Unification of multi-lingual scientific terminological resources using the ISO 16642 standard. The TermSciences initiative

نویسندگان

  • Majid Khayari
  • Stéphane Schneider
  • Isabelle Kramer
  • Laurent Romary
چکیده

The TermSciences initiative aims at building a multi-purpose and multi-lingual knowledge system from different source vocabularies produced by major French research institutions and which were initially intended to be used for indexing and cataloguing scientific literature. Since the construction of language resource repositories is cost-effective and time-consuming, the producers of these vocabularies wished to both share their terminological material and develop common tools for the collaborative management of the integrated resource. Sharing terminologies poses some problems because of the heterogeneous nature of the source data (i.e., coverage, granularity and compositionality of concepts, etc.), and to the discrepancy between partner needs (i.e., simple diffusion of the terminological material, use of the shared material to enhance information engineering tasks, etc.). This paper presents the TermSciences portal, which deals with the implementation of a conceptual model that uses the recent ISO 16642 standard (Terminological Markup Framework). This standard turned out to be suitable for concept modeling since it allowed for organizing the original resources by concepts and to associate the various terms for a given concept. Additional structuring is produced by sharing conceptual relationships, that is, cross-linking of resource results through the introduction of semantic relations which may have initially be missing. A special emphasis is put on medical resources used in this project, i.e. the French translation by the Institut National de la Santé et de la Recherche Médicale (INSERM) of the MeSH thesaurus from the US National Library of Medicine, the public health thesaurus of the Banque de Données de Santé Publique (BDSP) and the dictionary of human and mammals reproduction biotechnology of the Institut National de la Recherche Agronomique (INRA). 1 www.termsciences.fr

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using ISO and Semantic Web standards for creating a Multilingual Medical Interface Terminology : A use case for Hearth Failure

The correct registration and encoding of medical data in Electronic Health Records is still a major challenge for health care professionals. Efficient terminological systems are lacking to enable multilingual semantic interoperability between general practitioners, patients, medical specialists, and allied health personnel. The aim of this paper is to propose an architectural structure for a Mu...

متن کامل

Knowledge Exchange and Terminology Interchange: The role of standards

The emergence of standards for storing and retrieving language resources, including terminology and lexicographical data, documents and text corpora, will benefit system developers and users of a range of language and knowledge engineering systems. The developers will be able to cope better with the vagaries of natural language since standardised entries in term databases, or structured documen...

متن کامل

Lexicographyfor Specialised Languages - Terminology and Terminography Integrated Bilingual Specialist Dictionaries as Added Value for Translation Memories: the LexTerm Initiative

Market surveys in industrial companies have pointed out translators' demand for integrated specialist dictionaries in translation memory tools which they could use in addition to their own compiled dictionaries or stored parts of text. For this purpose the German specialist dictionary publisher, Langenscheidt Fachverlag in Munich proposes a global solution together with experts from the Univers...

متن کامل

A model oriented approach to the mapping of annotation formats using standards

In this paper, we present, Salt, a framework for mapping heterogeneous linguistic annotation formats into each other using a model-based approach, i.e. independently of the actual formats in which the corresponding linguistic data is being expressed. As we describe the underlying concept of this framework, we identify how it echoes ongoing standardisation activities within ISO committee TC 37/S...

متن کامل

Automatic Acquisition of Terminological Resources for Information Extraction Applications

In this paper we present a method aiming at (semi-)automating the process of eliciting domain specific terminological resources, in the framework of information extraction applications. The method aims at linguistically processing machine-readable text corpora and extracting lists of candidate multi-word terms of the domain, that would then be validated by domain experts. The method proceeds in...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/cs/0604027  شماره 

صفحات  -

تاریخ انتشار 2006