Drug Terminology. A multilingual term database. The AVENTINUS project

نویسنده

  • Christian Sjögreen
چکیده

T h is p ap er starts w ith a b r ie f o v era ll p resen ta tio n o f the A V E N T IN U S pro jec t, m ere ly a list o f th e d iffe ren t inc luded m o d u les and som e com m ents. T h en fo llow s a d iscu ss io n ab o u t d rug te rm i­ n o lo g y and fin a lly a d esc rip tio n o f th e desig n and im p lem en ta tio n o f a ta ilo red m u ltilingual drug te rm in o lo g y database in M S A ccess. T h e u sed tags and links are p resen ted and d iscussed and the in p u ttin g situa tion is described . T h en som e num erical deta ils o f th e datab ase and a short con ­ clu d in g rem ark are given. Project description T h e A V E N T IN U S Project* a im s a t su p p o rtin g an A d vanced In fo rm a tio n S ystem fo r M ultina­ tio n a l D ru g E nforcem ent. It is fu n d ed by th e E uropean U n io n in th e L in g u is tic E ng ineering (LE) P rog ram , and has several d ev e lo p m en t an d u se r partners. T he go al o f th e p ro jec t is to support d m g en fo rcem en t w ith m u ltilin g u a l lin g u is tic expertise. A V E N T IN U S w ill su p p o rt com m unica­ tio n b y p ro v id in g ling u is tic to o ls to o v erco m e language co m m u n ica tio n barrie rs. U sers should be ab le to access in fo rm ation a n d rece iv e resu lts o f search req u ests in th e ir ovra n a tiv e language, ev en i f th e in fo rm ation is d e riv ed fro m fo re ig n language sources. T h e lan g u ag es dea lt w ith in th e firs t p h ase are E ng lish , G erm an, S p an ish , an d Sw edish . A V E N T IN U S w ill p ro v id e m o d u les an d com ponen ts th a t can b e lin k ed to an d in teg ra ted into the u sers do m estic env ironm en ts. M o d u la rity and in teg ratab ility are th e m o st p ro m in en t features o f th e so ftw are so lu tions to b e p ro v id ed . T h e partic ip a tin g users are d o m estic p o lice o rgan isations and in te llig en ce ag en c ies and the Europ o l D ru g U n it (E D U ). In te re st fro m o th e r au tho rities h av e a lso b een n o ticed , though . ' For an exhaustive description of the project, see [THUR97] or take a look at our AVENTINUS web page at N O D A L ID A ; C openhagen , Jan u a ry 1998 13 T he d a ta to b e h an d led by A V E N T IN U S are o f sev era l types, th e m o s t im p o rtan t ones fo r A V E N T IN U S bein g tex tu a l data. T exts from o p e n s o u rc e s , m a in ly n ew sw ire tex ts an d in te r n a l c o m m u n ic a tio n tex ts , lik e p o lice reports , w ill be co n sid ered . A ccord ing to th e U se r R equ irem en ts R eport, tliere are tw o m a in scen ario s th a t sh o u ld be su p ­ ported , the In d e x in g (D a ta E n t iy ) S c e n a r io and th e R e tr ie v a l (A n a ly s is ) S ce n a rio . In the In d ex in g S cenario , u sers are con fro n ted w ith in co m in g tex ts from d iffe ren t sou rces (fax , te lex , e lec tron ic) in d iffe ren t languages. T h ey h av e to d ec id e w h e th e r a g iven in p u t te x t is re le ­ van t o r no t. A V E N T IN U S w ill support th is scenario b y p ro v id in g tran s la tio n su p p o rt too ls, and by p ro v id in g in fo rm atio n -u n d erstan d in g to o ls (ind ex in g , in fo rm a tio n ex traction). In the R etriev a l S cenario , search sup p o rt w ill co m p rise to o ls like n am e search (translite ra tion , s im ilarity o f n am es) o r te x t search b o th in s tru c tu red an d tex tua l da tabases. T ran sla tio n sup p o rt too ls w ill be re sp o n sib le to tran sla te th e search req u es ts as w ell as th e search resu lts (s tructu red o r tex tu a l) in to th e n a tiv e language o f th e searcher T o sup p o rt th e scenarios, A V E N T IN U S w ill p ro v id e d iffe ren t ty p e s o f co m ponen ts, such as T ra n s la t io n S u p p o r t , I n f o rm a t io n P ro c e s s in g S u p p o r t , and S e a rc h S u p p o r t . T he p ro jec t w ill have th ree ty p es o f tran s la tio n su p p o rt too ls. T erm Substitu tio n , T ra nsla tion M em o ry , and M a ch in e T ransla tion . A ll o f th em w ill b e ava ilab le as stand-alone to o ls accessing co m m o n lex ical reso u rces , and as com ponen ts to b e ca lled from standard W indow s ed ito rs such as W inW ord . T h ere w ill b e several com p o n en ts to p ro cess th e in co m in g tex ts , an d p ro v id e fu r­ th e r in fo rm a tio n fo r la te r re trieval, fo r in stan ce In fo rm a tio n E x tra c tio n and In d ex in g S earch S u p p o rt re fe rs to sev era l requ irem en ts, fo r in stan ce (i) req u es ts in n a tu ra l la n g u a g e , as w ell as in som e s tru c tu re d fo r m , (ii) req u es ts in a n a tive la n g u a g e in s tead o f th e fo re ig n lan ­ guages o f the d a tab ase to b e searched and (iii) q u ery exp a n sio n and n av ig a tio n p o ssib ilitie s in the area o f te x t search . S earch in b o th struc tu red d a tab ases an d in a tex tu a l ones w ill b e supported . T he co m p o n en ts to b e o ffe red com p rise the fo llow ing : N a m e Search , S earch in texts, an d S ea rch in s tru c tu re d da tabases. In o rd e r to su p p o rt th e A V E N T IN U S app lica tio n , th ree ty p es o f lin g u is­ tic reso u rces w ill b e se t u p w h ich have to do w ith b o th m u ltilin g u a l issues and d o m a in m o d e l­ ling: L ex ica l D a tabase, Thesaurus, and D o m a in M o d e l T h e arch itec tu re o f A V E N T IN U S fo llo w s tw o b as ic p rin c ip les . I t m u s t b e b ased o n co m p o n en ts th a t can b e in teg ra ted in a v ery flex ib le w ay in to th e ex is tin g sy stem env iro n m en ts o f th e users and it m u s t b e v ery f lex ib le in th e in te rac tio n o f th e in te rn a l com ponen ts. In m an y co m p o n en ts th e A V E N T IN U S fu n c tio n a lity m ay be ca lled fro m a s tan d ard te x t p ro cessin g system . T h e in ter­ face w ill b e av a ilab le o n several p la tfo rm s. A firs t v e rs io n o f the A V E N T IN U S d a ta p oo l, in c lu d in g te s t tex ts an d te rm in o lo g y , is im p le ­ m en ted , as a p o o l to c rea te reso u rces and te s t sp ec ifica tio n s. T he co m p le te system spec ifica tio n s 14 NOD ALIDA, Copenhagen, January 1998 h av e been w ritten and rev iew ed , so m e o f the A V E N T IN U S co m p o n en ts (transla tion m em ory , m ach in e transla tion) are o p era tio n a l an d a te st p la n is ava ilab le , w ith E u ro p o l as th e first te stin g env ironm ent.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multilingual Information Processing: the AVENTINUS Project

The AVENTINUS project is supported by the EU Commission under the Telematics Program. Its aim is to provide software components and linguistic resources to drug enforcement authorities to improve multilingual communication and information processing. Components of the system are tools for translation such as machine translation, translation memory and terminology databases, information extracti...

متن کامل

The LexALP Information System: Term Bank And Corpus For Multilingual Legal Terminology Consolidated

Standard techniques used in multilingual terminology management fail to describe legal terminologies as they are bound to different legal systems and terms do not share a common meaning. In the LexALP project, we use a technique defined for general lexical databases to achieve cross language interoperability between languages of the Alpine Convention. In this paper we present the methodology an...

متن کامل

Reusing the Mikrokosmos Ontology for Concept-based Multilingual Terminology Databases

This paper reports work carried out within a multilingual terminology project (OncoTerm) in which the Mikrokosmos (μK) ontology (Mahesh, 1996; Viegas et al 1999) has been used as a language independent conceptual structure to achieve a truly concept-based terminology database (termbase, for short). The original ontology, containing nearly 4,700 concepts and available in Lisp-like format (Januar...

متن کامل

Multilingual Terminology Extraction and Validation

This paper presents the automatic terminology extraction approach developed within project LIQUID. This project aims at developing a cost-effective solution for the problem of cross-language access to multilingual text databases in technical and scientific domains. Cross-Language Information Retrieval faces a major challenge: organizing unstructured textual information according to its contents...

متن کامل

EASTIN-CL: A multilingual front-end to a database of Assistive Technology products

The document describes an application of language technology to improve the access to a database of Assistive Technology in the EASTIN-CL project. It focuses on engineering aspects of language technology integration. The paper describes the collection of a multilingual terminology database of the domain, and its use in multilingual and multimodal frontend components, especially the design, impl...

متن کامل

ProTermino: a comprehensive web-based terminological management tool based on knowledge representation

This paper aims to describe ProTermino, a comprehensive web-based terminological management system that is been recently developed. 1 This new system is a prototype that intends to fulfill translators’ terminological needs regarding multilingual terminological resources and fill the existing gap of comprehensive terminological management tool. In this paper, we present the main functionalities ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998