Accessing and Standardizing Wiktionary Lexical Entries for Supporting the Translation of Labels in Taxonomies for Digital Humanities

نویسندگان

  • Thierry Declerck
  • Karlheinz Mörth
  • Piroska Lendvai
چکیده

We describe the usefulness of Wiktionary, the freely available web-based lexical resource, in providing multilingual extensions to catalogues that serve content-based indexing of folktales and related narratives. We develop conversion tools between Wiktionary and TEI, using ISO standards (LMF, MAF), to make such resources available to both the Digital Humanities community and the Language Resources community. The converted data can be queried via a web interface, while the tools of the workflow are to be released with an open source license. We report on the actual state and functionality of our tools and analyse some shortcomings of Wiktionary, as well as potential domains of application.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Accessing Multilingual Data on the Web for the Semantic Annotation of Cultural Heritage Texts

Our study targets interoperable semantic annotation of Cultural Heritage or eHumanities texts in German and Hungarian. A semantic resource we focus on is the Thompson Motif-index of folk-literature (TMI), the labels of which are available only in English. We investigate the use lexical data on the Web in German and Hungarian for supporting semi-automatic translation of TMI: lexical resources of...

متن کامل

Mental Representation of Cognates/Noncognates in Persian-Speaking EFL Learners

The purpose of this study was to investigate the mental representation of cognate and noncognate translation pairs in languages with different scripts to test the prediction of dual lexicon model (Gollan, Forster, & Frost, 1997). Two groups of Persian-speaking English language learners were tested on cognate and noncognate translation pairs in Persian-English and English-Persian directions with...

متن کامل

Leveraging the Crowdsourcing of Lexical Resources for Bootstrapping a Linguistic Data Cloud

We present a declarative approach implemented in a comprehensive open-source framework based on DBpedia to extract lexicalsemantic resources – an ontology about language use – from Wiktionary . The data currently includes language, part of speech, senses, definitions, synonyms, translations and taxonomies (hyponyms, hyperonyms, synonyms, antonyms) for each lexical word. Main focus is on flexibi...

متن کامل

Manipulation As an Ideological Tool in the Persian Translations of Ervand Abrahamian’s The Coup: A Multimodal CDA Approach

The present Critical Discourse Analysis (CDA) study aimed to explore the probable ideological manipu- lations exerted in three translations of an English political book entitled The Coup by Ervand Abraha- mian. This comparative qualitative study was conducted based on Farahzad‘s three-dimensional CDA model. The textual, paratextual, and ...

متن کامل

Examining the Effect of Ideology and Idiosyncrasy on Lexical Choices in Translation Studies within the CDA Framework

Using a critical discourse analytic model of translation criticism, the present study attempts to explore the effect of ideology and idiosyncrasy on the lexical choices in translation studies. The study employed a descriptive approach to answer two research questions: Is there any relationship between ideology and idiosyncratic features of translators' lexical choices? And if yes, can it be ana...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012