DOMCAT: A Bilingual Concordancer for Domain-Specific Computer Assisted Translation

نویسندگان

  • Ming-Hong Bai
  • Yu-Ming Hsieh
  • Keh-Jiann Chen
  • Jason S. Chang
چکیده

In this paper, we propose a web-based bilingual concordancer, DOMCAT 1 , for domain-specific computer assisted translation. Given a multi-word expression as a query, the system involves retrieving sentence pairs from a bilingual corpus, identifying translation equivalents of the query in the sentence pairs (translation spotting) and ranking the retrieved sentence pairs according to the relevance between the query and the translation equivalents. To provide high-precision translation spotting for domain-specific translation tasks, we exploited a normalized correlation method to spot the translation equivalents. To ranking the retrieved sentence pairs, we propose a correlation function modified from the Dice coefficient for assessing the correlation between the query and the translation equivalents. The performances of the translation spotting module and the ranking module are evaluated in terms of precision-recall measures and coverage rate respectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TS3: an Improved Version of the Bilingual Concordancer TransSearch

Computer Assisted Translation tools remain the preferred solution of human translators when publication quality is of concern. In this paper, we present our ongoing efforts conducted within TS3, a project which aims at improving the commercial bilingual concordancer TransSearch. The core technology of this Web-based service mainly relies on sentence-level alignment. In this study, we discuss an...

متن کامل

Enhancing the Bilingual Concordancer TransSearch with Word-Level Alignment

Despite the impressive amount of recent studies devoted to improving the state of the art of Machine Translation (MT), Computer Assisted Translation (CAT) tools remain the preferred solution of human translators when publication quality is of concern. In this paper, we present our perspectives on improving the commercial bilingual concordancer TransSearch, a Web-based service whose core technol...

متن کامل

Subsentential Translation Memory for Computer Assisted Writing and Translation

This paper describes a database of translation memory, TotalRecall, developed to encourage authentic and idiomatic use in second language writing. TotalRecall is a bilingual concordancer that support search query in English or Chinese for relevant sentences and translations. Although initially intended for learners of English as Foreign Language (EFL) in Taiwan, it is a gold mine of texts in En...

متن کامل

Using sign language corpora as bilingual corpora for data mining: Contrastive linguistics and computer-assisted..

More and more sign languages nowadays are now documented by large scale digital corpora. But exploiting sign language (SL) corpus data remains subject to the time consuming and expensive manual task of annotating. In this paper, we present an ongoing research that aims at testing a new approach to better mine SL data. It relies on the methodology of corpus-based contrastive linguistics, exploit...

متن کامل

TANGO: Bilingual Collocational Concordancer

In this paper, we describe TANGO as a collocational concordancer for looking up collocations. The system was designed to answer user’s query of bilingual collocational usage for nouns, verbs and adjectives. We first obtained collocations from the large monolingual British National Corpus (BNC). Subsequently, we identified collocation instances and translation counterparts in the bilingual corpu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012