ISI's Participation in the Romanian-English Alignment Task

نویسندگان

  • Alexander M. Fraser
  • Daniel Marcu
چکیده

We discuss results on the shared task of Romanian-English word alignment. The baseline technique is that of symmetrizing two word alignments automatically generated using IBM Model 4. A simple vocabulary reduction technique results in an improvement in performance. We also report on a new alignment model and a new training algorithm based on alternating maximization of likelihood with minimization of error rate.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Duluth Word Alignment System

The Duluth Word Alignment System participated in the 2003 HLT-NAACL Workshop on Parallel Text shared task on word alignment for both English–French and Romanian–English. It is a Perl implementation of IBM Model 2. We used approximately 50,000 aligned sentences as training data for each language pair, and found the results for Romanian–English to be somewhat better. We also varied the Model 2 di...

متن کامل

Word Alignment for Languages with Scarce Resources

This paper presents the task definition, resources, participating systems, and comparative results for the shared task on word alignment, which was organized as part of the ACL 2005 Workshop on Building and Using Parallel Texts. The shared task included English–Inuktitut, Romanian–English, and English–Hindi sub-tasks, and drew the participation of ten teams from around the world with a total of...

متن کامل

An Evaluation Exercise for Word Alignment

This paper presents the task definition, resources, participating systems, and comparative results for the shared task on word alignment, which was organized as part of the HLT/NAACL 2003 Workshop on Building and Using Parallel Texts. The shared task included Romanian-English and English-French sub-tasks, and drew the participation of seven teams from around the world. 1 Defining a Word Alignme...

متن کامل

TREQ-AL: A word alignment system with limited language resources

We provide a rather informal presentation of a prototype system for word alignment based on our previous translation equivalence approach, discuss the problems encountered in the shared-task on word-aligning of a parallel Romanian-English text, present the preliminary evaluation results and suggest further ways of improving the alignment accuracy.

متن کامل

Transferring Coreference Chains through Word Alignment

This paper investigates the problem of automatically annotating resources with NP coreference information using a parallel corpus, English-Romanian, in order to transfer, through word alignment, coreference chains from the English part to the Romanian part of the corpus. The results show that we can detect Romanian referential expressions and coreference chains with over 80% F-measure, thus usi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005