Generation of Vietnamese for French-Vietnamese and English-Vietnamese Machine Translation

نویسنده

  • Hai Doan-Nguyen
چکیده

This paper presents the implementation of the Vietnamese generation module in ITS3, a multilingual machine translation (MT) system based on the Government & Binding (GB) theory. Despite well-designed generic mechanisms of the system, it turned out that the task of generating Vietnamese posed non-trivial problems. We therefore had to deviate from the generic code and make new design and implementation in many important cases. By developing corresponding bilingual lexicons, we obtained prototypes of French-Vietnamese and English-Vietnamese MT, the former being the first known prototype of this kind. Our experience suggests that in a principle-based generation system, the parameterized modules, which contain language-specific and lexicalized properties, deserve more attention, and the generic mechanisms should be flexible enough to facilitate the integration of these modules.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining a Comparable Text Corpus for a Vietnamese-French Statistical Machine Translation System

This paper presents our first attempt at constructing a Vietnamese-French statistical machine translation system. Since Vietnamese is an under-resourced language, we concentrate on building a large VietnameseFrench parallel corpus. A document alignment method based on publication date, special words and sentence alignment result is proposed. The paper also presents an application of the obtaine...

متن کامل

The UMD Machine Translation Systems at IWSLT 2015

We describe the University of Maryland machine translation systems submitted to the IWSLT 2015 French-English and Vietnamese-English tasks. We built standard hierarchical phrase-based models, extended in two ways: (1) we applied novel data selection techniques to select relevant information from the large French-English training corpora, and (2) we experimented with neural language models. Our ...

متن کامل

Implementing Project Work in Teaching English at High School: The Case of Vietnamese Teachers’ Challenges

Research on using project work in teaching various disciplines has pointed out a number of challenges facing teachers. Similar research in the EFL classroom, however, has been under-researched. This study aimed to fill the gap with a report on the Vietnamese high school teachers’ challenges in implementing project-based learning in the setting of curricular innovation in English instruction nat...

متن کامل

EVBCorpus - A Multi-Layer English-Vietnamese Bilingual Corpus for Studying Tasks in Comparative Linguistics

Bilingual corpora play an important role as resources not only for machine translation research and development but also for studying tasks in comparative linguistics. Manual annotation of word alignments is of significance to provide a gold-standard for developing and evaluating machine translation models and comparative linguistics tasks. This paper presents research on building an English-Vi...

متن کامل

Building an Annotated English-Vietnamese Parallel Corpus for Training Vietnamese-related NLPs

In NLP (Natural Language Processing) tasks, the highest difficulty which computers had to face with, is the built-in ambiguity of Natural Languages. To disambiguate it, formerly, they based on human-devised rules. Building such a complete rule-set is time-consuming and labor-intensive task whilst it doesn’t cover all the cases. Besides, when the scale of system increases, it is very difficult t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001