ESANA: Hybrid Machine Translation Approach for English-to-Sinhala Language Translation

نویسنده

  • A. E. Ekanayake
چکیده

In modern society Internet has become the most popular and efficient communication media. Most of the Internet resources are available on English language. However, English fluency rate in majority of the countries is not up to a satisfactory level. In Sri Lanka, it was observed that the English fluency rate has been reduced over past 30 years. Therefore the neediness of an English-to-Sinhala translator increases, especially due to the unavailability of such a tool in some of the popular resources like Google Translator. There are different types of translation tools exist, which are developed using different techniques. Translation tools that are developed using Rule based machine translation (RBMT) approach relies on countless built-in linguistic rules. Furthermore, it is not capable of handling the language ambiguity. Translation tools that are developed using Statistical Machine translation (SMT) approach require a huge bilingual text corpus. In this paper, we developed an English-to-Sinhala translator called Esana translator. This was developed using a Hybrid Machine Translation approach, which combines both RBMT and SMT approaches. The evaluation results show that Esana translator is capable of providing more accurate translation results with compared to the other approaches. We argue that the higher accuracy in Esana Translator is mainly due to its capability to handle the language ambiguity in a proper way. Our concept can easily be applied to any other language translations with slight modifications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Computational Grammar of Sinhala for English-sinhala Machine Translation

Communication is fundamental to the evolution and development of all kinds of living beings. With no disputes, languages should be recognized as the most amazing artifacts ever developed by mankind to enable communication. Computer has also become such a unique machine, due to its capacity to communicate with humans through languages. It is worth mentioning that the languages understood by comp...

متن کامل

A Hybrid Machine Translation System Based on a Monotone Decoder

In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...

متن کامل

A Statistical Machine Translation Approach to Sinhala-Tamil Language Translation

Data-driven approaches to Machine Translation have come to the fore of Language Processing Research over the past decade. The relative success in terms of robustness of Example Based and Statistical approaches have given rise to a new optimism and an exploration of other data-driven approaches such as Maximum Entropy language modeling. Much of the work in the literature however, largely report ...

متن کامل

A Comparative Study of English-Persian Translation of Neural Google Translation

Many studies abroad have focused on neural machine translation and almost all concluded that this method was much closer to humanistic translation than machine translation. Therefore, this paper aimed at investigating whether neural machine translation was more acceptable in English-Persian translation in comparison with machine translation. Hence, two types of text were chosen to be translated...

متن کامل

Sinhala-Tamil Machine Translation: Towards better Translation Quality

Statistical Machine Translation (SMT) is a well-known and well established datadriven approach used for language translation. The focus of this work is to develop a statistical machine translation system for Sri Lankan languages, Sinhala and Tamil language pair. This paper presents a systematic investigation of how SinhalaTamil SMT performance varies with the amount of parallel training data us...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014