ESANA: Hybrid Machine Translation Approach for English-to-Sinhala Language Translation
نویسنده
چکیده
In modern society Internet has become the most popular and efficient communication media. Most of the Internet resources are available on English language. However, English fluency rate in majority of the countries is not up to a satisfactory level. In Sri Lanka, it was observed that the English fluency rate has been reduced over past 30 years. Therefore the neediness of an English-to-Sinhala translator increases, especially due to the unavailability of such a tool in some of the popular resources like Google Translator. There are different types of translation tools exist, which are developed using different techniques. Translation tools that are developed using Rule based machine translation (RBMT) approach relies on countless built-in linguistic rules. Furthermore, it is not capable of handling the language ambiguity. Translation tools that are developed using Statistical Machine translation (SMT) approach require a huge bilingual text corpus. In this paper, we developed an English-to-Sinhala translator called Esana translator. This was developed using a Hybrid Machine Translation approach, which combines both RBMT and SMT approaches. The evaluation results show that Esana translator is capable of providing more accurate translation results with compared to the other approaches. We argue that the higher accuracy in Esana Translator is mainly due to its capability to handle the language ambiguity in a proper way. Our concept can easily be applied to any other language translations with slight modifications.
منابع مشابه
A Computational Grammar of Sinhala for English-sinhala Machine Translation
Communication is fundamental to the evolution and development of all kinds of living beings. With no disputes, languages should be recognized as the most amazing artifacts ever developed by mankind to enable communication. Computer has also become such a unique machine, due to its capacity to communicate with humans through languages. It is worth mentioning that the languages understood by comp...
متن کاملA Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملA Statistical Machine Translation Approach to Sinhala-Tamil Language Translation
Data-driven approaches to Machine Translation have come to the fore of Language Processing Research over the past decade. The relative success in terms of robustness of Example Based and Statistical approaches have given rise to a new optimism and an exploration of other data-driven approaches such as Maximum Entropy language modeling. Much of the work in the literature however, largely report ...
متن کاملA Comparative Study of English-Persian Translation of Neural Google Translation
Many studies abroad have focused on neural machine translation and almost all concluded that this method was much closer to humanistic translation than machine translation. Therefore, this paper aimed at investigating whether neural machine translation was more acceptable in English-Persian translation in comparison with machine translation. Hence, two types of text were chosen to be translated...
متن کاملSinhala-Tamil Machine Translation: Towards better Translation Quality
Statistical Machine Translation (SMT) is a well-known and well established datadriven approach used for language translation. The focus of this work is to develop a statistical machine translation system for Sri Lankan languages, Sinhala and Tamil language pair. This paper presents a systematic investigation of how SinhalaTamil SMT performance varies with the amount of parallel training data us...
متن کامل