Improving Translation Quality Of Rule-Based Machine Translation
نویسندگان
چکیده
This paper proposes machine learning techniques, which help disambiguate word meaning. These methods focus on considering the relationship between a word and its surroundings, described as context information in the paper. Context information is produced from rule-based translation such as part-ofspeech tags, semantic concept, case relations and so on. To automatically extract the context information, we apply machine learning algorithms which are C4.5, C4.5rule and RIPPER. In this paper, we test on ParSit, which is an interlingual-based machine translation for English to Thai. To evaluate our approach, an verb-to-be is selected because it has increased in frequency and it is quite difficult to be translated into Thai by using only linguistic rules. The result shows that the accuracy of C4.5, C4.5rule and RIPPER are 77.7%, 73.1% and 76.1% respectively whereas ParSit give accuracy only 48%.
منابع مشابه
Using machine learning to improve rule-based machine translation
We present an experiment of using transformation-based learning for improving translation quality of a rule-based machine translation system by means of post-processing. Transformation rules are learned based on a parallel corpus of machine translation output and a human-corrected version of the output. The experiment resulted in a significant increase in translation quality of 0.8 measured usi...
متن کاملA Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملThe Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کاملApplying Statistical Post-Editing to English-to-Korean Rule-based Machine Translation System
Conventional rule-based machine translation system suffers from its weakness of fluency in the view of target language generation. In particular, when translating English spoken language to Korean, the fluency of translation result is as important as adequacy in the aspect of readability and understanding. This problem is more severe in language pairs such as English-Korean. It’s because Englis...
متن کاملRule Extraction Applied in Language Translation
Machine translation (MT) has been used to address inherent problems from human translators. However, the quality of machine translations are usually unacceptable. Researches have focused on improving quality by incorporating machine learning for translation. An example of which is TWiRL which translates English to Filipino sentences. However, TWiRL’s approach presented a strict requirement of a...
متن کاملA New Subtree-Transfer Approach to Syntax-Based Reordering for Statistical Machine Translation
In this paper we address the problem of translating between languages with word order disparity. The idea of augmenting statistical machine translation (SMT) by using a syntax-based reordering step prior to translation, proposed in recent years, has been quite successful in improving translation quality. We present a new technique for extracting syntax-based reordering rules, which are derived ...
متن کامل