The Effects of Word Order and Segmentation on Translation Retrieval Performance
نویسندگان
چکیده
This research looks at the effects of word order and segmentation on translation retrieval performance for an experimental Japanese-English translation memory system. We implement a number of both bag-of-words and word order-sensitive similarity metrics, and test each over characterbased and word-based indexing. The translation retrieval performance of each system configuration is evaluated empirically through the notion of word edit distance between translation candidate outputs and the model translation. Our results indicate that character-based indexing is consistently superior to word-based indexing, suggesting that segmentation is an unnecessary luxury in the given domain. Word order-sensitive approaches are demonstrated to generally outperform bag-of-words methods, with source language segment-level edit distance proving the most effective similarity metric.
منابع مشابه
Word Type Effects on L2 Word Retrieval and Learning: Homonym versus Synonym Vocabulary Instruction
The purpose of this study was twofold: (a) to assess the retention of two word types (synonyms and homonyms) in the short term memory, and (b) to investigate the effect of these word types on word learning by asking learners to learn their Persian meanings. A total of 73 Iranian language learners studying English translation participated in the study. For the first purpose, 36 freshmen from an ...
متن کاملBaldwin, Timothy (2010) The Hare and the Tortoise: Speed and Accuracy in Translation Retrieval, Machine Translation 23(4), pp. 195-240
This research looks at the effects of segment order and segmentation on translation retrieval performance for an experimental Japanese–English translation memory system. We implement a number of both bag-of-words and segment-ordersensitive string comparison methods, and test each over character-based and wordbased indexing using n-grams of various orders. To evaluate accuracy, we propose an aut...
متن کاملThe E ects of Word Order and Segmentation on TranslationRetrieval
This research looks at the eeects of word order and segmentation on translation retrieval performance for an experimental Japanese-English translation memory system. We implement a number of both bag-of-words and word order-sensitive similarity metrics, and test each over character-based and word-based indexing. The translation retrieval performance of each system connguration is evaluated empi...
متن کاملLow-cost, High-Performance Translation Retrieval: Dumber is Better
In this paper, we compare the relative effects of segment order, segmentation and segment contiguity on the retrieval performance of a translation memory system. We take a selection of both bag-of-words and segment order-sensitive string comparison methods, and run each over both characterand word-segmented data, in combination with a range of local segment contiguity models (in the form of N-g...
متن کاملTranslation Memory Engines: A Look under the Hood and Road Test
In this paper, we compare the relative effects of segment order, segmentation and segment contiguity on the retrieval performance of a translation memory system. We take a selection of both bag-of-words and segment order-sensitive string comparison methods, and run each over both characterand word-segmented data, in combination with a range of local segment contiguity models (in the form of N-g...
متن کامل