نتایج جستجو برای: translation test

تعداد نتایج: 934217  

2015
Michel Simard

Users of Statistical Machine Translation (SMT) sometimes turn to the Web to obtain data to train their systems. One problem with this approach is the potential for “MT contamination”: when large amounts of parallel data are collected automatically, there is a risk that a nonnegligible portion consists of machine-translated text. Theoretically, using this kind of data to train SMT systems is lik...

2013
Xiaoyin Fu Wei Wei Shixiang Lu Zhenbiao Chen Bo Xu

We present a phrase-based method to extract parallel fragments from the comparable corpora. We do this by introducing a force decoder based on the hierarchical phrase-based (HPB) translation model to detect the alignments in comparable sentence pairs. This method enables us to extract useful training data for statistical machine translation (SMT) system. We evaluate our method by fragment detec...

2008
Jin'ichi Murakami Masato Tokuhisa Satoru Ikehara

In this study, we paid attention to the reliability of phrase table. We have been used the phrase table using Och’s method[2]. And this method sometimes generate completely wrong phrase tables. We found that such phrase table caused by long parallel sentences. Therefore, we removed these long parallel sentences from training data. Also, we utilized general tools for statistical machine translat...

2009
Preslav Nakov Chang Liu Wei Lu Hwee Tou Ng

We describe the system developed by the team of the National University of Singapore for the Chinese-English BTEC task of the IWSLT 2009 evaluation campaign. We adopted a state-of-the-art phrase-based statistical machine translation approach and focused on experiments with different Chinese word segmentation standards. In our official submission, we trained a separate system for each segmenter ...

2012
Volha Petukhova Rodrigo Agerri Mark Fishel Sergio Penkale Arantza del Pozo Mirjam Sepesy Maucec Andy Way Panayota Georgakopoulou Martin Volk

This paper describes the data collection and parallel corpus compilation activities carried out in the FP7 EU-funded SUMAT project. This project aims to develop an online subtitle translation service for nine European languages combined into 14 different language pairs. This data provides bilingual and monolingual training data for statistical machine translation engines which will semi-automat...

2014
Saab Mansour Hermann Ney

In this work, we tackle the problem of language and translation models domainadaptation without explicit bilingual indomain training data. In such a scenario, the only information about the domain can be induced from the source-language test corpus. We explore unsupervised adaptation, where the source-language test corpus is combined with the corresponding hypotheses generated by the translatio...

2014
Markus Freitag Stephan Peitz Joern Wuebker Hermann Ney Matthias Huck Rico Sennrich Nadir Durrani Maria Nadejde Philip Williams Philipp Koehn Teresa Herrmann Eunah Cho Alexander H. Waibel

This paper describes one of the collaborative efforts within EU-BRIDGE to further advance the state of the art in machine translation between two European language pairs, German→English and English→German. Three research institutes involved in the EU-BRIDGE project combined their individual machine translation systems and participated with a joint setup in the shared translation task of the eva...

2015
Jinsong Su Deyi Xiong Shujian Huang Xianpei Han Junfeng Yao

Lexical selection is of great importance to statistical machine translation. In this paper, we propose a graph-based framework for collective lexical selection. The framework is established on a translation graph that captures not only local associations between source-side content words and their target translations but also targetside global dependencies in terms of relatedness among target i...

2017
Rico Sennrich Alexandra Birch Anna Currey Ulrich Germann Barry Haddow Kenneth Heafield Antonio Valerio Miceli Barone Philip Williams

This paper describes the University of Edinburgh’s submissions to the WMT17 shared news translation and biomedical translation tasks. We participated in 12 translation directions for news, translating between English and Czech, German, Latvian, Russian, Turkish and Chinese. For the biomedical task we submitted systems for English to Czech, German, Polish and Romanian. Our systems are neural mac...

Game-based practicing of materials can be seen as a method of capturing an essence of real- life expe-rience which is commonly missing in the conventional face-to-face classrooms. To serve the L2 learn-ers'' immediate communicative needs in wider classroom and societal contexts, this study sought to place L2 English learners within an interactional social framework through reinforcing their Eng...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید