نتایج جستجو برای: user generated translation

تعداد نتایج: 651427  

2012
Rasoul Samad Zadeh Kaljahi Raphael Rubino Johann Roturier Jennifer Foster

This paper describes a range of automatic and manual comparisons of phrase-based and syntax-based statistical machine translation methods applied to English-German and English-French translation of user-generated content. The syntax-based methods underperform the phrase-based models and the relaxation of syntactic constraints to broaden translation rule coverage means that these models do not n...

2013
Johanna Gerlach Victoria Porro Pierrette Bouillon Sabine Lehmann

The poor quality of user-generated content (UGC) found in forums hinders both readability and machine-translatability. To improve these two aspects, we have developed humanand machine-oriented pre-editing rules, which correct or reformulate this content. In this paper we present the results of a study which investigates whether pre-editing rules that improve the quality of statistical machine t...

2016
Marlies van der Wees Arianna Bisazza Christof Monz

A major challenge for statistical machine translation (SMT) of Arabic-to-English user-generated text is the prevalence of text written in Arabizi, or Romanized Arabic. When facing such texts, a translation system trained on conventional Arabic-English data will suffer from extremely low model coverage. In addition, Arabizi is not regulated by any official standardization and therefore highly am...

2014
Orphée De Clercq Sarah Schulz Bart Desmet Véronique Hoste

In this paper we present a Dutch and English dataset that can serve as a gold standard for evaluating text normalization approaches. With the combination of text messages, message board posts and tweets, these datasets represent a variety of user generated content. All data was manually normalized to their standard form using newly-developed guidelines. We perform automatic lexical normalizatio...

2012
Johann Roturier Linda Mitchell Robert Grabowski Melanie Siegel

This paper investigates the usefulness of automatic machine translation metrics when analyzing the impact of source reformulations on the quality of machinetranslated user generated content. We propose a novel framework to quickly identify rewriting rules which improve or degrade the quality of MT output, by trying to rely on automatic metrics rather than human judgments. We find that this appr...

2017
Pintu Lohar Haithem Afli Andy Way

The advent of social media has shaken the very foundations of how we share information, with Twitter, Facebook, and Linkedin among many well-known social networking platforms that facilitate information generation and distribution. However, the maximum 140-character restriction in Twitter encourages users to (sometimes deliberately) write somewhat informally in most cases. As a result, machine ...

2014
Violeta Seretan Johann Roturier David Silva Pierrette Bouillon

With the development of Web 2.0, a lot of content is nowadays generated online by users. Due to its characteristics (e.g., use of jargon and abbreviations, typos, grammatical and style errors), the user-generated content poses specific challenges to machine translation. This paper presents an online platform devoted to the pre-editing of user-generated content and its post-editing, two main typ...

پایان نامه :دانشگاه آزاد اسلامی واحد کرمانشاه - پژوهشکده زبان و گویش 1393

abstract the purpose of this study is twofold: on the one hand, it is intended to see what kind of noticing-the –gap activity (teacher generated vs. learner generated) is more efficient in teaching l2 grammar in classroom language learning. on the other hand, it is an attempt to determine which approach of the noticing-the-gap- activity is more effective in the long- term retention of grammar...

2016
Meritxell Fernández Barrera Vladimir Popescu Antonio Toral Federico Gaspari Khalid Choukri

This paper discusses the role that statistical machine translation (SMT) can play in the development of cross-border EU e-commerce, by highlighting extant obstacles and identifying relevant technologies to overcome them. In this sense, it firstly proposes a typology of e-commerce static and dynamic textual genres and it identifies those that may be more successfully targeted by SMT. The specifi...

2014
Alexandra Balahur Hristo Tanev Erik Van der Goot

of the talk In the past years, there has been an increasing amount of research done in the field of Sentiment Analysis. This was motivated by the growth in the volume of user-generated online data, the information flood in Social Media and the applications Sentiment Analysis has to different fields – Marketing, Business Intelligence, e-Law Making, Decision Support Systems, etc. Although many me...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید