persian parallel corpus

راهبردهای بکاررفته درترجمه کلیشه ها وهنجارهای مربوط به آنهادرفیلم های دوبله شده

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه بیرجند 1390

اکبر آذرکمند, محمدحسین قرشی, علی عتیزاده,

abstract this study attempted to investigate the strategies used to translate clichés of emotions in dubbed movies in iranian dubbing context for home video companies. the corpus of the current study was parallel and comparable in nature, consisting of five original american movies and their dubbed versions in persian, and five original persian movies which served as a touchstone for judging n...

15 صفحه اول

TEP: Tehran English-Persian Parallel Corpus

2011

Mohammad Taher Pilehvar Heshaam Faili Abdol Hamid Pilehvar

Parallel corpora are one of the key resources in natural language processing. In spite of their importance in many multi-lingual applications, no large-scale English-Persian corpus has been made available so far, given the difficulties in its creation and the intensive labors required. In this paper, the construction process of Tehran English-Persian parallel corpus (TEP) using movie subtitles,...

متن کامل

The First Parallel Multilingual Corpus of Persian: Toward a Persian BLARK

Journal: :CoRR 2007

Behrang Q. Zadeh Saeed Rahimi Behrooz Mahmoodi Bakhtiari

In this article, we have introduced the first parallel corpus of Persian with more than 10 other European languages. This article describes primary steps toward preparing a Basic Language Resources Kit (BLARK) for Persian. Up to now, we have proposed morphosyntactic specification of Persian based on EAGLE/MULTEXT guidelines and specific resources of MULTEXT-East. The article introduces Persian ...

متن کامل

Extracting Persian-English Parallel Sentences from Document Level Aligned Comparable Corpus using Bi-Directional Translation

2014

Ebrahim Ansari Mohammad Hadi Sadreddini Alireza Tabebordbar Richard WALLACE

Bilingual parallel corpora are very important in various filed of natural language processing (NLP). The quality of a Statistical Machine Translation (SMT) system strongly dependent upon the amount of training data. For low resource language pairs such as Persian-English, there are not enough parallel sentences to build an accurate SMT system. This paper describes a new approach to use the Wiki...

متن کامل

Constructing a Large-Scale English-Persian Parallel Corpus

Journal: :Études et prospectives 2009

متن کامل

comparative genre analysis of english newspaper editorials across english and persian

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه یزد 1388

زهره شیامی زاده, حمید علامی, گلنار مزدایسنا,

the present research was conducted to accomplish two purposes. firstly, it aimed to explore and describe schematic structure or what halliday and hassan (1989, p.64) have called “generic structure potential” (gsp) of american english, iranian persian and iranian english newspaper editorials within systemic functional linguistics. secondly, a quantitative cross-comparison was made to investigate...

15 صفحه اول

Supervised Morphology Generation Using Parallel Corpus

2013

Alireza Mahmoudi Mohsen Arabsorkhi Heshaam Faili

Translating from English, a morphologically poor language, into morphologically rich languages such as Persian comes with many challenges. In this paper, we present an approach to rich morphology prediction using a parallel corpus. We focus on the verb conjugation as the most important and problematic phenomenon in the context of morphology in Persian. We define a set of linguistic features usi...

متن کامل

Cross-Lingual Word Sense Disambiguation for Languages with Scarce Resources

2011

Bahareh Sarrafzadeh Nikolay Yakovets Nick Cercone Aijun An

Word Sense Disambiguation has long been a central problem in computational linguistics. Word Sense Disambiguation is the ability to identify the meaning of words in context in a computational manner. Statistical and supervised approaches require a large amount of labeled resources as training datasets. In contradistinction to English, the Persian language has neither any semantically tagged cor...

متن کامل

Using English as Pivot to Extract Persian-Italian Parallel Sentences from Non-Parallel Corpora

Journal: :CoRR 2017

Ebrahim Ansari Mohammad Hadi Sadreddini Mostafa Sheikhalishahi Richard Wallace Fatemeh Alimardani

Ebrahim Ansari ([email protected]) et al. 2017. Using english as pivot to extract persian-italian parallel sentences from non-parallel corpora. In " Applications of Comparable Corpora " edited book Berlin Linguistic Press (ed.). The effectiveness of a statistical machine translation system (SMT) is very dependent upon the amount of parallel corpus used in the training phase. For low-resource l...

متن کامل

Developing Bilingual Plagiarism Detection Corpus Using Sentence Aligned Parallel Corpus: Notebook for PAN at CLEF 2015

2015

Habibollah Asghari Khadijeh Khoshnava Omid Fatemi Heshaam Faili

Plagiarism detection is the process of locating text reuse within a suspicious document. The plagiarism detection corpora are used for evaluating plagiarism detection systems. In this paper, we present a bilingual PersianEnglish plagiarism detection corpus. We provide our corpus for the task of text alignment corpus construction in the PAN 2015 competition. Our approach is based on parallel cor...

متن کامل