Constructing a Large-Scale English-Persian Parallel Corpus

نویسندگان

چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MIZAN: A Large Persian-English Parallel Corpus

One of the most major and essential tasks in natural language processing is machine translation that is now highly dependent upon multilingual parallel corpora. Through this paper, we introduce the biggest Persian-English parallel corpus with more than one million sentence pairs collected from masterpieces of literature. We also present acquisition process and statistics of the corpus, and expe...

متن کامل

Constructing of a Large-Scale Chinese-English Parallel Corpus

This paper describes the constructing of a large-scale (above 500,000 pair sentences) Chinese-English parallel corpus. The current status of Chinese corpora is overviewed with the emphasis on parallel corpus. The XML coding principles for Chinese–English parallel corpus are discussed. The sentence alignment algorithm used in this project is described with a computer-aided checking processing. F...

متن کامل

TEP: Tehran English-Persian Parallel Corpus

Parallel corpora are one of the key resources in natural language processing. In spite of their importance in many multi-lingual applications, no large-scale English-Persian corpus has been made available so far, given the difficulties in its creation and the intensive labors required. In this paper, the construction process of Tehran English-Persian parallel corpus (TEP) using movie subtitles,...

متن کامل

PEN: Parallel English-Persian News Corpus

Parallel corpora are the necessary resources in many multilingual natural language processing applications, including machine translation and cross-lingual information retrieval. Manual preparation of a large scale parallel corpus is a very time consuming and costly procedure. In this paper, the work towards building a sentence-level aligned EnglishPersian corpus in a semi-automated manner is p...

متن کامل

Comparing k-means clusters on parallel Persian-English corpus

This paper compares clusters of aligned Persian and English texts obtained from k-means method. Text clustering has many applications in various fields of natural language processing. So far, much English documents clustering research has been accomplished. Now this question arises, are the results of them extendable to other languages? Since the goal of document clustering is grouping of docum...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Études et prospectives

سال: 2009

ISSN: 1492-1421,0026-0452

DOI: 10.7202/029804ar