نتایج جستجو برای: persian parallel corpus

تعداد نتایج: 300662  

Journal: :CoRR 2018
Omid Kashefi

One of the most major and essential tasks in natural language processing is machine translation that is now highly dependent upon multilingual parallel corpora. Through this paper, we introduce the biggest Persian-English parallel corpus with more than one million sentence pairs collected from masterpieces of literature. We also present acquisition process and statistics of the corpus, and expe...

Journal: :journal of english studies 2011
farid ghaemi janin benyamin

this study was an attempt to identify the interlingual strategies employed to translate english subtitles into persian and to determine their frequency, as well. contrary to many countries, subtitling is a new field in iran. the study, a corpus-based, comparative, descriptive, non-judgmental analysis of an english-persian parallel corpus, comprised english audio scripts of five movies of differ...

Journal: :journal of ai and data mining 2015
a. khazaei m. ghasemzadeh

this paper compares clusters of aligned persian and english texts obtained from k-means method. text clustering has many applications in various fields of natural language processing. so far, much english documents clustering research has been accomplished. now this question arises, are the results of them extendable to other languages? since the goal of document clustering is grouping of docum...

Farid Ghaemi Janin Benyamin

This study was an attempt to identify the interlingual strategies employed to translate English subtitles into Persian and to determine their frequency, as well. Contrary to many countries, subtitling is a new field in Iran. The study, a corpus-based, comparative, descriptive, non-judgmental analysis of an English-Persian parallel corpus, comprised English audio scripts of five movies of differ...

2011
Mohammad Amin Farajian

Parallel corpora are the necessary resources in many multilingual natural language processing applications, including machine translation and cross-lingual information retrieval. Manual preparation of a large scale parallel corpus is a very time consuming and costly procedure. In this paper, the work towards building a sentence-level aligned EnglishPersian corpus in a semi-automated manner is p...

This paper compares clusters of aligned Persian and English texts obtained from k-means method. Text clustering has many applications in various fields of natural language processing. So far, much English documents clustering research has been accomplished. Now this question arises, are the results of them extendable to other languages? Since the goal of document clustering is grouping of docum...

پایان نامه :وزارت علوم، تحقیقات و فناوری - دانشگاه شیخ بهایی - دانشکده زبانهای خارجی 1391

simplification universal as a universal feature of translation means translated texts tend to use simpler language than original texts in the same language and it can be critically investigated through common concepts: type/token ratio, lexical density, and mean sentence length. although steps have been taken to test this hypothesis in various text types in different linguistic communities, in ...

2010
Chris Irwin Davis Dan I. Moldovan

In this paper we describe a proof-of-concept for the bootstrapping of a Persian WordNet. This effort was motivated by previous work done at Stanford University on bootstrapping an Arabic WordNet using a parallel corpus and an English WordNet. The principle of that work is based on the premise that paradigmatic relations are by nature deeply semantic, and as such, are likely to remain intact bet...

2014
Habibollah Asghari Jalal Maleki Heshaam Faili

In this paper, we investigate the problem of Ezafe recognition in Persian language. Ezafe is an unstressed vowel that is usually not written, but is intelligently recognized and pronounced by human. Ezafe marker can be placed into noun phrases, adjective phrases and some prepositional phrases linking the head and modifiers. Ezafe recognition in Persian is indeed a homograph disambiguation probl...

Journal: :CoRR 2017
Akbar Karimi Ebrahim Ansari Bahram Sadeghi Bigham

Parallel data are an important part of a reliable Statistical Machine Translation (SMT) system. The more of these data are available, the better the quality of the SMT system. However, for some language pairs such as Persian-English, parallel sources of this kind are scarce. In this paper, a bidirectional method is proposed to extract parallel sentences from English and Persian document aligned...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید