نتایج جستجو برای: persian parallel corpus

تعداد نتایج: 300662  

Journal: :CoRR 2018
Pedram Hosseini Ali Ahmadian Ramaki Hassan Maleki Mansoureh Anvari Seyed Abolghasem Mirroshandel

Sentiment Analysis (SA) is a major field of study in natural language processing, computational linguistics and information retrieval. Interest in SA has been constantly growing in both academia and industry over the recent years. Moreover, there is an increasing need for generating appropriate resources and datasets in particular for low resource languages including Persian. These datasets pla...

2008
Tayebeh Mosavi Miangah

Information retrieval is a crucial area of natural language processing (NLP) and can be defined as finding documents whose content is relevant to the query need of a user. Cross-language information retrieval refers to a kind of information retriev/al in which the language of the query and that of searched document are different. This paper tries to construct a bilingual lexicon from an English...

2011
Mahsa Mohaghegh Abdolhossein Sarrafzadeh

This paper documents recent work carried out for PeEn-SMT, our Statistical Machine Translation system for translation between the English-Persian language pair. We give details of our previous SMT system, and present our current development of significantly larger corpora. We explain how recent tests using much larger corpora helped to evaluate problems in parallel corpus alignment, corpus cont...

Journal: :زبان شناسی و گویش های خراسان 0
مهدیس زورورز آزیتا افراشی سید مصطفی عاصی

this study investigates the conceptual metaphors of happiness in a representative corpus of modern persian. making use of persian linguistic database, we sampled a corpus of contemporary written texts, to represent modern colloquial persian; then we tried to extract the relevant conceptual metaphors of happiness. the sample corpus contains 14 texts written by contemporary iranian writers. analy...

2016
Murad Abouammoh Kashif Shah Ahmet Aker

Statistical Machine Translation (SMT) relies on the availability of rich parallel corpora. However, in the case of under-resourced languages or some specific domains, parallel corpora are not readily available. This leads to under-performing machine translation systems in those sparse data settings. To overcome the low availability of parallel resources the machine translation community has rec...

2006
Mohsen Arabsorkhi Mehrnoush Shamsfard

This paper reports the present results of a research on unsupervised Persian morpheme discovery. In this paper we present a method for discovering the morphemes of Persian language through automatic analysis of corpora. We utilized a Minimum Description Length (MDL) based algorithm with some improvements and applied it to Persian corpus. Our improvements include enhancing the cost function usin...

Journal: :Bulletin of the Indian Institute of History of Medicine 1998
S K Majumdar

The Hippocratic Corpus was attributed to all branches of healing including internal medicine, surgery, and obstetrics. The Hippocratic collection of treatises (or corpus) was mostly written between 430 and 330 B.C. and some are later works. Some 600 years after Hippocrates, the Corpus were further systematized by Galen and later still by the Persian Islamic physician Avicenna and others. The Co...

2010
Benoît Sagot Géraldine Walther

We introduce PerLex, a large-coverage and freely-available morphological lexicon for the Persian language. We describe the main features of the Persian morphology, and the way we have represented it within the Alexina formalism, on which PerLex is based. We focus on the methodology we used for constructing lexical entries from various sources, as well as the problems related to typographic norm...

Journal: :پژوهشنامه آموزش زبان فارسی به غیر فارسی زبانان 0
رضا رضوانی استادیار آموزش زبان انگلیسی- دانشگاه یاسوج عباس قلتاش استادیار علوم تربیتی، واحد مرودشت، دانشگاه ازاد اسلامی، مرودشت گران ناز زمانی دانشجوی دکترای آموزش زبان انگلیسی - دانشگاه رازی

this article generates the first persian academic word list (pawl) which comprises the most frequently used academic vocabulary in persian academic texts. the pawl was compiled from a corpus of 927,008 running words from academic resources. two principles of range and frequency of word families guided the selection and arrangement of the word list. the corpus included seven books and one hundre...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید