نتایج جستجو برای: persian parallel corpus

تعداد نتایج: 300662  

2006
Sarvnaz Karimi Andrew Turpin Falk Scholer

Persian is an Indo-European language written using Arabic script, and is an official language of Iran, Afghanistan, and Tajikistan. Transliteration of Persian to English—that is, the character-bycharacter mapping of a Persian word that is not readily available in a bilingual dictionary—is an unstudied problem. In this paper we make three novel contributions. First, we present performance compar...

2013
Nicola Bertoldi M. Amin Farajian Prashant Mathur Nicholas Ruiz Marcello Federico

This paper describes the systems submitted by FBK for the MT track of IWSLT 2013. We participated in the EnglishFrench as well as the bidirectional Persian-English translation tasks. We report substantial improvements in our English-French systems over last year’s baselines, largely due to improved techniques of combining translation and language models. For our Persian-English and English-Pers...

1999
S. M. Ahadi

Speech recognition in Persian (Farsi) has recently been addressed by a few native speaking researchers and some approaches to isolated word and phoneme recognition have been reported. A main bottleneck in this research field is the lack of a recognition-specific speech corpus. In this work, a phonetically balanced speech database of Persian has been modified and used in continuous speech recogn...

Journal: :JDCTA 2009
Majid Iranpour Mobarakeh Behrouz Minaei-Bidgoli

A novel technique is introduced for verb and inflection detection in Persian texts. This recognition can be useful for preprocessing phase in natural language processing (NLP) and text mining like partof-speech (POS) tagging and sentence boundary detection (SBD) in Persian texts. Our technique employs structural information of Persian verb for the first phase of this detection and then uses the...

Journal: :Quaderni di studi arabi 2021

Abstract This article maps the mainly lost Sasanian historiographical literature through Arabic translations of Middle Persian works and information preserved in early sources. Although only two texts have been original Persian, sources reveal a sizeable corpus translation.

2001
S. Mostafa ASSI M. Haji ABDOLHOSSEINI

The purpose of this article is to briefly introduce an interactive P.O.S. tagging system developed as a project at the Institute for Humanities and Cultural Studies in Tehran, Iran. The system is designed as part of the annotation procedure for a Persian corpus called The Farsi Linguistic Database (FLDB), and is the first attempt ever to tag a Persian corpus. In section I, the project itself wi...

2012
Zakieh Shakeri Neda Noormohammadi Shahram Khadivi Noushin Riahi

In this paper, we investigate the effects of using linguistic information for improvement of statistical machine translation for English-Persian language pair. We choose POS tags as helping linguistic feature. A monolingual Persian corpus with POS tags is prepared and variety of tags is chosen to be small. Using the POS tagger trained on this corpus, we apply a factored translation model. We al...

2016
Fatemeh Mashhadirajab Mehrnoush Shamsfard Razieh Adelkhah Fatemeh Shafiee Chakaveh Saedi

This paper describes how a Persian text alignment corpus was constructed to evaluate plagiarism detection systems. This corpus is in PAN format and contains 11,089 documents and more than 11,603 plagiarism cases. Efforts were made to simulate various types of plagiarism manually, semi-automatically, or automatically in this large-scale corpus.

Named entity recognition (NER) is a natural language processing (NLP) problem that is mainly used for text summarization, data mining, data retrieval, question and answering, machine translation, and document classification systems. A NER system is tasked with determining the border of each named entity, recognizing its type and classifying it into predefined categories. The categories of named...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید