نتایج جستجو برای: farsi 1

تعداد نتایج: 2754034  

Journal: :CoRR 2014
Behrang Q. Zadeh Saeed Rahimi Mehdi Safaee Ghalati

Farsi, also known as Persian, is the official language of Iran and Tajikistan and one of the two main languages spoken in Afghanistan. Farsi enjoys a unified Arabic script as its writing system. In this paper we briefly introduce the writing standards of Farsi and highlight problems one would face when analyzing Farsi electronic texts, especially during development of Farsi corpora regarding to...

2015
Peyman Passban Andy Way Qun Liu

Statistical machine translation (SMT) suffers from various problems which are exacerbated where training data is in short supply. In this paper we address the data sparsity problem in the Farsi (Persian) language and introduce a new parallel corpus, TEP++. Compared to previous results the new dataset is more efficient for Farsi SMT engines and yields better output. In our experiments using TEP+...

Journal: :Pattern Recognition 1981
Behrooz Parhami M. Taraghi

-The automatic recognition of printed Farsi (Persian) texts is complicated by several properties of the Farsi script: (a) connectivity of symbols, (b) similarity of groups of symbols, (c) highly variable widths, (d) subword overlap, and (e) line overlap. In this paper, a technique for the automatic recognition of printed Farsi texts is presented and its steps are discussed as follows : (1) digi...

2008
Chia-Lin Kao Shirin Saleem Rohit Prasad Fred Choi Premkumar Natarajan David Stallard Kriste Krstovski Matin Kamali

Significant advances have been achieved in Speech-to-Speech (S2S) translation systems in recent years. However, rapid configuration of S2S systems for low-resource language pairs and domains remains a challenging problem due to lack of human translated bilingual training data. In this paper, we report on an effort to port our existing English/Iraqi S2S system to the English/Farsi language pair ...

2004
Mortaza Kokabi

Zarnegar (gold writer) is a word processor widely used by publishers of both scholarly journals and books in Iran. Although it is gradually substituted by Word for Windows that is much more powerful than Zarnegar, the process seems to be slow and most Iranian publishers still prefer to receive manuscripts in Zarnegar than Word. There are many reasons for this preference: Word, though having man...

2009
H. Izakian

Nowadays, OCR systems have got several applications and are increasingly employed in daily life. Much research has been done regarding the identification of Latin, Japanese, and Chinese characters. However, very little investigation has been performed regarding Farsi/Arabic characters recognition. Probably the reason is difficulty and complexity of those characters identification compared to th...

2007
Vahab Pournaghshband

OCR (Optical Character Recognition) is the digital encoding of printed and handwritten characters from an image file created through a scanner or other optical imaging devices. In other words, OCR is a software program that converts image-texts into computerized or digital text (figure 1) . While OCR has been extensively used as the basic application of different learning methods in machine lea...

2006
F. Shahabi M. Rahmati

Writer identification recently has been studied and it has a wide variety of applications. Most studies are based on English documents with the assumption that the written text is fixed (text-dependent methods) and no research has been reported on Farsi or Arabic documents. In this paper, we have proposed a method for off-line writer identification based on Farsi handwriting, which is text-inde...

2004
E. Darrudi M. R. Hejazi F. Oroumchian

The development of Language Engineering (LE) and Information Retrieval (IR) applications requires availability of sizeable, reliable and representative corpora. This paper describes how we have constructed a well-structured 345 MB tagged corpus of news, and presents some beneficial statistics of this corpus based upon the characteristics of Farsi language. It also goes into particular detail on...

2006
Behrang Q. Zadeh Saeed Rahimi

Farsi, also known as Persian, is the official language of Iran, Tajikistan and one of the two main languages spoken in Afghanistan. It is an Indo-European agglutinating language, written in Arabic script. This paper presents the first step in creating Farsi basic language resources kit. This Step comprises the specifications for morphosyntactic encoding, which is based on the EAGLES/MULTEXT mod...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید