نتایج جستجو برای: urdu

تعداد نتایج: 1158  

2010
Wajid Ali Sarmad Hussain

A variety of verb phrases exist in Urdu including simple verb phrases, conjunct verb phrases and compound verb phrases. This paper explains the structure of Urdu verb phrases, and details a series of experiment to automatically tag them. Initially, a rule based model is developed using 21 linguistic rules for automatic VP chunking. A 100,000 word Urdu corpus is manually tagged with VP chunk tag...

2010
Nadir Durrani Sarmad Hussain

Word Segmentation is the foremost obligatory task in almost all the NLP applications where the initial phase requires tokenization of input into words. Urdu is amongst the Asian languages that face word segmentation challenge. However, unlike other Asian languages, word segmentation in Urdu not only has space omission errors but also space insertion errors. This paper discusses how orthographic...

2016
Naila Habib Khan Awais Adnan Sadia Basar

In this research article a detailed analysis has been proposed for various offline and online character recognition systems for Urdu script from year 2002 to 2012. This analysis is based on the Methodology, Text Type, Font, Recognition Level, Sample and Accuracy Level achieved by each individual Urdu script recognition system. This paper attempts to cover various aspects of offline and online c...

2013
Asad Abdul Malik Asad Habib

The diversity in source and target languages coupled with source language ambiguity makes Machine Translation (MT) an exceptionally hard problem. The highly information intensive corpus based MT leads the MT research field today, with Example Based MT and Statistical MT representing two dissimilar frameworks in the data-driven paradigm. Example Based MT is another approach that involves matchin...

2014
Hazrat Ali Nasir Ahmad Xianwei Zhou Khalid Iqbal Sahibzada Muhammad Ali

This paper presents the work on Automatic Speech Recognition of Urdu language, using a comparative analysis for Discrete Wavelets Transform (DWT) based features and Mel Frequency Cepstral Coefficients (MFCC). These features have been extracted for one hundred isolated words of Urdu, each word uttered by ten different speakers. The words have been selected from the most frequently used words of ...

2010
Karthik Visweswariah Vijil Chenthamarakshan Nanda Kambhatla

Hindi and Urdu share a common phonology, morphology and grammar but are written in different scripts. In addition, the vocabularies have also diverged significantly especially in the written form. In this paper we show that we can get reasonable quality translations (we estimated the Translation Error rate at 18%) between the two languages even in absence of a parallel corpus. Linguistic resour...

2016
Sebastian Sulger

This paper discusses genitive phrases in Hindi/Urdu in general and puts a particular focus on genitive scrambling, a process whereby the basic order of constituents is changed. In Hindi/Urdu, genitive phrases may not only occur at different structural positions within the NP that they modify; under the right circumstances, they can also be found outside of the NP, yielding discontinuous structu...

2010
Ghulam Raza

This paper describes an approach for inferring syntactic frames of verbs in Urdu from an untagged corpus. Urdu, like many other South Asian languages, is a free word order and case-rich language. Separable lexical units mark different constituents for case in phrases and clauses and are called case clitics. There is not always a one to one correspondence between case clitic form and case, and c...

2009
Muhammad Ghulam Abbas Malik Laurent Besacier Christian Boitet Pushpak Bhattacharyya

We report in this paper a novel hybrid approach for Urdu to Hindi transliteration that combines finite-state machine (FSM) based techniques with statistical word language model based approach. The output from the FSM is filtered with the word language model to produce the correct Hindi output. The main problem handled is the case of omission of diacritical marks from the input Urdu text. Our sy...

2013
Abbas Raza Ali Sarmad Hussain

Urdu language is written in Arabic script. In this script, the consonantal context is clearly represented, but the vocalic sounds are represented (mostly) by marks or diacritics, which are optional and normally not written. Readers can guess the diacritics and thus can pronounce words correctly, based on their knowledge of the language. But un-diacritized Urdu text creates ambiguity for novice ...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید