نتایج جستجو برای: phnilogical error urdu language

تعداد نتایج: 671206  

2012
Faryal Jahangir Waqas Anwar Usama Ijaz Bajwa Xuan Wang

Extraction of named entities (NEs) from the text is an important operation in many natural language processing applications like information extraction, question answering, machine translation etc. Since early 1990s the researchers have taken greater interest in this field and a lot of work has been done regarding Named Entity Recognition (NER) in different languages of the world. Unfortunately...

2012
Khalil Khan Muhammad Siddique Rehanullah Khan

This paper describes an efficient method for Urdu text search in computer generated and handwritten scanned images. An efficient text search technology is necessary because of increasing handled document every day. This method is unique and simple in the sense that no features are extracted. The proposed method is script independent. The input image is directly matched with a set of prototype c...

2015
Francisco M. Rangel Pardo Fabio Celli Paolo Rosso Martin Potthast Benno Stein Walter Daelemans

In this paper we describe and evaluate the corpora submitted to the PAN 2015 shared task on plagiarism detection for text alignment. We received monoand cross-language corpora in the following languages (pairs): English, Persian, Chinese, and Urdu-English, English-Persian. We present an independent section for each submitted corpus including statistics, discussion of the obfuscation techniques ...

2015
Marc Franco-Salvador Imene Bensalem Enrique Flores Parth Gupta Paolo Rosso

In this paper we describe and evaluate the corpora submitted to the PAN 2015 shared task on plagiarism detection for text alignment. We received monoand cross-language corpora in the following languages (pairs): English, Persian, Chinese, and Urdu-English, English-Persian. We present an independent section for each submitted corpus including statistics, discussion of the obfuscation techniques ...

Journal: :Engineering Letters 2008
Hemant A. Patil Tapan Kumar Basu

identifying an unknown language from the test utterances. In this paper, a new method of feature extraction, viz., Teager Energy Based Mel Frequency Cepstral Coefficients (T-MFCC) is developed for identification of perceptually similar languages. Finally, an LID system is presented for Hindi and Urdu (perceptually similar Indian languages) to demonstrate effectiveness of newly proposed feature ...

Journal: :Pakistan languages and humanities review 2023

This study compares and contrasts the usage of /and/ /aur/ within framework Contrastive Analysis Hypothesis (CAH). Data on Urdu conjunctions appearing in various structures were obtained from an unpublished research paper Conversation Analysis, while data English several past studies. Hundred Graduating students (twenty five each institution) different Higher Education Institutions Kotli AJ&amp...

2016
Saeeda Naz Arif Iqbal Umar Riaz Ahmed Muhammad Imran Razzak Sheikh Faisal Rashid Faisal Shafait

The recognition of Arabic script and its derivatives such as Urdu, Persian, Pashto etc. is a difficult task due to complexity of this script. Particularly, Urdu text recognition is more difficult due to its Nasta'liq writing style. Nasta'liq writing style inherits complex calligraphic nature, which presents major issues to recognition of Urdu text owing to diagonality in writing, high cursivene...

2010
Bushra Jawaid Mike Rosner Ondrej Bojar

One of the difficulties statistical machine translation (SMT) systems face are differences in word order. When translating from a language with rather fixed SVO word order, such as English, to a language where the preferred word order is dramatically different (such as the SOV order of Urdu, Hindi, Korean, ...), the system has to learn long-distance reordering of the words. Higher degree of fre...

2015
Mayce Ibrahim Ali Al Azawi

The goal of this work is to develop statistical natural language models and processing techniques based on Recurrent Neural Networks (RNN), especially the recently introduced Long ShortTerm Memory (LSTM). Due to their adapting and predicting abilities, these methods are more robust, and easier to train than traditional methods, i.e., words list and rule-based models. They improve the output of ...

2012
Riyaz Ahmad Bhat Dipti Misra Sharma

In this paper we describe a currently underway treebanking effort for Urdu-a South Asian language. The treebank is built from a newspaper corpus and uses a Karaka based grammatical framework inspired by Paninian grammatical theory. Thus far 3366 sentences (0.1M words) have been annotated with the linguistic information at morpho-syntactic (morphological, part-of-speech and chunk information) an...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید