Machine Learning Approach for the Classification of Demonstrative Pronouns for Indirect Anaphora in Hindi News Items
نویسندگان
چکیده
In this paper, we present machine learning approach for the classification indirect anaphora in Hindi corpus. The direct anaphora is able to find the noun phrase antecedent within a sentence or across few sentences. On the other hand indirect anaphora does not have explicit referent in the discourse. We suggest looking for certain patterns following the indirect anaphor and marking demonstrative pronoun as directly or indirectly anaphoric accordingly. Our focus of study is pronouns without noun phrase antecedent. We analyzed 177 news items having 1334 sentences, 780 demonstrative pronouns of which 97 (12.44 %) were indirectly anaphoric. The experiment with machine learning approaches for the classification of these pronouns based on the semantic cue provided by the collocation patterns following the pronoun is also carried out.
منابع مشابه
Automation and Validation of Annotation for Hindi Anaphora Resolution
The process of labelling any language genre by which one can extract useful information is called annotation. This provides syntactic information about a word or a word phrase. In this paper, an effort has been made to provide the algorithm for semiautomatic annotation for Hindi text to cater anaphora resolution only. The study was conducted on twelve files of Ranchi Express available in EMILLE...
متن کاملLa reconnaissance automatique de la fonction des pronoms démonstratifs en langue arabe (Automatic recognition of demonstrative pronouns function in Arabic) [in French]
________________________________________________________________________________________________________ Automatic recognition of demonstrative pronouns function in Arabic Anaphora resolution is one of the most difficult tasks in NLP. Classifying pronouns before attempting a task of anaphora resolution is important because to handle the cataphoric pronoun, the system should determine the antece...
متن کاملThe DAD Parallel Corpora and their Uses
This paper deals with the uses of the annotations of third person singular neuter pronouns in the DAD parallel and comparable corpora of Danish and Italian texts and spoken data. The annotations contain information about the functions of these pronouns and their uses as abstract anaphora. Abstract anaphora have constructions such as verbal phrases, clauses and discourse segments as antecedents ...
متن کاملDemonstrative Pronouns in Natural Discourse
We examine demonstrative pronouns in a portion of the Santa Barbara Corpus of American English and propose a coding scheme that classifies pronouns with nominal as well as non-nominal antecedents into direct and indirect, depending on whether their referent is the same as the referent/denotation of the antecedent. In agreement with previous studies, we find that demonstratives more often have n...
متن کاملZero Pronominal Anaphora Resolution for the Romanian Language
This paper presents a new study on the distribution, identification, and resolution of zero pronouns in Romanian. A Romanian corpus, including legal, encyclopaedic, literary, and news texts has been created and manually annotated for zero pronouns. Using a morphological parser for Romanian and machine learning methods, experiments were performed on the created corpus for the identification and ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Prague Bull. Math. Linguistics
دوره 95 شماره
صفحات -
تاریخ انتشار 2011