Anaphora Resolution of Demonstrative Noun Phrases in Medline Abstracts
نویسنده
چکیده
This paper reports our investigation of machine learning methods applied to anaphora resolution for Biology texts. Our primary concern is the investigation of features and their combinations for effective anaphora resolution. In this paper, we focus on the resolution of demonstrative anaphoric noun phrases. We propose several novel features that we call highlighting features and consider their utility particularly for processing of abstracts. The system using the highlighting features achieved 78% accuracy on our test corpus for demonstrative anaphora. Use of the highlighting features reduces errors by 10% compared to the system without using those features.
منابع مشابه
Natural Language Processing Scientific Literature DEMONSTRATIVE ANAPHORA: FORMS AND FUNCTIONS IN FULL-TEXT SCIENTIFIC ARTICLES
This study examines the functions and characteristics of demonstrative anaphora (this, these, that, those) in a collection of full-text scientific documents, confirming that they play an important role in maintaining discourse focus and binding together cohesive sections of text. Unlike corpora in other subject domains, the Cystic Fibrosis database contains more demonstrative expressions than a...
متن کاملA Study of Anaphoric Expressions in Human Produced Scientific Abstracts
One of the main reasons for having low quality automatic extracts is the presence of dangling anaphors. This paper analyses the referential expressions in a corpus of human written scientific summaries and tries to identify ways for improving the quality of automatic extracts. By recording the distance between the anaphoric expressions and their referents we noticed that humans do not use an ag...
متن کاملExtracting noun phrases for all of MEDLINE
A natural language parser that could extract noun phrases for all medical texts would be of great utility in analyzing content for information retrieval. We discuss the extraction of noun phrases from MEDLINE, using a general parser not tuned specifically for any medical domain. The noun phrase extractor is made up of three modules: tokenization; part-of-speech tagging; noun phrase identificati...
متن کاملEliminating Non-Referring Noun Phrases from Coreference Resolution
Indefinite noun phrases in certain contexts are unable to support anaphoric coreference to an individual entity, and therefore should be ignored when searching for coreferent antecedents of anaphoric pronouns. However, many algorithms for anaphora resolution utilize noun phrase chunking or shallow parsing, and therefore do not make the needed distinctions to avoid this type of spurious antecede...
متن کاملCorpus - Based Identi cation of Non - Anaphoric NounPhrasesDavid
Coreference resolution involves nding antecedents for anaphoric discourse entities, such as deenite noun phrases. But many deenite noun phrases are not anaphoric because their meaning can be understood from general world knowledge (e.g., \the White House" or \the news media"). We have developed a corpus-based algorithm for automatically identifying deenite noun phrases that are non-anaphoric, w...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005