Corpus-based Development and Evaluation of a System for Processing Definite Descriptions
نویسندگان
چکیده
We present an implemented system for processing definite descriptions. The system is based on the results of a corpus analysis previously reported, which showed how common discourse-new descriptions are in newspaper corpora, and identified several problems to be dealt with when developing computational methods for interpreting bridging descriptions. The annotated corpus produced in this earlier work was used to extensively evaluate the proposed techniques for matching definite descriptions with their antecedents, discourse segmentation, recognizing discourse-new descriptions, and suggesting anchors for bridging descriptions.
منابع مشابه
An Empirically-based System for Processing Definite Descriptions
We present an implemented system for processing definite descriptions in arbitrary domains. The design of the system is based on the results of a corpus analysis previously reported, which highlighted the prevalence of discourse-new descriptions in newspaper corpora. The annotated corpus was used to extensively evaluate the proposed techniques for matching definite descriptions with their antec...
متن کاملCorpus based coreference resolution for Farsi text
"Coreference resolution" or "finding all expressions that refer to the same entity" in a text, is one of the important requirements in natural language processing. Two words are coreference when both refer to a single entity in the text or the real world. So the main task of coreference resolution systems is to identify terms that refer to a unique entity. A coreference resolution tool could be...
متن کاملپیکره اعلام: یک پیکره استاندارد واحدهای اسمی برای زبان فارسی
Named entity recognition (NER) is a natural language processing (NLP) problem that is mainly used for text summarization, data mining, data retrieval, question and answering, machine translation, and document classification systems. A NER system is tasked with determining the border of each named entity, recognizing its type and classifying it into predefined categories. The categories of named...
متن کاملProcessing definite descriptions in corpora
We discuss in this paper a system that resolves definite descriptions in written texts. A preliminary study of definite descriptions in a collection of 20 texts revealed that about 30% of the 1040 definites in the collection were cases of anaphoric definites whose antecedents had the same head noun, and 50% introduced novel discourse referents. An algorithm which resolves anaphoric definite des...
متن کاملThe or That: Definite and Demonstrative Descriptions in Second Language Acquisition
Since Heubner's (1985) pioneering study, there have been many studies on (mis) use/ non-use of articles by L2 learners from article-less and article languages. The present study investigated how Persian L2 learners of English produce and interpret English definite descriptions and demonstrative descriptions. It was assumed that definite and demonstrative descriptions share the same central sema...
متن کامل