Comparing Knowledge Sources for Nominal Anaphora Resolution

نویسندگان

  • Katja Markert
  • Malvina Nissim
چکیده

We compare two ways of obtaining lexical knowledge for antecedent selection in other-anaphora and definite noun phrase coreference. Specifically, we compare an algorithm that relies on links encoded in the manually created lexical hierarchy WordNet and an algorithm that mines corpora by means of shallow lexico-semantic patterns. As corpora we use the British National Corpus (BNC), as well as the Web, which has not been previously used for this task. Our results show that (a) the knowledge encoded in WordNet is often insufficient, especially for anaphor– antecedent relations that exploit subjective or context-dependent knowledge; (b) for otheranaphora, the Web-based method outperforms the WordNet-based method; (c) for definite NP coreference, the Web-based method yields results comparable to those obtained using WordNet over the whole data set and outperforms the WordNet-based method on subsets of the data set; (d) in both case studies, the BNC-based method is worse than the other methods because of data sparseness. Thus, in our studies, the Web-based method alleviated the lexical knowledge gap often encountered in anaphora resolution and handled examples with context-dependent relations between anaphor and antecedent. Because it is inexpensive and needs no hand-modeling of lexical knowledge, it is a promising knowledge source to integrate into anaphora resolution systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A hybrid approach to resolve nominal anaphora

In order to resolve nominal anaphora, especially definite description anaphora, various sources of information have to be taken into account. These range from morphosyntactic information to domain knowledge encoded in ontologies. As the acquisition of ontological knowledge is a timeconsuming task, existing resources often model only a small set of information. This leads to a knowledge gap that...

متن کامل

Using the Web for Nominal Anaphora Resolution

We present a novel method for resolving non-pronominal anaphora. Instead of using handcrafted lexical resources, we search the Web with shallow patterns which can be predetermined for the type of anaphoric phenomenon. In experiments for other-anaphora and bridging, our shallow, almost knowledge-free and unsupervised method achieves state-ofthe-art results.

متن کامل

Comparing Domain-Specific and Non-Domain-Specific Anaphora Resolution Techniques

A quantification is provided for the improvements made in traditional salience-based pronominal anaphora resolution precision when input text that has been parsed using a large scale grammar to locate syntactic function of noun phrases, is used instead of input text where more shallow syntactic analysis techniques were used for identifying grammatical function. In addition, domain-specific tech...

متن کامل

Automatic Construction of Nominal Case Frames and its Application to Indirect Anaphora Resolution

This paper proposes a method to automatically construct Japanese nominal case frames. The point of our method is the integrated use of a dictionary and example phrases from large corpora. To examine the practical usefulness of the constructed nominal case frames, we also built a system of indirect anaphora resolution based on the case frames. The constructed case frames were evaluated by hand, ...

متن کامل

Automated Acquisition of Anaphora Resolution Strategies

We describe one approach to build an automatically trainable anaphora resolution system. In this approach, we used Japanese newspaper articles tagged with discourse information as training examples for a machine learning algorithm which employs the C4.5 decision tree algorithm by Quinlan (Quinlan 1993). Then, we evaluate and compare the results of several variants of the machine learning-based ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Linguistics

دوره 31  شماره 

صفحات  -

تاریخ انتشار 2005