Effect of Pronoun Resolution on Document Similarity

نویسندگان

  • Atul Kumar
  • Sudip Sanyal
  • Thomas Hofmann
  • Tuomo Kakkonen
  • Niko Myller
  • Jari Timonen
  • Erkki Sutinen
چکیده

This paper presents a novel effect of Pronoun Resolution on measurement of document similarity. In this paper we have studied the effect of pronoun resolution within the framework of the Vector Space Model and Probabilistic Latent Semantic Analysis. For this purpose we have developed a Benchmark Corpus consisting of documents whose similarity scores have been given by human beings. We measured the inter-document similarity on these documents using VSM and PLSA. We then performed pronoun resolution on these documents and again calculated the similarity using both methods. Next, the correlation coefficient of the scores was taken with those of the human generated scores. The correlation coefficients clearly demonstrated substantial and consistent improvements of the similarity score after pronoun resolution.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Deep Neural Network for Chinese Zero Pronoun Resolution

This paper investigates the problem of Chinese zero pronoun resolution. Most existing approaches are based on machine learning algorithms, using hand-crafted features, which is labor-intensive. Moreover, semantic information that is essential in the resolution of noun phrases has not been addressed enough by previous approaches on zero pronoun resolution. This is because that zero pronouns have...

متن کامل

Pronoun Resolution and Summary Extraction From English Documents

This paper presents an approach to generate a precise and meaningful summary of a Document in English Language. Here, we adopt a modified Sidner’s Focusing algorithm to perform pronoun resolution. We devise an algorithm to divide any sort of compound and complex sentences into simple sentences. Anaphora resolution is performed on these simple sentences. Then we find the lexical cohesion between...

متن کامل

The Role of Gender Information in Pronoun Resolution: Evidence from Chinese

Although previous studies have consistently demonstrated that gender information is used to resolve pronouns, the mechanisms underlying the use of gender information continue to be controversial. The present study used event-related potentials (ERPs) to investigate whether working memory modulates the effect of gender information on pronoun resolution. The critical pronoun agreed or disagreed w...

متن کامل

A Document-Level SMT System with Integrated Pronoun Prediction

This paper describes one of Uppsala University’s submissions to the pronounfocused machine translation (MT) shared task at DiscoMT 2015. The system is based on phrase-based statistical MT implemented with the document-level decoder Docent. It includes a neural network for pronoun prediction trained with latent anaphora resolution. At translation time, coreference information is obtained from th...

متن کامل

Disambiguation of the Neuter Pronoun and Its Effect on Pronominal Coreference Resolution

Coreference resolution, determining the appropriate discourse referent for an anaphoric expression, is an essential but difficult task in natural language processing. It has been observed that an important source of errors in machine-learning based approaches to this task, is the wrong disambiguation of the third person singular neuter pronoun as either referential or non-referential. In this p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010