Who's Who? Identifying Concepts and Entities across Multiple Documents

نویسندگان

  • Zunaid Kazi
  • Yael Ravin
چکیده

A number of research and software development groups have developed technology for identifying terms and names in documents and associating them with concepts and named entities, but few have addressed coreference of concepts and entities across multiple documents in a collection. Cross-document coreference is challenging, since a collection of documents consists of multiple discourse contexts, with a many-to-many correspondence between terms and names on one hand and the concepts and entities they refer to on the other. In this paper we describe extensions to our intra-document term and name identification for coreferencing concepts and entities across documents.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Solving the "Who's Mark Johnson Puzzle": Information Extraction Based Cross Document Coreference

Cross Document Coreference (CDC) is the problem of resolving the underlying identity of entities across multiple documents and is a major step for document understanding. We develop a framework to efficiently determine the identity of a person based on extracted information, which includes unary properties such as gender and title, as well as binary relationships with other named entities such ...

متن کامل

Is Hillary Rodham Clinton The President? Disambiguating Names Across Documents

A number of research and software development groups have developed name identification technology, but few have addressed the issue of cross-document coreference, or identifying the same named entities across documents. In a collection of documents, where there are multiple discourse contexts, there exists a manyto-many correspondence between names and entities, making it a challenge to automa...

متن کامل

Identifying Similar and Co-referring Documents Across Languages

This paper presents a methodology for finding similarity and co-reference of documents across languages. The similarity between the documents is identified according to the content of the whole document and co-referencing of documents is found by taking the named entities present in the document. Here we use Vector Space Model (VSM) for identifying both similarity and co-reference. This can be ...

متن کامل

Conceptual and institutional gaps: understanding how the WHO can become a more effective cross-sectoral collaborator

BACKGROUND Two themes consistently emerge from the broad range of academics, policymakers and opinion leaders who have proposed changes to the World Health Organization (WHO): that reform efforts are too slow, and that they do too little to strengthen WHO's capacity to facilitate cross-sectoral collaboration. This study seeks to identify possible explanations for the challenges WHO faces in add...

متن کامل

A Bibliometric Analysis of Open Strategy: A new Concept in Strategic Management

Strategy development has traditionally been an exclusive and secretive matter. However, some organizations have recently used IT to enable openness for making a strategy. The aim of this paper was to research the trends of open strategy by applying bibliometric mapping. The method involves identifying open strategy-related documents, including a sample of 1717 existing documents from 2000 to 20...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000