AIDA-light: High-Throughput Named-Entity Disambiguation

نویسندگان

  • Dat Ba Nguyen
  • Johannes Hoffart
  • Martin Theobald
  • Gerhard Weikum
چکیده

To advance the Web of Linked Data, mapping ambiguous names in structured and unstructured contents onto knowledge bases would be a vital asset. State-of-the-art methods for Named Entity Disambiguation (NED) face major tradeoffs regarding efficiency/scalability vs. accuracy. Fast methods use relatively simple context features and avoid computationally expensive algorithms for joint inference. While doing very well on prominent entities in clear input texts, these methods achieve only moderate accuracy when fed with difficult inputs. On the other hand, methods that rely on rich context features and joint inference for mapping names onto entities pay the price of being much slower. This paper presents AIDA-light which achieves high accuracy on difficult inputs while also being fast and scalable. AIDA-light uses a novel kind of two-stage mapping algorithm. It first identifies a set of “easy” mentions with low ambiguity and links them to entities in a very efficient manner. This stage also determines the thematic domain of the input text as an important and novel kind of feature. The second stage harnesses the high-confidence linkage for the “easy” mentions to establish more reliable contexts for the disambiguation of the remaining mentions. Our experiments with four different datasets demonstrates that the accuracy of AIDA-light is competitive to the very best NED systems, while its run-time is comparable to or better than the performance of the fastest systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extending AIDA framework by incorporating coreference resolution on detected mentions and pruning based on popularity of an entity

Named Entity Disambiguation (NED) is gaining popularity due to its applications in the field of information extraction. Entity linking or Named Entity Disambiguation is the task of discovering entities such as persons, locations, organizations, etc. and is challenging due to the high ambiguity of entity names in natural language text. In this paper, we propose a modification to the existing sta...

متن کامل

AIDA: An Online Tool for Accurate Disambiguation of Named Entities in Text and Tables

We present AIDA, a framework and online tool for entity detection and disambiguation. Given a natural-language text or a Web table, we map mentions of ambiguous names onto canonical entities like people or places, registered in a knowledge base like DBpedia, Freebase, or YAGO. AIDA is a robust framework centred around collective disambiguation exploiting the prominence of entities, similarity b...

متن کامل

U-AIDA: a customizable system for named entity recognition, classification, and disambiguation

Recognizing and disambiguating entities such as people, organizations, events or places in natural language text are essential steps for many linguistic tasks such as information extraction and text categorization. A variety of named entity disambiguation methods have been proposed, but most of them focus on Wikipedia as a sole knowledge resource. This focus does not fit all application scenari...

متن کامل

Combining Mention Context and Hyperlinks from Wikipedia for Named Entity Disambiguation

Named entity disambiguation is the task of linking entity mentions to their intended referent, as represented in a Knowledge Base, usually derived from Wikipedia. In this paper, we combine local mention context and global hyperlink structure from Wikipedia in a probabilistic framework. Our results show that the two models of context, namely, words in the context and hyperlink pathways to other ...

متن کامل

Joint Named Entity Recognition and Disambiguation

Extracting named entities in text and linking extracted names to a given knowledge base are fundamental tasks in applications for text understanding. Existing systems typically run a named entity recognition (NER) model to extract entity names first, then run an entity linking model to link extracted names to a knowledge base. NER and linking models are usually trained separately, and the mutua...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014