Fine-grained Named Entity Classi cation in Machine Reading

نویسنده

  • Xi Lin
چکیده

Fine-grained named entity classi cation or FG-NEC refers to the process of classifying a set of named entities from naturally occurring texts to the maximum granularity. It is essentially di erent from the traditional coarse-grained NEC (PER, LOC, ORG) in that it requires deep semantic analysis and the FG semantic classes are highly ambiguous. While research has been conducted in an application-oriented manner, few works have addressed this problem per se. This thesis addressed this problem, with a special focus on the person category. Our methodology is to extract the key property of each candidate instance rst and automatically classify them according to a reference taxonomy. The classi cation takes into account the non-uniformity and insu ciency of context clues and uses a cascade framework such that named entities with di erent kinds of context clues are resolved at di erent stages. The cascade framework is highly e cient since the simple instances can be ltered out at early stages thereby the system can focus on the more di cult ones. We also developed a joint-inference based property extraction algorithm for entities whose target properties are explicitly speci ed in the texts. Evaluated on the Wall Street Journal corpus, the extractor achieves an F1 score of 91.91, which is quite competitive. Trained on newswire texts, this framework can be easily tuned to apply to texts in other styles.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Name Translation based on Fine-grained Named Entity Recognition in a Single Language

We propose named entity abstraction methods with fine-grained named entity labels for improving statistical machine translation (SMT). The methods are based on a bilingual named entity recognizer that uses a monolingual named entity recognizer with transliteration. Through experiments, we demonstrate that incorporating fine-grained named entities into statistical machine translation improves th...

متن کامل

Fine-Grained Named Entity Recognition Using Conditional Random Fields for Question Answering

In many QA systems, fine-grained named entities are extracted by coarse-grained named entity recognizer and fine-grained named entity dictionary. In this paper, we describe a fine-grained Named Entity Recognition using Conditional Random Fields (CRFs) for question answering. We used CRFs to detect boundary of named entities and Maximum Entropy (ME) to classify named entity classes. Using the pr...

متن کامل

Fine-grained Arabic named entity recognition

Named Entity Recognition (NER) is a Natural Language Processing (NLP) task, which aims to extract useful information from unstructured textual data by detecting and classifying Named Entity (NE) phrases into predefined semantic classes. This thesis addresses the problem of fine-grained NER for Arabic, which poses unique linguistic challenges to NER; such as the absence of capitalisation and sho...

متن کامل

Improving Related Entity Finding via Incorporating Homepages and Recognizing Fine-grained Entities

This paper describes experiments on the TREC entity track that studies retrieval of homepages representing entities relevant to a query. Many studies have focused on extracting entities that match the given coarse-grained types such as organizations, persons, locations by using a named entity recognizer, and employing language model techniques to calculate similarities between query and support...

متن کامل

Domain Information for Fine-Grained Person Name Categorization

Named Entity Recognition became the basis of many Natural Language Processing applications. However, the existing coarsegrained named entity recognizers are insufficient for complex applications such as Question Answering, Internet Search engines or Ontology population. In this paper, we propose a domain distribution approach according to which names which occur in the same domains belong to th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011