ARE: Instance Splitting Strategies for Dependency Relation-Based Information Extraction

نویسندگان

  • Mstislav Maslennikov
  • Hai-Kiat Goh
  • Tat-Seng Chua
چکیده

Information Extraction (IE) is a fundamental technology for NLP. Previous methods for IE were relying on co-occurrence relations, soft patterns and properties of the target (for example, syntactic role), which result in problems of handling paraphrasing and alignment of instances. Our system ARE (Anchor and Relation) is based on the dependency relation model and tackles these problems by unifying entities according to their dependency relations, which we found to provide more invariant relations between entities in many cases. In order to exploit the complexity and characteristics of relation paths, we further classify the relation paths into the categories of ‘easy’, ‘average’ and ‘hard’, and utilize different extraction strategies based on the characteristics of those categories. Our extraction method leads to improvement in performance by 3% and 6% for MUC4 and MUC6 respectively as compared to the state-of-art IE systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovery of Dependency Tree Patterns for Relation Extraction

Relation extraction is to identify the relations between pairs of named entities. In this paper, we try to solve the problem of relation extraction by discovering dependency tree patterns (a pattern is an embedded sub dependency tree indicating a relation instance). Our approach is to find an optimal rule (pattern) set automatically based on the proposed dependency tree pattern mining algorithm...

متن کامل

Preventing Key Performance Indicators Violations Based on Proactive Runtime Adaptation in Service Oriented Environment

Key Performance Indicator (KPI) is a type of performance measurement that evaluates the success of an organization or a partial activity in which it engages. If during the running process instance the monitoring results show that the KPIs do not reach their target values, then the influential factors should be identified, and the appropriate adaptation strategies should be performed to prevent ...

متن کامل

Automatic Acquisition of Ranked IS-A Relation from Unstructured Text

In this paper, we present a weakly-supervised, general-purpose algorithm for IS-A relation extraction. The algorithm automatically identifies highly relevant triples for IS-A relation from a text collection and rank the triples through a combination of linguistic and statistical processing. The main features are: i) a method based on dependency structure analysis of texts, ii) a method to explo...

متن کامل

Extraction of Drug-Drug Interaction from Literature through Detecting Linguistic-based Negation and Clause Dependency

Extracting biomedical relations such as drug-drug interaction (DDI) from text is an important task in biomedical NLP. Due to the large number of complex sentences in biomedical literature, researchers have employed some sentence simplification techniques to improve the performance of the relation extraction methods. However, due to difficulty of the task, there is no noteworthy improvement in t...

متن کامل

A Weakly-Supervised Rule-Based Approach for Relation Extraction

Resumen Rule-based approaches for information extraction usually achieve good precision values, even if they often need a lot of manual effort to be implemented. In this paper, we present a novel rule-based strategy for semantic relation extraction that takes advantage of partial syntactic parsing in order to simplify the linguistic structures containing instances of semantic relations. We also...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006