Performance Measures for Information Extraction
نویسندگان
چکیده
While precision and recall have served the information extraction community well as two separate measures of system performance, we show that the F -measure, the weighted harmonic mean of precision and recall, exhibits certain undesirable behaviors. To overcome these limitations, we define an error measure, the slot error rate, which combines the different types of error directly, without having to resort to precision and recall as preliminary measures. The slot error rate is analogous to the word error rate that is used for measuring speech recognition performance; it is intended to be a measure of the cost to the user for the system to make the different types of errors.
منابع مشابه
A New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model
Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...
متن کاملA Geometric View of Similarity Measures in Data Mining
The main objective of data mining is to acquire information from a set of data for prospect applications using a measure. The concerning issue is that one often has to deal with large scale data. Several dimensionality reduction techniques like various feature extraction methods have been developed to resolve the issue. However, the geometric view of the applied measure, as an additional consid...
متن کاملPresenting a method for extracting structured domain-dependent information from Farsi Web pages
Extracting structured information about entities from web texts is an important task in web mining, natural language processing, and information extraction. Information extraction is useful in many applications including search engines, question-answering systems, recommender systems, machine translation, etc. An information extraction system aims to identify the entities from the text and extr...
متن کاملOptimizing Multivariate Performance Measures for Learning Relation Extraction Models
We describe a novel max-margin learning approach to optimize non-linear performance measures for distantly-supervised relation extraction models. Our approach can be generally used to learn latent variable models under multivariate non-linear performance measures, such as Fβ-score. Our approach interleaves Concave-Convex Procedure (CCCP) for populating latent variables with dual decomposition t...
متن کاملFace Recognition Based Rank Reduction SVD Approach
Standard face recognition algorithms that use standard feature extraction techniques always suffer from image performance degradation. Recently, singular value decomposition and low-rank matrix are applied in many applications,including pattern recognition and feature extraction. The main objective of this research is to design an efficient face recognition approach by combining many tech...
متن کامل