Complementarity, F-score, and NLP Evaluation

نویسنده

  • Leon Derczynski
چکیده

This paper addresses the problem of quantifying the differences between entity extraction systems, where in general only a small proportion a document should be selected. Comparing overall accuracy is not very useful in these cases, as small differences in accuracy may correspond to huge differences in selections over the target minority class. Conventionally, one may use per-token complementarity to describe these differences, but it is not very useful when the set is heavily skewed. In such situations, which are common in information retrieval and entity recognition, metrics like precision and recall are typically used to describe performance. However, precision and recall fail to describe the differences between sets of objects selected by different decision strategies, instead just describing the proportional amount of correct and incorrect objects selected. This paper presents a method for measuring complementarity for precision, recall and F-score, quantifying the difference between entity extraction approaches.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Word Embedding Evaluation and Combination

Word embeddings have been successfully used in several natural language processing tasks (NLP) and speech processing. Different approaches have been introduced to calculate word embeddings through neural networks. In the literature, many studies focused on word embedding evaluation, but for our knowledge, there are still some gaps. This paper presents a study focusing on a rigorous comparison o...

متن کامل

Solving Mathematical Programs with Complementarity Constraints with Nonlinear Solvers

S u m m a r y. MPCC can be solved with specific MPCC codes or in its nonlinear equivalent formulation (NLP) using NLP solvers. Two NLP solvers-NPSOL and the line search filter SQP-are used to solve a collection of test problems in AMPL. Both are based on SQP (Sequential Quadratic Programming) philosophy but the second one uses a line search filter scheme.

متن کامل

Recognition of medication information from discharge summaries using ensembles of classifiers

BACKGROUND Extraction of clinical information such as medications or problems from clinical text is an important task of clinical natural language processing (NLP). Rule-based methods are often used in clinical NLP systems because they are easy to adapt and customize. Recently, supervised machine learning methods have proven to be effective in clinical NLP as well. However, combining different ...

متن کامل

On an enumerative algorithm for solving eigenvalue complementarity problems

In this paper, we discuss the solution of linear and quadratic eigenvalue complementarity problems (EiCPs) using an enumerative algorithm of the type introduced by Júdice et al. [1]. Procedures for computing the interval that contains all the eigenvalues of the linear EiCP are first presented. A nonlinear programming (NLP) model for the quadratic EiCP is formulated next, and a necessary and suf...

متن کامل

A Supervised Machine Learning Approach for Temporal Information Extraction

Temporal information extraction is an interesting research area in Natural Language Processing (NLP). Here, the main task involves identification of the different relations between various events and time expressions in a document. The relations are then classified into some predefined categories like BEFORE, AFTER, OVERLAP, BEFORE-OR-OVERLAP, OVERLAP-OR-AFTER and VAGUE. In this paper, we repor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016