A unified view of performance metrics: translating threshold choice into expected classification loss
نویسندگان
چکیده
Many performance metrics have been introduced in the literature for the evaluation of classification performance, each of them with different origins and areas of application. These metrics include accuracy, unweighted accuracy, the area under the ROC curve or the ROC convex hull, the mean absolute error and the Brier score or mean squared error (with its decomposition into refinement and calibration). One way of understanding the relations among these metrics is by means of variable operating conditions (in the form of misclassification costs and/or class distributions). Thus, a metric may correspond to some expected loss over different operating conditions. One dimension for the analysis has been the distribution for this range of operating conditions, leading to some important connections in the area of proper scoring rules. We demonstrate in this paper that there is an equally important dimension which has so far received much less attention in the analysis of performance metrics. This dimension is given by the decision rule, which is typically implemented as a threshold choice method when using scoring models. In this paper, we explore many old and new threshold choice methods: fixed, score-uniform, score-driven, rate-driven and optimal, among others. By calculating the expected loss obtained with these threshold choice methods for a uniform range of operating conditions we give clear interpretations of the 0-1 loss, the absolute error, the Brier score, the AUC and the refinement loss respectively. Our analysis provides a comprehensive view of performance metrics as well as a systematic approach to loss minimisation which can be summarised as follows: given a model, apply the threshold choice methods that correspond with the available information about the operating condition, and compare their expected losses. In order to assist in this procedure we also derive several connections between the aforementioned performance metrics, and we highlight the role of calibration in choosing the threshold choice method.
منابع مشابه
Threshold Choice Methods: the Missing Link
Many performance metrics have been introduced in the literature for the evaluation of classification performance, each of them with different origins and areas of application. These metrics include accuracy, macro-accuracy, area under the ROC curve or the ROC convex hull, the mean absolute error and the Brier score or mean squared error (with its decomposition into refinement and calibration). ...
متن کاملVacation model for Markov machine repair problem with two heterogeneous unreliable servers and threshold recovery
Markov model of multi-component machining system comprising two unreliable heterogeneous servers and mixed type of standby support has been studied. The repair job of broken down machines is done on the basis of bi-level threshold policy for the activation of the servers. The server returns back to render repair job when the pre-specified workload of failed machines is build up. The first (seco...
متن کاملA COMPARATIVE ANALYSIS OF WAVELET-BASED FEMG SIGNAL DENOISING WITH THRESHOLD FUNCTIONS AND FACIAL EXPRESSION CLASSIFICATION USING SVM AND LSSVM
This work presents a technique for the analysis of Facial Electromyogram signal activities to classify five different facial expressions for Computer-Muscle Interfacing applications. Facial Electromyogram (FEMG) is a technique for recording the asynchronous activation of neuronal inside the face muscles with non-invasive electrodes. FEMG pattern recognition is a difficult task for the researche...
متن کاملA Unified View on Multi-class Support Vector Classification
A unified view on multi-class support vector machines (SVMs) is presented, covering most prominent variants including the one-vs-all approach and the algorithms proposed by Weston & Watkins, Crammer & Singer, Lee, Lin, & Wahba, and Liu & Yuan. The unification leads to a template for the quadratic training problems and new multi-class SVM formulations. Within our framework, we provide a comparat...
متن کاملNorms of Translating Taboo Words and Concepts from English into Persian after the Islamic Revolution in Iran
The research attempted to discover the norms of translating taboo words and concepts after the Islamic Revolution in Iran using Toury’s (1995) framework for classification of norms. The corpus of the study composed of Coelho’s novels between 1990 and 2005 and their Persian translations which were prepared and analyzed manually to discover the norms. During both the selection of novels for trans...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Machine Learning Research
دوره 13 شماره
صفحات -
تاریخ انتشار 2012