On the bayes-optimality of F-measure maximizers
نویسندگان
چکیده
The F-measure, which has originally been introduced in information retrieval, is nowadays routinely used as a performance metric for problems such as binary classification, multi-label classification, and structured output prediction. Optimizing this measure is a statistically and computationally challenging problem, since no closed-form solution exists. Adopting a decision-theoretic perspective, this article provides a formal and experimental analysis of different approaches for maximizing the F-measure. We start with a Bayes-risk analysis of related loss functions, such as Hamming loss and subset zero-one loss, showing that optimizing such losses as a surrogate of the F-measure leads to a high worst-case regret. Subsequently, we perform a similar type of analysis for F-measure maximizing algorithms, showing that such algorithms are approximate, while relying on additional assumptions regarding the statistical distribution of the binary response variables. Furthermore, we present a new algorithm which is not only computationally efficient but also Bayesoptimal, regardless of the underlying distribution. To this end, the algorithm requires only a quadratic (with respect to the number of binary responses) number of parameters of the joint distribution. We illustrate the practical performance of all analyzed methods by means of experiments with multi-label classification problems.
منابع مشابه
Degree of Optimality as a Measure of Distance of Power System Operation from Optimal Operation
This paper presents an algorithm based on inter-solutions of having scheduled electricity generation resources and the fuzzy logic as a sublimation tool of outcomes obtained from the schedule inter-solutions. The goal of the algorithm is to bridge the conflicts between minimal cost and other aspects of generation. In the past, the optimal scheduling of electricity generation resources has been ...
متن کاملUniqueness and Characterization of the Maximizers of Integral Functionals with Constraints
Optimization problems appear in economics, mathematics, to name only a few areas. They refer to minimizing, respectively maximizing a certain functional under several constraints. These problems are also connected with rearrangements. Often times the minimizers, or maximizers satisfy certain symmetry and monotonicity conditions, i.e. increasing or decreasing. For (X,μ) a measure space, and F a ...
متن کاملOptimality conditions for maximizers of the information divergence from an exponential family
The information divergence of a probability measure P from an exponential family E over a nite set is de ned as in mum of the divergences of P from Q subject to Q ∈ E . All directional derivatives of the divergence from E are explicitly found. To this end, behaviour of the conjugate of a log-Laplace transform on the boundary of its domain is analysed. The rst order conditions for P to be a maxi...
متن کاملEnsemble Classification and Extended Feature Selection for Credit Card Fraud Detection
Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...
متن کاملخوشهبندی اسناد مبتنی بر آنتولوژی و رویکرد فازی
Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Machine Learning Research
دوره 15 شماره
صفحات -
تاریخ انتشار 2014