Probability estimation for large margin classifiers
نویسندگان
چکیده
Large margin classifiers have proven to be effective in delivering high predictive accuracy, particularly those focusing on the decision boundaries and bypassing the requirement of estimating the class probability given input for discrimination. As a result, these classifiers may not directly yield an estimated class probability, which is of interest itself. To overcome this difficulty, this article proposes a novel method to estimate the class probability through sequential classifications, by utilising features of interval estimation of large margin classifiers. The method uses sequential classifications to bracket the class probability to yield an estimate up to the desired level of accuracy. The method is implemented for support vector machines and ψ-learning, in addition to an estimated Kullback-Leibler loss for tuning. A solution path of the method is derived for support vector machines to further reduce its computational cost. Theoretical and numerical analyses indicate that the method is highly competitive against alternatives, especially when the dimension of input greatly exceeds the sample size. Finally, an application to leukaemia data is described.
منابع مشابه
Robust Model-Free Multiclass Probability Estimation.
Classical statistical approaches for multiclass probability estimation are typically based on regression techniques such as multiple logistic regression, or density estimation approaches such as linear discriminant analysis (LDA) and quadratic discriminant analysis (QDA). These methods often make certain assumptions on the form of probability functions or on the underlying distributions of subc...
متن کاملHard or Soft Classification? Large-margin Unified Machines.
Margin-based classifiers have been popular in both machine learning and statistics for classification problems. Among numerous classifiers, some are hard classifiers while some are soft ones. Soft classifiers explicitly estimate the class conditional probabilities and then perform classification based on estimated probabilities. In contrast, hard classifiers directly target on the classificatio...
متن کاملMulticategory large-margin unified machines
Hard and soft classifiers are two important groups of techniques for classification problems. Logistic regression and Support Vector Machines are typical examples of soft and hard classifiers respectively. The essential difference between these two groups is whether one needs to estimate the class conditional probability for the classification task or not. In particular, soft classifiers predic...
متن کاملBiometrika Advance Access published November 19 , 2007
Large margin classifiers have proven to be effective in delivering high predictive accuracy, particularly those focusing on the decision boundaries and bypassing the requirement of estimating the class probability given input for discrimination. As a result, these classifiers may not directly yield an estimated class probability, which is of interest itself. To overcome this difficulty, this ar...
متن کاملGini Support Vector Machine: Quadratic Entropy Based Robust Multi-Class Probability Regression
Many classification tasks require estimation of output class probabilities for use as confidence scores or for inference integrated with other models. Probability estimates derived from large margin classifiers such as support vector machines (SVMs) are often unreliable. We extend SVM large margin classification to GiniSVM maximum entropy multi-class probability regression. GiniSVM combines a q...
متن کامل