Selective sampling algorithms for cost-sensitive multiclass prediction
نویسنده
چکیده
In this paper, we study the problem of active learning for cost-sensitive multiclass classification. We propose selective sampling algorithms, which process the data in a streaming fashion, querying only a subset of the labels. For these algorithms, we analyze the regret and label complexity when the labels are generated according to a generalized linear model. We establish that the gains of active learning over passive learning can range from none to exponentially large, based on a natural notion of margin. We also present a safety guarantee to guard against model mismatch. Numerical simulations show that our algorithms indeed obtain a low regret with a small number of queries.
منابع مشابه
Logistic Online Learning Methods and Their Application to Incremental Dependency Parsing
We investigate a family of update methods for online machine learning algorithms for cost-sensitive multiclass and structured classification problems. The update rules are based on multinomial logistic models. The most interesting question for such an approach is how to integrate the cost function into the learning paradigm. We propose a number of solutions to this problem. To demonstrate the a...
متن کاملReduction from Cost-Sensitive Multiclass Classification to One-versus-One Binary Classification
Many real-world applications require varying costs for different types of mis-classification errors. Such a cost-sensitive classification setup can be very different from the regular classification one, especially in the multiclass case. Thus, traditional meta-algorithms for regular multiclass classification, such as the popular one-versus-one approach, may not always work well under the cost-s...
متن کاملA Simple Cost-sensitive Multiclass Classification Algorithm Using One-versus-one Comparisons
Many real-world applications require varying costs for different types of misclassification errors. Such a cost-sensitive classification setup can be very different from the regular classification one, especially in the multiclass case. Thus, traditional meta-algorithms for regular multiclass classification, such as the popular one-versus-one approach, may not always work well under the cost-se...
متن کاملGuess-Averse Loss Functions For Cost-Sensitive Multiclass Boosting
Cost-sensitive multiclass classification has recently acquired significance in several applications, through the introduction of multiclass datasets with well-defined misclassification costs. The design of classification algorithms for this setting is considered. It is argued that the unreliable performance of current algorithms is due to the inability of the underlying loss functions to enforc...
متن کاملNew Algorithms for Optimizing Multi-Class Classifiers via ROC Surfaces
We study the problem of optimizing a multiclass classifier based on its ROC hypersurface and a matrix describing the costs of each type of prediction error. For a binary classifier, it is straightforward to find an optimal operating point based on its ROC curve and the relative cost of true positive to false positive error. However, the corresponding multiclass problem (finding an optimal opera...
متن کامل