Online Cost-Sensitive Learning for Efficient Interactive Classification
نویسندگان
چکیده
A lot of practical machine learning applications deal with interactive classification problems where trained classifiers are used to help humans find positive examples that are of interest to them. Typically, these classifiers label a large number of test examples and present the humans with a ranked list to review. The humans involved in this process are often expensive domain experts with limited time. We present an online cost-sensitive learning approach (more-like-this) that focuses on reducing the time it takes for the experts to review and label examples in interactive machine learning systems. We target the scenario where a batch classifier has been trained for a given classification task and optimize the interaction between the classifier and the domain experts who are consuming the results of this classifier. The goal is to make these experts more efficient and effective in performing their task as well as eventually improving the classifier over time. We validate our approach by applying it to the problem of detecting errors in health insurance claims and show significant reduction in labeling time while increasing the overall performance of the system.
منابع مشابه
Survey of Novel Method for Online Classification in Data Mining
Nowadays in communities of Data Mining and Machine Learning, cost-sensitive classification and online learning have been widely examined. Even though these topics are getting more and more attention, very few studies are based on an important concern of Cost-Sensitive Online Classification. This problem can be explored widely and new technique can be implemented to deal with this issue. By dire...
متن کاملA New Formulation for Cost-Sensitive Two Group Support Vector Machine with Multiple Error Rate
Support vector machine (SVM) is a popular classification technique which classifies data using a max-margin separator hyperplane. The normal vector and bias of the mentioned hyperplane is determined by solving a quadratic model implies that SVM training confronts by an optimization problem. Among of the extensions of SVM, cost-sensitive scheme refers to a model with multiple costs which conside...
متن کاملCost-Sensitive Double Updating Online Learning and Its Application to Online Anomaly Detection
Although both cost-sensitive classification and online learning have been well studied separately in data mining and machine learning, there was very few comprehensive study of cost-sensitive online classification in literature. In this paper, we formally investigate this problem by directly optimizing cost-sensitive measures for an online classification task. As the first comprehensive study, ...
متن کاملProposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms
In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...
متن کاملOnline Streaming Feature Selection Using Geometric Series of the Adjacency Matrix of Features
Feature Selection (FS) is an important pre-processing step in machine learning and data mining. All the traditional feature selection methods assume that the entire feature space is available from the beginning. However, online streaming features (OSF) are an integral part of many real-world applications. In OSF, the number of training examples is fixed while the number of features grows with t...
متن کامل