Robust supervised classification with mixture models: Learning from data with uncertain labels
نویسندگان
چکیده
In the supervised classification framework, human supervision is required for labeling a set of learning data which are then used for building the classifier. However, in many applications, human supervision is either imprecise, difficult or expensive. In this paper, the problem of learning a supervised multiclass classifier from data with uncertain labels is considered and a modelbased classification method is proposed to solve it. The idea of the proposed method is to confront an unsupervised modelling of the data with the supervised information carried by the labels of the learning data in order to detect inconsistencies. The method is able afterward to build a robust classifier taking into account the detected inconsistencies into the labels. Experiments on artificial and real data are provided to highlight the main features of the proposed method as well as an application to object recognition under weak supervision.
منابع مشابه
Detecting Concept Drift in Data Stream Using Semi-Supervised Classification
Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...
متن کاملSupervised classification of categorical data with uncertain labels for DNA barcoding
In the supervised classification framework, the human supervision is required for labeling a set of learning data which are then used for building the classifier. However, in many applications, the human supervision is either imprecise, difficult or expensive and this gives rise to non robust classifiers. An interesting application where this situation occurs is DNA barcoding which aims to deve...
متن کاملLearning from partially supervised data using mixture models and belief functions
This paper addresses classification problems in which the class membership of training data is only partially known. Each learning sample is assumed to consist in a feature vector xi ∈ X and an imprecise and/or uncertain “soft” label mi defined as a Dempster-Shafer basic belief assignment over the set of classes. This framework thus generalizes many kinds of learning problems including supervis...
متن کاملA Statistical Approach to Increase Classification Accuracy in Supervised Learning Algorithms
Probabilistic mixture models have been widely used for different machine learning and pattern recognition tasks such as clustering, dimensionality reduction, and classification. In this paper, we focus on trying to solve the most common challenges related to supervised learning algorithms by using mixture probability distribution functions. With this modeling strategy, we identify sub-labels an...
متن کاملRobust Method for E-Maximization and Hierarchical Clustering of Image Classification
We developed a new semi-supervised EM-like algorithm that is given the set of objects present in eachtraining image, but does not know which regions correspond to which objects. We have tested thealgorithm on a dataset of 860 hand-labeled color images using only color and texture features, and theresults show that our EM variant is able to break the symmetry in the initial solution. We compared...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pattern Recognition
دوره 42 شماره
صفحات -
تاریخ انتشار 2009