Robust Semi-Supervised Learning through Label Aggregation
نویسندگان
چکیده
Semi-supervised learning is proposed to exploit both labeled and unlabeled data. However, as the scale of data in real world applications increases significantly, conventional semisupervised algorithms usually lead to massive computational cost and cannot be applied to large scale datasets. In addition, label noise is usually present in the practical applications due to human annotation, which very likely results in remarkable degeneration of performance in semi-supervised methods. To address these two challenges, in this paper, we propose an efficient RObust Semi-Supervised Ensemble Learning (ROSSEL) method, which generates pseudo-labels for unlabeled data using a set of weak annotators, and combines them to approximate the ground-truth labels to assist semisupervised learning. We formulate the weighted combination process as a multiple label kernel learning (MLKL) problem which can be solved efficiently. Compared with other semisupervised learning algorithms, the proposed method has linear time complexity. Extensive experiments on five benchmark datasets demonstrate the superior effectiveness, efficiency and robustness of the proposed algorithm.
منابع مشابه
READER: Robust Semi-Supervised Multi-Label Dimension Reduction
Multi-label classification is an appealing and challenging supervised learning problem, where multiple labels, rather than a single label, are associated with an unseen test instance. To remove possible noises in labels and features of high-dimensionality, multi-label dimension reduction has attracted more and more attentions in recent years. The existing methods usually suffer from several pro...
متن کاملPRE-PRINT (Do Not Redistribute) Simple, Robust, Scalable Semi-supervised Learning via Expectation Regularization
Although semi-supervised learning has been an active area of research, its use in deployed applications is still relatively rare because the methods are often difficult to implement, fragile in tuning, or lacking in scalability. This paper presents expectation regularization, a semi-supervised learning method for exponential family parametric models that augments the traditional conditional lab...
متن کاملLabel Propagation for Semi-Supervised Learning in Self-Organizing Maps
Semi-supervised learning aims at discovering spatial structures in high-dimensional input spaces when insufficient background information about clusters is available. A particulary interesting approach is based on propagation of class labels through proximity graphs. The Emergent Self-Organizing Map (ESOM) itself can be seen as such a proximity graph that is suitable for label propagation. It t...
متن کاملA Novel Multi label Text Classification Model using Semi supervised learning
Automatic text categorization (ATC) is a prominent research area within Information retrieval. Through this paper a classification model for ATC in multi-label domain is discussed. We are proposing a new multi label text classification model for assigning more relevant set of categories to every input text document. Our model is greatly influenced by graph based framework and Semi supervised le...
متن کاملTowards Multi Label Text Classification through Label Propagation
Classifying text data has been an active area of research for a long time. Text document is multifaceted object and often inherently ambiguous by nature. Multi-label learning deals with such ambiguous object. Classification of such ambiguous text objects often makes task of classifier difficult while assigning relevant classes to input document. Traditional single label and multi class text cla...
متن کامل