Semi-supervised learning using greedy max-cut

نویسندگان

Jun Wang

Tony Jebara

Shih-Fu Chang

چکیده

Graph-based semi-supervised learning (SSL) methods play an increasingly important role in practical machine learning systems, particularly in agnostic settings when no parametric information or other prior knowledge is available about the data distribution. Given the constructed graph represented by a weight matrix, transductive inference is used to propagate known labels to predict the values of all unlabeled vertices. Designing a robust label diffusion algorithm for such graphs is a widely studied problem and various methods have recently been suggested. Many of these can be formalized as regularized function estimation through the minimization of a quadratic cost. However, most existing label diffusion methods minimize a univariate cost with the classification function as the only variable of interest. Since the observed labels seed the diffusion process, such univariate frameworks are extremely sensitive to the initial label choice and any label noise. To alleviate the dependency on the initial observed labels, this article proposes a bivariate formulation for graph-based SSL, where both the binary label information and a continuous classification function are arguments of the optimization. This bivariate formulation is shown to be equivalent to a linearly constrained Max-Cut problem. Finally an efficient solution via greedy gradient Max-Cut (GGMC) is derived which gradually assigns unlabeled vertices to each class with minimum connectivity. Once convergence guarantees are established, this greedy Max-Cut based SSL is applied on both artificial and standard benchmark data sets where it obtains superior classification accuracy compared to existing state-of-the-art SSL methods. Moreover, GGMC shows robustness with respect to the graph construction method and maintains high accuracy over extensive experiments with various edge linking and weighting schemes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Graph-based Semi-Supervised Learning Approach and its Feature Selection

With the lack of labeled data, the learning accuracy of a supervised learning algorithm deteriorates. Meanwhile, it is more easy to collect plenty of unlabeled data. Furthermore, a graph can be used to express the underlying distribution of data in the dataset. Thus, a classification problem is converted to a graph partition problem. One typical graph-based semi-supervised learning algorithm is...

متن کامل

Semi-Supervised Learning with Max-Margin Graph Cuts

This paper proposes a novel algorithm for semisupervised learning. This algorithm learns graph cuts that maximize the margin with respect to the labels induced by the harmonic function solution. We motivate the approach, compare it to existing work, and prove a bound on its generalization error. The quality of our solutions is evaluated on a synthetic problem and three UCI ML repository dataset...

متن کامل

Convergence rate of the semi-supervised greedy algorithm

This paper proposes a new greedy algorithm combining the semi-supervised learning and the sparse representation with the data-dependent hypothesis spaces. The proposed greedy algorithm is able to use a small portion of the labeled and unlabeled data to represent the target function, and to efficiently reduce the computational burden of the semi-supervised learning. We establish the estimation o...

متن کامل

Submodular Optimization for Efficient Semi-supervised Support Vector Machines

In this work we present a quadratic programming approximation of the Semi-Supervised Support Vector Machine (S3VM) problem, namely approximate QP-S3VM, that can be efficiently solved using off the shelf optimization packages. We prove that this approximate formulation establishes a relation between the low density separation and the graph-based models of semi-supervised learning (SSL) which is ...

متن کامل

Fast SDP Relaxations of Graph Cut Clustering, Transduction, and Other Combinatorial Problem

The rise of convex programming has changed the face of many research fields in recent years, machine learning being one of the ones that benefitted the most. A very recent developement, the relaxation of combinatorial problems to semi-definite programs (SDP), has gained considerable attention over the last decade (Helmberg, 2000; De Bie and Cristianini, 2004a). Although SDP problems can be solv...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Journal of Machine Learning Research

دوره 14 شماره

صفحات -

تاریخ انتشار 2013

Semi-supervised learning using greedy max-cut

نویسندگان

چکیده

منابع مشابه

A Graph-based Semi-Supervised Learning Approach and its Feature Selection

Semi-Supervised Learning with Max-Margin Graph Cuts

Convergence rate of the semi-supervised greedy algorithm

Submodular Optimization for Efficient Semi-supervised Support Vector Machines

Fast SDP Relaxations of Graph Cut Clustering, Transduction, and Other Combinatorial Problem

عنوان ژورنال:

اشتراک گذاری