Scaling up Structured Multi-Label Prediction using Discriminative Mean Field Networks
نویسندگان
چکیده
Multi-label classification is an important task in many modern machine learning applications. Accurate methods model the correlations and relationships between labels, either by assuming a low-dimensional embedding of the labels or a graph structure of label dependencies. While such interactions can be achieved using feed-forward predictors, problems with tight coupling between labels are better posed as structured prediction problems. Unfortunately, prior applications of graphical models to multi-label classification scale poorly. In response, we introduce discriminative mean field networks, an iterative structured prediction technique applicable to substantially larger label sets. We employ a deep architecture to define an energy function of candidate labels, and form predictions using backpropagation to iteratively optimize the energy with respect to the labels. This deep architecture captures dependencies between labels that would lead to completely intractable graphical models, and enables a form of structure learning by automatically learning discriminative features of the structured output. The technique is effective on a variety of benchmarks, and generalizes easily to other structured prediction applications.
منابع مشابه
Structured Prediction Energy Networks
We introduce structured prediction energy networks (SPENs), a flexible framework for structured prediction. A deep architecture is used to define an energy function of candidate labels, and then predictions are produced by using backpropagation to iteratively optimize the energy with respect to the labels. This deep architecture captures dependencies between labels that would lead to intractabl...
متن کاملEmbedding Inference for Structured Multilabel Prediction
A key bottleneck in structured output prediction is the need for inference during training and testing, usually requiring some form of dynamic programming. Rather than using approximate inference or tailoring a specialized inference method for a particular structure—standard responses to the scaling challenge— we propose to embed prediction constraints directly into the learned representation. ...
متن کاملMaximum Margin Output Coding
In this paper we study output coding for multi-label prediction. For a multi-label output coding to be discriminative, it is important that codewords for different label vectors are significantly different from each other. In the meantime, unlike in traditional coding theory, codewords in output coding are to be predicted from the input, so it is also critical to have a predictable label encodi...
متن کاملEnd-to-End Learning for Structured Prediction Energy Networks
Structured Prediction Energy Networks (Belanger & McCallum, 2016) (SPENs) are a simple, yet expressive family of structured prediction models. An energy function over candidate structured outputs is given by a deep network, and predictions are formed by gradient-based optimization. Unfortunately, we have struggled to apply the structured SVM (SSVM) learning method of Belanger & McCallum (2016) ...
متن کاملJoint Kernel Support Estimation for Structured Prediction
We present a new technique for structured prediction that works in a hybrid generative/discriminative way, using a one-class support vector machine to model the joint probability of (input, output)-pairs in a joint reproducing kernel Hilbert space. Compared to discriminative techniques, like conditional random fields or structured output SVMs, the proposed method has the advantage that its trai...
متن کامل