A Boosting Algorithm for Label Covering in Multilabel Problems

نویسندگان

  • Yonatan Amit
  • Ofer Dekel
  • Yoram Singer
چکیده

We describe, analyze and experiment with a boosting algorithm for multilabel categorization problems. Our algorithm includes as special cases previously studied boosting algorithms such as Adaboost.MH. We cast the multilabel problem as multiple binary decision problems, based on a user-defined covering of the set of labels. We prove a lower bound on the progress made by our algorithm on each boosting iteration and demonstrate the merits of our algorithm in experiments with text categorization problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Online Boosting Algorithms for Multi-label Ranking

We consider the multi-label ranking approach to multilabel learning. Boosting is a natural method for multilabel ranking as it aggregates weak predictions through majority votes, which can be directly used as scores to produce a ranking of the labels. We design online boosting algorithms with provable loss bounds for multi-label ranking. We show that our first algorithm is optimal in terms of t...

متن کامل

Incorporating Prior Knowledge into Boosting for Multi-Label Classification XiaoWang

Multi-label learning deals with the problem where each instance may belong to multiple labels simultaneously. The task of the learning paradigm is to output the label set whose size is unknown a priori for each unseen instance, through analyzing the training data set with known label sets. Existing multi-label learning algorithms are almost based on the purely data-driven method. The larger the...

متن کامل

Log-Linear Models for Label Ranking

Label ranking is the task of inferring a total order over a predefined set of labels for each given instance. We present a general framework for batch learning of label ranking functions from supervised data. We assume that each instance in the training data is associated with a list of preferences over the label-set, however we do not assume that this list is either complete or consistent. Thi...

متن کامل

Fast Label Embeddings via Randomized Linear Algebra

Many modern multiclass and multilabel problems are characterized by increasingly large output spaces. For these problems, label embeddings have been shown to be a useful primitive that can improve computational and statistical efficiency. In this work we utilize a correspondence between rank constrained estimation and low dimensional label embeddings that uncovers a fast label embedding algorit...

متن کامل

ATPboost: Learning Premise Selection in Binary Setting with ATP Feedback

ATPboost is a system for solving sets of large-theory problems by interleaving ATP runs with state-of-the-art machine learning of premise selection from the proofs. Unlike many previous approaches that use multi-label setting, the learning is implemented as binary classification that estimates the pairwise-relevance of (theorem, premise) pairs. ATPboost uses for this the XGBoost gradient boosti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007