Excess Risk Bounds for Multi-Task Learning

نویسنده

  • Ambuj Tewari
چکیده

The idea that it should be easier to learn several tasks if they are related in some way is quite intuitive and has been found to work in many practical settings. There has been some interest in obtaining theoretical results to better understand this phenomenon (e.g. [3, 4]). Maurer [4] considers the case when the “relatedness” of the tasks is captured by requiring that all tasks share a common “preprocessor”. Different linear classifiers are learned for the tasks where these classifiers all operate on the “preprocessed” input. Maurer obtains dimension-free and data-dependent bounds in this setting. He bounds the average error over tasks in terms of the margins of the classifiers and a complexity term involving the Hilbert-Schmidt norm of the selected preprocessor and the Frobenius norm of the Gram matrix for all tasks. We work in the same setting as Maurer’s. However, we introduce a loss function to measure the performance of the selected classifiers. Our aim is to obtain bounds for the difference between the average risk per task of the classifiers learned from the data and the least possible value of the average risk per task. Suppose we have m binary classification tasks with a common input space X which is a unit ball {x : ‖x‖ ≤ 1} in some Hilbert space H . Since we deal with binary classification, the output space is Y = {+1,−1}. Let v denote a tuple of classifiers (v1, . . . , vm) with vl ∈ H for all l ∈ {1, . . . ,m}. Let A be a set of symmetric Hilbert-Schmidt operators with ‖T ‖HS ≤ t for all T ∈ A. Denote the input distribution for task l by P l and let

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Excess risk bounds for multitask learning with trace norm regularization

Trace norm regularization is a popular method of multitask learning. We give excess risk bounds with explicit dependence on the number of tasks, the number of examples per task and properties of the data distribution. The bounds are independent of the dimension of the input space, which may be infinite as in the case of reproducing kernel Hilbert spaces. A byproduct of the proof are bounds on t...

متن کامل

Bounds for Vector-Valued Function Estimation

We present a framework to derive risk bounds for vector-valued learning with a broad class of feature maps and loss functions. Multi-task learning and one-vs-all multi-category learning are treated as examples. We discuss in detail vector-valued functions with one hidden layer, and demonstrate that the conditions under which shared representations are beneficial for multitask learning are equal...

متن کامل

Exploiting Task Relatedness for Mulitple Task Learning

The approach of learning of multiple ”related” tasks simultaneously has proven quite successful in practice; however, theoretical justification for this success has remained elusive. The starting point of previous work on multiple task learning has been that the tasks to be learnt jointly are somehow ”algorithmically related”, in the sense that the results of applying a specific learning algori...

متن کامل

Exploiting Task Relatedness for Multiple Task Learning

The approach of learning of multiple ”related” tasks simultaneously has proven quite successful in practice; however, theoretical justification for this success has remained elusive. The starting point of previous work on multiple task learning has been that the tasks to be learnt jointly are somehow ”algorithmically related”, in the sense that the results of applying a specific learning algori...

متن کامل

Bounds for Linear Multi-Task Learning

We give dimension-free and data-dependent bounds for linear multi-task learning where a common linear operator is chosen to preprocess data for a vector of task specific linear-thresholding classifiers. The complexity penalty of multi-task learning is bounded by a simple expression involving the margins of the task-specific classifiers, the Hilbert-Schmidt norm of the selected preprocessor and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006