Ranking and Empirical Minimization of U - Statistics

نویسندگان

  • STÉPHAN CLÉMENÇON
  • GÁBOR LUGOSI
  • NICOLAS VAYATIS
چکیده

The problem of ranking/ordering instances, instead of simply classifying them, has recently gained much attention in machine learning. In this paper we formulate the ranking problem in a rigorous statistical framework. The goal is to learn a ranking rule for deciding, among two instances, which one is “better,” with minimum ranking risk. Since the natural estimates of the risk are of the form of a U -statistic, results of the theory of U -processes are required for investigating the consistency of empirical risk minimizers. We establish, in particular, a tail inequality for degenerate U -processes, and apply it for showing that fast rates of convergence may be achieved under specific noise assumptions, just like in classification. Convex risk minimization methods are also studied.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ranking and Scoring Using Empirical Risk Minimization

A general model is proposed for studying ranking problems. We investigate learning methods based on empirical minimization of the natural estimates of the ranking risk. The empirical estimates are of the form of a U -statistic. Inequalities from the theory of U -statistics and U processes are used to obtain performance bounds for the empirical risk minimizers. Convex risk minimization methods a...

متن کامل

Scaling-up Empirical Risk Minimization: Optimization of Incomplete $U$-statistics

In a wide range of statistical learning problems such as ranking, clustering or metric learning among others, the risk is accurately estimated by U-statistics of degree d ≥ 1, i.e. functionals of the training data with low variance that take the form of averages over k-tuples. From a computational perspective, the calculation of such statistics is highly expensive even for a moderate sample siz...

متن کامل

Maximal Deviations of Incomplete U-statistics with Applications to Empirical Risk Sampling

It is the goal of this paper to extend the Empirical Risk Minimization (ERM) paradigm, from a practical perspective, to the situation where a natural estimate of the risk is of the form of a K-sample U -statistics, as it is the case in the K-partite ranking problem for instance. Indeed, the numerical computation of the empirical risk is hardly feasible if not infeasible, even for moderate sampl...

متن کامل

Ranking - convex risk minimization

The problem of ranking (rank regression) has become popular in the machine learning community. This theory relates to problems, in which one has to predict (guess) the order between objects on the basis of vectors describing their observed features. In many ranking algorithms a convex loss function is used instead of the 0−1 loss. It makes these procedures computationally efficient. Hence, conv...

متن کامل

SGD Algorithms based on Incomplete U-statistics: Large-Scale Minimization of Empirical Risk

In many learning problems, ranging from clustering to ranking through metric learning, empirical estimates of the risk functional consist of an average over tuples (e.g., pairs or triplets) of observations, rather than over individual observations. In this paper, we focus on how to best implement a stochastic approximation approach to solve such risk minimization problems. We argue that in the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008