Generalization Bounds for the Area Under an ROC Curve

نویسندگان

  • Shivani Agarwal
  • Thore Graepel
  • Ralf Herbrich
  • Sariel Har-Peled
  • Dan Roth
چکیده

We study generalization properties of the area under an ROC curve (AUC), a quantity that has been advocated as an evaluation criterion for bipartite ranking problems. The AUC is a different and more complex term than the error rate used for evaluation in classification problems; consequently, existing generalization bounds for the classification error rate cannot be used to draw conclusions about the AUC. In this paper, we define a precise notion of the expected accuracy of a ranking function (analogous to the expected error rate of a classification function), and derive distribution-free probabilistic bounds on the deviation of the empirical AUC of a ranking function (observed on a finite data sequence) from its expected accuracy. We derive both a large deviation bound, which serves to bound the expected accuracy of a ranking function in terms of its empirical AUC on a test sequence, and a uniform convergence bound, which serves to bound the expected accuracy of a learned ranking function in terms of its empirical AUC on a training sequence. Our uniform convergence bound is expressed in terms of a new set of combinatorial parameters that we term the bipartite rank-shatter coefficients; these play the same role in our result as do the standard shatter coefficients (also known variously as the counting numbers or growth function) in uniform convergence results for the classification error rate. We also compare our result with a recent uniform convergence result derived by Freund et al. (2003) for a quantity closely related to the AUC; as we show, the bound provided by our result is considerably tighter.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A tree-based ranking algorithm and approximation of the optimal ROC curve

Recursive partitioning methods are among the most popular techniques in machine-learning. It is the purpose of this paper to investigate how such an appealing methodology may be adapted to the bipartite ranking problem. In ranking, the goal pursued is global: the matter is to learn how to define an order on the whole feature space X , so that positive instances take up the top ranks with maximu...

متن کامل

Upper and Lower Bounds of Area Under ROC Curves and Index of Discriminability of Classifier Performance

Area under an ROC curve plays an important role in estimating discrimination performance – a well-known theorem by Green (1964) states that ROC area equals the percentage of correct in two-alternative forcedchoice setting. When only single data point is available, the upper and lower bound of discrimination performance can be constructed based on the maximum and minimum area of legitimate ROC c...

متن کامل

Pointwise ROC Confidence Bounds: An Empirical Evaluation

This paper is about constructing and evaluating pointwise confidence bounds on an ROC curve. We describe four confidencebound methods, two from the medical field and two used previously in machine learning research. We evaluate whether the bounds indeed contain the relevant operating point on the “true” ROC curve with a confidence of 1−δ. We then evaluate pointwise confidence bounds on the regi...

متن کامل

Generalization Bounds for the Area Under the ROC Curve

We study generalization properties of the area under the ROC curve (AUC), a quantity that has been advocated as an evaluation criterion for the bipartite ranking problem. The AUC is a different term than the error rate used for evaluation in classification problems; consequently, existing generalization bounds for the classification error rate cannot be used to draw conclusions about the AUC. I...

متن کامل

Receiver Operating Characteristic (ROC) Curve Analysis for Medical Diagnostic Test Evaluation

This review provides the basic principle and rational for ROC analysis of rating and continuous diagnostic test results versus a gold standard. Derived indexes of accuracy, in particular area under the curve (AUC) has a meaningful interpretation for disease classification from healthy subjects. The methods of estimate of AUC and its testing in single diagnostic test and also comparative studies...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004