PRROC: computing and visualizing precision-recall and receiver operating characteristic curves in R
نویسندگان
چکیده
Precision-recall (PR) and receiver operating characteristic (ROC) curves are valuable measures of classifier performance. Here, we present the R-package PRROC, which allows for computing and visualizing both PR and ROC curves. In contrast to available R-packages, PRROC allows for computing PR and ROC curves and areas under these curves for soft-labeled data using a continuous interpolation between the points of PR curves. In addition, PRROC provides a generic plot function for generating publication-quality graphics of PR and ROC curves.
منابع مشابه
ROCR: visualizing classifier performance in R
UNLABELLED ROCR is a package for evaluating and visualizing the performance of scoring classifiers in the statistical language R. It features over 25 performance measures that can be freely combined to create two-dimensional performance curves. Standard methods for investigating trade-offs between specific performance measures are available within a uniform framework, including receiver operati...
متن کاملPrecision-Recall-Gain Curves: PR Analysis Done Right
Precision-Recall analysis abounds in applications of binary classification where true negatives do not add value and hence should not affect assessment of the classifier’s performance. Perhaps inspired by the many advantages of receiver operating characteristic (ROC) curves and the area under such curves for accuracybased performance assessment, many researchers have taken to report PrecisionRe...
متن کاملUnachievable Region in Precision-Recall Space and Its Effect on Empirical Evaluation
Precision-recall (PR) curves and the areas under them are widely used to summarize machine learning results, especially for data sets exhibiting class skew. They are often used analogously to ROC curves and the area under ROC curves. It is known that PR curves vary as class skew changes. What was not recognized before this paper is that there is a region of PR space that is completely unachieva...
متن کاملWhat ROC Curves Can't Do (and Cost Curves Can)
This paper shows that ROC curves, as a method of visualizing classifier performance, are inadequate for the needs of Artificial Intelligence researchers in several significant respects, and demonstrates that a different way of visualizing performance – the cost curves introduced by Drummond and Holte at KDD’2000 – overcomes these deficiencies.
متن کاملBoosting First-Order Clauses for Large, Skewed Data Sets
Creating an e ective ensemble of clauses for large, skewed data sets requires nding a diverse, high-scoring set of clauses and then combining them in such a way as to maximize predictive performance. We have adapted the RankBoost algorithm in order to maximize area under the recall-precision curve, a much better metric when working with highly skewed data sets than ROC curves. We have also expl...
متن کامل