Ranking and combining multiple predictors without labeled data.
نویسندگان
چکیده
In a broad range of classification and decision-making problems, one is given the advice or predictions of several classifiers, of unknown reliability, over multiple questions or queries. This scenario is different from the standard supervised setting, where each classifier's accuracy can be assessed using available labeled data, and raises two questions: Given only the predictions of several classifiers over a large set of unlabeled test data, is it possible to (i) reliably rank them and (ii) construct a metaclassifier more accurate than most classifiers in the ensemble? Here we present a spectral approach to address these questions. First, assuming conditional independence between classifiers, we show that the off-diagonal entries of their covariance matrix correspond to a rank-one matrix. Moreover, the classifiers can be ranked using the leading eigenvector of this covariance matrix, because its entries are proportional to their balanced accuracies. Second, via a linear approximation to the maximum likelihood estimator, we derive the Spectral Meta-Learner (SML), an unsupervised ensemble classifier whose weights are equal to these eigenvector entries. On both simulated and real data, SML typically achieves a higher accuracy than most classifiers in the ensemble and can provide a better starting point than majority voting for estimating the maximum likelihood solution. Furthermore, SML is robust to the presence of small malicious groups of classifiers designed to veer the ensemble prediction away from the (unknown) ground truth.
منابع مشابه
An approach to rank efficient DMUs in DEA based on combining Manhattan and infinity norms
In many applications, discrimination among decision making units (DMUs) is a problematic technical task procedure to decision makers in data envelopment analysis (DEA). The DEA models unable to discriminate between extremely efficient DMUs. Hence, there is a growing interest in improving discrimination power in DEA yet. The aim of this paper is ranking extreme efficient DMUs in DEA based on exp...
متن کاملEstimation of Variance of Normal Distribution using Ranked Set Sampling
Introduction In some biological, environmental or ecological studies, there are situations in which obtaining exact measurements of sample units are much harder than ranking them in a set of small size without referring to their precise values. In these situations, ranked set sampling (RSS), proposed by McIntyre (1952), can be regarded as an alternative to the usual simple random sampling ...
متن کاملImproving the Calculation of RPN in the FMEA Method by Combining a Nonlinear Model with Revised TOPSIS and Fuzzy Logic
Introduction: Failure Mode and Effects Analysis (FMEA) is a structured way to find and understand the states of a system’s failure and to calculate the resulting effects. In this method, which has been criticized by many researchers, the risk priority number is obtained for each failure mode based on the multiplication of the three parameters of occurrence (O), severity (S) and detection (D). I...
متن کاملMicrosoft Research Asia at the NTCIR-10 Intent Task
Microsoft Research Asia participated in the Subtopic Mining subtask and Document Ranking subtask of the NTCIR-10 INTENT Task. In the Subtopic Mining subtask, we mine subtopics from query suggestions, clickthrough data and top results of the queries, and rank them based on their importance for the given query. In the Document Ranking subtask, we diversify top search results by estimating the int...
متن کاملA New Group Data Envelopment Analysis Method for Ranking Design Requirements in Quality Function Deployment
Data envelopment analysis (DEA) is an objective method for priority determination of decision making units (DMUs) with the same multiple inputs and outputs. DEA is an efficiency estimation technique, but it can be used for solving many problems of management such as rankig of DMUs. Many researchers have found similarity between DEA and MCDM techniques. One of the earliest techniques in MCDM is...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Proceedings of the National Academy of Sciences of the United States of America
دوره 111 4 شماره
صفحات -
تاریخ انتشار 2014