A method to increase the power of multiple testing procedures through sample splitting.

نویسندگان

  • Daniel Rubin
  • Sandrine Dudoit
  • Mark van der Laan
چکیده

Consider the standard multiple testing problem where many hypotheses are to be tested, each hypothesis is associated with a test statistic, and large test statistics provide evidence against the null hypotheses. One proposal to provide probabilistic control of Type-I errors is the use of procedures ensuring that the expected number of false positives does not exceed a user-supplied threshold. Among such multiple testing procedures, we derive the most powerful method, meaning the test statistic cutoffs that maximize the expected number of true positives. Unfortunately, these optimal cutoffs depend on the true unknown data generating distribution, so could never be used in a practical setting. We instead consider splitting the sample so that the optimal cutoffs are estimated from a portion of the data, and then testing on the remaining data using these estimated cutoffs. When the null distributions for all test statistics are the same, the obvious way to control the expected number of false positives would be to use a common cutoff for all tests. In this work, we consider the common cutoff method as a benchmark multiple testing procedure. We show that in certain circumstances the use of estimated optimal cutoffs via sample splitting can dramatically outperform this benchmark method, resulting in increased true discoveries, while retaining Type-I error control. This paper is an updated version of the work presented in Rubin et al. (2005), later expanded upon by Wasserman and Roeder (2006).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Throughput Maximization for Multi-Slot Data Transmission via Two-Hop DF SWIPT-Based UAV System

In this paper, an unmanned aerial vehicle (UAV) assisted cooperative communication system is studied, wherein a source transmits information to the destination through an energy harvesting decode-and-forward UAV. It is assumed that the UAV can freely move in between the source-destination pair to set up line of sight communications with the both nodes. Since the battery of the UAV may be limite...

متن کامل

Comparing two testing procedures in unbalanced two-way ANOVA models under heteroscedasticity‎: Approximate degree of freedom and parametric bootstrap approach

‎The classic F-test is usually used for testing the effects of factors in homoscedastic two-way ANOVA models‎. ‎However‎, ‎the assumption of equal cell variances is usually violated in practice‎. ‎In recent years‎, ‎several test procedures have been proposed for testing the effects of factors‎. ‎In this paper‎, ‎the two methods that are approximate degree of freedom (ADF) and parametric bootstr...

متن کامل

Performance Analysis of cooperative SWIPT System: Intelligent Reflecting Surface versus Decode-and-Forward

In this paper, we explore the impacts of utilizing intelligent reflecting surfaces (IRS) in a power-splitting based simultaneous wireless information and power transfer (PS-SWIPT) system and compare its performance with the traditional decode and forward relaying system. To analyze a more practical system, it is also assumed that the receiving nodes are subject to decoding cost, and they are on...

متن کامل

Power and Stability Properties of Resampling-Based Multiple Testing Procedures with Applications to Gene Oncology Studies

Resampling-based multiple testing procedures are widely used in genomic studies to identify differentially expressed genes and to conduct genome-wide association studies. However, the power and stability properties of these popular resampling-based multiple testing procedures have not been extensively evaluated. Our study focuses on investigating the power and stability of seven resampling-base...

متن کامل

A Modified Flux Vector Splitting Scheme for Flow Analysis in Shock Wave Laminar Boundary Layer Interactions

The present work introduces a modified scheme for the solution of compressible 2-D full Navier-Stokes equations, using Flux Vector Splitting method. As a result of this modification, numerical diffusion is reduced. The computer code which is developed based on this algorithm can be used easily and accurately to analyze complex flow fields with discontinuity in properties, in cases such as shock...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Statistical applications in genetics and molecular biology

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2006