The frequentist implications of optional stopping on Bayesian hypothesis tests.
نویسندگان
چکیده
Null hypothesis significance testing (NHST) is the most commonly used statistical methodology in psychology. The probability of achieving a value as extreme or more extreme than the statistic obtained from the data is evaluated, and if it is low enough, the null hypothesis is rejected. However, because common experimental practice often clashes with the assumptions underlying NHST, these calculated probabilities are often incorrect. Most commonly, experimenters use tests that assume that sample sizes are fixed in advance of data collection but then use the data to determine when to stop; in the limit, experimenters can use data monitoring to guarantee that the null hypothesis will be rejected. Bayesian hypothesis testing (BHT) provides a solution to these ills because the stopping rule used is irrelevant to the calculation of a Bayes factor. In addition, there are strong mathematical guarantees on the frequentist properties of BHT that are comforting for researchers concerned that stopping rules could influence the Bayes factors produced. Here, we show that these guaranteed bounds have limited scope and often do not apply in psychological research. Specifically, we quantitatively demonstrate the impact of optional stopping on the resulting Bayes factors in two common situations: (1) when the truth is a combination of the hypotheses, such as in a heterogeneous population, and (2) when a hypothesis is composite-taking multiple parameter values-such as the alternative hypothesis in a t-test. We found that, for these situations, while the Bayesian interpretation remains correct regardless of the stopping rule used, the choice of stopping rule can, in some situations, greatly increase the chance of experimenters finding evidence in the direction they desire. We suggest ways to control these frequentist implications of stopping rules on BHT.
منابع مشابه
Optional stopping: no problem for Bayesians.
Optional stopping refers to the practice of peeking at data and then, based on the results, deciding whether or not to continue an experiment. In the context of ordinary significance-testing analysis, optional stopping is discouraged, because it necessarily leads to increased type I error rates over nominal values. This article addresses whether optional stopping is problematic for Bayesian inf...
متن کاملComparison between Frequentist Test and Bayesian Test to Variance Normal in the Presence of Nuisance Parameter: One-sided and Two-sided Hypothesis
This article is concerned with the comparison P-value and Bayesian measure for the variance of Normal distribution with mean as nuisance paramete. Firstly, the P-value of null hypothesis is compared with the posterior probability when we used a fixed prior distribution and the sample size increases. In second stage the P-value is compared with the lower bound of posterior probability when the ...
متن کاملTesting Simple Hypotheses
We introduce a neutral statistic S that makes the Conditional Frequentist error reports identical to Bayesian posterior probabilities of the hypotheses. In symmetrical cases we can show this strategy to be optimal from the Frequentist perspective. A Conditional Frequentist who uses such a strategy can exploit the consistency of the method with the Likelihood Principle—for example, the validity ...
متن کاملHow To Remove the Ad Hoc Features of Statistical Inference within a Frequentist Paradigm
Our aim is to develop a frequentist theory of decision-making. The resulting unification of the seemingly unrelated theories of hypothesis testing and parameter estimation is based on a new definition of the optimality of a decision rule within an ensemble of token experiments. It is the introduction of ensembles that enables us to avoid the use of subjective Bayesian priors. We also consider t...
متن کاملUnified Conditional Frequentist and Bayesian Testing of Composite Hypotheses
Testing of a composite null hypothesis versus a composite alternative is considered when both have a related invariance structure. The goal is to develop conditional frequentist tests that allow the reporting of data-dependent error probabilities, error probabilities that have a strict frequentist interpretation and that reflect the actual amount of evidence in the data. The resulting tests are...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Psychonomic bulletin & review
دوره 21 2 شماره
صفحات -
تاریخ انتشار 2014