Descriptive vs. inferential cheating
نویسنده
چکیده
Given the recent and highly publicized scandals involving psychology researchers who cheated, the proliferation of articles on related topics is unsurprising. As an example, Simons et al. (2011) pointed out subtle ways in which researchers can increase their false positive rate above the nominal level of p < 0.05. From my perspective, a major limitation of the literature on cheating has been a failure to distinguish between two kinds of cheating (bias might be a kinder word), that I term descriptive and inferential cheating. I intend to demonstrate that inferential cheating is not as destructive as descriptive cheating. So what is descriptive and inferential cheating? Descriptive cheating involves the false reporting of descriptive data, such as sample means, proportions, standard deviations, and so on. The harm of descriptive cheating is obvious, has been demonstrated by previous scandals, and needs no further elaboration here. In contrast, when a researcher cheats inferentially, the descriptive data are true but the reported p-values (and associated t-tests, F-tests, and so on) are not. My conclusion that inferential cheating causes only limited harm is based on demonstrations that the null hypothesis significance testing procedure (NHSTP) is invalid. My conclusion is that although providing false information that matters a lot, such as wrong descriptive statistics, can do much harm, providing false information that matters hardly at all, such as false p values, does not do much harm. So what is wrong with the NHSTP? The basic idea is that if we are to reject the null hypothesis, it should be shown to have a low probability of being true, given the finding. But a p-value does not provide this; rather, a p-value only shows that a finding is rare given the null hypothesis (Nickerson, 2000). As Kass and Raftery (1995) pointed out, knowing that a finding is rare given a hypothesis is not useful unless one knows how rare the finding is given a competing hypothesis. Also, Trafimow (2003) demonstrated that (1) the null hypothesis can have a very high probability (including a probability of 1) of being true even when p < 0.05, (2) p-values generally are inaccurate estimators of probabilities of null hypotheses, and (3) the conditions needed to make p-values valid indicators of probabilities of null hypotheses preclude the researcher from gaining much information from the NHSTP. Furthermore, Trafimow and Rice (2009) demonstrated that the correlation between p values and probabilities of null hypotheses is low to begin with, and decreases to triviality when dichotomous “accept” or “reject” decisions are made based on cutoff numbers such as 0.05 or 0.01. The famous theorem by Bayes provides examples whereby the null hypothesis will be rejected even when it has a strong likelihood of being true. Suppose that the prior probability of the null hypothesis is 0.95, the probability of the finding given the null hypothesis is the traditional value of 0.05 (so the null hypothesis is rejected), and the prior probability of the finding given that the null hypothesis is not true is 0.06. In that case, the posterior probability of the rejected null hypothesis is (0.95)(0.05) (0.95)(0.05)+ (0.06)(1−0.95) = 0.94. In the foregoing example, I tacitly allowed the null hypothesis to represent a range of values. Worse yet, however, in most empirical psychology articles, the null hypothesis refers to a single value (e.g., that the difference between two conditions is zero). But when the null hypothesis refers to a specific value, it is a practical certainty that the value is not exactly true. With an infinite number of possible values, the probability that the single value specified by the null hypothesis is exactly true approaches zero (e.g., Meehl, 1967; Loftus, 1996; Trafimow, 2006), and so it should be rejected. The NHSTP has been demonstrated to be invalid and it results in p-values that have little correlation with actual probabilities of null hypotheses. We also have seen that when the null hypothesis specifies a point, as opposed to a range, it is almost certainly false regardless of the obtained pvalue. Thus, whether the null hypothesis specifies a range or a point, the NHSTP is invalid. Arguably, because of its invalidity, the NHSPT should not be performed, and so inferential cheating bypasses a procedure that should not be used anyway. Thus, where is the harm in avoiding the use of a procedure that is blatantly invalid and only trivially correlated with what we really need to know (the probabilities of null hypotheses)? Let me be clear about what I am not saying. First, I am not disagreeing with various prescriptions for avoiding inferential cheating, particularly because many of them would reduce descriptive cheating too, and the latter is much more important. Second, I am not arguing that all inferential cheating is harmless; for example, harm can result when one makes improper estimates of population parameters based on poor inferential procedures even with accurate sample statistics. Third, it is quite possible that in attempting heroic measures to obtain p < 0.05, descriptive statistics also might be influenced, and this would be harmful to psychology. Fourth, from a deontological point of view, cheating is unethical in its own right, even apart from specific demonstrable consequences, and so the present argument should not be taken as a justification for any cheating whatsoever. With the foregoing caveats in place, my main point is as follows. Although
منابع مشابه
فراوانی و عوامل مرتبط با تقلب در بین دانشجویان دانشگاه علوم پزشکی قم 1392
Abstract Introduction: Due to the negative impact of cheating at the learning environment and the need of planning to reduce it this study was performed to determine the frequency and the factors associated with cheating among college students Qom University of Medical Sciences in 2013 Materials and Methods: This study was a cross sectional study. Questionnaires were given to all Students and...
متن کاملAttitudes of students and teachers on cheating behaviors: descriptive cross-sectional study at six dental colleges in India.
Cheating behavior has been a serious problem in dental institutions across the globe. Attitudes of dental students have an impact on the quality of health care provided to their patients. This descriptive cross-sectional study had the following objectives: to assess and compare the attitudes of dental students and teachers about cheating behaviors, to assess students' opinions of various justif...
متن کاملInferential statistics , power estimates , and study design formalities continue to suppress biomedical innovation
Innovation is the direct intended product of certain styles in research, but not of others. Fundamental conflicts between descriptive vs inferential statistics, deductive vs inductive hypothesis testing, and exploratory vs pre-planned confirmatory research designs have been played out over decades, with winners and losers and consequences. Longstanding warnings from both academics and research-...
متن کاملWhat to Infer from a Description
Recent work in identifying dependent collocations among consecutive words (i.e., dependent bigrams) has applied inferential statistical methods where descriptive ones may have been more appropriate and easier to use. In this paper we make the distinction between inferential and descriptive methods and discuss how and when each should be applied. Inferential methods are useful in that they allow...
متن کاملStudy on Frequency, Method of Academic Cheating and Causes among Student of Ilam University of Medical Sciences, 2015-2016
Introduction: academic misconduct in medical science faculties is a global concern which, in addition to a threat to the quality of education, can predict the incompetence professional and disregard to moral values. In this regard, this study was conducted to determine the frequency of cheating, identify the current practices of academic cheating and the leading factors to cheat among students ...
متن کامل