Proper Model Selection with Significance Test
نویسندگان
چکیده
Model selection is an important and ubiquitous task in machine learning. To select models with the best future classification performance measured by a goal metric, an evaluation metric is often used to select the best classification model among the competing ones. A common practice is to use the same goal and evaluation metric. However, in several recent studies, it is claimed that using an evaluation metric (such as AUC) other than the goal metric (such as accuracy) results in better selection of the correct models. In this paper, we point out a flaw in the experimental design of those studies, and propose an improved method to test the claim. Our extensive experiments show convincingly that only the goal metric itself can most reliably select the correct classification models.
منابع مشابه
Haplotype Block Partitioning and tagSNP Selection under the Perfect Phylogeny Model
Single Nucleotide Polymorphisms (SNPs) are the most usual form of polymorphism in human genome.Analyses of genetic variations have revealed that individual genomes share common SNP-haplotypes. Theparticular pattern of these common variations forms a block-like structure on human genome. In this work,we develop a new method based on the Perfect Phylogeny Model to identify haplo...
متن کاملVector Autoregressive Model Selection: Gross Domestic Product and Europe Oil Prices Data Modelling
We consider the problem of model selection in vector autoregressive model with Normal innovation. Tests such as Vuong's and Cox's tests are provided for order and model selection, i.e. for selecting the order and a suitable subset of regressors, in vector autoregressive model. We propose a test as a modified log-likelihood ratio test for selecting subsets of regressors. The Europe oil prices, ...
متن کاملTesting Significance in Bayesian Classifiers
The Fully Bayesian Significance Test (FBST) is a coherent Bayesian significance test for sharp hypotheses. This paper explores the FBST as a model selection tool for general mixture models, and gives some computational experiments for Multinomial-Dirichlet-Normal-Wishart models.
متن کاملSignificance Tests Harm Progress in Forecasting
Based on a summary of prior literature, I conclude that tests of statistical significance harm scientific progress. Efforts to find exceptions to this conclusion have, to date, turned up none. Even when done correctly, significance tests are dangerous. I show that summaries of scientific research do not require tests of statistical significance. I illustrate the dangers of significance tests by...
متن کاملA New Approach to Project Risk Responses Selection with Inter-dependent Risks
Risks are natural and inherent characteristics of major projects. Risks are usually considered independently in analysis of risk responses. However, most risks are dependent on each other and dependent risks are rare in the real world. This paper proposes a model for proper risk response selection from the responses portfolio with the purpose of optimization of defined criteria for projects. Th...
متن کامل