Credit scoring and the sample selection bias
نویسنده
چکیده
For creating or adjusting credit scoring rules, usually only the accepted applicant’s data and default information are available. The missing information for the rejected applicants and the sorting mechanism of the preceding scoring can lead to a sample selection bias. In other words, mostly inferior classification results are achieved if these new rules are applied to the whole population of applicants. Methods for coping with this problem are known by the term “reject inference.” These techniques attempt to get additional data for the rejected applicants or try to infer the missing information. We apply some of these reject inference methods as well as two extensions to a simulated and a real data set in order to test the adequacy of different approaches. The suggested extensions are an improvement in comparison to the known techniques. Furthermore, the size of the sample selection effect and its influencing factors are examined.
منابع مشابه
The impact of sample bias on consumer credit scoring performance and profitability
This article seeks to gain insight into the influence of sample bias in a consumer credit scoring model. Considering the vital implications on revenues and costs concerned with the issuing and repayment of commercial credit, predictive performance of the model is crucial, and sample bias has been suggested to pose a sizeable threat to profitability due to its implications on either population d...
متن کاملThe Economic Value of Reject Inference in Credit Scoring
We use data with complete information on both rejected and accepted bank loan applicants to estimate the value of sample bias correction using Heckman’s two-stage model with partial observability. In the credit scoring domain such correction is called reject inference. We validate the model performances with and without the correction of sample bias by various measurements. Results show that it...
متن کاملCredit scoring in banks and financial institutions via data mining techniques: A literature review
This paper presents a comprehensive review of the works done, during the 2000–2012, in the application of data mining techniques in Credit scoring. Yet there isn’t any literature in the field of data mining applications in credit scoring. Using a novel research approach, this paper investigates academic and systematic literature review and includes all of the journals in the Science direct onli...
متن کاملSample selection in credit-scoring models
We examine three models for sample selection that are relevant for modeling credit scoring by commercial banks. A binary choice model is used to examine the decision of whether or not to extend credit. The selectivity aspect enters because such models are based on samples of individuals to whom credit has already been given. A regression model with sample selection is suggested for predicting e...
متن کامل