Dimension Reduction for Multinomial Models Via a Kolmogorov-Smirnov Measure (KSM)
نویسندگان
چکیده
Due to advances in technology and data collection techniques, the number of measurements often exceeds the number of samples in ecological datasets. As such, standard models that attempt to assess the relationship between variables and a response are inapplicable and require a reduction in the number of dimensions to be estimable. Several filtering methods exist to accomplish this, including Indicator Species Analyses and Sure Information Screening, but these techniques often have questionable asymptotic properties or are not readily applicable to data with multinomial responses. As such, we propose and validate a new metric called the Kolmogorov-Smirnov Measure (KSM) to be used for filtering variables. In the paper, we develop the KSM, investigate its asymptotic properties, and compare it to group equalized Indicator Species Values through simulation studies and application to a well-known biological dataset.
منابع مشابه
A comparison of the discrete Kolmogorov-Smirnov statistic and the Euclidean distance
Goodness-of-fit tests gauge whether a given set of observations is consistent (up to expected random fluctuations) with arising as independent and identically distributed (i.i.d.) draws from a user-specified probability distribution known as the “model.” The standard gauges involve the discrepancy between the model and the empirical distribution of the observed draws. Some measures of discrepan...
متن کاملMonitoring Multinomial Logit Profiles via Log-Linear Models (Quality Engineering Conference Paper)
In certain statistical process control applications, quality of a process or product can be characterized by a function commonly referred to as profile. Some of the potential applications of profile monitoring are cases where quality characteristic of interest is modelled using binary,multinomial or ordinal variables. In this paper, profiles with multinomial response are studied. For this purpo...
متن کاملLipschitz Stability for Stochastic Programs with Complete Recourse
This paper investigates the stability of optimal solution sets to stochastic programs with complete recourse, where the underlying probability measure is understood as a parameter varying in some space of probability measures.piro proved Lipschitz upper semicontinuity of the solution set mapping. Inspired by this result, we introduce a subgradient distance for probability distributions and esta...
متن کاملHow to Measure the Quality of Credit Scoring Models
Credit scoring models are widely used to predict the probability of client default. To measure the quality of such scoring models it is possible to use quantitative indices such as the Gini index, Kolmogorov-Smirnov statistics (KS), Lift, the Mahalanobis distance, and information statistics. This paper reviews and illustrates the use of these indices in practice.
متن کاملConstructive dimension equals Kolmogorov complexity
We derive the coincidence of Lutz’s constructive dimension and Kolmogorov complexity for sets of infinite strings from Levin’s early result on the existence of an optimal left computable cylindrical semi-measure M via simple calculations.
متن کامل