Testing for Equal Distributions in High Dimension

نویسندگان

  • Gábor J. Székely
  • Maria L. Rizzo
چکیده

We propose a new nonparametric test for equality of two or more multivariate distributions based on Euclidean distance between sample elements. Several consistent tests for comparing multivariate distributions can be developed from the underlying theoretical results. The test procedure for the multisample problem is developed and applied for testing the composite hypothesis of equal distributions, when distributions are unspecified. The proposed test is universally consistent against all fixed alternatives (not necessarily continuous) with finite second moments. The test is implemented by conditioning on the pooled sample to obtain an approximate permutation test, which is distribution free. Our Monte Carlo power study suggests that the new test may be much more sensitive than tests based on nearest neighbors against several classes of alternatives, and performs particularly well in high dimension. Computational complexity of our test procedure is independent of dimension and number of populations sampled. The test is applied in a high dimensional problem, testing microarray data from cancer samples.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparing the Shape Parameters of Two Weibull Distributions Using Records: A Generalized Inference

The Weibull distribution is a very applicable model for the lifetime data. For inference about two Weibull distributions using records, the shape parameters of the distributions are usually considered equal. However, there is not an appropriate method for comparing the shape parameters in the literature. Therefore, comparing the shape parameters of two Weibull distributions is very important. I...

متن کامل

On the Number of Modes of Finite Mixtures of Elliptical Distributions

We extend the concept of the ridgeline from Ray and Lindsay (2005) to finite mixtures of general elliptical densities with possibly distinct density generators in each component. This can be used to obtain bounds for the number of modes of two-component mixtures of t distributions in any dimension. In case of proportional dispersion matrices, these have at most three modes, while for equal degr...

متن کامل

Testing a Point Null Hypothesis against One-Sided for Non Regular and Exponential Families: The Reconcilability Condition to P-values and Posterior Probability

In this paper, the reconcilability between the P-value and the posterior probability in testing a point null hypothesis against the one-sided hypothesis is considered. Two essential families, non regular and exponential family of distributions, are studied. It was shown in a non regular family of distributions; in some cases, it is possible to find a prior distribution function under which P-va...

متن کامل

The distance correlation t-test of independence in high dimension

AMS subject classifications: primary 62G10 secondary 62H20 Keywords: dCor dCov Multivariate independence Distance covariance Distance correlation High dimension a b s t r a c t Distance correlation is extended to the problem of testing the independence of random vectors in high dimension. Distance correlation characterizes independence and determines a test of multivariate independence for rand...

متن کامل

Indistinguishability of Absolutely Continuous and Singular Distributions

It is shown that there are no consistent decision rules for the hypothesis testing problem of distinguishing between absolutely continuous and purely singular probability distributions on the real line. In fact, there are no consistent decision rules for distinguishing between absolutely continuous distributions and distributions supported by Borel sets of Hausdorff dimension 0. It follows that...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004