Equivalence of Distance-based and Rkhs-based Statistics in Hypothesis Testing by Dino Sejdinovic,

نویسندگان

  • BHARATH SRIPERUMBUDUR
  • ARTHUR GRETTON
  • KENJI FUKUMIZU
چکیده

We provide a unifying framework linking two classes of statistics used in two-sample and independence testing: on the one hand, the energy distances and distance covariances from the statistics literature; on the other, maximum mean discrepancies (MMD), that is, distances between embeddings of distributions to reproducing kernel Hilbert spaces (RKHS), as established in machine learning. In the case where the energy distance is computed with a semimetric of negative type, a positive definite kernel, termed distance kernel, may be defined such that the MMD corresponds exactly to the energy distance. Conversely, for any positive definite kernel, we can interpret the MMD as energy distance with respect to some negative-type semimetric. This equivalence readily extends to distance covariance using kernels on the product space. We determine the class of probability distributions for which the test statistics are consistent against all alternatives. Finally, we investigate the performance of the family of distance kernels in two-sample and independence tests: we show in particular that the energy distance most commonly employed in statistics is just one member of a parametric family of kernels, and that other choices from this family can yield more powerful tests.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Equivalence of Distance-based and Rkhs-based Statistics in Hypothesis Testing by Dino Sejdinovic, Bharath Sriperumbudur,

We provide a unifying framework linking two classes of statistics used in two-sample and independence testing: on the one hand, the energy distances and distance covariances from the statistics literature; on the other, maximum mean discrepancies (MMD), that is, distances between embeddings of distributions to reproducing kernel Hilbert spaces (RKHS), as established in machine learning. In the ...

متن کامل

Equivalence of distance-based and RKHS-based statistics in hypothesis testing

We provide a unifying framework linking two classes of statistics used in two-sample and independence testing: on the one hand, the energy distances and distance covariances from the statistics literature; on the other, Maximum Mean Discrepancies (MMD), i.e., distances between embeddings of distributions to reproducing kernel Hilbert spaces (RKHS), as established in machine learning. In the cas...

متن کامل

Hypothesis testing using pairwise distances and associated kernels

We provide a unifying framework linking two classes of statistics used in two-sample and independence testing: on the one hand, the energy distances and distance covariances from the statistics literature; on the other, distances between embeddings of distributions to reproducing kernel Hilbert spaces (RKHS), as established in machine learning. The equivalence holds when energy distances are co...

متن کامل

TESTING STATISTICAL HYPOTHESES UNDER FUZZY DATA AND BASED ON A NEW SIGNED DISTANCE

This paper deals with the problem of testing statisticalhypotheses when the available data are fuzzy. In this approach, wefirst obtain a fuzzy test statistic based on fuzzy data, and then,based on a new signed distance between fuzzy numbers, we introducea new decision rule to accept/reject the hypothesis of interest.The proposed approach is investigated for two cases: the casewithout nuisance p...

متن کامل

A Wild Bootstrap for Degenerate Kernel Tests

A wild bootstrap method for nonparametric hypothesis tests based on kernel distribution embeddings is proposed. This bootstrap method is used to construct provably consistent tests that apply to random processes, for which the naive permutation-based bootstrap fails. It applies to a large group of kernel tests based on V-statistics, which are degenerate under the null hypothesis, and nondegener...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013