Department of Statistics and Probability COLLOQUIUM

نویسنده

  • Hui Zou
چکیده

Distance weighted discrimination (DWD) is a margin-based classifier with an interesting geometric motivation. DWD was originally proposed as a superior alternative to the support vector machine (SVM), however DWD is yet to be popular compared with the SVM. The main reasons are twofold. First, the state-of-the-art algorithm for solving DWD is based on the second-order-cone programming (SOCP), while the SVM is a quadratic programming problem which is much more efficient to solve. Second, the current statistical theory of DWD mainly focuses on the linear DWD for the high-dimension-low-samplesize setting and data-piling, while the learning theory for the SVM mainly focuses on the Bayes risk consistency of the kernel SVM. In fact, the Bayes risk consistency of DWD is presented as an open problem in the original DWD paper. In this work, we advance the current understanding of DWD from both computational and theoretical perspectives. We propose a novel efficient algorithm for solving DWD, and our algorithm can be several hundred times faster than the existing state-of-the-art algorithm based on the SOCP. In addition, our algorithm can handle the generalized DWD, while the SOCP algorithm only works well for a special DWD but not the generalized DWD. Furthermore, we consider a natural kernel DWD in a reproducing kernel Hilbert space and then establish the Bayes risk consistency of the kernel DWD. We compare DWD and the SVM on several benchmark data sets and show that the two have comparable classification accuracy, but DWD equipped with our new algorithm can be much faster to compute than the SVM. To request an interpreter or other accommodations for people with disabilities, please call the Department of Statistics and Probability at 517-355-9589.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FUZZY INFORMATION AND STOCHASTICS

In applications there occur different forms of uncertainty. The twomost important types are randomness (stochastic variability) and imprecision(fuzziness). In modelling, the dominating concept to describe uncertainty isusing stochastic models which are based on probability. However, fuzzinessis not stochastic in nature and therefore it is not considered in probabilisticmodels.Since many years t...

متن کامل

Testing a Point Null Hypothesis against One-Sided for Non Regular and Exponential Families: The Reconcilability Condition to P-values and Posterior Probability

In this paper, the reconcilability between the P-value and the posterior probability in testing a point null hypothesis against the one-sided hypothesis is considered. Two essential families, non regular and exponential family of distributions, are studied. It was shown in a non regular family of distributions; in some cases, it is possible to find a prior distribution function under which P-va...

متن کامل

COLLOQUIUM Department of Statistics and Probability Michigan State University

The basic message of this talk could have been delivered a long ago, may be even soon after the time of publication of classical papers of K. Pearson (1900) and R. A. Fisher (1922, 1924). However, the tradition of using the chi-square goodness of fit statistic became so widely spread, and the point of view that, for the case of discrete distributions, statistics “have to” have their asymptotic ...

متن کامل

Probability-possibility DEA model with Fuzzy random data in presence of skew-Normal distribution

Data envelopment analysis (DEA) is a mathematical method to evaluate the performance of decision-making units (DMU). In the performance evaluation of an organization based on the classical theory of DEA, input and output data are assumed to be deterministic, while in the real world, the observed values of the inputs and outputs data are mainly fuzzy and random. A normal distribution is a contin...

متن کامل

Probability Distribution Fitting to Maternal Mortality in Nigeria.

The consequences of Maternal Mortality (MM) cannot be overemphasized. It inhibits population growth resulting into loss of lives among others. This work tends to obtain the maternal mortality rates (MMR) in Nigeria, identify some fitted distributions to MMR and determine which of the distributions best fits the data. A comprehensive Exploratory Data Analysis (EDA) was carried on MM and the MMRs...

متن کامل

A continuous approximation fitting to the discrete distributions using ODE

The probability density functions fitting to the discrete probability functions has always been needed, and very important. This paper is fitting the continuous curves which are probability density functions to the binomial probability functions, negative binomial geometrics, poisson and hypergeometric. The main key in these fittings is the use of the derivative concept and common differential ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015