A Probability for Classification Based on the Dirichlet Process Mixture Model
نویسندگان
چکیده
In this paper we provide an explicit probability distribution for classification purposes when observations are viewed on the real line and classifications are to be based on numerical orderings. The classification model is derived from a Bayesian nonparametric mixture of Dirichlet process model; with some modifications. The resulting approach then more closely resembles a classical hierarchical grouping rule in that it depends on sums of squares of neighboring values. The proposed probability model for classification relies on a numerical procedure based on a reversible Markov chain Monte Carlo (MCMC) algorithm for determining the probabilities. Some numerical illustrations comparing with alternative ideas for classification are provided.
منابع مشابه
Introducing of Dirichlet process prior in the Nonparametric Bayesian models frame work
Statistical models are utilized to learn about the mechanism that the data are generating from it. Often it is assumed that the random variables y_i,i=1,…,n ,are samples from the probability distribution F which is belong to a parametric distributions class. However, in practice, a parametric model may be inappropriate to describe the data. In this settings, the parametric assumption could be r...
متن کاملRecognizing the Emotional State Changes in Human Utterance by a Learning Statistical Method based on Gaussian Mixture Model
Speech is one of the most opulent and instant methods to express emotional characteristics of human beings, which conveys the cognitive and semantic concepts among humans. In this study, a statistical-based method for emotional recognition of speech signals is proposed, and a learning approach is introduced, which is based on the statistical model to classify internal feelings of the utterance....
متن کاملUsing Regression based Control Limits and Probability Mixture Models for Monitoring Customer Behavior
In order to achieve the maximum flexibility in adaptation to ever changing customer’s expectations in customer relationship management, appropriate measures of customer behavior should be continually monitored. To this end, control charts adjusted for buyer’s/visitor’s prior intention to repurchase or visit again are suitable means taking into account the heterogeneity across customers. In the ...
متن کاملA Joint Semantic Vector Representation Model for Text Clustering and Classification
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...
متن کاملSemiparametric Bayesian classification with longitudinal markers.
We analyse data from a study involving 173 pregnant women. The data are observed values of the β human chorionic gonadotropin hormone measured during the first 80 days of gestational age, including from one up to six longitudinal responses for each woman. The main objective in this study is to predict normal versus abnormal pregnancy outcomes from data that are available at the early stages of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- J. Classification
دوره 27 شماره
صفحات -
تاریخ انتشار 2010