A Bayesian Method for Fitting Parametric and Nonparametric Models to Noisy Data
نویسندگان
چکیده
ÐWe present a simple paradigm for fitting models, parametric and nonparametric, to noisy data, which resolves some of the problems associated with classical MSE algorithms. This is done by considering each point on the model as a possible source for each data point. The paradigm can be used to solve problems which are ill-posed in the classical MSE approach, such as fitting a segment (as opposed to a line). It is shown to be nonbiased and to achieve excellent results for general curves, even in the presence of strong discontinuities. Results are shown for a number of fitting problems, including lines, circles, elliptic arcs, segments, rectangles, and general curves, contaminated by Gaussian and uniform noise. Index TermsÐBayesian fitting, parametric models, nonparametric models. æ 1 INTRODUCTION IT is common practice to fit models (lines, circles, implicit polynomials, etc.) to data points by minimizing the sum of squared distances from the points to the model (the MSE or Mean Square Error, approach). While the MSE algorithm may seem natural, in fact it, implicitly assumes that each data point is the noised version of the point on the model which is closest to it. This assumption is clearly false and leads to bias, for instance, when fitting circles to data contaminated by strong noise. The MSE algorithm suffers from another drawback: It cannot differentiate between a alargeo model and a asmallo one. For instance, when fitting a line segment to image data, one would often like to know not only the slope and location of the fitted segment, but also its end points. The MSE criterion does not differentiate between the acorrecto segment and a segment which is too long because both have the same MSE error with respect to the data. We offer a simple paradigm for fitting parametric models which solves these problems. This is done by considering each point on the parametric model as a possible source for each data point. The paradigm is also extended to nonparametric models and gives good results even for data with strong discontinuities. We show results of the method for lines, segments, circles, elliptic arcs, rectangles, and general curves. Both Gaussian and uniform noise models are considered. 1.1 Previous Work There are many papers which describe least-square techniques to fit parameters to noisy data and on using different numerical techniques and linear approximations needed for the computations. See, for example, [11], [18] and their references and, also, [2], where an ordinary least-squares estimate is shown to be consistent for a regression problem. There are also many papers with different solutions and heuristics to fitting circles, ellipses, and other parametric curves using different statistical or optimization techniques; see, for example, [20], [14], [16], [4], [19]. There have been a few papers related to Bayesian techniques for specific cases of parametric or nonparametric curve and surface fitting, [7], [9], [8], [1], [5], [6]. The idea of associating a acloud of influenceo with each data point is used to compute a better straight line fitting in [10], [12] by using a more general error criterion than the point-line distance. In [3], a very interesting approximate solution to the traveling salesman problem is offered, in which a (nonparametric) path is pulled towards the cities, controlled by a term which tries to keep it as short as possible. This work differs from ours in the Bayesian formulation and, in that, no treatment of parametric models is offered. In general, this paper differs from previous work mainly in that precisely the MAP estimate of the model is found, where usually the MAP estimate of the model together with the denoised data points is computed or approximated. Also, we extend the fitting to the general, nonparametric case. 1.2 Suggested Algorithm Given data points D fpigi1 and a parametric model M d1 . . . dm defined by a set of parameters fdjgj1, a very common fitting algorithm is to choose the instance of the model M d1 . . . dm such that the so-called MSE (Mean Square Error) function, defined by Xn i1 dist M d1 . . . dm; pi attains its minimum at fd1 . . . dmg. dist M d1 . . . dm; pi is the squared distance between pi and the model. The aBayesian justificationo of minimizing the MSE function is as follows: One wishes to maximize the probability of a certain model instance, given the data. Using Bayes' formula and assuming a uniform distribution over the different model instances and independent data, Pr MjD Pr DjMPr M Pr D / Pr DjM Yn i1 Pr pijM assuming isodirectional Gaussian measurement noise with a variance of , it is common to approximate Pr pijM by const n exp ÿ dist pi ; pi 2 2 ; where pi is the point on the model M closest to pi. Multiplying over i, taking logarithms and ignoring constants, it is easy to see that maximizing this approximate probability is equivalent to minimizing the MSE function. However, this is only an approximation, which fails for some cases (notably, for instance, for large values of ). The correct expression is Pr pijM const n Z M exp ÿ dist 2 p; pi 2 2 Pr pjMdp; 528 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 23, NO. 5, MAY 2001 . M. Werman is with the Institute of Computer Science, Hebrew University at Jerusalem, Jerusalem 91904 Israel. E-mail: [email protected]. . D. Keren is with the Department of Computer Science, University of Haifa, Haifa 31905, Israel. E-mail: [email protected]. Manuscript received 21 Sept. 1999; revised 10 May 2000; accepted 15 May 2000. Recommended for acceptance by P. Meer. For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference IEEECS Log Number 110634. 0162-8828/01/$10.00 ß 2001 IEEE where p is a point on M, or more generally, Bayes rule: Pr Mjpi Prob pijMProb M Prob pi ;
منابع مشابه
Introducing of Dirichlet process prior in the Nonparametric Bayesian models frame work
Statistical models are utilized to learn about the mechanism that the data are generating from it. Often it is assumed that the random variables y_i,i=1,…,n ,are samples from the probability distribution F which is belong to a parametric distributions class. However, in practice, a parametric model may be inappropriate to describe the data. In this settings, the parametric assumption could be r...
متن کاملA Novel Bayesian Method for Fitting Parametric and Non-Parametric Models to Noisy Data
We o er a simple paradigm for tting models, parametric and non-parametric, to noisy data, which resolves some of the problems associated with classic MSE algorithms. This is done by considering each point on the model as a possible source for each data point. The paradigm also allows to solve problems which are not de ned in the classical MSE approach, such as tting a segment (as opposed to a l...
متن کاملBayesian Nonparametric and Parametric Inference
This paper reviews Bayesian Nonparametric methods and discusses how parametric predictive densities can be constructed using nonparametric ideas.
متن کاملA Comparison of Thin Plate and Spherical Splines with Multiple Regression
Thin plate and spherical splines are nonparametric methods suitable for spatial data analysis. Thin plate splines acquire efficient practical and high precision solutions in spatial interpolations. Two components in the model fitting is considered: spatial deviations of data and the model roughness. On the other hand, in parametric regression, the relationship between explanatory and response v...
متن کاملBayesian Sample size Determination for Longitudinal Studies with Continuous Response using Marginal Models
Introduction Longitudinal study designs are common in a lot of scientific researches, especially in medical, social and economic sciences. The reason is that longitudinal studies allow researchers to measure changes of each individual over time and often have higher statistical power than cross-sectional studies. Choosing an appropriate sample size is a crucial step in a successful study. A st...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEEE Trans. Pattern Anal. Mach. Intell.
دوره 23 شماره
صفحات -
تاریخ انتشار 2001