Outlier Detection with Globally Optimal Exemplar-Based GMM

نویسندگان

  • Xingwei Yang
  • Longin Jan Latecki
  • Dragoljub Pokrajac
چکیده

Outlier detection has recently become an important problem in many data mining applications. In this paper, a novel unsupervised algorithm for outlier detection is proposed. First we apply a provably globally optimal Expectation Maximization (EM) algorithm to fit a Gaussian Mixture Model (GMM) to a given data set. In our approach, a Gaussian is centered at each data point, and hence, the estimated mixture proportions can be interpreted as probabilities of being a cluster center for all data points. The outlier factor at each data point is then defined as a weighted sum of the mixture proportions with weights representing the similarities to other data points. The proposed outlier factor is thus based on global properties of the data set. This is in contrast to most existing approaches to outlier detection, which are strictly local. Our experiments performed on several simulated and real life data sets demonstrate superior performance of the proposed approach. Moreover, we also demonstrate the ability to detect unusual shapes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Exemplar-based Image Completion methods using Selecting the Optimal Patch

Image completion is one of the subjects in image and video processing which deals with restoration of and filling in damaged regions of images using correct regions. Exemplar-based image completion methods give more pleasant results than pixel-based approaches. In this paper, a new algorithm is proposed to find the most suitable patch in order to fill in the damaged parts. This patch selection ...

متن کامل

Identification of outliers types in multivariate time series using genetic algorithm

Multivariate time series data, often, modeled using vector autoregressive moving average (VARMA) model. But presence of outliers can violates the stationary assumption and may lead to wrong modeling, biased estimation of parameters and inaccurate prediction. Thus, detection of these points and how to deal properly with them, especially in relation to modeling and parameter estimation of VARMA m...

متن کامل

Outlier Detection Using Extreme Learning Machines Based on Quantum Fuzzy C-Means

One of the most important concerns of a data miner is always to have accurate and error-free data. Data that does not contain human errors and whose records are full and contain correct data. In this paper, a new learning model based on an extreme learning machine neural network is proposed for outlier detection. The function of neural networks depends on various parameters such as the structur...

متن کامل

Effective Image and Video Error Concealment using RST-Invariant Partial Patch Matching Model and Exemplar-based Inpainting

An effective visual error concealment method has been presented by employing a robust rotation, scale, and translation (RST) invariant partial patch matching model (RSTI-PPMM) and exemplar-based inpainting. While the proposed robust and inherently feature-enhanced texture synthesis approach ensures the generation of excellent and perceptually plausible visual error concealment results, the outl...

متن کامل

Optimal Feature Based Density Clustering for Outlier Detection in Multivariate Data

Efficient outlier detection in a large-sized big data environment incurs much of complexity in processing the information and to handle it in a proficient way. For segregating outliers from those normal data items, many of the prevailing methodologies experiences complexity in accordance with the features involved in every single attribute. On recognizing appropriate features associated the cha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009