Internal Evaluation of Unsupervised Outlier Detection
نویسندگان
چکیده
منابع مشابه
Unsupervised Outlier Profile Analysis
In much of the analysis of high-throughput genomic data, "interesting" genes have been selected based on assessment of differential expression between two groups or generalizations thereof. Most of the literature focuses on changes in mean expression or the entire distribution. In this article, we explore the use of C(α) tests, which have been applied in other genomic data settings. Their use f...
متن کاملFP-outlier: Frequent pattern based outlier detection
An outlier in a dataset is an observation or a point that is considerably dissimilar to or inconsistent with the remainder of the data. Detection of such outliers is important for many applications and has recently attracted much attention in the data mining research community. In this paper, we present a new method to detect outliers by discovering frequent patterns (or frequent itemsets) from...
متن کاملA comparative evaluation of outlier detection algorithms: Experiments and analyses
We survey unsupervised machine learning algorithms in the context of outlier detection. This task challenges state-of-the-art methods from a variety of research fields to applications including fraud detection, intrusion detection, medical diagnoses and data cleaning. The selected methods are benchmarked on publicly available datasets and novel industrial datasets. Each method is then submitted...
متن کاملEvaluation of Different Outlier Detection Methods for GPS Networks
GPS (Global Positioning System) devices can be used in many applications which require accurate point positioning in geosciences. Accuracy of GPS decreases due to outliers resulted from the errors inherent in GPS observations. Several approaches have been developed to detect outliers in geodetic observations. It is important to determine which method is most effective at distinguishing outliers...
متن کاملOutlier Detection by Boosting Regression Trees
A procedure for detecting outliers in regression problems is proposed. It is based on information provided by boosting regression trees. The key idea is to select the most frequently resampled observation along the boosting iterations and reiterate after removing it. The selection criterion is based on Tchebychev’s inequality applied to the maximum over the boosting iterations of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM Transactions on Knowledge Discovery from Data
سال: 2020
ISSN: 1556-4681,1556-472X
DOI: 10.1145/3394053