Outlier mining in high-dimensional data using the Jensen–Shannon divergence and graph structure analysis
نویسندگان
چکیده
Abstract Reliable anomaly/outlier detection algorithms have practical applications in many fields. For instance, anomaly allows to filter and clean the data used train machine learning algorithms, improving their performance. However, outlier mining is challenging when high-dimensional, different approaches been proposed for types of (temporal, spatial, network, etc). Here we propose a methodology mine outliers generic datasets which it possible define meaningful distance between elements dataset. The based on defining fully connected, undirected graph, where nodes are dataset links weights that distances nodes. Outlier scores defined by analyzing structure particular, using Jensen–Shannon (JS) divergence compare distributions We demonstrate method publicly available database credit-card transactions, some transactions labeled as frauds. with performance obtained Euclidean graph percolation, show JS leads improvement, but increases computational cost.
منابع مشابه
the clustering and classification data mining techniques in insurance fraud detection:the case of iranian car insurance
با توجه به گسترش روز افزون تقلب در حوزه بیمه به خصوص در بخش بیمه اتومبیل و تبعات منفی آن برای شرکت های بیمه، به کارگیری روش های مناسب و کارآمد به منظور شناسایی و کشف تقلب در این حوزه امری ضروری است. درک الگوی موجود در داده های مربوط به مطالبات گزارش شده گذشته می تواند در کشف واقعی یا غیرواقعی بودن ادعای خسارت، مفید باشد. یکی از متداول ترین و پرکاربردترین راه های کشف الگوی داده ها استفاده از ر...
data mining rules and classification methods in insurance: the case of collision insurance
assigning premium to the insurance contract in iran mostly has based on some old rules have been authorized by government, in such a situation predicting premium by analyzing database and it’s characteristics will be definitely such a big mistake. therefore the most beneficial information one can gathered from these data is the amount of loss happens during one contract to predicting insurance ...
15 صفحه اولDetecting Outlier in Graph Structure Data Using Centrality
This study describes an outlier detection technique for graph structure data that uses the centrality index. Existing techniques set thresholds for link and node regularity. However, existing techniques are not objective and do not apply to data without the link strength information. Therefore, we pay attention to centrality, which is an index used in network analysis. We perform outlier detect...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of physics
سال: 2022
ISSN: ['0022-3700', '1747-3721', '0368-3508', '1747-3713']
DOI: https://doi.org/10.1088/2632-072x/aca94a