Discovery of feature-based hot spots using supervised clustering

نویسندگان

  • Wei Ding
  • Tomasz F. Stepinski
  • Rachana Parmar
  • Dan Jiang
  • Christoph F. Eick
چکیده

Feature-based hot spots are localized regions where the attributes of objects attain high values. There is considerable interest in automatic identification of feature-based hot spots. This paper approaches the problem of finding feature-based hot spots from a data mining perspective, and describes a method that relies on supervised clustering to produce a list of hot spot regions. Supervised clustering uses a fitness function rewarding isolation of the hot spots to optimally subdivide the dataset. The clusters in the optimal division are ranked using the interestingness of clusters that encapsulate their utility for being hot spots. Hot spots are associated with the top ranked clusters. The effectiveness of supervised clustering as a hot spot identification method is evaluated for four conceptually different clustering algorithms using a dataset describing the spatial distribution of ground ice on Mars. Clustering solutions are visualized by specially developed raster approximations. Further assessment of the ability of different algorithms to yield hot spots is performed using raster approximations. Density-based clustering algorithm is found to be the most effective for hot spot identification. The results of the hot spot discovery by supervised clustering are comparable to those obtained using the G statistic, but the new method offers a high degree of automation, making it an ideal tool for mining large datasets for the existence of potential hot spots. & 2009 Elsevier Ltd. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Composite Kernel Optimization in Semi-Supervised Metric

Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...

متن کامل

Supervised Feature Extraction of Face Images for Improvement of Recognition Accuracy

Dimensionality reduction methods transform or select a low dimensional feature space to efficiently represent the original high dimensional feature space of data. Feature reduction techniques are an important step in many pattern recognition problems in different fields especially in analyzing of high dimensional data. Hyperspectral images are acquired by remote sensors and human face images ar...

متن کامل

A Social Interaction Model For Crime Hot Spots

A common feature of mapped crime patterns is a strong spatial and temporal clustering into crime “hot spots”. In this paper we explore a social interaction model for the evolution of the attractiveness of the crime environment for criminal activity. We see how hot spots may arise when the idiosyncratic attractiveness of the environment is not encouraging for criminal activity. The stability of ...

متن کامل

Pattern mining analysis of pulmonary TB cases in Hamadan province: Using space-time cube

Background and aims: One of the most common approach to understanding spatial and temporal trends of event data is to break it up into a series of time snapshots. Therefore space-time cube method applied in order to portray the likely trend in occurrence of the pulmonary tuberculosis (TB) cases. Methods: In this study, information of all patients with pul...

متن کامل

Using Supervised Clustering Technique to Classify Received Messages in 137 Call Center of Tehran City Council

Supervised clustering is a data mining technique that assigns a set of data to predefined classes by analyzing dataset attributes. It is considered as an important technique for information retrieval, management, and mining in information systems. Since customer satisfaction is the main goal of organizations in modern society, to meet the requirements, 137 call center of Tehran city council is ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computers & Geosciences

دوره 35  شماره 

صفحات  -

تاریخ انتشار 2009