Interestingness Measures for Association Patterns : A Perspective

نویسندگان

  • Pang-Ning Tan
  • Vipin Kumar
چکیده

ABSTRACT Asso iation rules are valuable patterns be ause they o er useful insight into the types of dependen ies that exist between attributes of a data set. Due to the ompleteness nature of algorithms su h as Apriori, the number of patterns extra ted are often very large. Therefore, there is a need to prune or rank the dis overed patterns a ording to their degree of interestingness. In this paper, we will examine the various interestingness measures proposed in statisti s, mahine learning and data mining literature. We will ompare these measures and investigate how lose they re e t the statisti al notion of orrelation. We will show that supportbased pruning, whi h is often used in asso iation rule disovery, is appropriate be ause it removes mostly un orrelated and negatively orrelated patterns. Our experimental results veri ed that many of the intuitive measures (su h as Piatetsky-Shapiro's rule-interest, on den e, lapla e, entropy gain, et .) are very similar in nature to orrelation oeÆ ient (in the region of support values typi ally en ountered in pra ti e). Finally, we will introdu e a new metri , alled the IS measure, and show that it is highly linear with respe t to orrelation oeÆ ient for many interesting assoiation patterns.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Defining Interestingness for Association Rules

Interestingness in Association Rules has been a major topic of research in the past decade. The reason is that the strength of association rules, i.e. its ability to discover ALL patterns given some thresholds on support and confidence, is also its weakness. Indeed, a typical association rules analysis on real data often results in hundreds or thousands of patterns creating a data mining proble...

متن کامل

Interestingness Measures for Rare Association Rules and Periodic-Frequent Patterns

Data mining is the process of discovering significant and potentially useful knowledge in the form of patterns from the data. As a result, the notion of interestingness is very important for extracting useful knowledge patterns. Numerous interestingness measures have been discussed in the literature to assess the interestingness of a knowledge pattern. In this thesis, we focus on selecting a ri...

متن کامل

ARQAT: An Exploratory Analysis Tool For Interestingness Measures

Finding interestingness measures to evaluate association rules has become an important knowledge quality issue in KDD. Many interestingness measures may be found in the literature, and many authors have discussed and compared interestingness properties in order to help choose the best measures for a given application. As interestingness depends both on the data structure and on the decision-mak...

متن کامل

Numeric Multi-Objective Rule Mining Using Simulated Annealing Algorithm

Abstract as a single objective one. Measures like support, confidence and other interestingness criteria which are used for evaluating a rule, can be thought of as different objectives of association rule mining problem. Support count is the number of records, which satisfies all the conditions that exist in the rule. This objective represents the accuracy of the rules extracted from the da...

متن کامل

Development of Subjective Measures of Interestingness: From Unexpectedness to Shocking

Knowledge Discovery of Databases (KDD) is the process of extracting previously unknown but useful and significant information from large massive volume of databases. Data Mining is a stage in the entire process of KDD which applies an algorithm to extract interesting patterns. Usually, such algorithms generate huge volume of patterns. These patterns have to be evaluated by using interestingness...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000