New Algorithms for Fast Discovery of Association Rules

نویسندگان

  • Mohammed J. Zaki
  • Srinivasan Parthasarathy
  • Mitsunori Ogihara
  • Wei Li
چکیده

Association rule discovery has emerged as an important problem in knowledge discovery and data mining. The association mining task consists of identifying the frequent itemsets, and then forming conditional implication rules among them. In this paper we present e cient algorithms for the discovery of frequent itemsets, which forms the compute intensive phase of the task. The algorithms utilize the structural properties of frequent itemsets to facilitate fast discovery. The related database items are grouped together into clusters representing the potential maximal frequent itemsets in the database. Each cluster induces a sub-lattice of the itemset lattice. E cient lattice traversal techniques are presented, which quickly identify all the true maximal frequent itemsets, and all their subsets if desired. We also present the e ect of using di erent database layout schemes combined with the proposed clustering and traversal techniques. The proposed algorithms scan a (pre-processed) database only once, addressing the open question in association mining, whether all the rules can be e ciently extracted in a single database pass. We experimentally compare the new algorithms against the previous approaches, obtaining improvements of more than an order of magnitude for our test databases. The University of Rochester Computer Science Department supported this work. This work was supported in part by an NSF Research Initiation Award (CCR-9409120) and ARPA contract (F19628-94-C-0057).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Introduction of a Heuristic Mutation Operator to Strengthen the Discovery Component of XCS

The extended classifier systems (XCS) by producing a set of rules is (classifier) trying to solve learning problems as online. XCS is a rather complex combination of genetic algorithm and reinforcement learning that using genetic algorithm tries to discover the encouraging rules and value them by reinforcement learning. Among the important factors in the performance of XCS is the possibility to...

متن کامل

The Introduction of a Heuristic Mutation Operator to Strengthen the Discovery Component of XCS

The extended classifier systems (XCS) by producing a set of rules is (classifier) trying to solve learning problems as online. XCS is a rather complex combination of genetic algorithm and reinforcement learning that using genetic algorithm tries to discover the encouraging rules and value them by reinforcement learning. Among the important factors in the performance of XCS is the possibility to...

متن کامل

ADtrees for Fast Counting and for Fast Learning of Association Rules

The problem of discovering association rules in large databases has received considerable research attention. Much research has examined the exhaustive discovery of all association rules involving positive binary literals (e.g. Agrawal et al. 1996). Other research has concerned finding complex association rules for high-arity attributes such as CN2 (Clark and Niblett 1989). Complex association ...

متن کامل

Introducing an algorithm for use to hide sensitive association rules through perturb technique

Due to the rapid growth of data mining technology, obtaining private data on users through this technology becomes easier. Association Rules Mining is one of the data mining techniques to extract useful patterns in the form of association rules. One of the main problems in applying this technique on databases is the disclosure of sensitive data by endangering security and privacy. Hiding the as...

متن کامل

New Approach to Optimize the Time of Association Rules Extraction

The knowledge discovery algorithms have become ineffective at the abundance of data and the need for fast algorithms or optimizing methods is required. To address this limitation, the objective of this work is to adapt a new method for optimizing the time of association rules extractions from large databases. Indeed, given a relational database (one relation) represented as a set of tuples, als...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997