Matrix Apriori: Speeding Up the Search for Frequent Patterns

نویسندگان

  • Judith Pavón
  • Sidney Viana
  • Santiago Gómez
چکیده

This work discusses the problem of generating association rules from a set of transactions in a relational database, taking performance and accuracy of found results as the essential aspects for comparing association mining algorithms. We do a critical analysis of two previously existing methods, Apriori and FP-growth, emphasizing their strengths and weaknesses; and based on this analysis, we propose an algorithm called Matrix Apriori combining the best features of both. Matrix Apriori utilizes simple structures such as matrices and vectors in the process of generating frequent patterns, and it also minimizes the number of candidate sets, thus achieving a more efficient computation than Apriori and FP-growth. The proposed algorithm can be easily extended to incorporate multiple minimal support defined by the user with the aim of improving method efficacy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A fast Algorithm for mining fuzzy frequent itemsets

In this paper, a fuzzy frequent itemset (FFI)-Miner algorithm is developed to mine the complete set of FFIs without candidate generation. It uses a novel fuzzy-list structure to keep the essential information for later mining process. An efficient pruning strategy is also developed to reduce the search space, thus speeding up the mining process to directly discover the FFIs. Experiments are con...

متن کامل

Max-FTP: Mining Maximal Fault-Tolerant Frequent Patterns from Databases

Mining Fault-Tolerant (FT) Frequent Patterns in real world (dirty) databases is considered to be a fruitful direction for future data mining research. In last couple of years a number of different algorithms have been proposed on the basis of Apriori-FT frequent pattern mining concept. The main limitation of these existing FT frequent pattern mining algorithms is that, they try to find all FT f...

متن کامل

CPM Algorithm for Mining Association Rules from Databases of Engineering Design Instances

In this paper, we propose an algorithm for mining associating rules based on transaction combination, attribute combination, pattern comparison and comparative pattern mapping (CPM), aiming at the databases with a large number of attributes but a small number of transactions which are common in engineering design. There are four main steps in the CPM algorithm. First, it scans and expands the d...

متن کامل

Comparative Analysis of Various Approaches Used in Frequent Pattern Mining

Frequent pattern mining has become an important data mining task and has been a focused theme in data mining research. Frequent patterns are patterns that appear in a data set frequently. Frequent pattern mining searches for recurring relationship in a given data set. Various techniques have been proposed to improve the performance of frequent pattern mining algorithms. This paper presents revi...

متن کامل

A New Approach for Extracting Closed Frequent Patterns and their Association Rules using Compressed Data Structure

In data mining, term frequent pattern extraction is largely used for finding out association rules. Generally association rule mining approaches are used as bottom-up or top-down approach on compressed data structure. In the past, different works proposed different approaches to mine frequent patterns from giving databases. In this paper, we propose a new approach by applying the closed & inter...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006