Mining Best Closed Itemsets for Projection-antimonotonic Constraints in Polynomial Time

نویسندگان

  • Aleksey Buzmakov
  • Sergei O. Kuznetsov
  • Amedeo Napoli
چکیده

The exponential explosion of the set of patterns is one of the main challenges in pattern mining. This chalenge is approached by introducing a constraint for pattern selection. One of the first constraints proposed in pattern mining is support (frequency) of a pattern in a dataset. Frequency is an anti-monotonic function, i.e., given an infrequent pattern, all its superpatterns are not frequent. However, many other constraints for pattern selection are neither monotonic nor anti-monotonic, which makes it difficult to generate patterns satisfying these constraints. In order to deal with nonmonotonic constraints we introduce the notion of “projection antimonotonicity” and Σοφια algorithm that allow generating best patterns for a class of nonmonotonic constraints. Cosine interest, robustness, stability of closed itemsets, and the associated Δ-measure are among these constraints. Σοφια starts from light descriptions of transactions in dataset (a small set of items in the case of itemset description) and then iteratively adds more information to these descriptions (more items with indication of tidsets they describe). In the experiments, we compute best itemsets w.r.t. some measures and show the advantage of our approach over postpruning approaches.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Algorithms for Mining Generalized Frequent Patterns of Generalized Association Rules

Mining generalized frequent patterns of generalized association rules is an important process in knowledge discovery system. In this paper, we propose a new approach for efficiently mining all frequent patterns using a novel set enumeration algorithm with two types of constraints on two generalized itemset relationships, called subset-superset and ancestor-descendant constraints. We also show a...

متن کامل

CLOLINK: An Adapted Algorithm for Mining Closed Frequent Itemsets

Mining of the complete set of frequent itemsets will lead to a huge number of itemsets. Fortunately, this problem can be reduced to the mining of closed frequent itemsets, which results in a much smaller number of itemsets. Methods for efficient mining of closed frequent itemsets have been studied extensively by many researchers using various strategies to prove their efficiencies such as Aprio...

متن کامل

CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets

Association mining may often derive an undesirably large set of frequent itemsets and association rules. Recent studies have proposed an interesting alternative: mining frequent closed itemsets and their corresponding rules, which has the same power as association mining but substantially reduces the number of rules to be presented. In this paper, we propose an e cient algorithm, CLOSET, for mi...

متن کامل

CLOSET : An E cient Algorithm for Mining Frequent ClosedItemsets

Association mining may often derive an undesirably large set of frequent itemsets and association rules. Recent studies have proposed an interesting alternative: mining frequent closed itemsets and their corresponding rules, which has the same power as association mining but substantially reduces the number of rules to be presented. In this paper, we propose an eecient algorithm, CLOSET, for mi...

متن کامل

Mining Frequent Itemsets Using Support Constraints

Interesting patterns often occur at varied levels of support. The classic association mining based on a uniform minimum support, such as Apriori, either misses interesting patterns of low support or suuers from the bottleneck of itemset generation. A better solution is to exploit support constraints, which specify what minimum support is required for what itemsets, so that only necessary itemse...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1703.09513  شماره 

صفحات  -

تاریخ انتشار 2017