RP-growth: Top-k Mining of Relevant Patterns with Minimum Support Raising
نویسندگان
چکیده
One practical inconvenience in frequent pattern mining is that it often yields a flood of common or uninformative patterns, and thus we should carefully adjust the minimum support. To alleviate this inconvenience, based on FP-growth, this paper proposes RP-growth, an efficient algorithm for top-k mining of discriminative patterns which are highly relevant to the class of interest. RP-growth conducts a branchand-bound search using anti-monotonic upper bounds of the relevance scores such as F-score and χ, and the pruning in branch-and-bound search is successfully translated to minimum support raising, a standard, easy-to-implement pruning strategy for top-k mining. Furthermore, by introducing the notion called weakness and an additional, aggressive pruning strategy based on weakness, RP-growth efficiently finds k patterns of wide variety and high relevance to the class of interest. Experimental results on text classification exhibit the efficiency and the usefulness of RP-growth.
منابع مشابه
TGP: Mining Top-K Frequent Closed Graph Pattern without Minimum Support
In this paper, we propose a new mining task: mining top-k frequent closed graph patterns without minimum support. Most previous frequent graph pattern mining works require the specification of a minimum support threshold to perform the mining. However it is difficult for users to set a suitable value sometimes. We develop an efficient algorithm, called TGP, to mine patterns without minimum supp...
متن کاملTFP-growth: An Efficient Algorithm for Mining Frequent Patterns without any Thresholds
Conventional frequent pattern mining algorithms require some user-specified minimum support, and then mine frequent patterns with support values that are higher than the minimum support. As it is difficult to predict how many frequent patterns will be mined with a specified minimum support, the Top-k mining concept has been proposed. The Top-k Mining concept is based on an algorithm for mining ...
متن کاملExMiner: An Efficient Algorithm for Mining Top-K Frequent Patterns
Conventional frequent pattern mining algorithms require users to specify some minimum support threshold. If that specified-value is large, users may lose interesting information. In contrast, a small minimum support threshold results in a huge set of frequent patterns that users may not be able to screen for useful knowledge. To solve this problem and make algorithms more user-friendly, an idea...
متن کاملMining Top-K Click Stream Sequences Patterns
Sequential pattern mining, it is not just important in data mining field, but it is the basis of many applications. However, running applications cost time and memory, especially when dealing with dense of the dataset. Setting the proper minimum support threshold is one of the factors that consume more memory and time. However, it is difficult for users to get the appropriate patterns; it may p...
متن کاملMining Top-K Frequent Closed Patterns without Minimum Support
In this paper, we propose a new mining task: mining top-k frequent closed patterns of length no less than min `, where k is the desired number of frequent closed patterns to be mined, and min ` is the minimal length of each pattern. An efficient algorithm, called TFP, is developed for mining such patterns without minimum support. Two methods, closed node count and descendant sum are proposed to...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012