منابع مشابه
Efficiently Mining Maximal Frequent Itemsets
We present GenMax, a backtrack search based algorithm for mining maximal frequent itemsets. GenMax uses a number of optimizations to prune the search space. It uses a novel technique called progressive focusing to perform maximality checking, and diffset propagation to perform fast frequency computation. Systematic experimental comparison with previous work indicates that different methods have...
متن کاملOptimizing inductive queries in frequent itemsets mining
Let Q = {Q1, . . . , Qn} be a set of past queries and let R = {R1, . . . , Rn} be their results. Moreover, let Q0 be a query newly submitted to the system and let R0 be its result. The task of optimizing the extraction of R0 using the knowledge provided by Q and R have been faced following two distinct approaches. In the first approach we search for a query Qi ∈ Q, such that R0 ⊆ Ri (in such a ...
متن کاملDistributed Frequent Itemsets Mining in Heterogeneous Platforms
Huge amounts of datasets with different sizes are naturally distributed over the network. In this paper we propose a distributed algorithm for frequent itemsets generation on heterogeneous clusters and grid environments. In addition to the disparity in the performance and the workload capacity in these environments, other constraints are related to the datasets distribution and their nature, an...
متن کاملEfficiently Mining Frequent Itemsets in Transactional Databases
Discovering frequent itemsets is an essential task in association rules mining and it is considered to be computationally expensive. To find the frequent itemsets, the algorithm of frequent pattern growth (FP-growth) is one of the best algorithms for mining frequent patterns. However, many experimental results have shown that building conditional FP-trees during mining data using this FP-growth...
متن کاملSet Overlap in Mining of Frequent Itemsets
An important module of soft computing methods is the set overlap operation. If a query set is tested with a large pool of source sets, the signature-based or the inverted-file methods are used to reduce the cost of operation. The paper introduces a modified version of the inverted-file approach, which yields in lowest costs for sparse input samples, i.e. where the number of records containing a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Information Systems
سال: 2014
ISSN: 0306-4379
DOI: 10.1016/j.is.2012.01.005