Summarizing data succinctly with the most informative itemsets
نویسندگان
چکیده
منابع مشابه
Data Summarization with Informative Itemsets
Data analysis is an inherently iterative process. That is, what we know about the data greatly determines our expectations, and hence, what result we would find the most interesting. With this in mind, we introduce a well-founded approach for succinctly summarizing data with a collection of informative itemsets; using a probabilistic maximum entropy model, we iteratively find the most interesti...
متن کاملDistributed Submodular Cover: Succinctly Summarizing Massive Data
How can one find a subset, ideally as small as possible, that well represents a massive dataset? I.e., its corresponding utility, measured according to a suitable utility function, should be comparable to that of the whole dataset. In this paper, we formalize this challenge as a submodular cover problem. Here, the utility is assumed to exhibit submodularity, a natural diminishing returns condit...
متن کاملSummarizing Frequent Itemsets via Pignistic Transformation
Since the proposal of the well-known Apriori algorithm and the subsequent establishment of the area known as Frequent Itemset Mining, most of the scientific contribution of the data mining area have been focused on the study of methods that improve its efficiency and its applicability in new domains. The interest in the extraction of this sort of patterns lies in its expressiveness and syntacti...
متن کاملMining N-most Interesting Itemsets
Previous methods on mining association rules require users to input a minimum support threshold. However, there can be too many or too few resulting rules if the threshold is set inappropriately. It is diicult for end-users to nd the suitable threshold. In this paper, we propose a diierent setting in which the user does not provide a support threshold, but instead indicates the amount of result...
متن کاملMining Frequent Most Informative Subgraphs
The main practical problem encountered with frequent subgraph search methods is the tens of thousands of returned graph patterns that make their visual analysis impossible. In order to face this problem, are introduced a very restricted family of relevant graph patterns called the most informative patterns along with an algorithm to mine them and associated experimental results. In graph-based ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM Transactions on Knowledge Discovery from Data
سال: 2012
ISSN: 1556-4681,1556-472X
DOI: 10.1145/2382577.2382580