Clustering Frequent Graph Patterns
نویسندگان
چکیده
In recent years, graph mining has attracted much attention in the data mining community. Several efficient frequent subgraph mining algorithms have been recently proposed. However, the number of frequent graph patterns generated by these graph mining algorithms may be too large to be effectively explored by users, especially when the support threshold is low. In this paper, we propose to summarize frequent graph patterns by a much smaller number of representative graph patterns. Several novel concepts such as -cover graph, jump value and -jump pattern are proposed for efficiently summarizing frequent graph patterns. Based on the property of all -jump patterns being representative graph patterns, we propose two efficient algorithms for summarizing frequent graph patterns, RP-FP and RP-GD. The RP-FP algorithm computes representative graph patterns from a set of closed frequent graph patterns, whereas the RP-GD algorithm directly mines representative graph patterns from graph databases. The RPFP has tight approximation (summarization) bound but higher computational complexity. The RP-GD has no approximation bound guarantee but is far more efficient. We conducted extensive experimental studies using various real and synthetic datasets. Experimental results show that RP-FP and RP-GD are able to obtain compact summarization in both real and synthetic graph databases. When the number of closed graph patterns is very large, RP-GD is much more efficient than RP-FP, while achieving comparable summarization quality.
منابع مشابه
ar X iv : 0 70 5 . 05 93 v 1 [ cs . A I ] 4 M ay 2 00 7 Clustering with Lattices in the Analysis of Graph Patterns Edgar
Mining frequent subgraphs is an area of research where we have a given set of graphs (each graph can be seen as a transaction), and we search for (connected) subgraphs contained in many of these graphs. In this work we will discuss techniques used in our framework Lattice2SAR for mining and analysing frequent subgraph data and their corresponding lattice information. Lattice information is prov...
متن کاملClustering with Lattices in the Analysis of Graph Patterns
Mining frequent subgraphs is an area of research where we have a given set of graphs (each graph can be seen as a transaction), and we search for (connected) subgraphs contained in many of these graphs. In this work we will discuss techniques used in our framework Lattice2SAR for mining and analysing frequent subgraph data and their corresponding lattice information. Lattice information is prov...
متن کاملGenerating Recurrent Patterns Using Clique Algorithm
Generating Recurrent Patterns Using Clique Algorithm Bipin Nair B J Lecturer in Department of Computer Science Amrita Vishwa Vidyapeetham, Mysore Campus, Karnataka, INDIA _________________________________________________________________________________________ Abstract: Clustering is one of the several machine learning techniques to find out frequent patterns. Most of the clustering methods are...
متن کاملMUSK: Uniform Sampling of k Maximal Patterns
Recent research in frequent pattern mining (FPM) has shifted from obtaining the complete set of frequent patterns to generating only a representative (summary) subset of frequent patterns. Most of the existing approaches to this problem adopt a two-step solution; in the first step, they obtain all the frequent patterns, and in the second step, some form of clustering is used to obtain the summa...
متن کاملApplication of a Mining Algorithm to Finding Frequent Patterns in a Text Corpus: A Case Study of the Arabic
Information repositories containing text data of different languages are abundant on the World Wide Web. Digital corpora of sacred text of Islam related to Quran containing Arabic language are also publicly available. The availability of these corpora and intelligent application to analyze them are vital to better comprehend the religious text of Islam. In this paper I propose a method of repre...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- JDIM
دوره 5 شماره
صفحات -
تاریخ انتشار 2007