Top-k Correlative Graph Mining
نویسندگان
چکیده
Correlation mining has been widely studied due to its ability for discovering the underlying occurrence dependency between objects. However, correlation mining in graph databases is expensive due to the complexity of graph data. In this paper, we study the problem of mining top-k correlative subgraphs in the database, which share similar occurrence distributions with a given query graph. The search space of the problem is prohibitively large since every subgraph in the database is a candidate. We propose an efficient algorithm, TopCor, which mines the top-k correlative graphs by exploring only the candidate graphs in the projected database of a query graph. We develop three key techniques for TopCor: an effective correlation checking mechanism, a powerful pruning criteria, and a set of useful rules for candidate exploration. The three key techniques are very effective in directing the search to those highly correlative candidate graphs. We justify by experiments the effectiveness of the three key techniques and show that TopCor is more than an order of magnitude faster than CGSearch, the state-of-the-art threshold-based correlative graph mining algorithm.
منابع مشابه
Mining Top-K Graph Patterns that Jointly Maximize Some Significance Measure
Most of graph pattern mining algorithms focus on finding frequent subgraphs and its compact representations, such as closed frequent subgraphs and maximal frequent subgraphs. However, little attention has been paid to mining graph patterns with user-specified significance measure. In this paper, we study a new problem of mining top-k graph patterns that jointly maximize some significance measur...
متن کاملPushing Constraints to Generate Top-K Closed Sequential Graph Patterns
In this paper, the problem of finding sequential patterns from graph databases is investigated. Two serious issues dealt in this paper are efficiency and effectiveness of mining algorithm. A huge volume of sequential patterns has been generated out of which most of them are uninteresting. The users have to go through a large number of patterns to find interesting results. In order to improve th...
متن کاملTGP: Mining Top-K Frequent Closed Graph Pattern without Minimum Support
In this paper, we propose a new mining task: mining top-k frequent closed graph patterns without minimum support. Most previous frequent graph pattern mining works require the specification of a minimum support threshold to perform the mining. However it is difficult for users to set a suitable value sometimes. We develop an efficient algorithm, called TGP, to mine patterns without minimum supp...
متن کاملTop-K Correlation Sub-graph Search in Graph Databases
Recently, due to its wide applications, (similar) subgraph search has attracted a lot of attentions from database and data mining community, such as [13, 18, 19, 5]. In [8], Ke et al. first proposed correlation sub-graph search problem (CGSearch for short) to capture the underlying dependency between subgraphs in a graph database, that is CGS algorithm. However, CGS algorithm requires the speci...
متن کاملEfficient Mining of Top-k Breaker Emerging Subgraph Patterns from Graph Datasets
This paper introduces a new type of discriminative subgraph pattern called breaker emerging subgraph pattern by introducing three constraints and two new concepts: base and breaker. A breaker emerging subgraph pattern consists of three subpatterns: a constrained emerging subgraph pattern, a set of bases and a set of breakers. An efficient approach is proposed for the discovery of top-k breaker ...
متن کامل