Dynamic Community Detection in Weighted Graph Streams

نویسندگان

  • Jian-Huang Lai
  • Chang-Dong Wang
  • Philip S. Yu
چکیده

In this paper, we aim to tackle the problem of discovering dynamic communities in weighted graph streams, especially when the underlying social behavior of individuals varies considerably over different graph regions. To tackle this problem, a novel structure termed Local Weighted-Edge-based Pattern (LWEP) Summary is proposed to describe a local homogeneous region. To efficiently compute LWEPs, some statistics need to be maintained according to the principle of preserving maximum weighted neighbor information with limited memory storage. To this end, the proposed approach is divided into online and offline components. During the online phase, we introduce some statistics, termed top-k neighbor lists and topk candidate lists, to track. The key is to maintain only the top-k neighbors with the largest link weights for each node. To allow for less active neighbors to transition into top-k neighbors, an auxiliary data structure termed top-k candidate list is used to identify emerging active neighbors. The statistics can be efficiently maintained in the online component. In the offline component, these statistics are used at each snapshot to efficiently compute LWEPs. Clustering is then performed to consolidate LWEPs into high level clusters. Finally, mapping is made between clusters of consecutive snapshots to generate temporally smooth communities. Experimental results are presented to illustrate the effectiveness and efficiency of the proposed approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approximate Integration of streaming data

We approximate analytic queries on streaming data with a weighted reservoir sampling. For a stream of tuples of a Datawarehouse we show how to approximate some Olap queries. For a stream of graph edges from a Social Network, we approximate the communities as the large connected components of the edges in the reservoir. We show that for a model of random graphs which follow a power law degree di...

متن کامل

Sublinear Estimation of Weighted Matchings in Dynamic Data Streams

This paper presents an algorithm for estimating the weight of a maximum weighted matching by augmenting any estimation routine for the size of an unweighted matching. The algorithm is implementable in any streaming model including dynamic graph streams. We also give the first constant estimation for the maximum matching size in a dynamic graph stream for planar graphs (or any graph with bounded...

متن کامل

Outlier Detection for Dynamic Data Streams Using Weighted K-means

This paper presents a new k-means type clustering algorithm that can calculate weights to the variables. This method is efficient for dynamic data streams in order to overcome the global optimum problems. The variable weights produced by the algorithm measures the importance of variable in clustering and can be used in variable selection in which the data items with similar properties are group...

متن کامل

A Scalable Approach for Outlier Detection in Edge Streams Using Sketch-based Approximations

Dynamic graphs are a powerful way to model an evolving set of objects and their ongoing interactions. A broad spectrum of systems, such as information, communication, and social, are naturally represented by dynamic graphs. Outlier (or anomaly) detection in dynamic graphs can provide unique insights into the relationships of objects and identify novel or emerging relationships. To date, outlier...

متن کامل

An Optimized Firefly Algorithm based on Cellular Learning Automata for Community Detection in Social Networks

The structure of the community is one of the important features of social networks. A community is a sub graph which nodes have a lot of connections to nodes of inside the community and have very few connections to nodes of outside the community. The objective of community detection is to separate groups or communities that are linked more closely. In fact, community detection is the clustering...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013