Probabilistic Graph Summarization
نویسندگان
چکیده
We study group-summarization of probabilistic graphs that naturally arise in social networks, semistructured data, and other applications. Our proposed framework groups the nodes and the edges of the graph based on a user selected set of node attributes. We present methods to compute useful graph aggregates without the need to create all of the possible graph-instances of the original probabilistic graph. Also, we present an algorithm for graph summarization based on pure relational (SQL) technology. We analyze our algorithm and practically evaluate its efficiency using an extended Epinions dataset as well as synthetic datasets. The experimental results show the scalability of our algorithm and its efficiency in producing highly compressed summary graphs in reasonable time.
منابع مشابه
Graph Hybrid Summarization
One solution to process and analysis of massive graphs is summarization. Generating a high quality summary is the main challenge of graph summarization. In the aims of generating a summary with a better quality for a given attributed graph, both structural and attribute similarities must be considered. There are two measures named density and entropy to evaluate the quality of structural and at...
متن کاملGraph Summarization in Annotated Data Using Probabilistic Soft Logic
Annotation graphs, made available through the Linked Data initiative and Semantic Web, have significant scientific value. However, their increasing complexity makes it difficult to fully exploit this value. Graph summaries, which group similar entities and relations for a more abstract view on the data, can help alleviate this problem, but new methods for graph summarization are needed that han...
متن کاملSpoken Lecture Summarization by Random Walk over a Graph Constructed with Automatically Extracted Key Terms
This paper proposes an improved approach for spoken lecture summarization, in which random walk is performed on a graph constructed with automatically extracted key terms and probabilistic latent semantic analysis (PLSA). Each sentence of the document is represented as a node of the graph and the edge between two nodes is weighted by the topical similarity between the two sentences. The basic i...
متن کاملIntra-Speaker Topic Modeling for Improved Multi-Party Meeting Summarization with Integrated Random Walk
This paper proposes an improved approach to extractive summarization of spoken multi-party interaction, in which integrated random walk is performed on a graph constructed on topical/ lexical relations. Each utterance is represented as a node of the graph, and the edges’ weights are computed from the topical similarity between the utterances, evaluated using probabilistic latent semantic analys...
متن کاملGraph-based models for multi-document summarization
University of Ljubljana Faculty of Computer and Information Science Ercan Canhasi Graph-based models for multi-document summarization is thesis is about automatic document summarization, with experimental results on general, query, update and comparative multi-document summarization (MDS). We describe prior work and our own improvements on some important aspects of a summarization system, incl...
متن کامل