PERSEUS-HUB: Interactive and Collective Exploration of Large-Scale Graphs

نویسندگان

  • Di Jin
  • Aristotelis Leventidis
  • Haoming Shen
  • Ruowang Zhang
  • Junyue Wu
  • Danai Koutra
چکیده

Graphs emerge naturally in many domains, such as social science, neuroscience, transportation engineering, and more. In many cases, such graphs have millions or billions of nodes and edges, and their sizes increase daily at a fast pace. How can researchers from various domains explore large graphs interactively and efficiently to find out what is ‘important’? How can multiple researchers explore a new graph dataset collectively and “help” each other with their findings? In this article, we present PERSEUS-HUB, a large-scale graph mining tool that computes a set of graph properties in a distributed manner, performs ensemble, multi-view anomaly detection to highlight regions that are worth investigating, and provides users with uncluttered visualization and easy interaction with complex graph statistics. PERSEUS-HUB uses a Spark cluster to calculate various statistics of large-scale graphs efficiently, and aggregates the results in a summary on the master node to support interactive user exploration. In PERSEUS-HUB, the visualized distributions of graph statistics provide preliminary analysis to understand a graph. To perform a deeper analysis, users with little prior knowledge can leverage patterns (e.g., spikes in the power-law degree distribution) marked by other users or experts. Moreover, PERSEUS-HUB guides users to regions of interest by highlighting anomalous nodes and helps users establish a more comprehensive understanding about the graph at hand. We demonstrate our system through the case study on real, large-scale networks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Perseus: An Interactive Large-Scale Graph Mining and Visualization Tool

Given a large graph with several millions or billions of nodes and edges, such as a social network, how can we explore it e ciently and find out what is in the data? In this demo we present Perseus, a large-scale system that enables the comprehensive analysis of large graphs by supporting the coupled summarization of graph properties and structures, guiding attention to outliers, and allowing t...

متن کامل

VCExplorer: A Interactive Graph Exploration Framework Based on Hub Vertices with Graph Consolidation

Graphs have been widely used to model different information networks, such as the Web, biological networks and social networks (e.g. Twitter). Due to the size and complexity of these graphs, how to explore and utilize these graphs has become a very challenging problem. In this paper, we propose, VCExplorer, a new interactive graph exploration framework that integrates the strengths of graph vis...

متن کامل

ROBE - Knitting a Tight Hub for Shortest Path Discovery in Large Social Graphs

Scalable and efficient algorithms are needed to compute shortest paths between any pair of vertices in large social graphs. In this work, we propose a novel ROBE scheme to estimate the shortest distances. ROBE is based on a hub serving as the skeleton of the large graph. In order to stretch the hub into every corner in the network, we first choose representative nodes with highest degrees that ...

متن کامل

Multimodal Transportation p-hub Location Routing Problem with Simultaneous Pick-ups and Deliveries

Centralizing and using proper transportation facilities cut down costs and traffic. Hub facilities concentrate on flows to cause economic advantage of scale and multimodal transportation helps use the advantage of another transporter. A distinctive feature of this paper is proposing a new mathematical formulation for a three-stage p-hub location routing problem with simultaneous pick-ups and de...

متن کامل

Provenance Map Orbiter: Interactive Exploration of Large Provenance Graphs

Provenance systems can produce enormous provenance graphs that can be used for a variety of tasks from determining the inputs to a particular process to debugging entire workflow executions or tracking difficult-to-find dependencies. Visualization can be a useful tool to support such tasks, but graphs of such scale (thousands to millions of nodes) are notoriously difficult to visualize. This pa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Informatics

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2017