An Experimental Comparison of Pregel-like Graph Processing Systems

نویسندگان

  • Minyang Han
  • Khuzaima Daudjee
  • Khaled Ammar
  • M. Tamer Özsu
  • Xingfang Wang
  • Tianqi Jin
چکیده

The introduction of Google’s Pregel generated much interest in the field of large-scale graph data processing, inspiring the development of Pregel-like systems such as Apache Giraph, GPS, Mizan, and GraphLab, all of which have appeared in the past two years. To gain an understanding of how Pregel-like systems perform, we conduct a study to experimentally compare Giraph, GPS, Mizan, and GraphLab on equal ground by considering graph and algorithm agnostic optimizations and by using several metrics. The systems are compared with four different algorithms (PageRank, single source shortest path, weakly connected components, and distributed minimum spanning tree) on up to 128 Amazon EC2 machines. We find that the system optimizations present in Giraph and GraphLab allow them to perform well. Our evaluation also shows Giraph 1.0.0’s considerable improvement since Giraph 0.1 and identifies areas of improvement for all systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

From "Think Like a Vertex" to "Think Like a Graph"

To meet the challenge of processing rapidly growing graph and network data created by modern applications, a number of distributed graph processing systems have emerged, such as Pregel and GraphLab. All these systems divide input graphs into partitions, and employ a “think like a vertex” programming model to support iterative graph computation. This vertex-centric model is easy to program and h...

متن کامل

Lightweight Fault Tolerance in Large-Scale Distributed Graph Processing

The success of Google’s Pregel framework in distributed graph processing has inspired a surging interest in developing Pregel-like platforms featuring a user-friendly “think like a vertex” programming model. Existing Pregel-like systems support a fault tolerance mechanism called checkpointing, which periodically saves computation states as checkpoints to HDFS, so that when a failure happens, co...

متن کامل

Optimizing Graph Algorithms on Pregel-like Systems

We study the problem of implementing graph algorithms efficiently on Pregel-like systems, which can be surprisingly challenging. Standard graph algorithms in this setting can incur unnecessary inefficiencies such as slow convergence or high communication or computation cost, typically due to structural properties of the input graphs such as large diameters or skew in component sizes. We describ...

متن کامل

Tech Report: Compiling GreenMarl into GPS

The massive size of the data in large graph processing requires distributed processing. However, conventional frameworks for distributed graph processing, such as Pregel, use programming models that are well-suited for scalability but inconvenient for programming graph algorithms. In this paper, we use Green-Marl, a Domain-Specific Language for graph analysis, to describe graph algorithms intui...

متن کامل

A General-Purpose Query-Centric Framework for Querying Big Graphs

Pioneered by Google’s Pregel, many distributed systems have been developed for large-scale graph analytics. These systems employ a user-friendly “think like a vertex” programming model, and exhibit good scalability for tasks where the majority of graph vertices participate in computation. However, the design of these systems can seriously under-utilize the resources in a cluster for processing ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • PVLDB

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2014