Flat Datacenter Storage

نویسندگان

  • Edmund B. Nightingale
  • Jeremy Elson
  • Jinliang Fan
  • Owen S. Hofmann
  • Jon Howell
  • Yutaka Suzue
چکیده

Flat Datacenter Storage (FDS) is a high-performance, fault-tolerant, large-scale, locality-oblivious blob store. Using a novel combination of full bisection bandwidth networks, data and metadata striping, and flow control, FDS multiplexes an application’s large-scale I/O across the available throughput and latency budget of every disk in a cluster. FDS therefore makes many optimizations around data locality unnecessary. Disks also communicate with each other at their full bandwidth, making recovery from disk failures extremely fast. FDS is designed for datacenter scale, fully distributing metadata operations that might otherwise become a bottleneck. FDS applications achieve single-process read and write performance of more than 2GB/s. We measure recovery of 92GB data lost to disk failure in 6.2s and recovery from a total machine failure with 655GB of data in 33.7s. Application performance is also high: we describe our FDS-based sort application which set the 2012 world record for disk-to-disk sorting.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TIMELY: RTT-based congestion control for the datacenter – Public Review

The context is datacenter congestion control. Traditional TCP transport stacks fare poorly in this environment, which has led to considerable interest in recent years in developing specialized transports that aim to deliver high bandwidth utilization at extremely low, microsecond-level packet latency. This is important for demanding datacenter applications such as cloud storage and near-realtim...

متن کامل

Query processing for datacenter-scale computers

Quickly exploring massive datasets for insights requires an efficient data processing platform. Parallel database management systems were originally designed to scale only to a handful of nodes, where each node keeps recent (“hot”) data in memory and has directlyattached hard disk storage for infrequently accessed (“cold”) data. To keep pace with the growing data volumes, the research focus has...

متن کامل

Dell EMC and Toshiba Power Nutanix to Deliver High-Performance and High-Scalability in Enterprise Workloads

Hyper-converged Infrastructure (HCI) is becoming a significant services platform in the datacenter, and with good reason. Abstracting processing, storage and network resources to create a fully software-defined datacenter is the logical progression of the decade-long virtualization trend. Companies like Nutanix are leading the way, enabling IT departments to create and expand application enviro...

متن کامل

Datacenter Storage Architecture for MapReduce Applications

Data-intensive computing systems running MapReducestyle applications are currently architected with storage local to computation in the same physical box. This poster argues that upcoming advances in converged datacenter networks will allow MapReduce applications to utilize and benefit from network-attached storage. This is made possible by properties of all MapReduce-style applications, such a...

متن کامل

GRIN: Utilizing the Empty Half of Full Bisection Networks

Various full bisection designs have been proposed for datacenter networks. They are provisioned for the worst case in which every server wishes to send flat out and there is no congestion anywhere in the network. However, these topologies are prone to considerable underutilization in the average case encountered in practice. To utilize spare bandwidth we propose GRIN, a simple, cheap and easily...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012