Flat Datacenter Storage
نویسندگان
چکیده
Flat Datacenter Storage (FDS) is a high-performance, fault-tolerant, large-scale, locality-oblivious blob store. Using a novel combination of full bisection bandwidth networks, data and metadata striping, and flow control, FDS multiplexes an application’s large-scale I/O across the available throughput and latency budget of every disk in a cluster. FDS therefore makes many optimizations around data locality unnecessary. Disks also communicate with each other at their full bandwidth, making recovery from disk failures extremely fast. FDS is designed for datacenter scale, fully distributing metadata operations that might otherwise become a bottleneck. FDS applications achieve single-process read and write performance of more than 2GB/s. We measure recovery of 92GB data lost to disk failure in 6.2s and recovery from a total machine failure with 655GB of data in 33.7s. Application performance is also high: we describe our FDS-based sort application which set the 2012 world record for disk-to-disk sorting.
منابع مشابه
TIMELY: RTT-based congestion control for the datacenter – Public Review
The context is datacenter congestion control. Traditional TCP transport stacks fare poorly in this environment, which has led to considerable interest in recent years in developing specialized transports that aim to deliver high bandwidth utilization at extremely low, microsecond-level packet latency. This is important for demanding datacenter applications such as cloud storage and near-realtim...
متن کاملQuery processing for datacenter-scale computers
Quickly exploring massive datasets for insights requires an efficient data processing platform. Parallel database management systems were originally designed to scale only to a handful of nodes, where each node keeps recent (“hot”) data in memory and has directlyattached hard disk storage for infrequently accessed (“cold”) data. To keep pace with the growing data volumes, the research focus has...
متن کاملDell EMC and Toshiba Power Nutanix to Deliver High-Performance and High-Scalability in Enterprise Workloads
Hyper-converged Infrastructure (HCI) is becoming a significant services platform in the datacenter, and with good reason. Abstracting processing, storage and network resources to create a fully software-defined datacenter is the logical progression of the decade-long virtualization trend. Companies like Nutanix are leading the way, enabling IT departments to create and expand application enviro...
متن کاملDatacenter Storage Architecture for MapReduce Applications
Data-intensive computing systems running MapReducestyle applications are currently architected with storage local to computation in the same physical box. This poster argues that upcoming advances in converged datacenter networks will allow MapReduce applications to utilize and benefit from network-attached storage. This is made possible by properties of all MapReduce-style applications, such a...
متن کاملGRIN: Utilizing the Empty Half of Full Bisection Networks
Various full bisection designs have been proposed for datacenter networks. They are provisioned for the worst case in which every server wishes to send flat out and there is no congestion anywhere in the network. However, these topologies are prone to considerable underutilization in the average case encountered in practice. To utilize spare bandwidth we propose GRIN, a simple, cheap and easily...
متن کامل