Distributed Anemone: Transparent Low-Latency Access to Remote Memory
نویسندگان
چکیده
Performance of large memory applications degrades rapidly once the system hits the physical memory limit and starts paging to local disk. We present the design, implementation and evaluation of Distributed Anemone (Adaptive Network Memory Engine) – a lightweight and distributed system that pools together the collective memory resources of multiple machines across a gigabit Ethernet LAN. Anemone treats remote memory as another level in the memory hierarchy between very fast local memory and very slow local disks. Anemone enables applications to access potentially “unlimited” network memory without any application or operating system modifications. Our kernel-level prototype features fully distributed resource management, low-latency paging, resource discovery, load balancing, soft-state refresh, and support for ’jumbo’ Ethernet frames. Anemone achieves low page-fault latencies of 160μs average, application speedups of up to 4 times for single process and up to 14 times for multiple concurrent processes, when compared against disk-based paging.
منابع مشابه
Fast Transparent Cluster-Wide Paging
In a cluster with a very low-latency interconnect, the remote memory of nodes can serve as a storage that is faster than local disk but slower than local memory. In this paper, we address the problem of transparently utilizing this cluster-wide pool of unused memory as a low-latency paging device. Such a transparent remote memory paging system can enable large-memory applications to benefit fro...
متن کاملAnemone: Transparently Harnessing Cluster-Wide Memory
There is a constant battle to break even between continuing improvements in DRAM capacities and the growing memory demands of large-memory high-performance applications. Performance of such applications degrades quickly once the system hits the physical memory limit and starts swapping to the local disk. We present the design, implementation and evaluation of Anemone – an Adaptive Network Memor...
متن کاملImplementation Experiences in Transparently Harnessing Cluster-Wide Memory
There is a constant battle to break even between continuing improvements in DRAM capacities and the growing memory demands of large-memory high-performance applications. Performance of such applications degrades quickly once the system hits the physical memory limit and starts swapping to the local disk. In this paper, we investigate the benefits and tradeoffs in pooling together the collective...
متن کاملA HyperTransport Network Interface Controller For Ultra-low Latency Message Transfers
This white paper presents the implementation of a high-performance HyperTransport-enabled Network Interface Controller (NIC), named Virtualized Engine for Low Overhead (VELO). The direct connect architecture and efficiency of HyperTransport produce an NIC capable of sub-microsecond latency. The prototype implemented on a Field Programmable Gate Array (FPGA) delivers a communication latency of 9...
متن کاملEnabling Transparent Data Sharing in Component Models Gabriel Antoniu, Hinde Lilia Bouziane, Landry
The fast growth of high-bandwidth wide-area networks has encouraged the development of computational grids. To deal with the increasing complexity of grid applications, the software component technology seems very appealing since it emphasizes software composition and re-use. However, current software component models only support explicit data transfers between components through remote proced...
متن کامل