DynRBLA: A High-Performance and Energy-Efficient Row Buffer Locality-Aware Caching Policy for Hybrid Memories
نویسندگان
چکیده
Phase change memory (PCM) is a promising memory technology that can offer higher memory capacity than DRAM. Unfortunately, PCM’s access latencies and energies are higher than DRAM and its endurance is lower. DRAM-PCM hybrid memory systems use DRAM as a cache to PCM, to achieve the low access latencies and energies, and high endurance of DRAM, while taking advantage of the large PCM capacity. A key question is what data to cache in DRAM to best exploit the advantages of each technology while avoiding their disadvantages as much as possible. We propose DynRBLA, a fundamentally new caching policy that improves hybrid memory performance and energy efficiency. Our observation is that both DRAM and PCM contain row buffers which cache the most recently accessed row. Row buffer hits incur the same latency in DRAM and PCM, whereas row buffer misses incur longer latencies in PCM. To exploit this, we devise a policy which tracks the access and row buffer misses of a subset of recently used rows in PCM, and caches in DRAM only rows which are likely to miss in the row buffer and be reused. Compared to a cache management technique that only takes into account the frequency of accesses to data, our row buffer locality-aware scheme improves performance by 15% and energy efficiency by 10%. DynRBLA improves system performance by 17% over an all-PCM memory, and comes to within 21% of the performance of an unlimitedsize all-DRAM memory system.
منابع مشابه
Row Buffer Locality-Aware Data Placement in Hybrid Memories
Phase change memory (PCM) is a promising alternative to DRAM, though its high latency and energy costs prohibit its adoption as a drop-in DRAM replacement. Hybrid memory systems comprising DRAM and PCM attempt to achieve the low access latencies of DRAM at the large capacities of PCM. However, known solutions neglect to assess the utility of data placed in DRAM, and hence fail to achieve high p...
متن کاملEvaluating Row Buffer Locality in Future Non-Volatile Main Memories
DRAM-based main memories have read operations that destroy the read data, and as a result, must buffer large amounts of data on each array access to keep chip costs low. Unfortunately, system-level trends such as increased memory contention in multi-core architectures and data mapping schemes that improve memory parallelism may cause only a small amount of the buffered data to be accessed. This...
متن کاملDRAM-Aware Last-Level Cache Replacement
The cost of last-level cache misses and evictions depend significantly on three major performance-related characteristics of DRAM-based main memory systems: bank-level parallelism, row buffer locality, and write-caused interference. Bank-level parallelism and row buffer locality introduce different latency costs for the processor to service misses: parallel or serial, fast or slow. Write-caused...
متن کاملDesign of a novel congestion-aware communication mechanism for wireless NoC architecture in multicore systems
Hybrid Wireless Network-on-Chip (WNoC) architecture is emerged as a scalable communication structure to mitigate the deficits of traditional NOC architecture for the future Multi-core systems. The hybrid WNoC architecture provides energy efficient, high data rate and flexible communications for NoC architectures. In these architectures, each wireless router is shared by a set of processing core...
متن کاملApplication-aware Adaptive DRAM Bank Partitioning in CMP
Main memory is a shared resource among cores in a chip and the speed gap between cores and main memory limits the total system performance. Thus, main memory should be effectively accessed by each core. Exploiting both parallelism and locality of main memory is the key to realize the efficient memory access. The parallelism between memory banks can hide the latency by pipelining memory accesses...
متن کامل