Optimizing MapReduce for Multicore Architectures
نویسندگان
چکیده
MapReduce is a programming model for data-parallel programs originally intended for data centers. MapReduce simplifies parallel programming, hiding synchronization and task management. These properties make it a promising programming model for future processors with many cores, and existing MapReduce libraries such as Phoenix have demonstrated that applications written with MapReduce perform competitively with those written with Pthreads [11]. This paper explores the design of the MapReduce data structures for grouping intermediate key/value pairs, which is often a performance bottleneck on multicore processors. The paper finds the best choice depends on workload characteristics, such as the number of keys used by the application, the degree of repetition of keys, etc. This paper also introduces a new MapReduce library, Metis, with a compromise data structure designed to perform well for most workloads. Experiments with the Phoenix benchmarks on a 16-core AMD-based server show that Metis’ data structure performs better than simpler alternatives, including Phoenix.
منابع مشابه
Analyzing and Accelerating Runtime Systems on Multicore Architecture
TIWARI, DEVESH. Analyzing and Accelerating Runtime Systems on Multicore Architecture. (Under the direction of Yan Solihin.) Technology scaling has made multicore architectures commercially prevalent. However, exploiting multicore parallelism for performance remains challenging for programmers, because of side-effects of parallel programming such as concurrency management, data-races, deadlocks ...
متن کاملDesign of a novel congestion-aware communication mechanism for wireless NoC architecture in multicore systems
Hybrid Wireless Network-on-Chip (WNoC) architecture is emerged as a scalable communication structure to mitigate the deficits of traditional NOC architecture for the future Multi-core systems. The hybrid WNoC architecture provides energy efficient, high data rate and flexible communications for NoC architectures. In these architectures, each wireless router is shared by a set of processing core...
متن کاملOptimizing the use of the Hard Disk in MapReduce Frameworks for Multi-core Architectures*
MapReduce simplifies parallel programming, abstracting the responsibility of the programmer, such asing the responsibility of the programmer, such as synchronization and task management. The paradigm allows the programmer to write sequential code that is automatically parallelized. The MapReduce Frameworks developed for multi-core architectures provide large processing keys which consequently g...
متن کاملMethods for Optimizing OpenCL Applications on Heterogeneous Multicore Architectures
Heterogeneous multicore architectures with CPU and add-on GPUs or streaming processors are now widely used in computer systems. These GPUs provide substantially more computation capability and memory bandwidth compared to traditional multi-cores. Also, because they are highly programmable, they provide the computational performance needed for realistic graphics rendering. Applications with gene...
متن کاملTransaction / Regular Paper Title
Current high-throughput algorithms for constructing inverted files all follow the MapReduce framework, which presents a high-level programming model that hides the complexities of parallel programming. In this paper, we take an alternative approach and develop a novel strategy that exploits the current and emerging architectures of multicore processors. Our algorithm is based on a high-throughp...
متن کامل