Quantifying Locality Effect in Data Access Delay: Memory logP
نویسندگان
چکیده
The application of hardware-parameterized models to distributed systems can result in omission of key bottlenecks such as the full cost of inter-node communication in a shared memory cluster. However, inclusion in the model of message characteristics and complex memory hierarchies may result in impractical models. Nonetheless, the growing gap between memory and CPU performance combined with the trend toward large scale clustered shared memory platforms implies an increased need to consider the impact of local memory communication on parallel processing in distributed systems. We present a simple and useful model of pointto-`point memory communication to predict and analyze the latency of memory copy, pack and unpack. We use the model to isolate contributions of hardware, middleware, and software to data transfers on Inteland MIPS-based platforms.
منابع مشابه
Machine Abstractions and Locality Issues in Studying Parallel Systems
We define a set of overhead functions that capture the salient artifacts representing the interaction between parallel application characteristics and architectural features. An execution-driven simulation testbed is used to separate these overheads in a parallel system. Using this testbed and a set of applications, we address two important issues. The first concerns the use of machine abstract...
متن کاملScheduling algorithm based on prefetching in MapReduce clusters
Due to cluster resource competition and task scheduling policy, some map tasks are assigned to nodes without input data, which causes significant data access delay. Data locality is becoming one of the most critical factors to affect performance of MapReduce clusters. As machines in MapReduce clusters have large memory capacities, which are often underutilized, in-memory prefetching input data ...
متن کاملFurther Results with Algorithmic Skeletons for the CLUMPS Model of Parallel Computation
The CLUMPS (Campbell's Lenient, Uniied Model of Parallel Systems) model of parallel computation is composed of an architectural model with an associated cost model. The architectural model employs a multi-level memory hierarchy, so requires general locality of communication (communication between close processors). The multi-level memory hierarchy is reeected in the cost model which is based on...
متن کاملThe Memory logP Model of Local Communication
1 Abstract—Data movement across a memory hierarchy can severely impact application execution time. For example, on the fast interconnect of the Origin 2000 three-and four-fold increases in communication cost for small message transmissions (~1K) stored non-contiguously are not uncommon. Simple, accurate predictions of communication time in hierarchical memories will identify bottlenecks in comm...
متن کاملQuantifying and Resolving Remote Memory Access Contention on Hardware DSM Multiprocessors
This paper makes the following contributions: It proposes a new methodology for quantifying remote memory access contention on hardware DSM multiprocessors. The most valuable aspect of this methodology is that it assesses the impact of contention on real parallel programs running on real hardware. The methodology uses as input the number of accesses from each DSM node to each page in memory. A ...
متن کامل