Modeling Parallel Computers as Memory Hierarchies
نویسندگان
چکیده
A parameterized generic model that captures the features of diverse computer architectures would facilitate the development of portable programs. Spe-ciic models appropriate to particular computers are obtained by specifying parameters of the generic model. A generic model should be simple, and for each machine that it is intended to represent, it should have a reasonably accurate speciic model. The Parallel Memory Hierarchy (PMH) model of computation uses a single mechanism to model the costs of both interprocessor communication and memory hierarchy traac. A computer is modeled as a tree of memory modules with processors at the leaves. All data movement takes the form of block transfers between children and their parents. This paper assesses the strengths and weaknesses of the PMH model as a generic model.
منابع مشابه
Extending the BSP model for multi-core and out-of-core computing: MBSP
We present an extension of the bulk-synchronous parallel (BSP) model to abstract and model parallelism in the presence of multiple memory hierarchies and multiple cores. We call the new model MBSP for multi-memory BSP. The BSP model has been used to model internal memory parallel computers; MBSP retains the properties of BSP and in addition can abstract not only traditional external memory-supp...
متن کاملExplicit Management of Memory Hierarchy
All scalable parallel computers feature a memory hierarchy, in which some locations are “closer” to a particular processor than others. The hardware in a particular system may support a shared memory or message passing programming model, but these factors effect only the relative costs of local and remote accesses, not the system’s fundamental Non-Uniform Memory Access (NUMA) characteristics. Y...
متن کاملAutomatic Tuning of Whole Applications:
For many years, retargeting of applications for new architectures has been a major headache for high performance computation. As new architectures have emerged at dizzying speed, we have moved from uniprocessors, to vector machines, symmetric multiprocessors, synchronous parallel arrays, distributed-memory parallel computers, and scalable clusters. Over the past year, clusters based on multicor...
متن کاملA Cache-Aware Parallel Implementation of the Push-Relabel Network Flow Algorithm and Experimental Evaluation of the Gap Relabeling Heuristic
The maximum flow problem is a combinatorial problem of significant importance in a wide variety of research and commercial applications. It has been extensively studied and implemented over the past 40 years. The pushrelabel method has been shown to be superior to other methods, both in theoretical bounds and in experimental implementations. Our study discusses the implementation of the push-re...
متن کاملAutomatic mapping of parallel applications on multicore architectures using the Servet benchmark suite
Servet is a suite of benchmarks focused on detecting a set of parameters with high influence on the overall performance of multicore systems. These parameters can be used for autotuning codes to increase their performance on multicore clusters. Although Servet has been proved to detect accurately cache hierarchies, bandwidths and bottlenecks in memory accesses, as well as the communication over...
متن کامل