Programming , Compilation , and Runtime Support for Processing - In - Memory Based Parallel Architectures ∗
نویسندگان
چکیده
Processing-In-Memory (PIM) systems avoid the von-Neumann bottleneck in conventional machines by integrating high-density DRAM and CMOS logic on the same chip. Parallel systems based on this technology are expected to provide higher scalability, adaptability, robustness, fault tolerance, and lower power consumption than current MPPs or commodity clusters. Most current PIM-related research deals with hardware issues. In this paper, we outline the main ideas of a project intending to demonstrate that a high-level language approach can be successfully applied to massively parallel PIM-based architectures. We define a generalized abstract PIM architecture reflecting the characteristics of most current approaches to PIM, and develop new compilation and runtime technology to support optimizing translation from a Fortran 90 language extension to PIM assembly code.
منابع مشابه
Shared memory multiprocessor support for functional array processing in SAC
Classical application domains of parallel computing are dominated by processing large arrays of numerical data. Whereas most functional languages focus on lists and trees rather than on arrays, SaC is tailor-made in design and in implementation for efficient high-level array processing. Advanced compiler optimizations yield performance levels that are often competitive with low-level imperative...
متن کاملTowards Automatic Support of Parallel Sparse
In this paper, we present a generic matrix class in Java and a runtime environment with continuous compilations aiming to support automatic parallelization of sparse computations on distributed environments. Our package comes with a collection of matrix classes including operators of dense matrix, sparse matrix, and parallel matrix on distributed memory environments. In our environment, a progr...
متن کاملTowards Efficient OpenMP Strategies for Non-Uniform Architectures
Memory Access (NUMA) based processors architectures. In these architectures, analyzing and considering the non-uniformity is of high importance for improving scalability of systems. In this paper, we analyze and develop a NUMA based approach for the OpenMP parallel programming model. Our technique applies a smart threads allocation method and an advanced tasks scheduling strategy for reducing r...
متن کاملRuntime Support for Task Migration on Distributed Memory Architectures
The use of the task migration paradigm has been shown to allow efficient execution of unstructured codes on distributed-memory, parallel architectures. With this model, the data distributed on the parallel processors are never moved. When access to a non-local variable is necessary, the current computation is suspended and then resumed on the processor in charge of this variable. Our implementa...
متن کاملA Compile - Time Openmp Cost Model
OpenMP is a de facto API for parallel programming in C/C++ and Fortran on shared memory and distributed shared memory platforms. It is also being increasingly used with MPI to form a hybrid programming model and is expected to be a promising candidate to exploit emerging multicore architectures. An OpenMP cost model is an analytical model that reflects the characteristics of OpenMP applications...
متن کامل