An introduction to processor-time-optimal systolic arrays
نویسندگان
چکیده
We consider computations suitable for systolic arrays, often called regular array computations or systems of uniform recurrence relations. In such computations, the tasks to be computed are viewed as the nodes of a directed acyclic graph (dag), where the data dependencies are represented as arcs. A processor-time-minimal schedule measures the minimum number of processors needed to extract the maximum parallelism from the dag. We present a technique for nding a lower bound on the number of processors needed to achieve a given schedule of an algorithm represented as a dag. The application of this technique is illustrated with a tensor product computation. We then consider the free schedule of algorithms for matrix product, Gaussian elimination, and transitive closure. For each problem, we provide a time-minimal processor schedule that meets the computed processor lower bounds, including the one for tensor product.
منابع مشابه
Bounded Broadcast in Systolic Arrays
Much work has been done on the problem of synthesizing a processor array from a system of recurrence equations. Some researchers limit communication to nearest neighbors in the array; others use broadcast. In many cases, neither of the above approaches result in an optimal execution time. In this paper a technique called bounded broadcast is explored whereby an element of a processor array can ...
متن کاملAn E cient Allocation Strategy for MappingA ne Recurrences into
This paper adresses the problem of eecient mappings of nested loops, and more generally of systems of aane recurrence equations, into regular arrays. The presented technique is based on the transformation of an initial systolic mapping. By studying the processor element (PE) activity, a nearly space-optimal mapping is designed by serializing the computations of several initial PEs into a single...
متن کاملProcessor-time-optimal systolic arrays
Minimizing the amount of time and number of processors needed to perform an application reduces the application's fabrication cost and operation costs. A directed acyclic graph (dag) model of algorithms is used to de ne a time-minimal schedule and a processor-time-minimal schedule. We present a technique for nding a lower bound on the number of processors needed to achieve a given schedule of a...
متن کاملSystematic Methodology of Mapping Signal Processing Algorithms into Arrays of Processors
Nowadays high speed signal processing has become the only alternative in modern communication system, given the rapidly growing microelectronics technology. This high speed, real time signal processing depends critically both on the parallel algorithms and on parallel processor technology. Special purpose array processor structures will have become the real possibility for high speed signal pro...
متن کاملReducing the Number of Processors Elements in Systolic Arrays for Matrix Multiplication
Author is discussing a problems of determining parameters suitable systolic arrays for implementation regular 3-nested loop algorithms. Author shows that if the characteristics of so called adaptable algorithms to the projection direction are used we have the best results. This characteristics can be space (number of processor elements,chip area, input-output elements, ...), time(flow period of...
متن کامل