Efficient Inter-Task Communication for Nested Loop Programs on a Multiprocessor System

نویسندگان

  • Tjerk Bijlsma
  • Marco Bekooij
  • Gerard Smit
  • Pierre Jansen
چکیده

In modern multiprocessor systems, processors can be stalled by inter-task communication when reading from a remote buffer. This paper presents a solution for the inter-task communication, that has a minimal impact on the performance of the system, hides the inter-task communication latency without requiring additional hardware. The solution applies to jobs, represented as task graphs, where the tasks are nested loop programs. Buffers are allocated in scratch-pad memories of the consuming tasks to provide low latency read access. For the nested loop programs, minimal buffer sizes can be determined to cover all possible communication patterns. The added computational complexity is low, as the solution adds only a few operations to the nested loop programs. Keywords—Nested Loop Program, Scratch-Pad Memory, Circular Buffer.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Free Scheduling of General Nested Loops For Distributed Memory Architectures

The most extensive, in terms of time execution, part of a program is the nested loops. Loop parallelization involves two steps: First the time partitioning of the index space to achieve the minimum makespan, and second the efficient assignment of the concurrent partitions into the target parallel architecture. If distributed memory multiprocessor systems are used, overall performance is decline...

متن کامل

Parallelization of While-Loops in Nested Loop Programs for Real-time Multiprocessor Systems

Many applications with stream processing behavior contain one or more loops with an unknown number of iterations. These loops have to be parallelized in order to utilize the maximum capacity of an embedded multiprocessor platform and thus increase the total throughput. This thesis presents a method to automatically extract a parallel task graph based on function level parallelism from a sequent...

متن کامل

Pre-scheduling and Scheduling of Task Graph on Homogeneous Multiprocessor Systems

Task graph scheduling is a multi-objective optimization and NP-hard problem. In this paper a new algorithm on homogeneous multiprocessors systems is proposed. Basically, scheduling algorithms are targeted to balance the two parameters of time and energy consumption. These two parameters are up to a certain limit in contrast with each other and improvement of one causes reduction in the othe...

متن کامل

Pre-scheduling and Scheduling of Task Graph on Homogeneous Multiprocessor Systems

Task graph scheduling is a multi-objective optimization and NP-hard problem. In this paper a new algorithm on homogeneous multiprocessors systems is proposed. Basically, scheduling algorithms are targeted to balance the two parameters of time and energy consumption. These two parameters are up to a certain limit in contrast with each other and improvement of one causes reduction in the othe...

متن کامل

Cache Optimization for Coarse Grain Task Parallel Processing Using Inter-Array Padding

The wide use of multiprocessor system has been making automatic parallelizing compilers more important. To improve the performance of multiprocessor system more by compiler, multigrain parallelization is important. In multigrain parallelization, coarse grain task parallelism among loops and subroutines and near fine grain parallelism among statements are used in addition to the traditional loop...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007