Global Trade-o between Code Size and Performance for Loop Unrolling on VLIW Architectures

نویسندگان

  • K. Heydemann
  • F. Bodin
چکیده

Many media processors 28, 7, 14, 8, 18, 27], used for computing intensive embedded applications, are VLIW architectures that rely on the compiler to exploit Instruction Level Parallelism. Loop unrolling is generally used to expose instruction parallelism but computing the unrolling factor is very diicult as instruction cache misses and spill code can cancel the expected beneet of the transformation. Moreover increasing the code size directly impacts on the embedded system cost. In this paper, we propose a method, called UFC (Unrolling Factor computation under Constraints) to compute unrolling factors of set of loops while taking into account code size, a major issue for embedded systems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

UFC : a Global Trade - o Strategy for Loop Unrolling for VLIWArchitectureK

In order to minimize code size overhead on VLIW ar-chitectures, compilers for embedded processors have to pay higher attention on code optimization than on compilation time. Thus, the rst demand on compiler for embedded processors consists in spending instruction memory space for optimization only if the associated performance improvement justiies it. In this paper, we propose a novel method ba...

متن کامل

A Study of Loop Unrolling for VLIW-based DSP Processors

With the growing popularity of DSPs and their associated applications, cost-effective software development has become a major issue. High-level language compilers are becoming more commonplace in the DSP world. While these compilers can generate correct code for DSP architectures, there remains considerable room for performance improvements. This paper addresses issues related to DSP compilatio...

متن کامل

The Effectiveness of Loop Unrolling for Modulo Scheduling in Clustered VLIW Architectures

Clustered organizations are becoming a common trend in the design of VLIW architectures. In this work we propose a novel modulo scheduling approach for such architectures. The proposed technique performs the cluster assignment and the instruction scheduling in a single pass, which is shown to be more effective than doing first the assignment and later the scheduling. We also show that loop unro...

متن کامل

Code Size Aware Compilation for Real-Time Applications

Statically constructed plan of execution (POE) and aggressive instruction level parallelism (ILP) exploitation make EPIC/VLIW processors appropriate for high performance real-time systems. On the one hand, the compiler controlled POE makes the worst-case execution-time (WCET) analysis more accurate as run-time variations are minimized. On the other hand, the compiler can leverage ILP optimizati...

متن کامل

Self-Evaluating Compilation Applied to Loop Unrolling

Well-engineered compilers use a carefully selected set of optimizations, heuristic optimization policies, and a phase ordering to produce good machine code. Designing a compiler with one heuristic per optimization that works well with other optimization phases is a challenging task. Although compiler designers evaluate the optimization heuristics and phase ordering before deployment, compilers ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007