Eecacy of Code Optimizations on Cache-based Processors

نویسنده

  • Rob F. Van der Wijngaart
چکیده

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Code Size Efficiency in Global Scheduling for VLIW/EPIC Style Embedded Processors

In embedded computing, code size is very important for system cost and performance. In global scheduling for VLIW/EPIC style embedded processors, region-enlarging optimizations, especially tail duplication, are commonly used to exploit instruction level parallelism (ILP) to boost the performance. The code size increase due to such optimizations, however, raises serious concerns about the affect...

متن کامل

Cache Optimizations for Iterative Numerical Codes Aware of Hardware Prefetching

Cache optimizations use code transformations to increase the locality of memory accesses and use prefetching techniques to hide latency. For best performance, hardware prefetching units of processors should be complemented with software prefetch instructions. A cache simulation enhanced with a hardware prefetcher is presented to run code for a 3D multigrid solver. Thus, cache misses not predict...

متن کامل

Inside the Intel® 10.1 Compilers: New Threadizer and New Vectorizer for Intel® CoreTM2 Processors

The fast introduction of the Intel CoreTM2 Duo and Quad processors to the mass market has drawn attention to threadization (a.k.a. parallelization) and vectorization of the existing code in many application domains. In fact, multi-core processor vendors are eager to enable their users to exploit various levels of parallelism in order to harness the additional compute resources of multi-core pro...

متن کامل

Multi-Core Software

The fast introduction of the Intel CoreTM2 Duo and Quad processors to the mass market has drawn attention to threadization (a.k.a. parallelization) and vectorization of the existing code in many application domains. In fact, multi-core processor vendors are eager to enable their users to exploit various levels of parallelism in order to harness the additional compute resources of multi-core pro...

متن کامل

Platform-Independent Cache Optimization by Pinpointing Low-Locality Reuse

For many applications, cache misses are the primary performance bottleneck. Even though much research has been performed on automatically optimizing cache behavior at the hardware and the compiler level, many program executions remain dominated by cache misses. Therefore, we propose to let the programmer optimize, who has a better high-level program overview, needed to resolve many cache proble...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997