Performance Effects of a Cache Miss Handling Architecture in a Multi-core Processor

نویسندگان

  • Magnus Jahre
  • Lasse Natvig
چکیده

Multi-core processors, also called Chip multiprocessors (CMPs), have recently been proposed to counter several of the problems associated with modern superscalar microprocessors: limited instruction level parallelism (ILP), high power consumption and large design complexity. However, the performance gap between a processor core and main memory is large and growing. Consequently, multi-core architectures must invest in techniques to hide the large memory latency. One way of doing this is to use non-blocking or lockup-free caches. The key idea is that a cache can continue to service requests while one or more misses are being processed at a lower memory hierarchy level. This technique was first proposed by Kroft [16]. The main contribution of this paper is the observation that a non-blocking cache must fulfill two functions. Firstly, it should provide sufficient miss parallelism to speed up the applications. Secondly, the number of parallel misses should not be so large that it creates congestion in the on-chip interconnect or off-chip memory bus. While the first function is well known, the other is a result of the possibility for contention when multiple processors are placed on a single chip. A compromise miss handling architecture (MHA) evaluated in this work which handles 16 parallel misses in the L1 cache, has an average speed-up of 47 % compared to a blocking cache and has a hardware cost of 9 % of the L1 cache area.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance of Cache Memory Subsystems for Multicore Architectures

Advancements in multi-core have created interest among many research groups in finding out ways to harness the true power of processor cores. Recent research suggests that on-board component such as cache memory plays a crucial role in deciding the performance of multi-core systems. In this paper, performance of cache memory is evaluated through the parameters such as cache access time, miss ra...

متن کامل

A High Performance Adaptive Miss Handling Architecture for Chip Multiprocessors

Chip Multiprocessors (CMPs) mainly base their performance gains on exploiting thread-level parallelism. Consequently, powerful memory systems are needed to support an increasing number of concurrent threads. Conventional CMP memory systems do not account for thread interference which can result in reduced overall system performance. Therefore, conventional high bandwidth Miss Handling Architect...

متن کامل

Measuring Performance Degradation in Multi-core Processors due to Shared resources

The effect of resource sharing in multicore processors can lead to many more effects most of which are undesirable. This effect of Cross-core interference is a major performance bottleneck. It is important that Chip multiprocessors (CMPs) incorporate methods that minimise this interference. To do so, some accurate measure of Cross Core Interference needs to be devised. This paper studies the re...

متن کامل

Architecture Aware Programming on Multi-Core Systems

In order to improve the processor performance, the response of the industry has been to increase the number of cores on the die. One salient feature of multi-core architectures is that they have a varying degree of sharing of caches at different levels. With the advent of multi-core architectures, we are facing the problem that is new to parallel computing, namely, the management of hierarchica...

متن کامل

Effects of Main Memory Latencies on the Performance of Nonblocking Caches

Lockup-free caches in conjunction to non-blocking processor loads have been proposed to hide miss latencies in high performance processors. One problem with current approaches is the increased complexity of the processor and of the cache controller due to non-blocking. In this paper, we introduce a simple mechanism to support non-blocking loads and a lockup-free cache. A modified SPARC architec...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007