Accurately modeling speculative instruction fetching in trace-driven simulation

نویسندگان

  • R. Bhargava
  • L. K. John
  • F. Matus
چکیده

Performance evaluation of modern, highly speculative, out-of-order microprocessors and the corresponding production of detailed, valid, accurate results have become serious challenges. A popular evaluation methodology is trace-driven simulation which provides the advantage of a highly portable simulator that is independent of the constraints of the trace generation system. While developing and maintaining a trace-driven simulator is relatively easier than other alternatives, a primary drawback is the inability to accurately simulate speculative instruction fetching and subsequent execution. Fetching from an incorrect path occurs often in a speculative processor, however it is di cult to capture this information in a trace. This paper investigates a scheme to accurately model instruction fetching within a trace-driven framework. This is accomplished by recreating an approximate copy of the object code segment, which we call resurrected code, using a preliminary pass through the trace. We discuss a fast and memory-e cient method for implementing this resurrected code. In addition, we characterize UltraSPARC traces of C, C++, and Fortran programs generated using Shade to determine the potential of this method. Using these traces, and a modest branch predicting scheme, we nd that in 14 of 16 cases more than 99% of all branches will nd their target instruction in the resurrected code. Furthermore, on these occasions, a large amount of consecutive instructions are available along the mispredicted path. These results indicate that the inaccuracies associated with speculative fetching in trace-driven simulation can be signi cantly reduced using this resurrected code. L. John is supported by the National Science Foundation under Grants CCR-9796098 (CAREER Award), and EIA9807112, and a grant from the Texas Advanced Technology Program. F. Matus is also with Advanced Micro Devices.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Out-of-Order Instruction Fetch Using Multiple Sequencers

Conventional instruction fetch mechanisms fetch contiguous blocks of instructions in each cycle. They are difficult to scale since taken branches make it hard to increase the size of these blocks beyond eight instructions. Trace caches have been proposed as a solution to this problem, but they use cache space inefficiently. We show that fetching large blocks of contiguous instructions, or wide ...

متن کامل

Pre-execution via Speculative Data-driven Multithreading

This dissertation introduces pre-execution, a novel technique for accelerating sequential programs. Pre-execution directly attacks the instructions that cause performance problems—mis-predicted branches and cache missing loads. In preexecution, future branch outcomes and load addresses are computed on the side and the results are fed to the main program. In doing so, the main program is spared ...

متن کامل

Can Trace-Driven Simulators Accurately Predict Superscalar Performance?

There are four crucial issues associated with performance simulators: simulator retargetability, simulator validation, simulation speed and simulation accuracy. This paper documents our experiences in developing performance simulators and our recent findings in using these simulators. We are concerned with all four of the crucial issues. Our first-generation tool, VMW, focused on achieving reta...

متن کامل

Modeled and Measured Instruction Fetching Performance for Superscalar Microprocessors

Instruction fetching is critical to the performance of a superscalar microprocessor. We develop a mathematical model for three different cache techniques and evaluate its performance both in theory and in simulation using the SPEC95 suite of benchmarks. In all the techniques, the fetching performance is dramatically lower than ideal expectations. To help remedy the situation, we also evaluate i...

متن کامل

Future Branches { beyond Speculative Execution

The performance and hardware complexity of superscalar architectures is hindered by conditional branch instructions. When conditional branches are encountered in a program, the instruction fetch unit must rapidly predict the branch predicate and begin speculatively fetching instructions with no loss of instruction throughput. Speculative execution increases hardware cost, since speculative inst...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999