Dual-thread Weld: A Technique for Latency Tolerance in Horizontal Architectures

نویسندگان

  • Emre Özer
  • Mark C. Toburen
  • Thomas M. Conte
چکیده

This paper presents dual-thread Weld architecture for VLIW/EPIC processors. The dual-thread Weld model supports one main thread and one speculative thread running simultaneously in a VLIW/EPIC processor with a register file and a fetch unit per thread. This paper analyzes the cost-performance impact of the dual-thread Weld model, which includes analysis of migrating the disambiguation hardware for speculative memory operations to the compiler and of the sensitivity of the model to the variation of branch misprediction and second-level cache miss penalties. Up to 35% speedup can be gained using the dual-thread Weld compared to a singlethreaded VLIW/EPIC processor.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Weld for Itanium Processor

Sharma, Saurabh Weld for Itanium Processor (Under the direction of Dr. Thomas M. Conte) This dissertation extends a WELD for Itanium processors. Emre Özer presented WELD architecture in his Ph.D. thesis. WELD integrates multithreading support into an Itanium processor to hide run-time latency effects that cannot be determined by the compiler. Also, it proposes a hardware technique called operat...

متن کامل

Weld: A Multithreading Technique Towards Latency-Tolerant VLIW Processors

This paper presents a new architecture model, named Weld, for VLIW processors. Weld integrates multithreading support into a VLIW processor to hide run-time latency effects that cannot be determined by the compiler. It does this through a novel hardware technique called operation welding that merges operations from different threads to utilize the hardware resources more efficiently. Hardware c...

متن کامل

Latency Tolerance: A Metric for Performance Analysis of Multithreaded Architectures

Multithreaded multiprocessor systems (MMS) have been proposed to tolerate long latencies for communication. This paper provides an analytical framework based on closed queueing networks to quantify and analyze the latency tolerance of multithreaded systems. We introduce a new metric, called the tolerance index, which quantifies the closeness of performance of the system to that of an ideal syst...

متن کامل

Performance Modeling of Multithreaded Distributed Memory Architectures

In multithreaded distributed memory architectures, long{ latency memory operations and synchronization delays are tolerated by suspending the current thread and switching to another thread, which is executed concurrently with the long{latency operation of the suspended thread. Timed Petri nets are used to model several multithreaded architectures at the instruction and thread levels. Model eval...

متن کامل

Thread Pitch Variant in Orthodontic Mini-screws: A 3-D Finite Element Analysis

Orthodontic miniscrews are widely used as temporary anchorage devices to facilitate orthodontic movements. Miniscrew loosening is a common problem, which usually occurs during the first two weeks of treatment. Macrodesign can affect the stability of a miniscrew by changing its diameter, length, thread pitch, thread shape, tapering angle and so on. In this study, a 3-D finite element analysis wa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003