Thread Data Sharing in Cache

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Second - level Instruction Cache Thread Processing Unit Thread Processing Unit Thread Processing Unit Instruction Cache First - level First - level First - level Instruction Cache Instruction Cache Execution

This paper presents a new parallelization model, called coarse-grained thread pipelining, for exploiting speculative coarse-grained parallelism from general-purpose application programs in shared-memory multiprocessor systems. This parallelization model, which is based on the ne-grained thread pipelining model proposed for the superthreaded architecture 11, 12], allows concurrent execution of l...

متن کامل

Improving Inter-thread Data Sharing with GPU Caches

The massive amount of fine-grained parallelism exposed by a GPU program makes it difficult to exploit shared cache benefits even there is good program locality. The non deterministic feature of thread execution in the bulk synchronize parallel (BSP) model makes the situation even worse. Most prior work in exploiting GPU cache sharing focuses on regular applications that have linear memory acces...

متن کامل

Parallel Data Sharing in Cache: Theory, Measurement and Analysis

Cache sharing on a multicore processor is usually competitive. In multi-threaded code, however, different threads may access the same data and have a cooperative effect in cache. This retport describes a new metric called shared footprint and a new locality theory to measure and analyze parallel data sharing in cache. Shared footprint is machine independent, i.e. data sharing in all cache sizes...

متن کامل

Cache-Fair Thread Scheduling for Multicore Processors

We present a new operating system scheduling algorithm for multicore processors. Our algorithm reduces the effects of unequal CPU cache sharing that occur on these processors and cause unfair CPU sharing, priority inversion, and inadequate CPU accounting. We describe the implementation of our algorithm in the Solaris operating system and demonstrate that it produces fairer schedules enabling be...

متن کامل

Sharing and Contention in Coherent - cache Parallel

Parallel graph reduction is a model for parallel program execution in which shared memory is used under a strict access regime with single assignment and blocking reads. We present the design of an eecient and accurate multiprocessor simulation scheme and the results of a simulation study of the pattern of access of a suite of benchmark programs.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM SIGPLAN Notices

سال: 2017

ISSN: 0362-1340,1558-1160

DOI: 10.1145/3155284.3018759