Directory Memory CPU Cache Network Interface Directory Memory CPU Cache Network Interface Directory Memory CPU Cache Network Interface � � � � � � � �

نویسندگان

  • Ramesh R. Lakshmi
  • R. Govindarajan
چکیده

Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. Abstract Distributed Shared Memory (DSM) approach provides the illusion of a global shared address space by implementing a layer of shared memory abstraction on a physically distributed memory system. In this paper, we present DSM-SP2, a software distributed shared memory system built on IBM SP2, a distributed memory machine. DSM-SP2 is implemented completely in software as a set of user-level library routines on top of the AIX operating system without requiring any modiications to the operating system or any additional compiler support. The salient features of DSM-SP2 are: (i) it implements lazy release consistency model with hybrid coherence protocol to reduce the communication overheads; (ii) it allows multiple concurrent writers to minimize the effects of false-sharing; (iii) to reduce the DSM overheads and the idling time of processes, the DSM-SP2 implementation allows multiple processes per node; and (iv) it implements a new synchronization primitive called conditional lock acquire/release for eeective simple producer-consumer type of synchronization. Detailed performance measurements for three benchmark programs namely, Water, Jacobi and Tomcatv are reported .

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel Approach to Reduce L2 Miss Latency in Shared-Memory Multiprocessors

Recent technology improvements allow multiprocessor designers to put some key components inside the processor chip, such as the memory controller, the coherence hardware and the network interface/router. In this work we exploit such integration scale, presenting a novel node architecture aimed at reducing the long L2 miss latencies and the memory overhead of using directories that characterize ...

متن کامل

The RDT Router Chip: A versatile router for supporting a distributed shared memory

JUMP-1 is currently under development by seven Japanese universities to establish techniques for building an e cient distributed shared memory on a massively parallel processor. It provides a coherent cache with reduced hierarchical bit-map directory scheme to achieve cost e ective and high performance management. Messages for coherent cache are transferred through a fat tree on the RDT(Recursi...

متن کامل

A CC-NUMA Prototype Card for SCI-Based PC Clustering

It is extremely important to minimize network access time in constructing a high-performance PC cluster system. For an SCI-based PC cluster, it is possible to reduce the network access time by maintaining network cache in each cluster node. This paper presents a CCNUMA card that utilizes network cache for SCI-based PC clustering. The CC-NUMA card is directly plugged into the PCI slot of each no...

متن کامل

Array Memory CPU & Caches Address & Data Interface Processor Off - chip Cache I

The progressive integration of processor and memory has unexpected implications for the design of DSM systems. To exploit this integration best, we claim that we need to redesign the nodes of DSM systems and then reorganize the whole machine. In this paper, we propose a new DSM organization where processor nodes have their on-chip memories conngured as caches and their directory controllers hav...

متن کامل

Hierarchical Bit-Map Directory Schemes on the RDT Interconnection Network for a Massively Parallel Processor JUMP-1

JUMP-1 is currently under development by seven Japanese universities to establish techniques of an e cient distributed shared memory on a massively parallel processor. It provides a memory coherency control scheme called the hierarchical bit-map directory to achieve cost e ective and high performance management of the cache memory. Messages for maintaining cache coherency are transferred throug...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997