نتایج جستجو برای: suitable locality of processing unit

تعداد نتایج: 21213500  

2014

Our method approximates exact texture filtering for arbitrary scales and translations of an image while taking into account the performance characteristics of modern GPUs. Our algorithm is fast because it accesses textures with a high degree of spatial locality. Using bilinear samples guarantees that the texels we read are in a regular pattern and that we use a hardware accelerated path. We con...

2008
Xipeng Shen Jonathan Shaw

As memory hierarchy becomes deeper and shared by more processors, locality increasingly determines system performance. As a rigorous and precise locality model, reuse distance has been used in program optimizations, performance prediction, memory disambiguation, and locality phase prediction. However, the high cost of measurement has been severely impeding its uses in scenarios requiring high e...

1992
Josep Torrellas Monica S. Lam John L. Hennessy

The performance of the data cache in shared-memory multiprocessors has been shown to be diierent from that in uniprocessors. In particular, cache miss rates in multiprocessors do not show the sharp drop typical of uniprocessors when the size of the cache block increases. The resulting high cache miss rate is a cause of concern, since it can signiicantly limit the performance of multiprocessors....

2003

The crossbar is the fastest switching architecture available but is the most expensive in terms of hardware cost. The hardware complexity of a crossbar is θ (nw), where n is the number of processors and w is the width of the data path. The reduced crossbar is a new type of switching architecture, which reduces the hardware complexity of a traditional crossbar by a factor of k, where k is the re...

2013
Jungha Lee JongBeom Lim Daeyong Jung KwangSik Chung JoonMin Gil

Hadoop, an open source implementation of the MapReduce framework, has been widely used for processing massive-scale data in parallel. Since Hadoop uses a distributed file system, called HDFS, the data locality problem often happens (i.e., a data block should be copied to the processing node when a processing node does not possess the data block in its local storage), and this problem leads to t...

2003
Cédric Bastoul Paul Feautrier

Cache memories were invented to decouple fast processors from slow memories. However, this decoupling is only partial, and many researchers have attempted to improve cache use by program optimization. Potential benefits are significant since both energy dissipation and performance highly depend on the traffic between memory levels. But modeling the traffic is difficult; this observation has led...

1997
Govindan Ravindran Michael Stumm

This paper compares the performance of hierarchical ring-and mesh-connected wormhole routed shared memory multiprocessor networks in a simulation study. Hierarchical rings are interesting alternatives to meshes since i) they can be clocked at faster rates, ii) they can have wider data paths and hence shorter message sizes, iii) they allow addition and removal of processing nodes at arbitrary lo...

Journal: :Parallel Computing 2014
Yuki Sugimoto Fumihiko Ino Kenichi Hagihara

We present a cache-aware method for accelerating texture-based volume rendering on a graphics processing unit (GPU). Because a GPU has hierarchical architecture in terms of processing and memory units, cache optimization is important to maximize performance for memory-intensive applications. Our method localizes texture memory reference according to the location of the viewpoint and dynamically...

Journal: :iranian journal of animal biosystematics 0
m rajabizadeh n rastegar-pouyani a khosravani h barani-beiranvand

this report presents a new record of iranolacerta brandtii brandtii from 30 km south of tekab city, west azarbaijan province and 130 km south of the previously known distribution range of the subspecies; a new record of iranolacerta zagrosica in kaljonun mountain peak, lorestan province, about 70 km northwest of the type locality; a new record of apathya cappadocica urmiana in the manesht prote...

Journal: :Fundam. Inform. 2005
Dobieslaw Wróblewski

We consider finite connected undirected graphs as a model for anonymous computer networks. In this framework we show a general purpose distributed election protocol, which uses forward links over the standard communication channels between processors. The forward links are represented in the form of structured labels, so the algorithm is in fact a graph relabelling system. However, its transfor...

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید