Session S3A THE UCSC KESTREL HIGH PERFORMANCE SIMD PROCESSOR: PRESENT AND FUTURE
نویسندگان
چکیده
The UCSC Kestrel parallel processor is a single-board linear array processor with 512 8-bit processing elements. In the process of building the machine, we have touched nearly all aspects of computer engineering, from VLSI layout to board design and debugging, and from device drivers to new algorithm development. The programmable array is primarily designed for several core algorithms from computational biology, on which Kestrel can outperform a workstation by a factor of 20. We have also considered a variety of other algorithms, including graph coloring, computational chemistry, and neural network evaluation.
منابع مشابه
Roofline Model Toolkit: A Practical Tool for Architectural and Program Analysis
We present preliminary results of the Roofline Toolkit for multicore, manycore, and accelerated architectures. This paper focuses on the processor architecture characterization engine, a collection of portable instrumented micro benchmarks implemented with Message Passing Interface (MPI), and OpenMP used to express thread-level parallelism. These benchmarks are specialized to quantify the behav...
متن کاملAutovectorization in GCC
Vectorization is an optimization technique that has traditionally targeted vector processors. The importance of this optimization has increased in recent years with the introduction of SIMD (single instruction multiple data) extensions to general purpose processors, and with the growing significance of applications that can benefit from this functionality. With the adoption of the new Tree SSA ...
متن کاملMulti-Core Software
The fast introduction of the Intel CoreTM2 Duo and Quad processors to the mass market has drawn attention to threadization (a.k.a. parallelization) and vectorization of the existing code in many application domains. In fact, multi-core processor vendors are eager to enable their users to exploit various levels of parallelism in order to harness the additional compute resources of multi-core pro...
متن کاملInside the Intel® 10.1 Compilers: New Threadizer and New Vectorizer for Intel® CoreTM2 Processors
The fast introduction of the Intel CoreTM2 Duo and Quad processors to the mass market has drawn attention to threadization (a.k.a. parallelization) and vectorization of the existing code in many application domains. In fact, multi-core processor vendors are eager to enable their users to exploit various levels of parallelism in order to harness the additional compute resources of multi-core pro...
متن کاملToward a Toolchain for Pipeline Parallel Programming on CMPs
Today’s processors exploit the fine grain data parallelism that exists in many applications via ILP design, vector processing, and SIMD instructions. Thus, future gains must come from chipmultiprocessors, which present developers with previously unimaginable computing resources. Programmers can use these resources for coarse-grain data-parallel computation or task parallelism. Given the extensi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001