Massively Parallel Inner-Product Array Processor
نویسندگان
چکیده
We present a hardware architecture for parallel innerproduct array computation in very high dimensional feature spaces, towards a general-purpose kernel-based classiJer and function approximator: The architecture is internally analog with fully digital interface. On-chip analog jinegrain parallel processing yields real-time throughput levels for high-dimensional (over 1,000per chip) classification tasks. The architecture contains an array of computational cells with integrated digital storage and a parallel bank of analog-to-digital converters (ADC). A three-transistor unit cell combines a single-bit dynamic random-access memory (DRAM) and a charge injection device (CID) binary multiplier and analog accumulator: Digital multiplication with enhanced resolution is obtained with bit-serial input vectors and bit-parallel storage of weights, by combining quantized outputs from multiple rows of binary unit cells over time. A prototype 128 x 512 inner-product array processor on a single 3mm x 3mm chip fabricated in standard CMOS 0.5pm technology achieves 8-bit effective resolution, consumes 3.3mW of power and offers 2 x 10l2 binary MACS (multiply accumulates per second) per Watt of power: This corresponds to a factor of at least 1,000 increase in computational efficiency compared to modem desktop workstations. Based on the inner-product array processor, an ejicient real-time massively-parallel hardware architecture of a Support Vector Machine classifier is presented.
منابع مشابه
Analog Array Processor with Digital Resolution Enhancement and Offset Compensation
Abstract — A mixed-mode inner-product vector processor is presented. It performs high-dimensional matrix-vector multiplication on a fine-grain analog array and has a purely-digital interface. The array incorporates charge-mode analog computational cells and row-parallel analog-to-digital converters (ADC). Each of the cells includes a dynamic storage element and a charge injection device computi...
متن کاملA Formal Methodology for Hierarchical Partitioning of Piecewise Linear Algorithms
processor arrays can be used as accelerators for a plenty of data flow-dominant applications. The explosive growth in research and development of massively parallel processor array architectures has lead to demand for mapping tools to realize the full potential of these architectures. Such architectures are characterized by hierarchies of parallelism and memory structures, i.e. processor array ...
متن کاملDetermination of an Optimal Processor Allocation in the Design of Massively Parallel Processor Arrays
In this paper we consider the determination of allocation functions as a part of the design of massively parallel processor arrays for algorithms which can be represented as systems of uniform recurrence equations. The objective is to nd allocation functions minimizing the necessary chip area for a hardware implementation of the processor array. We propose an algorithm approximately minimizing ...
متن کاملCellular automata and non-static image processing for embodied robot systems on a massively parallel processor array
A massively parallel processor array which combines image sensing and processing is utilized for the implementation of a simple Cellular Automaton. This automaton is essential part of an image processing task supporting object detection in real-time for an autonomous robot system. Experiments are presented, which demonstrate that objects will be detected only if they move below a specific veloc...
متن کاملMassively parallel Image Compression for multimedia applications†
Abstract High quality, real-time digital image compression (required in such applications as video distribution, video conference, entertainement, tele-education, etc.) can be considered a low level image processing task thus well suited for implementation on massively parallel computers. This paper describes two new digital image compression algorithms able to compress and decompress images in...
متن کامل