Single - Port SRAM - Based Transpose Memory With Diagonal Data Mapping for Large Size 2 - D DCT / IDCT
نویسندگان
چکیده
This brief describes a new method to implement the singleport SRAM-based transpose memory for large size discrete cosine transform (DCT)/indiscrete cosine transform (IDCT) which are used in the latest video coding standard, such as high efficiency video coding. Instead of shift-register array or multiport SRAM, only single-port SRAM is used in the proposed design. A new diagonal data mapping scheme is proposed to reduce the number of SRAM banks used to implement the transpose memory. This design can be flexibly extended to support DCT/IDCT of different transform sizes and different data throughput rates. To support larger size DCT/IDCT, only the depth of SRAM needs to be increased. To support different data throughput rate, multiple SRAM banks are well organized according to the required throughput. Row access and column access can be perfectly supported under single port SRAM. The equivalent gate count per bit (EGC) of proposed approach is less than two, which is much more efficient than the previous method. It is suitable for real-time processing of the video with the resolution up to 1080P HD or even higher.
منابع مشابه
A Low-Cost VLSI Architecture of Multiple-Size IDCT for H.265/HEVC
In this paper, we present an area-efficient 4/8/16/32-point inverse discrete cosine transform (IDCT) architecture for a HEVC decoder. Compared with previous work, this work reduces the hardware cost from two aspects. First, we reduce the logical costs of 1D IDCT by proposing a reordered parallel-in serial-out (RPISO) scheme. By using the RPISO scheme, we can reduce the required calculations for...
متن کاملMemory-Efficient and High-Performance 2-D DCT and IDCT Processors Based on CORDIC Rotation
Abstract: Two-dimensional discrete cosine transform (DCT) and inverse discrete cosine transform (IDCT) have been widely used in many image processing systems. In this paper, efficient architectures with parallel and pipelined structures are proposed to implement 8 8× DCT and IDCT processors. In which, only one bank of SRAM (64 words) and coefficient ROM (6 words) is utilized for saving the memo...
متن کاملA High-Throughput and Memory-Efficiency 2-D DCT Architecture Based on CORDIC Rotation
2-D Discrete Cosine Transform (DCT) applies on image data compression and saves more memories. In this paper, we use fast DCT algorithm, and propose a parallel-pipelined architecture to implement a 8 8× DCT/IDCT processor. This architecture involves two 8-point DCT processors, dual-bank of SRAM (128 words) and the coefficient ROM, three multiplexers, timing controller and 7-bit counter. The ker...
متن کاملM.f.a.s.t.: a Highly Parallel Single Chip Dsp with a 2d Idct Example
IBM MwaveTM engineers have developed a radically new Digital Signal Processor (DSP) for realtime video and graphics applications. A scalable array of processing elements (PEs) is configured as a “folded array” for effective execution of matrix, transpose, and signal processing operations. The single chip Mwave Folded Array Signal Transform processor (M.F.A.S.T.) is a parallel DSP that provides ...
متن کاملHigh Throughput Parallel-Pipeline 2-D DCT/IDCT Processor Chip
This paper presents a 2-D DCT/IDCT processor chip for high data rate image processing and video coding. It uses a fully pipelined row–column decomposition method based on two 1-D DCT processors and a transpose buffer based on D-type flip-flops with a double serial input/output data-flow. The proposed architecture allows the main processing elements and arithmetic units to operate in parallel at...
متن کامل