Comparative Study of Cache Utilization for Matrix Multiplication Algorithms
نویسندگان
چکیده
In this work, the performance of basic and strassen’s matrix multiplication algorithms are compared in terms of memory hierarchy utilization. The problem taken here is MATRIX MULTIPLICATION (Basic and Strassen’s). Strassen’s Matrix Multiplication Algorithm has time complexity of O(n) with respect to the Basic multiplication algorithm with time complexity of O(n). This slight reduction in time makes Strassen’s Algorithm seems to be faster but introduction of additional temporary storage makes Strassen’s Algorithm less efficient in space point of view. Access patterns of the two multiplication algorithms are generated and then cache replacement algorithms (namely LRU and FIFO) are applied to find the misses in cache. With the number of misses in hand and time taken to process 1 miss taken from Hou Fang’s research, we calculate the overall time consumed to process misses in case of both the matrix multiplication algorithms. It is found that the basic matrix multiplication is far better than the Strassen’s matrix multiplication algorithm beacause of higher memory usage. This analysis is important because memory plays more vital role in deciding the efficiency of an algorithm. Additional temporary storage causes increase in number of memory locations and number of memory accesses too in case of strassen’s algorithm.
منابع مشابه
A New Parallel Matrix Multiplication Method Adapted on Fibonacci Hypercube Structure
The objective of this study was to develop a new optimal parallel algorithm for matrix multiplication which could run on a Fibonacci Hypercube structure. Most of the popular algorithms for parallel matrix multiplication can not run on Fibonacci Hypercube structure, therefore giving a method that can be run on all structures especially Fibonacci Hypercube structure is necessary for parallel matr...
متن کاملAn Algebraic Approach to Cache Memory Characterization for Block Recursive Algorithms
Multiprocessor systems usually have cache or local memory in the memory hierarchy. Obtaining good performance on these systems requires that a program utilizes the cache ef-ciently. In this paper, we address the issue of generating eecient cache based algorithms from tensor product formulas. Tensor product formulas have been used for expressing block recursive algorithms like Strassen's matrix ...
متن کاملA New Format for the Sparse Matrix-vector Multiplication
Algorithms for the sparse matrix-vector multiplication (shortly SpMV) are important building blocks in solvers of sparse systems of linear equations. Due to matrix sparsity, the memory access patterns are irregular and the utilization of a cache suffers from low spatial and temporal locality. To reduce this effect, the register blocking formats were designed. This paper introduces a new combine...
متن کاملOptimal Configuration of GPU Cache Memory to Maximize the Performance
GPU devices offer great performance when dealing with algorithms that require intense computational resources. A developer can configure the L1 cache memory of the latest GPU Kepler architecture with different cache size and cache set associativity, per Streaming Multiprocessors (SM). The performance of the computation intensive algorithms can be affected by these cache parameters. In this pape...
متن کاملA New Diagonal Blocking Format and Model of Cache Behavior for Sparse Matrices
Algorithms for the sparse matrix-vector multiplication (shortly SpM×V ) are important building blocks in solvers of sparse systems of linear equations. Due to matrix sparsity, the memory access patterns are irregular and the utilization of a cache suffers from low spatial and temporal locality. To reduce this effect, the diagonal register blocking format was designed. This paper introduces a ne...
متن کامل