Optimizing Matrix Transpose on Torus Interconnects
نویسندگان
چکیده
Matrix transpose is a fundamental matrix operation that arises in many scientific and engineering applications. Communication is the main bottleneck in performing matrix transpose on most multiprocessor systems. In this paper, we focus on torus interconnection networks and propose application-level routing techniques that improve load balancing, resulting in better performance. Our basic idea is to route the data via carefully selected intermediate nodes. However, directly employing this technique may lead to worsening of the congestion. We overcome this issue by employing the routing only for selected set of communicating pairs. We implement our optimizations on the Blue Gene/P supercomputer and demonstrate up to 35% improvement in performance.
منابع مشابه
An accelerated gradient based iterative algorithm for solving systems of coupled generalized Sylvester-transpose matrix equations
In this paper, an accelerated gradient based iterative algorithm for solving systems of coupled generalized Sylvester-transpose matrix equations is proposed. The convergence analysis of the algorithm is investigated. We show that the proposed algorithm converges to the exact solution for any initial value under certain assumptions. Finally, some numerical examples are given to demons...
متن کاملMatrix Multiplication on Multidimensional Torus Networks
Blocked matrix multiplication algorithms such as Cannon’s algorithm and SUMMA have a 2-dimensional communication structure. We introduce a generalized ’Split-Dimensional’ version of Cannon’s algorithm (SD-Cannon) with higher-dimensional and bidirectional communication structure. This algorithm is useful for higher-dimensional torus interconnects that can achieve more injection bandwidth than si...
متن کاملMatrix-free numerical torus bifurcation of periodic orbits
We consider systems φ̇ = f(φ, λ) where f : R×R → R. Such systems often arise from space discretizations of parabolic PDEs. We are interested in branches (with respect to λ) of periodic solutions of such systems. In the present paper we describe a numerical continuation method for tracing such branches. Our methods are matrix-free, i.e., Jacobians are only implemented as actions, this enables us ...
متن کاملSome Modifications to Calculate Regression Coefficients in Multiple Linear Regression
In a multiple linear regression model, there are instances where one has to update the regression parameters. In such models as new data become available, by adding one row to the design matrix, the least-squares estimates for the parameters must be updated to reflect the impact of the new data. We will modify two existing methods of calculating regression coefficients in multiple linear regres...
متن کامل