Optimizing Matrix Transpose on Torus Interconnects

نویسندگان

  • Venkatesan T. Chakaravarthy
  • Nikhil Jain
  • Yogish Sabharwal
چکیده

Matrix transpose is a fundamental matrix operation that arises in many scientific and engineering applications. Communication is the main bottleneck in performing matrix transpose on most multiprocessor systems. In this paper, we focus on torus interconnection networks and propose application-level routing techniques that improve load balancing, resulting in better performance. Our basic idea is to route the data via carefully selected intermediate nodes. However, directly employing this technique may lead to worsening of the congestion. We overcome this issue by employing the routing only for selected set of communicating pairs. We implement our optimizations on the Blue Gene/P supercomputer and demonstrate up to 35% improvement in performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An accelerated gradient based iterative algorithm for solving systems of coupled generalized Sylvester-transpose matrix equations

‎In this paper‎, ‎an accelerated gradient based iterative algorithm for solving systems of coupled generalized Sylvester-transpose matrix equations is proposed‎. ‎The convergence analysis of the algorithm is investigated‎. ‎We show that the proposed algorithm converges to the exact solution for any initial value under certain assumptions‎. ‎Finally‎, ‎some numerical examples are given to demons...

متن کامل

Matrix Multiplication on Multidimensional Torus Networks

Blocked matrix multiplication algorithms such as Cannon’s algorithm and SUMMA have a 2-dimensional communication structure. We introduce a generalized ’Split-Dimensional’ version of Cannon’s algorithm (SD-Cannon) with higher-dimensional and bidirectional communication structure. This algorithm is useful for higher-dimensional torus interconnects that can achieve more injection bandwidth than si...

متن کامل

Matrix-free numerical torus bifurcation of periodic orbits

We consider systems φ̇ = f(φ, λ) where f : R×R → R. Such systems often arise from space discretizations of parabolic PDEs. We are interested in branches (with respect to λ) of periodic solutions of such systems. In the present paper we describe a numerical continuation method for tracing such branches. Our methods are matrix-free, i.e., Jacobians are only implemented as actions, this enables us ...

متن کامل

Some Modifications to Calculate Regression Coefficients in Multiple Linear Regression

In a multiple linear regression model, there are instances where one has to update the regression parameters. In such models as new data become available, by adding one row to the design matrix, the least-squares estimates for the parameters must be updated to reflect the impact of the new data. We will modify two existing methods of calculating regression coefficients in multiple linear regres...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010