A Highly Scalable Parallel Algorithm for Sparse Matrix Factorization

نویسندگان

  • Anshul Gupta
  • George Karypis
  • Vipin Kumar
چکیده

In this paper, we describe a scalable parallel algorithm for sparse matrix factorization, analyze their performance and scalability, and present experimental results for up to 1024 processors on a Cray T3D parallel computer. Through our analysis and experimental results, we demonstrate that our algorithm substantially improves the state of the art in parallel direct solution of sparse linear systems—both in terms of scalability and overall performance. It is a well known fact that dense matrix factorization scales well and can be implemented efficiently on parallel computers. In this paper, we present the first algorithm to factor a wide class of sparse matrices (including those arising from twoand three-dimensional finite element problems) that is asymptotically as scalable as dense matrix factorization algorithms on a variety of parallel architectures. Our algorithm incurs less communication overhead and is more scalable than any previously known parallel formulation of sparse matrix factorization. Although, in this paper, we discuss Cholesky factorization of symmetric positive definite matrices, the algorithms can be adapted for solving sparse linear least squares problems and for Gaussian elimination of diagonally dominant matrices that are almost symmetric in structure. An implementation of our sparse Cholesky factorization algorithm delivers up to 20 GFlops on a Cray T3D for medium-size structural engineering and linear programming problems. To the best of our knowledge, this is the highest performance ever obtained for sparse Cholesky factorization on any supercomputer. This work was supported by IST/BMDO through Army Research Office contract DA/DAAH04-93-G-0080, NSF grant NSG/1RI9216941, and by Army High Performance Computing Research Center under the auspices of the Department of the Army, Army Research Laboratory cooperative agreement number DAAH04-95-2-0003/contract number DAAH04-95-C-0008, the content of which does not necessarily reflect the position or the policy of the government, and no official endorsement should be inferred. Access to computing facilities were provided by Minnesota Supercomputer Institute, Cray Research Inc. and by the Pittsburgh Supercomputing Center.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scalable Parallel Algorithms for Solving Sparse Systems of Linear Equations∗

We have developed a highly parallel sparse Cholesky factorization algorithm that substantially improves the state of the art in parallel direct solution of sparse linear systems—both in terms of scalability and overall performance. It is a well known fact that dense matrix factorization scales well and can be implemented efficiently on parallel computers. However, it had been a challenge to dev...

متن کامل

Voice-based Age and Gender Recognition using Training Generative Sparse Model

Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...

متن کامل

Highly Scalable Parallel Algorithms for Sparse Matrix Factorization

In this paper, we describe scalable parallel algorithms for sparse matrix factorization, analyze their performance and scalability, and present experimental results for up to 1024 processors on a Cray T3D parallel computer. Through our analysis and experimental results, we demonstrate that our algorithms substantially improve the state of the art in parallel direct solution of sparse linear sys...

متن کامل

A Scalable Parallel Algorithm for Sparse MatrixFactorization

In this paper, we describe a scalable parallel algorithm for sparse matrix factorization, analyze its performance and scalability, and present experimental results of its implementation on a 1024-processor nCUBE2 parallel computer. Through our analysis and experimental results, we demonstrate that our algorithm improves the state of the art in parallel direct solution of sparse linear systems b...

متن کامل

A high performance two dimensional scalable parallel algorithm for solving sparse triangular systems

Solving a system of equations of the form Tx = y, where T is a sparse triangular matrix, is required after the factorization phase in the direct methods of solving systems of linear equations. A few parallel formulations have been proposed recently. The common belief in parallelizing this problem is that the parallel formulation utilizing a two dimensional distribution of T is unscalable. In th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994