Performance Evaluation of Some Inverse Iteration Algorithms on PowerXCell 8i Processor

نویسندگان

  • Masami Takata
  • Hiroyuki Ishigami
  • Kinji Kimura
  • Yoshimasa Nakamura
چکیده

In this paper, we compare with the inverse iteration algorithms on PowerXCell 8i processor, which has been known as a heterogeneous environment. When some of all the eigenvalues are close together or there are clusters of eigenvalues, reorthogonalization must be adopted to all the eigenvectors associated with such eigenvalues. Reorthogonalization algorithms need a lot of computational cost. The Classical Gram-Schmidt (CGS) algorithm, the modified Gram-Schmidt (MGS) algorithm, and the Householder orthogonalization algorithm in terms of the compact WY representation have been known as reorthogonalization algorithms. These algorithms can be computed using BLAS level-1 and level-2. Since synergistic processor elements in PowerXCell 8i processor archive the high performance of BLAS level-2 and level-3, the orthogonalization algorithms except the MGS algorithm can be computed high-speed on parallel computers.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

0 3 9 Status of the QPACE Project

We give an overview of the QPACE project, which is pursuing the development of a massively parallel, scalable supercomputer for LQCD. The machine is a three-dimensional torus of identical processing nodes, based on the PowerXCell 8i processor. The nodes are connected by an FPGA-based, application-optimized network processor attached to the PowerXCell 8i processor. We present a performance analy...

متن کامل

An MPI Performance Monitoring Interface for Cell Based Compute Nodes

In this paper, we present a methodology for profiling parallel applications executing on the family of architectures commonly referred as the “Cell” processor. Specifically, we examine Cell-centric MPI programs on hybrid clusters containing multiple Opteron and IBM PowerXCell 8i processors per node such as those used in the petascale Roadrunner system. We analyze the performance of our approach...

متن کامل

Lattice QCD Applications on QPACE

QPACE is a novel massively parallel architecture optimized for lattice QCD simulations. A single QPACE node is based on the IBM PowerXCell 8i processor. The nodes are interconnected by a custom 3-dimensional torus network implemented on an FPGA. The compute power of the processor is provided by 8 Synergistic Processing Units. Making efficient use of these accelerator cores in scientific applica...

متن کامل

Self-Organizing Maps on the Cell Broadband Engine Architecture

We present and evaluate novel parallel implementations of Self-Organizing Maps for the Cell Broadband Engine Architecture. Motivated by the interactive nature of the datamining process, we evaluate the scalability of the implementations on two clusters using different network characteristics and incarnations (PS3console and PowerXCell 8i) of the architecture. Our implementations use varying com...

متن کامل

Implementing 3D SPHARM Surfaces Registration on Cell Processor

Spherical harmonics (SPHARM) description is a highly promising surface-based morphometry (SBM) method and has been widely used in neuroimaging applications to model the surface of arbitrarily shaped but simply connected 3D objects. This paper focuses on SHREC, a recently developed generalpurpose surface-matching method for 3D SPHARM registration. We implemented SHREC in MATLAB, then optimized a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012