NORTH- HOLLAND High Performance Algorithms for Toeplitz and Block Toeplitz Matrices
نویسندگان
چکیده
In this paper, we present several high performance variants of the classical Schur algorithm to factor various Toeplitz matrices. For positive definite block Toeplitz matrices, we show how hyperbolic Householder transformations may be blocked to yield a block Schur algorithm. This algorithm uses BLAS3 primitives and makes efficient use of a memory hierarchy. We present three algorithms for indefinite Toeplitz matrices. Two of these are based on look-ahead strategies and produce an exact factorization of the Toeplitz matrix. The third produces an inexact faetorization via perturbations of singular principal minors. We also present an analysis of the numerical behavior of the third algorithm and derive a bound for the number of iterations to improve the accuracy of the solution. For rank-deficient Toeplitz least-squares problems, we present a variant of the generalized Schur algorithm that avoids breakdown due to an exact rank-deficiency. In the presence of a near rank-deficiency, an approximate rank factorization of the Toeplitz matrix is produced. Finally, we suggest an algorithm to solve the normal equations resulting from a real Toeplitz least-squares problem based on transforming to Cauehy-like matrices. This algorithm exploits both realness and symmetry in the normal equations. 1. I N T R O D U C T I O N Algor i thms to solve Toeplitz matr ices can be broadly classified into two categories, namely, the Levinson type and the Schur type. The Levinson type a lgor i thms produce factorizat ions of the inverse of the Toepli tz mat r ix LINEAR ALGEBRA AND ITS APPLICATIONS 241 243:343 388 (1996) © Elsevier Science Inc., 1996 0024-3795/96/$15.0(} 655 Avenue of the Americas, New York, NY 10010 SSDI 0024-3795(95)00649-4 344 K.A. GALLIVAN ET AL. such as T -1 = L D L T and T -1 = QR, while the Schur type algorithms produce factorizations of the Toeplitz matrix itself such as T = LDL T and T = QR. In addition, the two approaches differ in the kinds of computational primitives used during the factorization. In [30] Schur derived a fast recursive algorithm to check if a power series is analytic and bounded in the unit disc. Interestingly, the recursions proposed in this algorithm provide a fast factorization of matrices with displacement rank 2. It is well known that Toeplitz matrices have a displacement rank of 2 [23]. More generally block Toeplitz matrices with a block size of m have a displacement rank of 2m. In this paper we discuss several high performance variants of the classical Schur algorithms to factor symmetric block Toeplitz matrices. Specifically we discuss routines to factor symmetric positive definite, positive semidefinite, and indefinite matrices. Algorithms to obtain the Q R factorization of exactly and nearly rank deficient Toeplitz matrices are also discussed. In this paper the classical Schur algorithm for obtaining the Cholesky factorization of symmetric positive definite block Toeplitz matrices [8, 9] is generalized to the block Toeplitz matrix case using a block generalization of the hyperbolic Householder reflectors. The block generalization of the Schur algorithm and various blocking schemes differing in the amount of storage and computational primitives used are described in Section 2. Blocking the hyperbolic Householder transformations allows us to apply these transformations using BLAS 3 primitives rather than the BLAS 2 primitives that are required for plain hyperbolic Householder transformations. On machines with a memory hierarchy this provides us with a faster algorithm. For symmetric indefinite block Toeplitz matrices the Schur algorithm breaks down if the matrix has singular principal minors. A scheme to modify the block Schur algorithm by perturbing the generators and obtaining an approximate factorization of the matrix is described in Section 3. The approximate solution is then improved through iterative refinement. The numerical behavior of this method to circumvent the singularities is studied. If an exact factorization of the indefinite block Toeplitz matrix is desired, then one would have to look ahead over the singular or near singular principal minors. Look-ahead algorithms based on the Levinson algorithm have appeared in the literature [4, 12] but suffer from the same reduced parallelism relative to the Schur algorithm mentioned above and are limited to point Toeplitz matrices. Look-ahead Schur algorithms based on orthogonal polynomials exist [18] but are limited to point Toeplitz matrices. In Section 3 we present two look-ahead Schur algorithms for point and block Toeplitz matrices and compare the two from a computational viewpoint. The classical Schur algorithm can be generalized to obtain the Q R factorization of block Toeplitz matrices [5]. If the Toeplitz matrix is rank TOEPLITZ AND BLOCK TOEPLITZ 345 deficient, then we present a modification of the generalized Schur algorithm in Section 4 to obtain the Q R factorization by pruning the generators of the Toeplitz matrix. If the matr ix is nearly rank deficient, then this method produces a low-rank approximation of the Toeplitz matrix. Finally we discuss algorithms to factor Toeplitz matrices by converting them to Cauchy type matrices. Toeplitz matrices can be converted using the discrete Fourier transform into Cauchy type matrices that allow pivoting during the factorization [15, 21]. These algorithms also have the same complexity, O(n2), as the Schur algorithm. The problem with this method is tha t any real-valued Toeplitz matr ix is converted to a complex Cauchy type matr ix and the entire factorization algorithm proceeds in complex arithmetic. This is computationally expensive. Similarly, any symmetry in the Toeplitz matr ix is ignored in this algorithm. In Section 5 we present a modification to this algorithm tha t allows us to work in real ari thmetic and also exploit the symmetric structure of the matrix. This yields a rank revealing algorithm for the factorization of a semidefinite block Toeplitz matr ix that is computat ionally less expensive than the algorithm presented in [15, 21]. 2. SYMMETRIC POSITIVE D E F I N I T E BLOCK T O E P L I T Z MATRICES In this section we present a block generalization of the classical Schur algorithm [8, 9] using block hyperbolic Householder reflectors. Block hyperbolic Householder transformations can be applied at the BLAS 3 rate rather than plain householder transformations, which are applied at the BLAS 2 rate. On machines with a memory hierarchy this provides us with a significant improvement in performance. Various blocking strategies tha t differ in the computat ional primitives required during the construction are presented. The cost of applying these transformations is also discussed. 2.1. The Classical Schur Algorithm Let T be an m p × m p symmetric positive definite block Toeplitz matr ix with a block size of m × m whose first block row is given by [T1 T2 -. . Tp-1 :Fp]. Let Z be a block right shift matrix. The Schur algorithm is based on the fact tha t the displacement of a block Toeplitz matr ix T, defined as T z T T z , has a rank of at most 2m [23]. The derivation of the Schur algorithm to compute the Cholesky factorization of a symmetric positive definite block Toeplitz matr ix is outlined below. Since 2r 1 is a symmet r i c positive definite matrix, we can find its Cholesky factorization T1 = L1L T, where L1 is an m × m lower tr iangular matrix. 346 K.A. GALLIVAN ET AL. Let Tj = Li-ITj . It is easy to see tha t T1 = L T. We now define two matr ices e l ( T ) and G2(T) as follows [6, 23]: G I ( T ) = IT1 r2 T3 ... G o T~ ?2 . . . Tp-~
منابع مشابه
High Performance Algorithms for Toeplitz and block Toeplitz matrices
High Performance Algorithms for Toeplitz and block Toeplitz matrices
متن کاملOn Solving Block Toeplitz Systems Using a Block Schur Algorithm
This paper presents a block Schur algorithm to obtain a factorization of a symmetric block Toeplitz matrix. It is inspired by the various block Schur algorithms that have appeared in the literature but which have not considered the innuence of performance tradeoos on implementation choices. We develop a version based on block hyperbolic Householder reeectors by adapting the representation schem...
متن کاملKronecker product approximations for dense block Toeplitz-plus-Hankel matrices
In this paper, we consider the approximation of dense block Toeplitz-plus-Hankel matrices by sums of Kronecker products. We present an algorithm for efficiently computing the matrix approximation that requires the factorization of matrices of much smaller dimension than that of the original. The main results are described for block Toeplitz matrices with Toeplitz-plus-Hankel blocks (BTTHB), but...
متن کاملToeplitz Block Matrices in Compressed Sensing
Recent work in compressed sensing theory shows that n×N independent and identically distributed (IID) sensing matrices whose entries are drawn independently from certain probability distributions guarantee exact recovery of a sparse signal with high probability even if n N . Motivated by signal processing applications, random filtering with Toeplitz sensing matrices whose elements are drawn fro...
متن کاملFast Gaussian Elimination with Partial Pivoting for Matrices with Displacement Structure
Fast 0(n2) implementation of Gaussian elimination with partial pivoting is designed for matrices possessing Cauchy-like displacement structure. We show how Toeplitz-like, Toeplitz-plus-Hankel-like and Vandermondelike matrices can be transformed into Cauchy-like matrices by using Discrete Fourier, Cosine or Sine Transform matrices. In particular this allows us to propose a new fast 0{n2) Toeplit...
متن کامل