Twisted Factorization

Twisted Factorization
Robert van de Geijn Margaret Myers December 4, 2020
This document provides a number of exercises that help us review ideas that we have encountered in our course “Advanced Linear Algebra: Foundations to Frontiers,” extending traditional matrix fac- torizations. The insights are related to techniques that underlie the fastest algorithm for finding eigen- vectors of a tridiagonal matrix: the Method of Multiple Relatively Robust Representations (MRRR).
1 MRRR, from 35,000 Feet
The Method of Multiple Relatively Robust Representations (MRRR) [1] is an algorithm that, given a tridiagonal matrix and its eigenvalues, computes eigenvectors associated with that matrix in O(n) time per eigenvector. This means it computes all eigenvectors in O(n2) time, which is much faster than the tridiagonal QR algorithm (which requires O(n3) computation for all eigenvectors). Notice that highly accurate eigenvalues of a tridiagonal matrix can themselves be computed in O(n2) time. So, it is legitimate to start by assuming that we have these highly accurate eigenvalues. Importantly, the MRRR algorithm can efficiently compute eigenvectors corresponding to a subset of eigenvalues.
The benefit when computing all eigenvalues and eigenvectors of a dense matrix is considerably less because transforming the eigenvectors of the tridiagonal matrix back into the eigenvectors of the dense matrix requires an extra O(n3) computation, as discussed in [2].
Notice that the devil is in the details of the MRRR algorithm. Making the method totally robust has taken decades of hard work, since the method does not rely exclusively on unitary similarity transforma- tions. It is particularly for matrices that have clusters of (nearly equal) eigenvalues that things get tricky. Those details are well beyond the scope of this note.
2 This document
A fundamental idea behind MRRR is that of a twisted factorization of the tridiagonal matrix. In this docu- ment, we play around with such factorizations: how to compute them and how to compute an eigenvector with it. We start by reminding the reader of what the Cholesky factorization of a symmetric positive definite (SPD) matrix is. Then we discuss the Cholesky factorization of a tridiagonal SPD matrix. This then leads to the LDLT factorization of a symmetric indefinite and symmetric indefinite tridiagonal matrix. NextfollowsadiscussionoftheUDUT factorizationofasymmetricindefinitematrix.Combininginsights fromtheLDLT andUDUT factorizationsfinallyyieldsthetwistedfactorization.Whenthematrixisnearly singular, which happens when the matrix is shifted by the eigenvalue of interest, an approximation of the twisted factorization can then be used to compute an approximate eigenvalue.

3 Cholesky Factorization, Again
We have discussed in class the Cholesky factorization, A = LLT , which requires a matrix to be symmetric positive definite. (We will restrict our discussion to real matrices.) The following computes the Cholesky factorization:
• PartitionA→ α11 a21 . a21 A22
• Update α11 := √α11.
• Update a21 := a21/α11.
• Update A22 := A22 − a21aT21 (updating only the lower triangular part). • Continue to compute the Cholesky factorization of the updated A22.
The resulting algorithm is given in Figure 1.
In the special case where A is tridiagonal and SPD, the algorithm needs to be modified so that it can
take advantage of zero elements:
α11 ⋆ ⋆ 
• Partition A →  α21 α22 ⋆ , where ⋆ indicates the symmetric part that is not stored. Here 
0 α32eF A33
eF indicates the standard basis vector with a “1” as first element.
• Update α11 := √α11.
• Update α21 := α21/α11.
• Update α22 := α22 − α21.
• Continue to compute the Cholesky factorization of
The resulting algorithm is given in Figure 2. In that figure, it helps to interpret F, M, and L as First, Middle, and Last. Notice that the Cholesky factor of a tridiagonal matrix is a lower bidiagonal matrix that overwrites the lower triangular part of the tridiagonal matrix A.
Naturally, the whole matrix needs not be stored. But that detail is not important for our discussion.
4 The LDLT Factorization
Now, one can alternatively compute A = LDLT , where L is unit lower triangular and D is diagonal. We will
look at how this is done first and will then note that this factorization can be computed for any indefinite 2

Algorithm: A := CHOL(A) 
Partition A→ ATL ⋆ ABL ABR
where ATL is0×0 while m(ATL) < m(A) do Repartition ATL ⋆ →   aTα⋆ 1011  A20 a21 A22 α11 := √α11 a21 := a21/α11 A22 :=A22−a21aT21 Continue with A00 ⋆ ⋆ ATL⋆  ←aT α11 ⋆ 10  A20 a21 A22 Figure 1: Algorithm for computing the Cholesky factorization. Updates to A22 affect only the lower triangular part. Algorithm: A := CHOL TRI(A)  AFF ⋆ ⋆ T Partition A→αMFeL αMM ⋆   0 αLMeF ALL where AFF is 0×0 while m(AFF)Code Help
(nonsingular) symmetric matrix. (Notice: the so computed L is not the same as the L computed by the Cholesky factorization.) Partition
T 
A→ α11 a21 ,L→ 1 0 ,andD→ δ1 0 . a21 A22 l21 L22 0 D22
 T    T α11 a21 = 1 0 δ1 0 1 0
l21 L22 0 D22 l21 L22  T
= δ11 0 1l21
δ11l21 L22D22 0 L22 
=δ11 ⋆ TT
δ11l21 δ11l21l21 + L22D22L22
This suggests the following algorithm for overwriting the strictly lower triangular part of A with the strictly lower triangular part of L and the diagonal of A with D:
• PartitionA→ α11 a21 . a21 A22
• α11 := δ11 = α11 (no-op).
• Compute l21 := a21/α11.
• Update A22 := A22 − l21aT21 (updating only the lower triangular part).
• a21 := l21.
• Continue with computing A22 → L22D22LT . 22
This algorithm will complete as long as in all iterations δ11 ̸= 0. This condition is met if and only A is nonsingular. Recall that A is nonsingular if and only if A does not having zero as an eigenvalue. A nonsingular symmetric matrix is also called a symmetric indefinite
Exercise 4.1 Modify the algorithm in Figure 1 so that it computes the LDLT factorization. (Fill in Fig- ure 3.)
See Figure 4. End of Answer

Algorithm: A := LDLT(A) 
Partition A→ ATL ⋆ ABL ABR
whereATL is 0×0 while m(ATL) < m(A) do Repartition ATL ⋆ →   aTα⋆ 1011  A20 a21 A22 whereα11 is 1×1 Continue with A20 a21 A22 A00 ⋆ ⋆ ATL⋆  ←aT α11 ⋆ 10  Figure 3: Skeleton for algorithm for computing A → LDLT , overwriting indefinite matrix A. Algorithm: A := LDLT(A)  Partition A→ ATL ⋆ ABL ABR whereATL is 0×0 while m(ATL) < m(A) do Repartition ATL ⋆ →   aTα⋆ 1011  A20 a21 A22 whereα11 is 1×1 Option 1: l21 := a21/α11 A22:=A22−l21aT21 (updating lower triangle) a21 := l21 Continue with A00 ⋆ ⋆ ATL⋆  ←aT α11 ⋆ 10  a21 := a21/α11 A22:=A22−α11a21aT21 (updating lower triangle) A22:=A22−α1 a21aT21 11 (updating lower triangle) a21 := a21/α11 A20 a21 A22 Figure 4: Three correct answers. What if our matrix A is symmetric and tridiagonal, so that it has the form    × × ×  A= ××× .   × × ×  ×× As illustrated in Figure 5, at atypical point in the iteration, the matrix A is partitioned as ××  αLMe , LF  × × αMF  AFF αMFeL 0 αMF αMM αLM = αMFe αMM  α××0αeA  LM  LMFLL  × × × where eF and eL are standard basis vectors with a ”1” in the first and last element, respectively: T 􏰀 􏰁􏰀 􏰁 00 αMFeL=αMF 0 ··· 0 1 = 0 ··· 0 αMF and αLMeF=αLM.= . . .  .    This then is repartitioned to expose all the different parts that are needed for computation in the current iteration: A00 α10eL 0   αeT α α ⋆  10L1121 αMFeTL αMM ⋆→  T 0 αLMeF ALL  0 α21 α22 α32eF  0 0 α32eF A33 8 ××   × × ×   ××α10   =  α10 α11 α21    α21 α22 α32    α32 × ×  Exercise 4.2 Modify the algorithm in Figure 2 so that it computes the LDLT factorization of a tridiagonal matrix. (Fill in Figure 5.) What is the approximate cost, in floating point operations, of computing the LDLT factorization of a tridiagonal matrix? Count a divide, multiply, and add/subtract as a floating point operation each. Show how you came up with the algorithm, similar to how we derived the algorithm for the tridiagonal Cholesky factorization. See Figure 6. The key insight is to recognize that, relative to the algorithm in Figure 4, a21 = α21 so that, for example, a21 := a21/α11 becomes  α21 := α21 /α11 = α21/α11  Then, an update like A22 := A22 − α11a21aT21 becomes       T α22 ⋆ := α22 ⋆ −α α21 α21   11 α32eF A33 α32eF A33 0 0 2 = α22 −α11α21 ⋆ .  α32 eF A33 The cost is one divide, two multiplies, and one subtract per iteration, for a total of, approximately n divides, 2n multiplies, and n subtracts, if A is n × n. End of Answer 5 The U DU T Factorization The fact that the LU, Cholesky, and LDLT factorizations start in the top-left corner of the matrix and work towards the bottom-right is an accident (might we say twist) of history. Gaussian elimination works in that direction, hence so does LU factorization, and the rest kind of follow. Algorithm: A := LDLT TRI(A)  AFF ⋆ ⋆ T Partition A→αMFeL αMM ⋆   0 αLMeF ALL whereAFF is 0×0 while m(AFF)