CS 6210: Matrix Computations

Gaussian elimination

David Bindel

2025-09-15

Introduction

Introduction

Goal (3 lec): LU factorization and co

  • AKA Gaussian elimination
  • Workhorse for linear systems (not the only way)
  • Not just for linear systems!
  • More subtle than you think…

Points to remember

  • Nonzero matrices can still be singular
  • Some matrices are rectangular
  • Can still factor singular and rectangular matrices!

Julia notes

  • Will want to use LinearAlgebra throughout
  • A\b and c'/A compute \(A^{-1} b\) and \(c^T A^{-1}\)
  • Can also use forward/backslash with factorization objects
  • We can explicitly call inv, but almost never should

Linear systems in Julia

let
    # Set up test problem with known solution
    A    = [1.0 2.0; 3.0 4.0]
    xref = [5.0; 6.0]
    b    = A*xref

    x1 = inv(A)*b  # AVOID EXPLICIT INVERSES!
    x2 = A\b       # Good for one solve

    F = lu(A)      # Save factorization object
    x3 = F\b       # Good for multiple solves

    x1  xref, x2  xref, x3  xref
end
(true, true, true)

Julia views

\[\begin{bmatrix} 1 & 4 & 7 \\ 2 & {\color{red}5} & {\color{red}8} \\ 3 & {\color{red}6} & {\color{red}9} \end{bmatrix}\]

let
    A = [1 4 7; 2 5 8; 3 6 9]  # Create matrix, bind to name A
    B = @view A[2:3,2:3]       # Create view, bind to name B
    B[1,1] = 10
    A
end
3×3 Matrix{Int64}:
 1   4  7
 2  10  8
 3   6  9

Julia views

\[\begin{bmatrix} 1 & 4 & 7 \\ 2 & {\color{red}5} & {\color{red}8} \\ 3 & {\color{red}6} & {\color{red}9} \end{bmatrix}\]

  • Matrix object owns storage
  • A submatrix view references a piece of that storage
  • Ex: B[1,1] = 10 also changes A[2,2] to 10 (same data!)

Julia views

In general:

  • Views provide interpretation of storage
  • That storage is “owned” by someone else

In particular: strided views have

  • Fixed offset between rows (row stride — unit is best)
  • Fixed offset between cols (col stride)
  • Often relatively fast arithmetic routines

Julia views

Views don’t have to just reference submatrices!

let
    A = [1 4 7; 2 5 8; 3 6 9]
    L = UnitLowerTriangular(A)  # This is a view
    U = UpperTriangular(A)      # This is, too!
    A[3,3] = 10
    display(L)
    display(U)
end
3×3 UnitLowerTriangular{Int64, Matrix{Int64}}:
 1  ⋅  ⋅
 2  1  ⋅
 3  6  1
3×3 UpperTriangular{Int64, Matrix{Int64}}:
 1  4   7
 ⋅  5   8
 ⋅  ⋅  10

Gaussian elimination

Gauss transformations

Example: \(M = \begin{bmatrix} 1 & 0 \\ -0.5 & 1 \end{bmatrix}\)

Gauss transformations

  • Geometrically: a type of simple shear (preserves volume!)
  • Write as \(M^{(j)} = I - \tau_j e_j^T\), \((\tau_j)_k = 0\) for \(k \leq j\)

Gauss transformations

Write as \(M^{(j)} = I - \tau_j e_j^T\). Then for \((\tau_j)_k = v_k/v_j\):

\[(I - \tau_j e_j^T) v = \begin{bmatrix} v_1 \\ \vdots \\ v_j \\ 0 \\ \vdots \\ 0 \end{bmatrix}\]

Operation “zeros out” entries after index \(j\).

Gauss transformations

Write as \(M^{(j)} = I - \tau_j e_j^T\).

  • Interpretation of \(M^{(j)} v\): subtract \((\tau_{j})_k v_j\) from \(v_k\), \(k > j\).
  • Interpretation of \((M^{(j)})^{-1} v\): add \((\tau_{j})_k v_j\) to \(v_k\), \(k > j\)
  • Note: \((M^{(j)})^{-1} = I + \tau_j e_j^T\)

Q: Why is \((I + \tau_j e_j^T) (I - \tau_j e_j^T) = I\)?

Gaussian elimination

  • Apply Gauss transforms to \(A\) to “zero out” subdiagonals
  • Transformed matrix is upper triangular (\(U\))
  • (Inverse) transforms into unit lower triangular \(L\)
  • Result: \(A = LU\)

Example

\[ A = A^{(0)} = \begin{bmatrix} 1 & 4 & 7 \\ 2 & 5 & 8 \\ 3 & 6 & 10 \end{bmatrix} \]

Example

\[ \begin{bmatrix} \color{black}{1} & \color{lightgray}{0} & \color{lightgray}{0} \\ \color{red}{-2} & \color{black}{1} & \color{lightgray}{0} \\ \color{red}{-3} & \color{lightgray}{0} & \color{black}{1} \end{bmatrix} \begin{bmatrix} \color{blue}{1} & \color{blue}{4} & \color{blue}{7} \\ \color{black}{2} & \color{black}{5} & \color{black}{8} \\ \color{black}{3} & \color{black}{6} & \color{black}{10} \end{bmatrix} = \begin{bmatrix} \color{blue}{1} & \color{blue}{4} & \color{blue}{7} \\ \color{lightgray}{0} & \color{purple}{-3} & \color{purple}{-6} \\ \color{lightgray}{0} & \color{purple}{-6} & \color{purple}{-11} \end{bmatrix} \] Subtract multiples of row 1 to introduce zeros in column 1: \[ (I-\tau_1 e_1^T) A^{(0)} = A^{(1)}, \quad\tau_1 = \begin{bmatrix} \color{lightgray}{0} \\ \color{red}{2} \\ \color{red}{3} \end{bmatrix} ,\quad e_1 = \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} \]

Example

\[ \begin{bmatrix} \color{black}{1} & \color{lightgray}{0} & \color{lightgray}{0} \\ \color{lightgray}{0} & \color{black}{1} & \color{lightgray}{0} \\ \color{lightgray}{0} & \color{red}{-2} & \color{black}{1} \end{bmatrix} \begin{bmatrix} \color{black}{1} & \color{black}{4} & \color{black}{7} \\ \color{black}{0} & \color{blue}{-3} & \color{blue}{-6} \\ \color{black}{0} & \color{black}{-6} & \color{black}{-11} \end{bmatrix} = \begin{bmatrix} \color{black}{1} & \color{black}{4} & \color{black}{7} \\ \color{lightgray}{0} & \color{blue}{-3} & \color{blue}{-6} \\ \color{lightgray}{0} & \color{lightgray}{0} & \color{purple}{1} \end{bmatrix} \] Subtract multiples of row 2 to introduce zeros in column 2: \[ (I-\tau_2 e_2^T) A^{(1)} = A^{(2)}, \quad\tau_2 = \begin{bmatrix} \color{lightgray}{0} \\ \color{lightgray}{0} \\ \color{red}{2} \end{bmatrix} ,\quad e_2 = \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix} \]

Example

\[ \begin{bmatrix} \color{black}{1} & \color{lightgray}{0} & \color{lightgray}{0} \\ \color{lightgray}{0} & \color{black}{1} & \color{lightgray}{0} \\ \color{lightgray}{0} & \color{red}{-2} & \color{black}{1} \end{bmatrix} \begin{bmatrix} \color{black}{1} & \color{lightgray}{0} & \color{lightgray}{0} \\ \color{red}{-2} & \color{black}{1} & \color{lightgray}{0} \\ \color{red}{-3} & \color{lightgray}{0} & \color{black}{1} \end{bmatrix} \begin{bmatrix} 1 & 4 & 7 \\ 2 & 5 & 8 \\ 3 & 6 & 10 \end{bmatrix} = \begin{bmatrix} \color{black}{1} & \color{black}{4} & \color{black}{7} \\ \color{lightgray}{0} & \color{black}{-3} & \color{black}{-6} \\ \color{lightgray}{0} & \color{lightgray}{0} & \color{black}{1} \end{bmatrix} \]

But note: \((I-\tau_j e_j^T)^{-1} = (I + \tau_j e_j)\) – so

\[ \begin{bmatrix} 1 & 4 & 7 \\ 2 & 5 & 8 \\ 3 & 6 & 10 \end{bmatrix} = \begin{bmatrix} \color{black}{1} & \color{lightgray}{0} & \color{lightgray}{0} \\ \color{red}{2} & \color{black}{1} & \color{lightgray}{0} \\ \color{red}{3} & \color{lightgray}{0} & \color{black}{1} \end{bmatrix} \begin{bmatrix} \color{black}{1} & \color{lightgray}{0} & \color{lightgray}{0} \\ \color{lightgray}{0} & \color{black}{1} & \color{lightgray}{0} \\ \color{lightgray}{0} & \color{red}{2} & \color{black}{1} \end{bmatrix} \begin{bmatrix} \color{black}{1} & \color{black}{4} & \color{black}{7} \\ \color{lightgray}{0} & \color{black}{-3} & \color{black}{-6} \\ \color{lightgray}{0} & \color{lightgray}{0} & \color{black}{1} \end{bmatrix} \]

Example

Now note: \((I+\tau_1 e_1^T) (I + \tau_2 e_2^T) = I + \tau_1 e_1^T + \tau_2 e_2^T\)

\[ \begin{bmatrix} \color{black}{1} & \color{lightgray}{0} & \color{lightgray}{0} \\ \color{red}{2} & \color{black}{1} & \color{lightgray}{0} \\ \color{red}{3} & \color{lightgray}{0} & \color{black}{1} \end{bmatrix} \begin{bmatrix} \color{black}{1} & \color{lightgray}{0} & \color{lightgray}{0} \\ \color{lightgray}{0} & \color{black}{1} & \color{lightgray}{0} \\ \color{lightgray}{0} & \color{red}{2} & \color{black}{1} \end{bmatrix} = \begin{bmatrix} \color{black}{1} & \color{lightgray}{0} & \color{lightgray}{0} \\ \color{red}{2} & \color{black}{1} & \color{lightgray}{0} \\ \color{red}{3} & \color{red}{2} & \color{black}{1} \end{bmatrix} \] Therefore \(A = LU\):

\[ \begin{bmatrix} 1 & 4 & 7 \\ 2 & 5 & 8 \\ 3 & 6 & 10 \end{bmatrix} = \begin{bmatrix} \color{black}{1} & \color{lightgray}{0} & \color{lightgray}{0} \\ \color{red}{2} & \color{black}{1} & \color{lightgray}{0} \\ \color{red}{3} & \color{red}{2} & \color{black}{1} \end{bmatrix} \begin{bmatrix} \color{black}{1} & \color{black}{4} & \color{black}{7} \\ \color{lightgray}{0} & \color{black}{-3} & \color{black}{-6} \\ \color{lightgray}{0} & \color{lightgray}{0} & \color{black}{1} \end{bmatrix} \]

Example

\[ \begin{bmatrix} 1 & 4 & 7 \\ 2 & 5 & 8 \\ 3 & 6 & 10 \end{bmatrix} = \begin{bmatrix} \color{black}{1} & \color{lightgray}{0} & \color{lightgray}{0} \\ \color{red}{2} & \color{black}{1} & \color{lightgray}{0} \\ \color{red}{3} & \color{red}{2} & \color{black}{1} \end{bmatrix} \begin{bmatrix} \color{black}{1} & \color{black}{4} & \color{black}{7} \\ \color{lightgray}{0} & \color{black}{-3} & \color{black}{-6} \\ \color{lightgray}{0} & \color{lightgray}{0} & \color{black}{1} \end{bmatrix} \] Solve \(Ax = b\) by:

  • Forward substitution \(Ly = b\) and
  • Back substitution \(Ux = y\):

Example

Forward substitution: \[ \begin{bmatrix} \color{black}{1} & \color{lightgray}{0} & \color{lightgray}{0} \\ \color{red}{2} & \color{black}{1} & \color{lightgray}{0} \\ \color{red}{3} & \color{red}{2} & \color{black}{1} \end{bmatrix} \begin{bmatrix} y_1 \\ y_2 \\ y_3 \end{bmatrix} = \begin{bmatrix} 12 \\ 15 \\ 19 \end{bmatrix} \]

\[ y = \begin{bmatrix} 12 \\ -9 \\ 1 \end{bmatrix} \]

Example

Back substitution: \[ \begin{bmatrix} \color{black}{1} & \color{black}{4} & \color{black}{7} \\ \color{lightgray}{0} & \color{black}{-3} & \color{black}{-6} \\ \color{lightgray}{0} & \color{lightgray}{0} & \color{black}{1} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix} = \begin{bmatrix} 12 \\ -9 \\ 1 \end{bmatrix} \]

\[ x = \begin{bmatrix} 1 \\ 1 \\ 1 \end{bmatrix} \]

Example

\[ \begin{bmatrix} 1 & 4 & 7 \\ 2 & 5 & 8 \\ 3 & 6 & 10 \end{bmatrix} = \begin{bmatrix} \color{black}{1} & \color{lightgray}{0} & \color{lightgray}{0} \\ \color{red}{2} & \color{black}{1} & \color{lightgray}{0} \\ \color{red}{3} & \color{red}{2} & \color{black}{1} \end{bmatrix} \begin{bmatrix} \color{black}{1} & \color{black}{4} & \color{black}{7} \\ \color{lightgray}{0} & \color{black}{-3} & \color{black}{-6} \\ \color{lightgray}{0} & \color{lightgray}{0} & \color{black}{1} \end{bmatrix} \] And we can pack both factors into one matrix: \[ \begin{bmatrix} \color{black}{1} & \color{black}{4} & \color{black}{7} \\ \color{red}{2} & \color{black}{-3} & \color{black}{-6} \\ \color{red}{3} & \color{red}{2} & \color{black}{1} \end{bmatrix} \]

Code: Separate factors

function mylu_v1(A)
    n = size(A)[1]
    L = Matrix(1.0I, n, n)
    U = copy(A)
    for j = 1:n-1
        for i = j+1:n
1            L[i,j] = U[i,j]/U[j,j]
            U[i,j] = 0.0
            for k = j+1:n
2                U[i,k] -= L[i,j]*U[j,k]
            end
        end
    end
    L, U
end
1
Figure out multiple of row \(j\) to subtract from row \(i\)
2
Subtract off the appropriate multiple of row \(j\) from row \(i\)

Code: Packed storage

function mylu_v2(A)
    n = size(A)[1]
    A = copy(A)
    for j = 1:n-1
        for i = j+1:n
1            A[i,j] = A[i,j]/A[j,j]
            for k = j+1:n
2                A[i,k] -= A[i,j]*A[j,k]
            end
        end
    end
3    UnitLowerTriangular(A), UpperTriangular(A)
end
1
Figure out multiple of row \(j\) to subtract from row \(i\)
2
Subtract off the appropriate multiple of row \(j\) from row \(i\)
3
Return views of (unit) lower and upper triangles

Code: Packed storage

function mylu_v2b(A)
    n = size(A)[1]
    A = copy(A)
    for j = 1:n-1
1        A[j+1:n,:] ./= A[j,j]
2        A[j+1:n,j+1:n] .-= A[j+1:n,:] * A[:,j+1:n]
    end
3    UnitLowerTriangular(A), UpperTriangular(A)
end
1
Figure out multiple of row \(j\) to subtract from row \(i\)
2
Subtract off the appropriate multiple of row \(j\) from row \(i\). (This version forms outer product)
3
Return views of (unit) lower and upper triangles

Code: Packed storage

function mylu_v2b(A)
    n = size(A)[1]
    A = copy(A)
    for j = 1:n-1
1        lj = @view A[j+1:n,j]
        uj = @view A[j,j+1:n]
2        lj[:] ./= A[k,k]
3        BLAS.ger!(-1.0, lj, uj, view(A,j+1:n,j+1:n))
    end
4    UnitLowerTriangular(A), UpperTriangular(A)
end
1
Also use views of pieces of \(A\)
2
Figure out multiple of row \(j\) to subtract from row \(i\)
3
Subtract off the appropriate multiple of row \(j\) from row \(i\). (This version uses a BLAS call – more efficient.)
4
Return views of (unit) lower and upper triangles

Block elimination

Block 2-by-2 LU

Block perspective on \(A = LU\): \[ \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{bmatrix} = \begin{bmatrix} L_{11} & \\ L_{21} & L_{22} \end{bmatrix} \begin{bmatrix} U_{11} & U_{12} \\ & U_{22} \end{bmatrix} \] Can think of this as four matrix equations \[\begin{aligned} A_{11} &= L_{11} U_{11} \\ A_{12} &= L_{11} U_{12} \\ A_{21} &= L_{21} U_{11} \\ A_{22} &= L_{21} U_{12} + L_{22} U_{22} \end{aligned}\]

Block LU computation

Matrix equations \(\implies\) computational steps

\[\begin{aligned} A_{11} &= L_{11} U_{11} \\ A_{12} &= L_{11} U_{12} \\ A_{21} &= L_{21} U_{11} \\ A_{22} &= L_{21} U_{12} + L_{22} U_{22} \end{aligned}\]

\[\begin{aligned} L_{11} U_{11} &= A_{11} \\ U_{12} &= L_{11}^{-1} A_{12} \\ L_{21} &= A_{21} U_{11}^{-1} \\ L_{22} U_{22} &= A_{22} - L_{21} U_{12} \end{aligned}\]

Trailing submatrix in last step is a Schur complement: \[\begin{aligned} S &= A_{22} - L_{21} U_{12} \\ &= A_{22} - A_{21} A_{11}^{-1} A_{12} \end{aligned}\]

Alternate: block elimination

Can do blocked view of LU or eliminate blockwise: \[ \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{bmatrix} = \begin{bmatrix} I & 0 \\ A_{21} A_{11}^{-1} & I \end{bmatrix} \begin{bmatrix} A_{11} & A_{12} \\ 0 & S \end{bmatrix} \] where \(S = A_{22} - A_{21} A_{11}^{-1} A_{12}\) as before!

Why does it matter?

Suppose a fast solver \(g(b)\) for \(Ax = b\) and consider

\[\begin{bmatrix} A & v \\ v^T & 0 \end{bmatrix} \begin{bmatrix} x \\ \lambda \end{bmatrix} = \begin{bmatrix} b \\ 0 \end{bmatrix}\]

Schur complement system gives:

\[\lambda = \frac{v^T g(b)}{v^T g(v)}\]

Back substitute to get \(x = g(b) - \lambda g(v)\). Only two \(g\) calls!

Schur complement meaning

Let \(B = A^{-1}\). Then \(AB = LUB = I\), so

\[ \begin{bmatrix} L_{11} & \\ L_{21} & L_{22} \end{bmatrix} \begin{bmatrix} U_{11} & U_{12} \\ & U_{22} \end{bmatrix} \begin{bmatrix} B_{21} \\ B_{22} \end{bmatrix} = \begin{bmatrix} 0 \\ I \end{bmatrix}. \]

Block forward and back substitution gives:

\[ \begin{bmatrix} B_{21} \\ B_{22} \end{bmatrix} = \begin{bmatrix} -U_{11}^{-1} U_{12} S^{-1} \\ S^{-1} \end{bmatrix} \]

where the Schur complement \(S\) is \(L_{22} U_{22}\)

Schur complement meaning

Conclusion: \[S = L_{22} U_{22} = B_{22}^{-1} = ((A^{-1})_{22})^{-1}\]

Schur complement is an

  • Inverse
  • Of a submatrix
  • Of an inverse

Can make sense of each of these in an abstract setting!

Block perspective gains

  • Fast level-3 BLAS rich codes
  • Approach to fast solves for bordered systems
  • Interesting interpretation of Schur complement

Backward error analysis

Backward error

Idea: write \(A = LU\) as: \[ u_{jk} = a_{jk} - \sum_{i=1}^{j-1} l_{ji} u_{ij} \] Computed version has \(|\delta_i| \leq (j-1) \epsilon_{\mathrm{mach}}\) s.t. \[ \hat{u}_{jk} = a_{jk}(1+\delta_0) - \sum_{i=1}^{j-1} \hat{l}_{ji} \hat{u}_{ij} (1+\delta_i) + O(\epsilon_{\mathrm{mach}}^2) \] True independent of order.

Turn it around

\[\begin{aligned} a_{jk} & = \frac{1}{1+\delta_0} \left( \hat{l}_{jj} \hat{u}_{jk} + \sum_{i=1}^{j-1} \hat{l}_{ji} \hat{u}_{ik} (1 + \delta_i) \right) + O(\epsilon_{\mathrm{mach}}^2) \\ & = \hat{l}_{jj} \hat{u}_{jk} (1 - \delta_0) + \sum_{i=1}^{j-1} \hat{l}_{ji} \hat{u}_{ik} (1 + \delta_i - \delta_0) + O(\epsilon_{\mathrm{mach}}^2) \\ & = \left( \hat{L} \hat{U} \right)_{jk} + e_{jk} \\ e_{jk} &= -\hat{l}_{jj} \hat{u}_{jk} \delta_0 + \sum_{i=1}^{j-1} \hat{l}_{ji} \hat{u}_{ik} (\delta_i - \delta_0) + O(\epsilon_{\mathrm{mach}}^2) \end{aligned}\]

A little more work, and…

\[ A = \hat{L} \hat{U} + E, \mbox{ where } |E| \leq n \epsilon_{\mathrm{mach}}|\hat{L}| |\hat{U}| + O(\epsilon_{\mathrm{mach}}^2). \] Then backward error from forward and backward substitution.

  • But wait: what if \(|L| |U| \gg |A|\)?
  • Control via growth factor
  • This really just gives the problem a name…
  • (Partial) solution: (partial) pivoting

Pivoting

The problem

  • What if \(|L| |U| \gg |A|\)?
  • Extreme case: divide by zero
  • Still bad: divide by tiny

Fixing the problem?

Bad: \[ A = \begin{bmatrix} \delta & 1 \\ 1 & 1 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ \delta^{-1} & 1 \end{bmatrix} \begin{bmatrix} \delta & 1 \\ 0 & 1-\delta^{-1} \end{bmatrix}. \]

Fine: \[ PA = \begin{bmatrix} 1 & 1 \\ \delta & 1 \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ \delta & 1 \end{bmatrix} \begin{bmatrix} 1 & 1 \\ 0 & 1-\delta \end{bmatrix}. \] How to choose permutation \(P\) more generally?

Partial pivoting

Goal: \(PA = LU\) where \(L\) controlled.
Idea: At step \(j\), eliminate \(x_j\) from remaining equations

  • Can use any of equations \(j\) through \(n\)
  • Choose equation \(i\) with largest \(|S^{(j)}_{ij}|\)
  • Swap rows \(i\) and \(j\) and eliminate
  • Result: \(|l_{ij}| \leq 1\)

Example

\[ \begin{bmatrix} \color{black}{0.69} & \color{black}{0.39} & \color{black}{0.32} & \color{black}{0.41} \\ \color{black}{0.86} & \color{black}{0.71} & \color{black}{0.01} & \color{black}{0.22} \\ \color{black}{0.40} & \color{black}{0.51} & \color{black}{0.75} & \color{black}{1.00} \\ \color{black}{0.53} & \color{black}{0.42} & \color{black}{0.58} & \color{black}{0.12} \end{bmatrix} \quad \begin{bmatrix} \color{black}{1} \\ \color{black}{2} \\ \color{black}{3} \\ \color{black}{4} \end{bmatrix} \] Initial matrix and permutation \(\pi\)

Example

\[ \begin{bmatrix} \color{brown}{0.69} & \color{brown}{0.39} & \color{brown}{0.32} & \color{brown}{0.41} \\ \color{brown}{0.86} & \color{brown}{0.71} & \color{brown}{0.01} & \color{brown}{0.22} \\ \color{black}{0.40} & \color{black}{0.51} & \color{black}{0.75} & \color{black}{1.00} \\ \color{black}{0.53} & \color{black}{0.42} & \color{black}{0.58} & \color{black}{0.12} \end{bmatrix} \quad \begin{bmatrix} \color{brown}{1} \\ \color{brown}{2} \\ \color{black}{3} \\ \color{black}{4} \end{bmatrix} \] Identify pivot (column 1)

Example

\[ \begin{bmatrix} \color{brown}{0.86} & \color{brown}{0.71} & \color{brown}{0.01} & \color{brown}{0.22} \\ \color{brown}{0.69} & \color{brown}{0.39} & \color{brown}{0.32} & \color{brown}{0.41} \\ \color{black}{0.40} & \color{black}{0.51} & \color{black}{0.75} & \color{black}{1.00} \\ \color{black}{0.53} & \color{black}{0.42} & \color{black}{0.58} & \color{black}{0.12} \end{bmatrix} \quad \begin{bmatrix} \color{brown}{2} \\ \color{brown}{1} \\ \color{black}{3} \\ \color{black}{4} \end{bmatrix} \] Apply swap (1 with 2)

Example

\[ \begin{bmatrix} \color{blue}{0.86} & \color{blue}{0.71} & \color{blue}{0.01} & \color{blue}{0.22} \\ \color{red}{0.80} & \color{black}{0.39} & \color{black}{0.32} & \color{black}{0.41} \\ \color{red}{0.46} & \color{black}{0.51} & \color{black}{0.75} & \color{black}{1.00} \\ \color{red}{0.62} & \color{black}{0.42} & \color{black}{0.58} & \color{black}{0.12} \end{bmatrix} \quad \begin{bmatrix} \color{black}{2} \\ \color{black}{1} \\ \color{black}{3} \\ \color{black}{4} \end{bmatrix} \] Compute multipliers in column 1

Example

\[ \begin{bmatrix} \color{blue}{0.86} & \color{blue}{0.71} & \color{blue}{0.01} & \color{blue}{0.22} \\ \color{red}{0.80} & \color{purple}{-0.17} & \color{purple}{0.31} & \color{purple}{0.24} \\ \color{red}{0.46} & \color{purple}{0.18} & \color{purple}{0.74} & \color{purple}{0.90} \\ \color{red}{0.62} & \color{purple}{-0.02} & \color{purple}{0.57} & \color{purple}{-0.01} \end{bmatrix} \quad \begin{bmatrix} \color{black}{2} \\ \color{black}{1} \\ \color{black}{3} \\ \color{black}{4} \end{bmatrix} \] Schur complement

Example

\[ \begin{bmatrix} \color{blue}{0.86} & \color{blue}{0.71} & \color{blue}{0.01} & \color{blue}{0.22} \\ \color{brown}{0.80} & \color{brown}{-0.17} & \color{brown}{0.31} & \color{brown}{0.24} \\ \color{brown}{0.46} & \color{brown}{0.18} & \color{brown}{0.74} & \color{brown}{0.90} \\ \color{red}{0.62} & \color{black}{-0.02} & \color{black}{0.57} & \color{black}{-0.01} \end{bmatrix} \quad \begin{bmatrix} \color{black}{2} \\ \color{brown}{1} \\ \color{brown}{3} \\ \color{black}{4} \end{bmatrix} \] Identify pivot (column 2)

Example

\[ \begin{bmatrix} \color{blue}{0.86} & \color{blue}{0.71} & \color{blue}{0.01} & \color{blue}{0.22} \\ \color{brown}{0.46} & \color{brown}{0.18} & \color{brown}{0.74} & \color{brown}{0.90} \\ \color{brown}{0.80} & \color{brown}{-0.17} & \color{brown}{0.31} & \color{brown}{0.24} \\ \color{red}{0.62} & \color{black}{-0.02} & \color{black}{0.57} & \color{black}{-0.01} \end{bmatrix} \quad \begin{bmatrix} \color{black}{2} \\ \color{brown}{3} \\ \color{brown}{1} \\ \color{black}{4} \end{bmatrix} \] Apply swap (2 with 3)

Example

\[ \begin{bmatrix} \color{blue}{0.86} & \color{blue}{0.71} & \color{blue}{0.01} & \color{blue}{0.22} \\ \color{red}{0.46} & \color{blue}{0.18} & \color{blue}{0.74} & \color{blue}{0.90} \\ \color{red}{0.80} & \color{red}{-0.93} & \color{black}{0.31} & \color{black}{0.24} \\ \color{red}{0.62} & \color{red}{-0.08} & \color{black}{0.57} & \color{black}{-0.01} \end{bmatrix} \quad \begin{bmatrix} \color{black}{2} \\ \color{black}{3} \\ \color{black}{1} \\ \color{black}{4} \end{bmatrix} \] Compute multipliers in column 2

Example

\[ \begin{bmatrix} \color{blue}{0.86} & \color{blue}{0.71} & \color{blue}{0.01} & \color{blue}{0.22} \\ \color{red}{0.46} & \color{blue}{0.18} & \color{blue}{0.74} & \color{blue}{0.90} \\ \color{red}{0.80} & \color{red}{-0.93} & \color{purple}{0.99} & \color{purple}{1.07} \\ \color{red}{0.62} & \color{red}{-0.08} & \color{purple}{0.64} & \color{purple}{0.06} \end{bmatrix} \quad \begin{bmatrix} \color{black}{2} \\ \color{black}{3} \\ \color{black}{1} \\ \color{black}{4} \end{bmatrix} \] Schur complement

Example

\[ \begin{bmatrix} \color{blue}{0.86} & \color{blue}{0.71} & \color{blue}{0.01} & \color{blue}{0.22} \\ \color{red}{0.46} & \color{blue}{0.18} & \color{blue}{0.74} & \color{blue}{0.90} \\ \color{brown}{0.80} & \color{brown}{-0.93} & \color{brown}{0.99} & \color{brown}{1.07} \\ \color{red}{0.62} & \color{red}{-0.08} & \color{black}{0.64} & \color{black}{0.06} \end{bmatrix} \quad \begin{bmatrix} \color{black}{2} \\ \color{black}{3} \\ \color{brown}{1} \\ \color{black}{4} \end{bmatrix} \] Identify pivot (column 3)

Example

\[ \begin{bmatrix} \color{blue}{0.86} & \color{blue}{0.71} & \color{blue}{0.01} & \color{blue}{0.22} \\ \color{red}{0.46} & \color{blue}{0.18} & \color{blue}{0.74} & \color{blue}{0.90} \\ \color{brown}{0.80} & \color{brown}{-0.93} & \color{brown}{0.99} & \color{brown}{1.07} \\ \color{red}{0.62} & \color{red}{-0.08} & \color{black}{0.64} & \color{black}{0.06} \end{bmatrix} \quad \begin{bmatrix} \color{black}{2} \\ \color{black}{3} \\ \color{brown}{1} \\ \color{black}{4} \end{bmatrix} \] Apply swap (3 with 3)

Example

\[ \begin{bmatrix} \color{blue}{0.86} & \color{blue}{0.71} & \color{blue}{0.01} & \color{blue}{0.22} \\ \color{red}{0.46} & \color{blue}{0.18} & \color{blue}{0.74} & \color{blue}{0.90} \\ \color{red}{0.80} & \color{red}{-0.93} & \color{blue}{0.99} & \color{blue}{1.07} \\ \color{red}{0.62} & \color{red}{-0.08} & \color{red}{0.64} & \color{black}{0.06} \end{bmatrix} \quad \begin{bmatrix} \color{black}{2} \\ \color{black}{3} \\ \color{black}{1} \\ \color{black}{4} \end{bmatrix} \] Compute multipliers in column 3

Example

\[ \begin{bmatrix} \color{blue}{0.86} & \color{blue}{0.71} & \color{blue}{0.01} & \color{blue}{0.22} \\ \color{red}{0.46} & \color{blue}{0.18} & \color{blue}{0.74} & \color{blue}{0.90} \\ \color{red}{0.80} & \color{red}{-0.93} & \color{blue}{0.99} & \color{blue}{1.07} \\ \color{red}{0.62} & \color{red}{-0.08} & \color{red}{0.64} & \color{purple}{-0.63} \end{bmatrix} \quad \begin{bmatrix} \color{black}{2} \\ \color{black}{3} \\ \color{black}{1} \\ \color{black}{4} \end{bmatrix} \] Schur complement

Example

\[ \begin{bmatrix} \color{blue}{0.86} & \color{blue}{0.71} & \color{blue}{0.01} & \color{blue}{0.22} \\ \color{red}{0.46} & \color{blue}{0.18} & \color{blue}{0.74} & \color{blue}{0.90} \\ \color{red}{0.80} & \color{red}{-0.93} & \color{blue}{0.99} & \color{blue}{1.07} \\ \color{red}{0.62} & \color{red}{-0.08} & \color{red}{0.64} & \color{blue}{-0.63} \end{bmatrix} \quad \begin{bmatrix} \color{black}{2} \\ \color{black}{3} \\ \color{black}{1} \\ \color{black}{4} \end{bmatrix} \] Final factorization \({\color{red}L}{\color{blue}U}\) and \(\pi\)

Partial pivoting code

function mypivlu(A)
    n = size(A)[1]
    A = copy(A)
    p = Vector(1:n)
    for j = 1:n-1
        jj = findmax(abs.(A[j:n,j]))[2] + j-1    # Find pivot
        p[jj], p[j] = p[j], p[jj]                # Record it
        for k = 1:n                              # Apply swap
            A[jj,k], A[j,k] = A[j,k], A[jj,k]
        end
        A[j+1:n,j] /= A[j,j]                     # Multipliers
        A[j+1:n,j+1:n] -= A[j+1:n,j]*A[j,j+1:n]' # Schur complement
    end
    p, UnitLowerTriangular(A), UpperTriangular(A)
end

Deferred updating

What about blocking? Do standard LU on a few columns \[ \begin{bmatrix} A_{11} & A_{12} \\ A_{21} & A_{22} \end{bmatrix} \rightarrow \begin{bmatrix} (L/U)_{11} & A_{12} \\ L_{21} & A_{22} \end{bmatrix} \] Then apply permutations in a batch \[ \begin{bmatrix} (L/U)_{11} & (PA)_{12} \\ L_{21} & (PA)_{22} \end{bmatrix} \] Then \(U_{12} = L_{11}^{-1} (PA)_{12}\) (level 3 BLAS)
Then Schur complement (level 3 BLAS)

Problems with partial pivoting

Gaussian elimination with Partial Pivoting (GEPP) is standard

  • But it is not uniformly backward stable
    • \(L\) remains controlled, but \(U\) entries can (rarely) grow
    • Studied since Wilkinson 1961 — and still some today!
  • And pivoting increases memory movement
    • Deferred updating helps with this
    • But still not great in parallel

Beyond partial pivoting

  • For better stability, permute rows and columns
    • Opens broader search for big elements
    • Complete pivoting: Search entire Schur complement
    • Rook pivoting: Search column \(j\) and row \(j\)
  • More recent: tournament pivoting
    • Choose \(b\) pivot rows in one go
    • Can improve both stability and performance!
  • Can also just use GEPP (or less) and clean up later…

Summary

LU factorization

  • Gaussian elimination \(\equiv PA = LU\)
  • Block perspective is good for performance
    • Also good for algorithmic variants!
  • Schur complements are interesting on their own

Error and pivoting

  • \(\hat{L} \hat{U} = A + E\) where \(|E| \leq n \epsilon_{\mathrm{mach}}|L| |U|\)
  • Could have \(|L| |U| \gg |A|\)!
  • Partial solution from partial pivoting: \(PA = LU\)
  • Options beyond partial pivoting with different tradeoffs