Prof David Bindel

Please click the play button below.

Permute unknowns for better SpMV or

- Stability of Gauss elimination,
- Fill reduction in Gaussian elimination,
- Improved performance of preconditioners...

Want to partition sparse graphs so that

- Subgraphs are same size (load balance)
- Cut size is minimal (minimize communication)

Matrices that are “almost” diagonal are good?

Reverse Cuthill-McKee

- Select “peripheral” vertex \(v\)
- Order according to breadth first search from \(v\)
- Reverse ordering

- RCM ordering is great for SpMV
- But isn’t narrow banding good for solvers, too?
- LU takes \(O(nb^2)\) where \(b\) is bandwidth.
- Great if there’s an ordering where \(b\) is small!

*Profile*solvers generalize band solvers- Skyline storage for lower triangle: for each row \(i\),
- Start and end of storage for nonzeros in row.
*Contiguous*nonzero list up to main diagonal.

- In each column, first nonzero defines a profile.
- All fill-in confined to profile.
- RCM is again a good ordering.

- Minimum bandwidth for 2D model problem? 3D?
- Skyline only gets us so much farther

But more general solvers have similar structure

- Ordering (minimize fill)
- Symbolic factorization (where will fill be?)
- Numerical factorization (pivoting?)
- ... and triangular solves

One step of Gaussian elimination *completely* fills this matrix!

Full Gaussian elimination generates *no* fill in this matrix!

Consider first steps of GE

```
A(2:end,1) = A(2:end,1)/A(1,1);
A(2:end,2:end) = A(2:end,2:end)-...
A(2:end,1)*A(1,2:end);
```

Nonzero in the outer product at \((i,j)\) if A(i,1) and A(j,1) both nonzero — that is, if \(i\) and \(j\) are both connected to 1.

General: Eliminate variable, connect remaining neighbors.

Order leaves to root \(\implies\)

on eliminating \(i\), parent of \(i\) is only remaining neighbor.

- Idea: Think of
*block*tree structures. - Eliminate block trees from bottom up.
- Can recursively partition at leaves.

- Rough cost estimate: how much just to factor dense Schur complements associated with separators?
- Notice graph partitioning appears again!
- And again we want small separators!

Model problem: Laplacian with 5 point stencil (for 2D)

- ND gives optimal complexity in exact arithmetic

(George 73, Hoffman/Martin/Rose) - 2D: \(O(N \log N)\) memory, \(O(N^{3/2})\) flops
- 3D: \(O(N^{4/3})\) memory, \(O(N^2)\) flops

- Locally greedy strategy
- Want to minimize upper bound on fill-in
- Fill \(\leq\) (degree in remaining graph)\(^2\)

- At each step
- Eliminate vertex with smallest degree
- Update degrees of neighbors

- Problem: Expensive to implement!
- But better varients via
*quotient graphs* - Variants often used in practice

- But better varients via

- Variables (columns) are nodes in trees
- \(j\) a descendant of \(k\) if eliminating \(j\) updates \(k\)
- Can eliminate disjoint subtrees in parallel!

Basic idea: exploit “supernodal” (dense) structures in factor

- e.g. arising from elimination of separator Schur complements in ND
- Other alternatives exist (multifrontal solvers)

Pivoting is painful, particularly in distributed memory!

- Cholesky — no need to pivot!
- Threshold pivoting — pivot when things look dangerous
- Static pivoting — try to decide up front

What if things go wrong with threshold/static pivoting?

Common theme: Clean up sloppy solves with good residuals

Can improve solution by *iterative refinement*: \[\begin{aligned}
PAQ &\approx LU \\
x_0 &\approx Q U^{-1} L^{-1} Pb \\
r_0 &= b-Ax_0 \\
x_1 &\approx x_0 + Q U^{-1} L^{-1} P r_0
\end{aligned}
\] Looks like approximate Newton on \(F(x) = Ax-b = 0\).

This is just a stationary iterative method!

Nonstationary methods work, too.

If we’re willing to sacrifice some on factorization,

- Single precision factor + double precision refinement?
- Sloppy factorizations (marginal stability) + refinement?
- Modify \(m\) small pivots as they’re encountered (low rank updates), fix with \(m\) steps of a Krylov solver?

- Sparse direct for 2D problems
- Gets more expensive for 3D problems
- Approximate direct solves make good preconditioners