# CS 5220
## Parallelism and locality in simulation
### Distributed parameter systems
## 17 Sep 2015
Distributed parameter problems
Mostly PDEs:
| Type |
Example |
Time? |
Space dependence? |
| Elliptic |
electrostatics |
steady |
global |
| Hyperbolic |
sound waves |
yes |
local |
| Parabolic |
diffusion |
yes |
global |
Different types involve different communication:
- Global dependence $\implies$ lots of communication
- Finite wave speed $\implies$ local dependence
Example: 1D heat equation
Consider flow (e.g. of heat) in a uniform rod
- Heat ($Q$) $\propto$ temperature ($u$) $\times$ mass ($\rho h$)
- Heat flow $\propto$ temperature gradient (Fourier’s law)
Example: 1D heat equation
$$\begin{aligned}
\frac{\partial Q}{\partial t} \propto
h \frac{\partial u}{\partial t} &\approx
C \left[ \left( \frac{u(x-h)-u(x)}{h} \right) +
\left( \frac{u(x)-u(x+h)}{h} \right) \right] \\
\frac{\partial u}{\partial t} &\approx
C \left[ \frac{u(x-h)-2u(x)+u(x+h)}{h^2} \right] \rightarrow
C \frac{\partial^2 u}{\partial x^2}
\end{aligned}$$
Spatial discretization
Heat equation with $u(0) = u(1) = 0$
$$\frac{\partial u}{\partial t} = C \frac{\partial^2 u}{\partial x^2}$$
Spatial semi-discretization:
$$\frac{\partial^2 u}{\partial x^2} \approx \frac{u(x-h)-2u(x)+u(x+h)}{h^2}$$
Spatial discretization
Yields a system of ODEs
$$
\begin{align}
\frac{du}{dt} & = C h^{-2} (-T) u \\
&= -C h^{-2}
\begin{bmatrix}
2 & -1 & & & & \\
-1 & 2 & -1 & & & \\
& \ddots & \ddots & \ddots & \\
& & -1 & 2 & -1 \\
& & & -1 & 2
\end{bmatrix}
\begin{bmatrix} u_1 \\ u_2 \\ \vdots \\ u_{n-2} \\ u_{n-1} \end{bmatrix}
\end{align}$$
Explicit time stepping
Approximate PDE by ODE system (“method of lines”):
$$\frac{du}{dt} = C h^{-2} T u$$
Now need a time-stepping scheme for the ODE:
- Simplest scheme is Euler:
$$u(t+\delta) \approx u(t) + u'(t) \delta =
\left( I - C \frac{\delta}{h^2} T \right) u(t)$$
- Taking a time step $\equiv$ sparse matvec with
$\left( I - C \frac{\delta}{h^2} T \right)$
- This may not end well...
Explicit time stepping data dependence

Nearest neighbor interactions per step $\implies$
finite rate of numerical information propagation
Explicit time stepping in parallel

for t = 1 to N
communicate boundary data ("ghost cell")
take time steps locally
end
Overlap communication + computation

for t = 1 to N
start boundary data sendrecv
compute new interior values
finish sendrecv
compute new boundary values
end
Batching time steps

for t = 1 to N by B
start boundary data sendrecv (B values)
compute new interior values
finish sendrecv (B values)
compute new boundary values
end
Explicit pain

Unstable for $\delta > O(h^2)$!
### Implicit time stepping
- Backward Euler uses backward difference for $d/dt$
$$u(t+\delta) \approx u(t) + u'(t + \delta t) \delta$$
- Taking a time step $\equiv$ sparse matvec with
$\left( I + C \frac{\delta}{h^2} T \right)^{-1}$
- No time step restriction for stability (good!)
- But each step involves linear solve (not so good!)
- Good if you like numerical linear algebra?
Explicit and implicit
- Propagates information at finite rate
- Steps look like sparse matvec (in linear case)
- Stable step determined by fastest time scale
- Works fine for hyperbolic PDEs
Explicit and implicit
- No need to resolve fastest time scales
- Steps can be long... but expensive
- Linear/nonlinear solves at each step
- Often these solves involve sparse matvecs
- Critical for parabolic PDEs
Poisson problems
Consider 2D Poisson
$$-\nabla^2 u =
\frac{\partial^2 u}{\partial x^2} +
\frac{\partial^2 u}{\partial y^2} = f
$$
- Prototypical elliptic problem (steady state)
- Similar to a backward Euler step on heat equation
Poisson problem discretization

$$
u_{i,j} = h^{-2} \left( 4u_{i,j}-u_{i-1,j}-u_{i+1,j}-u_{i,j-1}-u_{i,j+1} \right)
$$
Poisson problem discretization
$$L =
\left[
\begin{array}{ccc|ccc|ccc}
4 & -1 & & -1 & & & & & \\
-1 & 4 & -1 & & -1 & & & & \\
& -1 & 4 & & & -1 & & & \\ \hline
-1 & & & 4 & -1 & & -1 & & \\
& -1 & & -1 & 4 & -1 & & -1 & \\
& & -1 & & -1 & 4 & & & -1 \\ \hline
& & & -1 & & & 4 & -1 & \\
& & & & -1 & & -1 & 4 & -1 \\
& & & & & -1 & & -1 & 4
\end{array}
\right]$$
Poisson solvers in 2D
Demmel, Applied Numerical Linear Algebra.
| Method |
Time |
Space |
| Dense LU |
$N^3$ |
$N^2$ |
| Band LU |
$N^2$ |
$N^{3/2}$ |
| Jacobi |
$N^2$ |
$N$ |
| Explicit inv |
$N^2$ |
$N^2$ |
| CG |
$N^{3/2}$ |
$N$ |
| Red-black SOR |
$N^{3/2}$ |
$N$ |
| Sparse LU |
$N^{3/2}$ |
$N \log N$ |
| FFT |
$N \log N$ |
$N$ |
| Multigrid |
$N$ |
$N$ |
Remember: best MFlop/s $\neq$ fastest solution!
### General implicit picture
- Implicit solves or steady state $\implies$ solving systems
- Nonlinear solvers generally linearize
- Linear solvers can be
- Direct (hard to scale)
- Iterative (often problem-specific)
- Iterative solves boil down to matvec!
### PDE solver summary
Can be implicit or explicit (as with ODEs)
- Explicit (sparse matvec) - fast, but short steps?
- works fine for hyperbolic PDEs
- Implicit (sparse solve)
- Direct solvers are hard!
- Sparse solvers turn into matvec again
### PDE solver summary
- Differential operators turn into local mesh stencils
- Matrix connectivity looks like mesh connectivity
- Can partition into subdomains that communicate only through
boundary data
- More on graph partitioning later
- Not all nearest neighbor ops are equally efficient!
- Depends on mesh structure
- Also depends on flops/point