# CS 5220
## Parallelism and locality in simulation
### Lumped parameter systems
## 17 Sep 2015
### Lumped parameter simulations
- SPICE-level circuit simulation
- nodal voltages vs. voltage distributions
- Structural simulation
- beam end displacements vs. continuum field
- Chemical concentrations in stirred tank reactor
- mean concentrations vs. spatially varying
Typically involves ordinary differential equations (ODEs),
or with constraints (differential-algebraic equations, or DAEs).
Often (not always) *sparse*.
Consider system of ODEs $x' = f(x)$ (special case: $f(x) = Ax$)
- Dependency graph has edge $(i,j)$ if $f_j$ depends on $x_i$
- Sparsity means each $f_j$ depends on only a few $x_i$
- Often arises from physical or logical locality
- Corresponds to $A$ being a sparse matrix (mostly zeros)
Sparsity and partitioning
Want to partition sparse graphs so that
- Subgraphs are same size (load balance)
- Cut size is minimal (minimize communication)
We’ll talk more about this later.
### Types of analysis
Consider $x' = f(x)$ (special case: $f(x) = Ax + b$).
- Static analysis ($f(x_*) = 0$)
- Boils down to $Ax = b$ (e.g. for Newton-like steps)
- Can solve directly or iteratively
- Sparsity matters a lot!
- Dynamic analysis (compute $x(t)$ for many values of $t$)
- Involves time stepping (explicit or implicit)
- Implicit methods involve linear/nonlinear solves
- Need to understand stiffness and stability issues
- Modal analysis (compute eigenvalues of $A$ or $f'(x_*)$)
### Explicit time stepping
- Example: forward Euler
- Next step depends only on earlier steps
- Simple algorithms
- May have stability/stiffness issues
### Implicit time stepping
- Example: backward Euler
- Next step depends on itself and on earlier steps
- Algorithms involve solves — complication, communication!
- Larger time steps, each step costs more
### A common kernel
In all these analyses, spend lots of time in sparse matvec:
- Iterative linear solvers: repeated sparse matvec
- Iterative eigensolvers: repeated sparse matvec
- Explicit time marching: matvecs at each step
- Implicit time marching: iterative solves (involving matvecs)
We need to figure out how to make matvec fast!
### An aside on sparse matrix storage
- Sparse matrix $\implies$ mostly zero entries
- Can also have “data sparseness” — representation with less than
$O(n^2)$ storage, even if most entries nonzero
- Could be implicit (e.g. directional differencing)
- Sometimes explicit representation is useful
- Easy to get lots of indirect indexing!
- Compressed sparse storage schemes help
Example: Compressed sparse row storage
This can be even more compact:
- Could organize by blocks (block CSR)
- Could compress column index data (16-bit vs 64-bit)
- Various other optimizations — see OSKI