# CS 5220 ## Parallelism and locality in simulation ### Lumped parameter systems ## 17 Sep 2015
### Lumped parameter simulations Examples include: - SPICE-level circuit simulation - nodal voltages vs. voltage distributions - Structural simulation - beam end displacements vs. continuum field - Chemical concentrations in stirred tank reactor - mean concentrations vs. spatially varying Typically involves ordinary differential equations (ODEs), or with constraints (differential-algebraic equations, or DAEs). Often (not always) *sparse*.

### Sparsity

Consider system of ODEs $x' = f(x)$ (special case: $f(x) = Ax$)

• Dependency graph has edge $(i,j)$ if $f_j$ depends on $x_i$
• Sparsity means each $f_j$ depends on only a few $x_i$
• Often arises from physical or logical locality
• Corresponds to $A$ being a sparse matrix (mostly zeros)

### Sparsity and partitioning

Want to partition sparse graphs so that

• Subgraphs are same size (load balance)
• Cut size is minimal (minimize communication)

### Types of analysis Consider $x' = f(x)$ (special case: $f(x) = Ax + b$). - Static analysis ($f(x_*) = 0$) - Boils down to $Ax = b$ (e.g. for Newton-like steps) - Can solve directly or iteratively - Sparsity matters a lot! - Dynamic analysis (compute $x(t)$ for many values of $t$) - Involves time stepping (explicit or implicit) - Implicit methods involve linear/nonlinear solves - Need to understand stiffness and stability issues - Modal analysis (compute eigenvalues of $A$ or $f'(x_*)$)
### Explicit time stepping - Example: forward Euler - Next step depends only on earlier steps - Simple algorithms - May have stability/stiffness issues
### Implicit time stepping - Example: backward Euler - Next step depends on itself and on earlier steps - Algorithms involve solves — complication, communication! - Larger time steps, each step costs more
### A common kernel In all these analyses, spend lots of time in sparse matvec: - Iterative linear solvers: repeated sparse matvec - Iterative eigensolvers: repeated sparse matvec - Explicit time marching: matvecs at each step - Implicit time marching: iterative solves (involving matvecs) We need to figure out how to make matvec fast!
### An aside on sparse matrix storage - Sparse matrix $\implies$ mostly zero entries - Can also have “data sparseness” — representation with less than $O(n^2)$ storage, even if most entries nonzero - Could be implicit (e.g. directional differencing) - Sometimes explicit representation is useful - Easy to get lots of indirect indexing! - Compressed sparse storage schemes help

### Example: Compressed sparse row storage

This can be even more compact:

• Could organize by blocks (block CSR)
• Could compress column index data (16-bit vs 64-bit)
• Various other optimizations — see OSKI