Prof David Bindel

Please click the play button below.

- Billiards, electrons, galaxies, ...
- Ants, cars, agents, ...?

Particles move via Newton (\(F = ma\)), with

- External forces: ambient gravity, currents, etc.
- Local forces: collisions, Van der Waals (\(1/r^6\)), etc.
- Far-field forces: gravity and electrostatics (\(1/r^2\)), etc.
- Simple approximations often apply (Saint-Venant)

Example force: \[f_i = \sum_j Gm_i m_j \frac{(x_j-x_i)}{r_{ij}^3} \left(1 - \left(\frac{a}{r_{ij}}\right)^{4} \right), \qquad r_{ij} = \|x_i-x_j\|\]

- Long-range attractive force (\(r^{-2}\))
- Short-range repulsive force (\(r^{-6}\))
- Go from attraction to repulsion at radius \(a\)

In Matlab, we can write

```
npts = 100;
t = linspace(0, tfinal, npts);
[tout, xyv] = ode113(@fnbody, t, [x; v], [], m, g);
xout = xyv(:,1:length(x))';
```

... but I can’t call ode113 in C in parallel (or can I?)

Maybe a fixed step leapfrog will do?

```
npts = 100;
steps_per_pt = 10;
dt = tfinal/(steps_per_pt*(npts-1));
xout = zeros(2*n, npts);
xout(:,1) = x;
for i = 1:npts-1
for ii = 1:steps_per_pt
x = x + v*dt;
a = fnbody(x, m, g);
v = v + a*dt;
end
xout(:,i+1) = x;
end
```

- Where do particles
live

(distributed mem)?- Decompose in space? By particle number?
- What about clumping?

- How are long-range force computations organized?
- How are short-range force computations organized?
- How is force computation load balanced?
- What are the boundary conditions?
- How are potential singularities handled?
- What integrator is used? What step control?

Simplest case: no particle interactions.

- Embarrassingly parallel (like Monte Carlo)!
- Could just split particles evenly across processors
- Is it that easy?
- Maybe some trajectories need short time steps?
- Even with MC, load balance may not be trivial!

- Simplest all-pairs check is \(O(n^2)\) (expensive)
- Or only check close pairs (via binning, quadtrees?)
- Communication required for pairs checked
- Usual model: domain decomposition

Minimize communication:

- Send particles that might affect a neighbor “soon”
- Trade extra computation against communication
- Want low surface area-to-volume ratios on domains

- Are particles evenly distributed?
- Do particles remain evenly distributed?
- Can divide space unevenly (e.g. quadtree/octtree)

- Every particle affects every other particle
- All-to-all communication required
- Overlap communication with computation
- Poor memory scaling if everyone keeps everything!

- Idea: pass particles in a round-robin manner

copy local particles to current buf for phase = 1:p send current buf to rank+1 (mod p) recv next buf from rank-1 (mod p) interact local particles with current buf swap current buf with next buf end

Suppose \(n = N/p\) particles in buffer. At each phase \[\begin{aligned} t_{\mathrm{comm}} & \approx \alpha + \beta n \\ t_{\mathrm{comp}} & \approx \gamma n^2 \end{aligned}\] So we can mask communication with computation if \[n \geq \frac{1}{2\gamma} \left( \beta + \sqrt{\beta^2 + 4 \alpha \gamma} \right) > \frac{\beta}{\gamma}\]

More efficient serial code

\(\implies\) larger \(n\) needed to mask communication!

\(\implies\) worse speed-up as \(p\) gets larger (fixed \(N\))

but scaled speed-up (\(n\) fixed) remains unchanged.

Consider \(r^{-2}\) electrostatic potential interaction

- Enough charges looks like a continuum!
- Poisson maps charge distribution to potential
- Fast Poisson for regular grids (FFT, multigrid)
- Approx depends on mesh and particle density
- Can clean up leading part of approximation error

- Map particles to mesh points (multiple strategies)
- Solve potential PDE on mesh
- Interpolate potential to particles
- Add correction term – acts like local force

- Distance simplifies things
- Andromeda looks like a point mass from here?

- Build a tree, approx descendants at each node
- Variants: Barnes-Hut, FMM, Anderson’s method
- More on this later in the semester

- Model: Continuous motion of particles
- Could be electrons, cars, whatever...

- Step through discretized time

- Local interactions
- Relatively cheap
- Load balance a pain

- All-pairs interactions
- Obvious algorithm is expensive (\(O(n^2)\))
- Particle-mesh and tree-based algorithms help

An important special case of lumped/ODE models.