CS 5220: Applications of Parallel Computers
Architecture basics
01 Sep 2015
Just for fun
Is this a fair portrayal of your CPU?
The idealized machine

- Address space of named words
- Basic operations are register read/write, logic, arithmetic
- Everything runs in program order
- High-level language into "obvious" machine code
- All operations take about the same amount of time
The real world
- Memory operations not all the same!
- Registers/caches lead to variable access speeds
- Memory layout affects performance (a lot)
- Instructions are non-obvious!
- Pipelining allows instructions to overlap
- Functional units run in parallel (and out of order)
- Instructions take different amounts of time
- Different costs for different orders/instruction mixes
Our goal: enough understanding to help the compiler.
Prelude
Self-evident:
- One should not sacrifice correctness for speed
- One should not re-invent the wheel
- Your time matters more than computer time
Less obvious, but still true
- Most time goes to a few bottlenecks
- Bottlenecks are hard to find without measuring
- Communication is expensive (common bottleneck)
- A little good hygiene will save your sanity
- Automate testing, time carefully, use version control