This lecture introduces the concept of scalable and autonomic mechanisms that might support complex large-scale distributed systems.
To motivate the topic we look at some of the ambitious large-scale
technologies people are hoping to build today.
The core of the first part of the lecture looks at an example from the US
Air Force of a complex, modular, componentized system that would be distributed
over huge networks and link all sorts of information sources in support of
tactical decision making both in command centers and in the field. The purpose is to give the students a glimpse
of a large scale system that isn’t at all like the Akamai
Then we start to look at how hard it can be for a system like this to orchestrate reactions to faults in a coordinated way. Drilling down, we run into problems with fault-handling and detection even in a trivial 2-process scenario!