New Page 0

Let's repeat the top-to-bottom thinking now and ask how big data centers come to have the structure illustrated here.

Last time we talked about one aspect of this structure: the need to load-balance and offer a degree of affinity. This leads to a notion Jim Gray calls RAPS of RACS: services structured into partitions, each of which replicates data to load balance on a cluster.

Like the example of discovery, Web Services and CORBA just don't tackle the many issues seen in such systems! And there are many of them that we would need to cope with.

Today we'll focus on performance. Suppose you create a new venture, eStuff.com, and it becomes more and more of a success. How will you scale up the key services, assuming that for now all we care about is performance? Note that the book has little on this topic.

Probably you'll start by just building and using a first cut at the solution. It will get overloaded and you'll multithread it. But now you hit new issues:

- With lots of threads it thrashes and dies.

- By recoding you can switch to an event style of programming (the book mentions this but only briefly)

In fact the outcome of such thinking is to use a SEDA style of program structure (not discussed in the book... maybe in the next edition). In this lecture I review the ideas: pipelined multi-stage programs, event queues, thread "pools". See the two SOSP papers cited in the slide set for details.

With this approach, you manage to scale up but suppose load keeps growing. Now we are using our single machine to the max. What next?

This leads to notions of clustering... load-balancing... and to issues of affinity and caching. Mention Jim Gray's RAPS and RACS concept (a paper can be found if you need to read about this, but in fact that paper goes in a direction different from ours). Our point: the Web Services architecture, and CORBA, offer surprisingly little help for the programmer trying to follow this path!

Leads to recognition that these fantastic new architectures really omit a whole lot of stuff we would ideally wish to have found in them, like cluster management, fault-tolerance, replication mechanisms, etc. In CS514 we want to solve such problems and then find ways to elegantly integrate solutions back into Web Services or CORBA or other SOA environments.