CS514 Spring 2008
Assignment #3
Due: May 9, 2008
Objective
In your future career, you will inevitably discover that not just building,
but even designing all details of any sufficiently large and complex application
is far beyond the capabilities of a single person, and furthermore, that any
large system has to constantly evolve to adapt to changing requirements. For
these reasons, component-oriented approach, modularity, and separation of
concerns are the key to success. They are the key factors that enable a
technical architect to divide work in a 10-person team in a way that leaves each
developer a manageable and reasonably isolated piece of work to focus on. They
allow team members to specialize and gain expertise in their areas, thereby
freeing the technical lead from the need to be an omnibus on everything
project-related, something that usually ends with a complete failure. They also
allow the entire team re-implement selected parts of a system to adapt to
changing requirements without breaking the whole, and to save development time
by reusing existing general-purpose software components implemented by others.
Designing distributed systems in a modular manner is particularly difficult
because we lack tools and language support. For desktop applications, modularity
involves a heavy use of things like object-oriented programming, which allows us
to hide implementation details of a part of the application behind a generic
interface, and replace the implementation with another one that matches the same
interface without breaking inter-component dependencies. But distributed systems
don't even have a true object-oriented language to begin with. The closest we
get to having one is through the use of web services. But web services are only
useful for a limited range of applications where scalability and performance
aren't essential. For example, they are an awfully poor match for implementing
an Internet-scale video streaming service or a massively multiplayer game.
Besides being relatively slow and heavy-weight due to XML serialization
overhead, among other factors, web services are also a client-server technology
in the first place, and as such, they are a poor match for the modern Internet,
which is becoming predominantly peer-to-peer. In a system with hundreds of
thousands of casual users spread across a large area, any technology that places
too much overhead on a centralized client-server infrastructure is doomed to be
expensive and difficult to maintain, it will suffer from bottlenecks, and from
competition with substantially cheaper and more efficient services that can
leverage the potential for direct peer-to-peer interactions.
In real large-scale systems, the closest we get to modular design is through
protocol layers. Higher-level protocols that deal with more abstract tasks such
as replication or content delivery are designed to leverage lower-level
protocols that might deal with tasks such as delivering packets from A to B, or
locating A and B in the first place. For example, the Internet infrastructure
itself involves a routing infrastructure, on top of which we run DNS. We have a
layer that allows machines on the internet to establish point-to-point
connections, on top of which one may build an overlay, which is then used as a
basis for a content-distribution service. Fault-tolerant systems often involve
layers that deal with detecting failures, organizing nodes across the internet
into groups, which run various multicast or replication protocols that are
further used as a basis for replicated data structures, replicated services, a
means of making consistent configuration changes etc. Implementation of each of
the layers can often change without affecting the implementation of the layers
below or above it. For example, your browser can fetch content from this website
through a direct of a VPN connection. In one case, it will talk to the server
directly, whereas in the other, it may involve additional layers for encryption
and tunneling of the traffic, but the protocol used by the browser doesn't need
to deal with any of these aspects. Similarly, you can run an overlay network or
a BitTorrent client on top of either type of connections without changing it a
bit.
By now, having worked on the previous assignment, you will understand that
working with protocol layers as a means of structuring your application is not
easy. While a desktop application developer may simply create a new object and
declare a new variable to represent a component, a distributed system developer
often needs to maintain state in communication channels. This is not always
necessary, and for certain parts of the application that don't involve heavy
traffic, such as e.g. a component that allows users to login, fetch an udpate,
register an address with a centralized repository or lookup a service, web
services are perfectly adequate. However, in any part of the system that
involves heavy volume of traffic, such as transactions, video streams, events
from sensors, simultaneous updates to thousands of document accessed by
thousands of users, the use of peer-to-peer technologies that carry data and
events directly between producers and consumers, without the bottleneck of going
through a central server, is a must if the system is meant to scale.
This assignment is meant to be an exercise in modular distributed system
design. We are asking you to use QuickSilver Live Objects framework to build a
complete distributed application that consists of multiple interconnected
distributed components. Some of these components will be simple agents that live
in a single location, but many will be distributed protocols, such as multicast,
and some will be composed or build upon one or more other components, much in
the way your shared document in the previous assignment was built upon an
underlying multicast channel.
There are two key aspects of this project, both of which must be present in
your solution and will count towards the grade.
- Your system should not exclusively involve web services. While it is
perfectly ok to use web services for many parts of it, you should also
leverage distributed protocols. Your system should make a significant use of
multicast, and may also use other protocols you yourself develop, either
from scratch or by wrapping existing tools or libraries to fit into the live
objects framework. The less your system relies on web services for
tasks such as event or content delivery, storage, and others that involve
heavy throughput or large volume of data, the better. This being said, there
are tasks for which, like mentioned earlier, web services are perfectly
adequate, and you should use them when appropriate. Also, you only have a
limited amount of time for this assignment, so being extreme in this aspect
and trying to use multicast everywhere you can might not be the best idea.
If unsure, you should consult with us sometime early in the planning stage.
- Your system should be modular, and the more modular it is, the better.
Building even the most fascinating monolithic application that will break
apart if any piece of it is touched will defeat the purpose of this
assignment. Your system should involve multiple distributed components, and
ideally multiple different types of distributed components. Sensors,
multicast channels, mash-ups, storage objects, directories, containers,
logic objects etc. are all good examples of what we have in mind. The
components do not need to be complex and do not necessarily need
sophisticated logic. Indeed, since you have limited time, most components
should be rather simple, and it is perfectly acceptable to leverage existing
code. The best applications would be composed of reusable components that
someone might take in the present form and use elsewhere, possibly for an
entirely different purpose. If your system can function not just an
application, but as some kind of a toolkit for building a certain class of
applications, that's even better.
Teams
You may work alone, or in a team of as many as 3 co-workers. We strongly
encourage you to work as a team.
Optional for M. Eng. Credit
In addition to implementing the above, we expect you to discuss the
reliability and fault-tolerance aspects of your application (what can you
guarantee about your implementation and why and what you know you can't
guarantee, what are the vulnerabilities, and how one might avoid them). You
don't need to implement anything. We also ask you to evaluate the performance
and scalability of the system and identify the bottlenecks and apparent
vulnerabilities. How many sensors, how many users, what data rates can the
system support? You don't need to be comprehensive, but you should have some
argument backed by data that you might extrapolate. For example, by running the
system with 1, 2, 4, and 8 users, you might show that overhead grows or
performance decreases in a certain way, and speculate when it might collapse or
drop below some threshold. It doesn't matter for this project if the
implementation is scalable, only that you can understand its limitations.
Details
We suggest that you build one of the following three example applications.
Customizations are possible, and we are open to other ideas, but we would like
to keep the list reasonably short, and if you want to deviate from what we
propose, you should consult it with us.
- A monitoring application for a data center.
Think of a large enterprise network with thousands of machines
scattered across multiple office buildings. Companies invest a lot of
resources into maintaining and monitoring their network and computing
infrastructure, and the lack of customizable tools that can be easily
adapted to support proprietary hardware is a headache. Your task will be to
design a system that can collect information from a variety of customizable
distributed agents, present it in the form of mash-ups that multiple users
can access, and possibly modify, and perhaps allow the user to perform
certain actions, such as running a script, changing a configuration setting,
or deploying a file on one or more machines in the data center.
Internally, you application will involve several types of components:
- Agents or sensors that tap into the local resources on a machine on
which they run, and either pump information they collect into a
multicast channel, or invoke certain local actions based on requests
received from multicast channels. For example, agents might leverage
Windows Management Instrumentation (WMI) interface and performance
counters, to read information such as processor usage, average network
throughput or the number of transmission failures in the last minute,
the temperature on a fan, or recent errors from the system event log,
start a service, replace a library, or modify the local registry. You
might also create agents that tap into databases or other local
applications. Agents should be customizable: the system should not
assume that it works only with certain five types of agents that have
been hardcoded, but that the user can create and deploy new agents.
Ideally, each agent would be a small live object.
- Multicast channels that carry information from agents to users.
- Objects that visualize the information obtained from agents, perhaps
after processing it a bit. For example, one object might simply display
a number, either in a numerical format or as some widget, while another
might display a history of values over a period of time or a histogram
etc. Ideally, this would include mash-up objects that can present
information from multiple agents on what looks like a small webpage. For
example, a page could show a graph showing database requests per second
over the last hour alongside a list of events that need urgent
attention, machines that are overheating or that have some key services
down etc. The users should be able to modify existing mash-ups and save
them for others to use. Every piece of data from agents and every
mash-up should be viewable by multiple users at a time, and if it is
customizable, it should correctly deal with situations where multiple
users try to access and modify it.
- Means by which agent code or mash-ups that users created are stored
somehow. At the very minimum, in some central repository, but we would
encourage you to leverage the work you did for the previous assignment,
store the mash-ups in channels, and allow the users to edit them
concurrently. Note that ideally, you would need to ensure that mash-ups
that are not being viewed by anyone are somehow "saved" before the
last user disconnects from the channel and the data is lost. While
implementing this is not required for this assignment, we'll value it
extra if you do. To keep things simple, you can leverage web services to
keep track of the clients who have a given document open and have a
dedicated server connect to the channel where the document is stored
before the last client closes the document. Other solutions are possible
(and, like mentioned above, the lack of a solution is also acceptable).
You can assume that the clients are generally well behaved and will not
crash unexpectedly.
- Means by which users can discover what viewable information is out
there and access it, and by which new mash-ups created by the user can
be published and discovered by other users.
Although the "wow" factor counts in this project, we do not require fancy
graphics, and there's no need to implement every possible sensor or deal
with every possible problem that you might encounter when building the
system. What matters the most is that the application has a good
architecture and offers a degree of customizability (adding new types of
sensors to a running system without recompiling the entire project, changing
settings on existing sensor, editing mash-ups). This being said, if your
system implements a virtual systems lab where the user can walk between
virtual servers and virtual displays as in a role-playing game and look at
the virtual sensors that represent information collected from the agents, it
will count extra.
- A distributed multiplayer role-playing game.
This version is similar to the
last
year's assignment, but since you will have more time, better tools, and
an experience in using multicast, we expect you to implement something more
sophisticated. The system should maintain multiple rooms, multiple avatars,
and multiple objects, and it should allow for new rooms, users, and objects
to be created. Also, it should not be limited to just a few predefined types
of objects, but allow new types of objects to be created that can be used in
a running game without the need to recompile it. For example, if a panel
displaying a webcam video is not supported, one should be able to implement
it, publish somehow, and let users place it in one of the rooms.
Your application might internally involve the following types of
objects:
- Avatars the represent individual users. The state of the avatar
might include its current position, appearance, or objects it is
carrying, direction it is walking towards etc., and should be stored in
a multicast channel, accessed by all users watching the avatar. A user
controlling the avatar should tunnel all actions taken by the avatar
through the channel, much in the way edits to a shared document were
tunneled through it.
- Objects. Some of these might be static, but some would ideally have
state, such as color, text displayed, or perhaps even be connected to a
video stream.
- Rooms. These should be like mash-ups, in that they would contain
links to avatars and users that are in them, as well as links to other
rooms. When in a room, the user's avatar would thus connect to channels
corresponding to other avatars and objects that reside in the room in
order to display them. You might also want to publish on the room
channel what the users have just said. If you want to be fancy, you
might even publish and playback audio from the user's microphones,
although we don't expect you to work this hard.
- Means by which information about rooms, users, and objects is
stored. Same guidelines apply as in the previous example.
- Means by which one can discover, access, modify, and publish
information related to the rooms, users, and objects in the game.
As you can see, as far as the internal architecture is concerned, this
project is actually not very different from the previous one, and the same
design guidelines apply. The main difference here is that you would spend
more time dealing with graphics, and less with APIs for accessing databases
or system resources. Still, we expect that your application provides a
degree of customizability and allows users to modify the virtual world
they're in.
- A collaboratively-administered news portal.
This is basically a twist on the first project. Instead of sensors that read
data from the system or databases, you would have "sensors" that might data
from RSS feeds or other Internet sources, publish video frames from webcams,
media files, or video streams obtained from elsewhere, and pump these into
multicast channels. Everything else would be the same: users could create
mash-ups that could resemble articles, or just collections of annotated
content collected in one place, and publish those for others to edit. We
would want a way to create new types of "sensors" that suck information from
the Internet and publish it on the channels, and a way for users to find
channels with new content and add them to their mash-ups, without
recompiling the entire application.
What to hand in
As in the
previous assignment.
How to turn in your assignment
As in the
previous assignment.
How we'll grade the assignment
General guidelines are as in the
previous assignment, but this assignment counts for more, because the
project is more ambitious: 50% of the homework grade points will be based on
assignment 3, with 25% each from assignments 1 and 2. Also, unlike in the
previous assignments, every group will need to run a demo of your system in the
CSUG. Professor Birman and the TAs will show up and will want to see your stuff
in action. Be prepared to blow us away (good practice for impressing venture
capital investors in June, once you’ve graduated from Cornell and are ready to
start your company).
Hints
Nothing for now, check again later.