Experiences with the Amoeba Distributed Operating System
Notes by Kevin Walsh, April 1998
Overview
Most traditional operating system services are distributed. Local
operations are limited to RPC primitives and a few others (threading, hardware interface,
maybe?). The distribution is meant to be user and process transparent, which is not
so tricky since nearly everything is an RPC call. Most useful services are
implemented as user-space servers running in the distributed pool of processors.
This includes the file system, process management/creation, wide area networking,
etc. The entire pool of processors is meant to look like one unified system, and
this goal was met quite successfully.
Architecture
Some processors are dedicated servers (provide services/gateways), some are in a
pool of workers available to clients, and some are terminals.
Tasks are run in the processor pool, which allocates entire processes to each job. Binary
program files specify processor types on which they can execute. Multiple languages
are supported. Tasks are multi-threaded (cooperatively, for now), at user-level.
There is no virtual memory.
Objects and Capabilities
Capabilities have server id, object id, rights bitmap, and an encrypted checksum.
Rights bits may be masked-off at user-level (only by owner of object?). This is
very nice.
RPC System
Primitives are blocking get_request, put_reply (server), and do_operation
(client). Stubs are generated by hand, or using a specialized interface language
(AIL).
Part of the justification for a transparent distributed file systems is that
traditional network file systems are too cumbersome to the user, who must explicitly mount
and manage the connections. This is not so: think about the department unix
file system -- cumbersome for the administrators, but not for users.
Servers
Services advertise a port on which they listen for connections (using get_request).
All services communication is through RPC.
Memory and Process Server
provides process creation, making distribution of tasks
transparent to processes. Fork/exec are emulated, almost.
Bullet Server is the file server. Files are stored contiguously, in one
piece, both in memory and on disk. This might still work on disk, but in memory?
Caching is at the file level. All files are accessed immutable, accessed
through capabilities (no naming system).
Directory Server provides a repository for mapping names to capabilities, mostly
for the file system, but possibly used for any capabilities (process handles, etc).
Provides a limited mechanism for access-control lists (different capabilities for same
name, depending on the client's capabilities).
Wide Area Communication
Done through a gateway server (via RPC on the Amoeba side, other protocols on the
outside). Remote services are implemented by having a service stub (agent) on
client-side gateway and client stub (agent) on service-side gateway. This makes
remote-operations transparent to everything except gateway.
Wish-list and Bad Decisions
Object/Capability model worked well. How does it work on a large system?
What about containment issues?
RPC is OK. What about other communication models (i.e., streams, pipes, etc)?
Multi-cast and a better network protocol are needed.
Threading is really bad; No virtual memory, no migration, are undecided.
File System is wonderful. Naming vs. Storage separate is very good. Immutability is good.
Transparent distribution is good. Scaling a problem? Garbage collection
a problem?
Internetworking is tricky, but good in theory.
Unix emulation is OK but problematic (not perfect emulation).
Performance is great.
Security needs work.
Questions
What did everyone think of the performance?
Is their distributed file system too distributed? Is there a way to make
files stay close to the terminal that uses them?
Is there any method of interacting with non-Amoeba servers (using streams/connections
instead of RPC)?
Is there a way to specify where a process should execute?
Servers seem to be single-tasking, why?
How to return large data from RPC calls?
Is the decision to use no virtual memory still OK?
So, why are we not all using Amoeba?