Proposal

[ Up ] [ Proposal ] [ Current Status ] [ Previous Status ]

Security and Fault-tolerance of Mobile Code

Fred B. Schneider, Professor; Robbert van Renesse, Senior Research Associate; Greg Morrisett, Assistant Professor, Computer Science Department

Today, a single model is dominant in systems research: client and server processes that communicate using shared variables and/or message passing. We are investigating an alternative that is based on mobile processes or agents that migrate through a network. By structuring a system in terms of agents, applications can be constructed in which scarce resources (i.e., network bandwidth or local disk space) are conserved. This is because an agent can filter or otherwise reduce the data it reads, carrying only relevant information with it as it roams the network; there is rarely a need to transmit large amounts of raw data from one site to another. In contrast, when an application is built using only clients and servers, large quantities of raw data may have to be sent from one site to another if, for example, the client obtains its computing cycles from a different site than it obtains its data. In addition, the agent paradigm seems ideal for environments in which connectivity between sites is intermittent and network partitions are frequent (e.g., mobile clients). The TACOMA system provides operating-system support for agents written in a variety of languages. To date, a series of prototype TACOMA agent-support environments have been built, and used for problems such as gathering and visualizing Arctic weather data, providing matching between service providers and potential clients, interacting with users (i.e. active documents), and managing software installation in a network. Unlike agents written in Java, TACOMA agents can leave data behind at the sites they visit, as well as transport data from site to site.

The benefits of easily implemented computations that span multiple hosts must be tempered by the realization that: (i) computations must be protected from faulty or malicious hosts, the agent integrity problem, and (ii) hosts must be protected from faulty or malicious agents, the host integrity problem. A form of the agent integrity problem is present in today's distributed systems, since they also must tolerate faulty hosts and links. However, the number of hosts involved in a typical distributed computation today is modest compared to what we envisage when wide-spread support for agents becomes available. Fault-tolerance–on a much wider scale than we can now provide–is critical for success of this new paradigm, so this is one of the first problems we are attacking.

We are currently investigating a new approach to the host integrity problem–one that combines the best aspects of previous approaches and, therefore, supports both integrity and high performance. The fundamental idea is to break compilation into a sequence of stages that operate on both code and proofs of type-correctness. Each compilation stage takes as input a program in one language and produces an equivalent program in a (possibly) lower-level language. Furthermore, each stage produces a proof in some logic that the stage's output is type-correct. In practice, this proof will appear as a set of annotations on the output code. Subsequent stages can then verify and use the proofs from their predecessors in producing their output code and proofs. Only when a stage is unable to prove that its input is type-correct will it insert the kinds of dynamic type checks that are currently used to ensure integrity.

Ensemble Groupware System

Ken Birman, Professor; Robbert van Renesse, Senior Research Associate, Computer Science Department

Cornell's Horus research project is developing a new generation of groupware communication tools. This effort seeks to introduce guarantees such as reliability, high availability, fault-tolerance, consistency, security and real-time responsiveness into applications that run on modern networks. Our approach is to focus on what we call process group communication structures. These arise in many settings involving cluster-style computing, scalable servers, groupware and conferencing, distributed systems management, fault-tolerance and other execution guarantees, replicated data or computing, distributed coordination or control, and so forth. Group computing has been employed successfully in a great variety of real-world settings. Our emphasis is on the underlying communication systems support for this model, on simplifying and standardizing the interfaces within our support environment, and on making the model as transparent (and hence as easy to use) as possible.

Horus/C, Electra and the HOT object-oriented tools

This project has developed two related software systems, both solving the group communication problem, but differing in focus. Our first is called Horus/C, and was developed over a five year period that started in 1991. In building this system, our intention was to reimplement the mechanisms first used in the much earlier Isis Toolkit, but in a manner that would demonstrate greater flexibility and performance than was possible with Isis. Horus/C achieved these goals through an architecture that supports group communication and the virtual synchrony runtime model, but without imposing the model on applications that need something weaker. The architecture is layered, and the layers supporting a particular group communication application are selected at runtime. By hiding the system behind standard interfaces, we have demonstrated a good degree of transparency: this tactic allows us to slip groupware mechanisms into applications that were not designed with group communication as an explicit goal. For example, Horus supports an interface to the CORBA architecture called Electra, which makes Horus a highly modular extension to a widely accepted industry standard. Electra in turn is implemented over another object-oriented interface to Horus called HOT.

One expected application of Horus/C is a cluster-style server, possibly with real-time and other demanding performance-intensive requirements. For example, recent work has demonstrated how to build a telephone switch coprocessor using Horus and an SP2 or a similar cluster multicomputer. Horus orchestrates the management issues that arise as nodes are swapped on and off line, and the overall coprocessor is able to sustain 22,000 SS7 telephone-call routing requests per second, without disruption even when failures or recoveries occur. Such performance is good enough to impress the telecommunications community. Further work on this problem requires demonstrating scalability of coprocessor memory resources and computing performance on a large number of nodes, and in other server settings, such as Web and file-system servers.

Ensemble System

As Horus/C matured, we encountered issues that recently led to a complete reimplementation of the system using a subset of the ML programming language. To avoid confusion, we have begun to call this version of our system Ensemble. Although ML normally executes more slowly than C, the language brings powerful tools for formal verification, which have assisted us in ensuring that our protocols are correct. To minimize the impact of ML on performance, the critical path of Ensemble avoids features of the language that are difficult to compile to efficient code, such as exceptions and garbage collection. Moreover, the use of ML has facilitated semi-automatic optimization of our protocols.

Developing Safe Systems for Network Appliances

Thorsten von Eicken, Assistant Professor, Computer Science Department

The impact of computer science on society has broadened dramatically in the past few years with a growing penetration of information technology into corporations of all sizes, as well as into the home. At the computer systems level, this "information revolution" is stimulating the emergence of new computing platforms: in the not distant future we will begin to see network appliances that connect to the information infrastructure and find use in homes, offices, and information booths. Network appliances is taken here to describe a large class of devices that interact with the real world, that can communicate over a digital network, and that can be programmed to perform custom functions.

A primary focus of our research is in the infrastructure required for large-scale collaborative information systems in which users connect with different appliances to the network and in which devices from telephones to cameras in meeting rooms are integrated into the system. Our research in this area is centered around the Safe Language Kernel (SLK) operating system, which we are currently developing. SLK relies on the properties of type-safe languages in order to enforce protection boundaries between applications and the OS itself, which means that all code can run in a single address space and at a single hardware privilege level. The first version of SLK is heavily Java based, but a significant part of our research effort lies in understanding how to host multiple languages. For example, we plan to integrate ML into the family of languages supported by SLK, thereby leveraging work in both the Ensemble and TACOMA projects.

The expected benefits of our approach are higher resource efficiency, seamless system extensibility, and flexibility in the form of fine grain sharing across protection domains. We are focusing on a small number of complex issues, in particular, the security issues involved in executing foreign code in systems, the operating system structures to support the secure execution of type-safe languages without the use of hardware protection facilities, and the efficient integration of communication and computation. We are developing stand-alone systems that interface cameras and microphones to the network for use in classrooms and conference rooms to record events automatically. We are also developing a version of SLK that will run as a real-time thread under Windows NT, so that we can develop application specific gateways, which will be used to process data coming from the cameras.

Tools for Creating Quality Multimedia Content

Brian Smith, Assistant Professor, Computer Science Department

Despite enormous commercial pressure, relatively few Web sites have incorporated audio and video data. Widespread use of this technology has been hampered by: 1) inadequate infrastructure and 2) the labor needed to create quality audio/video content. For example, consider the problem of putting video materials, such as class lectures, up on the Web. Although occasional browsing of videos is possible using the current networking infrastructure, the same infrastructure would buckle under the load generated by a class full of students accessing the video in the four hours before an assignment is due. One could use techniques such as joint source channel coding or video transcoding to better match the video bandwidth to the available network bandwidth, but then the bottleneck becomes the computing cycles needed for the associated processing. We are considering distributed solutions that place portions of the video around the network, where they are most needed.

Even with an improved infrastructure, the problem of creating quality video content remains. What non-expert programmers need is a high-level language and optimizing compiler that allows them to write programs that efficiently manipulate multimedia data. The primitives in the language should include algorithms from computer vision, graphics, multimedia, signal processing, and compression. The compiler should apply the tricks of the expert programmer. We have developed such a language and compiler, called Rivl. Rivl differs from previous approaches, like Khoros and NetPbm, because programs written in Rivl specify processing, not implementation. Rivl can be likened to a database management system's (DBMS) query language and optimizer. In a DBMS, the query language specifies the data to retrieve, and the optimizer generates the plan that executes the query efficiently. Rivl programs are also resolution independent. Resolution independence allows a program to be developed and debugged on low-resolution data, and later executed off-line on high-resolution data. It also allows Rivl to efficiently exploit multi-resolution image formats, such as Photo-CD and FlashPix.

Rivl programs are agents that can be migrated. That is, Rivl programs can be moved and executed close to the data. We are using this feature to develop a Web-based video editor. A standard Web client provides a user interface and creates Rivl scripts that are then executed remotely on a server. The resulting video and images are then sent back to the client, where the user sees the results of his edits. With this technology, PCs connected over modems can edit hour-long movies.

Last modified on: 07/30/99