CS262B Proposal
Team Eta
Hakim Weatherspoon
Dennis Geels
Yan Chen
Replication has been widely used and studied on wide-area data storage, distributed database management system and Internet web caching. There have been numerous work on the tradeoff between consistency and availability (ACID v.s. BASE semantics), some recent work on dynamic optimization of number and location of replicas and some recent work for untrusted environments. But none of the previous approaches works well for an oceanic data utility such as OceanStore.
OceanStore assumes a distributed, untrusted model in which peers have limited to no trust for each other. OceanStore replicates data objects and distributes these replicas throughout the network to improve performance, availability, and durability. Unfortunately, a higher degree of replication increases the resources consumed by any consistency protocol among the replicas.
We propose an introspective replication algorithm so that both the number and location of the replicas will adapt to usage patterns in order to minimize both the replica-user communication latency and the inter-replica consistency protocol overhead. Our algorithm will make only local decisions and requires minimum trust between servers in the OceanStore, making it scalable and suitable for the untrusted environment.
We will compare our algorithm to a centralized decision process and/or to a static placement method. We will analyze the extent of the "common case", in which our algorithm functions properly without trust among servers. We will examine the overhead of checks in the system to detect and correct the "uncommon cases" in which our system cannot handle malicious or compromised servers.
Thus, we are going to design, implement and test our algorithms for "introspective replication under untrusted environment". Introspection is the intelligent observation and response to changing patterns of usage. Preliminary versions of data collection and analysis for intelligent cache management has been explored in the Rumor project at UCLA [GRR+98]. Our consistecy protocols can range from epidemic two-tier weak-consistency protocol as used in Bayou [TTP+95] to Byzantine three-phase fault-tolerant consistency protocol O(n^2) [CL99]. It is similiar to the study of Adaptive Data Replication (ADR) algorithm, which claimed that their replication can be used for both strong consistency and weak consistency [WJH97]. We will basically focus on the wide-area introspective computing and how to handle corrupted server for uncommon case.
To make intelligent decision on the actions for replication center, we need the global knowledge of all the replication centers for the requested object. The knowledge includes the number, location and state of the replication centers as well as the network distance. Although the central management is a much simpler and cleaner approach compared to distributed one, the bottleneck of the responsible center makes it unsuitable for global scale data storage. Thus, we are going to use the distributed management scheme which requires every replication center be aware of the rest of peers.
To evaluate our algorithm and implementation we will use trace based simulation. Through oceanstore we have access to both web and NFS traces. To simulate a wide-area network we will use a publicly available network simulator such as ns, jns, or pdns. Possible metrics include total network bandwidth consumed, average request overhead bandwidth, and client-perceived latency: both total request latency and tentative commit latency, if the consistency model differentiates between the two.
Numerous related work has been completed on the tradeoff between consistency and availability (ACID v.s. BASE semantics), dynamic optimization of number and location of replicas and some recent work for untrusted environments.
Much work has involved relaxing the consistency model among replicas in order to increase the availability of the system as a whole [ABKW98, BK97, CKF99, GHOS96, TTP+95]. Bayou [TTP+95] and Deno [CKF99] proposed solutions for mobile and weakly connected environments. They do not consider the dynamic optimization of replicas or any form of corrupted server.
Recently, there are some research on dynamic object replication and migration. Radar project from AT&T proposed a protocol suite that can dynamically decide on the number and location of object replicas and distribute request among currently available replicas evenly [RRRA99]. However, their system is limited on read-only objects that do not change as the result of user access because most of the web access is to static objects. Objects that do change on user accesses are limited to migration only in their protocol and they use primary copy approach for consistency, which has very bad scalability and availability. Similarly, [PRR97] designed a simple randomized algorithm for accessing shared objects that tends to satisfy each access request with a nearby copy. There are quite a few work on dynamic transactional client-server caching[CFLS91, FC94, WJH97], however, most of their work are on ACID consistency and none of them consider untrusted environment.
Almost all previous replication and conflict resolution approaches are based on the assumption of trusted replication centers. If the replication center maliciously denies the services, or pretend it has more/less currencies than it actually holds, they will be in trouble. Castro and Liskov [CL99] proposed a new replication mechanism that is able to tolerate Byzantine (i.e. arbitrary) faults. However, their algorithm has O(n^2) communication overhead for three-phrase commit and does not have the dynamic location and number optimization to reduce the cost.