From eyh5@ee.cornell.edu Wed Dec 5 18:15:57 2001 Received: from memphis.ece.cornell.edu (memphis.ece.cornell.edu [128.84.81.8]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.7) with ESMTP id fB5NFu610009 for ; Wed, 5 Dec 2001 18:15:56 -0500 (EST) Received: from photon.ece.cornell.edu (photon.ece.cornell.edu [128.84.81.138]) by memphis.ece.cornell.edu (8.11.6/8.11.2) with ESMTP id fB5NCRM26864 for ; Wed, 5 Dec 2001 18:12:27 -0500 Date: Wed, 5 Dec 2001 18:12:27 -0500 (EST) From: Edward Hua X-X-Sender: To: Subject: 615 Paper # 70 Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII A Scalable Content-Addressable Network Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker This paper proposes an algorithm designed to improve the indexing scheme used to map file names to their locations in a scalable peer-to-peer file distribution system. The Content-Addressable Network, or CAN, acts as a hash table, whose main function is name resolution, in a decentralized fashion. It performs insertion, lookup, and deletion of pairs in the network. CAN does not limit its application to peer-to-peer networks; it may also be implemented in wide-area name resolution services that decouple the naming scheme from the name resolution process. The namespace of CAN is built in a d-dimensional Cartesian coordinate space on a d-torus. The design partitions this space into n zones, corresponding to the n hosts in the network. Each host maintains a neighbor map. The nodes send out periodic update messages to their neighbors reporting the zone coordinates and neighbor lists. These periodic updates manifest their importance when node joins and departures take place in the coordinate space. The joining of a node to the network involves identifying a node currently in the network and acquiring half of the zone it oversees. Likewise, the departure of the node involves giving its zone to one of its neighbors. In either case, the neighbor nodes will learn the change in the topology by the exchange of periodic updates. The researchers of CAN also propose some design improvements on top of the base operations it carries out. The first one is to have multiple dimensions (beyond the conventional 3-dimensionality) in constructing the coordinate space. The second is to have multiple coordinate spaces, or realities so that each node occupies a different zone in these different spaces. Both techniques achieve a smaller path length, although for the same number of neighbors, increasing the dimensions of the space yields short path lengths than increasing the number of realities. However, the trade-off for the advantage of multi-dimensionality over multi-reality is the increase in the average per-node neighbor state. Further design improvements are also proposed to enhance performance of the CAN. First, the routing metric, which was purely based on the distance from a client to a destination, is now weighted with round-trip time (RTT). Also, Instead of having one node occupy a zone in the coordinate space, multiple nodes can occupy the same zone, forming a peer-ship amongst themselves. This zone overloading technique does not increase the amount of neighbor information an individual node must carry, but does require to hold additional states for its own peers. Other techniques, such as using multiple hash functions, topologically-sensitive construction of CAN overlay network, uniform portioning of coordinate zones, and caching and replication for hotspot management are being explored. Some of the prices to pay for enhanced functionality in CAN are listed below. First, there is a constant need for all nodes in the CAN to periodically send update messages to their neighbors, even though if the topology is relatively constant over a long period of time, as is often the case in wired networks. This incurs a substantial consumption of the network bandwidth. Second, the use of multiple realities inevitably demands an increase in the disk space in the nodes. Third, it seems that the scheme is rather computationally intensive, demanding a lot of CPU power on each of the nodes. While experiencing reduction in routing delay with the design enhancements, some of that reduction may be offset by the internal computing power (e.g., the power to send periodic updates with multiple realities). Therefore, more accurate measurements of the amount of delay saved are needed to strike a balance between the overall system complexity and reduced routing delay. From avneesh@csl.cornell.edu Thu Dec 6 13:11:47 2001 Received: from capricorn.ds.csl.cornell.edu (capricorn.csl.cornell.edu [132.236.71.92]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.7) with ESMTP id fB6IBk612442 for ; Thu, 6 Dec 2001 13:11:46 -0500 (EST) content-class: urn:content-classes:message Subject: 615 Paper 70 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Date: Thu, 6 Dec 2001 13:15:11 -0500 X-MimeOLE: Produced By Microsoft Exchange V6.0.4712.0 Message-ID: <97C142C1212ED545B0023A177F5349C40A0A34@capricorn.ds.csl.cornell.edu> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: 615 Paper 70 Thread-Index: AcF+gfGy8NK2YS5/Qie3hYaFftZKPw== From: "Avneesh Bhatnagar" To: Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by sundial.cs.cornell.edu id fB6IBk612442 A scalable content addressable network This paper presents the design of a scalable indexing mechanism, which unlike Chord and PAstry, visualizes the routing space as a d dimensional torus. A hashing mechanism is used here to store a key, value pair where the key is deterministically mapped onto a point P using the hashing mechanism. Routing is done by using greedy forwarding; each node has a set of neighbors whose coordinate spans overlap in d-1 dimensions and which abut in one dimension. This gives the average path length as d/4(n^1/d), while the max number of neighbors would be 2d. Each can node keeps a chunk (zone) of the hashtable containing the key value pairs. In order to join the CAN network, a node must query an existing CAN node, which then attempts to find a neighbor node whose hash zone needs to be split. This is chosen randomly in the simple version of the algorithm, but can be chosen as a function of how much load a particular zone has (size). Larger zone would be a good candidate for splitting. The node then joins the routing mechanism, by informing its new neighbors of its existence and updating its neighbor state on the basis of soft state updates. Node departure is followed by a timed TakeOver message, which a neighbor that detects lack of response from a node, sends to the failed node's neighbors informing them of a takeover of the zone. The authors then discuss and improve the basic algorithm, and also evaluate the impaacts of the improvements: a. Multidimensional co-ordinate spaces: Increasing dimensions of CAN coordinate space yields shorter path lengths. b. Realities: Each node can maintain multiple coordinate spaces, with contents of the hash table being replicated at each of these. This 'replication' yields better data availability and lower hop count. c. Better routing metrics such as measuring the latency between nodes, to avoid long hops. d. Having overloaded co-ordinate zones, i.e. instead of one node owning a zone, multiple nodes do so, and this improves fault tolerance along with reduced per hop latency/path length. e. Multiple hash functions: k different hash functions that map a single key to k different points would also increase availability. f. Topologically sensitive routing. This feature is well handled in e.g. Pastry which takes into account leaf sets while routing. Here this would be a problem, which the authors attempt to solve by setting up 'landmarks' upon which geographically close routing can be leveraged. g. Caching and replication of commonly used data can alleviate strain on hot spots. The authors then evaluate CAN with and without these factors and also the effect that each of these has on the overall performance. I think Table 3 on page 169 gives a very good idea of how these changes affect performance. I think that this paper was quite well written and brings some important issues into light. The idea of zones is quite interesting. The only issue that I can see here is again the inherent assumption that all nodes are equal in capabilities. Furthermore, there seems to be quite a lot of state that would have to be transferred periodically if we were to look at the best case performance as shown on page 168 (30 nodes as neighbors). Furthermore, it would have been interesting if the authors had evaluated this with the Pastry and Chord designs to trade off the zone idea with a simple circular ring.