From sc329@cornell.edu Thu Oct 31 03:01:23 2002 Received: from postoffice2.mail.cornell.edu (postoffice2.mail.cornell.edu [132.236.56.10]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id g9V81Nc06609 for ; Thu, 31 Oct 2002 03:01:23 -0500 (EST) Received: from sangeeth.cornell.edu (syr-24-58-36-135.twcny.rr.com [24.58.36.135]) by postoffice2.mail.cornell.edu (8.9.3/8.9.3) with ESMTP id DAA18086 for ; Thu, 31 Oct 2002 03:01:20 -0500 (EST) Message-Id: <5.1.0.14.2.20021031025320.034e8b90@postoffice2.mail.cornell.edu> X-Sender: sc329@postoffice2.mail.cornell.edu (Unverified) X-Mailer: QUALCOMM Windows Eudora Version 5.1 Date: Thu, 31 Oct 2002 03:01:22 -0500 To: egs@CS.Cornell.EDU From: Sangeeth Chandrakumar Subject: 615 PAPER 61 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Submitted by Sangeeth Chandrakumar Chord is a distributed lookup protocol in a highly dynamic and peer to peer system with nodes joining and leaving the system. Given a key, the protocol maps it onto a node.It differs from some of the other systems we saw earlier like freenet in the fact that the lookup operation runs in predictible time and always result in a success or definitve failure. One of the main features of Chord is that it is simple and handles concurrent node joins and failures well. The cost of a chord lookup grows as the log of the number of the nodes, so scalability is not a problem. Also, there is no restriction on the structure of the keys. Chord uses consistent hashing to map keys to its responsible nodes. Every node in the system is arranged in a ring structure, with each node having a successor and a predecessor. Each node maintains a routing table with atmost logN entries, called a finger table. Each finger entry would point to a successor which is more distant form the current node. Every node knows more about nodes closely following it in the identifier circle than nodes farther away. When a node wants to locate a key, it searches in the finger table for the most immediate predecessor of that key and asks that node for further directions. This process is repeated till the node responsible for the key is found out.Whenever a new node joins or leaves the network, some of the local nodes would be updated - the table for the new node would be initialized, the fingers and predecessors of existing nodes would be modified to reflect the addition on the new node. And the responsiblity of some of the keys would be transferred to the new node form its successor. Chord has a lookup time of O(log N). Pastry is a scalable, distributed object location and routing substrate for wide area peer to peer applications. When presented with a key, a pastry node effectively routes the message to a node that is numerically closest to the key, among all live pastry nodes. In pastry, each node is represented using a 128 bit id in a circular nodeID space. Each node maintains a routing table, a neighborhood set and a leaf set.Neighborhood set contains nodeIds and IP address of nodes that are closed to the local node according to proximity metric. Routing table contains log N rows, each row containing nodes which share certain length of prefix with the current node. Leaf set contain the nodes which are numerically closer to the current node. Given a message, the node first checks to see if the key happens to fall within the range of the leaf set. If so, the message is directly forwarded to the closest leaf node. Else it forwards the message to the node in the routing table which shares a greater prefix length with the message key than the current node. This simple routing algorithm always converges, because each step takes the message to a node that is either 1) shares a longer prefix with the key than the local node, or 2) shares as long a prefix as the local node but is numerically closer to the key than the local node.Whenever a new node arrives, it sends a message to a node which it already knows, which would then propagate a join message to the given key. Table informations are exchanged with the new nodes and all the tables would get eventually stabilized to reflect the new node. From ashieh@CS.Cornell.EDU Thu Oct 31 04:11:58 2002 Received: from zinger.cs.cornell.edu (zinger.cs.cornell.edu [128.84.96.55]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id g9V9Bwc20342 for ; Thu, 31 Oct 2002 04:11:58 -0500 (EST) Received: from localhost (ashieh@localhost) by zinger.cs.cornell.edu (8.11.3/8.11.3/C-3.2) with ESMTP id g9V9BvT08843 for ; Thu, 31 Oct 2002 04:11:57 -0500 (EST) Date: Thu, 31 Oct 2002 04:11:57 -0500 (EST) From: Alan Shieh To: Subject: 615 PAPER 61 Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII ** Lookup time - Pastry O(log_(2^b) (N)) hops in overlay network. Attempts to reduce hop cost. O(log_(2^b * log_2^b(N))) routing table, plus auxiliary leaf and proximity tables - Chord O(log_2 (N)) hops in overlay network. Does not address hop cost. O(log_2 (N)) routing entries per node - Tapestry O(log_(2^b) (N)). Attempts to reduce hop cost. hops in overlay network to access root; similar number of hops to access actual object O(log_(2^b * log_2^b(N))) routing table; redundant links; auxiliary location information - Kademlia O(log_2 (N)). Does not attempt to reduce hop cost. hops in overlay network O(log_(log_2(N))) k-buckets ** Failure resilience - Pastry Leaf table is a fallback for repair/routing. Many sequential nodes in the key space must fail for disconnection to occur. - Chord Requires distributed repair process to regenerate ring, which appears to be a fragile structure. Only single query path possible - Tapestry Multiple redundant links per routing entry. Uses soft-state as a backup; all operations are checked. However, global invariants (e.g. hole property) may invalidate over time. Replication of objects and root nodes. - Kademlia Probabilistic guarantees on running time and connectivity. Unsure as to relative likelihood of network partition. Simple response to network anomalies/node arrival or depature. Able to augment routing table using any received packet. Multiquery support. ** Simplicity - Pastry Data structure difficulty somewhat lower than Kademlia, Tapestry. Node join algorith much simpler than Tapestry's. - Chord Lookup algorithm simple; distributed data structure is also simple. Update, concurrency, repair code is moderately complex and necessary for correctness. - Tapestry Complex system. Especially challenging: node arrival due to necessity of maintaining global invariants. Neighbor map optimizations somewhat non-trivial as well. - Kademlia Simple join, routing functions, and data structures. ** Tamper resistance - Pastry Nodes can insert themselves into the network at any point; a federation of nodes may be able to cause partitions of the network. Leaf table provides some degree of redundancy vs routing table, which may provide some tamper-resistnace. - Chord Very vulnerable due to its one-dimensional arrangement - Tapestry Nodes can insert themselves into the network at any point; a federation of nodes may be able to cause partitions of the network. Hole invariant is particularly vulnerable to attack. Optimization algorithm may make it more difficult for attackers to subvert the system, as would the redundant links. Multi-plane root nodes also makes it slightly more difficult for attacker to partition the network. If an attacker were capable of detecting holes in the Tapestry network, it is possible to trigger cascading changes of root nodes, which would drastically impact performance. - Kademlia Nodes can insert themselves into the network at any point; a federation of nodes may be able to cause partitions of the network. As long as good nodes stay up, and arrive first, Kademlia will continue routing through them. Kademlia's multi-path queries also provide some degree of redundancy. **** Pastry Pastry uses a prefix-based routing algorithm, in which each succeeding node in the routing path shares one more digit in common with the destination node. Each successive row in the routing table stores routing entries for nodes that successively share more prefix with the router. The combined effect of prefix-routing is to eliminate the expected number of possible destinations by a factor of b at each step. As in Tapestry and Kademlia, a hash function is used to achieve uniform distribution over the keyspace. Each node maintains a routing table, leaf table, and neighborhood table. The routing table is used to support prefix routing; each row has an increasing number of shared digits with the address of the node. The leaf table is used to shortcut routing, provide resilience against link failures, and to provide lateral routing capabilities. Pastry has O(log N) routing length in the overlay network. However, the path length may be extraordinarily long in the underlying network, despite attempts at minimizing path length. This results from the reliance on a greedy path selection algorithm at each node; the resulting paths are forced away from the source, but they may spiral outward their way to the destination. ** Shortcomings A major shortcoming of Pastry is the amount of network stretch, for which there do not exist very strong bounds. ** Possible extensions **** Chord Chord arranges keys in a ring structure, and uses "finger-tables" which are forward pointers whose skip distance increases geometrically. There are log(N) such finger table entries, which allows for log(N) lookup. A lookup follows the finger pointers of log(N) separate nodes. Chord requires load balancing and fixing of finger pointers when nodes enter and leave the network. ** Shortcomings Repair function does not address all possible failure modes. Not resilient against rapid node entry/exit. Also, the topology is static once inititalized; there is no balancing for load, or redistribution to optimize hop count in the underlying network. ** Possible extensions **** Tapestry Tapestry provides suffix-based routing to hash keys, and a location system for finding objects. The suffix routing scheme works in much the same way as Pastry's, except for additional redundancy (multiple secondary entries) and more extensive optimization algorithm when nodes enter the network. Becauase of this expensive optimization procedure, Tapestry uses a "second-chance" algorithm to prevent short-term fluctuations in network connectivity from triggering a cascade of topology updates in the overlay network. Tapestry supports both routing and object location algorithms. As a routing algorithm, it allows multiple routes to be selected, and for replicas to be squashed by API hooks at later nodes. The routing table optimization allows Tapestry to provide better performance than Pastry in its routing layer. Tapestry also separates values from keys, thus allowing for much larger objects (such as disk blocks), and allows for keys to refer to multiple objects. A key feature of Tapestry is its deterministic root election algorithm. This algorithm allows any node in the network to indepedently select the same root node to track an object. Most of Tapestry's routing and location is stored in soft-state. To improve performance, a notification system is used to explicitly version and delete stale state, such as location information when objects are relocated. ** Shortcomings Tapestry requires global state to maintain correctness of its distributed root node selection algorithm. It is not possible to change the root node mapping without kicking out the existing root server. Heavy-weight optimizations preclude use in a highly dynamic environment. ** Extensions - Construct load balancing schemes for routing, not just node access - Come up with a way to relax the hole fill property, or perhaps evaluate whether this extra complexity/fragility is a worthwhile tradeoff. **** Kademlia The key ideas behind Kademlia are the use of the XOR metric and an implicit routing table update scheme. Kademlia routing tables, which map exponeintially increasing ranges of keys (sort of a combination of Chord and Pastry ideas) to sets of nodes, rely on density properties of the hash function, namely that one would expect nodes in most of the ranges. This allows for logarithmic hop count in the overlay network. Kademlia uses caching and some soft-state (check before purging from cache) to construct its routing tables. The use of a highly-symmetric routing metric (any RPC may implicitly contribute routing information) allows for simplified network maintenance. The use of the XOR metric does not suffer from the problem of missing object replicas through prefix routing. If an object hashes to a key whose prefix (or suffix) is close to the min or max value, many of the replicas will be placed in nodes with a different prefix. ** Shortcomings It's not immediately clear whether not having the replica problem of prefix-routing is that much of an issue, especially since systems like Tapestry use a separate mechanism for tracking replicas. An evaluation would have been very useful, especially since I was left with an air of uncertainty about the probabilistic properties. ** Extensions - Add some determinstic guarantees to Kademlia, ie have a backbone of known reliable nodes in addition to the randomly determined ones. - Combine the XOR metric with a deterministic route construction algorithm - Evaluate the performance delta with a prefix-routing algorithm when both take underlying network latencies into account. From tmroeder@CS.Cornell.EDU Thu Oct 31 08:44:35 2002 Received: from dhcp98-88.cs.cornell.edu (dhcp98-88.cs.cornell.edu [128.84.98.88]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id g9VDiZc15833 for ; Thu, 31 Oct 2002 08:44:35 -0500 (EST) Received: (from tmroeder@localhost) by dhcp98-88.cs.cornell.edu (8.11.6/8.11.6) id g9VDgm019400; Thu, 31 Oct 2002 08:42:48 -0500 From: Thomas Roeder MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15809.13143.999357.958676@dhcp98-88.cs.cornell.edu> Date: Thu, 31 Oct 2002 08:42:47 -0500 To: Emin Gun Sirer Subject: 615 PAPER #61 X-Mailer: VM 7.07 under Emacs 21.2.1 In these papers, as we did last time, we evaluate performance along the metrics of lookup time, failure resilience, simplicity, and tamper resistance. I will treat each protocol separately below. Pastry is a routing object location protocol which has an expected lookup time of O(log n), when n is the number of nodes in the entire network. As in Plaxton's scheme it routes by prefix matching, and making monotonic progress towards the complete match of the key we are seeking. It achieves this progress with routing tables of size O(log n). One of the greatest flaws with Pastry is that it pushes the responsibility of replication and security onto the client application, disavowing any responsibility for it. Thus is a node fails, and the client is not replicating the state, the key/value pairs that were on the node are lost. Its failure resilience amounts to route resilience by being able to choose multiple paths rather than a single path. Its protocol is certainly simpler than most of the other protocols we have seen in this space, but, as noted above, it wins this simplicity by ignoring serious issues that it could and should handle. Its tamper resistance is low and basically dependent on the application above it. Chord uses the idea of consistent hashing to store key/value pairs by their key on the successor node of their key hashed into a circular identifier space. Nodes as they join are assigned a random value in this space, and add themselves into the ring. The consistent hashing is used to prove that the lookups will take expected O(log n) to discover a key. Furthermore, a join or a leave is shown to only take O(log^2 n), and the number of nodes moved is at most O(1/n). Like Pastry, Chord pushes the burden of replication onto the application using it, and only offers multipath routing as a partial solution to failure conditions. It thus has low failure resilience, especially in terms of recovering lost data. It is a relatively simple protocol, but also buys that simplicity at the expense of other important features, as noted above. It implicitly trusts the other Chord nodes to follow the algorithm outlined in the paper, and so is very vulnerable to tampering without assistance from the application. Tapestry is another of the Plaxton-based protocols which uses prefix routing to place key/value pairs in an overlay network of nodes. Its main contribution is a system which is quite failure resilient, at least in terms of routing. It too relies on the layer above to replicate data. They give the example of Silverback, from the OceanStore project, which places fragments on randomly chosen servers. The protocol is not at all simple, although it has reasonable lookup times. A major problem here, as in the protocols above, is that the system ignores tampering as an issue. It is, for example, entirely vulnerable to grey hole attacks. Kademlia proposes a new method of placing objects in an overlay network. The structure of the node space is the same as in Chord, but they use XOR as the distance metric. Having done so, they can then use queries as routing information, saving some of the administrative overhead that Chord incurs, although adding the cost of 'hello' messages being sent more frequently. The protocol is indeed a simple one, probably the simplest of the four papers. They nonetheless can sketch a proof that most operations are O(log n) for n the size of the network. Thus lookup time is acceptable, as it has been with all of the other structured P2P protocols that we have examined. If a node has failed, then this is expected to be discovered at query time, and replaced immediately. In any case, they keep enough information to generate multiple paths for a given request, in the average case. Tamper resistance is non-existent, since any node can make claims about the data it contains, and there is no mechanism to check its claims against reality. From shafat@CS.Cornell.EDU Thu Oct 31 08:47:06 2002 Received: from exchange.cs.cornell.edu (exchange.cs.cornell.edu [128.84.97.8]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id g9VDl5c16724 for ; Thu, 31 Oct 2002 08:47:05 -0500 (EST) content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Subject: 615 PAPER 61 X-MimeOLE: Produced By Microsoft Exchange V6.0.6249.0 Date: Thu, 31 Oct 2002 08:47:05 -0500 Message-ID: <47BCBC2A65D1D5478176F5615EA7976D134FC2@opus.cs.cornell.edu> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: 615 PAPER 61 Thread-Index: AcKA4/+N1+Md1+dFQhK9cVf8I2gsXg== From: "Syed Shafat Zaman" To: "Gun Sirer" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by sundial.cs.cornell.edu id g9VDl5c16724 Pastry: ------- Lookup time in Pastry is first analytically predicted to be upperbounded at log_2^b N, which is later experimentally shown (on simulators) to be true. But this bound is only applicable if the routing table are up-to-date and there's been no recent failures. Fault tolerance in Pastry is done in a number of ways. Replication: Replicates to k numerically closest NodeIds in the nodeId space. Another way to guard against arbitrarily failing nodes is to use randomized routing, and pick a random nodeId that matches the same length of prefix as that of the nodeId in the routing table. If there is a network partition, nodes can periodically do a expanding ring IP search to look for other nodes in the vicinity. Chord: ------ The number of messages Chord has to exchange in a steady state is given as O(log N), where N is the number of nodes in the system. This value remains relatively low even when N is quite large, thus exhibiting a good lookup time. As for its fault tolerance mechanisms, whenever a node decides to leave the network or simply dies, Chord uses a "successor-list" at each node to replace the missing node with the next successor. The finger tables are then eventually fixed when the stablization routine runs for each node. This is one of the ways Chord tries to keep its node states current and avoid failed lookups. Another method that can be deployed if the necessity arises is replication of keys in succeeding k nodes. This should also provide k-1 fault tolerance. Tapestry: --------- Tapestry's location and routing mechanisms are based on the Plaxton system, with a number of modifications to finetune performance, performance stability and fault handling. The lookup time factor in Tapestry is simulated in the paper in the context of Silverback which is an archival utility. The results reflect the outcome of having improved distributed algorithms added in Tapestry, and minimized latency. Tapestry, due to the way its neighbor maps are set up, also places an upper bound of log(n) logical hops to reach any unique node in the system. This has a significance impact on routing peroformance. One of the other key goals of Tapestry is to make the Plaxton system more flexible and fault tolerant. It achieves flexibility by not relying on global knowledge during network construction. Earlier, each object also had only one root node thus making it a single point of failure for that node. Tapestry improves upon that by introducing multiple root nodes of objects hashed with "salt". Each tapestry node makes routing more reliable by periodically sending a beacon to each neighbor and maintaining backup or alternate neighbors in case of failures. Kademlia: --------- Kademlia is yet another P2P system that attempts to combine a number of desirable P2P features into one coherent system. A lookup for a node or a value is carried out using parallel, asynchronous RPCs to a constant (alpha) number of nodes in a recursive fashion. Nodes that communicate quickly are explored further by sending requests to the nodes in their k-buckets. Lookup time is futher improved by maintaing a least-recently used list of nodes at each node. This avoids sending a request along a dead-end path and save time. Kademlia does not explicitly address failure tolerance although it uses various mechanisms that indirectly provide some guard against failure. As mentioned earlier, the LRU policy of maintaining nodes ensures that no dead node is included in the consideration for an extended period of time. An up-to-date list of pairs stored at the nodes is also maintained by periodic publishing by the nodes. One side-effect of implementing the LRU policy of caching nodes in the k-buckets is that it provides resistance to certain DoS attacks. Nodes in the k-buckets by are not automatically replaced unless they have been inactive for a period of time. And, before a node is evicted from the bucket, it is PINGed for the one last time ensuring that inauthentic addresses cannot end up occupying its space. From nbs24@cornell.edu Thu Oct 31 10:54:08 2002 Received: from travelers.mail.cornell.edu (travelers.mail.cornell.edu [132.236.56.13]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id g9VFs8c17385 for ; Thu, 31 Oct 2002 10:54:08 -0500 (EST) Received: by travelers.mail.cornell.edu (8.9.3/8.9.3) id KAA21550; Thu, 31 Oct 2002 10:54:06 -0500 (EST) Date: Thu, 31 Oct 2002 10:54:06 -0500 (EST) From: nbs24@cornell.edu Message-Id: <200210311554.KAA21550@travelers.mail.cornell.edu> To: egs@CS.Cornell.EDU Errors-To: nbs24@cornell.edu Reply-To: nbs24@cornell.edu MIME-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: IMP/PHP3 Imap webMail Program 2.0.9 Sender: nbs24@cornell.edu X-Originating-IP: 64.185.145.94 Subject: 615 PAPER 61 Pastry The paper presents the design and evaluation of Pastry. Pastry performs application-level routing and object location in a potentially large overlay network of nodes connected by the Internet. Each node in the network has a randomly assigned unique 128-bit id, keeps track of nodes within its immediate neighborhood (space) and notifies applications about topology changes. This is achieved by having each node maintain a routing table, neighborhood set and a leaf set, and having each node route towards nodes that share successively longer address prefixes with the destination. This avoids routing loops. If the key of a message falls within a receiving node’s leaf set, the message is sent directly to the destination; otherwise, the routing table (whose entries are chosen to provide good locality properties) is used. The authors show a linear lookup time in their experiment. Pastry is designed to guarantee an answer to a query in a bounded number of network hops but this can fail if several concurrent nodes fail simultaneously. It is also reasonably simple; it retains the scalability of FreeNet and the self- organizing properties of both FreeNet and Gnutella. It is not tamper-resistant as a malicious node can accept messages and not correctly forward them. Chord Chord is similar to Pastry but forwards messages based on numerical difference with the destination address. It seeks to address the problems of load balance, decentralization, scalability, availability and flexible naming. However, it also makes no explicit effort to achieve network locality. It is designed to efficiently locate the node that stores a particular data item, given a key. It achieves this by using a variant of consistent hashing to assign keys to nodes, with each node maintaining information about O(log n) other nodes in an n-node network. The authors show that its lookup runs in predictable time, O(log n), and always results in success or definitive failure but provides weaker worst-case guarantees. It is also very simple in its design. However, it is also not tamper-resistant since a malicious set of Chord nodes could present an incorrect view of the Chord ring by inserting an id immediately following the data’s key and having the node return errors when requested for the said data. Tapestry Like Pastry, Tapestry’s routing is also based on address prefixes, strives to achieve network locality and supports replication. While Tapestry places references to object location on hops between the server and the root, Pastry assumes that the clients use an id to route directly to the vicinity where replicas of the object are located. However, Tapestry appears to be more complex than Pastry. Its goals are adaptivity, self-management and fault-resilience, while maintaining the good properties (simplicity, scalability, locality, decent lookup time) of the previously proposed Plaxton scheme. Fault adaptivity is addressed by using soft state to maintain cached content (which are updated periodically) for fault recovery. Kadmelia Kadmelia uses a novel XOR metric for distance between points in a key space. This allows a query to be sent to any node within an interval; allowing routes to be selected based on latency. Kadmelia uses only one algorithm to located nodes near a given id, eliminating the need for secondary routing tables. Each node stores information (IP address, UDP port, node id) of neighbor nodes. This information is updated whenever a message is received from the corresponding node. A least-recently seen eviction policy is used to maintain these lists but live nodes are never evicted. Nana B. Sam From kwalsh@CS.Cornell.EDU Thu Oct 31 11:07:39 2002 Received: from kwalsh4158.cs.cornell.edu (dhcp98-75.cs.cornell.edu [128.84.98.75]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id g9VG7cc22144 for ; Thu, 31 Oct 2002 11:07:38 -0500 (EST) Message-Id: <5.1.1.6.0.20021031102743.00a85ea8@popsrv.cs.cornell.edu> X-Sender: kwalsh@popsrv.cs.cornell.edu X-Mailer: QUALCOMM Windows Eudora Version 5.1.1 Date: Thu, 31 Oct 2002 11:07:38 -0500 To: egs@CS.Cornell.EDU From: Kevin Walsh Subject: 615 PAPER 61 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. Chord: A Peer-to-Peer Lookup Service for Internet Applications. Tapestry: An Infrastructure for Wide-area Fault-tolerant Location and Routing. Kademlia: A peer-to-peer information system based on the XOR metric. These papers each present distributed hash table mechanisms all more or less equivalent to each other (and to CAN). The differences can be found mostly in the various tricks, optimizations, and extra tactics that are layered on top of the same fundamental design. These tricks are meant to improve resiliency against failure, simplify the routing protocol or reduce node routing state. Only Kademlia addresses resistance of the protocol to tampering. Chord is perhaps the most similar to CAN (equivalent, in fact, if one considers a CAN with dimensionality d = log2(N) and one also runs Chord in both clockwise and counter-clockwise directions). Chord adds similar optimizations, using multiple realities for optimizing routes, and replication of objects among k successive nodes for failure resistance. Pastry/Tapestry is based on Plaxton trees, essentially a generalization of the Chord/CAN base 2 routing mechanism to a base b routing mechanism. Taking b = 2 reduces Plaxton to the Chord/CAN strategy of successive halving of the distance to the target. One slight difference is the ability to choose the long distance links according to arbitrary metrics, and to also maintain several backup links. This is in contrast to the Chord/CAN model, where finger[i] must point to exactly the right node. This comes at the cost of slightly longer expected routes on average. It also introduces slight complexity in determining the precise home for an object. To resolve this problem, nodes shift to an alternate routing strategy as the request approaches the neighborhood of the object's home, essentially doing a linear search for the closest node. Plaxton also uses the multiple reality trick ("planes") for replication. Finally, Kademlia presents one interesting trick which has several advantages over the Chord/Pastry/Tapestry one-way binary (b-ary) routing mechanism. Kademlia proposes using the XOR of keys as the distance metric. The benefits of this are not immediately obvious. First, it is one-way (as in Chord's clockwise routing), allowing intermediate nodes along a path to cache objects (since all paths to a destination converge). Second, incoming messages give useful information about other paths, since XOR is symmetric (that is, if A routes a message to B for some key, then B might also want to route messages to A for some other keys). This feature is missing in Chord/Pastry/Tapestry, but is present in CAN, and could easily be added to Chord (by running Chord in both directions). Kademlia discusses somewhat the issue of tampering and resistance to failure strategies (other than just replication). By preferring old neighbors in the routing tables over new ones, it gains some resistance to an attacker trying to subvert the network by adding many new nodes. Also, old neighbors are more likely to stay up than new, providing some avoidance of failure-prone nodes. One last trick of Kademlia is to do round-trip RPC-style routing, rather than forwarding-style routing. From bd39@cornell.edu Thu Oct 31 11:23:59 2002 Received: from postoffice2.mail.cornell.edu (postoffice2.mail.cornell.edu [132.236.56.10]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id g9VGNxc25544 for ; Thu, 31 Oct 2002 11:23:59 -0500 (EST) Received: from boweilaptop.cornell.edu (dhcp-128-84-93-87-wifi.ece.cornell.edu [128.84.93.87]) by postoffice2.mail.cornell.edu (8.9.3/8.9.3) with ESMTP id LAA22467 for ; Thu, 31 Oct 2002 11:23:57 -0500 (EST) Message-Id: <5.1.0.14.2.20021031112201.00b752a8@postoffice2.mail.cornell.edu> X-Sender: bd39@postoffice2.mail.cornell.edu (Unverified) X-Mailer: QUALCOMM Windows Eudora Version 5.1 Date: Thu, 31 Oct 2002 11:22:09 -0500 To: egs@CS.Cornell.EDU From: Bowei Du Subject: 615 PAPER 61 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed * CHORD * Chord is a distributed hashing scheme which serves much the same purpose as CAN did in the earlier reading. Chord's functionality is simply to map a hash value to a single node in the participating network. As before, the goal is to form a scalable, decentralized, load-balancing system in which this can be done which adapts very well to changes in node joins or failures. Chord maps key to nodes using a consistent hashing technique. Keys are mapped by their hashes h to the first node whose hash equals or follows h in a modulo 2^m ring of numbers, which m is the bit width of the hash. Theoretical work in consistent hashing has shown that with high probability, nodes divide the key space evenly. Each node is responsible for at most O(1+log N K / N) keys. Routing in Chord is accomplished by maintaining a Finger table, which contains the node addresses of nodes which are 2^i-1 further down the ring of numbers, where i designates the ith finger node. These Fingers enable a chord node to route at least halving the distance in the key space to the desired node, which results in an efficient O(log N) routing time. In the worst case, routing could be done by following neighbor links (preceding and successor links) in the ring of node. They do not demonstrate in this paper, but initialization of new node can be bounded by O(log N) as well. In the evaluations, Chord performs very well in the face of node failures and joins. The consistent hashing function also provides load balancing in terms of spreading keys out evenly among the nodes. It would be interesting to see what role different geographic distribution of nodes would have on the performance of the system. * PASTRY * PASTRY's goals and organization is very similar to CHORD. Pastry distributes its nodeId in a 2^128 space of numbers. Each node maintains a table of nodes it knows to be share different lengths of the prefixes of the node's own nodeId. Routing is performed by routing to nodes which share the as much of the same prefix as the key. Eventually, the message will be routed to a node designated in the leaf set of the routing table, and these nodes are the repository for he keys pairs. Chord seems to be a generalization of Pastry, which minor changes in implementation. Pastry attempts addresses locality in routing as well. Pastry argues that because new nodes initialize their neighborhood tables from nearby nodes on a path towards the node with the closests key, their routing table will contain entries which have the locality property. * Tapestry * Tapestry also routes in a prefix matching manner. The digits of a hash of an object is use to determine which node to route to next. In Tapestry also combines some features of freenet - objects are replicated along the path from publisher to the node whose ID is closest to the object hash. Neighborhood discovery is done with an expanding ring search. * Kademlia * Kademlia present a variant of the prefex routing metric. Instead of using prefixes (as with the previous schemes) it uses the XOR of the identifiers as a measure of distance of a node to its key. Also, keys are replicated throughout the network by proactively caching them whenever a better candidate node is known. Caches of keys are managed with LRU, which leads to the problem of stale keys expiring. Kademlia solves this by republishing keys every hour, but this seems like a nasty hack which could have scalability issues. * Question * How do these distributed hash tables differ ? Evaluate along lookup time, failure resilience, simplicity, tamper-resistance. Because of the prefix routing, Chord, Pastry and Tapestry have an expected O(log N) routing time. The matching of a digit in the hash corresponds to the elimination of a portion of the network in a binary search. Kademlia claims to perform better by giving more potentials to match a particular hash ID - instead of matching simply the prefix, any selected sets of digits in the the ID can be matched. Chord mitgates failures by repeated time updates of information at every constant interval. Pastry deals with bad nodes by performing randomized selection next hop forwarding. Kademlia performs fault tolerance by replication (similar in some ways to Freenet) in the nodes who have similar keys. All schemes have keep some secondary table of backup nodes (i.e. in Chord, successors) which to send data in case a route cannot be found. This however degenerates routing length to O(n) in the network. In terms of simplicity, chord seems to be the most elegantly (described) scheme. When hashed ip addresses are used, it is more difficult for malicious users to take over portions of the keyspace. For nodes that simply randomly choose a hash value from the key space, it is possible for nodes to fill their routing tables with false entries that cover the key space and redirect traffic to themselves. From adam@graphics.cornell.edu Thu Oct 31 11:54:15 2002 Received: from bach.graphics.cornell.edu (bach.graphics.cornell.edu [128.84.247.50]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id g9VGsEc02603 for ; Thu, 31 Oct 2002 11:54:14 -0500 (EST) Received: from envy.graphics.cornell.edu (envy.graphics.cornell.edu [128.84.247.206]) by bach.graphics.cornell.edu (8.12.1/8.12.1) with ESMTP id g9VGs90k062967 for ; Thu, 31 Oct 2002 11:54:09 -0500 (EST) Date: Thu, 31 Oct 2002 11:53:45 -0500 (EST) From: Adam Kravetz To: egs@CS.Cornell.EDU Subject: 615 paper 61 Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Chord: Chord is simply a distributed has function, spreading keys evenly over nodes, the key hash corresponds to the position in a ring. Find the node which sits closest on this ring and you can find the file. You can do so by "jumping" across the ring (doing a binary search) and having to only jump across O(log n) nodes, this however requires information about O(log n) nodes in the network. Chord claims it is scalable w/ a log bound on lookup, it's fully distributed, automatic fault tolerance and has flexible naming. The use of "succesor-list" to find the next successor in a chord ring when a node dies is part of fault tolerance of the system. Pastry: Another self-organizing overlay network with a randomly assigned nodeID which uniformly distribute themselves over a 128 bit space. Again given a certain amount of info, like Chord it can route find information housed on the numerically closest node in less than Log_2^b N steps. Routing is simple, if it's close by pass it down the chain, if not pass it to a node (accross the ring) with a corresponding node prefix, finally if no match is found than forward to something closer than this node. Joins are simply specialized look-ups which involve knowing one node in Pastry, asking them to send a special look-up message (a "join"). All nodes that touch this respond w/ their table info, helping the newly joined node to establish itself. Fault tolerance works by nodes that had the failed node in their leaf sets (local to the failed node). A node that notices the failure tries to find a node just outside the leaf set of it on the "far side" of the node that failed relative to it. A new node is chosen to insert into L(the leaf set w/ a failed node) to replace the failed one. Now requesting nodes will lazily find out about the change in the network. Tapestry: Tapestry has the same goals as Chord and Pastry at least roughly. Using a similar routing idea (a ring of nodes distributed over a hash space) they extend it by allowing hashed names to lie close the their desired source. Fault tolerance is achieved (or attempted) by using backup nodes in the tables. Kademlia: The first paper that does fault tolerance by replication of data (yeah!), which makes me happy. I think that redundancy is a good guard against the spotty user (one that gets on the network for a short time gets the files they want, then leaves w/ some files on there system which people might want... especially if they are distributed across the network it would make it hard to determine what has been lost and would need to be replicated from original source). Kademlia's XOR metric is a cool trick which is the main difference in routing over the other three papers. Chord, Pastry and Tapestry are very much a like and have a bounded routing, lookup time of O(log N). Kademlia and it's XORing system claims better based on better matches (aka matching any subset of the key against any nodeID). From vrg3@cornell.edu Thu Oct 31 11:55:16 2002 Received: from travelers.mail.cornell.edu (travelers.mail.cornell.edu [132.236.56.13]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id g9VGtGc02682 for ; Thu, 31 Oct 2002 11:55:16 -0500 (EST) Received: from travelers.mail.cornell.edu (travelers.mail.cornell.edu [132.236.56.13]) by travelers.mail.cornell.edu (8.9.3/8.9.3) with SMTP id LAA08558; Thu, 31 Oct 2002 11:55:13 -0500 (EST) Date: Thu, 31 Oct 2002 11:55:12 -0500 (EST) From: vrg3@cornell.edu X-Sender: vrg3@travelers.mail.cornell.edu To: egs@CS.Cornell.EDU Subject: 615 PAPER 61 Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Pastry is a distributed system for locating objects. It functions by routing (message,key) pairs to the node whose ID is closest to the key. Each node maintains a routing table whose size is logarithmic with respect to the size of the network, and can perform a lookup in logarithmic time (actually log_(2^b) N). Pastry achieves its failure resilience by adding some randomization; each routing step can choose randomly between several nodes which would each bring the query closer to the goal. Pastry does not have an approach to deal with tampering other than this randomized routing which would make it possible to potentially avoid some nodes. Pastry is very simple because it pushes much of the responsibility for making use of the routing, for providing robustness and redundancy, and for providing security up to the application layer above. This simplicity is somewhat marred by the presence of three parameters, b, |L|, and |M|, which must be chosen. Chord provides the service of mapping a given key onto a node in the network with a ring-type topology. Like in Pastry, nodes in a Chord network keep routing tables logarithmic in size so that they can perform lookups in logarithmic time by doing a kind of binary search. Pastry's lookups have a larger logarithmic base ordinarily, however. Because of the one-dimensional network topology, any node failure must be met with a response repairing the structure. This means that a node failure results in a logarithmic amount of work for each of the failed node's neighbors. There does not appear to be any security measures against adversarial nodes in Chord. Network maintenance is a cooperative process which involves implicit trust. Like Pastry, Chord pushes the requirements for redundancy and security to the application layer above. Because of this, Chord is fairly simple and elegant. Tapestry provides lookups by using a technique similar to Pastry's prefix searching, except with suffix searching. This gives it a logarithmic lookup time. Failure resilience was a primary goal of the Tapestry design; this is achieved by including replication in the protocol, unlike Chord and Pastry. Each object is cached at multiple nodes, and multiple nodes behave as roots for each object. Tamper-resistance is not well addressed by Tapestry. Tapestry is quite complex, but handles failure well. Kademlia introduces a novel routing metric, based on the XOR of keys. Nodes in a Kademlia network keep track of the k nodes nearest them, and cache keys passing through them. Lookup time is logarithmic because the lookup process involves recursive parallel searching down a tree-like structure represented by the nodes' k-bucket lists. Failure resilience is achieved by the LRU nature of the k-bucket lists; this means that only nodes that were recently known to be active are used. Lookups also take multiple paths, which provides some redundancy. There is no direct approach to prevent tampering with data. Kademlia overall is quite simple, although it does have some parameters whose proper selection can affect the efficiency of the network. From mvp9@cornell.edu Thu Oct 31 12:02:45 2002 Received: from cornell.edu (cornell.edu [132.236.56.6]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id g9VH2jc04594 for ; Thu, 31 Oct 2002 12:02:45 -0500 (EST) Received: from zoopark.cornell.edu (syr-24-58-46-186.twcny.rr.com [24.58.46.186]) by cornell.edu (8.9.3/8.9.3) with ESMTP id MAA07514 for ; Thu, 31 Oct 2002 12:02:45 -0500 (EST) Message-Id: <5.1.0.14.2.20021031120206.00ac4f30@postoffice.mail.cornell.edu> X-Sender: mvp9@postoffice.mail.cornell.edu (Unverified) X-Mailer: QUALCOMM Windows Eudora Version 5.1 Date: Thu, 31 Oct 2002 12:02:44 -0500 To: egs@CS.Cornell.EDU From: mike polyakov Subject: 615 PAPER 61 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed The four systems presented today all aim to quickly find locations of objects in overlaid networks. Pastry is simply a substrate for various distributed P2P services. Node ids are picked randomly, allowing some measure of load balancing. Through a direct relationship between node ids and keys, messages can find their way in log N time, in the number of nodes. When new nodes are added, updates to routing tables must be sent around, the resulting tables also being of size log N. Although it performs replication along with way, Pastry puts most the responsibility for reliability and security on the application that is running on top. Although its resilient to single failures, simultaneous failures of neighboring nodes can prevent query completion. Overall it is a fairly simple protocol. Chord focuses more on providing scalable data location in a dynamic environment. It's sole function is to map keys to nodes the interface it presents to the application. The mapping is done by a consistent hash function which ensures load balancing (with some assumptions about the nature of the hash function). Chord also maintains log N information, but can serve queries with considerably less (allowing failures) though with slower than normal log N running time. Similarly to the plaxton scheme, it maintains finger tables, such that the size of the hop increases on each hop. While it has flexible naming, there is no concept of locality and demand by location cannot be addressed as it was in freenet. It has provable correctness and performance, and the basic scheme is fairly simple, though maintaining synchronization can become complicated. In the evaluation it scales quite well, with log N and a small constant, path lengths averaging 6 when there are 20,000 nodes in the system. Tapestry is another plaxton-based system. It does keep track of locality, providing access to the closest copy of the object. It employs randomness for both load distribution and locality, that is objects are randomly assigned to nodes. Like in Plaxton, routing is done digit by digit (of the key id), in their case from end to front, so path lengths end up roughly ln N; and each object in the network has a root node and a tree based at that node. It is scalable and the fault handling & routing are quite simple there are multiple redundant routing links per entry. However maintaining the soft state and running all of the optimizations required for good behavior is quite complicated. Kademlia also uses prefix routing as the above systems. One of its benefits is that configuration information is spread with key lookup, eliminating much of the traffic necessary for other algorithms. This is accomplished by using XOR as the distance metric between keys. Since it is unidirectional (symmetric) the incoming request tells a node a lot about the topology around it. Keys and nodes ids are in the same space, thus a node has keys that are most like its id. Keys are also cached as part of the system operation, improving performance. This is probably the simplest protocol reviewed, which allows the authors to give proofs of its functionality and speed once again, queries are log N. The system of using k-buckets increase performance and provide some defense against DoS attacks. Unfortunately there is no empirical evaluation of the protocol in the paper to bear out the results of the proofs. From mr228@cornell.edu Thu Oct 31 12:07:26 2002 Received: from travelers.mail.cornell.edu (travelers.mail.cornell.edu [132.236.56.13]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id g9VH7Qc05763 for ; Thu, 31 Oct 2002 12:07:26 -0500 (EST) Received: by travelers.mail.cornell.edu (8.9.3/8.9.3) id MAA18397; Thu, 31 Oct 2002 12:07:23 -0500 (EST) Date: Thu, 31 Oct 2002 12:07:23 -0500 (EST) From: mr228@cornell.edu Message-Id: <200210311707.MAA18397@travelers.mail.cornell.edu> To: egs@CS.Cornell.EDU Errors-To: mr228@cornell.edu Reply-To: mr228@cornell.edu Cc: mr228@cornell.edu MIME-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: IMP/PHP3 Imap webMail Program 2.0.9 Sender: mr228@cornell.edu X-Originating-IP: 128.84.223.176 Subject: 615 PAPER 61 The papers we read for today all discuss overlay networks aimed at allowing distributed storage with varying degrees of fault-tolerance and efficiency. Chord is a distributed hashtable. The goal here is efficiency of lookups. Keys are mapped to node based on a hash. The node that stores a given key is the one whose key most closely matches the file's key. The basic Chord protocol works like this: all nodes compute or are assigned their unique identifier (chosen uniformly at random from the keyspace) There then exists a function that maps content to a key within this space. All nodes arrange themselves into a virtual circle in increasing order of node id. Nodes have both forward and backward pointers to their neighbors. Additionally, each node keeps log(# hosts) pointers to other locations within the ring. That is, each node has pointers to the nodes that are 2, 4, 8, 16, 32, ... (# hosts / 2) away from it when counting clockwise around the circle. Ths allows for the fundamental contribution: O(logN) lookups. With this scheme it is provable that starting at any node, one can reach the node containing any piece of data in log (N) steps. The paper further describes the mechanisms for hosts joining and leaving and for dealing with node failure. It also present a simulation and empirical results. In the second paper, Pastry, we see a system that is very similar to Chord. At their core, both systems give similar bounds on performance and fault- tolerance. In Pastry each node keeps track of a finite neighborhood of nodes and routing is done by lookups in this table. If a node receives a request that it cannot service itself, it checks its leaf set to see if one of them can handle it. If so, the request is delivered. In additional to this leaf set, the nodes keep a routing table that is prefix-based. That is, the key being requested is routed to the node in the routing table whose prefix it most closely matches. The paper discusses how these sets are created and maintained and given a reasonable set of assumptions, bounds the lookups time. The key to bounding these relies on the keys being approx. uniformly distributed. The paper goes on to describe node failures and other maintanence provisions. A key benefit of this protocol is that it takes into account the underlying network and can do things such as encourage nodes that are "close" in the underlying network's topology to be close in the overlay network. Like Pastry, Tapestry is also a system based on prefix matching and clever assignment of keys. One key feature of Tapestry is replication: more than 1 copy of a file/resource may be present in the network. Like the other protocols, Tapestry's state is soft and self- repairing. Routing is done via a neighborhood map that has a dynamic size and is discovered with a modified expanding ring search. Kademlia is an interesting approach that makes use of some of the properties of XOR. Keys are again 160-bit identifiers. The distance between two keys is the XOR of them (interpreted as an integer). XOR is unidirectional; therefore, all lookups will converge along the same path and the key benefit is that caching of these lookups can alleviate hotspots. The topology that evolves is reinforced as queries are presented. This paper lacks an implementation and instead contains only a sketchy proof and discussion in support of their claims. From hs247@cornell.edu Thu Oct 31 12:12:22 2002 Received: from mailout5-0.nyroc.rr.com (mailout5-1.nyroc.rr.com [24.92.226.169]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id g9VHCMc07160 for ; Thu, 31 Oct 2002 12:12:22 -0500 (EST) Received: from hubby.cornell.edu (syr-24-58-42-130.twcny.rr.com [24.58.42.130]) by mailout5-0.nyroc.rr.com (8.11.6/RoadRunner 1.20) with ESMTP id g9VHCJc14172; Thu, 31 Oct 2002 12:12:19 -0500 (EST) Message-Id: <5.1.0.14.2.20021031121047.00b433e0@postoffice2.mail.cornell.edu> X-Sender: hs247@postoffice2.mail.cornell.edu (Unverified) X-Mailer: QUALCOMM Windows Eudora Version 5.1 Date: Thu, 31 Oct 2002 12:12:48 -0500 To: egs@CS.Cornell.EDU, cornell.edu@cornell.edu From: Hubert Sun Subject: 615 Paper 61 Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1"; format=flowed Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by sundial.cs.cornell.edu id g9VHCMc07160 Pastry stores routing about nodes which share the same n bits in identifier. This allows requests to be routed towards the correct node. It also has leaves which have are directly greater and less then its key. Pastry also has a neighbourhood map which contains addresses closes in geography to it (This allows Pastry to have some local information). To process a request, if its not the owner, it looks first at the leaves, then at the routing table. On average, a lookup is then log N where N is the number of nodes in the network. This is guaranteed because from the tables, each request moves closer towards the target. Chord has a routing table that contains nodes that are 1, 2, 2^2, … hops away. Therefore each node only keeps log N entries which is more space efficient then Pastry. If it is not the owner, the node forwards the request to the 2^i node where 2^i < key in the ring. A request is therefore guaranteed not to go around the ring more than once, and the entries allow the search to be log N. The algorithm and data structures required are simpler then Pastry. Tapestry has a nice property that it’s requests are looked up in logN time where N is the length of the key. Each key is routed from right to left. So if a node reaches node Y on the ith hop, Y knows that the ith most right digits match it’s key. All it has to do is pass it on to the next node which is responsible for the ith+1 digit. However, this also means the table grows can grow as the key size becomes larger. Tapestry seems to be most complex out of the 4 networks. Kadmelia uses a XOR metric to determine the distance between nodes. Though it isn’t totally quite clear, they claim the average operation time to be log N. The advantage of Kadmelia is that it’s algorithm is very simple. As far as fault tolerance, Tapestry offers this by having multiple routes for each routing entry. Because Chord is one way, they seem to be less fault tolerant then Pastry. When a node fails, its immediate predecessor has to take over. Though the paper claims Chord will only fail if all log N entries in the table fail. In Pastry, data is replicated to nodes near a node, and totally fails only if all the leaves entries fail. All the papers discussed seem to be susceptible to tampering attacks as any node can join a network and do attacks such as diverging route and providing wrong information to other nodes. From smw17@cornell.edu Thu Oct 31 12:13:51 2002 Received: from cornell.edu (cornell.edu [132.236.56.6]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id g9VHDpc07258 for ; Thu, 31 Oct 2002 12:13:51 -0500 (EST) Received: from cornell.edu ([128.84.84.84]) by cornell.edu (8.9.3/8.9.3) with ESMTP id MAA15525 for ; Thu, 31 Oct 2002 12:13:49 -0500 (EST) Message-ID: <3DC164F2.7090308@cornell.edu> Date: Thu, 31 Oct 2002 12:14:26 -0500 From: Sean Welch User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.0.1) Gecko/20020823 Netscape/7.0 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Emin Gun Sirer Subject: 615 PAPER 61 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit Chord/Pastry Chord is a structured P2P network based on a variant of consistent hashing. The primary goal of Chord is to take a key and map that key onto a node. In addition, Chord attempts to maintain reasonable scalability by reducing the local knowledge requirement and the route length to O(log N). Chord creates and maintains an identifier circle within the keyspace, with each node responsible for a portion of the keyspace based on their position within the circle. Minimal routing may be achieved by simply forwarding to your successor node in the ring, but this incurs linear growth with respect to the number of nodes. To reduce this, Chord seeks to improve routing performance by maintaining a table of successor pointers into a series of k-buckets. Each k-bucket represents a bit of difference between the current node and the true successor, and hence represents an approximate binary search/traversal of the identifier ring. Pastry is a very similar protocol, which also configures an identifier ring as described above. The major difference between Chord and Pastry is that in Pastry, there is an inherent two-tier configuration, with nodes maintaining both a general routing table (with multiple entries in each bucket) as well as a list of leaf nodes that are within the local bucket size. This permits two major changes/optimizations over Chord. First, the presence of the leaf nodes implies inherent redundant information in the network for nodes close to the desired node. This both increases performance somewhat by allowing the idea of 'close enough' during routing, but also provides some degree of enhanced fault tolerance independent of the network reorganization mechanism for node addition/ removal. The second improvement is the ability to select nodes based on the IP level performance, and attempting to select short, fast hops wherever possible. This is an interesting idea, but it is by no means an optimal network path, especially when physically long distances need to be traversed. But the availability of multiple leaves and multiple entries in the routing table allows the freedom to optimize based on network or application derived metrics. Tapestry Tapestry is somewhat different in presentation, aiming more at creating a wide-area location and routing system. Tapestry is a modified Plaxton scheme, intended to permit operation in dynamic environments with only local knowledge. Tapestry accomplishes this through surrogate routing, where a node is presented with some object key k. The node then looks in its neighbor table to find the neighbor that matches the largest number of trailing bits, and forwards the data to that node. While less efficient than the original Plaxton scheme, this surrogate routing scheme allows for the extension to decentralized dynamic systems. Fault tolerance is provided by maintaining multiple neighbors for each link, permitting a fast switch to an alternate on link failure. The previous link is not simply discarded, but instead is maintained for a second chance interval. By not discarding the link information, nodes may avoid the removal/addition overhead required for a link that suffers only short transient failures. Tapestry also provides support for mobile data objects, allowing a moved data object to redirect the routing path only as necessary to accomodate the change. Essentially, the new node that owns the object propagates a message up the tree until it reaches a node that already knows about its previous route. Once this node is adjusted, the route is properly reconstructed. Essentially, Tapestry makes a credible attempt to create a useful wide-area system. Implementation details are sparse in many cases, providing more of a general outline than a complete functional system description. Kademlia Kademlia is an interesting little system based on an XOR metric. The underlying system is conceptually similar to those outlined above, with some differences in implementation (i.e. - eviction policies). The main advantage is that the XOR scheme is both bidirectional and uniform; there is no need for secondary routing tables such as those in Pastry. Kademlia also includes inherently parallel distribution to nodes close to the actual keyspace, allowing a reliability/bandwith tradeoff in the system. Keys are distributed to the k nodes nearest the actual keyspace, and any node with the appropriate key can respond to (and terminate) a query. From ag75@cornell.edu Thu Oct 31 12:30:06 2002 Received: from travelers.mail.cornell.edu (travelers.mail.cornell.edu [132.236.56.13]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id g9VHU5c10570 for ; Thu, 31 Oct 2002 12:30:05 -0500 (EST) Received: from travelers.mail.cornell.edu (travelers.mail.cornell.edu [132.236.56.13]) by travelers.mail.cornell.edu (8.9.3/8.9.3) with SMTP id MAA05468 for ; Thu, 31 Oct 2002 12:30:02 -0500 (EST) Date: Thu, 31 Oct 2002 12:30:02 -0500 (EST) From: ag75@cornell.edu X-Sender: ag75@travelers.mail.cornell.edu To: egs@CS.Cornell.EDU Subject: 615 PAPER 61 Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII This week we are presented with 4 papers in which the authors describe their object location and routing systems. These systems differ in how they are implemented, but their goals are pretty similar. Pastry is a distributed object location and routing substrate for wide-area P2P applications. The idea behind Pastry is to make it scalable, fault-resilient, completely decentralized and self-organizing. The lookup time for Pastry is very good. With a large network of 100,000 nodes the number of hops needed for one node to reach any other node never exceeds 5. It is quite resilient to failures, and its performance degrades gradually with the number of recent node failures. It is not a simple protocol, but it is scalable and has good performance. Pastry has no way to prevent tampering with data or any protection against DoS attacks. Chord is a protocol for lookup in a dynamic P2P system with frequent node arrivals and departures. The idea behind Chord is to make it scalable, fully distributed and simple. The lookup time for Chord is similar to that of Pastry, but a bit worse. In a network of 10,000 nodes, the number of hops needed was approximately 10 in the worst case. Like Pastry, Chord is quite resilient to failures, and its performance degrades gradually with the number of node failures. One of the advantages advocated by the authors is that Chord is simple. Chord can deal with some types of attacks, but it is still quite vulnerable to tampering. Tapestry is an overlay location and routing infrastructure that provides location-independent routing of messages directly to the closest copy of an object or service without centralized resources. The idea behind Tapestry is to make it self-administering, fault-taulerant and scalable. Tapestry provides similar logarithmic logical hop limits as Chord, but there is not enough information t determine which of the two performs better. Tapestry is quite resilient to failures, but it can't deal with networks that are constantly changing, such as sensor networks. It is a complex protocol and, like the previous two algorithms, susceptible to tampering. Kademlia is a P2P system that has provable consistency and performance in a fault-prone environment. The idea behind Kademlia is to route queries and locate nodes using an XOR-based metric topology. The authors claim that Kademlia can route with lower latency than Chord, but there are no measurements to justify their claim. It deals with failed nodes by using parallel, asynchronous queries, but again there are no measurements to indicate how well the system works. Kademlia is simple and resists certain basic DoS attacks. From pj39@CS.Cornell.EDU Thu Oct 31 13:13:51 2002 Received: from postoffice2.mail.cornell.edu (postoffice2.mail.cornell.edu [132.236.56.10]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id g9VIDpc20166 for ; Thu, 31 Oct 2002 13:13:51 -0500 (EST) Received: from pj39.cornell.edu (syr-24-59-67-50.twcny.rr.com [24.59.67.50]) by postoffice2.mail.cornell.edu (8.9.3/8.9.3) with ESMTP id NAA00432 for ; Thu, 31 Oct 2002 13:13:49 -0500 (EST) Message-Id: <5.1.0.14.2.20021031130824.01a7d638@postoffice2.mail.cornell.edu> X-Sender: pj39@postoffice2.mail.cornell.edu (Unverified) X-Mailer: QUALCOMM Windows Eudora Version 5.1 Date: Thu, 31 Oct 2002 13:13:52 -0500 To: egs@CS.Cornell.EDU From: Piyoosh Jalan Subject: 615 PAPER 61 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems Chord: A Peer-to-Peer Lookup Service for Internet Applications Tapestry: An Infrastructure for Wide-area Fault-tolerant Location and Routing Kademlia: A peer-to-peer information system based on the XOR metric All the four papers discussed today differ a lot in terms of their look up time, failure resilience and simplicity. Pastry an decentralized object location and routing for large scale p2p systems has a unique 128 bit nodeID assigned to each node which may be a crytographic hash of its public key or IP address. The nodeID indicates the position of the node in the circluar node space which ranges from 0- 2^128 - 1. Each pastry node maintains a routing table a neighborhood set and a leaf set. The routing table contains log(2^b) N rows with each row having 2^b-1 entries. The neighborhood set M contains the nodeIds and IP addresses of the |M| nodes that are closest according to proximity metric to the local node. The leaf set L consists of |L|/2 nodes with numerically closest smaller nodeIds relative to the present nodeId. In a N node network pastry would thus require lookup of log(2^b) N lookups, assuming the the routing table doesn't has stale entries. Pastry achieves fault tolerance by using of randomized routing and replication. As far as simplicity is concerned I think pastry is not that simple as compared to other systems such as Kademlia. Chord is a peer to peer lookup service whose basic functionality is to map keys to nodes responsible for them. It uses consistent hashing with several good properties such as hash function balances load, when a node N joins (or leaves) the network only O(1/N) fraction of the keys are moved to a different location. At each node chord maintains a finger table with log N entries which gives the successor for nodes in the ring. In a N node chord maintains information about O(log N) other nodes and hence lookup is of the order O (log N). Chord is most suitable for dynamic environment where nodes very frequently join and leave the network. Nodes in chord form a chord ring. Chord also performs adequately with nodes joining the system concurrently and with nodes that fail or leave the system concurrently. Whenever a node in the ring leaves the network or fails chord uses successor list of r nearest successors on the chord ring which helps in replacing the missing node with the next successor. Tapestry is an overlay location and routing infrastructure that provides location-independent routing directly to the closest copy of data using p2p links. The main difference between Tapestry and other papers discussed today is that it is fault-taulerant and resilient under load. It uses soft-state announce/listen approach and caches are updated and purged by periodic refreshment messages to neighbors about its routing information. To detect link failures nodes send a periodic beacon to all its neighbor. Node also maintains two backup neighbors for fault taulerance. . There are multiple root nodes in tapestry thus no singly point of failure. Tapestry is thus quite complex in implementation but with a fairly good lookup time O(log N). Kademlia is a peer to peer system which locates nodes based on XOR metric. It combines features of other P2P systems such as chord and pastry to form a coherent system. Lookup in Kademlia is based on parallel asynchronous Remote Procedure Call (RPCs). Nodes that respond quickly are further explored by sending requests to them in their k-buckets. Fault tolerance in Kademlia is based on each node maintaining a list of least recently used nodes thus avoiding sending a request along a possible dead path. I believe Kademlia by far is the most simplest among the paper discussed today. From xz56@cornell.edu Thu Oct 31 14:25:44 2002 Received: from postoffice2.mail.cornell.edu (postoffice2.mail.cornell.edu [132.236.56.10]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id g9VJPhc05871 for ; Thu, 31 Oct 2002 14:25:44 -0500 (EST) Received: from Xin (dhcp-ece-171.ece.cornell.edu [132.236.232.171]) by postoffice2.mail.cornell.edu (8.9.3/8.9.3) with SMTP id OAA04871 for ; Thu, 31 Oct 2002 14:25:42 -0500 (EST) Message-ID: <01e301c28113$4500fe90$5c5d5480@Xin> From: "Xin Zhang" To: "Emin Gun Sirer" Subject: 615 PAPER 61 Date: Thu, 31 Oct 2002 14:25:27 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2720.3000 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 All the four papers deal with the problem of application-level routing and object location. Pastry aims at being a completely decentralized, fault-resilient, scalable, and reliable substrate for internet applications (such as PAST developed upon it). It achieves good route locality properties by assigning each node a unique identifier (nodeId) indicating the node's position and routing the message to the numerically closest node to the key. The state each node maintains contains: a routing table, a neighborhood set and a leaf set. Routing is done in 3 steps: check the leaf set, check in the routing table nodes sharing common prefix, check other associate nodes.With high probability the routing can be done in logarithm time. A new node joins by asking a known neighbor routing a request and copying the state table from the response and making some proper modifications to the table.Nodes periodically check the liveness of the other nodes and update the table.The simulation shows a good scalability (up to 100,000 nodes), a quite high percentage of finding the closest node and quick recovery from node failures. Chord is a distributed lookup protocol. Given a key, it maps the key onto a node. It uses consistent hashing to do this, which also helps to balance the load. Each node stores a small number of other nodes, mostly nodes nearby. One lookup takes time O(log N) and routing info can be updated in time O(log N ^ 2) in case of joins and fails. Tapestry provides location-independent routing using only point-to-point links even in the presence of heavy load and network and node faults. The main idea behind this fault-tolerant scheme is "stability through statistics". Tapestry is mainly based on the Plaxton, uses neighbor maps (local routing map) resolving the ID digit by digit. Each entry in the neighbor map maintains two backup neighbors which increases fault-tolerance. Tapestry further improves over Plaxton by a dynamic operation in a decentralized manner. The main contribution of Kademlia is the use of XOR metric for distance between points in the key space. Each node maintains lists corresponding to distance calculated using XOR. Each list is a k-bucket with nodes sorted by time last seen. The protocol consists of four operations: PING, STORE, FIND_NODE, FIND_VALUE, each of which takes (log n)+c time, which is provable using the XOR metric. From ks238@cornell.edu Thu Oct 31 14:44:41 2002 Received: from postoffice2.mail.cornell.edu (postoffice2.mail.cornell.edu [132.236.56.10]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id g9VJifc10501 for ; Thu, 31 Oct 2002 14:44:41 -0500 (EST) Received: from ks238.cornell.edu (syr-24-24-18-11.twcny.rr.com [24.24.18.11]) by postoffice2.mail.cornell.edu (8.9.3/8.9.3) with ESMTP id OAA25037 for ; Thu, 31 Oct 2002 14:44:38 -0500 (EST) Message-Id: <5.1.0.14.2.20021031144346.01b4cd20@postoffice2.mail.cornell.edu> X-Sender: ks238@postoffice2.mail.cornell.edu (Unverified) X-Mailer: QUALCOMM Windows Eudora Version 5.1 Date: Thu, 31 Oct 2002 14:44:36 -0500 To: egs@CS.Cornell.EDU From: Karan Suri Subject: 615 Paper #61 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed The focus of these papers are on decentralized searches in a P2P domain. The four different protocols that facilitate decentralized searches are Pastry, Chord, Tapestry and Kademlia. As presented by Rowstron and Drushchel in their paper, Pastry is a routing protocol for peer-to-peer systems that is used to discover objects within a decentralized network. Nodes are denoted with a nodeID, which is simply the hashed value of the key or IP address for a particular node. Pastry uses routing tables, neighboring sets and leaf sets in order to facilitate route discovery in an efficient manner. A routing table, which is of size log2b (N) x 2b 1 is used to store those nodes that are in close proximity to the source node. Proximity is gauged by every node's prefix. Similarities between prefixes of nodeIDs translate to close proximity and disparities in prefixes translate to more distance between nodes. The paper includes a simulation of up to 100,000 nodes which demonstrates that the protocol is truly quite powerful and flexible. One of the reason's the protocol scales so well is the efficiency with which is stores routing tables. An area of concern that may be an issue is scaling this protocol in extremely resource starved networks. A solution to this may be only using the neighboring node table instead of the entire table in a quest to minimize resource exhaustion even more. The next protocol evaluated is Chord, a distributed lookup-service specifically used in decentralized networks. The Chord protocol is used to locate nodes that store specific data items. In Chord, each node is designated with a specific key value (i.e. key valuses are distributed evenly throughout the network to various nodes) which serves as its primary identity in the network. The author's discuss methods in which data can also be included in the description of the key. Therefore, when searching for a certain data item, the proximity of the target key value will determine how close the search is to arriving at the target node which contains certain data. Chord only routes through O(long N) other nodes in order to arrive at its destination, which makes it an extremely scalable and efficient protocol. Different design issues are addressed in the paper, including the robustness of the protocol, so that given immediate topological changes in the network Chord will adjust by accessing a node's successor pointers. The third protocol is Tapestry, which aims at creating a scalable, fault resistant and adaptive overlay infrastructure designed to enable easy application creation. Tapestry's primary differentiating factor is the semantic flexibility it facilitates by allowing operators to select objects from the entire network (rather than be limited or assigned to one object). Furthermore, Tapestry is fault resilient since it uses soft state nodes that cache data to make recovery efficient. The paper also discusses a series of dynamic algorithms which facilitate dynamic adjustment in decentralized networks. For instance, dynamic node insertion, mechanisms used to make sure that the soft-state node implementation balances with excessive overhead, introspective optimizations which help increase the robustness of the protocol (i.e. adapting to particular network changes). Tapestry is an ideal protocol for heavy load networks and mobile networks constantly demanding load balancing and the network adapting to node failures. The final P2P protocol that we read was Kademlia. This protocol uses a unique XOR based metric topology to facilitate query and node discovery by using messages to transport additional pertinent data to monitor the network. In other words, nodes which need to conduct key lookups can also use these messages to send configuration information. This protocol identifies nodes with 160 bit node ID. Keys are also a 160 bit identifiers. Distances between nodes are found by the bitwise exclusive or between two bitwise 160 bit nodeIDs. Lookups are unidirectional so all lookups for the same key will converge along the same path which reduces unnecessary duplicate paths. The protocol consists of four key RPCS (ping, store, find_node, find_value) which are used for node and route discoveries in the network. The protocol also proposes the use of alpha which addressees the tradeoff between lowest latency hop selection and bandwidth exhaustion. From linga@CS.Cornell.EDU Thu Oct 31 14:45:08 2002 Received: from snoball.cs.cornell.edu (snoball.cs.cornell.edu [128.84.96.54]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id g9VJj7c10540 for ; Thu, 31 Oct 2002 14:45:07 -0500 (EST) Received: from localhost (linga@localhost) by snoball.cs.cornell.edu (8.11.3/8.11.3/C-3.2) with ESMTP id g9VJj7203905 for ; Thu, 31 Oct 2002 14:45:07 -0500 (EST) Date: Thu, 31 Oct 2002 14:45:07 -0500 (EST) From: Prakash Linga To: Emin Gun Sirer Subject: 615 PAPER 61 Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Pastry ------ This is a scalable distributed object location and routing substrate for P2P systems. Each node has a nodeId which is used to identify the node. This is obtained by applying a hash function to the node address (say, IP address). Pastry provides a routing mechanism to efficiently route a message to node with nodeId numerically closest to the key (supports route(message, key)). The nodeId is viewed as a number in base 2^b. (i, j) th of the routing table is the target node which agrees with the nodeId in first i bits and whose i+1th bit is j. This can be used to route any message in atmost logarithmic number of hops given that all the data structures are correct (which is the case given that for a significant period of time there are no insertions/deletions). In addtion to the routing table, 2^b immediate left and right neighbors (according to the nodeId) are stored in the leaf set. Also there is a neighborhood set which stores 2*2^b nodes which are closest to this node according to the proximity metric (rtt time is one such). When a node X wants to join the system it will first figure out (say using expanding ring IP multicast) someone (say A) whose close to X and who is also a member of the system. Then A will issue of a join request which then reaches Z the node in the system with nodeId closest to X. Now A gets its leaf set from Z and neighborhood set from X. Then it gets its routing table candidates from the nodes in the path from A to Z. Then A sends a copy of its state to all the nodes it knows of. The recipient nodes inturn updates their state based on the fact that node A is now in the system. Node departures are detected by adjacent nodes (neighbors in nodeId space). When a node is detected to dead then the extreme node in the same half of the leaf set is contacted and a new candidate is chosen. Nodes with the failed node as a routing table entry will detect the failure when they are trying to route to that node. It then finds an alternative for this entry by contacting some other entry node in the same row. Locality can be considered when constructing the routing table and the authors argue that route chosen for a message is likely to be good with respect to the proximity metric. Pros & Cons: Decentralized, scalable, takes locality into account. Logarithmic search time when the system is stable. Logarithmic storage. Can tolerate atmost (2*2^b-1) failures of nodes closer on the nodeId space (The probability that his happens is really low because of the use of a hash function). Simple protocol. Security and privacy have not been investigated. The behavior of the system in a highly dynamic environment (a large number of nodes joining and leaving the system has not been investigated). TAPERSTRY is very similar to PASTRY CHORD ----- This is another distributing object location and routing substrate for P2P systems. Each node has a nodeId which is determined by hashing the node's address (say IP address). The nodes are arranged in a virtual ring in the nodeId space. Routing table at each node consists of pointers to a nodes which are 2^i hops away for i=0 to log(N). So storage is logarithmic. Routing a message, forward it to the farthest node you know of which is closer than the target node. Search time is logarithmic in the worst case, assuming all the data structures are correct. Note that search will succeed when the data structures are not accurate but worst case cost is no longer logarithmic. When a node wants to join the system it contacts some node which is already a part of the system. This message will forward the join message to the node with nodeId numerically closest to the joining node. Then a simple stabilization protocol is used to ensure that the ring is maintained consistently. Then they present an O(log^2N) algorithm to find nodes which are affected by the insertion of this node and update their finger tables. A tuple is stored with a node with id nodeId where nodeId is the least (among the nodes currently in the system) nodeId such that key <= nodeId. Given this, when a node joins the system it takes responsibiity of some of the tuples stored with its sucessor. Each node runs the stabilize protocol periodically to fix the finger table entries (to handle concurrent joins). When a node fails it's neighbors (in nodeId space) detect its departure and fix the lowest level (using a successor list). The stabilization process then fixes the other fingers appropriately. Experiments show that the approach is effective and scalable. Pros and Cons: Similar to Pastry. Except for that this is a simpler protocol and locality issues are not addressed here. From liuhz@CS.Cornell.EDU Thu Oct 31 18:11:44 2002 Received: from exchange.cs.cornell.edu (exchange.cs.cornell.edu [128.84.97.8]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id g9VNBic09816 for ; Thu, 31 Oct 2002 18:11:44 -0500 (EST) content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Subject: 615 PAPER 61 X-MimeOLE: Produced By Microsoft Exchange V6.0.6249.0 Date: Thu, 31 Oct 2002 18:11:43 -0500 Message-ID: <706871B20764CD449DB0E8E3D81C4D4302CEE6BA@opus.cs.cornell.edu> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: 615 PAPER 61 Thread-Index: AcKBMuDJz1z+CNdDSO6NRhjC7kF8VQ== From: "Hongzhou Liu" To: "Gun Sirer" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by sundial.cs.cornell.edu id g9VNBic09816 These four papers introduce Pastry, Chord, Tapestry, and Kademlia. They are similar in some way. And they all claim they can finish lookup of a key in O(log N) hops and maintain only O(log N) routing information in each node. Their routing protocols also share the common idea, that is routing the key to the node whose ID is closest to the key by some metrics, which, however, is different for each of these system. Pastry: Each node in the Pastry network has a unique identifer(nodeId, a 2^b base 128-bit number) and maintains routings table consisting of (128/b) levels. Each level consists of (2^b-1) entries. Entries at level n each refer to a node whose node ID shares the first n digits with the present node's nodeId, but whose n+1th digit has one of the (2^b-1) possible values other than the n+1th digit in the present node's id. Each node maintains a leaf set that contains |L| nodes that are numerically closest to the present node and a neighbor set that contains |M| nodes that are closest(according to some proximity metric) to the local node. when presented with a key, each node will search its routing table, the leaf set and the neighbor set and among which pick a node that either shares a longer prefix with the key than the local node or shares as long a prefix with, but is numerically closer to the key than the local node. Pastry is self -organized. When a node joins the network, it constructs its routing table according the information it obtains along the path between itself and the node whose nodeId is numerically closest. Then the new node notify all nodes that need to be aware of its arrival. Nodes in the Pastry network may fail or depart without warning, however, their neighbors can detect this and repair their leaf sets and routing tables accordingly. Pastry is failure resilient in some way. Objects are duplicated thoughout the network and become unavailable only when all of them fails. Randomness is utilized to choose next hop given multiple qulified next hops to alleviate the effect of link failure. Tapestry: Tapestry is very similar to Pastry. It uses the length of common suffix (instead of prefix in Pastry) as the metrics to measure the closeness between a key and a node ID. A key lookup request is also forwarded to a node whose ID is numerically closest to the key. The main difference between them is how do they deal with replication. Objects in PAST(a application built on top of Pastry) are replicated without control by the owner. Upon "publication" of the objects, it is replicated and replicas are placed on several nodes whose nodeIDs are closest in the namespace to that of the object's objectID. While Tapestry places references to the object location on hops between the server and the root, Pastry assumes that clients use the objectID to attempt to route directly to the vicinity where replicas of the object are kept. Another difference is regarding simplicity. Pastry appears to be less complex than Tapestry. Tapestry uses "soft state" as well as explict re-publishing to make itself fault-tolerant. Each node in Tapestry network stores some secondary neighbors which it can foward request to when the link to the primary neighbor fails. To avoid the single point of failure problem, the root node is duplicated. Each replica is differentiated by "salt" value concatenated to the node ID of the root. Tapestry uses explicit republishing to propagate node failure and mobility information faster(then soft-state schem). Thus, Tapestry performs better than Pastry with respect to fault tolence. Both Tapestry and Pastry utilize routing locality, although they do that in different ways. Chord: Chord provides support for just one operation: given a key, it maps the key onto a node. Chord has three features that distinguish it from many other peer-to-peer lookup protocols, namely simplicity, provable correctness and provable performance(different from Pastry and Tapestry). In Chord, the key and the node ID shares the namespace. The name space is imagined as a cycle. Each key is stored on the node whose ID is the closest successor of the key. A Chord node stores O(log N) fingers, the ith of which points to the successor of the local node that is at least 2^i-1 far away. However not all of them are required for correct routing. One entry per node is enough to guarantee correct(though very slow) routing of queries. Storing addition information is just to improve the efficiency of routing. Chord is self-adaptive. It uses the "stabilization" protocol to keep nodes' successor pointers up to date, thus guarantee the correctness of lookups. Stabilization protocols guarantee lookup success in most situation. However, there are still occasional failures, which are left to the higher -layer to deal with, either retry the lookup or cancel it. Chord uses a list of successors to cope with node failure. If the first successor fails, a node will refer to the second successor on the list to forward queries. This is scheme is very similar to the secondary neighbors scheme of Tapestry. Unlike Tapesty and Pastry, Chord makes no efforts to achieve good network locality. Kademlia: The routing algorithm of Kademlia is similar to the above systems except that is uses the XOR of node ID and the key as the metrics that measure the closeness between the key and the node ID. Each Kademlia node maintains O(log N) k-buckets. Each k-bucket is managed in a way similar to LRU algorithm and stores k nodes which the local node sees recently and are still alive. Only those nodes that the present node saw long ago and are dead currently are removed from the bucket. This provides resistance to some simple DoS attack. An attack can not flush nodes' rouing state by flooding the system with new nodes. Kademlia nodes will only insert the new nodes into the k-buckets when old nodes die. when looking up some key, a Kademlia sends n concurrent and asychronous queries to n nodes chosen from the k closest nodes to the key. This is unique among this four systems. This concurrent queries allow people the freedom to select next hops with lowest-latency and support delay-free recovery. Finally, all of these systems seem to be vulnerable to tampering since none of them spend explicit efforts on tamper-resistance.