From xthemage@mac.com Mon Mar 6 17:13:30 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-0.9 required=5.0 tests=BAYES_00,DNS_FROM_RFC_POST autolearn=no version=3.1.0 X-Spam-Level: Received: from smtpout.mac.com (smtpout.mac.com [17.250.248.84] (may be forged)) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k26MDTt26327 for ; Mon, 6 Mar 2006 17:13:29 -0500 (EST) Received: from mac.com (smtpin02-en2 [10.13.10.147]) by smtpout.mac.com (Xserve/8.12.11/smtpout08/MantshX 4.0) with ESMTP id k26MDSKs018581 for ; Mon, 6 Mar 2006 14:13:28 -0800 (PST) Received: from [128.84.98.36] (dhcp98-36.cs.cornell.edu [128.84.98.36]) (authenticated bits=0) by mac.com (Xserve/smtpin02/MantshX 4.0) with ESMTP id k26MDPDJ023620 for ; Mon, 6 Mar 2006 14:13:28 -0800 (PST) Mime-Version: 1.0 (Apple Message framework v746.2) Content-Transfer-Encoding: 7bit Message-Id: <141E1EA2-9D1A-4BB4-98DA-8100888C4A89@mac.com> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed To: egs+summary@cs.cornell.edu From: Oliver Kennedy Subject: PAPER 12 Date: Mon, 6 Mar 2006 17:13:31 -0500 X-Mailer: Apple Mail (2.746.2) We've already seen how to create Distributed Hash Tables, where nodes in the system can store and retrieve small bits of data in a distributed manner. Two major drawbacks of these systems were that their load balancing assumptions began to break down in the face of varying data sizes, and that persistence was guaranteed only by the data's source continually re-publishing the data. ie: If a node failed, all data stored on that node would be lost. As data objects were usually small, this was simple. This week's papers explore how to expand on this DHT concept to store whole files. First, we have PAST. PAST is based off of the popular Pastry DHT framework, and takes a very care-free approach to storage. The entire file is stored as if it were the value being stored in the DHT. The k nodes nearest to the host node are used for replication. As node ID locality is uncorrelated with node spacial locality, data loss becomes unlikely (as long as k is sufficiently large). However, since the whole file must be stored at this node (and its k nearest neighbors), a region of the node space may be clobbered in the unlucky case that several large files are stored with overlapping replication regions. PAST allows a limited amount of indirection in this replication storage; if a node is unable to store the data it will try to forward it to one of its nearby nodes. Even here however, the overlapping regions case will cause difficulty. CFS takes a more general approach, attempting to model their filesystem on a more traditional filesystem. Files are broken up into blocks and these blocks are stored in the database. A block is identified by a hash of its contents, much like in a previous project by a related group called FFS. Inodes store these hashes as pointers to the blocks, and are in turn treated as normal blocks; directories store inode hashes as pointers. This chain continues all the way up to the root block which is signed by a PKI style key (who's hash is the root block's key). As there is a direct correlation between the block contents and the block's key, the filesystem's consistency may be maintained. Every PKI entity using the system may keep its own filesystem in this way, and only the owner entity may change its contents. This changing process is one of the most serious weaknesses of CFS, as even the smallest change requires updates to every block representing a directory all the way from the root block to the file itself. Similarly, file reads require an extraordinary amount of system traffic. They use an 8 megabyte file with an 8k block size as an example. In this example, it would take on the order of 100 separate lookups to obtain the file contents, as opposed to just 1 in PAST. One thing both papers gloss over is the security of the underlying system. Both Chord and Pastry are subject to a large number of attacks. PAST attempts to avoid this issue via an implicit trust in nodes that have access to a smart card. While this does remove an attacker's ability to supply false data, it doesn't close any number of denial of service attacks. Furthermore, it somewhat defeats the distributed nature of the system. If only authority signed data is implicitly trusted, then why bother re-inventing the wheel. Lastly, PAST's smart card is supposed to keep track of quotas. The way it does this is somewhat mysterious, as it can not be assured that space revocation requests (when a node informs the card that it no longer needs space it has been allocated) will be transmitted to the network. Timeouts may be used, but they are messy in a number of ways. CFS ensures file validity by using a file-tree, and ensuring that the root node is always signed by its owner. Assuming the public key of an entity may be trusted, everything in the filesystem signed by that key should also be trusted. The potential for denial of service attacks, both in the realm of negative responses, and the realm of overwhelming the network with data or requests is not addressed either. - Oliver Kennedy There cannot be a crisis next week. My schedule is already full. -- Henry Kissinger -Oliver Kennedy Computer science isn't about computers any more than astronomy is about telescopes. -- Dijkstra From pjk25@cornell.edu Mon Mar 6 19:34:42 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from authusersmtp.mail.cornell.edu (granite1.mail.cornell.edu [128.253.83.141]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k270Ygt03985 for ; Mon, 6 Mar 2006 19:34:42 -0500 (EST) Received: from [128.84.92.5] (sbw11-net2-128-84-92-5.ece.cornell.edu [128.84.92.5]) (authenticated bits=0) by authusersmtp.mail.cornell.edu (8.13.1/8.12.10) with ESMTP id k270YfhO027296 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NOT) for ; Mon, 6 Mar 2006 19:34:42 -0500 (EST) Mime-Version: 1.0 (Apple Message framework v746.2) Content-Transfer-Encoding: 7bit Message-Id: Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed To: egs+summary@cs.cornell.edu From: Philip Kuryloski Subject: PAPER 12 Date: Mon, 6 Mar 2006 19:37:39 -0500 X-Mailer: Apple Mail (2.746.2) SAMSARA A challenge of p2p network design is to ensure that users in the system contribute to its welfare appropriately based on the benefit they receive; freeriding is curbed. Experience in the Gnutella & Napster networks has shown users will often freeride given the opportunity. The basic principle behind Samsara is the use of incompressible storage claims, essentially manufactured bits of data for use in an "I'll store your data if you'll store mine" scheme. Enhancements of the scheme allow forwarding of claims to balance storage requirements across the network. However, a node remains responsible for claims it forwards, so forwarding is used as a last resort. The authors implement the system on top of a Pastry network for testing purposes. Claims are generated using the sha1 hash of a passphrase known only by the host node concatenated with an on disc location. Nodes periodically query nodes for whom they are storing data for claims, if a response is invalid, that node is free to drop the stored data. A probabilistic punishment scheme is used to forgive nodes for their transient failures. A 5000 node network is simulated, showing good performance from the system. However, there are some limitations. The first is that without claim forwarding, the global storage requirement for the system is doubled. Also, the scheme is only really appropriate for a distributed file storage/replication scheme; pushing data. An adaptation for pull scenarios may be possible, but is not immediately evident. PAST PAST is a p2p distributed filesystem using Pastry routing for the underlying overlay network. As a result, files are immutable, and are retrieved via a fileId which is generated upon insertion into the network. PAST provides the insert, lookup, and reclaim primitives leveraging the underlying Pastry network. When a file is inserted into the network, the desired number of replicas is specified, and the file is replicated at the nodes with nodeId closest to the fileId in the Pastry identifier space. This is a lower limit on the number of replicas of that file maintained by the system. Additional cached copies may be automatically created and destroyed by the load balancing mechanism. When objects are inserted into PAST, a file certificate containing file metadata is created and signed with the client's private key. Nodes who store the file also generate signed receipts which they return to the inserting client. Storage quotas are enforced using smartcards with associated private/public key pairs, which are signed by the smartcard issuer. Load balancing is address in PAST by bending the rules of file and replica placement. Replicas are moved to other members of a Pastry leaf set or forwarded to other nodes using pointers, and fires are moved to other nodes by changing the salt used in creating fileId's. Nodes accept forwarded objects if the object does not occupy a significantly large portion of the nodes remaining free space. Reed- Solomon encoding encoding of data is suggested to decrease storage requirements. Nodes in PAST use their unused disk space for file caching. GreedyDual-Size policy is used for determining which objects should remain cached. Simulation shows the system to be effective in high storage utilization situations of 95% and above. The system benefits from large node leaf set degree, as there is more opportunity for local load balancing. PAST has a number of limitations, one of which is the requirement of smartcards issued to each node, to some extent introducing a central authority into the PAST scheme. Another is that if the quantity of available network storage decreases, the system will fail to maintain the requested number of copies per file. This is of course unavoidable, however there is no mechanism given to achieve this gracefully (e.g. if total storage decreases by 20%, each file will be replicated 20% less). It also un-structures the underlying Pastry network in the name of load balancing, making the network more susceptible to Sybil and Eclipse attacks. There is also no mechanism to ensure that nodes actually maintain the file replicas they claim to store. CFS The Cooperative File System (CFS) is similar to PAST in the service it provides; a p2p distributed filesystem for immutable files. It differs from PAST primarily in that CFS distributes blocks across nodes in the network, as where PAST distributes whole files. The creators of CFS also assume that storage space is abundant, and that it is not expected to utilize nearly 100% of the global storage space. CFS makes use of DHash and Chord as underlying components. The root block of a file is identified by a public key and signed by the corresponding private key. Other blocks are identified by hashes of their contents. This allows a client to verify the integrity of a retrieved file. All files in the system have a finite lifetime after which they are automatically discarded. The publisher of a file can, however, as for lifetime extensions for files. CFS, like PAST uses disk quotas to prevent nodes from using an unfair amount of resources. As in PAST, replicas are stored on the k-successors of a chord node, and caching of blocks occurs on the paths of incoming file requests. Virtual servers are used on machines with very large resources to improve load balancing. Simulation of the system showed that when multiple blocks are requested in parallel, and proximital server selection is used, file transfer speeds are on par with ftp transfers between nodes who are part of the CFS network. Unfortunately the authors of CFS provide little information with regard to the appropriate block size for a deployed system. 8Kb is used for their simulations, however the number of nodes in their testbed was small, so it is unconvincing whether or not this block size is generally applicable. CFS's block level storage results in much more natural load balancing that PAST. However, given the resulting increase in overhead in retrieving a file, it is difficult to compare this overhead with that generated by the additional load balancing measures implemented in PAST. The block level granularity does however make it more difficult for an attacker to destroy a whole file. From sh366@cornell.edu Mon Mar 6 23:28:14 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from postoffice10.mail.cornell.edu (postoffice10.mail.cornell.edu [132.236.56.14]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k274SEt02029 for ; Mon, 6 Mar 2006 23:28:14 -0500 (EST) Received: from orpheus3.dataserver.cornell.edu (orpheus3.dataserver.cornell.edu [128.253.161.167]) by postoffice10.mail.cornell.edu (8.12.10/8.12.6) with ESMTP id k274SD9B001795 for ; Mon, 6 Mar 2006 23:28:14 -0500 (EST) Message-ID: <76814523.1141705691907.JavaMail.webber@orpheus3.dataserver.cornell.edu> Date: Mon, 6 Mar 2006 23:28:12 -0500 (EST) From: Huang Shiang-Jia To: egs+summary@cs.cornell.edu Subject: PAPER 12 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: uPortal WEB email client 3.0 * Load balance can't be guaranteed in PAST. Although a caching scheme is proposed to balance the query load for popular files, (1) it is infeasible to cache large files; (2) caches begin to replace files as the storage utilization increases, so the cache hit ratio drops. * The overhead of data migration is quite large if nodes join and leave system frequently (or said, the k numerically closest nodes change frequently). According to the migration scheme given in the section of 'Maintaining replicas', when a node joins and becomes one of the k numerically closest nodes, a pointer pointing to it is installed and then the file is transferred to it in the background. Imagine that it soon leaves the system. The file has to be transferred to another node again. If nodes joins and leaves frequently, the cost to transfer the file among the nodes is high. * While allowing parallel blocks access to improve performance, CFS requires a lot of messages to locate a file, especially when the file size is very large. The granularity of block sizes should be well designed to fit the file sizes. * Server selection (concerning latency) is supported in CFS by including the function find_predecessor() in Chord protocol. As the latencies are measured at the predecessor, people may doubt if they exactly reflect the latencies from the client to those nodes. * PAST is a peer-to-peer storage utility layered on top of Pastry. This paper presents and evaluates (1) storage management and (2) caching in PAST. * Three operations: Insert, Lookup, and Reclaim are provided by PAST. Each file, along with its file certificate, is stored at k nodes. This is confirmed by the store receipts returned by these nodes. The required storage is debited against the file owner's quota. When a lookup request is issued, it is routed until the closest node storing that file responds. The file owner can also issue a reclaim certificate and check the credit against quota upon receiving the reclaim receipts. * The primary goal of storage management is to allow high global storage utilization and gracefully degradation as the system approaches its maximal utilization. Generally, a file is stored at the k numerically closest nodes, but load imbalance may occur due to different size of each file and different storage capacity of each node. * The 'replica diversion' is therefore proposed to balance the remaining free storage space among nodes in the system: if one of the k nodes can't accommodate the file, it chooses a node form its leaf set to store the file instead and keeps a pointer to that node. The 'replica diversion' is also necessary for maintenance of the invariant that each file is stored at k nodes. To accept a replica or to divert it depends on policies based on a metric SD/FN, where SD is the size of a file and FN is the free storage space of a node. * File diversion: if the root node of the file can't accommodate the file, a negative acknowledgement is returned to the client. The client has to generate another node ID (with another salt). If the same situation occurs three times, the owner has to consider splitting the file to smaller sizes. * The primary goal of caching is to minimize client access latencies, to maximize query throughput and to balance query load in the system. Nodes may use unused portion of their storage space to cache files. Cache copies may be discarded when necessary (to store a new replica, for example). GreedyDual-Size policy is used here for cache replacement. * CFS is a peer-to-peer ready-only block-oriented storage system. It is structured as a collection of servers that provide block-level storage. The file system semantics can be layered on top of this block store. * CFS consists of two layers: The Chord layer maintains the routing table to locate blocks. The DHash layer is responsible for storing keyed blocks and maintaining replications and caching of them. * CFS stores data for an agree-upon interval; servers may discard data whose guaranteed period has expired. This prevents, to some degree, malicious behaviors of keeping inserting data, since they are automatically deleted when attackers stop refreshing them. * DHash places a block's replicas at the k servers immediately after the block's successor. Blocks are cached along the lookup path and replaced in least-recently-used order. * Multiple virtual servers may be run on a physical server, for load balance, in proportion to the server's storage and network bandwidth. Virtual servers can take short-cut through each other's routing table to compensate for the increased number of hops (resulting from the increased number of servers) in the lookup. * The experimental results show that the pre-fetch scheme improves the performance depending on the pre-fetch window size for different cases. To decide the right size is a future research. Besides, a keyword search engine is under development for CFS. From niranjan.sivakumar@gmail.com Mon Mar 6 23:58:13 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from xproxy.gmail.com (xproxy.gmail.com [66.249.82.192] (may be forged)) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k274wDt09178 for ; Mon, 6 Mar 2006 23:58:13 -0500 (EST) Received: by xproxy.gmail.com with SMTP id i26so917298wxd for ; Mon, 06 Mar 2006 20:58:12 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:date:from:to:subject:message-id:x-mailer:mime-version:content-type:content-transfer-encoding; b=Wdk8Ydb3/zAkDJGgixBCYolCDk/rAWmlLZJ+mx0zvN2NK1b5lRbvSCGenl9emw3JC3FhAFegqIsmstUywJROU+5esJMiO0PXm0RxITC5jmZP+z5BC1eQoGeXmSrr0JkxuLz9teP4nhZ7x5kF8rzMi9xJoh2R54+R4pDNZbr/I/c= Received: by 10.70.43.18 with SMTP id q18mr1046380wxq; Mon, 06 Mar 2006 20:58:12 -0800 (PST) Received: from localhost ( [69.207.49.126]) by mx.gmail.com with ESMTP id h14sm1467948wxd.2006.03.06.20.58.12; Mon, 06 Mar 2006 20:58:12 -0800 (PST) Date: Mon, 6 Mar 2006 23:58:11 -0500 From: Niranjan Sivakumar To: egs+summary@cs.cornell.edu Subject: PAPER 12 Message-Id: <20060306235811.0489a2a3.niranjan.sivakumar@gmail.com> X-Mailer: Sylpheed version 2.2.0 (GTK+ 2.8.13; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Niranjan Sivakumar Storage Management and Caching in PAST, a Large-Scale Persistent Peer-to-Peer Storage Utility Wide-Area Cooperative Storage with CFS PAST is an implementation of a peer-to-peer file storage system that runs on top of Pastry. PAST offers relatively simple functionality, with three basic operations: insert, lookup, and reclaim. Past allows for users to request a certain amount of replication in the system, and will replicate the object on nodes with close nodeID's to the 128 most significant bits of the fileID. The system relies on the natural variation that will be expected in nodes that are adjacent to one another in Pastry to ensure some robustness in its replication. The system also allows for some further, more informal caching (not guaranteed) to further alleviate load imbalances. The authors present a basic security mechanism that is based on public key cryptography facilitated by smartcards that are signed by a trusted authority. There is also a replica diversion mechanism in order for burdened nodes to pass on requests to nodes in their leaf sets. CFS is another file storage system running on a distributed hash table, in this case DHash (a variant of Chord.) The system is similar to PAST in many ways, but as the authors point out, the main difference between the two systems is that Pastry considers files as single objects whereas CFS splits files up into blocks, somewhat like a conventional file system. CFS has a similar replication scheme to PAST, placing block replicas on some specified number of servers in the locality where the file's ID falls. CFS uses DHash's caching system to provide further replication by caching blocks at nodes that are traversed en route to the desired data. Also, CFS refers to the security system outlined in PAST as something that could be leveraged to implement a quota system. While both of these implementations are somewhat novel approaches for the application of p2p networks, neither seems to provide a compelling reason to use the system. Both of the papers seem to indicate that the systems could be useful if future research into adding more features to their respective DHT layers are fruitful, but as it stands, they are quite simple. The security system that is proposed is centralized, and as seen with other systems, this seems to be somewhat against the distributed natures of the systems themselves. The PAST paper appears to almost discount the possibility of malicious nodes causing routing failures that would result in denial of service. In CFS, the idea of breaking files up into blocks is interesting, but it seems that finding the correct block size could be challenging and having a file split up into many pieces and having to perform many lookups on a Chord based network is not negligible. From asr32@cornell.edu Tue Mar 7 00:30:28 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-1.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from authusersmtp.mail.cornell.edu (granite1.mail.cornell.edu [128.253.83.141]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k275URt16951 for ; Tue, 7 Mar 2006 00:30:27 -0500 (EST) Received: from dreadnought.cornell.edu (r253240123.resnet.cornell.edu [128.253.240.123]) (authenticated bits=0) by authusersmtp.mail.cornell.edu (8.13.1/8.12.10) with ESMTP id k275URPc017332 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Tue, 7 Mar 2006 00:30:27 -0500 (EST) Message-Id: <6.2.1.2.2.20060305135356.0317a230@postoffice8.mail.cornell.edu> X-Mailer: QUALCOMM Windows Eudora Version 6.2.1.2 Date: Tue, 07 Mar 2006 00:28:24 -0500 To: egs+summary@cs.cornell.edu From: Ari Rabkin Subject: PAPER 12 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Samsara: Samsara is a system for fair disk space exchange. The basic idea is that machine A is willing to store files for machine B, provided that machine B stores a "claim" on behalf of A. This claim can later be transferred, so that B is storing files on behalf of A, in effect. If users are unavailable, their stored data is removed, on a probabilistic basis designed to differentiate short outages from permanent cheating. The system is aimed to stop greedy users, not malicious ones. This is not necessarily a weakness, depending on the target user base. These scheme, to the extent that it works, works only for storage (permanent or RAM). It has a number of shortcomings. Much of the time, one might be willing to store on behalf of others more data than one wishes to store--some users are generous, particularly if the protocol is being used on a corporate or academic intranet. Samsara does nothing to support such flexible policies. PAST: PAST is a system for storing immutable files on top of a peer-to-peer substrate such as Pastry. Each file is replicated at the k nodes with IDs nearest the ID of the file. The file's ID is based on its hash, the hash of its inserter's public key, and a salt. In the event that these nodes (the "replica set") are unable to store the file, they first attempt to find a nearby node that can, and if that fails, alert the file owner, who will pick a different part of node-ID space. As a result, the system is able to robustly store files even in conditions of high utilization. PAST stores immutable files, and therefore if a file is altered, it must be deleted and then reinserted. If a file is deleted and a replica didn't get the reclaim request, there is no provision for it to find out later that the file can be removed. As a result, the PAST system may become clogged with files that should have been garbage-collected, with corresponding harm to user's quotas. A similar problem arises with nodes that leave the system permanently; there is no way for nodes with replicas to learn that they can remove those files. PAST relies on trusted hardware and software; if users have broken clients, the system's guarantees do not apply. CFS: CFS takes a quite different approach from PAST. Layered over Chord, CFS allows nodes to insert content into the network CFS uses a filesystem paradigm, where data consists of fixed-sized blocks, some of which are metadata specifying which blocks correspond to which files. This data is visible to others nodes as a read-only-filesystem. Data is deleted if not refreshed periodically, preventing storage from being consumed over time. CFS is a clean and elegant system, and the concept of building a block-structured filesystem atop a DHT is a good one. However, removing data blocks only through timeouts seems like a weakness; if the timeout period is too small, the data's owner will be constantly sending out refresh results. If the timeout is too large, then old data will pile up. CFS inherits all the weaknesses of the underlying DHT; in particular, malicious nodes can drop queries or data. Ari Rabkin asr32@cornell.edu Risley Hall 454 3-2842 The resources of civilization are not yet exhausted. --William Gladstone From lackhand@gmail.com Tue Mar 7 00:58:13 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-0.9 required=5.0 tests=AWL,BAYES_00,HTML_00_10, HTML_MESSAGE autolearn=no version=3.1.0 X-Spam-Level: Received: from zproxy.gmail.com (zproxy.gmail.com [64.233.162.205] (may be forged)) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k275wDt23723 for ; Tue, 7 Mar 2006 00:58:13 -0500 (EST) Received: by zproxy.gmail.com with SMTP id i28so1489637nzi for ; Mon, 06 Mar 2006 21:58:12 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:mime-version:content-type; b=c35cwBlGP0YxAL2n9oWEP/9nMOB7PBRWhHqt7dzL0J7avBw3z8iM5puUn3f4e5m39mUvb9Iumnu4XJx91eAsAvA7d8ZVfahyu5eN3IPiY+UAMZFJB0NT64sHdOfP96MTsiEPU285AuYoULmI2Diu+82uS8Ae+QzjFa4Apgp7KGU= Received: by 10.37.2.46 with SMTP id e46mr5917nzi; Mon, 06 Mar 2006 21:58:12 -0800 (PST) Received: by 10.36.147.10 with HTTP; Mon, 6 Mar 2006 21:58:12 -0800 (PST) Message-ID: <9aa7a97d0603062158h1a55733do14796b21b4b8ec0@mail.gmail.com> Date: Tue, 7 Mar 2006 00:58:12 -0500 From: "Andrew Cunningham" To: egs+summary@cs.cornell.edu Subject: PAPER 12 MIME-Version: 1.0 X-Security: message sanitized on sundial.cs.cornell.edu See http://www.impsec.org/email-tools/sanitizer-intro.html for details. $Revision: 1.148 $Date: 2004-12-19 11:59:17-08 X-Security: The postmaster has not enabled quarantine of poisoned messages. Content-Type: multipart/alternative; boundary="----=_Part_421_6026018.1141711092470" ------=_Part_421_6026018.1141711092470 Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Andrew Cunningham arc39 _Storage_Management_and_Caching_in_PAST,_a_Large-Scale,_Persistend_Peer-to= -Peer_Storage_Utility_ Antony Rowstron, Peter Druschel PAST, a large scale peer-to-peer system, allows persistent storage on a self-organizing, Internet-based overlay network of storage nodes cooperatively routing file queries, storing multiple replicas of files, and caching additional copies of popular files. The paper relies on Pastry for many of its qualities, and thus concentrates mostly on attempting to even the workload across the network through diverting replicas and files to locations of the address space removed from the standard key of the message= . The trick used is that files may be stored, rather than at the location pointed to by the hash of the file name, at any node in the leaf set of the responsible node; to permit widening of horizons, the 'hack' of allowing th= e two edges of the leaf set to extend the query was permitted. While steps ar= e taken to avoid malicious users, the strict limit imposed on who may host replicas seems out of place -- since we only allow two levels of indirection, and fix the size of the leaf set, this will not scale well. However, it places a strict limit on the number of additional hops that a message query will require, and is necessary for various other reasons. Moreover, the statement that if storage is unavailable close by, it is globally unavailable is valid under the security assumptions that this pape= r makes -- i.e., protocols designed to avoid simple byzantine failure -- but nevertheless easy to defeat by conspiring foes, who simply refuse to share "enough" resources. While IP addresses can be "defended" so that it is difficult to partition a legitimate user from the network, it remains the case that local free space need not necessarily be indicative of global fre= e space, simply that it is similar with high probability and inefficient to make any stronger discovery. _Wide-Area_Cooperative_Storage_With_CFS_ Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, Ion Stoica Providing a similar, though more limited service, than PAST, CFS is built on top of Chord to provide a distributed hash table mapping values to blocks. This allows a decentralized system whose clients may view the value= s as files. Due to the use of cryptographic techniques to ensure legitimacy o= f data, it is impossible for clients to alter data, though updates may be performed with access to the secret key used to insert the data. The system is bizarrely POSIX compliant, in that it simply stores directory nodes which point to inodes which point to blocks, but rather tha= n referring to disk areas, they refer to locations in the underlying peer to peer network. Caches and copies of these data are replicated throughout the network, and the aforementioned cryptography allows the data to be verified= . A difference between standard hard disks and CFS is that CFS allows blocks of data to time-out, and thus be removed from hosting. They simultaneously strengthen and weaken Chord, by permitting optimization of finger-tables, which permits tunneling attacks, but also specifying secure nodeID generation, which tends to slow down many attacks, making them less profitable. Perhaps most importantly, however, this system does not assume that storage space will be a scarce resource, and thus punts the problem of distributing the data (to the previous paper), assuming that it is not a problem if overworked nodes simply request that someone else do their work. This is perhaps the biggest problem, and one which PAST addresses; in many ways CFS is not a file system, despite how easily it can be made to serve a= s one. ------=_Part_421_6026018.1141711092470 Content-Type: text/html; charset=ISO-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Andrew Cunningham
arc39

    _Storage_Management_and_Caching_in_PAST,_a_Large-Scale,_= Persistend_Peer-to-Peer_Storage_Utility_
    Antony Rowstron, Peter Druschel
    
    PAST, a large scale peer-to-peer system, allows persistent storage on a self-organizing, Internet-based overlay network of storage nodes cooperatively routing file queries, storing multiple replicas of files, and caching additional copies of popular files. The paper relies on Pastry for many of its qualities, and thus concentrates mostly on attempting to even the workload across the network through diverting replicas and files to locations of the address space removed from the standard key of the message.
    The trick used is that files may be stored, rather than at the location pointed to by the hash of the file name, at any node in the leaf set of the responsible node; to permit widening of horizons, the 'hack' of allowing the two edges of the leaf set to extend the query was permitted. While steps are taken to avoid malicious users, the strict limit imposed on who may host replicas seems out of place -- since we only allow two levels of indirection, and fix the size of the leaf set, this will not scale well. However, it places a strict limit on the number of additional hops that a message query will require, and is necessary for various other reasons. Moreover, the statement that if storage is unavailable close by, it is globally unavailable is valid under the security assumptions that this paper makes -- i.e., protocols designed to avoid simple byzantine failure -- but nevertheless easy to defeat by conspiring foes, who simply refuse to share "enough" resources. While IP addresses can= be "defended" so that it is difficult to partition a legitimate user= from the network, it remains the case that local free space need not necessarily be indicative of global free space, simply that it is similar with high probability and inefficient to make any stronger discovery.
    
    _Wide-Area_Cooperative_Storage_With_CFS_
    Frank Dabek, M. Frans Kaashoek, David Karger, Robert Mor= ris, Ion Stoica
    
    Providing a similar, though more limited service, than PAST, CFS is built on top of Chord to provide a distributed hash table mapping values to blocks. This allows a decentralized system whose clients may view the values as files. Due to the use of cryptographic techniques to ensure legitimacy of data, it is impossible for clients to alter data, though updates may be performed with access to the secret key used to insert the data.
    The system is bizarrely POSIX compliant, in that it simply stores directory nodes which point to inodes which point to blocks, but rather than referring to disk areas, they refer to locations in the underlying peer to peer network. Caches and copies of these data are replicated throughout the network, and the aforementioned cryptography allows the data to be verified. A difference between standard hard disks and CFS is that CFS allows blocks of data to time-out, and thus be removed from hosting. They simultaneously strengthen and weaken Chord, by permitting optimization of finger-tables, which permits tunneling attacks, but also specifying secure nodeID generation, which tends to slow down many attacks, making them less profitable. Perhaps most importantly, however, this system does not assume that storage space will be a scarce resource, and thus punts the problem of distributing the data (to the previous paper), assuming that it is not a problem if overworked nodes simply request that someone else do their work. This is perhaps the biggest problem, and one which PAST addresses; in many ways CFS is not a file system, despite how easily it can be made to serve as one. ------=_Part_421_6026018.1141711092470-- From ryanp@cs.cornell.edu Tue Mar 7 05:27:24 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from exchfe2.cs.cornell.edu (exchfenlb-2.cs.cornell.edu [128.84.97.34]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k27ARNt06509 for ; Tue, 7 Mar 2006 05:27:23 -0500 (EST) Received: from exchfe2.cs.cornell.edu ([128.84.97.28]) by exchfe2.cs.cornell.edu with Microsoft SMTPSVC(6.0.3790.1830); Tue, 7 Mar 2006 05:27:23 -0500 Received: from [128.253.211.203] ([128.253.211.203]) by exchfe2.cs.cornell.edu over TLS secured channel with Microsoft SMTPSVC(6.0.3790.1830); Tue, 7 Mar 2006 05:27:23 -0500 Mime-Version: 1.0 (Apple Message framework v746.2) Content-Transfer-Encoding: 7bit Message-Id: <45ED1FAF-5338-44FB-A799-07982201269F@cs.cornell.edu> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed To: Gun Sirer From: "Ryan S. Peterson" Subject: PAPER 12 Date: Tue, 7 Mar 2006 05:27:22 -0500 X-Mailer: Apple Mail (2.746.2) X-OriginalArrivalTime: 07 Mar 2006 10:27:23.0462 (UTC) FILETIME=[B9037660:01C641D1] The first paper, by Rowstron and Druschel, presents PAST, a peer-to- peer storage management system. PAST provides users with an interface that enables them to insert, remove, and lookup files from a persistent distributed file system. The system is built on top of Pastry, which handles all the routing between nodes. The nodes are assumed to be normal desktop machines, probably owned by clients of the system. Consequently, nodes cannot be trusted in general since their owners could potentially remove or corrupt files that they hold. The authors brush this issue aside. The main contribution of PAST is its replica management. Users specify a parameter k when inserting an object, which is the number of machines the file should be replicated on. After Pastry routes the file to the appropriate home node, the node contacts k - 1 neighbors from its leaf set and sends the file to them for backup storage. As part of the replication protocol, if a node does not have enough free space to store a file, it selects a node in its leaf set or possibly a node in a different part of the Pastry ring to store the file. The original node then stores a pointer to the file on the other machine. As an added feature, nodes cache files if they have sufficient free space during lookups. The second paper, by Dabek et al., introduces another wide-area storage system, CFS. CFS, built on top of Chord, focuses primarily on robustness, providing experimental results that show that files are preserved even when as many as half of the nodes fail. Unlike PAST, CFS is read-only, meaning it does not allow users to insert objects. Instead, it follows something more like a publish-subscribe paradigm, whereby there are content providers who insert objects, and there are clients who download data from the system. Like PAST, much of CFS's complexity is in the load balancing techniques it employs. Ryan From nsg7@cornell.edu Tue Mar 7 10:20:21 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from postoffice10.mail.cornell.edu (postoffice10.mail.cornell.edu [132.236.56.14]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k27FKKt02296 for ; Tue, 7 Mar 2006 10:20:20 -0500 (EST) Received: from webmail.cornell.edu (hermes21.mail.cornell.edu [132.236.56.20]) by postoffice10.mail.cornell.edu (8.12.10/8.12.6) with ESMTP id k27FKIUG008033 for ; Tue, 7 Mar 2006 10:20:18 -0500 (EST) Received: from 132.236.227.119 by webmail.cornell.edu with HTTP; Tue, 7 Mar 2006 10:20:19 -0500 (EST) Message-ID: <1683.132.236.227.119.1141744819.squirrel@webmail.cornell.edu> Date: Tue, 7 Mar 2006 10:20:19 -0500 (EST) Subject: PAPER 12 From: "Nicholas S Gerner" To: egs+summary@cs.cornell.edu User-Agent: SquirrelMail/1.4.5 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal CFS provides a filesystem-like abstraction built upon a Chord ring. Specifically, CFS provides an insert and lookup mechanism to support block level storage and retrieval of files distributed over nodes on the Chord ring. Files must be fragmented into small blocks which are identified by the hash of their contents and signed by the owner's private key. The block level storage seeks to distributed both storage and query load by ensuring that all storage units are of approximately equal size and distributing logically related blocks uniformly throughout the ring to support parallel, distributed retrieval. The fact that blocks are identified by the hash of their content ensures that a named block cannot be replaced by faulty data. Deletes are not explicitly updated, instead CFS guarantees storage for blocks only for a fixed period of time. After this period expires nodes are free to discard the associated block. To ensure persistence, owners must request an extension for the storage. Additionally CFS implements passive LRU caching along query paths. This caching is distinguished from guaranteed replication of each block on the k nodes succeeding the original storage location (the node with nodeid closest to to the blockid in question). CFS also supports IP identified quotas where each node limits the amount of local storage which can be consumed by a given IP address to a fraction of some estimation of the total storage in the network. PAST also provides a filesystem-like abstraction built upon a Pastry ring. PAST also provides an insert/lookup for storage units. However PAST places no restrictions on the sizes of storage units (instead this choice could be layered on top of PAST to achieve the probabalistic load balancing properties which CFS hopes to achieve). A reclaim operation allows storing nodes to remove associated file replicas (subject to verification of the reclaim request through cryptographic signatures), but does not guarantee that such replicas are no longer available. Also, PAST storage units are identified not by a hash of their content, but by a potentially unrelated filename, salt and owner public key. The block content hash is additionally included and signed with the owner's private key to ensure a malicious node can't forge a copy with alternate content. PAST guarantees that k replicas are persistently stored in the network (at the k nodes with closest nodeid to the fileid) (constrained by available system space). In the face of varied node capacities and file sizes PAST introduces replica diversion and file diversion. Replica diversion allows one of the k nodes which should store a replica, A, to store instead a pointer to an alternate storage location at one of the nodes in A's leaf set. If replication diversion is not possible (because the chosen leaf node also cannot store the file), file diversion suggests that the file owner create a new fileid (by choosing a new salt value) so that the file may be stored at an entirely different location in the node ring. These two methods provide some measure of load balance (empirically verified). PAST also maintains the k replicas where possible (constrained by system wide storage capacity) in the face of node additions. When a new node joins it must semantically keep the replicas for which it is one of the k closest nodes. This is done by first having the new node store a pointer to all relevant copies and lazily copying the necessary data. This is similar to replica diversion. Node failures are handled as part of the Pastry node failure process. Both these systems provide filesystem-like abstractions and replication without providing all the services provided by a filesystem. It seems as if there may be fundamental properties of some of these services (such as delete) which cannot be expressed in such p2p systems, however, the characterization of these properties is not explored in either paper. Additionally, some important and simple features of a distributed file system are provided in one system but not explored in the other (such as parallel retrieval) and the implications of such features are not explored in depth. Additionally, both systems feature passive caching and vague arguments that such caching should be able to improve common access behaviors (such as system-wide or local non-uniform access distributions), but such arguments are not formalized and no model of such behaviors are provided. Samsarsa doesn't explicitly provide a filesystem abstraction, but supports symmetric storage exchange through claims which are uncompressible sequences of data created and validated by a storage node and stored by a storing node. Periodically a storage node B storing data for node A can validate that A still stores a claim for B. This claim may later be replaced by actual data from B as long as B still stores data for A. Samsara extends this notion to dependency chains where three or more nodes store data in a chain (A stores data on B who stores data on C), where there is only one claim (storage overhead) per chain at the root of the chain (so A stores a C's claim for B). If a claim fails a validation query, the node performing the query probabalistically ejects a storage block from the node failing the query. A might fail a validation query on B's claim, B might respond by ejecting A's storage block. While a simple system, there are several weaknesses. First if chains are avoided then the storage overhead for the system equals the amount of data stored in the system. Second, dependency chains are only as strong as the weakest link and a malicious or failed node in a dependency chain can trigger a cascading failure along the chain (in one direction). While the paper points out that dependency cycles are much more resilient to failures, in fact cycles are only resilient to one failure, after which they become a dependency chain. From gp72@cornell.edu Tue Mar 7 11:00:43 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-1.5 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from postoffice10.mail.cornell.edu (postoffice10.mail.cornell.edu [132.236.56.14]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k27G0gt16879 for ; Tue, 7 Mar 2006 11:00:42 -0500 (EST) Received: from orpheus3.dataserver.cornell.edu (orpheus3.dataserver.cornell.edu [128.253.161.167]) by postoffice10.mail.cornell.edu (8.12.10/8.12.6) with ESMTP id k27G0f7r007290; Tue, 7 Mar 2006 11:00:41 -0500 (EST) Message-ID: <2094271417.1141747240665.JavaMail.webber@orpheus3.dataserver.cornell.edu> Date: Tue, 7 Mar 2006 11:00:40 -0500 (EST) From: Gopal Parameswaran To: egs+summary@cs.cornell.edu Subject: PAPER 12 Cc: gp72@cornell.edu Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Mailer: uPortal WEB email client 3.0 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by sundial.cs.cornell.edu id k27G0gt16879 PAST: This paper presents a method for storage management and caching of files using Pastry for client request routing and adding and deleting nodes. Its primary functionality is to ensure that files are replicated k times in the network with a focus on caching of popular files. Since it relies on the underlying Pastry, the queries are of the order of log N. The authors claim that their system provides high availability, scalability and security though most of it is based on Pastry’s implementation. PAST also uses hashes of files for identification and retrieval. Past supports three primary operations of adding a new file called Insert and a Lookup functionality and a Reclaim function that actually frees up the resources allocated to the stored file. When an insert request in made to insert a file into the network the SHA-1 hash of the file’s name with the client’s public key and a random salt is taken and it is redirected to the nearest node as per Pastry and then from that node forwarded to the next nearest node till k nodes have a copy of the file. Past also specifies a client storage requirement and removes the size of the storage used for all k copies from the client’s quota. The client receives an acknowledgement from all the k nodes that have the copies called the store receipt. For a lookup the client sends the SHA1 hash made for the file and Pastry does the lookup and returns a copy of the file with a high probability of returning a copy that is nearest to the client due to the locality properties of the Pastry routing schema. The deleting of a file or reclaiming the resources is the same process as adding but instead frees up the resources allocated to the copy of the file and returns back a reclaim receipt. Past introduces the concept of smartcards for security of peer to peer networks to identify users. This identification of the users makes the concept of a paid peer to peer service also more realizable especially for the area of storage where the clients are being charged for the service of k fault tolerance storage. Past does storage distribution among its nodes based a storage capacity that is mentioned when a node joins a network and used as an admission criteria. However Past does not focus on the an important issue in file storage and retrieval with regards to improving the latency of the system especially since for file systems file storage and retrieval would be a more common task and low latency requirements could be critical. This paper does not distinguish between performance tuning in terms of optimizing storage and allocating storage capacity based on connection speeds. This plays an important role since a bad client might offer a large storage capacity over a slow network and thus slow down all request that come to it. CFS In this paper the authors present the concept of a peer-to-peer storage network that uses Chord as its underlying peer-to-peer network and tries to handle robustness and load balancing and effective distribution of files. CFS is composed of two primary layers, viz. the Dhash layer and the underlying Chord. The Dhash layer performs block fetches and distribution of the file into the network by using the second layer of Chord as a distributed lookup system. It differs from the Past implementation above with regards to the distribution of the files in the network. Past stores the contents of a file at once location and hence for large files it cannot store it in the network on the associated nodes though the nodes together collectively can store it. Instead it offloads it to servers that have sufficient free space. CFS on the other hand breaks up the file into blocks and stores the data as blocks and thus is more granular in its approach. Thus the files are no longer restricted in terms of available storage capacity at a particular node but a collective capacity of the network. Also since it is broken down into smaller packets the reliability of CFS would be higher than Past since if two copies of the same file is corrupted or the nodes that contain them are down in different places Past would not be able to take either of the two files and they both becomes useless. In the case of CFS since only the individual packets are lost or corrupted the packets that correspond to the faulty section of the file are discarded and the rest can be used. CFS also differs from Past in not supporting a delete operation but instead relying on a periodic update by the client to maintain the data. However this approach also subjects the system to the condition that if the client wants to reuse storage space allocated to him he has to wait till the timeout period for the update request. Also this particular scheme also has another serious drawback in that if a client goes down due to some network issues and cannot come back up in time From kash@cs.cornell.edu Tue Mar 7 11:07:24 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-2.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00, HTML_40_50,HTML_MESSAGE autolearn=ham version=3.1.0 X-Spam-Level: Received: from exchfe2.cs.cornell.edu (exchfenlb-2.cs.cornell.edu [128.84.97.34]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k27G7Ot19164 for ; Tue, 7 Mar 2006 11:07:24 -0500 (EST) Received: from EXCHVS2.cs.cornell.edu ([128.84.97.24]) by exchfe2.cs.cornell.edu with Microsoft SMTPSVC(6.0.3790.1830); Tue, 7 Mar 2006 11:07:23 -0500 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 X-Security: message sanitized on sundial.cs.cornell.edu See http://www.impsec.org/email-tools/sanitizer-intro.html for details. $Revision: 1.148 $Date: 2004-12-19 11:59:17-08 X-Security: The postmaster has not enabled quarantine of poisoned messages. Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C64201.382305C2" Subject: PAPER 12 Date: Tue, 7 Mar 2006 11:07:22 -0500 Message-ID: <2EE48095D8C21643B0B70EC95F9BFBAF01526392@EXCHVS1.cs.cornell.edu> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: PAPER 12 Thread-Index: AcZCATf0tsFcOEYFTmyhZ9FbhPbW5A== From: "Ian Kash" To: "Emin Gun Sirer" X-OriginalArrivalTime: 07 Mar 2006 16:07:23.0768 (UTC) FILETIME=[388B1F80:01C64201] This is a multi-part message in MIME format. ------_=_NextPart_001_01C64201.382305C2 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Tried sending this to egs+summary@cs.cornell.edu, but it was bouncing. =20 - Ian =20 PAST is a storage system built on top of Pastry. To ensure availability = of files it stores k copies of each at the k nearest nodes to the files = location in the ring. It uses 2 mechanisms to store copies when one of those k = is unable to store one. Replica diversion allows a node to choose another = node in its leaf set to store the copy and just keeps a pointer to that node = (this mechanism is also leveraged to minimize copying on node joins). File diversion uses a new salt to select a new location in the ring to store = the file in the event the original region lacks room to store the file using replica diversion. PAST also uses space not currently allocated to file storage to cache files for faster lookups. One concern is that neither = form of diversion seems to work well for larger files because of the = likelyhood that many or most clients would be unwilling to store the entire file = and would have to pay the entire cost if the file was popular. While this = could be handled by a higher level application, it seems like something PAST = should handle. CFS is a storage system built on top of Chord. While PAST stores the = entire file, CFS breaks the file into blocks, which helps with large files and = load balancing. A possible downside of this approach is that if any block is unavailable, the entire file is effectively unavailable. This means = that even if the possibility of a single block being unavailable is quite = small, the probability of a large file being unavailable may be quite large. = Like PAST, they reject Reed-Solomon or other encodings to achieve redundancy because of the cost of fetching multiple shares per block. However, = they do not consider using Reed-Solomon rather than blocking. This would = maintain the advantages of blocking, while removing the reliance on having every = block available to have the file available. =20 ------_=_NextPart_001_01C64201.382305C2 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable -->

Tried sending this to egs+summary@cs.cornell.edu= , but it was bouncing.

 

 - Ian

 

PAST is a storage system built on top of Pastry.  To ensure availability of = files it stores k copies of each at the k nearest nodes to the files location in = the ring.  It uses 2 mechanisms to store copies when one of those k is = unable to store one.  Replica diversion allows a node to choose another = node in its leaf set to store the copy and just keeps a pointer to that node = (this mechanism is also leveraged to minimize copying on node joins).  = File diversion uses a new salt to select a new location in the ring to store = the file in the event the original region lacks room to store the file using = replica diversion.  PAST also uses space not currently allocated to file = storage to cache files for faster lookups.  One concern is that neither = form of diversion seems to work well for larger files because of the likelyhood = that many or most clients would be unwilling to store the entire file and = would have to pay the entire cost if the file was popular.  While this could = be handled by a higher level application, it seems like something PAST = should handle.

CFS is a storage system built on top of Chord.  While PAST stores = the entire file, CFS breaks the file into blocks, which helps with large = files and load balancing.  A possible downside of this approach is that if = any block is unavailable, the entire file is effectively unavailable.  This = means that even if the possibility of a single block being unavailable is = quite small, the probability of a large file being unavailable may be quite large.  Like PAST, they reject Reed-Solomon or other encodings to = achieve redundancy because of the cost of fetching multiple shares per = block.  However, they do not consider using Reed-Solomon rather than = blocking.  This would maintain the advantages of blocking, while removing the = reliance on having every block available to have the file = available.

 

------_=_NextPart_001_01C64201.382305C2-- From kelvinso@gmail.com Tue Mar 7 11:18:39 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from wproxy.gmail.com (wproxy.gmail.com [64.233.184.205] (may be forged)) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k27GIct23273 for ; Tue, 7 Mar 2006 11:18:38 -0500 (EST) Received: by wproxy.gmail.com with SMTP id i3so202106wra for ; Tue, 07 Mar 2006 08:18:38 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=AcZdfw7zbuWeqGPshgadD9mTn67aaaXIwmcWjKW8nqhDp1bX/Bbb4/iF7TtR4+Gb7Sm1rlGMNyUAAYwNwR28/qroARYAZnXpJhN9nVDZZUqjj6Z+Q3SSgyqFh5C+NpbeEZ3b47BkcgHLEY7hfd2/SjejuhykrmRnhiNPhWnXGIA= Received: by 10.54.129.16 with SMTP id b16mr184817wrd; Tue, 07 Mar 2006 08:18:38 -0800 (PST) Received: by 10.54.80.9 with HTTP; Tue, 7 Mar 2006 08:18:38 -0800 (PST) Message-ID: <6e1ca4560603070818h5bd5da55t4f1496a8603cb2d0@mail.gmail.com> Date: Tue, 7 Mar 2006 11:18:38 -0500 From: "Chiu Wah Kelvin So" To: egs+summary@cs.cornell.edu Subject: Paper 12 MIME-Version: 1.0 Content-Type: text/plain; charset=WINDOWS-1252 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by sundial.cs.cornell.edu id k27GIct23273 The first paper, "Storage management and caching in PAST…," presents a distributed and self-organizing peer-to-peer storage infrastructure. PAST supports three simple file operations: insertion, lookup and reclaiming(deletion) files. However, it only supports immutable file. To insert a file, it first computes the file ID using the SHA-1 hash of the file name, public key and a random salt. Then, PAST replicates the file to k nodes whose node ID is numerically closest to the file ID to maintain high file availability even when nodes fail. Since size of different files and the size of hard drive of peers various by orders of magnitude, file may not be able to fit into the node which is one of the numerically closest nodes. Therefore, we need a mechanism to solve this problem. PAST uses two techniques to divert replicas to other nodes when it doesn't fit into the nodes. First, PAST allows a node that is not one of the k numerically closest nodes to the file ID to alternatively store the file, if it is the leaf set of one of those k nodes. This is called replica diversion. If both the node and the leaf set of one of those k nodes do not fit into file, then it uses file diversion, where the client node will compute the file ID with a different salt value and retry the insert operation. There is also caching on the lookup path to improve performance for popular files. This paper only uses Pastry with a storage management layer on top of it. Storage management is only useful when storage is scarce. In most of the machines, there are plenty of storages. Also, this paper only targets in very small problem in which a file system with immutable files. It will be a lot more useful if one can update files. The second paper, "Wide-area cooperative storage with CFS," also presents an alternative read-only distributed data storage system on top of Chord. CFS consists of two layers: DHash, and Chord. DHash layer is responsible for storing keyed blocks, maintaining proper levels of replication, can caching popular blocks. Chord layer is used to efficiently locate blocks of files. Instead of storing a whole file into node in PAST, CFS store blocks of files into nodes (Each file consists of root block, and each of the root block points to different data blocks). Therefore, CFS eliminates the need to manage storage of each node since each block of files is fixed size and they can be naturally load balanced using DHT. However, it will require multiple lookup for a single file. In CFS, File can be updated by its publisher who has access to the private key used to sign the root block of file while PAST only allows write-once file. To maintain replication, CFS stores a block's replicas at the k servers immediately after the block's successor on the Chord ring. To have different server storage capacities, the server with high storage capacity can enter the systems with several virtual servers to load balance the systems. The limitation of both of the distributed data storage systems do not allow clients to update files, which is a very useful features client can have. Also, it requires large bandwidth to maintain k replicas when there is churns in the systems. From tc99@cornell.edu Tue Mar 7 11:26:11 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from postoffice10.mail.cornell.edu (postoffice10.mail.cornell.edu [132.236.56.14]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k27GQBt26265 for ; Tue, 7 Mar 2006 11:26:11 -0500 (EST) Received: from webmail.cornell.edu (hermes21.mail.cornell.edu [132.236.56.20]) by postoffice10.mail.cornell.edu (8.12.10/8.12.6) with ESMTP id k27GQ9Ak026021 for ; Tue, 7 Mar 2006 11:26:10 -0500 (EST) Received: from 128.84.153.218 by webmail.cornell.edu with HTTP; Tue, 7 Mar 2006 11:26:10 -0500 (EST) Message-ID: <1150.128.84.153.218.1141748770.squirrel@webmail.cornell.edu> Date: Tue, 7 Mar 2006 11:26:10 -0500 (EST) Subject: paper 12 From: "Theodore Ming Shiuan Chao" To: egs@cs.cornell.edu User-Agent: SquirrelMail/1.4.5 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal Both PAST and CFS are highly similar in many ways. Both are layered on top of a ring-topology (Pastry and Chord respectively), replicate files on the k closest or next nodes, and perform reactive path caching. The differences they have are slight and both could easily have been extended to include the additional of others: CFS will automatically split files into blocks that are stored separately while PAST leaves that up to the supplier, and PAST supports replica and file diversion through pointers and choosing different random salts. PAST also sets a criteria by which nodes can choose to reject pushed replica or primary file storage requests based on the size of the file and available space left on the node. PAST and CFS have complementary problems with regards to file size issues. In CFS, a large file is fragmented and stored as numerous blocks that can end up on many different nodes. This causes a linear increase in the number of lookups and hops required to reassemble the file. In PAST, finding a node that would accept such a large file would be the issue, though even then, the supplier could opt to fragment the file and store arbitrarily smaller blocks independently. PAST could, however, require more hops to figure out a suitable block size for the file as it waits for insert rejections before retrying at smaller sizes. Neither PAST or CFS really present anything in the way of how file server network systems are utilized in the real world, so it's hard to judge the effect of this. PAST does conduct an experiment on a trace of web proxies, but the utilization of web proxies and a NFS would likely be extremely different based on usage and the average size of files stored. From asg46@cornell.edu Tue Mar 7 11:38:50 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-1.9 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from postoffice10.mail.cornell.edu (postoffice10.mail.cornell.edu [132.236.56.14]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k27Gcnt00359 for ; Tue, 7 Mar 2006 11:38:49 -0500 (EST) Received: from webmail.cornell.edu (hermes21.mail.cornell.edu [132.236.56.20]) by postoffice10.mail.cornell.edu (8.12.10/8.12.6) with ESMTP id k27GcksW005212 for ; Tue, 7 Mar 2006 11:38:47 -0500 (EST) Received: from 128.84.98.131 by webmail.cornell.edu with HTTP; Tue, 7 Mar 2006 11:38:48 -0500 (EST) Message-ID: <4943.128.84.98.131.1141749528.squirrel@webmail.cornell.edu> Date: Tue, 7 Mar 2006 11:38:48 -0500 (EST) Subject: paper 12 - storage systems From: "Abhishek Santosh Gupta" To: egs+summary@cs.cornell.edu User-Agent: SquirrelMail/1.4.5 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal PAST it exploits the multitude and diversity (in geography, ownership, administration, jurisdiction) of nodes in the Internet to achieve strong persistence and high availability. PAST is layered on top of Pastry which is used for routing. files stored in PAST are associated with a quasi-unique fileid that is generated at the time of is generated at the time of its insertion into the PAST. fileid is computed as the secure hash of the file name, owner's public key along with a randomly chosen salt. each file is generally replicated on k numerically closest nodes. in order to accommodate differences in storage capacity and utilization of nodes within a leaf set, PAST allows a node that is not one of the k numerically closest nodes to alternatively store the file. this is termed as replica diversion. the authors offer a upper bound on k. Further, irrespective of the popularity of a file, k replicas are generated ( k must be proportional to popularity to relax storage requirements) they suggest caching in order to consider popularity requirements but they also state that a node might discard its cache ( rendering this caching scheme useless --no incentives- especially malicious nodes) in order to control the distribution of per-node storage capacities, PAST has a policy to allow a node with larger storage capacity to split and join under multiple nodeIds. Joining as multiple nodes with different nodeIDs seems a bad idea as opposed to multiple nodes with same nodeIds. it provides lower availability if the large node fails ( as the same node may contain multiple pointers to it since multiple nodes have different ids) . even large malicious nodes will be able to control larger portions of a node's routing table. policies: diverting a large file is better than diverting multiple small ones thereby reducing the insertion overhead and also minimizing the impact of replica diversion for lookups. in case of failures of nodes (in the case of replica diversion) background operations have to be carried out so that the affected files can be migrated. a node chosen for replica diversion must have more remaining space than the average space. the CFS paper points to the fact that the system as a whole may have sufficient space to store a file but individual nodes may not have sufficient space resulting in the file being rejected. it also suffers from load imbalance due to difference in popularity of files. CFS CFS stores blocks rather than files and spreads blocks evenly over available servers. thus although better space-efficiency is achieved the lookup time per object increases. CFS achieves better load balancing than PAST due to the above mechanism too. CFS stores data for only a finite amount of time. Thus, nodes require to request extensions periodically. this approach limits DOS attacks which try to insert garbage data into the system. whenever a file needs to be updated, the publisher is required to sign the new root node. the system checks the signature to authenticate the publisher ( external references need not be changed in case of an update). however, if the node storing the external references fails(or that space gets corrupted), the system would suffer from availability permanently. ( disk space would be reclaimed in case no extension was requested but availability would still suffer) CFS has a quota system that limits the amount of data that any particular IP can insert into the system. However, this can also serve as mechanism to launch DOS attacks by spoofing IP (assuming that different public key pairs can be generated). it uses CHORD for routing. virtual servers (having different nodeIDS) creates vulnerabilities in the system. From km266@cornell.edu Tue Mar 7 11:40:36 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-1.2 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from authusersmtp.mail.cornell.edu (granite1.mail.cornell.edu [128.253.83.141]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k27Geat00645 for ; Tue, 7 Mar 2006 11:40:36 -0500 (EST) Received: from KEVSTOY (cpe-69-207-37-68.twcny.res.rr.com [69.207.37.68]) (authenticated bits=0) by authusersmtp.mail.cornell.edu (8.13.1/8.12.10) with ESMTP id k27GeZ4I003424 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NOT) for ; Tue, 7 Mar 2006 11:40:35 -0500 (EST) Message-ID: <000b01c64205$e49ef450$4425cf45@KEVSTOY> Reply-To: "Kevin" From: "Kevin" To: Subject: PAPER 12 Date: Tue, 7 Mar 2006 11:40:49 -0500 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.2527 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2527 PAST is a persistent storage system built upon Pastry. PAST works by associating an immutable fileID with each immutable file. The fileID is a hash of file name, (owner's) public key, and a salt. The file is then stored at the Pastry node whose nodeID is closest to the fileID. In order to maintain the file, the keeper of the file sends it out to several of its leaf nodes (k of them) and tells them to store the file as well. If they do not have enough space to store the file, they can store a pointer to the file and then hand the actual file off to another node. Once the file is in the system, you have to know a file's ID in order to look it up. You can also reclaim a file knowing the fileid and owner credentials. Reclaiming is much like deleting, except that the operation is not guaranteed to succeed. In addition to the k-extra nodes that are storing the file, a replicating caching scheme allows the file to be stored at other points in the system to allow for quicker retrieval and lookup. One of the interesting designs in this paper was for the security. The authors recommend using smartcards which store public/private key pairs. This seems to imply that a central authority or, at the very least, a large amount of cooperation between peers exists. Because PAST uses files (unlike blocks like CFS), several nodes could have a large amount of storage overhead. The authors' remedy is to add indirection and therefore more latency while decreasing the structure of the underlying network. CFS is a distributed filesystem for, once again, immutable files. It reminds me a lot of the Linux filesystem, with inodes and pointers pointing to other files, directories, or inodes. Files are no longer stored in one lump, but are broken up into blocks. These blocks are stored at a pre-determined number of successors (CHORD built upon DHash, not PASTRY in this paper) in order to maintain the file. These blocks are then distributed and have pointers to them. The pointers to them are just a hash of the contents of the block. Therefore, each inode only needs to keep track of hashes. Each machine in the system has a root block which is signed (and therefore more trustable). Unlike PAST, each file has a lifetime (which the owner can ask to extend) and therefore storage is not necessarily persistent. It seems that this system is more fragile than the first. Assuming we have a k of 5 and half the nodes in the system go down with probability 1/2 and in CFS the file is broken up into 1000 chunks, then we see that the chance of losing a file in PAST is (1/2)^5=1/32. In CFS we expect to lose 1/32 of our file (which is ~32 blocks of data). We expect to lose the same amount in PAST, but it is the deviation that matters. It is a tradeoff between losing large amounts of data infrequently or losing a small amount of data more frequently. Furthermore, CFS seems to have a large amount of lookup overhead, needing to do lots of lookups in order to find the file and then to download it. From tudorm@cs.cornell.edu Tue Mar 7 11:52:47 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from exchfe2.cs.cornell.edu (exchfenlb-2.cs.cornell.edu [128.84.97.34]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k27Gqlt04737 for ; Tue, 7 Mar 2006 11:52:47 -0500 (EST) Received: from exchfe2.cs.cornell.edu ([128.84.97.28]) by exchfe2.cs.cornell.edu with Microsoft SMTPSVC(6.0.3790.1830); Tue, 7 Mar 2006 11:52:47 -0500 Received: from [128.84.223.148] ([128.84.223.148]) by exchfe2.cs.cornell.edu over TLS secured channel with Microsoft SMTPSVC(6.0.3790.1830); Tue, 7 Mar 2006 11:52:46 -0500 Message-ID: <440DBA5E.3070808@cs.cornell.edu> Date: Tue, 07 Mar 2006 11:52:46 -0500 From: Tudor Marian User-Agent: Thunderbird 1.5 (X11/20051201) MIME-Version: 1.0 To: egs@cs.cornell.edu Subject: PAPER 12 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 07 Mar 2006 16:52:46.0742 (UTC) FILETIME=[8F8FFB60:01C64207] PAST is a large scale p2p persistent storage system based on the Pastry overlay. The interface provided permits insertions, lookups and reclaiming at the granularity of a file, therefore the semantics is that of a storage for immutable files. Since Pastry performs the routing, the paper focuses on the replica management, each file gets to be replicated on k distinct nodes (they may be virtual). Initially, an insert operation routes the file to the home node, that will in turn send it to k-1 of the neighbors in its leaf set. PAST uses replica diversion and file diversion to counter the storage load imbalance issue. Replica diversion means that if a node cannot accommodate locally a copy of an object, it will chose a node in its leaf set that doesn't yet hold a copy and delegate the storage responsibility to it, keeping the pointer though. File diversion happens when an object was not able to be replicated on k nodes. In such a scenario a negative acknowledgment is returned to the client, that in turn can generate a new fileID based on a different salt and retry to insert the file, thus targeting a different part of the ring. The benefits of caching are explored as well. Security is quite an issue in such a system, yet the paper defers it to a "forthcoming" paper. CFS is a p2p read only completely decentralized file system based on the Chord overlay. The system provides different use cases based on the role of the actors, the users and the content publishers. The users see the file system as a read only POSIX-like block file system, while the publishers are able to create and modify entire filesystems (there's a 1:1 mapping between a publisher's public key and a filesystem). CFS is layered into the file system interface, the DHash and Chord. Chord performs routing at the granularity of blocks, the blocks may be metadata blocks (like directory blocks or inode blocks) or data content blocks. The DHash layer handles the block replica placement and caching popular blocks, while the FS layer interprets the blocks at the DHash level as a file system. Load balancing is an easier problem to solve in CFS since the blocks have same size, yet the multiple indirections (even with the pre-fetching) that each may take up to logN steps cripples the performance of the system. Tudor