From tc99@cornell.edu Mon Mar 13 22:12:42 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-1.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from postoffice10.mail.cornell.edu (postoffice10.mail.cornell.edu [132.236.56.14]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k2E3Cgt13087 for ; Mon, 13 Mar 2006 22:12:42 -0500 (EST) Received: from webmail.cornell.edu (hermes21.mail.cornell.edu [132.236.56.20]) by postoffice10.mail.cornell.edu (8.12.10/8.12.6) with ESMTP id k2E3Cfpu005117 for ; Mon, 13 Mar 2006 22:12:41 -0500 (EST) Received: from 24.59.114.243 by webmail.cornell.edu with HTTP; Mon, 13 Mar 2006 22:12:41 -0500 (EST) Message-ID: <3084.24.59.114.243.1142305961.squirrel@webmail.cornell.edu> Date: Mon, 13 Mar 2006 22:12:41 -0500 (EST) Subject: paper 14 From: "Theodore Ming Shiuan Chao" To: egs@cs.cornell.edu User-Agent: SquirrelMail/1.4.5 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal GNP and Vivaldi both embed the network nodes into a finite virtual coordinate space so that distance calculations can be calculated without additional explicit measurements. They use previously done measurements to other nodes - specific landmark nodes for GNP and a random subset of nodes for Vivaldi - to place the nodes relative to each other. An issue with virtual coordinate-based location schemes is their ability to deal with constantly changing networks. Can GNP and Vivaldi deal with networks with high churn or a propensity for flash loads at nodes? The GNP paper (or slides) do not address that at all, and the Vivaldi paper only offers one experiment on a flash load at a single node. They found that the network stabilized after 20 seconds, but they didn't investigate the effects of churn (where a node entering has no information and a node leaving takes away information from network locality), nor the effects of short-lived flash loads that last less than that 20 seconds. There is also a huge reliance on cooperation and the accuracy of other nodes. In Vivaldi, a node relies on it neighbors correctly reporting their position and also correctly estimating and reporting their accuracy as well. The effects of a single misconfigured or malicious node might significantly undermine the accuracy of the scheme. Meridian offers a different approach that is somewhat similar to how Pastry handles locality-informed routing. Each node maintains information about k nodes in each ring representing a range of measured distances away. When locating a node nearest to a given target, it looks for a node in appropriate ring closest to the target and forwards the request to it. Assuming a good distribution within each ring, the request should get exponentially closer to the target with each hop. However, since locating nodes requires sending requests out, the uses of this seem more limited than virtual coordinates, where node locating can be done without use of bandwidth for measurements after the geography is set up. It seems that Meridian sacrifices generality (where a node can easily determine the distance between it and any other node in the network) for scalability and accuracy. From ns253@cornell.edu Mon Mar 13 23:57:21 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from authusersmtp.mail.cornell.edu (granite1.mail.cornell.edu [128.253.83.141]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k2E4vKt08471 for ; Mon, 13 Mar 2006 23:57:20 -0500 (EST) Received: from localhost (cpe-69-207-49-126.twcny.res.rr.com [69.207.49.126]) (authenticated bits=0) by authusersmtp.mail.cornell.edu (8.13.1/8.12.10) with ESMTP id k2E4vIXY001816 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Mon, 13 Mar 2006 23:57:19 -0500 (EST) Date: Mon, 13 Mar 2006 23:57:18 -0500 From: Niranjan Sivakumar To: egs+summary@cs.cornell.edu Subject: PAPER 14 Message-Id: <20060313235718.41387f9c.ns253@cornell.edu> Organization: Cornell Law School X-Mailer: Sylpheed version 2.2.2 (GTK+ 2.8.13; i686-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-2022-JP Content-Transfer-Encoding: 7bit Niranjan Sivakumar Predicting Internet Network Distance with Coordinates-Based Approaches Vivaldi: A Decentralized Network Coordinate System Meridian: A Lightweight Network Location Service without Virtual Coordinates Global Network Positioning (GNP) is a system that models the Internet as a geometric space in order to allow distance calculations without actually having to measure RTT between any given nodes. The core of the system is a small set of landmark nodes that are used by other nodes in the system to approximate their position coordinates in the network space. Vivaldi is another geometric modeling of the Internet space, but without the centralized landmark servers that are present in GNP. Vivaldi models the network as a spring system. Nodes are $B!H(Bpushed$B!I(B by other nodes as the network changes and nodes communicate with each other with a goal of eventually stabilizing at coordinates that are closely correlated to real RTT. Nodes in Vivaldi can perform their initial RTT measurements on any node in the system and do not have to contact anything like a landmark node. Meridian does not use a coordinate system for discovering network location, but rather relies on a system of multi-resolution rings. A few neighbor nodes are put into rings based on their measured distance. Thus, nodes become experts on their locality and can forward queries to other experts at another, more distant, locality. The system uses gossiping to contact nodes to measure actual latency and discover nodes. Periodic gossiping keeps latency information up to date. One of the biggest weaknesses that is clearly seen with GNP is the requirement for centralized landmark servers. These can be a target for attack and detract from other benefits provided by the rest of the decentralized network. Vivaldi deals with the issue of centralized servers, but as the authors admit, it is not robust in the face of malicious nodes. It seems that the coordinate scheme could be manipulated by attackers, and particularly by attackers that work together to exploit the system. Meridian seems to put a higher burden on nodes than the coordinate systems. While the load is not excessive, it's certainly more complex than some other coordinate systems. Also, as noted in the paper, coordinate based systems may be more general than Meridian. From victoria@cs.hmc.edu Tue Mar 14 00:56:14 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from turing.cs.hmc.edu (turing.cs.hmc.edu [134.173.42.99]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k2E5uDt22467 for ; Tue, 14 Mar 2006 00:56:13 -0500 (EST) Received: by turing.cs.hmc.edu (Postfix, from userid 34382) id D6B1B5327F; Mon, 13 Mar 2006 21:56:07 -0800 (PST) Date: Mon, 13 Mar 2006 21:56:07 -0800 From: Victoria Krafft To: egs+summary@cs.cornell.edu Subject: PAPER 14 Message-ID: <20060314055607.GA22886@cs.hmc.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i For many network applications, both peer-to-peer and more general client-server, it's useful to determine which of a set of servers is closest to a client. The simple solution to this is to have the client ping all the servers, and take the one with the shortest round trip time. In systems with a large number of clients or servers, this approach generates a lot of traffic, and it results in higher latencies the first time a client contacts a server. The papers for this week attempt to eliminate these problems by providing a scheme for estimating the distance between two nodes, based on other information in the network. The first scheme is GNP, which is presented using a set of slides rather than in a paper. This makes it slightly more difficult to understand. As far as I can tell, it works by modeling the Internet in a coordinate space. First, it selects a set of landmark servers, determines the RTT between them, and uses these measurements to assign the landmarks coordinates in the space. Then, when a new node joins, it pings each landmark, and figures out its location in the coordinate space. They also propose Triangulated Heuristic Coordinates, which I think create coordinates for an object by selecting some base servers, finding the distance from the object to the base servers, and using those as coordinates. Vivaldi takes the same basic approach, with two key modifications. First, it models the nodes in the network as a system of physical springs; the system attempts to find the minimum energy configuration, and this corresponds to the minimum error configuration for the coordinate system. Secondly, it uses a 2 dimensional space, and adds a height, a directionless value, to each object. This accounts for the lag due to access link latency. The performance data which is presented shows that Vivaldi outperforms GNP in most cases. Meridian uses a different approach to find nodes which are nearby. It forms a loosely structured overlay network. Each node keeps track of O(log N) peers, which are organized into a series of concentric rings. The node also tracks some additional peers which it can add to the rings as needed to fill in gaps. Information about peers is spread using a gossip protocol. Meridian can be used to find the closest node to a target, giving it the ability to search like Chord or Pastry. Meridian also makes it easy to find a central leader for a group, and sets of nodes which satisfy network geography constraints. Experiments show that by eliminating embedding errors, Meridian significantly reduces error in the distances it calculates. GNP and Vivaldi both require fairly expensive calculations to create their embedded coordinate system. This could prove to be too costly in networks with high churn. In Meridian, fairly high churn could be tolerated, provided that the gossip protocol spread information fast enough to offer replacements for failed nodes. However, Meridian doesn't provide a very accurate way of determining which server will be closest to a client without running any additional packets through the network. If these calculations are used to determine who a node contacts, then these schemes might run into problems, as people intentionally delay responses so that their node gets less traffic. A couple of misbehaving nodes, especially in GNP's landmark servers, could also cause a lot of chaos. -- Victoria Krafft From lackhand@gmail.com Tue Mar 14 01:07:38 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-1.1 required=5.0 tests=AWL,BAYES_00,HTML_00_10, HTML_MESSAGE autolearn=no version=3.1.0 X-Spam-Level: Received: from zproxy.gmail.com (zproxy.gmail.com [64.233.162.197] (may be forged)) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k2E67ct26974 for ; Tue, 14 Mar 2006 01:07:38 -0500 (EST) Received: by zproxy.gmail.com with SMTP id 4so352618nzn for ; Mon, 13 Mar 2006 22:07:37 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:mime-version:content-type; b=sNwQNtIQC+EnxtW3PneeUomtvjwLVnAC4TpThEcaERQxKZUhONkD61DU0bhieVLFXm+UNbwgy+JYjHgiks3lorNMRXzlyEzz9z+mt7VJKXDphmD91IfjdSn9A/5uU9EmXs/Nxgy/31ZUMAMONuDkHYQzI+cSmeKeNEopp8MC8OQ= Received: by 10.35.18.4 with SMTP id v4mr225594pyi; Mon, 13 Mar 2006 22:07:37 -0800 (PST) Received: by 10.35.61.2 with HTTP; Mon, 13 Mar 2006 22:07:37 -0800 (PST) Message-ID: <9aa7a97d0603132207m3aa129ffv4f261d49dd1fb161@mail.gmail.com> Date: Tue, 14 Mar 2006 01:07:37 -0500 From: "Andrew Cunningham" To: egs+summary@cs.cornell.edu Subject: PAPER 14 MIME-Version: 1.0 X-Security: message sanitized on sundial.cs.cornell.edu See http://www.impsec.org/email-tools/sanitizer-intro.html for details. $Revision: 1.148 $Date: 2004-12-19 11:59:17-08 X-Security: The postmaster has not enabled quarantine of poisoned messages. Content-Type: multipart/alternative; boundary="----=_Part_2306_5941036.1142316457469" ------=_Part_2306_5941036.1142316457469 Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Andrew Cunningham arc39 _Predicting_Internet_Network_Distance_with_Coordinates-Based_Approaches= _ T.S. Eugene Ng and Hui Zhang It's hard to respond to this paper -- I'm operating under the assumptio= n that you didn't want us to, since it was difficult to ocomprehend the telegraphic speech present in the powerpoint slides. Nevertheless, the central concept of the slides is interesting and fairly easy. It is relatively expensive to do a ping exchange with each other node we wish to establish a distance to; if we model the internet as a euclidian space, we can measure distances to central points, and use those as coordinates in th= e internet-space so generated; this lets us simply measure distance (in terms of hop count, or ping time, etc) to obtain an upper bound on actual ping-time distance. The scheme is fairly good, though somewhat expensive fo= r the central servers, albeit at a cost proportional to the carrying capacity of the network, not the number of distance-related transactions on the network. Also, outages in the well-known central entities cannot easily be repaired -- the entire system must remeasure against any new landmark that is put into the system, which is an individually small cost, but means that there are definite points of this system which do not scale and cannot be easily repaired/replaced. _Vivaldi:_A_Decentralized_Network_Coordinate_System_ Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris Vivaldi attempts to measure internet distances via a spring-lattice structure, allowing disparate peer nodes to communicate information about physical location -- as measured in milliseconds -- to each other, for rout= e selection, etc. Alternate measures are proposed, the most feasible being to consider the system as not just a two dimensional grid but the surface of a sphere, and thus the effects of wrap around be internalized to the model. This is rejected because the internet does not, in fact, tend to wrap around; America is at the "center" of the world-wide-web, at least for now. They model the interrnet as a lattice of springs, in that between each pair of hosts there is some spring with rest length equal to the RTT between the two nodes. They then minimize the squared error function, which is equivalent to spring energy, in other words, allowing the springs to acheiv= e a natural equilibrium. They divide time into a series of steps, and permit the springs to act upon those nodes that are out of alignment, thus converging in a strictly local sense to the correct answer, using only localized information. In addition, the system models the fact that many nodes must be routed through a third party as a height vector, derived from the energy stored in the springs, which means that euclidean distance is better approximated. This is a very tight algorithm, though as pointed out, it is prone to find locally minimal (energy) coordinate-states rather than globally minima= l ones, which means that it is nonoptimal; moreover, it is night impossible t= o do simulated annealing or other such random-restart processes, making it difficult to recover from these states, without introducing wild shifts. Unrelatedly, the system seems logically incomplete: though they model distance from the internet core as distance above the euclidean plane, ther= e are two cases which are not easily distinguished in this model and not well handled: sites that are near each other, yet distant from the internet core= , and sites that are equidistant from the internet core and removed from each other. In the first case, the system behaves perfectly, taking the distance between two euclidean "tall" points essentially ignoring the height (if equal). In the latter, however, they will be at the same "height" (if equidistant from the core) and thus this extra travel time will not be factored in correctly, and thus must be modeled entirely in euclidean distance in the plane. This is a degenerate case of their algorithm (with all heights equal to 0) and thus throws suspicion on the height. _Meridian:_A_Lightweight_Network_Location_Service_without_Virtual_Coordina= tes_ Bernard Wong, Aleksandrs Slivkins, Emin Gun Sirer This version may be described as a set of series of concentric rings; each node maintains some neighbor set organized as rings. These rings vary over exponential radii, and are maximized based on local measurements (in a local n-space, the paper maximizes the volume of the object formed between the various neighbors). This is also prone to local optima which are not globally maximum, or even necessarily very good, but the system may take wider swings and thus is not as prone to stultifying as other systems. It i= s a decentralized system with excellent (logarithmic) performance measures. The system is interesting in that it doesn't have the same sense of direction as others with more absolute senses; rather than providing absolute coordinates and thus euclidian direction, there are strict distances. The fact that state may be maintained locally per-node for wide-ranging maxima is almost icing on the cake. ------=_Part_2306_5941036.1142316457469 Content-Type: text/html; charset=ISO-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Andrew Cunningham
arc39

    _Predicting_Internet_Network_Distance_with_Coordinates-B= ased_Approaches_
    T.S. Eugene Ng and Hui Zhang
    
    It's hard to respond to this paper -- I'm operating under the assumption that you didn't want us to, since it was difficult to ocomprehend the telegraphic speech present in the powerpoint slides. Nevertheless, the central concept of the slides is interesting and fairly easy. It is relatively expensive to do a ping exchange with each other node we wish to establish a distance to; if we model the internet as a euclidian space, we can measure distances to central points, and use those as coordinates in the internet-space so generated; this lets us simply measure distance (in terms of hop count, or ping time, etc) to obtain an upper bound on actual ping-time distance. The scheme is fairly good, though somewhat expensive for the central servers, albeit at a cost proportional to the carrying capacity of the network, not the number of distance-related transactions on the network. Also, outages in the well-known central entities cannot easily be repaired -- the entire system must remeasure against any new landmark that is put into the system, which is an individually small cost, but means that there are definite points of this system which do not scale and cannot be easily repaired/replaced.
    
    _Vivaldi:_A_Decentralized_Network_Coordinate_System_
    Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris
    
    Vivaldi attempts to measure internet distances via a spring-lattice structure, allowing disparate peer nodes to communicate information about physical location -- as measured in milliseconds -- to each other, for route selection, etc. Alternate measures are proposed, the most feasible being to consider the system as not just a two dimensional grid but the surface of a sphere, and thus the effects of wrap around be internalized to the model. This is rejected because the internet does not, in fact, tend to wrap around; America is at the "center" of the world-wide-web, at least for now. They model the interrnet as a lattice of springs, in that between each pair of hosts there is some spring with rest length equal to the RTT between the two nodes. They then minimize the squared error function, which is equivalent to spring energy, in other words, allowing the springs to acheive a natural equilibrium. They divide time into a series of steps, and permit the springs to act upon those nodes that are out of alignment, thus converging in a strictly local sense to the correct answer, using only localized information. In addition, the system models the fact that many nodes must be routed through a third party as a height vector, derived from the energy stored in the springs, which means that euclidean distance is better approximated.
    This is a very tight algorithm, though as pointed out, it is prone to find locally minimal (energy) coordinate-states rather than globally minimal ones, which means that it is nonoptimal; moreover, it is night impossible to do simulated annealing or other such random-restart processes, making it difficult to recover from these states, without introducing wild shifts. Unrelatedly, the system seems logically incomplete: though they model distance from the internet core as distance above the euclidean plane, there are two cases which are not easily distinguished in this model and not well handled: sites that are near each other, yet distant from the internet core, and sites that are equidistant from the internet core and removed from each other. In the first case, the system behaves perfectly, taking the distance between two euclidean "tall" points essential= ly ignoring the height (if equal). In the latter, however, they will be at the same "height" (if equidistant from the core) and thus this ex= tra travel time will not be factored in correctly, and thus must be modeled entirely in euclidean distance in the plane. This is a degenerate case of their algorithm (with all heights equal to 0) and thus throws suspicion on the height.
    
    _Meridian:_A_Lightweight_Network_Location_Service_withou= t_Virtual_Coordinates_
    Bernard Wong, Aleksandrs Slivkins, Emin Gun Sirer
    
    This version may be described as a set of series of concentric rings; each node maintains some neighbor set organized as rings. These rings vary over exponential radii, and are maximized based on local measurements (in a local n-space, the paper maximizes the volume of the object formed between the various neighbors). This is also prone to local optima which are not globally maximum, or even necessarily very good, but the system may take wider swings and thus is not as prone to stultifying as other systems. It is a decentralized system with excellent (logarithmic) performance measures.
    The system is interesting in that it doesn't have the same sense of direction as others with more absolute senses; rather than providing absolute coordinates and thus euclidian direction, there are strict distances. The fact that state may be maintained locally per-node for wide-ranging maxima is almost icing on the cake. ------=_Part_2306_5941036.1142316457469-- From asr32@cornell.edu Tue Mar 14 01:12:19 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-1.8 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from authusersmtp.mail.cornell.edu (granite1.mail.cornell.edu [128.253.83.141]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k2E6CJt27893 for ; Tue, 14 Mar 2006 01:12:19 -0500 (EST) Received: from dreadnought.cornell.edu (r253240123.resnet.cornell.edu [128.253.240.123]) (authenticated bits=0) by authusersmtp.mail.cornell.edu (8.13.1/8.12.10) with ESMTP id k2E6CI1o008741 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Tue, 14 Mar 2006 01:12:18 -0500 (EST) Message-Id: <6.2.1.2.2.20060313225334.02f30bc0@postoffice8.mail.cornell.edu> X-Mailer: QUALCOMM Windows Eudora Version 6.2.1.2 Date: Tue, 14 Mar 2006 01:06:53 -0500 To: egs+summary@cs.cornell.edu From: Ari Rabkin Subject: PAPER 14 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed In building distributed systems, it's often useful to have an estimate of the round-trip-time to a given host. This can generally be obtained by pinging, but it would commonly be desirable to measure it without first pinging. In particular, the network load from this approach rises proportional to the number of hosts. Much better to infer this distance. An easy way to do this, used by GNP, is to measuring distances to a small number of landmarks, and then applying Euclidean distance metrics. The GNP analysis shows that this method is inexact, but still gives usable results. GNP is a centralized algorithm, with all the scaleability issues that implies. Nodes do not always respond consistently, and link RTT depends on traffic as well as distance, so measurements will vary over time. GNP does not take this into account. Lastly, internet routes do not obey the triangle inequality, so attempts to map network distance cleanly to Euclidean space are ultimately impossible. Vivaldi attempts to minimize the global deviation from Euclidean space, in a fully distributed way. The Vivaldi designers use two important tricks: First, they add a misleadingly-named 'height', which is in fact an extra distance added to every node. This simulates the processing time for packets at the host, and on the local link. They also have nodes adjust their position based on spring forces to nearby nodes. The vivaldi system works fairly well, but the authors seem not to have fully used the spring analogy. Mass-spring meshes are vulnerable to oscillation, particularly given that vivaldi essentially uses euler's method. Using the physics model more consistently--for instance, having meaningful 'masses', dependent perhaps on time connected to the system--might have made the system more effective. It would be nice to test the system on a 'live' network, rather than static data. Meridian attempts to solve a somewhat different problem. Rather than giving nodes a well defined 'address', Meridian is a peer-to-peer substrate designed for network-location queries. Each node keeps track of a number of concentric rings, defined by network distance. Each ring consists of a number of other nodes, chosen to be spread out (not near each-other). Gossip and periodic polling are used to keep the rings up to date. Meridian is more accurate than Vivaldi or GNP, but it supplies less data--it might be useful to know the true network distance to a given node, without measuring it. Meridian does not allow this. The actual distance distribution of nodes in the internet is nontrivial; there is no reason to assume a priori that equal sized rings are optimal. Ari Rabkin asr32@cornell.edu Risley Hall 454 3-2842 The resources of civilization are not yet exhausted. --William Gladstone From sh366@cornell.edu Tue Mar 14 04:04:51 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-2.0 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from postoffice10.mail.cornell.edu (postoffice10.mail.cornell.edu [132.236.56.14]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k2E94pt14687 for ; Tue, 14 Mar 2006 04:04:51 -0500 (EST) Received: from orpheus3.dataserver.cornell.edu (orpheus3.dataserver.cornell.edu [128.253.161.167]) by postoffice10.mail.cornell.edu (8.12.10/8.12.6) with ESMTP id k2E94okk028204 for ; Tue, 14 Mar 2006 04:04:51 -0500 (EST) Message-ID: <2120651026.1142327089474.JavaMail.webber@orpheus3.dataserver.cornell.edu> Date: Tue, 14 Mar 2006 04:04:49 -0500 (EST) From: Huang Shiang-Jia To: egs+summary@cs.cornell.edu Subject: PAPER 14 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Mailer: uPortal WEB email client 3.0 Network distance (round-trip latency) predictions are beneficial to large-scale distributed (including peer-to-peer, either structured or unstructured) systems in node selections. The three papers to be discussed today present different, but all scalable, approaches to select nodes based on estimated network distances. GNP and Vivaldi are coordinate-based distance prediction systems. Meridian performs node selections without virtual coordinates but use direct measurements instead. Unlike GNP that involves centralized components, landmarks, in determining coordinates, Vivaldi computes coordinates on the basis of a simulation of springs that requires no fixed infrastructure. * Constantly adjusting the coordinates to adapt underlying topology changes in Vivaldi implies significant overhead. * In the frameworks such as GNP that distances are estimated by two nodes' coordinates, it is possible that nodes lie their coordinates (so as to avoid being selected by other nodes, in peer-to-peer systems). * This paper proposes a Global Netwotk Positioning (GNP) mechanism and compares it with two previous works, triangular heuristic and IDMap. * GNP models the Internet as a geometric space. It characterizes the position of any node by a point in this space. A set of servers, Landmarks, are deployed which compute their coordinates in a chosen geometric space. They serve as the reference and are disseminated to nodes who want to compute their coordinates. The network distance between any two nodes is predicted by the geometric distance between them. * Vivaldi computes coordinates for all nodes by minimizing the square error between round-trip latency and the coordinate distance. Nodes periodically re-compute their coordinates to adapt to changing network conditions. * Height-Vector Model: Different from previous works that use Euclidean or spherical models, Vivaldi proposes a height vector model: Each node has a positive height element in its coordinates. A node that finds itself too close (far away) to (from) nodes on all sides scales its height up (down) away from the Euclidean plane. A packet transmitted from one node to another has to travel the source node's height, the Euclidean space, and the destination node's height. The experimental results show that it performs better than above-mentioned models. * The Meridian framework consists of three mechanisms and can be used to address three location-related problems in distributed systems. (1) Each node keeps track of a fixed number of other nodes and organizes them into concentric non-overlapping rings of exponentially increasing radii. (2) Each node periodically reassesses and replaces ring members with alternatives that provide greater geographical diversity (which is quantified through the hyper-volume of the k-polytope formed by the nodes according to their coordinates). (3) Each node discovers and maintains a small set of pointers to a diverse set of nodes by a gossip protocol. (a) Based on its loosely-structured overlay, Meridian locates the closest node by performing a multi-hop search very similar with Chord or Pastry where each hop reduces exponentially the distance to the targeted reference point. (b) Extending the previous protocol, Meridian finds a node that offers minimal latencies to a given set of nodes by using the average d computed by each node. (c) To find a set of nodes in a region whose boundaries are defined by latency constraints, Meridian performs a multi-constraint query and includes the share area of each elected node in the solution space. From nsg7@cornell.edu Tue Mar 14 11:18:27 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from postoffice10.mail.cornell.edu (postoffice10.mail.cornell.edu [132.236.56.14]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k2EGIRt22204 for ; Tue, 14 Mar 2006 11:18:27 -0500 (EST) Received: from webmail.cornell.edu (hermes21.mail.cornell.edu [132.236.56.20]) by postoffice10.mail.cornell.edu (8.12.10/8.12.6) with ESMTP id k2EGIPJg008984 for ; Tue, 14 Mar 2006 11:18:25 -0500 (EST) Received: from 128.84.154.13 (proxying for unknown) by webmail.cornell.edu with HTTP; Tue, 14 Mar 2006 11:18:26 -0500 (EST) Message-ID: <15222.128.84.154.13.1142353106.squirrel@webmail.cornell.edu> Date: Tue, 14 Mar 2006 11:18:26 -0500 (EST) Subject: PAPER 14 From: "Nicholas S Gerner" To: egs+summary@cs.cornell.edu User-Agent: SquirrelMail/1.4.5 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal Ng, T. and H. Zhang present "Predicting Internet Network Distance..." which examines predicting internet network "distance" via several approaches. Generally, "distance" measures latency in the network. First IDMaps is briefly examined which is a client-server architecture where a server maintains a toplogical model of the network and can make predictions of latency for clients. However, this model presents an overhead of communicating with a potentially heavily loaded server. Next, Global Network Positioning is presented which uses landmarks in the network. These landmarks measure their latencies to each other and embed themselves in a Euclidean space by minimizing the overall difference between computed distance in the Euclidean space and the actual measured distances.. New nodes measure their latencies to the landmarks and embed themselves in the space similarly. Finally a triangulation heuristic is presented where nodes measure the number of hops to various "bases" in the network. This provides some kind of upper bound as the min of the sum of the number of hops to all bases from each node. An evaluation of these systems is presented using relative and direction errors which first suggest that GNP outperforms both other methods(90% of estimates have relative error of .97) and that these systems typically overestimate the latency (and do so with large deviations). Dabek et al. present "Vivaldi" which models the coordinate embedding problem as a spring potential energy minimization problem. Here, nodes are connected via springs with the rest length of the springs equal to the measured RTTs between nodes and the actual length of the springs equal to the distance between node in some coordinate space (several are considered). The optimization problem then iteratively examines the forces on each node, sums them and applies the resulting force for some timestep delta. This algorithm can be distributed such that each node makes a sample with a peer and applies that new "force" to the existing samples it already has. This leads to an adjustment in the coordinate space and yields a local search problem. Several coordinate systems are examined and finally a 2-d Euclidean space with "height vectors" is proposed which is shown to model (with some intuitive arguments regarding height vectors modelling access links and the plane modelling the internet core) internet RTTs well. However, the evaluation doesn't consider changes in RTTs between nodes. Instead all RTTs are preset and fixed for the lifetime of the simulation. It's not clear how the local search problem would fare in such a dynamic environment and even so it's not clear that the optimization will converge in any case (although it seems to have in the presented simulation). Furthermore, the model presented seems to have a good intuitive basis and is mathematically well defined, however no analysis of this model is performed to explore parameters of the system. Wong, Slivkins, Sirer present "Meridian" which is another network distance solution; however, Meridian doesn't embed nodes in a global coordinate system. Instead, each node maintains a set of concentric, exponentially expanding rings. Each ring contains a fixed number of nodes at the given ring distance (actually a range of distances). In this way each node accurately models the distance to its neighbors. This solution is less general than the absolute coordinate embedding because each node only has a local model of positioning. No global model is available for general use. However, solutions to three common problems using Meridian are presented: closest node discovery (find the node closest to some target), central node discovery (find the node central to a set of nodes) and latency region intersection select (find the set of nodes all of which have latency less than some bound to all nodes in another diverse set of nodes). Additionally analysis is presented that under some important metrics Meridian will provide epsilon-nice rings (such that "good" closest neighbors will be selected with high probability) for small per-node state. Empirical results (supporting assumptions made in analysis) suggest that Meridian significantly outperforms the above two embedding solutions. From kelvinso@gmail.com Tue Mar 14 11:24:58 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from wproxy.gmail.com (wproxy.gmail.com [64.233.184.205] (may be forged)) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k2EGOvt24002 for ; Tue, 14 Mar 2006 11:24:57 -0500 (EST) Received: by wproxy.gmail.com with SMTP id 68so2220234wra for ; Tue, 14 Mar 2006 08:24:53 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=gqD/1OdlniSLGGKZyEB6GX2esn5jRwsSDV3HyXNDqVidFOhIKjt4ZRT0z9GpBlBq1guD5RSzvZ+nOoAeZEpsezZexevY+35tkKSwymrtki7TPcW8GvRNj3OsMplvU7ngHkxFQ5sZK4xd07eXjUWn+SheibcPtGae/XuBCago+y8= Received: by 10.54.134.11 with SMTP id h11mr607675wrd; Tue, 14 Mar 2006 08:24:52 -0800 (PST) Received: by 10.54.80.9 with HTTP; Tue, 14 Mar 2006 08:24:52 -0800 (PST) Message-ID: <6e1ca4560603140824yd10b2cfra6cb3c438a92d6bf@mail.gmail.com> Date: Tue, 14 Mar 2006 11:24:52 -0500 From: "Chiu Wah Kelvin So" To: egs+summary@cs.cornell.edu Subject: Paper 14 MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by sundial.cs.cornell.edu id k2EGOvt24002 The first paper, "Predicting Internet Network Distance with Coordinates-Based Approaches," presents architecture to predict network distance between nodes by using network embedding. Their proposed system is called Global Network Positioning. In their system, it include two set of hosts (two-tier system), which include Landmarks and ordinary hosts. Landmarks are a small set of hosts who compute their own coordinate. Landmarks' coordinate will use as a reference for computing coordinate for other ordinary nodes. To compute the Landmarks' coordinate, Landmarks first collect all the pairwise RTT of Landmarks. Then, it minimizes the overall error of all the predicted distance with the measured RTT. Using the coordinate of the Landmarks, ordinary node sends a ping message to calculate the RTT to each of the landmark, and minimizes the overall error of all the predicted distance between the node and all Landmarks with the measured RTT to all the Landmarks. This system will exchange N^2 messages to compute the coordinate of the Landmarks, where N is the number of Landmarks. And each additional node will require N message to compute its own coordinate. In the paper, it compared GNP with IDMap and triangulated heuristic to show that GNP has a higher accuracy in predicting the network distance than existing approach. However, GNP requires Landmarks to be up all the time. Also, the choice of landmarks affects the GNP prediction of RTT. The second paper, "Vivaldi: A Decentralized Network Coordinate System," presents a similar, but decentralized approach of GNP. Vivaldi simulates systems by placing spring between each of the pairwise nodes with the rest length set to the RTT. The potential energy of such a spring is proportional to the displacement of its rest length. And we want to minimize the energy over all springs (the error function). Each node simulates its own movement and computes its own coordinate in the system. Whenever a node communicates with other node, it measures the RTT and learns that node's current coordinates and the accuracy. Based on the information it receives, node updates its own coordinate. They show that the accuracy of Vivaldi is as good as the accuracy in GNP while Vivaldi is completely decentralized. They also provide some interesting coordinate space to simulate the accuracy of Vivaldi. However, this system assumes that all nodes are cooperative in reporting their information. The third paper, "Meridian: A Lightweight Network Location Service without Virtual Coordinates," presents an architecture to select closest node to a target, minimal latencies to a given set of nodes, and select a set of nodes given latency constraint to a target. Instead of using network embedding, the author narrows the network coordinate position problem to a node selection problem. Meridian uses a loosely structured overlay network to maintain multi-resolution rings. Radius of each ring is exponentially farther. Nodes measure distance of nodes and places O(LogN) nodes into ring i where ri < distance < r(i+1). Each node periodically measures other nodes in the same ring. Then, meridian will only put the most k diverse nodes into the ring. Meridian nodes can locate the closest node to a target by looking at the node which is closer to the current nodes. At each step, the node greedily picks a closer node to the target. Since there are exponentially more nodes to pick from when it is closer to the Meridian nodes, it can have this zoom in effect. The experiments show that it is more accurate to select the closest node to target than the previous network embedding approach. From asg46@cornell.edu Tue Mar 14 11:42:23 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-2.1 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from postoffice10.mail.cornell.edu (postoffice10.mail.cornell.edu [132.236.56.14]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k2EGgMt28397 for ; Tue, 14 Mar 2006 11:42:22 -0500 (EST) Received: from webmail.cornell.edu (hermes21.mail.cornell.edu [132.236.56.20]) by postoffice10.mail.cornell.edu (8.12.10/8.12.6) with ESMTP id k2EGgLRE026527 for ; Tue, 14 Mar 2006 11:42:21 -0500 (EST) Received: from 128.84.98.90 by webmail.cornell.edu with HTTP; Tue, 14 Mar 2006 11:42:22 -0500 (EST) Message-ID: <1375.128.84.98.90.1142354542.squirrel@webmail.cornell.edu> Date: Tue, 14 Mar 2006 11:42:22 -0500 (EST) Subject: paper 14 - RTT estimation From: "Abhishek Santosh Gupta" To: egs+summary@cs.cornell.edu User-Agent: SquirrelMail/1.4.5 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal GNP in this scheme hosts maintain their co-ordinates computed according to a set of landmarks. these co-ordinates allow nodes to compute inter-host distances as soon as they discover each other. each host measures its distance to the landmarks and takes the minimum of several paths as distance. the number of the landmarks must be greater than the dimensionality of the co-ordinates in order to compute a larger number of unique co-ordinates. once the landmark co-ordinates are computed, they are disseminated, along with the identifier for the geometric space used and the corresponding distance function to any host that wants to participate in GNP. experimental results show high prediction accuracy in case of random placement of i-nodes. the ideas in this domain assume that hosts do not lie about their co-ordinates although they have a huge incentive for doing so ( reduces the amount of load on a node) the authors themselves state that prediction mechanisms fail in the case that Internet paths are not stable. Vivaldi it is a decentralized algorithm that assigns synthetic co-ordinates to hosts such that the distance between the hosts accurately predicts the communication latency between the hosts. the ability to predict RTT without measuring it directly is important in the case when the number of servers is large or when the amount of data is small. Any prediction scheme would have a certain amount of error- minimizing this error is a key goal. Any low dimensional scheme would not be good enough due to distorted latencies and violations of triangle inequality. BASIC ALGORITHM: the error is computed is analogous to the displacement in a physical mass-spring system: minimizing energy in the spring is equivalent to minimizing the squared-error function. the central idea is based on the fact that in each period, we compute the change in co-ordinates on the basis of the net force in the system (reaching convergence after a certain amount of time) the period may be constant or variable - DESIGN ISSUE a large time step would adjust co-ordinates in large steps which could result in oscillations. a small time step leads to better convergence with a greater communication overhead. oscillations in co-ordinates can be avoided by including another factor that indicates how certain a node is about its co-ordinates. Additionally, when communication with other nodes, nodes must also be be aware about the certainties of the co-ordinates of other nodes (the converse of which is termed as remote error) MODEL SELECTION the authors have experimentally shown that adding beyond 3 dimensions to the co-ordinate system does not make a significant improvement in error prediction. the authors add another height dimension to the 2-d co-ordinate system improving the error prediction system (forces may cancel out in 2-d but not in 3-d) MERIDIAN it consists of an overlay structure around multi-resolution rings, query routing with direct measurements, and gossip protocols for dissemination. each meridian node keeps track of m*log N peers (m is the number of nodes in an annulus) and organizes them into concentric rings of exponentially increasing radii. each node keeps track of a finite number of rings. the total number of rings in the system is clamped to a constant- all rings i > j (where j is a system-wide constant) are collapsed into a single outermost ring (here radius of i > radius of j) a query is matched against relevant nodes in the ring and optionally forwarded to a node's peers. meridian achieves geographic diversity by periodically reassessing ring membership decisions and replacing ring members with those that provide greater diversity(based on selecting the subset of nodes which provide a polytopes with largest hyper volume) nodes are discovered using a gossip protocol which has a re-pinging mechanism that ensures that stale information is updated. APPLICATIONS: 1) closest node discovery : an acceptance threshold is used to reduce the reduction in distance at each hop (choosing this threshold is another tradeoff issue) 2) central leader election : extends closest node discovery 3) multi-constraint system : can be modeled as a node selection problem TRADEOFFS: 1) the number of nodes per ring is denoted by k. a large k helps in better node selection but at the same time requires more communication b/w the nodes. 2) acceptance interval : responsible for tradeoff between query latency and accuracy ( although accuracy not sensitive to threshold > 0.5) 3) gossip rate : selection vs communication overhead From gp72@cornell.edu Tue Mar 14 11:45:27 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from postoffice10.mail.cornell.edu (postoffice10.mail.cornell.edu [132.236.56.14]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k2EGjRt29662 for ; Tue, 14 Mar 2006 11:45:27 -0500 (EST) Received: from orpheus3.dataserver.cornell.edu (orpheus3.dataserver.cornell.edu [128.253.161.167]) by postoffice10.mail.cornell.edu (8.12.10/8.12.6) with ESMTP id k2EGjPbB028807; Tue, 14 Mar 2006 11:45:26 -0500 (EST) Message-ID: <2024326674.1142354725380.JavaMail.webber@orpheus3.dataserver.cornell.edu> Date: Tue, 14 Mar 2006 11:45:25 -0500 (EST) From: Gopal Parameswaran To: egs+summary@cs.cornell.edu Subject: PAPER 14 Cc: Gopal Parameswaran Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Mailer: uPortal WEB email client 3.0 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by sundial.cs.cornell.edu id k2EGjRt29662 Predicting Internet network Distances In this paper the authors discusses the coordinate -based mechanisms in a peer to peer architecture to predict network distances and discusses a mechanism for mapping nodes called Global Network Positioning which is based on absolute co-ordinates computed from observing the Internet as a geometric space. As soon as end hosts discover themselves they compute the distances between them by using a small set of distributed hosts called landmarks which serve as a frame of reference and are used to map and position the nodes in the network. The number of landmarks would need to be number of dimensions + 1. Once the landmark co-ordinates are known any node can be mapped on to the network map based on its orientation from the landmarks. Vivaldi: This paper discusses a simple light weight algorithm that assigns synthetic co-ordinates to hosts such that the distances between the co-ordinates of two hosts accurately predict the latency between the two hosts. The system of using synthetic co-ordinates is based on the concept that RTT is based on the physical distance between two hosts. However there because transmission time and router electronics introduce distortions in the predictions and the distorted latencies makes the two –dimensional co-ordinate system unsuitable and to correct this issues the authors suggest augmenting the coordinates by a height dimension that improves the prediction accuracy of the system based on measurements from the Internet. The Vivaldi algorithm works by first assigning synthetic co-ordinates in a co-ordinate space such that the distance in the co-ordinate space accurately maps the RTT distance between the two hosts. The algorithm does not predict the value of the synthetic coordinates accurately but instead uses an error reduction algorithm to reduce the error and improve the system. It uses an error reduction function that uses the Squared error of the accurate distance in space represented by the RTT and the assumed distance in space given by the difference squared in the co-ordinate space. This error function is then reduced to generate the close mapping. As the authors asserts this algorithm suffers from the disadvantage that it can get caught in a local minima. However a suitable extension to this paper would be to extend the number of dimensions by transformation of the data into a higher dimensional co-ordinate space which can then be converted into an approximate system by collapsing the dimensions one by one. Also as the error margin is reduced in multiple steps the system could also show oscillatory behaviors but in general should converge to well defined RTT maps. Meridian: This paper discusses a framework for an overlay network structured around multi-resolution rings that support query routing with measurements and uses gossip protocol for dissemination. This paper addresses the three issues of closest node discovery and location and tries to satisfy latency considerations in large scale distributed systems without computing absolute co-ordinates. A general technique for a distance metric is to map high dimensional network measurements into a smaller Euclidean space. Though this works for smaller networks it is not suitable for a large scale network and is neither accurate nor complete since the measurements are non-trivial resulting in a sub-optimal solution. Meridian uses a lightweight protocol that is scalable and more accurate for performing node selection based on direct measurement instead of network embedding. Meridian is a loose routing system based on multi resolution rings on each node where each ring is based on a distance interval of Alpha and s where Alpha is a constant and s the multiplication factor. Actually Alpha * s determine the interval for the distance for the other nodes with the node. This ring structure enables each node to maintain at most k nodes in its ring and drops peers from overpopulated rings. Meridian periodically assesses ring members about its proximity metrics and replaces the nodes with those that provide it with a greater diversity. The objective of the rings seems to increase the diversity factor using the rings. A list of secondary nodes is maintained in every ring and when a node drops out the dropped nodes are replaced with the secondary nodes randomly. Distance metric in Meridian is measured by doing multiple hops to the destination and by measuring the physical latencies. The number of hops that are traversed is used to adjust between the error in the metrics and the increased hop count. From kash@cs.cornell.edu Tue Mar 14 11:59:48 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00, HTML_MESSAGE autolearn=ham version=3.1.0 X-Spam-Level: Received: from exchfe2.cs.cornell.edu (exchfenlb-2.cs.cornell.edu [128.84.97.34]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k2EGxmt04201 for ; Tue, 14 Mar 2006 11:59:48 -0500 (EST) Received: from EXCHVS1.cs.cornell.edu ([128.84.97.23]) by exchfe2.cs.cornell.edu with Microsoft SMTPSVC(6.0.3790.1830); Tue, 14 Mar 2006 11:59:48 -0500 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 X-Security: message sanitized on sundial.cs.cornell.edu See http://www.impsec.org/email-tools/sanitizer-intro.html for details. $Revision: 1.148 $Date: 2004-12-19 11:59:17-08 X-Security: The postmaster has not enabled quarantine of poisoned messages. Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C64788.B333F503" Subject: PAPER 14 Date: Tue, 14 Mar 2006 11:59:46 -0500 Message-ID: <2EE48095D8C21643B0B70EC95F9BFBAF01526B13@EXCHVS1.cs.cornell.edu> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: PAPER 14 Thread-Index: AcZHiLLYKdTkQlgVSDao8wOytkArXg== From: "Ian Kash" To: "Emin Gun Sirer" X-OriginalArrivalTime: 14 Mar 2006 16:59:48.0046 (UTC) FILETIME=[B3921EE0:01C64788] This is a multi-part message in MIME format. ------_=_NextPart_001_01C64788.B333F503 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Still bouncing to egs+summary =20 Vivaldi is an algorithm to compute location of nodes in a synthetic coordinate system, with the goal that distances should approximately reflect RTTs. In order to do this the algorithm is designed to minimize the squared error between the predicted RTTs (i.e. the distances) and the actual RTTs. The basic intuition of Vivaldi is that of a system of springs; the algorithm treats errors are creating spring forces and uses a gradient descent technique to move the system towards equilibrium. To fit internet latencies, Vivaldi uses a 2d Euclidean metric augmented with a height vector. This captures the intution that traffic from a node needs some amount of time to get into the "core" of the network, then it is routed through the core and finally needs some time to travel out of the core to the destination. =20 The choice of minimizing squared error is not very well motivated, other than that they have a nice procedure for doing it. Nothing about the internet innately seems to suggest this metric over other error metrics. Furthmore, its is not clear that optimizing for any metric is really the right goal. The real goal is to help applications make good decisions; any metric used is valuable only as a proxy for that. The same complaint can be made about the height metric. In this case there is a plausable story behind why it is a good metric, but it is far from definitive (especially given the observation that internet latencies do not satisfy the triangle inequality). =20 Meridian takes an approach that is more application driven in spirit. Rather than attempting to use a synthetic metric to evaluate queries, Meridian keeps a logarithmic amount of state at each node that allows it to rapidly pass queries to nodes capable of giving accurate answers to them. The basic organization is a set of rings around the node with exponentially increasing radii (where the radius measures the maximum RTT allowable in that ring). By maintaining a logarithmic number of peers in each ring, nodes know quite a bit about their local area and progressively less about areas farther away. However they know enough that, with high probability, they can still get queries answered. To minimize overhead, maintainance of the rings is done through gossip. Meridian is not able to entirely free itself of assumptions about an underlying metric. To makre provable guarantees, some sort of low dimensionality assumption is needed. However, Meridian is able to make somewhat weaker assumptions by using non-geometric assumptions (i.e. growth-constrained or doubling metrics). ------_=_NextPart_001_01C64788.B333F503 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable -->

Still bouncing to = egs+summary

 

Vivaldi is an algorithm to compute location of nodes = in a synthetic

coordinate system, with the goal that distances = should approximately

reflect RTTs.  In order to do this the algorithm = is designed to

minimize the squared error between the predicted RTTs = (i.e. the

distances) and the actual RTTs.  The basic = intuition of Vivaldi is

that of a system of springs; the algorithm treats = errors are creating

spring forces and uses a gradient descent technique = to move the system

towards equilibrium.  To fit internet latencies, = Vivaldi uses a 2d

Euclidean metric augmented with a height = vector.  This captures the

intution that traffic from a node needs some amount = of time to get

into the "core" of the network, then it is = routed through the core and

finally needs some time to travel out of the core to = the destination.

 

The choice of minimizing squared error is not very = well motivated,

other than that they have a nice procedure for doing = it.  Nothing

about the internet innately seems to suggest this = metric over other

error metrics.  Furthmore, its is not clear that = optimizing for any

metric is really the right goal.  The real goal = is to help

applications make good decisions; any metric used is valuable only as

a proxy for that.  The same complaint can be = made about the height

metric.  In this case there is a plausable story = behind why it is a

good metric, but it is far from definitive = (especially given the

observation that internet latencies do not satisfy = the triangle

inequality).

 

Meridian takes an approach that is more application driven in = spirit.

Rather than attempting to use a synthetic metric to = evaluate queries,

Meridian keeps a logarithmic amount of state at each node that = allows

it to rapidly pass queries to nodes capable of giving accurate answers

to them.  The basic organization is a set of = rings around the node

with exponentially increasing radii (where the radius measures the

maximum RTT allowable in that ring).  By = maintaining a logarithmic

number of peers in each ring, nodes know quite a bit = about their local

area and progressively less about areas farther = away.  However they

know enough that, with high probability, they can = still get queries

answered.  To minimize overhead, maintainance of = the rings is done

through gossip.  Meridian is not able to entirely free itself of

assumptions about an underlying metric.  To = makre provable guarantees,

some sort of low dimensionality assumption is = needed.  However,

Meridian is able to make somewhat weaker assumptions by using

non-geometric assumptions (i.e. growth-constrained or doubling metrics).

------_=_NextPart_001_01C64788.B333F503-- From km266@cornell.edu Tue Mar 14 12:17:50 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-1.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from authusersmtp.mail.cornell.edu (granite1.mail.cornell.edu [128.253.83.141]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k2EHHot09834 for ; Tue, 14 Mar 2006 12:17:50 -0500 (EST) Received: from KEVSTOY (cpe-69-207-37-68.twcny.res.rr.com [69.207.37.68]) (authenticated bits=0) by authusersmtp.mail.cornell.edu (8.13.1/8.12.10) with ESMTP id k2EHHneS020350 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NOT) for ; Tue, 14 Mar 2006 12:17:49 -0500 (EST) Message-ID: <001301c6478b$417d6140$4425cf45@KEVSTOY> Reply-To: "Kevin" From: "Kevin" To: Subject: PAPER 14 Date: Tue, 14 Mar 2006 12:18:04 -0500 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.2527 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2527 GNP proposes a Global Network Positioning system in order to reduce latency. The goal a positioning system would be to get a good estimate of round-trip-time to a peer without having to actually find the latency (by pinging them). The accomplish this by setting up a system of servers that act as landmarks. By pinging these central servers, a node can find its position in the coordinate system. It works similarly to wireless triangulation: if you know the signal strength to several pre-determined wireless routers, you can determine your approximate location. The issue with GNP is its central server philosophy: it has a single point of failure and running these servers is likely not to gain the operators anything. If a server goes down, every node in the system will have to re-ping the new landmark server to re-tune its position in the system. Vivaldi improves upon GNP in several ways. First of all, it is decentralized. There are no longer single points of failure in the system as the landmark servers are no longer servers: they are peers. Vivaldi also explores several other coordinate systems and determines that a Euclidean coordinate system with a height vector added to it works best. The height vector adds more distance for the packet to travel, i.e., it has to travel the height then the actual distance on the plane then the height of the receiving node. Their experiments showed that this worked better than other systems (which is interesting since it seems like there are several degenerated cases where it wouldn't help). Vivaldi also shows a centralized algorithm to find the optimal distance by solving a spring equation. It then decentralizes the algorithm and shows good results. In the end, it seems as if Vivaldi has some good characteristics, but would not perform amazingly under periods of high churn. They show that the system can adapt decently well, but it would be interesting to see how well it does in practice. Meridian is another system to route location-aware packets. Instead of putting every node in the system in a coordinate space, Meridian has each node keeps track of exponentially distance concentric rings. A node can therefore keep track of the distance to any of the neighbors in these rings. Using a similar Pastry or Chord-like mechanism, Meridian can get its closest node in log(n) steps. Each node periodically searches for a set of closer nodes to make sure it still has good information. It also keeps pointers to lots of other nodes that it discovers through the gossip protocol. In the end, Meridian outperforms both other systems. The only downside seems to be the extra overhead. From ryanp@cs.cornell.edu Tue Mar 14 13:15:19 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-2.2 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.1.0 X-Spam-Level: Received: from exchfe2.cs.cornell.edu (exchfenlb-2.cs.cornell.edu [128.84.97.34]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k2EIFIt26754 for ; Tue, 14 Mar 2006 13:15:19 -0500 (EST) Received: from exchfe2.cs.cornell.edu ([128.84.97.28]) by exchfe2.cs.cornell.edu with Microsoft SMTPSVC(6.0.3790.1830); Tue, 14 Mar 2006 13:15:18 -0500 Received: from [128.253.211.203] ([128.253.211.203]) by exchfe2.cs.cornell.edu over TLS secured channel with Microsoft SMTPSVC(6.0.3790.1830); Tue, 14 Mar 2006 13:15:18 -0500 Mime-Version: 1.0 (Apple Message framework v746.2) Content-Transfer-Encoding: 7bit Message-Id: Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed To: Gun Sirer From: "Ryan S. Peterson" Subject: PAPER 14 Date: Tue, 14 Mar 2006 13:15:20 -0500 X-Mailer: Apple Mail (2.746.2) X-OriginalArrivalTime: 14 Mar 2006 18:15:18.0289 (UTC) FILETIME=[3FCE5810:01C64793] The topic of this lecture is network positioning: determining the location, by some metric, of Internet objects to improve performance in several applications such as finding the closest node to a client. Traditional methods for accomplishing this generally assign each node a set of virtual coordinates that ideally represent the node's position relative to other nodes in the system. Such coordinates are often computed by collecting a large number of metrics for each node and compressing them into a two-dimensional space. Taking the image of all the data to such a low dimension loses a lot of data, and researchers have invented other techniques for positioning nodes in a distributed system. The first paper, by Dabek, et al., presents a system called Vivaldi, which attempts to remedy the problems presented above. While Vivaldi also uses virtual coordinates to maintain relative node distances, it does not rely on an inaccurate dimension compression. Instead, it uses an insight from physics to to continually shift nodes around by adjusting and readjusting their coordinates based on current network conditions. In physics, Hooke's Law describes the behavior of perfectly elastic springs and their interactions with each other. Vivaldi conceptually inserts springs into the network between nodes, which push and pull on each other based on measured statistics between the nodes. Because nodes have spring relations with multiple neighboring nodes, adjusting the distance between a pair of nodes will cause other springs connected to those nodes to adjust as well, propagating network measurements throughout the network to maintain the relative distances. One problem with virtual coordinates is that they often require O(N) time to update, and even when they are relatively accurate, they lead to suboptimal results. Furthermore, because Internet links do not abide by the triangle inequality, it is impossible for virtual coordinates to accurately model node distances even in the best case, placing an inherent restriction on the accuracy of the system. Wong, et al., introduces a network location system called Meridian that does not rely on virtual coordinates. Each Meridian node maintains a constant number of concentric rings of exponentially increasing radii. For each ring, the node maintains a constant number of nodes whose distances are proportional to the exponential distance represented by that ring. Thus, each node keeps track of more close nodes than far away nodes. When a node receives a query for its closest node, it forwards the request to the nodes in the appropriate ring, based on the distance of the client machine. Thus, each forwarding hop gets exponentially closer to the target closest node, akin to an O(log N) lookup in Choord or Pastry. As the network changes, Meridian nodes swap nodes in and out of the rings, replacing old nodes either with ones from a backup set of nodes for each ring or with new nodes from the network. Each node only keeps track of a constant number of nodes, so system maintenance time is reduced without sacrificing accuracy. Ryan From lackhand@gmail.com Wed Mar 15 22:54:09 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-1.2 required=5.0 tests=AWL,BAYES_00,HTML_00_10, HTML_MESSAGE autolearn=no version=3.1.0 X-Spam-Level: Received: from pproxy.gmail.com (pproxy.gmail.com [64.233.166.183] (may be forged)) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k2G3s8t09800 for ; Wed, 15 Mar 2006 22:54:09 -0500 (EST) Received: by pproxy.gmail.com with SMTP id o67so347274pye for ; Wed, 15 Mar 2006 19:54:08 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:mime-version:content-type; b=j2ZS/bi4it/pocNOviz5PuyzghMotaDu52vUO3K58wxgu6BgNaH17F2B2JkKU9xoY/z4oQpZ0mKFMWhP1bA1vAzmelTDM3JgvJ3aMfbh3F9BAFBS/NAZl3hFUogn7yl/od6xarOLbaEvVYdPInT0UjzkJ0SqBFxVay9GQZLSQFQ= Received: by 10.35.54.20 with SMTP id g20mr501434pyk; Wed, 15 Mar 2006 19:54:08 -0800 (PST) Received: by 10.35.61.2 with HTTP; Wed, 15 Mar 2006 19:54:08 -0800 (PST) Message-ID: <9aa7a97d0603151954ga64a9c6gd8816959c90d712b@mail.gmail.com> Date: Wed, 15 Mar 2006 22:54:08 -0500 From: "Andrew Cunningham" To: egs+summary@cs.cornell.edu Subject: PAPER 14 MIME-Version: 1.0 X-Security: message sanitized on sundial.cs.cornell.edu See http://www.impsec.org/email-tools/sanitizer-intro.html for details. $Revision: 1.148 $Date: 2004-12-19 11:59:17-08 X-Security: The postmaster has not enabled quarantine of poisoned messages. Content-Type: multipart/alternative; boundary="----=_Part_349_14065812.1142481248026" ------=_Part_349_14065812.1142481248026 Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Andrew Cunningham arc39 Crowds Offering enhanced anonymity on the internet, Crowds operates by creatin= g a network of colluding peers who bounch any given web request through several local redirections before allowing it to leave; thus, upon receipt of a request, the remote server cannot know who issued the request that it is responding to (as long as the issuer did not include any identifying details). Individual entrants -- Jondos, in the parlance of the paper -- rely on a server to introduce them to the crowd, but there is no reason tha= t this should not be a well known node in the system. An odd statement is made regarding mixes and how crowds does not implement a mix: the paper claims that mixes do not provide sender anonymity. This need not be true, and is an odd thing to state, since there is no reason that it should be so; certainly, I did not learn about mixes that did not do this! Also, an attempt is made to claim that mixes' relianc= e on public key cryptography is a weakness, while a similar ability is even found in Crowds (though implied to be pairwise symmetric keys). However, th= e behavior properties of Crowds is better than that of a system of mixes, due to the fact that each element of a Crowds system serves as a mix for each other element. Complete knowledge of active Jondos is assumed, which is not scalable (though traffic impacting any one node is relatively stable). Precapturing embedded content is extraordinarily clever, but does not completely preclude timing attacks -- page redirections might still trigger this (or are they usually javascript?). The biggest drawback to the system is the amount of trust which must be placed in fellow jondos -- SSL links are not supported, and due to the structure of the internet, it is extraordinarily difficult to ensure that a man-in-the-middle attack has not been executed in delivering web content. This could be done with end-to-end encryption, but for obvious reasons this is difficult, and so the limitations must be lived with; for certain media, multiple paths could be used to request the site, which (while lowering anonymity) would raise surety of content. The biggest problem is that joins are batched and it is possible to link requestors by spanning routes of requests (ie, routes pre- and post- join batches) implies that browsing mus= t be done quickly and targetted, which is not the standard use-model for te internet, these days! P^5 P^5 performs structured anonymous broadcast via a broadcast tree, where a node joins several groups at various points in the tree, where distance from the root corresponds to message efficiency. The overhead in the system can become staggering, even with the assumed lateral links, though the system is guaranteed "secretive enough", i.e. rigorous proofs are presented that a node will not inadvertantly expose more information than it chooses to. It assumes a public key infrastructure through the curious method of no= t assuming a global public key infrastructure while assuming that two parties may ascertain each others public keys through OOB mechanisms. P^5's main shortcomings are that due to the nature of its broadcast trees, each message sent results in a flood of (unreliable) mesages throughout the network, hitting especially hard the closer one comes to the root. Also, P^5 is vulnerable to asynchronous analysis; it is noted that a user should not change the set of channels it is part of, or intersection attacks become feasible. However, this is essentially what happens each tim= e a user logs off, which means that repeated use of P^5 leaks information. Also, though we have receiver anonymity, since we can have both sender and receiver anonymity, this paper suffers from a very simple problem: there is no reason to trust any given message, since it cannot rely on any information about the sender to vouch for its relevance. Dining Cryptographers The cryptography used here is easy, though with many "side effects". By relying on a broadcast medium, and some structure where each edge in the graph has some random bit shared between the two nodes it links, a bit of information may be communicated by each non-speaker broadcasting the cumulative xor of each incident edge's bit; the speaker also xor's in the bit that the speaker wishes to communicate. After each step of speaking, th= e spoken bit may be determined by taking the xor of the communicated bits -- thus, there is a perfect model of who-knows-what. The scheme suffers from some drawbacks, but far fewer than other systems for anonymity -- partition= s are dangerous, allowing information to leak (in the extreme case, where the broadcaster is surrounded by colluding nodes, their information is completely compromised) but with even one non-colluding node, the speaker i= s again "anonymous". The base version of the system can be incredibly expensive, as for each bit of message traffic, essentially N^2 information must be exchanged -- if a node has connectivity less than N, then it trusts some portion of the users, because it is now that much easier to partition that node, determining whether or not it is the speaker. Also, it's very slow, as each bit sent must be sent throughout the entire network; though they point out that this can be done en masse, by sending an entire message of bits, the point remains that the activities of every remote peer, not just a subset o= f the remote peers, determine the propagation of the message. Finally, the entire system relies on an efficient broadcast mechanism which, while not vulnerable to analysis (we are still essentially using one-time-pads) is certainly not reasonable to assume: each node ends up having to have n^2 connectivity if only to dissemenate information, though there are more efficient schemes for this, so they could be used, the paper doesn't reference them. As a postscript, denial of service and noise injection are trivial in this system, and devastating. Herbivore Built upon the strength of DC-nets and assumptions about the strength o= f cryptographic techniques, Carnivore builds another anonymous system that is more scalable and efficient than DC-nets. It is built over Pastry, with crypto-puzzles to protect against Sybil attacks; the insight is that we can have clusters of high connectivity to conceal information, which preserve many of the extraordinarily good security properties of DC-nets. The paper leaves unclear precisely how exactly slots are reserved without making clear the intended author for any given period -- usually on= e would try to inductively use the same system for this, but this is clearly unallowable as the entire issue is collision detection. Implied is that thi= s is a well known solution, but if this problem may be solved anonymously, then the entire question is easily solved by simply using the scheduling algorithm "larger". Also, if the individual to whom a given timestamp has been assigned becomes known, then since the message is known (in encrypted form at minimum!) the anonymity has failed, and thus this reservation decision is of utmost importance! Also, the system uses "local proxies" to explore other cliques on the ring; we can only hope that they are elected through this protocol (otherwise, we essentially rely on onion routing). Performance is enhanced under Herbivore, but is clearly still not optimal; this is a necessary condition, apaprently, and unfortunate. ------=_Part_349_14065812.1142481248026 Content-Type: text/html; charset=ISO-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Andrew Cunningham
arc39
    Crowds
    Offering enhanced anonymity on the internet, Crowds operates by creating a network of colluding peers who bounch any given web request through several local redirections before allowing it to leave; thus, upon receipt of a request, the remote server cannot know who issued the request that it is responding to (as long as the issuer did not include any identifying details). Individual entrants -- Jondos, in the parlance of the paper -- rely on a server to introduce them to the crowd, but there is no reason that this should not be a well known node in the system.
    An odd statement is made regarding mixes and how crowds does not implement a mix: the paper claims that mixes do not provide sender anonymity. This need not be true, and is an odd thing to state, since there is no reason that it should be so; certainly, I did not learn about mixes that did not do this! Also, an attempt is made to claim that mixes' reliance on public key cryptography is a weakness, while a similar ability is even found in Crowds (though implied to be pairwise symmetric keys). However, the behavior properties of Crowds is better than that of a system of mixes, due to the fact that each element of a Crowds system serves as a mix for each other element. Complete knowledge of active Jondos is assumed, which is not scalable (though traffic impacting any one node is relatively stable). Precapturing embedded content is extraordinarily clever, but does not completely preclude timing attacks -- page redirections might still trigger this (or are they usually javascript?).
    The biggest drawback to the system is the amount of trust which must be placed in fellow jondos -- SSL links are not supported, and due to the structure of the internet, it is extraordinarily difficult to ensure that a man-in-the-middle attack has not been executed in delivering web content. This could be done with end-to-end encryption, but for obvious reasons this is difficult, and so the limitations must be lived with; for certain media, multiple paths could be used to request the site, which (while lowering anonymity) would raise surety of content. The biggest problem is that joins are batched and it is possible to link requestors by spanning routes of requests (ie, routes pre- and post- join batches) implies that browsing must be done quickly and targetted, which is not the standard use-model for te internet, these days!
    
    P^5
    P^5 performs structured anonymous broadcast via a broadcast tree, where a node joins several groups at various points in the tree, where distance from the root corresponds to message efficiency. The overhead in the system can become staggering, even with the assumed lateral links, though the system is guaranteed "secretive enough", i.e. rigorous proofs are presented that a node will not inadvertantly expose more information than it chooses to. It assumes a public key infrastructure through the curious method of not assuming a global public key infrastructure while assuming that two parties may ascertain each others public keys through OOB mechanisms.
    P^5's main shortcomings are that due to the nature of its broadcast trees, each message sent results in a flood of (unreliable) mesages throughout the network, hitting especially hard the closer one comes to the root. Also, P^5 is vulnerable to asynchronous analysis; it is noted that a user should not change the set of channels it is part of, or intersection attacks become feasible. However, this is essentially what happens each time a user logs off, which means that repeated use of P^5 leaks information. Also, though we have receiver anonymity, since we can have both sender and receiver anonymity, this paper suffers from a very simple problem: there is no reason to trust any given message, since it cannot rely on any information about the sender to vouch for its relevance.
    
    Dining Cryptographers
    The cryptography used here is easy, though with many "side effects". By relying on a broadcast medium, and some struct= ure where each edge in the graph has some random bit shared between the two nodes it links, a bit of information may be communicated by each non-speaker broadcasting the cumulative xor of each incident edge's bit; the speaker also xor's in the bit that the speaker wishes to communicate. After each step of speaking, the spoken bit may be determined by taking the xor of the communicated bits -- thus, there is a perfect model of who-knows-what. The scheme suffers from some drawbacks, but far fewer than other systems for anonymity -- partitions are dangerous, allowing information to leak (in the extreme case, where the broadcaster is surrounded by colluding nodes, their information is completely compromised) but with even one non-colluding node, the speaker is again "anonymous".
    The base version of the system can be incredibly expensive, as for each bit of message traffic, essentially N^2 information must be exchanged -- if a node has connectivity less than N, then it trusts some portion of the users, because it is now that much easier to partition that node, determining whether or not it is the speaker. Also, it's very slow, as each bit sent must be sent throughout the entire network; though they point out that this can be done en masse, by sending an entire message of bits, the point remains that the activities of every remote peer, not just a subset of the remote peers, determine the propagation of the message. Finally, the entire system relies on an efficient broadcast mechanism which, while not vulnerable to analysis (we are still essentially using one-time-pads) is certainly not reasonable to assume: each node ends up having to have n^2 connectivity if only to dissemenate information, though there are more efficient schemes for this, so they could be used, the paper doesn't reference them. As a postscript, denial of service and noise injection are trivial in this system, and devastating.     
    Herbivore
    Built upon the strength of DC-nets and assumptions about the strength of cryptographic techniques, Carnivore builds another anonymous system that is more scalable and efficient than DC-nets. It is built over Pastry, with crypto-puzzles to protect against Sybil attacks; the insight is that we can have clusters of high connectivity to conceal information, which preserve many of the extraordinarily good security properties of DC-nets.
    The paper leaves unclear precisely how exactly slots are reserved without making clear the intended author for any given period -- usually one would try to inductively use the same system for this, but this is clearly unallowable as the entire issue is collision detection. Implied is that this is a well known solution, but if this problem may be solved anonymously, then the entire question is easily solved by simply using the scheduling algorithm "larger". Also, i= f the individual to whom a given timestamp has been assigned becomes known, then since the message is known (in encrypted form at minimum!) the anonymity has failed, and thus this reservation decision is of utmost importance! Also, the system uses "local proxies" to explore othe= r cliques on the ring; we can only hope that they are elected through this protocol (otherwise, we essentially rely on onion routing). Performance is enhanced under Herbivore, but is clearly still not optimal; this is a necessary condition, apaprently, and unfortunate. ------=_Part_349_14065812.1142481248026-- From lackhand@gmail.com Mon Mar 27 19:06:06 2006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on sundial.cs.cornell.edu X-Spam-Status: No, score=-0.5 required=5.0 tests=AWL,BAYES_00,HTML_00_10, HTML_MESSAGE,RCVD_IN_BL_SPAMCOP_NET autolearn=no version=3.1.0 X-Spam-Level: Received: from pproxy.gmail.com (pproxy.gmail.com [64.233.166.183]) by sundial.cs.cornell.edu (8.11.7-20031020/8.11.7/M-3.22) with ESMTP id k2S066Y00917 for ; Mon, 27 Mar 2006 19:06:06 -0500 (EST) Received: by pproxy.gmail.com with SMTP id s49so1584943pyc for ; Mon, 27 Mar 2006 16:06:05 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:mime-version:content-type; b=M5SgUSpDbJM6ExpMzUJ48F9qFo93027pRT+8bQmjkm4phhrwuOt7kTeJdHHIY294HqjDvLYidzI8nGMmkyjlAOFs6cSgr5k4SIQFLXOGetYCwW2ZNLaGC69BkRXTdcQIHu5VoySGoJG7wNq4BOEJHMkBNyjj07zWuzCtlf15qDk= Received: by 10.35.49.4 with SMTP id b4mr1829417pyk; Mon, 27 Mar 2006 16:06:05 -0800 (PST) Received: by 10.35.61.2 with HTTP; Mon, 27 Mar 2006 16:06:05 -0800 (PST) Message-ID: <9aa7a97d0603271606x59513eeci60c27ca54d41ba92@mail.gmail.com> Date: Mon, 27 Mar 2006 19:06:05 -0500 From: "Andrew Cunningham" To: egs+summary@cs.cornell.edu Subject: PAPER 14 MIME-Version: 1.0 X-Security: message sanitized on sundial.cs.cornell.edu See http://www.impsec.org/email-tools/sanitizer-intro.html for details. $Revision: 1.148 $Date: 2004-12-19 11:59:17-08 X-Security: The postmaster has not enabled quarantine of poisoned messages. Content-Type: multipart/alternative; boundary="----=_Part_8131_17142571.1143504365295" ------=_Part_8131_17142571.1143504365295 Content-Type: text/plain; charset=ISO-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Andrew Cunningham arc39 EigenTrust Hector Garcia-Molina Attempting to deduce the trust values to place in entities and objects removed from oneself in the network, EigenTrust uses a system of "entities rating entities" to discover not just the trust that A places in B (in orde= r to aggregate and so determine whether B should be trusted), but the aggregate trust placed in A by the network as a whole, via the observation that by asking ones friends, and ones friends of friends, and so forth, one creates a completely connected segment which should be equal to the entire network. This turns out to be a problem equivalent to solving for the principal eigenvector of a matrix whose entries represent localized trust value measurements. Key insights include the rarity of individual client-client interactions, leading to sparse matrix representation and relatively low coordination message overhead, and that piecewise computatio= n of this matrix function converges relatively quickly, having each individua= l node calculate its own trust function (conceptually only, due to security flaw! In actuality, M deterministically-selected trust-manager peers calculate this value on behalf of each node). As the paper points out, the difficulty with this algorithm is that its end result is universally viewed, which means that there is no information assymetry in the network which can be leveraged, which will almost universally result in the best possible peer being isolated, and used, leading to an accrual of trust, which will cause the problem the escalate. The probabilistic algorithm will fix this, but is something of an altruisti= c solution, since it relies on each individual accepting the choice of a suboptimal path to patch a "black hole" in the system. Their attacker model is sufficiently malicious, but their experiments seem to leave something to be desired; the subtext is that the effect is visible in the small scale, but that the data in the large scale experiments are too large to fully study. However, these data are not presented; presumably the effect is cleaner in a small network. Robust Reputation System for P2P and Mobile Ad-hoc Networks Jean-Yves Le Boudec Another attempt at validating entities in a distributed fashion, this paper uses another reputation scheme, attempting to strike the happy medium between considering the reports of others and relying entirely on self-generated interactions. It does this by pushing first hand information out into the network, based on a modified Bayesian approach and a linear model merging heuristic, leaving reputation & trust ratings as private. It does this in the absence of primary trusted entities (the seed values that EigenTrust relies on). It uses reputation fading, to keep ratings "current"= , and is exceedingly noncentralized. Not much space is given in the paper, to using others' deduced trust values, just first hand operations. This means that while the system can be quite scalable, as it relies only on local data about any entity who is to be interacted with, it gives the impression of being lower overhead than EigenTrust without actually being so; if one wishes to later interact with = a remote entity, to look up its rating requires just as much work, and the calculation is not particularly easier here, merely a different metric. ------=_Part_8131_17142571.1143504365295 Content-Type: text/html; charset=ISO-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Andrew Cunningham
arc39
    EigenTrust
    Hector Garcia-Molina
    
    Attempting to deduce the trust values to place in entities and objects removed from oneself in the network, EigenTrust uses a system of "entities rating entities" to discover not just = the trust that A places in B (in order to aggregate and so determine whether B should be trusted), but the aggregate trust placed in A by the network as a whole, via the observation that by asking ones friends, and ones friends of friends, and so forth, one creates a completely connected segment which should be equal to the entire network. This turns out to be a problem equivalent to solving for the principal eigenvector of a matrix whose entries represent localized trust value measurements. Key insights include the rarity of individual client-client interactions, leading to sparse matrix representation and relatively low coordination message overhead, and that piecewise computation of this matrix function converges relatively quickly, having each individual node calculate its own trust function (conceptually only, due to security flaw! In actuality, M deterministically-selected trust-manager peers calculate this value on behalf of each node).
    As the paper points out, the difficulty with this algorithm is that its end result is universally viewed, which means that there is no information assymetry in the network which can be leveraged, which will almost universally result in the best possible peer being isolated, and used, leading to an accrual of trust, which will cause the problem the escalate. The probabilistic algorithm will fix this, but is something of an altruistic solution, since it relies on each individual accepting the choice of a suboptimal path to patch a "black hole" in the system. Their attacker model is sufficiently malicious, but their experiments seem to leave something to be desired; the subtext is that the effect is visible in the small scale, but that the data in the large scale experiments are too large to fully study. However, these data are not presented; presumably the effect is cleaner in a small network.
    
    Robust Reputation System for P2P and Mobile Ad-hoc Netwo= rks
    Jean-Yves Le Boudec
    Another attempt at validating entities in a distributed fashion, this paper uses another reputation scheme, attempting to strike the happy medium between considering the reports of others and relying entirely on self-generated interactions. It does this by pushing first hand information out into the network, based on a modified Bayesian approach and a linear model merging heuristic, leaving reputation & trust ratings as private. It does this in the absence of primary trusted entities (the seed values that EigenTrust relies on). It uses reputation fading, to keep ratings "current",= and is exceedingly noncentralized.
    Not much space is given in the paper, to using others' deduced trust values, just first hand operations. This means that while the system can be quite scalable, as it relies only on local data about any entity who is to be interacted with, it gives the impression of being lower overhead than EigenTrust without actually being so; if one wishes to later interact with a remote entity, to look up its rating requires just as much work, and the calculation is not particularly easier here, merely a different metric. ------=_Part_8131_17142571.1143504365295--