From mp98@cornell.edu Tue Dec 3 00:00:37 2002 Received: from postoffice2.mail.cornell.edu (postoffice2.mail.cornell.edu [132.236.56.10]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id gB350b903293 for ; Tue, 3 Dec 2002 00:00:37 -0500 (EST) Received: from cornell.edu (r109493.resnet.cornell.edu [128.253.240.252]) by postoffice2.mail.cornell.edu (8.9.3/8.9.3) with ESMTP id AAA01241 for ; Tue, 3 Dec 2002 00:00:35 -0500 (EST) Date: Tue, 3 Dec 2002 00:00:35 -0500 Mime-Version: 1.0 (Apple Message framework v546) Content-Type: text/plain; charset=US-ASCII; format=flowed Subject: 615 Paper 74 From: Milo Polte To: egs@CS.Cornell.EDU Content-Transfer-Encoding: 7bit Message-Id: <289ECE66-067C-11D7-BCAD-003065EE5F0A@cornell.edu> X-Mailer: Apple Mail (2.546) This paper is motivated by a need to increase the real-time reliability of packets over a network. Traditional approachs (i.e. TCP's timeout and retransmission) are prohibitively slow for some applications. The author's approach uses multiple paths (thus sacrificing efficiency in terms of bandwidth) in order to achieve a lower rate of retransmission. The use of XML for this protocol is incidental and not necessary to the innovation at hand. Rather it provides an easy platform for implementation (XML routers) and a standard expressive format for data. Their system is divided into Root Routers (who generate messages--A somewhat odd concept), internal routers (who disseminate the information to the clients) and the clients themselves who consume the XML streams. Clients join to n internal routers and specify what kind of XML stream they desire. It is up to the internal routers then to parse the streams from the root servers into the desired format. This may include merging streams from different root routers (routers performing this task are 'combining routers'). The XML router core is responsible for this merging and processing and is treated somewhat as a black box as the particulars of its operation will vary from implementation to implementation (though the authors suggest the XQuery language). The connections between clients and routers and between the routers and their parents are maintained through the Diversity Control Protocol. In DCP, packets are content identified rather then source identified. This way, all the parents of a node may send the same packet and the client can identify them as clones of the same data. Thus, a client can simply take the first good packet it sees and ignore the rest. Note that packet ids must be assigned at the root routers; it is unclear how this works with combining routers. Dynamic networks are easily configured by a simple (perhaps too simple) joining protocol. The same algorithm is used to find replacements for failed nodes--If this fails, the node must rejoin. From tmroeder@CS.Cornell.EDU Tue Dec 3 01:02:13 2002 Received: from dhcp98-13.cs.cornell.edu (dhcp98-13.cs.cornell.edu [128.84.98.13]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id gB362D915650 for ; Tue, 3 Dec 2002 01:02:13 -0500 (EST) Received: (from tmroeder@localhost) by dhcp98-13.cs.cornell.edu (8.11.6/8.11.6) id gB362Dm01263; Tue, 3 Dec 2002 01:02:13 -0500 From: Thomas Roeder MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15852.18661.398566.916791@dhcp98-13.cs.cornell.edu> Date: Tue, 3 Dec 2002 01:02:13 -0500 To: Emin Gun Sirer Subject: 615 PAPER #74 X-Mailer: VM 7.07 under Emacs 21.2.1 Today's paper describes a method of creating a mesh of routers to distribute XML-encoded content at high rates. The main assumption here is that we are willing to spend large amounts of bandwidth to have published data reliably reach subscribers. In the context of local area networks or Internet2 hosts, which have the bandwidth to share, this is sensible, but I am not convinced that it is cost effective in the case of regular clients on the Internet, which tend to have much lower rates, and whose connections are already saturated with the streaming data that is currently available. They describe a transmission protocol called DCP (Diversity Control Protocol), which is effectively a multi-source TCP without the congestion control. It also requires some justification with respect to the end-to-end argument, since it requires acknowledgment at each hop (although for fairly clear reasons, since it is meant that the receiver not have any need to know from which of the sources it received the packets.). The lack of congestion control is worrisome, given the difficulty that the Internet has had in the past in sustaining such protocols. Again, in certain contexts, it may not be an issue, but the issue is not resolved in the paper. Further presented is a method for maintaining fault-tolerance of the sources of the packets. They seek only to protect against fail-stop of the links, and do not consider Byzantine faults, which indeed could mix up the order of the packets and cause great confusion at the recipients end. For TCP and other single-source protocols, trust is required that the data is what it claims to be. With n sources, however, it would be preferable that the malicious nature of one not destroy the work of the others. This is, I imagine, future work, since the assumption of fail-stop is the obvious place to start. In their tests, the implementations seem to perform reasonably well, and get very high rates and low loss. I would be interested to see the effects of a more widespread distribution of this technology, and whether or not the local results generalize well. From shafat@CS.Cornell.EDU Tue Dec 3 01:52:34 2002 Received: from exchange.cs.cornell.edu (exchange.cs.cornell.edu [128.84.97.8]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id gB36qY925782 for ; Tue, 3 Dec 2002 01:52:34 -0500 (EST) content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Subject: 615 PAPER 74 X-MimeOLE: Produced By Microsoft Exchange V6.0.6249.0 Date: Tue, 3 Dec 2002 01:52:33 -0500 Message-ID: <47BCBC2A65D1D5478176F5615EA7976D135005@opus.cs.cornell.edu> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: 615 PAPER 74 Thread-Index: AcKZ3lWHOyZdfZBESh6hZC/GOMF1ZA== From: "S. Shafat Zaman" To: "Gun Sirer" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by sundial.cs.cornell.edu id gB36qY925782 This paper presents a new, highly reliable mechanism to distribute "time-critical data" over a heterogenous, mesh-based overlay network using XML packets. It highlights the usage of such mechanisms in applications (e.g. air-traffic control) where reliability and extremely low latency are of utmost importance. The network is set up in a mesh structure with each node connected to n parents, making it (n-1) resilient. The most important contribution of this paper however, is the use of XML routing in filtering contents of the packets and route only the desired portions on separate links. Every XML packet reaching a node is therefore decompressed, relevant information is extracted based on a client's query, and is recompressed for individual clients. Clients can use a query language like XQuery to place their queries. Every XML router implements three main algorithms and protocols: XML router core - receives and forwards packets based on queries in an efficient manner. Diversity Control Protocol (DCP) - allows a receiver to reconstruct a packet stream from diverse senders and thus reducing latency and improving reliability. Lastly, a set of mesh intialization and maintenance algorithms that handles joining and departing nodes, and takes of link failures. The paper also evaluates two implementations of this XML based routing scheme. The results match the expectations in that the delivery rates of packets do not fall even at high failure rates. Latency, on the other hand, improves as the depth of the mesh increases. From vrg3@cornell.edu Tue Dec 3 08:55:47 2002 Received: from travelers.mail.cornell.edu (travelers.mail.cornell.edu [132.236.56.13]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id gB3Dtk912908 for ; Tue, 3 Dec 2002 08:55:46 -0500 (EST) Received: from travelers.mail.cornell.edu (travelers.mail.cornell.edu [132.236.56.13]) by travelers.mail.cornell.edu (8.9.3/8.9.3) with SMTP id IAA13740 for ; Tue, 3 Dec 2002 08:55:45 -0500 (EST) Date: Tue, 3 Dec 2002 08:55:45 -0500 (EST) From: vrg3@cornell.edu X-Sender: vrg3@travelers.mail.cornell.edu To: egs@CS.Cornell.EDU Subject: 615 PAPER 74 Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII This paper presents a technique for multicasting data from root routers out to clients extremely reliably and with very low latency. The way the authors try to achieve these properties is by focusing on minimizing packet loss, while allowing bandwidth usage to increase. The basic idea is to introduce redundancy into the network so as to eliminate the need for TCP's retransmission-based packet loss recovery. The system consists of an overlay network, with root nodes creating content, and XML routers processing the content and then forwarding it to clients. Each client is connected to many routers, thus providing redundancy. The content is all encoded in XML so that a client can explicitly tell the routers what portion of the content it is interested in, and the routers can easily parse the content to extract that portion. This also allows routers to accept data from more than one root node to be aggregated. With the redundant network routing scheme in place, and with the aforementioned goals in mind, the authors eschew TCP in favor of a new protocol, DCP, the Diversity Control Protocol. DCP is designed such that packets are assigned sequence numbers at the root node, and if multiple nodes produce the same data packet, they all assign it the same sequence number. Since some packets never get forwarded, since the client isn't interested in all packets, each packet that is forwarded contains the sequence number of the previously forwarded packet. The result is that DCP packets are identified by their content rather than by their sender. This way, a client can receive each packet of the stream from a different router and still stitch them together. DCP includes a retransmission protocol, which should be very rarely used because of the redundancy in the network. The contribution of a well-defined way to trade bandwidth for reliability is an interesting one. In their tests, the system seems to work well, providing good reliability. It seems best targeted at relatively small private networks where it isn't impolite to use large amounts of bandwidth for your private goal. However, if it is to be deployed at a large scale, security becomes an issue, and having a distributed routing scheme might allow for some interesting ways of implementing security. From adam@graphics.cornell.edu Tue Dec 3 09:34:30 2002 Received: from bach.graphics.cornell.edu (bach.graphics.cornell.edu [128.84.247.50]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id gB3EYT919959 for ; Tue, 3 Dec 2002 09:34:30 -0500 (EST) Received: from envy.graphics.cornell.edu (envy.graphics.cornell.edu [128.84.247.206]) by bach.graphics.cornell.edu (8.12.1/8.12.1) with ESMTP id gB3EYNDc064774 for ; Tue, 3 Dec 2002 09:34:23 -0500 (EST) Date: Tue, 3 Dec 2002 09:33:37 -0500 (EST) From: Adam Kravetz To: egs@CS.Cornell.EDU Subject: 615 paper 74 Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Mesh-Based Routing w/ XML: This paper explores multicast, reliable routing of information in an overlay network using XML. It's goals are low latency and good error recovery, in an effort to have low loss at the client level. Increased bandwidth usage (redundancy) is allowed in order to achieve these goals. The overlay network is constructed out of three components: Root routers, intermediate routers and clients. A root router is a source of information is sending out XML packets to intermediate routers. Each intermediate router can take multiple XML streams and merge them together to provide for redundant streams. Each client will receive from multiple intermediate routers for redundancy. The XML part of this overlay is a big feature, that is that any subset of the stream data is possible to any client. Each client "subscribes" to the XML content it wants and only gets forwarded that content from the intermediate routers. There are many synchronization, subscription and error control issues that go along w/ multiple streams, mixing and converging, but there are delt w/ well in this paper. The novel idea of Multiple stream routing for redundancy and low error rate is very good. In a system that is designed solely for reliable data (where modest bandwidth increase isn't an issue) having multiple streams eliminates the path style failures which are so common in networks. The idea of using subscription based content dispersion (you tell a router what part of the stream you want and it sends only that to you) is also really good since it can support very lightweight appliances at endnodes. The disadvantages to this however are not to be ignored. First the idea that routers are doing more than routing is moving strictly away from the model of the internet (end to end), but in a network designed specifically for this purpose it could be very useful. Secondly the root router problem often represents a single point of failure. It is often that we want data collected from a sensor (such as temp) that isn't redundant. For a multiple root router system there would need to be a manyfold increase in sensors or servers (like 2 weather stations in ithaca instead of just one for redundant root routing about ithaca) in some situations. I am also concerned that this system might not scale very well. What happens when there are 10,000 sources of data and 10,000,000 people that all want to receive some subset of it (such as meterologists receiving weather data from weather stations). Routers might have to keep an incredible amount of state around to decide who gets what, causing major overhead issues as each packet is filtered for the correct data. I expect the XML parsing/filtering/subsetting to become a bottleneck on large scale networks. One suggestion that I would have to improve this system is to build a hierarchical system of first-mile and last-mile routing. To make scalability better the first-mile routers could forward everything and last-mile (those w/in a few hops of the target could filter the XML). From kwalsh@CS.Cornell.EDU Tue Dec 3 10:03:19 2002 Received: from kwalsh4158.cs.cornell.edu (dhcp99-39.cs.cornell.edu [128.84.99.39]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id gB3F3J925365 for ; Tue, 3 Dec 2002 10:03:19 -0500 (EST) Message-Id: <5.1.1.6.0.20021203100248.00a9a658@popsrv.cs.cornell.edu> X-Sender: kwalsh@popsrv.cs.cornell.edu X-Mailer: QUALCOMM Windows Eudora Version 5.1.1 Date: Tue, 03 Dec 2002 10:03:19 -0500 To: egs@CS.Cornell.EDU From: Kevin Walsh Subject: 615 PAPER 74 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Mesh-Based Content Routing using XML The authors propose an XML-based publish/subscribe routing mechanism. Several points are key to the proposal. First, that multiple paths may increase reliability of end-to-end data streams. This approach avoids (in the common case) the need for retransmission at the cost of increased traffic. The routing takes place in an overlay constructed as a k-connected acyclic mesh. Failures (up to k-1) in the mesh do not immediately lead to data loss, because of the redundant links. The mesh is restored gradually back to the k-connected state. Subscriptions are XML "queries". Publishers send streams of self-contained XML objects, which are matched against subscription queries. Since XML is structured, router nodes (supposedly) can aggregate all down-stream queries into a single query. The upstream router then passes all incoming packets that match the query down the link. The authors attempt to tackle some of the issues that come up, but the solutions presented seem fairly simplistic. First, multiple senders can publish to a logical "stream", but receivers would like to totally order packets on a stream (excepting packets in which they are not interested). If all packets have a totally ordered sequence number, then a receiver can just accept the first well-formed packet for each sequence number, dropping later copies. To allow for holes (due to pruning of uninteresting packets), routers stamp each packet with the current and last sequence number. Retransmission is done hop-by-hop. The problem of generating sequence numbers is not much discussed: it is assumed that "root" routers for each stream will generate the correct totally ordered stamps. For mesh construction, they rely on a set of root publishers for every stream. The roots must be independent (for resistance to failure) but synchronized (for generating packet stamps). When a node (either an "internal XML router" or "client") wants to join the mesh, it begins with the root nodes and walks down the tree looking for n parents. While simple, this algorithm does not seem to ensure independent paths, and does not take the underlying network topology into account. From smw17@cornell.edu Tue Dec 3 10:10:15 2002 Received: from cornell.edu (cornell.edu [132.236.56.6]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id gB3FAF926560 for ; Tue, 3 Dec 2002 10:10:15 -0500 (EST) Received: from cornell.edu (syr-24-161-107-202.twcny.rr.com [24.161.107.202]) by cornell.edu (8.9.3/8.9.3) with ESMTP id KAA02836 for ; Tue, 3 Dec 2002 10:10:14 -0500 (EST) Message-ID: <3DECC7EA.3000604@cornell.edu> Date: Tue, 03 Dec 2002 10:04:10 -0500 From: Sean Welch User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.0.1) Gecko/20020823 Netscape/7.0 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Emin Gun Sirer Subject: 615 PAPER 74 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit The basic goal of this paper is to design a resilient XML distribution network to provide a degree of failure resilience. In order to do so, the authors created a mesh overlay structure, preferring to use redundant network paths to achieve the desired fault tolerance. Each node in the mesh connects itself to n parent nodes, with each parent node sending an independent data stream to the child. Data packets are identified based on content regardless of source, permitting the child to make use of the first data item it recieves from any of the n parents. This structure creates what the authors term an n-1 resilient network, in that the routing is preserved in the face of any n-1 nodes failing. This is based on the fact that the mesh at any given point has a minimum cut of n links, and is acyclic. The router core at each node performs two major functions. The first is compression and decompression of the data stream. This permits the network to reduce the size overhead of XML (relative to simpler but less extensible formats). The second function is to route XML packets based on the interests of the children. Each child, upon joining the network, presents a set of queries that identifies the XML packets that that particular node is interested in. Parent nodes keep track of these interests, and only forward packets to nodes that have expressed an interest in those packets. Each packet is identified by an identifier AN that is based solely on the packet contents. These identifiers work together with the Diversity Control Protocol (DCP) to permit a node to accept a data packet from any of its parent nodes to forward a data packet through the XML router. Only in the case where none of the parents forward a data packet (or all drop one particular packet) does the child produce a NACK, requesting retransmission of a particular data item. While I found the idea an interesting approach to content distribution, I had a number of problems with the specific approach. First of all, it seems to me that the assumption that all links are independent is far from a good assumption. Given the structure of the internet and the implementation of an overlay network, it is highly likely that many nodes will tend to pass through a relatively small number of physical links. The impact of non-independent links is not considered here. Secondly, the authors completely sidestep the issue of latency by running the physical links through a localhost redirection, and attempting to compensate by imposing a large retransmission latency. I would like to see a more realistic network model implemented to give a better idea of actual latencies in the system. From jsy6@postoffice2.mail.cornell.edu Tue Dec 3 10:24:53 2002 Received: from postoffice2.mail.cornell.edu (postoffice2.mail.cornell.edu [132.236.56.10]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id gB3FOq929597 for ; Tue, 3 Dec 2002 10:24:52 -0500 (EST) Received: from Janet.cornell.edu (syr-24-59-78-146.twcny.rr.com [24.59.78.146]) by postoffice2.mail.cornell.edu (8.9.3/8.9.3) with ESMTP id KAA23053 for ; Tue, 3 Dec 2002 10:24:49 -0500 (EST) Message-Id: <5.1.0.14.2.20021203102349.00b59818@postoffice2.mail.cornell.edu> X-Sender: jsy6@postoffice2.mail.cornell.edu (Unverified) X-Mailer: QUALCOMM Windows Eudora Version 5.1 Date: Tue, 03 Dec 2002 10:24:33 -0500 To: egs@CS.Cornell.EDU From: Janet Suzie Yoon Subject: 615 PAPER 74 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed The main goal of this paper is to provide a protocol that allows reliable multicasting of time-critical data to heterogeneous clients. This protocol is built ontop of a mesh overlay network. The reason for having a mesh network (each node is connected to n parents) is to tradeoff transport costs due to duplications for latency and reliability (retransmission avoided in face of loss due to duplicate data). The network consists of root rooters (origins of data) that link to internal routes that in turn link to the clients. Due to this network structure, a child's resilience is also dependent on each of its n parent's resilience. The root routers can be seen as a "central" point of failure in terms of resilience. The failure of a core rooter will reduce the resilience level of all clients that connect to the failed node. From besema@yahoo.com Tue Dec 3 10:38:56 2002 Received: from web13703.mail.yahoo.com (web13703.mail.yahoo.com [216.136.175.136]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with SMTP id gB3Fct902308 for ; Tue, 3 Dec 2002 10:38:56 -0500 (EST) Message-ID: <20021203153854.48896.qmail@web13703.mail.yahoo.com> Received: from [132.236.71.82] by web13703.mail.yahoo.com via HTTP; Tue, 03 Dec 2002 07:38:54 PST Date: Tue, 3 Dec 2002 07:38:54 -0800 (PST) From: Nana B Sam Subject: 615 PAPER 74 To: egs@CS.Cornell.EDU MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Mesh-Based Content Routing Using XML This paper makes 3 contributions: - XML routing: first packet-based network XML routing - Mesh-based overlay networks: first overlay network to use multiple, redundant paths for simultaneous transport of multicast streams - Diversity control protocol: novel protocol that uses source-independent sequence numbers to reliably reconstruct a sequenced packet stream from multiple sources. DCP reduces latency and improves reliability when compared to conventional single-sender approaches. Motivated by an interest in building an infrastructure for distributing and processing real-time air traffic control data, the authors take advantage of the ease of modification and deployment of overlay networks and build one that transports XML streams. Their design involves constructing a content distribution mesh where each node is connected to n parents, receiving duplicate packet streams from each of its parents. By maintaining an acyclic mesh, this approach claims to guarantee that the minimum cut of the mesh is n nodes or independent links. Thus a mesh is resilient to (n-1) node or (n-1) independent link failures without repair. If a mesh failure occurs, recovery algorithms restore the mesh to (n-1) resilience in a few seconds. Clients wishing to join an (n-1) mesh: - compose an XML query with description of desired data - contact n existing routers (each of which contains a query table and implements an XML router core, the DCP, mesh initialization and maintenance) that can service the query - send the XML query to these n routers - receives the XML stream described by the query. The content carried by routers in a mesh can be statically or dynamically configured. A static configuration gives clients a wide choice of routers for servicing requests but requires a fixed bandwidth capacity throughput. A dynamic configuration allows a router to carry only relevant packet stream leading to significant bandwidth savings. Their approach may not be scalable since there is an enormous overhead in making each XML packet a fully formatted document. __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com From ashieh@CS.Cornell.EDU Tue Dec 3 10:42:42 2002 Received: from zinger.cs.cornell.edu (zinger.cs.cornell.edu [128.84.96.55]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id gB3Fgg903289 for ; Tue, 3 Dec 2002 10:42:42 -0500 (EST) Received: from localhost (ashieh@localhost) by zinger.cs.cornell.edu (8.11.3/8.11.3/C-3.2) with ESMTP id gB3FggM19192 for ; Tue, 3 Dec 2002 10:42:42 -0500 (EST) Date: Tue, 3 Dec 2002 10:42:42 -0500 (EST) From: Alan Shieh To: Subject: 615 PAPER 74 Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Mesh-based content routing using XML This paper contributes a multicast scheme in an overlay network with novel techniques for content-based routing and redundant transport. In this system, all data is distributed in XML, and clients express the information that they wish to receive using XPath and other XML query languages. These queries are used to filter the information that is sent to each client. If a client is also a router, and hence has its own children, then the disjunction of the queries of each children is sent to the router's parent. The purpose of this XML-based content routing is to conserve bandwidth in a content-based manner. The content distribution mesh dynamically generates a DAG radiating outward from the data sources (root nodes). The graph construction algorithm proceeds iteratively from the root nodes towards the edges of the network, and a node is attached at the highest level nodes that are fastest to respond (a router need not respond if it has insufficient bandwidth, or does not have the necessary query stream). Each node uses this procedure to find n parents. Each of a node's parents sends a content stream that contains at least the information that the node requests. DCP (Diversity Control Protocol) is used to reliably reconstruct the original stream, and uses sequence numbers to detect missing packets. Unlike TCP, there may be multiple independent sources for information: there are multiple root nodes, and each intermediate router uses DCP to reconstruct the content stream before forwarding. Hence, ANs (appliication serial numbers, the sequence numbers used in DCP) are embedded in or generated from the data itself, that is independent of the source or routing path. Since XML queries are used to filter the content streams and thus may introduce gaps in the ANs, the DCP header contains chaining information. If an element is still missing from the chain after a timeout period, then DCP sends a NACK to all of its parents. Parents may also request acknowledgements from a client; this is used for congestion control. The results show that diversity in the distribution mesh allows for an exponential decrease in packet loss and more consistent latency as loss increases. For the same loss rate and bandwidth, DCP provides better throughput than both TCP and erasure code streams. ** Shortcomings The comparison between TCP and DCP was not constructed properly. It is also possible to use diversity routing with TCP if the root is configured with multilink Ethernet, and the routers configured appropriately. The client would send SACKs back along each incoming link. The tree construction and maintenance algorithms are somewhat rudimentary. ** Future work - Since the distribution mesh is overlaid on another network, it lacks complete knowledge about the underlying topology. However, burning extra bandwidth for redundant routing requires that each such path be diverse, and different sets of paths would have different values from a redundancy point of view. Hence, it would be nice to have a mechanism for detecting the overlap of a particular set of paths. - Content-based filtering trades off CPU time for bandwidth savings. This requires an admission control mechanism for throttling the amount of CPU time that may be consumed by queries. This dichotomy also suggests that it might be nice to have rules for determining whether a router should burn CPU, or send an unfiltered stream to each client. Such a decision would hinge on the bandwidth costs and constraints within the system. For instance, if two routers happen to be in the same AS, with all internal links (which tend to be free), then bandwidth conservation is not so much of a concern. - Multilevel filter optimization. Conceivably, if multiple routers are in the same AS, then bandwidth is really cheap, and so filtering may be performed hierarchically. However, this is a difficult problem, as reducing filtering in an upstream node would require a downstream node to perform more filtering. Moreover, cycles in upstream nodes are worth more than cycles in the downstream nodes that would otherwise have to independently perform filtering if not performed upstream. Hence, any such optimization would have to account for the both the aggregate bandwiidth and aggregate CPU consumption of deferring filtering, as well as QoS considerations for a CPU-bottlenecked server. From ag75@cornell.edu Tue Dec 3 11:24:54 2002 Received: from travelers.mail.cornell.edu (travelers.mail.cornell.edu [132.236.56.13]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id gB3GOs912848 for ; Tue, 3 Dec 2002 11:24:54 -0500 (EST) Received: from travelers.mail.cornell.edu (travelers.mail.cornell.edu [132.236.56.13]) by travelers.mail.cornell.edu (8.9.3/8.9.3) with SMTP id LAA11977 for ; Tue, 3 Dec 2002 11:24:51 -0500 (EST) Date: Tue, 3 Dec 2002 11:24:51 -0500 (EST) From: ag75@cornell.edu X-Sender: ag75@travelers.mail.cornell.edu To: egs@CS.Cornell.EDU Subject: 615 PAPER 74 Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII In this week's paper we are presented with a new approach for reliably multicasting time-critical data to heterogeneous clients over mesh-based overlay networks. The idea is to have a very reliable data distribution system that can deliver the information to the clients with low latency in the presense of both node and link failures. In order to do this, the authors use a distributed mesh, where every node is connected to n parents, receiving duplicate packet streams from each of its parents. Such a mesh is resilient to n-1 node or independent link failures without repair. To facilitate intelligent content pruning, data streams are comprised of a sequence of XML packets and forwarded by application-level XML routers. The routers use Diversity Control Protocol for router-to-router and router-to-client communication. DCP reassembles a stream of packets from multiple senders using the first copy of a packet to arrive from any sender. Experimental results show that such a mesh can provide acceptable delivery rates over even extremely lossy channels - a node with 4 parents can expect a loss rate of less than 5% when each parent experiences a loss rate of up to 45%. Average latency is also greatly improved with higher levels of redundancy. Of course, all these benefits come at a price - high bandwidth consumption, and the more reliable and low latency you want the mesh to be, the more bandwidth you need. From mvp9@cornell.edu Tue Dec 3 11:40:52 2002 Received: from cornell.edu (cornell.edu [132.236.56.6]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id gB3Geq916535 for ; Tue, 3 Dec 2002 11:40:52 -0500 (EST) Received: from zoopark.cornell.edu (syr-24-59-74-132.twcny.rr.com [24.59.74.132]) by cornell.edu (8.9.3/8.9.3) with ESMTP id LAA17291 for ; Tue, 3 Dec 2002 11:40:51 -0500 (EST) Message-Id: <5.1.0.14.2.20021203113934.01a69e40@postoffice.mail.cornell.edu> X-Sender: mvp9@postoffice.mail.cornell.edu (Unverified) X-Mailer: QUALCOMM Windows Eudora Version 5.1 Date: Tue, 03 Dec 2002 11:40:48 -0500 To: egs@CS.Cornell.EDU From: mike polyakov Subject: 615 PAPER 74 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed This paper attacks the challenge of reliable broadcasts, particularly for latency-sensitive applications. The basic premise of the work is that bandwidth can be traded for reliability, but instead of using redundancy of data packets, they instead use redundancy of paths. There are three types of elements in their networks: root routers, which provide the actual content schemes, intermediate routers and clients. The main innovation of the paper is the DCP protocol, which splits a file into chunks in a standardized manner and names packets [regardless of source], thus allowing any receiver to combine a given stream coming from multiple sources back into one. When a client subscribes to a stream, he will receive it from multiple routers (root or intermediate) and failure of a subset of them or their links will not be harmful; the intermediate router performs similarly. Their second main contribution is the XML routing. This allows complicated subscription and routing rules, and provides great flexibility for in-the-network stream processing. The essential element of this is the full encapsulation of each packet with its description, thus XML encoding can be replaced with any other. This may be advantageous, since the XML is extremely bulky, and the compression that they suggest to shrink it appears to add considerable processing cost for the routers. Indeed, although their results do show high resistance to failure, the throughput of their routers is underwhelming, especially considering the high bandwidth many of the applications will require. Another benefit of their approach is its overlay nature. The networks over which it is running do not need to be modified in any way and can even be unaware of it altogether. Indeed, the brain is no longer on the end nodes, as even the transmissions of DCP are performed hop-to-hop. But there were some issues that were poorly addressed. It appears difficult to create a standard sequencing of the AN's when the stream originates in several independent root routers. There is a question of security since the stream is combined from multiple sources, if even one of those malicious, the whole stream can be corrupted for the receivers; the multiplicity of sources complicate security. Also, there is no mechanism to address load balancing, which can be a critical issue in large networks. Any of these, as well as further experimentation, provide good topics for further research. From hs247@cornell.edu Tue Dec 3 11:53:07 2002 Received: from travelers.mail.cornell.edu (travelers.mail.cornell.edu [132.236.56.13]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id gB3Gr6920082 for ; Tue, 3 Dec 2002 11:53:06 -0500 (EST) Received: by travelers.mail.cornell.edu (8.9.3/8.9.3) id LAA05142; Tue, 3 Dec 2002 11:53:03 -0500 (EST) Date: Tue, 3 Dec 2002 11:53:03 -0500 (EST) From: hs247@cornell.edu Message-Id: <200212031653.LAA05142@travelers.mail.cornell.edu> To: egs@CS.Cornell.EDU Errors-To: hs247@cornell.edu Reply-To: hs247@cornell.edu MIME-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: IMP/PHP3 Imap webMail Program 2.0.9 Sender: hs247@cornell.edu X-Originating-IP: 128.84.97.226 Subject: 615 Paper 74 Meshed-Based content Routing using XML is an overlay network that routes packets based on the content of packets (XML Data). The authors identify a problem in which there is time-sensitive data that needs to reach end users with very low latency. To solve this problem, Meshed-based routing trades bandwidth for latency. In this network, XML documents are transferred from source to clients. There are three types of nodes: clients (sink), root routers (source), and internal routers (packet forwarders). The network is organized in an acyclic “mesh” with roots at the top, and clients at the bottom. The overall concept is that latency will be reduced by sending redundant packets. Using XML, clients can register them selves to any number of nodes. Each document from the root is forwarded to all its children, and the children only needs to forward packets/information enough to satisfy its children. XML documents can be used by clients to specify what kind of packets it wants. In this network, a node can receive redundant packets. The paper introduces a Diversity Control Protocol which takes care of communication between nodes. This protocol handles everything from packet merging information/ filtering redundant packets. The idea that the clients can receive packets from any number of sources may be a cause for security concern. If a piece of data was really time critical, it’s probably really important also. Do you really want to trust any number of sources from transferring your data? A schema like this can potentially be very susceptible to malicious attackers. From mr228@cornell.edu Tue Dec 3 11:55:45 2002 Received: from travelers.mail.cornell.edu (travelers.mail.cornell.edu [132.236.56.13]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id gB3Gtj922291 for ; Tue, 3 Dec 2002 11:55:45 -0500 (EST) Received: by travelers.mail.cornell.edu (8.9.3/8.9.3) id LAA07240; Tue, 3 Dec 2002 11:55:41 -0500 (EST) Date: Tue, 3 Dec 2002 11:55:41 -0500 (EST) From: mr228@cornell.edu Message-Id: <200212031655.LAA07240@travelers.mail.cornell.edu> To: egs@CS.Cornell.EDU Errors-To: mr228@cornell.edu Reply-To: mr228@cornell.edu Cc: mr228@cornell.edu MIME-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: IMP/PHP3 Imap webMail Program 2.0.9 Sender: mr228@cornell.edu X-Originating-IP: 128.84.223.176 Subject: 615 PAPER 74 This paper describes what, in essence, is little more than a publish-subscribe system with content-aware routers. The data is XML and via a querying mechanism a node can express an interest in certain types of data. There is a high degree of resolution: Apparently one can specify interests to just certain tags (and by extension hierarchies). These routers than transmit information downstream only on an as needed basis. They call their transport-like protocol DCP: Diversity Control Protocol. DCP is very similar to other transports with the big exception being a hop-by-hop ack. This violates the end-to-end argument and may produce unneccesary overhead, although according to their experiments this is not the case. They also attempt to provide fault tolerance at this layer, but I'm not convinced they are as robust as they claim. It seems their system would be best suited for very light-weight, content-driven applications. Future work should look at the interactions between this layer and applications to determine if any behave funny and which are best suited for this type of system. They present two implementations and the results they give look promising. It would be nice to see how this might scale to larger systems and systems that are more spread out. From pj39@cornell.edu Tue Dec 3 12:30:25 2002 Received: from postoffice2.mail.cornell.edu (postoffice2.mail.cornell.edu [132.236.56.10]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id gB3HUP900894 for ; Tue, 3 Dec 2002 12:30:25 -0500 (EST) Received: from pj39.cornell.edu (syr-24-59-96-161.twcny.rr.com [24.59.96.161]) by postoffice2.mail.cornell.edu (8.9.3/8.9.3) with ESMTP id MAA17072 for ; Tue, 3 Dec 2002 12:29:56 -0500 (EST) Message-Id: <5.1.0.14.2.20021203122903.02584a88@postoffice2.mail.cornell.edu> X-Sender: pj39@postoffice2.mail.cornell.edu (Unverified) X-Mailer: QUALCOMM Windows Eudora Version 5.1 Date: Tue, 03 Dec 2002 12:29:56 -0500 To: egs@CS.Cornell.EDU From: Piyoosh Jalan Subject: 615 PAPER 74 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed Mesh-Based Content Routing using XML This main goal of this paper is to introduce a mechanism that provides high reliability and low latency in the presence of both node and link failures. Soma target environments where the authors perceive requirements of such system are real time trading systems and air traffic control systems. This paper proposes a scheme where time critical data is reliably multicasted to heterogeneous clients over mesh based overlay networks. In the mesh structure each node has n parents (n is configurable) and hence making the network n-1 resilient. It uses XML encapsulated packets for content pruning. Every client publishes there interest in the form of XQuery and the routers based on the query prunes the packets it receives from parents and sends a subset of the packets on the downlink to the client hosts. XML routers use dynamic control protocol (DCP) for router to router and router to client communication. The choice of XML stems from the fact that it permits the network to interpret data in terms of well defined xml queries, secondly xml packet suggests what logical units of data should be processed by clients together and thirdly there are many tools and standards that exist for XML making it easy to build more robust applications. The disadvantage of using xml is that it increases the number of bytes in the data packet, but the authors suggest that compression and decompression can be used by the routers to offset this. An XML router thus implements three key algorithms i.e. xml router core the engine that receives and forwards packets according to client queries, the diversity control protocol (DCP) which implements resilient mesh communication by allowing a receiver to reassemble a packet stream from diverse sources and a mesh initialization and maintenance which is a set of algorithms that automatically organizes routers and clients into a mesh and repairs the mesh when fault occurs. For the purpose of evaluation the authors used a fully featured multi threaded Java implementation that uses DCP for router - router and router - client communication. The results of the simulation yield that xml routers provide low latency and high delivery rates even in the case of high failure rates of nodes. From ks238@cornell.edu Tue Dec 3 13:53:31 2002 Received: from travelers.mail.cornell.edu (travelers.mail.cornell.edu [132.236.56.13]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id gB3IrV921894 for ; Tue, 3 Dec 2002 13:53:31 -0500 (EST) Received: by travelers.mail.cornell.edu (8.9.3/8.9.3) id NAA09738; Tue, 3 Dec 2002 13:53:27 -0500 (EST) Date: Tue, 3 Dec 2002 13:53:27 -0500 (EST) From: ks238@cornell.edu Message-Id: <200212031853.NAA09738@travelers.mail.cornell.edu> To: egs@CS.Cornell.EDU Errors-To: ks238@cornell.edu Reply-To: ks238@cornell.edu MIME-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 7bit X-Mailer: IMP/PHP3 Imap webMail Program 2.0.9 Sender: ks238@cornell.edu X-Originating-IP: 128.84.99.155 Subject: 615 Paper 74 The motivation for this paper, which proposes the use of mesh based content routing, is to adopt a protocol which addresses the problem of time critical data transfers with low latency in a network with node and link failures. The paper uses a content distribution mesh in which multiple paths exist between two nodes. These multiple paths can be leveraged in order to allow the reception of duplicate packet streams while the consumption of extra bandwidth is almost ignored since the paths already exist. In addition, data is in the form of XML packets and XML streams (which are multiple packets). Finally, XML routers are responsible for forwarding packets based on subscriptions (i.e. by certain nodes in the network) that are described by queries. The protocol supports a publish-subscribe system which allows nodes to specify (through detail queries) the certain portion of a data stream that they would like to receive. In such a scheme, we see that routers will only have to route the data that is identified by its children’s queries which will thereby reduce the amount of excess bandwidth that is consumed. The three contributions that are indicated in the paper are the proposal of the first XML based routing protocol which argues that XML based routing is more efficient and accurate than byte stream multicast systems. Second, the paper proposes a mesh-based overlay network in which redundant paths exist and redundant messages are transferred over these paths. Finally, we see that Diversity Control Protocol enables the reconstruction of packet streams through its use of a source independent sequencing scheme, which allows for duplicate data transmissions in order to reduce latency and ensure that time critical data is received as described above. Each router will have to implement three different algorithms or protocols. First we have the XML router core which is responsible for forwarding packets based on its children’s queries and the use of an input component, a switch, and obviously the resulting output component. Next, the Diversity Control Protocol (as discussed above) is responsible for the sequencing of packets when reconstructing a packet stream at the receiving node. Finally, the paper discusses the formation of the mesh overlay network and its maintenance in light of node and link failures. Also, in terms of mesh repair, the protocol uses a leveling scheme in which a parent which fails will be replaced by another parent which is at least one level above the child node (as was the original parent). The key idea is that node failures can be addressed without reconfiguring the system or disturbing data flow. From sc329@cornell.edu Tue Dec 3 13:56:40 2002 Received: from postoffice2.mail.cornell.edu (postoffice2.mail.cornell.edu [132.236.56.10]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id gB3Iue922822 for ; Tue, 3 Dec 2002 13:56:40 -0500 (EST) Received: from sangeeth.cornell.edu (syr-24-58-5-174.twcny.rr.com [24.58.5.174]) by postoffice2.mail.cornell.edu (8.9.3/8.9.3) with ESMTP id NAA09424 for ; Tue, 3 Dec 2002 13:56:38 -0500 (EST) Message-Id: <5.1.0.14.2.20021203135359.0213b630@postoffice2.mail.cornell.edu> X-Sender: sc329@postoffice2.mail.cornell.edu (Unverified) X-Mailer: QUALCOMM Windows Eudora Version 5.1 Date: Tue, 03 Dec 2002 13:56:17 -0500 To: egs@CS.Cornell.EDU From: Sangeeth Chandrakumar Subject: 615 PAPER 74 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed This paper presents an approach to reliably multicasting time-critical data to clients over a mesh-based overlay networks. The system tries to achieve low latency in presense of both node and link failures. The basic approach is to construct a content distribution mesh, where every node is connected to n parents, receiving duplicate packet streams from each of its parents. The architecture is based on an overlay network that transports XML streams. The use of XML allows applications and databases to push part of their processing to the network fabric.THe many tools which exist for XML makes it easy for both the data originator and receiver to build robust applications. The three main components of the system are XML routing, mesh-based overlay networks and diversity control protocol. The typical overlay network for XML streams contain one or two root routers, a variable number of internal routers and a variable number of edge clients. Root routers are the origin of data and are assumed to have independant means of generating their XML stream. The contents carried by routers can be dynamically configured to filter only those packets which are of interest downstream. A client wishing to join the network, connects to n existing routers and send them the XML query that describes the data desired. Each router includes a query table that describes the portion of the XML stream each of its children wishes to receive. The router can also merge multiple XML streams for routing. The diversity control protocol give the system the capability to reassemble a packet stream from diverse senders. Also it performs ordering of packets based on a source-independant packet identifier. Retransmissions are done on a hop by hop basis. A retransmission is done on receiving a negative acknowledgment from a downstream node. The system also employs a mesh repair scheme when one of the parent node fails. Different level number are assigned to nodes to prevents formation of cycles while mesh formation Such a mesh based content addressing system could improve the functionality and performance of tightly-constrained applications making it a lot more reliable and giving lower latencies. From vivi@CS.Cornell.EDU Tue Dec 3 17:31:12 2002 Received: from exchange.cs.cornell.edu (exchange.cs.cornell.edu [128.84.97.8]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id gB3MVC918858 for ; Tue, 3 Dec 2002 17:31:12 -0500 (EST) content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Subject: 615paper74 X-MimeOLE: Produced By Microsoft Exchange V6.0.6249.0 Date: Tue, 3 Dec 2002 17:31:12 -0500 Message-ID: <47BCBC2A65D1D5478176F5615EA7976D11AFBD@opus.cs.cornell.edu> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: 615paper74 Thread-Index: AcKbG673KfBqFtviQwmmxeidxwXD1A== From: "Vivek Vishnumurthy" To: "Gun Sirer" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by sundial.cs.cornell.edu id gB3MVC918858 The paper presents a method to reliably multicast XML packets (normal data converted to XML format) to clients using a mesh- based overlay network. The overlay is organized as an acyclic mesh, with the root routers (the originators of data) at one end, the clients at the other end, and inner XML routers in between. Each router keeps track of what data each of its children needs, and forwards the appropriate packets to the children, after recreating the reordered packet stream based on the input packets it has seen. The XML packets are compressed before transmission, because of the increased size of the XML packets as compared to the data in the original format. This setup achieves reliability due to the fact that each component obtains inputs from a number of (say N) other nodes, and can recreate the required data stream even in the face of up to (N-1) failures among its parent nodes. Any packet similarly is successfully received at a node even if (N-1) other copies of the packet are lost. Weaknesses: (*) The effects of compression and decompression on latency are not evaluated. (The paper says that compared to compression overhead, the markup overheads in each XML packet is negligible) (*) The paper claims that having more layers of inner routers leads to increased benefits in terms of better latency and throughput. But having more layers could add to the probability of packet errors/failures at each layer, thus the errors could accumulate as we go from the root to the client. (The results shown in the paper correspond to a setup with only one intermediate layer - same as the earlier setup) Possible improvements: Rigorously verify the benefits of multiple layers for different numbers of inner layers. From linga@CS.Cornell.EDU Thu Dec 5 14:14:35 2002 Received: from snoball.cs.cornell.edu (snoball.cs.cornell.edu [128.84.96.54]) by sundial.cs.cornell.edu (8.11.3/8.11.3/M-3.10) with ESMTP id gB5JEZ904493 for ; Thu, 5 Dec 2002 14:14:35 -0500 (EST) Received: from localhost (linga@localhost) by snoball.cs.cornell.edu (8.11.3/8.11.3/C-3.2) with ESMTP id gB5JEYL19232 for ; Thu, 5 Dec 2002 14:14:34 -0500 (EST) Date: Thu, 5 Dec 2002 14:14:34 -0500 (EST) From: Prakash Linga To: Emin Gun Sirer Subject: 615 PAPER 74 Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Mesh-Based Content Routing using XML This papers proposes using a mesh-based overlay network for reliable and quick transfer of data. Mesh construction is done using a simple parent selection algorithm which is run at each node. This algorithm can be used to construct an (n-1)-resilient mesh for any given n. Each router maintains a level number which is one greater than the maximum of its parents. For mesh recovery router joins parents with level number greater than its own level number. If this is not possible it quits and rejoins the mesh. This works for all non-root routers as long as there are atmost n-1 failures. Content distribution is viewed in the publish subscribe framework. Publishers publish the content in XML form and the subsribers query this XML packet stream. XML routers perform content routing and pruning based on requests from subscribers downstream. XML router has three main components: input component (that acquires streams for presentation to the XML switch); XML switch (compares received packets against link queries and forwarding matching packets on the corresponding link); output component (forwards packets on output links using DCP). DCP (Diversity control protocol) is used reliably construct a sequenced packet stream from multiple sources. This is done by assigning an application serial number (AN) to every DCP packet when the packet is created at the root. This protocol helps reduce latency and improve reliability when compared to conventional single-sender approaches. The latency and loss-rate plots in the experimental section show that their scheme works well. Pros,Cons: Authors describe a new and interesting approach for reliably multicasting time critical data to heterogenous clients over a mesh-based overlay network. Experimentals results show that their scheme works good. Possible improvements are: Better mesh building and maintenance algorithms; integrating multiple XML streams as a single stream; XML routers currently handle only one-way communication (i.e pubs to subs), this could be extended to duplex communication.