arc39

_Predicting_Internet_Network_Distance_with_Coordinates-B= ased_Approaches_

T.S. Eugene Ng and Hui Zhang

It's hard to respond to this paper -- I'm operating under the assumption that you didn't want us to, since it was difficult to ocomprehend the telegraphic speech present in the powerpoint slides. Nevertheless, the central concept of the slides is interesting and fairly easy. It is relatively expensive to do a ping exchange with each other node we wish to establish a distance to; if we model the internet as a euclidian space, we can measure distances to central points, and use those as coordinates in the internet-space so generated; this lets us simply measure distance (in terms of hop count, or ping time, etc) to obtain an upper bound on actual ping-time distance. The scheme is fairly good, though somewhat expensive for the central servers, albeit at a cost proportional to the carrying capacity of the network, not the number of distance-related transactions on the network. Also, outages in the well-known central entities cannot easily be repaired -- the entire system must remeasure against any new landmark that is put into the system, which is an individually small cost, but means that there are definite points of this system which do not scale and cannot be easily repaired/replaced.

_Vivaldi:_A_Decentralized_Network_Coordinate_System_

Frank Dabek, Russ Cox, Frans Kaashoek, Robert Morris

Vivaldi attempts to measure internet distances via a spring-lattice structure, allowing disparate peer nodes to communicate information about physical location -- as measured in milliseconds -- to each other, for route selection, etc. Alternate measures are proposed, the most feasible being to consider the system as not just a two dimensional grid but the surface of a sphere, and thus the effects of wrap around be internalized to the model. This is rejected because the internet does not, in fact, tend to wrap around; America is at the "center" of the world-wide-web, at least for now. They model the interrnet as a lattice of springs, in that between each pair of hosts there is some spring with rest length equal to the RTT between the two nodes. They then minimize the squared error function, which is equivalent to spring energy, in other words, allowing the springs to acheive a natural equilibrium. They divide time into a series of steps, and permit the springs to act upon those nodes that are out of alignment, thus converging in a strictly local sense to the correct answer, using only localized information. In addition, the system models the fact that many nodes must be routed through a third party as a height vector, derived from the energy stored in the springs, which means that euclidean distance is better approximated.

This is a very tight algorithm, though as pointed out, it is prone to find locally minimal (energy) coordinate-states rather than globally minimal ones, which means that it is nonoptimal; moreover, it is night impossible to do simulated annealing or other such random-restart processes, making it difficult to recover from these states, without introducing wild shifts. Unrelatedly, the system seems logically incomplete: though they model distance from the internet core as distance above the euclidean plane, there are two cases which are not easily distinguished in this model and not well handled: sites that are near each other, yet distant from the internet core, and sites that are equidistant from the internet core and removed from each other. In the first case, the system behaves perfectly, taking the distance between two euclidean "tall" points essential= ly ignoring the height (if equal). In the latter, however, they will be at the same "height" (if equidistant from the core) and thus this ex= tra travel time will not be factored in correctly, and thus must be modeled entirely in euclidean distance in the plane. This is a degenerate case of their algorithm (with all heights equal to 0) and thus throws suspicion on the height.

_Meridian:_A_Lightweight_Network_Location_Service_withou= t_Virtual_Coordinates_

Bernard Wong, Aleksandrs Slivkins, Emin Gun Sirer

This version may be described as a set of series of concentric rings; each node maintains some neighbor set organized as rings. These rings vary over exponential radii, and are maximized based on local measurements (in a local n-space, the paper maximizes the volume of the object formed between the various neighbors). This is also prone to local optima which are not globally maximum, or even necessarily very good, but the system may take wider swings and thus is not as prone to stultifying as other systems. It is a decentralized system with excellent (logarithmic) performance measures.

The system is interesting in that it doesn't have the same sense of direction as others with more absolute senses; rather than providing absolute coordinates and thus euclidian direction, there are strict distances. The fact that state may be maintained locally per-node for wide-ranging maxima is almost icing on the cake. ------=_Part_2306_5941036.1142316457469-- From asr32@cornell.edu Tue Mar 14 01:12:19 2006 Return-Path:

Still bouncing to =
egs+summary

Vivaldi is an algorithm to compute location of nodes =
in a
synthetic

coordinate system, with the goal that distances =
should
approximately

reflect RTTs. In order to do this the algorithm =
is designed
to

minimize the squared error between the predicted RTTs =
(i.e.
the

distances) and the actual RTTs. The basic =
intuition of
Vivaldi is

that of a system of springs; the algorithm treats =
errors are
creating

spring forces and uses a gradient descent technique =
to move
the system

towards equilibrium. To fit internet latencies, =
Vivaldi
uses a 2d

Euclidean metric augmented with a height =
vector. This
captures the

intution that traffic from a node needs some amount =
of time
to get

into the "core" of the network, then it is =
routed
through the core and

finally needs some time to travel out of the core to =
the
destination.

The choice of minimizing squared error is not very =
well
motivated,

other than that they have a nice procedure for doing =
it.
Nothing

about the internet innately seems to suggest this =
metric
over other

error metrics. Furthmore, its is not clear that =
optimizing
for any

metric is really the right goal. The real goal =
is to help

applications make good decisions; any metric used is
valuable only as

a proxy for that. The same complaint can be =
made about the
height

metric. In this case there is a plausable story =
behind why
it is a

good metric, but it is far from definitive =
(especially given
the

observation that internet latencies do not satisfy =
the
triangle

inequality).

Rather than attempting to use a synthetic metric to =
evaluate
queries,

it to rapidly pass queries to nodes capable of giving
accurate answers

to them. The basic organization is a set of =
rings around
the node

with exponentially increasing radii (where the radius
measures the

maximum RTT allowable in that ring). By =
maintaining a
logarithmic

number of peers in each ring, nodes know quite a bit =
about
their local

area and progressively less about areas farther =
away.
However they

know enough that, with high probability, they can =
still get
queries

answered. To minimize overhead, maintainance of =
the rings
is done

through gossip.

assumptions about an underlying metric. To =
makre provable
guarantees,

some sort of low dimensionality assumption is =
needed.
However,

non-geometric assumptions (i.e. growth-constrained or
doubling metrics).

arc39

Crowds

Offering enhanced anonymity on the internet, Crowds operates by creating a network of colluding peers who bounch any given web request through several local redirections before allowing it to leave; thus, upon receipt of a request, the remote server cannot know who issued the request that it is responding to (as long as the issuer did not include any identifying details). Individual entrants -- Jondos, in the parlance of the paper -- rely on a server to introduce them to the crowd, but there is no reason that this should not be a well known node in the system.

An odd statement is made regarding mixes and how crowds does not implement a mix: the paper claims that mixes do not provide sender anonymity. This need not be true, and is an odd thing to state, since there is no reason that it should be so; certainly, I did not learn about mixes that did not do this! Also, an attempt is made to claim that mixes' reliance on public key cryptography is a weakness, while a similar ability is even found in Crowds (though implied to be pairwise symmetric keys). However, the behavior properties of Crowds is better than that of a system of mixes, due to the fact that each element of a Crowds system serves as a mix for each other element. Complete knowledge of active Jondos is assumed, which is not scalable (though traffic impacting any one node is relatively stable). Precapturing embedded content is extraordinarily clever, but does not completely preclude timing attacks -- page redirections might still trigger this (or are they usually javascript?).

The biggest drawback to the system is the amount of trust which must be placed in fellow jondos -- SSL links are not supported, and due to the structure of the internet, it is extraordinarily difficult to ensure that a man-in-the-middle attack has not been executed in delivering web content. This could be done with end-to-end encryption, but for obvious reasons this is difficult, and so the limitations must be lived with; for certain media, multiple paths could be used to request the site, which (while lowering anonymity) would raise surety of content. The biggest problem is that joins are batched and it is possible to link requestors by spanning routes of requests (ie, routes pre- and post- join batches) implies that browsing must be done quickly and targetted, which is not the standard use-model for te internet, these days!

P^5

P^5 performs structured anonymous broadcast via a broadcast tree, where a node joins several groups at various points in the tree, where distance from the root corresponds to message efficiency. The overhead in the system can become staggering, even with the assumed lateral links, though the system is guaranteed "secretive enough", i.e. rigorous proofs are presented that a node will not inadvertantly expose more information than it chooses to. It assumes a public key infrastructure through the curious method of not assuming a global public key infrastructure while assuming that two parties may ascertain each others public keys through OOB mechanisms.

P^5's main shortcomings are that due to the nature of its broadcast trees, each message sent results in a flood of (unreliable) mesages throughout the network, hitting especially hard the closer one comes to the root. Also, P^5 is vulnerable to asynchronous analysis; it is noted that a user should not change the set of channels it is part of, or intersection attacks become feasible. However, this is essentially what happens each time a user logs off, which means that repeated use of P^5 leaks information. Also, though we have receiver anonymity, since we can have both sender and receiver anonymity, this paper suffers from a very simple problem: there is no reason to trust any given message, since it cannot rely on any information about the sender to vouch for its relevance.

Dining Cryptographers

The cryptography used here is easy, though with many "side effects". By relying on a broadcast medium, and some struct= ure where each edge in the graph has some random bit shared between the two nodes it links, a bit of information may be communicated by each non-speaker broadcasting the cumulative xor of each incident edge's bit; the speaker also xor's in the bit that the speaker wishes to communicate. After each step of speaking, the spoken bit may be determined by taking the xor of the communicated bits -- thus, there is a perfect model of who-knows-what. The scheme suffers from some drawbacks, but far fewer than other systems for anonymity -- partitions are dangerous, allowing information to leak (in the extreme case, where the broadcaster is surrounded by colluding nodes, their information is completely compromised) but with even one non-colluding node, the speaker is again "anonymous".

The base version of the system can be incredibly expensive, as for each bit of message traffic, essentially N^2 information must be exchanged -- if a node has connectivity less than N, then it trusts some portion of the users, because it is now that much easier to partition that node, determining whether or not it is the speaker. Also, it's very slow, as each bit sent must be sent throughout the entire network; though they point out that this can be done en masse, by sending an entire message of bits, the point remains that the activities of every remote peer, not just a subset of the remote peers, determine the propagation of the message. Finally, the entire system relies on an efficient broadcast mechanism which, while not vulnerable to analysis (we are still essentially using one-time-pads) is certainly not reasonable to assume: each node ends up having to have n^2 connectivity if only to dissemenate information, though there are more efficient schemes for this, so they could be used, the paper doesn't reference them. As a postscript, denial of service and noise injection are trivial in this system, and devastating.

Herbivore

Built upon the strength of DC-nets and assumptions about the strength of cryptographic techniques, Carnivore builds another anonymous system that is more scalable and efficient than DC-nets. It is built over Pastry, with crypto-puzzles to protect against Sybil attacks; the insight is that we can have clusters of high connectivity to conceal information, which preserve many of the extraordinarily good security properties of DC-nets.

The paper leaves unclear precisely how exactly slots are reserved without making clear the intended author for any given period -- usually one would try to inductively use the same system for this, but this is clearly unallowable as the entire issue is collision detection. Implied is that this is a well known solution, but if this problem may be solved anonymously, then the entire question is easily solved by simply using the scheduling algorithm "larger". Also, i= f the individual to whom a given timestamp has been assigned becomes known, then since the message is known (in encrypted form at minimum!) the anonymity has failed, and thus this reservation decision is of utmost importance! Also, the system uses "local proxies" to explore othe= r cliques on the ring; we can only hope that they are elected through this protocol (otherwise, we essentially rely on onion routing). Performance is enhanced under Herbivore, but is clearly still not optimal; this is a necessary condition, apaprently, and unfortunate. ------=_Part_349_14065812.1142481248026-- From lackhand@gmail.com Mon Mar 27 19:06:06 2006 Return-Path:

arc39

EigenTrust

Hector Garcia-Molina

Attempting to deduce the trust values to place in entities and objects removed from oneself in the network, EigenTrust uses a system of "entities rating entities" to discover not just = the trust that A places in B (in order to aggregate and so determine whether B should be trusted), but the aggregate trust placed in A by the network as a whole, via the observation that by asking ones friends, and ones friends of friends, and so forth, one creates a completely connected segment which should be equal to the entire network. This turns out to be a problem equivalent to solving for the principal eigenvector of a matrix whose entries represent localized trust value measurements. Key insights include the rarity of individual client-client interactions, leading to sparse matrix representation and relatively low coordination message overhead, and that piecewise computation of this matrix function converges relatively quickly, having each individual node calculate its own trust function (conceptually only, due to security flaw! In actuality, M deterministically-selected trust-manager peers calculate this value on behalf of each node).

As the paper points out, the difficulty with this algorithm is that its end result is universally viewed, which means that there is no information assymetry in the network which can be leveraged, which will almost universally result in the best possible peer being isolated, and used, leading to an accrual of trust, which will cause the problem the escalate. The probabilistic algorithm will fix this, but is something of an altruistic solution, since it relies on each individual accepting the choice of a suboptimal path to patch a "black hole" in the system. Their attacker model is sufficiently malicious, but their experiments seem to leave something to be desired; the subtext is that the effect is visible in the small scale, but that the data in the large scale experiments are too large to fully study. However, these data are not presented; presumably the effect is cleaner in a small network.

Robust Reputation System for P2P and Mobile Ad-hoc Netwo= rks

Jean-Yves Le Boudec

Another attempt at validating entities in a distributed fashion, this paper uses another reputation scheme, attempting to strike the happy medium between considering the reports of others and relying entirely on self-generated interactions. It does this by pushing first hand information out into the network, based on a modified Bayesian approach and a linear model merging heuristic, leaving reputation & trust ratings as private. It does this in the absence of primary trusted entities (the seed values that EigenTrust relies on). It uses reputation fading, to keep ratings "current",= and is exceedingly noncentralized.

Not much space is given in the paper, to using others' deduced trust values, just first hand operations. This means that while the system can be quite scalable, as it relies only on local data about any entity who is to be interacted with, it gives the impression of being lower overhead than EigenTrust without actually being so; if one wishes to later interact with a remote entity, to look up its rating requires just as much work, and the calculation is not particularly easier here, merely a different metric. ------=_Part_8131_17142571.1143504365295--