Monday, March 14, 2005
4:15 pm
5130 Upson Hall

Computer Science
Colloquium
Spring 2005


Prasanna Ganesan
Stanford University

Data Management in Peer-to-Peer Systems

The peer-to-peer (P2P) model of distributed computation is characterized by three laws: (i) all peers are born equal; (ii) the set of participant peers is dynamic and may be arbitrarily large; (iii) peers are distributed over a wide-area network. These three laws pose significant challenges to the development of efficient data-management solutions in the P2P model. This talk will consider one piece of the data-management puzzle -- enabling complex queries in P2P systems -- and discuss how to solve the problem through efficient schemes for data storage, query and data routing, as well as load balancing. In particular, it will focus on the following two questions:

  • how to exploit hierarchical network structure in enabling efficient routing and data caching (tackling the problems induced by the second and third laws), while not sacrificing the functional homogeneity required by the first law.

  • how to enable range queries via dynamic range partitioning, while guaranteeing compliance with the first law through online load balancing.

We note that the latter question is of independent interest in the context of parallel databases.