Speaker: Farnam Jahanian
Affiliation: Department of EECS, University of Michigan
Time & Place: 4:15 PM, 101 Phillips Hall
Host: Ken Birman
Title: Experimental Study of Internet Stability and Wide-Area Backbone Failures
The Internet has seen an explosive growth in both size and topological complexity, particularly since the end of the NSFNet backbone in 1995. This growth has placed severe strain on the commercial Internet infrastructure. This talk highlights our findings on stability and availability of the Internet routing infrastructure. This study was conducted as part of a joint project between the University of Michigan and Merit Network, aimed at studying the growth and scalability of the Internet backbone routing infrastructure. This project has been involved in the development and deployment of tools for measurement, analysis and visualization of network performance and routing statistics during the last four years.
Our findings are based on instrumentation of key portions of the Internet infrastructure during a four-year study, which included both passive data collection and active fault-injection at major Internet public exchange points. Our study has shown that the volume of these routing updates was initially several orders of magnitude larger than expected and that the majority of this routing information was redundant, or pathological. Furthermore, our analysis has revealed several unexpected trends and ill-behaved systematic properties in Internet routing. Our study of network reachability has also found unexpectedly high level of path fluctuation and an aggregate low mean-time-between-failures for individual Internet paths. Unlike switches in the public telephony network which exhibit failover on the order of milliseconds, we show that inter-domain routers in the packet switched Internet may take several minutes to reach a consistent view of the network topology after a fault. During these periods of delayed convergence, end-to-end Internet paths will experience intermittent loss of connectivity, as well as increased packet loss and latency. Our findings on convergence properties of Internet routing is based on instrumentation of key portions of the Internet infrastructure, including both passive data collection and fault-injection machines at major Internet exchange points. We finally posit a number of explanations for these anomalies and evaluate their potential impact on the Internet infrastructure.
Farnam Jahanian is an Associate Professor of Electrical Engineering and Computer Science and the Director of Software Systems Lab at the University of Michigan. Prior to joining the faculty at Michigan in 1993, he was a Research Staff Member at the IBM T.J. Watson Research Center. He received both the M.S. and Ph.D. degrees in Computer Science from the University of Texas at Austin in 1987 and 1989, respectively. His research interests include fault-tolerant distributed computing, and network protocols and architectures. (http://www.eecs.umich.edu/~farnam)