Thursday, February 17, 2005
4:15 pm
B17 Upson Hall

Computer Science
Colloquium
Spring 2005


Brian Bershad
University of Washington

The Data Turbine

Today, a large amount of compelling digital content, including audio, video, and news, can be found on hundreds of thousands of unscheduled continuous data streams on the Internet.  The lack of a schedule coupled with the sheer number of streams makes it extremely difficult for users to find the streaming content they desire.

In this talk, I'll describe a new technique called the Data Turbine that quickly and efficiently locates digital content floating within a large number of Internet streams, including audio and RSS.  In order to explore the feasibility of the Data Turbine, we have designed, simulated and implemented Radio Turbine, a software system that allows users to find desired audio content, such as music, within any one of the tens of thousands of easily-discovered Internet radio streams.  With Radio Turbine, a user can find the majority of songs of their choosing in a short period of time.  For example, using a playlist of 100 titles available from a major Internet music store, Radio Turbine was able to find over half within the first two hours, and nearly 80% within the first twelve.  Moreover, Radio Turbine can find content more quickly and more completely than a popular peer-to-peer music sharing system.