| Tuesday, April 5, 2005 | |
| 
     | |
| Query Processing for Large-Scale Message Brokering | |
| Emerging distributed information systems such as Web services, personalized content delivery, and event monitoring require increasingly flexible and adaptive infrastructures. Recently, the publish/subscribe model has gained acceptance as a solution for the loose coupling of systems at the communication level. Meanwhile, at the content level, XML (Extensible Markup Language) is becoming a de facto standard for online data exchange. I propose an approach that integrates publish/subscribe and XML and, in particular, exploits declarative XML queries to offer high flexibility and functionality in distributed systems. This approach is based on building XML message brokers, which I define as middleware components that perform three main functions: filtering, transformation, and routing of XML messages based on client-specified queries. In this talk, I present YFilter, an XML message brokering system, that provides the three functions for large numbers of queries on high volumes of messages. The key innovation has been to identify commonalities among queries and share their processing. I will summarize YFilter’s shared filtering techniques, focus on its shared transformation techniques, and present an overview of its third component on routing. I will also report on the results of a thorough performance study, showing that YFilter can provide efficient message processing for tens of thousands of queries while preserving the flexibility to support a wide variety of message types and query workloads. Bio: Yanlei Diao is a Ph.D. candidate at the University of California, Berkeley. Her research interests are in information architectures and data management systems, with a focus on message-based distributed systems, XML query processing, stream processing, and learning-based data processing. She received her B.S. in Computer Science from Fudan University in China in 1998, and her M.S. in Computer Science from Hong Kong University of Science and Technology in 2000. During her graduate years, she also worked as a research intern at the IBM Almaden Research Center and at BEA Systems. | |