This semester's database seminar will deal, as the title suggests, with approximate query processing over data streams. We will cover some introductory material as well as advanced topics, with the goal of identifying interesting research problems in the area.
The seminar will meet Wednesdays and (most) Fridays from 2:30 to 3:30 PM, in Upson 215.
Al Demers and Johannes Gehrke will give the initial lectures; after the first three or four weeks participants in the course will give the remaining lectures.
There will be more discussion of this requirement during the first meeting.
Date | Topic | Lecturer |
January 30 |
IntroductionS. Babu and J. Widom, Continuous Queries over Data Streams. [BW01] |
A. Demers |
February 1 |
J. S. Vitter, Random Sampling With a Reservoir. [V85] |
A. Demers |
February 6 |
P. Flajolet and G. Martin, Probabilistic Counting Algorithms for Database Applications. [FM85] |
A. Demers |
February 8 |
Kyu-Young Whang, Brad Vander Zanden and Howard Taylor, A Linear Time Probabilistic Counting Algorithm for Database Applications. [WVT90] |
A. Demers |
March 20-22 | Spring Break | |
March 29 |
N. Alon, Y. Matias and M. Szegedy, The Space Complexity of Approximating the Frequency Moments. [AMS96] |
Martin Pal |
April 3 | N. Alon, P. B. Gibbons, Y. Matias and M. Szegedy, Tracking Join and Self-Join Sizes in Limited Storage. [AGM99] | Martin Pal |
April 5 |
W. C. Hou, Ozsoyoglu, and B. K. Taneja, Statistical Estimators for Relational Algebra Expressions. [HOT88] S. Ganguly, P. B. Gibbons, Y. Matias and A. Silberschatz Bifocal sampling for skew-resistant join size estimation. [GGM96] |
Sasha |
April 10 |
J. Shanmugasundaram, K. Tufte, D. DeWitt, J. Naughton, D. Maier, " Architecting a Network Query Engine for Producing Partial Results ", Lecture Notes in Computer Science, Vol. 1997, Springer-Verlag Publishers, 2001. J. Chen, D. DeWitt, F. Tian and Y. Wang, NiagaraCQ: A Scalable Continuous Query System for Internet Databases. [CDTW00] |
Jai |
April 12 |
P. B. Gibbons and Y. Matias, New Sampling-Based Summary Statistics for Improving Approximate Query Answers. [GM98] S. Acharya, P. Gibbons, V. Poosala and S. Ramaswamy, Join Synopses for Fast Approximate Query Answering. [AGP99] |
Alin |
April 19 |
S. Chaudhuri, R. Motwani and N. Narasayya, On Random Sampling over Joins. [CMN99] V. Ganti, M. L. Lee and R. Ramakrishnan, ICICLES: Self-Tuning Samples for Approximate Query Answering. [GLR00] |
Adina Costea |
April 24 |
J. Chen, D. DeWitt and J. Naughton, Design and Evaluation of Alternative Selection Placement Strategies in Optimizing Continuous Queries. [CDN02] S. Madden, M. Shah, and Joe Hellerstein. Continuously Adaptive Continuous Queries over Streams . SIGMOD 2002 [PDF] |
Abhinandan |
Monday April 29 |
Swarup Acharya, Phillip B. Gibbons, Viswanath Poosala: Congressional Samples for Approximate Answering of Group-By Queries. SIGMOD Conference 2000: 487-498 P. B. Gibbons, Distinct Sampling for Highly-Accurate Answers to Distinct Values Queries and Event Reports. [Gib01] |
Cristi |
May 1 |
P. Seshadri, M. Livny and R. Ramakrishnan, Sequence Query Processing. [SLR94] P. Seshadri, M. Livny and R. Ramakrishnan, SEQ: A Model for sequence Databases. [SLR95] |
Rohit |
May 3 |
S. Chaudhuri, G. Das, M. Datar, R. Motwani and V. Narasayya, Overcoming Limitations of Sampling for Aggregation Queries. [CDD+01] S. Chaudhuri, G. Das and V. Narasayya, A Robust, Optimization-Based Approach for Approximate Answering of Aggregate Queries. [CDN01] |
Dan |