Cornell Database Group Publications
2008
Christoph Koch. Approximating Predicates and Expressive Queries on Probabilistic Databases Proc. PODS 2008.
Christoph Koch, Stefanie Scherzinger, Michael Schmidt: XML Prefiltering as a String Matching Problem. ICDE 2008: 626-635.
Lyublena Antova, Thomas Jansen, Christoph Koch, Dan Olteanu: Fast and Simple Relational Processing of Uncertain Data. ICDE 2008: 983-992.
David J. Martin, Johannes Gehrke, Joseph Y. Halpern: Toward Expressive and Scalable Sponsored Search Auctions. ICDE 2008: 237-246.
Ashwin Machanavajjhala, Daniel Kifer, John M. Abowd, Johannes Gehrke, Lars Vilhuber: Privacy: Theory meets Practice on the Map. ICDE 2008: 277-286.
2007
Biswanath Panda, Mirek Riedewald, Johannes Gehrke, Stephen B. Pope: High-Speed Function Approximation. ICDM 2007: 613-618.
Alan J. Demers, Johannes Gehrke, Biswanath Panda, Mirek Riedewald, Varun Sharma, Walker M. White: Cayuga: A General Purpose Event Monitoring System. CIDR 2007: 412-422.
David J. Martin, Daniel Kifer, Ashwin Machanavajjhala, Johannes Gehrke, Joseph Y. Halpern: Worst-Case Background Knowledge for Privacy-Preserving Data Publishing. ICDE 2007: 126-135.
Lyublena Antova, Christoph Koch, Dan Olteanu: MayBMS: Managing Incomplete Information with Probabilistic World-Set Decompositions. ICDE 2007: 1479-1480.
Michael Schmidt, Stefanie Scherzinger, Christoph Koch: Combined Static and Dynamic Analysis for Effective Buffer Minimization in Streaming XQuery Evaluation. ICDE 2007: 236-245.
Lyublena Antova, Christoph Koch, Dan Olteanu: 10^(10^6) Worlds and Beyond: Efficient Representation and Processing of Incomplete Information. ICDE 2007: 606-615.
Lyublena Antova, Christoph Koch, Dan Olteanu: World-Set Decompositions: Expressiveness and Efficient Algorithms. ICDT 2007: 194-208.
Lucja Kot and Walker White: Characterization of the Interaction of XML Functional Dependencies with DTDs. ICDT 2007: 119-133.
Walker M. White, Mirek Riedewald, Johannes Gehrke, Alan J. Demers: What is "next" in event processing? PODS 2007: 263-272.
Lars Brenna, Alan J. Demers, Johannes Gehrke, Mingsheng Hong, Joel Ossher, Biswanath Panda, Mirek Riedewald, Mohit Thatte, Walker M. White: Cayuga: a high-performance event processing engine. SIGMOD Conference 2007: 1100-1102.
Nitin Gupta, Fan Yang, Alan J. Demers, Johannes Gehrke, Jayavel Shanmugasundaram: User-centric personalized extensibility for data-driven web applications. SIGMOD Conference 2007: 1125-1127.
Adina Crainiceanu, Prakash Linga, Ashwin Machanavajjhala, Johannes Gehrke, Jayavel Shanmugasundaram: P-ring: an efficient and robust P2P range index structure. SIGMOD Conference 2007: 223-234.
Walker M. White, Alan J. Demers, Christoph Koch, Johannes Gehrke, Rajmohan Rajagopalan: Scaling games to epic proportion. SIGMOD Conference 2007: 31-42.
Lyublena Antova, Christoph Koch, Dan Olteanu: From complete to incomplete information and back. SIGMOD Conference 2007: 713-724.
Mingsheng Hong, Alan J. Demers, Johannes Gehrke, Christoph Koch, Mirek Riedewald, Walker M. White: Massively multi-query join processing in publish/subscribe systems. SIGMOD Conference 2007: 761-772.
Christoph Koch, Stefanie Scherzinger, Michael Schmidt: The GCX System: Dynamic Buffer Minimization in Streaming XQuery Evaluation. VLDB 2007: 1378-1381.
Lyublena Antova, Christoph Koch, Dan Olteanu: Query language support for incomplete information in the MayBMS system. VLDB 2007: 1422-1425.
Fan Yang, Nitin Gupta, Nicholas Gerner, Xin Qi, Alan J. Demers, Johannes Gehrke, Jayavel Shanmugasundaram: A unified platform for data driven web applications with automatic client-server partitioning. WWW 2007: 341-350.
Zhiyuan Chen, Johannes Gehrke, Flip Korn, Nick Koudas, Jayavel Shanmugasundaram, Divesh Srivastava: Index structures for matching XML twigs using relational query processors. Data Knowl. Eng. 60(2): 283-302 (2007).
Martin Grohe, Christoph Koch, Nicole Schweikardt: Tight lower bounds for query processing on streaming and external memory data. Theor. Comput. Sci. 380(1-2): 199-217 (2007).
Niki Trigoni, Yong Yao, Alan J. Demers, Johannes Gehrke, Rajmohan Rajaraman: Wave scheduling and routing in sensor networks. TOSN 3(1): 2 (2007).
Michaela Goetz, Christoph Koch, Wim Martens: Efficient Algorithms for the Tree Homeomorphism Problem. DBPL 2007: 17-31.
Christoph Koch, Stefanie Scherzinger: Attribute grammars for scalable query processing on XML streams. VLDB J. 16(3): 317-342 (2007).
Walker M. White, Christoph Koch, Nitin Gupta, Johannes Gehrke, Alan J. Demers: Database research opportunities in computer games. SIGMOD Record 36(3): 7-13 (2007).
Ashwin Machanavajjhala, Daniel Kifer, Johannes Gehrke, Muthuramakrishnan Venkitasubramaniam: L-diversity: Privacy beyond k-anonymity. TKDD 1(1): (2007).
2006
Ashwin Kumar V Machanavajjhala and Johannes Gehrke. On the Efficiency of Checking Perfect Privacy. In Proceedings of the 25th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 2006).
Daniel Kifer and J. E. Gehrke. Injecting Utility into Anonymized Datasets . In Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data (SIGMOD 2006).
William Y. Arms, Selcuk Aya, Manuel Calimlim, Jim Cordes, Julia Deneva, Pavel Dmitriev, Johannes Gehrke, Lawrence Gibbons, Christopher D. Jones, Valentin Kuznetsov, Dave Lifka, Mirek Riedewald, Dan Riley, Anders Ryd, and Gregory J. Sharp. Three Case Studies of Large-Scale Data Flows . In Proceedings of the IEEE Workshop on Workflow and Data Flow for Scientific Applications (SciFlow 2006). Atlanta Georgia, April 2006.
Ashwin Machanavajjhala, Johannes Gehrke, Daniel Kifer, and Muthuramakrishnan Venkitasubramaniam. l-Diversity: Privacy Beyond k-Anonymity . In Proceedings of the 22nd IEEE International Conference on Data Engineering (ICDE 2006). Atlanta Georgia, April 2006.
Jayavel Shanmugasundaram, Fan Yang, Mirek Riedewald, Johannes Gehrke, and Alan Demers. Hilda: A High-Level Language for Data-Driven Web Applications. In Proceedings of the 22nd IEEE International Conference on Data Engineering (ICDE 2006), Atlanta Georgia, April 2006.
Alan Demers, Johannes Gehrke, Mingsheng Hong, Mirek Riedewald, and Walker White. Towards Expressive Publish/Subscribe Systems . In Proceedings of the 10th International Conference on Extending Database Technology (EDBT 2006), Munich, Germany, March 2006.
Chavdar Botev, Sihem Amer-Yahia, and Jayavel Shanmugasundaram. Expressiveness and Performance of Full-Text Search Languages. In Proceedings of the 10th International Conference on Extending Database Technology (EDBT 2006), Munich, Germany, March 2006.
2005
Lin Guo, Jayavel Shanmugasundaram, Kevin Beyer, Eugene Shekita, "Efficient Inverted Lists and Query Algorithms for Structured Value Ranking in Update-Intensive Relational Databases", In Proceedings of the IEEE International Conference on Data Engineering (ICDE) , Tokyo, Japan, April 2005.
Feng Shao, Antal Novak, Jayavel Shanmugasundaram, "Triggers over XML Views of Relational Data", In Proceedings of the IEEE International Conference on Data Engineering (ICDE) (poster) , Tokyo, Japan, April 2005.
2004
Abhinandan Das , Johannes Gehrke, and Mirek Riedewald . Approximation Techniques for Spatial Data . In Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data (SIGMOD 2004) . Paris, France, June 2004.
Sihem Amer-Yahia, Chavdar Botev, and Jayavel Shanmugasundaram. TeXQuery: A Full-Text Search Extension to XQuery .
2003
Lin Guo , Feng Shao, Chavdar Botev, and Jayavel Shanmugasundaram . XRANK: Ranked Keyword Search over XML Documents . In Proceedings of the the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD 2003) . San Diego, CA, June 2003.
Abhinandan Das , J. E. Gehrke, and Mirek Riedewald . Approximate Join Processing Over Data Streams . In Proceedings of the the 2003 ACM SIGMOD International Conference on Management of Data (SIGMOD 2003) . San Diego, CA, June 2003.
Daniel Kifer , J. E. Gehrke, Cristian Bucila , and Walker White. How to Quickly Find a Witness . In Proceedings of the 22nd ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS 2003) . San Diego, CA, June 2003.
Alexandre Evfimievski , J. E. Gehrke, and Ramakrishnan Srikant. Limiting Privacy Breaches in Privacy Preserving Data Mining . In Proceedings of the 22nd ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS 2003) . San Diego, CA, June 2003.
2002
Cristian Bucila, J. E. Gehrke, Daniel Kifer, and Walker White. DualMiner: A Dual-Pruning Algorithm for Itemsets with Constraints. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . Edmonton, Alberta, Canada, July 2002.
Alexandre Evfimievski, Ramakrishnan Srikant, Rakesh Agrawal, and J. E. Gehrke. Privacy Preserving Mining of Association Rules . In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . Edmonton, Alberta, Canada, July 2002.
Shai Ben-David, J. E. Gehrke, and Reba Schuller. A Theoretical Framework for Learning from a Pool of Disparate Data Sources . In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . Edmonton, Alberta, Canada, July 2002.
Jay Ayres, J. E. Gehrke, Tomi Yiu, and Jason Flannick. Sequential Pattern Mining Using Bitmaps . In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . Edmonton, Alberta, Canada, July 2002.
Alin Dobra and Johannes Gehrke. SECRET: A Scalable Linear Regression Tree Algorithm . In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining . Edmonton, Alberta, Canada, July 2002.
A. Dobra, M. Garofalakis, J. E. Gehrke, and R. Rastogi. Processing Complex Aggregate Queries over Data Streams , In Proceedings of the 2002 ACM Sigmod International Conference on Management of Data , Madison, Wisconsin, June 2002.
I. Tatarinov, E. Viglas, K. Beyer, J. Shanmugasundaram, E. Shekita, "Storing and Querying Ordered XML Using a Relational Database System", In Proceedings of the 2002 ACM Sigmod International Conference on Management of Data , Madison, Wisconsin, June 2002.
F. Chu, J. Halpern, and J. E. Gehrke. Least Expected Cost Query Optimization: What Can We Expect? In Proceedings of the 21st ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS 2002) . Madison, Wisconsin, June 2002.
Anton Faradjian, J. E. Gehrke, and Philippe Bonnet. GADT: A Probability Space ADT For Representing and Querying the Physical World. In Proceedings of the 18th International Conference on Data Engineering (ICDE 2002) , San Jose, California, February 2002.
2001
J. Shanmugasundaram, E. Shekita, R. Barr, M. Carey, B. Lindsay, H. Pirahesh, B. Reinwald, " Efficiently Publishing Relational Data as XML Documents ", VLDB Journal . An earlier version appeared in the VLDB 2000 conference.
J. Shanmugasundaram, J. Kiernan, E. Shekita, C. Fan, J. Funderburk, " Querying XML Views of Relational Data ", In Proceedings of the VLDB Conference, Rome, Italy, September 2001.
J. Shanmugasundaram, E. Shekita, J. Kiernan, R. Krishnamurthy, E. Viglas, J. Naughton, I. Tatarinov, " A General Technique for Querying XML Documents using a Relational Database System ," SIGMOD Record , September 2001.
Alin Dobra and J. E. Gehrke. Bias Correction in Classification Tree Construction. In Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2001) , Williams College, Massachusetts, June 2001.
Zhiyuan Chen, J. E. Gehrke, and Flip Korn. Query Optimization In Compressed Database Systems. In Proceedings of the 2001 ACM Sigmod International Conference on Management of Data , Santa Barbara, California, May 2001.
J. E. Gehrke, Flip Korn, and Divesh Srivastava . On Computing Correlated Aggregates Over Continual Data Streams. In Proceedings of the 2001 ACM Sigmod International Conference on Management of Data , Santa Barbara, California, May 2001.
Doug Burdick, Manuel Calimlim, and J. E. Gehrke. MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases . In Proceedings of the 17th International Conference on Data Engineering , Heidelberg, Germany, April 2001.
J. Shanmugasundaram, K. Tufte, D. DeWitt, J. Naughton, D. Maier, "Architecting a Network Query Engine for Producing Partial Results", Lecture Notes in Computer Science , Vol. 1997, Springer-Verlag Publishers, 2001. An earlier version appeared in the WebDB 2000 workshop.
Philippe Bonnet, J. E. Gehrke, and Praveen Seshadri. Towards Sensor Database Systems . In Proceedings of the Second International Conference on Mobile Data Management . Hong Kong, January 2001.
2000
Philippe Bonnet, J. E. Gehrke, and Praveen Seshadri. Querying the Physical World . IEEE Personal Communications, Vol. 7, No. 5, October 2000, pages 10-15. Special Issue on Smart Spaces and Environments.
J. Shanmugasundaram, E. Shekita, R. Barr, M. Carey, B. Lindsay, H. Pirahesh, B. Reinwald, Efficiently Publishing Relational Data as XML Documents, In Proceedings of the VLDB Conference, Cairo, Egypt, September 2000.
J. E. Gehrke, Raghu Ramakrishnan, and Venkatesh Ganti. RAINFOREST - A Framework for Fast Decision Tree Construction of Large Datasets. In Data Mining and Knowledge Discovery, Volume 4, Issue 2/3, July 2000 , pages 127-162.
Venkatesh Ganti, J. E. Gehrke, and Raghu Ramakrishnan . DEMON: Mining and Monitoring Evolving Data . In Proceedings of the 16th International Conference on Data Engineering , San Diego, California, February 2000. Best student paper award.
Zhiyuan Chen and Praveen Seshadri: An Algebraic Compression Framework for Query Results. In Proceedings of the 16th International Conference on Data Engineering , San Diego, California, February 2000, pages 177-188.
Philippe Bonnet, Praveen Seshadri : Device Database Systems. In Proceedings of the 16th International Conference on Data Engineering , San Diego, California, February 2000.
1999
J. Shanmugasundaram, K. Tufte, G. He, C. Zhang, D. DeWitt, J. Naughton, " Relational Databases for Querying XML Documents: Limitations and Opportunities ," In Proceedings of the VLDB Conference, Edinburgh, Scotland, September 1999.
Venkatesh Ganti, J. E. Gehrke, and Raghu Ramakrishnan. Mining very large databases. IEEE Computer, Vol. 32, No. 9, August 1999 , pages 38-45.
J. Shanmugasundaram, U. Fayyad, P. Bradley, " Compressed Data Cubes for OLAP Aggregate Query Approximation on Continuous Dimensions ", In Proceedings of the 1999 SIGKDD Conference, San Diego, California, August 1999.
Venkatesh Ganti, J. E. Gehrke, and Raghu Ramakrishnan . CACTUS--Clustering Categorical Data Using Summaries . In Proceedings of the 1999 SIGKDD Conference , San Diego, California, August 1999.
J. Shanmugasundaram, A. Nithrakashyap, R. Sivasankaran, K. Ramamritham, "Efficient Concurrency Control for Broadcast Environments", In Proceedings of the 1999 SIGMOD Conference, Philadelphia, Pennsylvania, June 1999.
J. E. Gehrke, Venkatesh Ganti, Raghu Ramakrishnan , and Wei-Yin Loh. BOAT -- Optimistic Decision Tree Construction . In Proceedings of the 1999 SIGMOD Conference , Philadelphia, Pennsylvania, June 1999.
Tobias Mayr and Praveen Seshadri: Client-Site Query Extensions. In Proceedings of the 1999 SIGMOD Conference , Philadelphia, Pennsylvania, June 1999, pages 347-358.
Philippe Bonnet, Kyle Buza , Zhiyuan Chen , Victor Cheng , Randolph Chung , Takako M. Hickey , Ryan Kennedy , Daniel Mahashin , Tobias Mayr , Ivan Oprencak , Praveen Seshadri and Hubert Siu : The Cornell Jaguar System: Adding Mobility to PREDATOR. In Proceedings of the 1999 SIGMOD Conference , Philadelphia, Pennsylvania, June 1999, pages 580-581.
Venkatesh Ganti, J. E. Gehrke, Raghu Ramakrishnan , and Wei-Yin Loh. A Framework for Measuring Changes in Data Characteristics . In Proceedings of the Eighteenth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems , Philadelphia, Pennsylvania, May 1999. (Invited to Journal of Computer Science and Systems (JCSS).)
Francis Chu , Joseph Y. Halpern , and Praveen Seshadri: Least Expected Cost Query Optimization: An Exercise in Utility. In Proceedings of the Eighteenth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems , Philadelphia, Pennsylvania, May 1999, pages 138-147.
Venkatesh Ganti, Raghu Ramakrishnan, J. E. Gehrke, Allison L. Powell, and James French. Clustering Large Datasets in Arbitrary Metric Spaces . In Proceedings of the Fifteenth International Conference on Data Engineering , Sidney, Australia, 1999.
1998
N. Gehani, K. Ramamritham, J. Shanmugasundaram, O. Shmueli, " Accessing Extra-Database Information: Concurrency Control and Correctness ", Information Systems: An International Journal, 23(7), pp. 439-462, 1998.
Praveen Seshadri: Enhanced Abstract Data Types in Object-Relational Databases. VLDB Journal 7 (3): 130-140 (1998).
J. E. Gehrke, Raghu Ramakrishnan, and Venkatesh Ganti. RAINFOREST - A Framework for Fast Decision Tree Construction of Large Datasets . In Proceedings of the Twenty-fourth International Conference on Very Large Data Bases , New York, New York, 1998.
Rakesh Agrawal, J. E. Gehrke, Dimitrios Gunopulos, and Prabhakar Raghavan . Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications . In Proceedings of the 1998 SIGMOD Conference, Seattle, Washington, June 1998.
Michael Godfrey , Tobias Mayr , Praveen Seshadri, and Thorsten von Eicken : Secure and Portable Database Extensibility. In Proceedings of the 1998 SIGMOD Conference , Seattle, Washington, June 1998, pages 390-401.
Praveen Seshadri: Predator: A Resource for Database Research. SIGMOD Record 27 (1): 16-20 (1998).
1997
Michael J. Carey, David J. DeWitt, Jeffrey F. Naughton , Mohammad Asgarian, J.E. Gehrke, and Dhaval N. Shah. The BUCKY Object-Relational Benchmark . In Proceedings of the 1997 SIGMOD Conference , Tucson, Arizona, May 1997. More material , including the data generator used in the benchmark.