Systems for Large Data

CS 6322
Department of Computer Science
Cornell University
Fall 2008


Instructor: Johannes Gehrke
Time: Tuesdays and Thursdays, 2:45-4:00pm.
: Thurston 202
Course Management System


Course Overview

The last decade has been a turning point in database research. The number of research communities working on BIG data has grown significantly, and it now not only includes the traditional database vendors but also industries such as digital entertainment, social network analysis, e-science, advertising, and search. At the same time as application scenarios expanded, the way that data has traditionally been managed has changed significantly. The rise of cloud computing requires fundamental changes in the architecture of data-driven systems; Moore's law is now based on scaling the number of processor cores instead of clock speed; systems with huge main memory sizes and large middle tiers of solid-state disks are emerging; power consumption has become a major concern for large systems.

This course covers recent research on the design and implementation of scalable data-centric systems. Topics include infrastructure for cloud computing, novel database architectures such as column stores and main-memory data management, the convergence of search over unstructured data and querying of structured data, power-aware data management, and data management for computer games and virtual worlds.

The course prerequisites include basic undergraduate knowledge of database systems as covered in the cow book.

Course Work

Course Outline

Note that the course schedule is still under construction.

Data Services in the Cloud

Thursday, September 3, 2008 (Presenter: Johannes)

Tuesday, September 9, 2008 (Presenter: Johannes)

Thursday, September 11, 2008 (Presenter: Johannes)

Tuesday, September 16, 2008 (Presenter: Johannes)

Thursday, September 18, 2008 (Presenter: Ymir Vigfusson)

Background reading:

Parallel Database Systems

Tuesday, September 23, 2008 (Presenter: Johannes)

Background reading:

Class cancelled on Thursday, September 25.

Concurrency Control and Recovery

Tuesday, September 30, 2008 (Presenter: Johannes)

Background reading:

Thursday, October 2, 2008 (Presenter: Alan Demers)

Tuesday, October 7, 2008 (Presenter: Christoph Koch)

Background reading:

No class on Thursday, October 9 due to Yom Kippur.

No class on Tuesday, October 14 due to Fall Break.

Thursday, October 16, 2008 (Class meets at 10:10am in Upson 111 together with CS 6410)

Tuesday, October 21, 2008 (Presenter: Robbert van Renesse)

Background Reading:

Thursday, October 23, 2008

Tuesday, October 28, 2008 (Presenter: Christoph Koch)

Thursday, October 30, 2008 (Presenter: Hussam Abu-Libdeh)

Background reading:

Column Stores

Tuesday, November 4, 2008 (Presenter: Johannes)

Thursday, November 6, 2008 (Presenter: Lyublena Antova)

Tuesday, November 11, 2008 (Presenter: Michaela Goetz)

Thursday, November 13, 2008

Tuesday, November 18, 2008 (Presenter: Guozhang Wang)

Thursday, November 20, 2008 (Presenter: Ben Sowell)

Tuesday, November 25, 2008 (Presenter: Haoyuan Li)

No class on Thursday, November 27 due to Thanksgiving break.


Tuesday, December 2, 2008 (Presenter: Raluca Tanase)

Thursday, December 4, 2008 (Presenter: Johannes)