CS4320/CS5320: Big Data (Fall 2012)


·       Lecture: MWF, 2:30-3:20pm; Phillips Hall 101

·       Sections:

o   M, 3:35-4:25pm, Upson 315

o   M, 7:30-8:20pm, Upson 315

o   M, 8:35-9:25pm, Upson 315

o   T, 3:35-4:25pm, Upson 315

·       Instructor: Johannes Gehrke

o   Office hours: Fridays, 1:15-2:15pm, 4105B Upson Hall or by appointment.

·       TAs:

o   Gabriel Bender, Daniel Murphy, Sudip Roy, Shihui Sweet Song, Guozhang Wang

o   Office hours (Upson 328B, Bay D).

§  Wednesdays 4:30-5:30pm

§  Thursdays 11:00am-noon

·       Course Management System. The homework assignments, grades, course schedule, and lecture notes are available in the CMS.

·       Textbook: Raghu Ramakrishnan and Johannes Gehrke. Database Management Systems, third edition, 2002.





·       (August 29) We will only finish relational algebra today. For Wednesday, please install MySQL on your laptop and bring your laptop to class (http://www.mysql.com/; go to Downloads (GA), and then download the MySQL Installer for Windows if you are running Windows; there is also a Mac version: http://dev.mysql.com/doc/refman/5.6/en/macosx-installation.html). We will use MySQL Workbench in class starting on Wednesday.

·       (August 27) The slides for today’s lecture are here. Please catch up in the textbook until and including Chapter 4.

·       (August 23) The academic integrity form can be found here.

·       (August 23) Note that the CMS for the course is still in the process of being set up.

·       (August 23) On Friday, August 24, we will cover material from Chapter 3 in the book; the slides for Friday are here.

·       (August 23) The slides for the first lecture are now online. Please read Chapters 1 and 2 in the book to catch up. The full syllabus and more information will be available soon.

·       (August 22) There are no sections the first week of classes or the week of August 27. The sections will start the week of September 3.


Course Description


CS4320/CS5320 gives an introduction to relational database systems, NoSQL systems, and Big Data cloud infrastructure. Topics covered include the relational model, SQL, transactions, database design, NoSQL, large data processing, cloud data management, and concepts and algorithms for building Big Data systems. Students are encouraged to concurrently enroll in CS4321/CS5321 (Practicum in Database Systems) as well. The textbook is required, but the contents of the book do not constitute the syllabus for the course - the classroom lectures define the course content, and the textbook is a reference.




CS4320/CS5320 assumes knowledge of the material covered in CS2110 (Object-Oriented Programming and Data Structures) and CS3110 (Data Structures and Functional Programming).





The grades for CS4320/CS5320 will be determined based on four homework assignments (50%), two exams (49%), and participation in the course evaluation (1%).

·       Four Homework Assignments (50% of your grade, 12.5% each). Details about the homework assignments can be found in CMS whenever it becomes available. Tentative assignment due dates are September 26, October 15, November 8, and November 27.

·       Exams (49% of your grade)

o   Prelim: Thursday, October 18, 7:30pm. Three different rooms: Upson 109, Upson 111, and Upson B17 (21% of your grade)

o   Final exam. Thursday, December 13, 9:00am; Olin 155. (28% of your grade)

·       Participate in the course evaluation at the end of the course. (1% of your grade)




The course has associated discussion sections lead by the TAs where the material from class will be reviewed through exercises. In particular, the sections focus on doing exercises from the book, hands-on exercises with a real database system, and answering questions about the homework assignments. The sections start the week of September 3.


Late Homework Submissions Policy 


All homework assignments have to be submitted via CMS in electronic format. You may submit scanned PDF files of your homework assignments, but there is a limit on upload size and it is your responsibility to make sure in time that the upload of scanned files succeeds.  If there is a problem, submitting your homework via email or on paper is not an option. We suggest composing the homework assignments using a text editor or latex and creating a PDF file for submission.


The assignments have strict deadlines at 11:59pm on the day they are due. If you submit the assignment up to 24 hours late, there will be a 15% penalty. If you submit your assignment between 24 and 48 hours late, there will be a 30% penalty. No homework submissions are accepted more than two days late. This may sound strict, but we want to be fair and have the same rules for everyone. We will try to provide the best help possible to make you succeed with the assignments, but you will have to allocate sufficient time to finish your homework assignments and submit them before the deadline. Late submissions will have to be emailed to the course instructor, Johannes Gehrke.


Academic Integrity


Students at Cornell are expected to follow a strict Code of Academic Integrity, which is taken very seriously in the Department of Computer Science and in the course. If you are taking CS4320 or CS4321, please print and sign the Academic Integrity Form. You need to hand in your signed form to us in class.


Job Interviews


Many students who take CS4320/CS5320 also have job interviews in the fall. Please make sure not to schedule any interviews such that you cannot attend the exams. We cannot move exams because of interviews and we also cannot provide makeup exam. Makeup exams can only be scheduled for serious medical reasons, not because of job interviews. We know that some companies behave inflexibly regarding interview dates, but in this case you will have to state firmly that it is a Cornell rule that exams have priority. It is important not to tell recruiters that there is a possibility of moving the exam, because it is not true and they will take your word on it. Please keep in mind that traveling takes time: you should not schedule interviews a day before or after an exam, particularly if the interview is on the West Coast. Job interviews are no excuse for late submission of homework assignments.