Data Mining Seminar  

Updated Schedule (last update 10/13/2000)

 

Logistics

Time: Tuesdays, 2:30-3:30pm
Instructor: Johannes Gehrke, johannes@cs.cornell.edu, http://www.cs.cornell.edu/johannes
Place: We now have a  room for the remainder of the semester: 5130 Upson Hall.
Seminar Homepage: http://www.cs.cornell.edu/Courses/cs732/2000fa  
The CID for registration for the seminar is 959-346, and this information is now also in the online roster.

Seminar Description

Data mining is one of the hot information technologies today. Algorithms that construct data mining models from large databases have been the focus of much research recently.

We will read primarily research papers from the literature. The homework for each meeting is to read the papers and prepare to discuss them. Papers in electronic format are posted on the seminar homepage; other papers will be made available by hardcopy. The seminar will involve a mixture of lectures by the instructor and volunteers from the audience, discussion of the readings, and possibly some guest speakers. There is a small take-home final exam for students who want to get credit in the seminar.

This seminar is targeted at PhD students; no background knowledge in data mining is required. Some undergraduate knowledge such as elementary data structures, basic sorting and searching, basic graph terminology, asymptotic order of growth notation, and basic recurrence relations for analyzing algorithms will be assumed throughout the seminar. It will also be helpful to have a basic understanding of probability and statistics (e.g. random variables and their moments, elementary distributions).

We will roughly cover the following topics: