Time: Tuesdays,
2:30-3:30pm
Instructor: Johannes Gehrke, johannes@cs.cornell.edu,
http://www.cs.cornell.edu/johannes
Place: We now have a room for the remainder of
the semester: 5130 Upson Hall.
Seminar Homepage: http://www.cs.cornell.edu/Courses/cs732/2000fa
The CID for registration for the seminar is 959-346,
and this information is now also in the online roster.
Data mining is one
of the hot information technologies today. Algorithms that construct data mining
models from large databases have been the focus of much research recently.
We will read
primarily research papers from the literature. The homework for each meeting is
to read the papers and prepare to discuss them. Papers in electronic format are
posted on the seminar homepage; other papers will be made available by hardcopy.
The seminar will involve a mixture of lectures by the instructor and volunteers
from the audience, discussion of the readings, and possibly some guest speakers.
There is a small take-home final exam for students who want to get credit in the
seminar.
This seminar is targeted at PhD students; no background knowledge in data
mining is required. Some undergraduate knowledge such as elementary data
structures, basic sorting and searching, basic graph terminology, asymptotic
order of growth notation, and basic recurrence relations for analyzing
algorithms will be assumed throughout the seminar. It will also be helpful to
have a basic understanding of probability and statistics (e.g. random variables
and their moments, elementary distributions).
We will roughly
cover the following topics: