CS478
Machine Learning
Computer Science Department
Cornell University
Spring 2000
Time and Place
- Tuesday, Thursday: 11:15-12:05
- Thurston 203
Personnel for CS478
Instructor
Asst. Professor Claire Cardie
cardie@cs.cornell.edu
Office Hours: Weds 10-11; Thurs 1-2.
Office Address: Upson 4142.
Phone: x5-9206.
Teaching Assistant
Alin Dobra
dobra@cs.cornell.edu
Office Hours: Monday, 2-5.
Office Address: Upson 5152.
Phone: x5-3495.
Teaching Assistant
Kiri Wagstaff
wkiri@cs.cornell.edu
Office Hours: Tues 2:30-3:30, Weds 11-12.
Office Address: Upson 4156.
Phone: x5-5033.
(Tentative) Course Syllabus (.ps,
.pdf) (last modified 1/24)
General Course Information(.ps,
.pdf) (last modified 1/24)
Introduction [Mitchell Ch1] (.ps, .pdf)
Version spaces [Mitchell Ch2] (.ps, .pdf) Warning: one slide per
page necessary (rather than 4 per page) because of figures.
Inductive bias (.ps, .pdf)
Decision trees [Mitchell Ch3] (part 1) (.ps,
.pdf) 2 per page
Decision trees (part 2) (.ps, .pdf) 2 per page
Decision stumps (.ps, .pdf) 2 per page
Evaluation [Mitchell 5.1, 5.2, 5.5, 5.6.0 (i.e. up to but not
including 5.6.1)] (.ps, .pdf) 2 per page
Rule learning [Mitchell 10.1-10.5] (.ps, .pdf) 2 per page
Instance-based learning [Mtchell 8.1-8.3,8.6-8.7]; feature selection
and weighting (.ps, .pdf) *1* per page
Genetic algorithms and genetic programming [Mitchell Ch9] (.ps, .pdf) 2 per page
Artificial neural nets [Mitchell 4.1-4.3,4.5-4.9] (.ps, .pdf) 2 per page
Bayesian learning [Mitchell 6.1-6.3,6.7-6.10] (.ps, .pdf) 2 per page
Transformation-based learning (.ps, .pdf) 2 per page
COLT: Mistake-bounds framework, Winnow and Weighted-Majority
algorithms [Mitchell 7.1-7.2,7.5]. Bagging. (.ps, .pdf) 2 per page
Association rules. (.ps, .pdf) 2 per page [format fixed 5/5]
Clustering. (.ps, .pdf) 2 per page
Support Vector Machines (.ps,
.pdf).
Critiques
Critique
Guidelines. (Important note: All critiques must be
typewritten.)
- Paper Critique 1. Due Tuesday, Feb 8, at the beginning
of class.
Fayyad, Piatetsky-Shapiro, and Smyth, 1996.
Knowledge Discovery and Data Mining: Towards a Unifying
Framework. Proceedings of the Second International Conference on
Knowledge Discovery and Data Mining (KDD-96), AAAI Press. (postscript)
- Paper Critique 2. Due Tuesday, Feb 29, at the beginning
of class. Please single space the critques. One page maximum in
length.
Concept Learning and the Problem of Small Disjuncts. Robert C. Holte,
L. Acker, and B. Porter (1989). Proceedings of the Eleventh
International Joint Conference on Artificial Intelligence
(IJCAI-89), pp. 813-818. (postscript)
- Paper Critique 3. Due Tuesday, Mar 28, at the beginning
of class.
Discovering Structure in Multiple Learning Tasks: The TC
Algorithm. Sebastian Thrun and Joseph O'Sullivan (1996). Proceedings
of the Thirteenth International Joint Conference on Machine Learning
(ICML-96). (postscript)
Homeworks
To submit your homework go to the
submission
page and follow the instructions.
REGRADES: Regrade requests must be submitted
within two weeks after receiving the grade. E-mail a description of
what you believe the problem to be to me and to the TA's (in a single
message). Enclose any supporting code, documents, etc. Remember that
any assignment submitted for a regrade may be checked for errors for
which we forgot to deduct points the first time.
Homework 1. Due Feb 22, 11a.m. Version spaces and decision
trees.(.ps, .pdf).
Here are the implementation details.
- solutions to the
programming portion of the assignment (the decision trees generated
for each data set)
- solutions
(pdf) for the written
part (.ps and .pdf)
- information on the
grading
The average score was 66.9 (out of 91), the min was 9, and the max was
90.
Homework 2. ***EXTENSION***. Now due Friday March 10,
10a.m. Decision trees.(.ps,
.pdf).
See here for more details.
The average score was 75 (out of 83), the min was 9, and the max was
95 (std dev 15 pts).
Homework 3. Due electronically by Mar 30,
11 a.m. Rule-learning and instance-based learning.(.ps,
.pdf).
Solutions(ps)(pdf) for homework 3 are available.
Homework 4. Due by April 11, 11 a.m. Instance-based
learning, GA's, neural networks. The programming portion should be
turned in electronically; the problem set portion can be turned in on
paper (preferred) or electronically. (.ps,
.pdf).
- Checkers
- You can register
here
and play against Chinook, the Checkers program mentioned
in lecture on 1/27.
- You can read more about Chinook (article from AI Magazine).
- Arthur Samuel was the first to craft a Checkers program
based on a learning algorithm (he used reinforcement
learning), in 1959 (see here
for some discussion).
- Backgammon
- Gerald Tesauro wrote TD-Gammon. It is not available online
to play against, but you can
read
about the details.
- Prof. Schneider's page in
the annual report
- ML Software
- Weka -
Java implementation of a lot of ML algorithms.
- MLC++ -
C++ implementation of ML algorithms.
- ML Data
- Other Resources