CS478
Machine Learning

Computer Science Department
Cornell University
Spring 2000

Time and Place

Tuesday, Thursday: 11:15-12:05
Thurston 203

CS478 Handouts
CS478 Lecture Notes
CS478 Assignments
CS478 Final Project
Related Links
Final Exam: Monday, May 15, 12-2:30 in B17 Upson
Academic integrity policy

Personnel for CS478

Instructor

Asst. Professor Claire Cardie

cardie@cs.cornell.edu

Office Hours: Weds 10-11; Thurs 1-2.
Office Address: Upson 4142.
Phone: x5-9206.

Teaching Assistant

Alin Dobra

dobra@cs.cornell.edu

Office Hours: Monday, 2-5.
Office Address: Upson 5152.
Phone: x5-3495.

Teaching Assistant

Kiri Wagstaff

wkiri@cs.cornell.edu

Office Hours: Tues 2:30-3:30, Weds 11-12.
Office Address: Upson 4156.
Phone: x5-5033.

CS478 Handouts

(Tentative) Course Syllabus (.ps, .pdf) (last modified 1/24)
General Course Information(.ps, .pdf) (last modified 1/24)

CS478 Lecture Notes

Introduction [Mitchell Ch1] (.ps, .pdf)

Version spaces [Mitchell Ch2] (.ps, .pdf) Warning: one slide per page necessary (rather than 4 per page) because of figures.

Inductive bias (.ps, .pdf)

Decision trees [Mitchell Ch3] (part 1) (.ps, .pdf) 2 per page

Decision trees (part 2) (.ps, .pdf) 2 per page

Decision stumps (.ps, .pdf) 2 per page

Evaluation [Mitchell 5.1, 5.2, 5.5, 5.6.0 (i.e. up to but not including 5.6.1)] (.ps, .pdf) 2 per page

Rule learning [Mitchell 10.1-10.5] (.ps, .pdf) 2 per page

Instance-based learning [Mtchell 8.1-8.3,8.6-8.7]; feature selection and weighting (.ps, .pdf) *1* per page

Genetic algorithms and genetic programming [Mitchell Ch9] (.ps, .pdf) 2 per page

Artificial neural nets [Mitchell 4.1-4.3,4.5-4.9] (.ps, .pdf) 2 per page

Bayesian learning [Mitchell 6.1-6.3,6.7-6.10] (.ps, .pdf) 2 per page

Transformation-based learning (.ps, .pdf) 2 per page

COLT: Mistake-bounds framework, Winnow and Weighted-Majority algorithms [Mitchell 7.1-7.2,7.5]. Bagging. (.ps, .pdf) 2 per page

Association rules. (.ps, .pdf) 2 per page [format fixed 5/5]

Clustering. (.ps, .pdf) 2 per page

Support Vector Machines (.ps, .pdf).

CS478 Assignments

Critiques

Critique Guidelines. (Important note: All critiques must be typewritten.)

Paper Critique 1. Due Tuesday, Feb 8, at the beginning of class.
Fayyad, Piatetsky-Shapiro, and Smyth, 1996. Knowledge Discovery and Data Mining: Towards a Unifying Framework. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), AAAI Press. (postscript)
- Grading guide
Paper Critique 2. Due Tuesday, Feb 29, at the beginning of class. Please single space the critques. One page maximum in length.
Concept Learning and the Problem of Small Disjuncts. Robert C. Holte, L. Acker, and B. Porter (1989). Proceedings of the Eleventh International Joint Conference on Artificial Intelligence (IJCAI-89), pp. 813-818. (postscript)
Paper Critique 3. Due Tuesday, Mar 28, at the beginning of class.
Discovering Structure in Multiple Learning Tasks: The TC Algorithm. Sebastian Thrun and Joseph O'Sullivan (1996). Proceedings of the Thirteenth International Joint Conference on Machine Learning (ICML-96). (postscript)

Homeworks

To submit your homework go to the submission page and follow the instructions.

REGRADES: Regrade requests must be submitted within two weeks after receiving the grade. E-mail a description of what you believe the problem to be to me and to the TA's (in a single message). Enclose any supporting code, documents, etc. Remember that any assignment submitted for a regrade may be checked for errors for which we forgot to deduct points the first time.

Homework 1. Due Feb 22, 11a.m. Version spaces and decision trees.(.ps, .pdf).
Here are the implementation details.

solutions to the programming portion of the assignment (the decision trees generated for each data set)
solutions (pdf) for the written part (.ps and .pdf)
information on the grading

The average score was 66.9 (out of 91), the min was 9, and the max was 90.

Homework 2. ***EXTENSION***. Now due Friday March 10, 10a.m. Decision trees.(.ps, .pdf).
See here for more details.

Solutions (pdf) for the written part (.ps and .pdf)
comments and advice related to the programming problems

The average score was 75 (out of 83), the min was 9, and the max was 95 (std dev 15 pts).

Homework 3. Due electronically by Mar 30, 11 a.m. Rule-learning and instance-based learning.(.ps, .pdf).
Solutions(ps)(pdf) for homework 3 are available.

Homework 4. Due by April 11, 11 a.m. Instance-based learning, GA's, neural networks. The programming portion should be turned in electronically; the problem set portion can be turned in on paper (preferred) or electronically. (.ps, .pdf).

Solutions (pdf) for written part (.ps and .pdf)
Solutions for the programming part.

Final Projects

Project Proposal.Due on or preferably well before Tuesday, March 28, at the beginning of class.
Proposal format and project ideas are here.
Final Report. Due electronically, Friday, May 5.
Information on what should be in the final report is here. Submit the paper along with any supporting code via the usual electronic submission mechanism. Only submit the code that you wrote. The report and/or the README file should indicate clearly indicate what code is yours and what code was obtained from another source.

CS478 Machine Learning

Computer Science Department Cornell University Spring 2000

Time and Place

Personnel for CS478

Instructor

Teaching Assistant

Teaching Assistant

CS478 Handouts

CS478 Lecture Notes

CS478 Assignments

Critiques

Homeworks

Final Projects

Related Links

CS478
Machine Learning

Computer Science Department
Cornell University
Spring 2000