Machine Learning

CS478 - Spring 2004
Cornell University
Department of Computer Science

 
Time and Place
First lecture: January 27, 2004
Last lecture: May 6, 2004
  • Tuesday, 11:40am - 12:55pm in Thurston 203
  • Thursday, 11:40am - 12:55pm in Thurston 203

Exam: April 13, 2004 (in class)

Instructor
Thorsten Joachims, tj@cs.cornell.edu, 4153 Upson Hall.
Office hours: Thursdays at 2:15pm - 3:00pm
Teaching Assistants
Filip Radliński, filip@cs.cornell.edu, 4154 Upson Hall.
Office hours: Wednesdays at 3:00pm - 4:00pm
Niranjan Nagarajan, niranjan@cs.cornell.edu, 4154 Upson Hall.
Office hours: Mondays at 11:00am - 12:00pm
Syllabus
Machine learning is concerned with the question of how to make computers learn from experience. This course will introduce the fundamental set of techniques and algorithms that constitute machine learning as of today, ranging from classification methods like decision trees and support vector machines, over sequence models like hidden Markov models, to unsupervised learning and clustering. The course will not only discuss algorithms and methods, but also provide an introduction to the theory of machine learning. In particular, the course will cover the following topics: 
  • Concept Learning : Version space, generalization ordering
  • Decision Trees : TDIDT, Representation bias vs. search bias
  • Hypothesis Tests : Confidence intervals, resampling estimates
  • Linear Rules : Perceptron, Winnow
  • Support Vector Machines : Optimal hyperplane, Kernels
  • Generative Models : Bayes Rule, Naïve Bayes, MAP and Bayesian learning
  • Hidden Markov Models : Viterbi, Expectation-Maximization
  • Nearest Neighbor : K-NN, asymptotics
  • Learning Theory : PAC learning, No-Free-Lunch
  • Clustering : HAC, k-means, latent semantic indexing
  • Reinforcement Learning : Q-Learning, Temporal difference learning

Readings
  • 01/27: Mitchell, Chapter 1
  • 01/29: Mitchell, Chapter 2
  • 02/05: Mitchell, Chapter 3
  • 02/12: Mitchell, Chapter 5
  • 02/19: Mitchell, Sections 4.4 - 4.4.2
  • 02/24: Christianini/Shawe-Taylor, Section 2.1
  • 02/26: Joachims, Chapter 3 (supplementary Schoelkopf Statistical Learning and Kernel Methods)
  • 03/09: Mitchell, Chapter 6 (except 6.4 - 6.6)
  • 03/16: Manning/Schuetze, Chapter 9 (not 9.2.1, 9.3.1, 9.3.3, 9.4)
  • 03/18: Mitchell, Chapter 6, Sections 6.4 
  • 03/30: Mitchell, Chapter 8 (not 8.3, 8.4, 8.5)
  • 04/01: Mitchell, Chapter 7 (not 7.4.4, 7.5.3)
  • 04/15: Duda/Hart/Stork, Sections 10.1 - 10.9
Homework Assignments
Reference Material
The main textbook for the class is

Tom Mitchell, "Machine Learning", McGraw Hill, 1997.

In addition, we will provide hand-outs for topics not covered in the book. For further reading beyond the scope of the course, we recommended the following books:

  • Duda, Hart, Stork, "Pattern Classification", Wiley, 2000.
  • Hastie, Tibshirani, Friedman, "The Elements of Statistical Learning", Springer, 2001.
  • Shawe-Taylor, Cristianini, "Introduction to Support Vector Machines", Cambridge University Press, 2000.
  • Joachims, "Learning to Classify Text using Support Vector Machines", Kluwer, 2002.
  • Devroye, Gyoerfi, Lugosi, "A Probabilistic Theory of Pattern Recognition", Springer, 1997.
  • Schoelkopf, Smola, "Learning with Kernels", MIT Press, 2001.
  • Vapnik, "Statistical Learning Theory", Wiley, 1998.
Prerequisites
Programming skills (e.g. COM S 211 or COM S 312), and basic knowledge of linear algebra and probability theory (e.g. COM S 280).
Grading
This is a 4-credit course. Grades will be determined based on a written mid-term exam, a final project, homework assignments, and class participation.
  • 25%: Mid-Term Exam
  • 25%: Final Project
  • 40%: Homework (5 homeworks, best 4 count towards grade)
  • 10%: Class Participation

All assignments are due at the beginning of class on the due date. Assignments turned in late will drop 10 points for each period of 24 hours for which the assignment is late. In addition, no assignments will be accepted after the solutions have been made available.

Roughly: A=90-100; B=80-90; C=70-80; D=60-70; F= below 60

Academic Integrity
This course follows the Cornell University Code of Academic Integrity. Each student in this course is expected to abide by the Cornell University Code of Academic Integrity. Any work submitted by a student in this course for academic credit will be the student's own work. Violations of the rules (e.g. cheating, copying, non-approved collaborations) will not be tolerated.