CS 4780 - Machine Learning
Machine Learning

CS 4780/5780 - Fall 2011
Cornell University
Department of Computer Science

 
Time and Place
First lecture: August 25, 2011
Last lecture: December 1, 2011
  • Tuesday, 1:25pm - 2:40pm in Hollister B14
  • Thursday, 1:25pm - 2:40pm in Hollister B14

First Prelim Exam: October 13
Second Prelim Exam: November 22
Project Due Date: December 15

Instructor
Thorsten Joachims, tj@cs.cornell.edu, 4153 Upson Hall.
Online Resources
  • [Piazza Forum]  We greatly prefer that you use this Piazza forum for questions and discussions.  The forum is monitored all the TA's and the prof -- you will get the best response time. And all the TA's will know the question you asked and the answers you receive.
  • [CMS] Assignments, the prereq exam, and all grades are posted on CMS.
  • [Videonote] Recordings of the lectures with annotations are available through Videonote.
Teaching Assistants and Consultants
Karthik Raman, TA, 4126 Upson Hall
Chenhao Tan, TA, 4121 Upson Hall
Adith Swaminathan, TA, 5132 Upson Hall
Igor Labutov, Consultant
Mevlana Gemici, Consultant
Anthony Chang, Consultant
Nic Williamson, Consultant
Heran Yang, Consultant
Boiar Qin, Consultant
Office Hours
Monday, 9:30am - 10:30am Adith Swaminathan Upson 328B, Bay B
Monday, 4:00pm - 5:00pm Karthik Raman Upson 328B, Bay B
Monday, 6:00pm - 7:00pm Chenhao Tan Upson 4121
Tuesday, 4:30pm - 5:20pm Thorsten Joachims Upson 4153
Tuesday, 6:00pm - 7:00pm Igor Labutov Upson 328B
Wednesday, 6:00pm - 7:00pm Chenhao Tan Upson 4121
Thursday, 6:00pm - 7:00pm Adith Swaminathan Upson 328B, Bay B
Friday, 5:00pm - 6:00pm Karthik Raman Upson 328B, Bay B
Saturday, 11:00am - 12:00pm Anthony Chang Upson 328B
  Sunday, 12:30pm - 1:30pm Mevlana Gemici Upson 328B, Bay B
Sunday, 4:00pm - 5:00pm Boiar Qin Upson 328B, Bay A
Syllabus
Machine learning is concerned with the question of how to make computers learn from experience. The ability to learn is not only central to most aspects of intelligent behavior, but machine learning techniques have become key components of many software systems. For examples, machine learning techniques are used to create spam filters, to analyze customer purchase data, or to detect fraudulent credit card transactions.

This course will introduce the fundamental set of techniques and algorithms that constitute machine learning as of today, ranging from classification methods like decision trees and support vector machines, over structured models like hidden Markov models and context-free grammars, to unsupervised learning and clustering. The course will not only discuss individual algorithms and methods, but also tie principles and approaches together from a theoretical perspective. In particular, the course will cover the following topics: 

  • Concept Learning : Hypothesis space, version space
  • Instance-based Learning : K-Nearest Neighbors, collaborative filtering
  • Decision Trees : TDIDT, attribute selection, pruning and overfitting
  • ML Experimentation : Hypothesis tests, resampling estimates
  • Linear Rules : Perceptron, duality, mistake bound
  • Support Vector Machines : Optimal hyperplane, kernels, stability
  • Generative Models : Naïve Bayes, linear discriminant analysis
  • Hidden Markov Models: probabilistic model, estimation, Viterbi
  • Structured Output Prediction : predicting sequences, rankings, etc.
  • Learning Theory : PAC learning, mistake bounds, VC dimension
  • Clustering : HAC, k-means, mixture of Gaussians

 

Slides and Handouts
08/25: Introduction (PDF)
08/29: Instance-Based Learning (PDF)
09/01: Decision-Tree Learning (PDF)
09/13: Assessing Learning Results (PDF)
09/20: Linear Classifiers and Perceptrons (PDF)
09/27: Support Vector Machines: Optimal Hyperplanes (PDF)
09/29: Support Vector Machines: Duality and Leave-One-Out Error (PDF)
10/04: Support Vector Machines: Kernels (PDF)
10/14: Learning to Rank (PDF)
10/18: Generative Models, Naive Bayes, and Linear Discriminant (PDF)
11/01: Sequences and Hidden Markov Models (PDF)
11/08: Statistical Learning Theory (PDF)
11/15: Clustering (PDF)
11/15: Structured Prediction and Structural SVMs (PDF)
Reference Material
The main textbooks for the class are
  • Tom Mitchell, "Machine Learning", McGraw Hill, 1997.
  • Cristianini, Shawe-Taylor, "Introduction to Support Vector Machines", Cambridge University Press, 2000. (online via Cornell Library)
  • Schoelkopf, Smola, "Learning with Kernels", MIT Press, 2001. (online)

An additional textbook that can serve as a brief secondary reference on many topics in this class is

Ethem Alpaydin, "Introduction to Machine Learning", MIT Press, 2004.

In addition, there will be additional readings for topics not covered in the main textbooks. For further reading beyond the scope of the course, we recommended the following books:

  • Bishop, "Pattern Recognition and Machine Learning", Springer, 2006.
  • Devroye, Gyoerfi, Lugosi, "A Probabilistic Theory of Pattern Recognition", Springer, 1997.
  • Duda, Hart, Stork, "Pattern Classification", Wiley, 2000.
  • Hastie, Tibshirani, Friedman, "The Elements of Statistical Learning", Springer, 2001.
  • Joachims, "Learning to Classify Text using Support Vector Machines", Kluwer, 2002.
  • Leeds Tutorial on HMMs (online)
  • Manning, Schuetze, "Foundations of Statistical Natural Language Processing", MIT Press, 1999. (online via Cornell Library)
  • Manning, Raghavan, Schuetze, "Introduction to Information Retrieval", Cambridge, 2008. (online)
  • Vapnik, "Statistical Learning Theory", Wiley, 1998.
Prerequisites
Programming skills (e.g. CS 2110 or CS 3110), and basic knowledge of linear algebra and probability theory (e.g. CS 2800).
Grading
This is a 4-credit course. Grades will be determined based on two written exams, a final project, homework assignments, and class participation.
  • 40%: 2 Prelim Exams
  • 15%: Final Project
  • 35%: Homework (~5 assignments)
  • 5%: Quizzes (in class)
  • 2%: Prereq Exam
  • 3%: Class Participation

To eliminate outlier grades for homeworks and quizzes, the lowest grade is replaced by the second lowest grade in the final grade computation.

All assignments are due at the beginning of class on the due date. Assignments turned in late will be charged a late penalty of 5 points for each period of 24 hours for which the assignment is late. However, every student has a budget of 4 late days (i.e. 24 hour periods after the time the assignment was due) throughout the semester for which there is no late penalty. No assignment will be accepted after the solution was made public, which is typically 5 days after the time it was due. You can submit late assignments in class, in office hours, or to the office of a TA.

Graded homework assignments and prelims can be picked up in Upson 360 (opening hours Monday - Thursday 12noon - 4:00pm, Friday: 1:30pm - 4:00pm). Regrade requests can be submitted within 7 days after the grades have been made available on CMS. Regrade requests have to be submitted in writing and in hardcopy using this form (or similar). They can be submitted in class, in office hours, or to the office of a TA.

We always appreciate interesting homework solutions that go beyond the minimum. To reward homework solutions that are particularly nice, we will give you "Bonus Points". Bonus points are collected in a special category on CMS. Bonus points are not real points and are not summed up for the final grade, but they can nudge somebody to a higher grade who is right on the boundary.

All assignment, exam, and final grades are roughly on the following scale: A=92-100; B=82-88; C=72-78; D=60-68; F= below 60

Academic Integrity
This course follows the Cornell University Code of Academic Integrity. Each student in this course is expected to abide by the Cornell University Code of Academic Integrity. Any work submitted by a student in this course for academic credit will be the student's own work. Violations of the rules (e.g. cheating, copying, non-approved collaborations) will not be tolerated.

We run automatic cheating detection to detect violations of the collaboration rules.