Machine Learning
COM S 478  Spring 2007 

Time and Place  
First lecture: January 23, 2007 Last lecture: May 3, 2007
First Prelim Exam: Tuesday, March 13, in
Thurston 203 

Instructor  
Thorsten Joachims, tj@cs.cornell.edu, 4153 Upson Hall.  
Mailing List and Newsgroup  
[cs478staffl@lists.cs.cornell.edu] We'd like you to contact us by using this mailing list. The list is set to mail all the TA's and Prof. Joachims  you will get the best response time by using this facility, and all the TA's will know the question you asked and the answers you receive. This makes both of our jobs easier.  
[cornell.class.cs478] We will post announcements to this newsgroup and students can use it to communicate among each other. You can find instruction for accessing the newsgroup at http://www.cit.cornell.edu/services/netnews/  
Teaching Assistants  
ChunNam Yu, cnyu@cs.cornell.edu, 5138 Upson Hall.  
Evan Herbst  
Gary Soedarsono  
Office Hours  
Mondays, 2:30pm  3:30pm  ChunNam Yu  5138 Upson  
Tuesdays, 1:30pm  2:30pm  Thorsten Joachims  4153 Upson  
Wednesdays, 5:00pm  6:00pm  Gary Soedarsono  Upson 328X (X varying)  
Fridays, 3:30pm  4:30pm  Evan Herbst  Upson 328X (X varying)  
Syllabus  
Machine learning is concerned with the
question of how to make computers learn from experience. The ability to
learn is not only central to most aspects of intelligent behavior, but
machine learning techniques have become key components of many software
systems. For examples, machine learning techniques are used to create
spam filters, to analyze customer purchase data, or to detect fraudulent
credit card transactions.
This course will introduce the fundamental set of techniques and algorithms that constitute machine learning as of today, ranging from classification methods like decision trees and support vector machines, over structured models like hidden Markov models and contextfree grammars, to unsupervised learning and reinforcement learning. The course will not only discuss individual algorithms and methods, but also tie principles and approaches together from a theoretical perspective. In particular, the course will cover the following topics:


Reference Material  
The main textbook for the class is
A good additional textbook as a secondary reference is
In addition, we will provide handouts for topics not covered in the book. For further reading beyond the scope of the course, we recommended the following books:


Prerequisites  
Programming skills (e.g. COM S 211 or COM S 312), and basic knowledge of linear algebra and probability theory (e.g. COM S 280).  
Grading  
This is a 4credit course. Grades will be
determined based on two written exams, a final project, homework
assignments, and class participation.
All assignments are due at the beginning of class on the due date. Assignments turned in late will drop 5 points for each period of 24 hours for which the assignment is late. In addition, no assignments will be accepted after the solutions have been made available. Roughly: A=92100; B=8288; C=7278; D=6068; F= below 60 

Academic Integrity  
This course follows the Cornell University Code of Academic Integrity. Each student in this course is expected to abide by the Cornell University Code of Academic Integrity. Any work submitted by a student in this course for academic credit will be the student's own work. Violations of the rules (e.g. cheating, copying, nonapproved collaborations) will not be tolerated. 