CS7792  Counterfactual Machine LearningSpecial Topics in Machine Learning
Fall 2018 

Time and PlaceFirst meeting: August 24, 2018 

Course DescriptionHow many clicks will a new adplacement system get? Will a different newsranking algorithm increase the dwell times of the users? What ranking function will minimize abandonment in my search engine? Answering such evaluation and learning questions is at the core of improving many of the online systems we use every day. This seminar addresses the problem of using past humaninteraction data (e.g. click logs) to learn to improve the performance of the system. This requires integrating causal inference models into the design of the learning algorithm, since we need to make predictions about the system’s performance after an intervention (e.g. fielding a new ranking function). This seminar discusses the emerging research area of counterfactual machine learning in the intersection of machine learning, causal inference, economics, and information retrieval. Topics include causal inference in the counterfactual model, observational vs. experimental data, fullinformation vs. partial information data, batch learning from bandit feedback, handling selection bias in data, policy learning vs. reward prediction. Concepts will be illustrated with applications in search engines, recommender systems, and computational advertising. The prerequisites for the class are: knowledge of machine learning algorithms and its theory, basic probability, basic statistics, and general mathematical maturity. Enrollment is limited to PhD students. 

Syllabus


ContactPlease use the CS7792 Piazza Forum for questions and discussions. Otherwise, contact Thorsten Joachims (homepage) [Office hours: Fridays, 11:10am12:10pm (Gates 418)]. For peer feedback, we are using this CMT Instance for this course. For grades, we are using CMS. 

GradingThis is a 1credit seminar. S/U only (no letter grade, no audit). Grades will be determined based on quizzes, paper presentations, peer reviewing, and class participation. For the paper presentations, we will use peer review. This means that you will comment on other students presentations, giving constructive feedback. The quality of your reviewing also becomes a component of your own grade. To eliminate outlier grades for quizzes and peer reviews, the lowest grade is replaced by the second lowest grade when grades are cumulated at the end of the semester. So, missing one week is no big deal. To pass the course, you need to get at least half of the cumulative quiz points, half of the presentation points, half of the peer reviewing points, and half of the class participation points. 

Reference MaterialWe will mostly read original research papers, but the following books provide entry points for the main topics of the class:
Other sources for general background on machine learning are:


Academic IntegrityThis course follows the Cornell University Code of Academic Integrity. Each student in this course is expected to abide by the Cornell University Code of Academic Integrity. Any work submitted by a student in this course for academic credit will be the student's own work. Collaborations are allowed only if explicitly permitted. Violations of the rules (e.g. cheating, copying, nonapproved collaborations) will not be tolerated. 