6781, Spring 2020

Foundations of Modern Machine Learning

Overview

Key Information

Tuesday + Thursday , 8:40am - 9:55am, Bard Hall 140

Wilson Yoo, Abhishek Shetty

50% homeworks, 20% exam, 25% project and proposal, 5% scribe/participation

This course will cover fundamental topics in theory of machine learning for modern use, including statistical, computational, and social consideration. We start with a basic statistical and computational toolset required for understanding machine learning. We then explore a number of modern perspectives on machine learning including connections between game theory and machine learning, robustness of machine learning to adversaries, a beyond the worst-case analysis perspective on learning, and ethics in machine learning. In addressing these, the course makes connections to statistics, algorithms, complexity theory, optimization, game theory, and more.

Potential list of topics is subject to change but will likely include the following:

  • Offline and online learning, including VC theory, online learning, mistake bounds, etc.

  • Computational complexity of learning.

  • Boosting.

  • Connections between game theory and learning theory.

  • Generative Adversarial Networks.

  • Beyond the worst-case analysis of machine learning.

  • Learning with Statistical queries.

  • Differential privacy.

  • Fairness.

Prerequisites

There are no formal pre-requisites for graduate student. Undergraduate students need to have taken CS 4820. Students require mathematical maturity and ease/familiarity with writing and understanding theorems and proofs. Familiarity with probability theory and basics of algorithms is required. No programming skills are required. Please see the instructor if you are unsure whether your background is suitable for the course.

Reference Material

There is no required textbook for this course. The following resources will be helpful for additional reading.

Office Hours

Name Email Hours Location
Nika Haghtalab nika@cs.cornell.edu Fridays 10am-11am Gates 315
Wilson Yoo sy536@cornell.edu Mondays 5:30pm-6:30pm Rhodes 402
Abhishek Shetty avs88@cornell.edu Thursdays 4:30pm-5:30pm Rhodes 412

You may also reach out for discussion/questions on our course Piazza page.

Schedule (Subject to change)

The schedule of upcoming classes will be posted as the term progresses.
Date Topic Slides/Notes Readings
1/21 Introductions and Logistics Slides Chapter 1, UML
1/23 Consistency Model and Intro to PAC Notes Chapter 2, UML
1/28 Sample Complexity (finite hypothesis classes) Notes Chapter 2.2-3.1, UML
1/30 Combinatorial dimensions for learning Notes Chapter 6, UML
02/04 Sample Complexity (infinite hypothesis classes) Notes Chapter 6, UML
02/06 Sample complexity lower bounds Notes None
02/11 Finish PAC lower bound, intro to agnostic learning Notes Chapter 3.2, 4
02/13 Agnostic Learning upper and lower bound Notes Chapter 28.2, UML
02/18 Introduction to hardness of learning Notes 6820 Notes on NP-Completeness
02/20 Representation Independent hardness Notes Daniely'16
02/25 No Class -- Feb. Break
02/27 Mistake Bound and Weighted Majority Notes Chapter 21-21.2, UML
03/03 Learning from Experts Notes Blum and Mansour chapter
03/05 Online optimization I Notes
03/10 Follow The Regularized/Perturbed Leader Notes
03/12 Sequential Experimentation (Bobby Kleinberg) Bobby's Slides, Notes
03/17-04/06 Classes Suspended due to COVID-19
04/07 FTPL Recap and Partial Information Merged with 03/10 and 04/09
04/9 Multi-Armed Bandits Notes
04/14 Introduction to Boosting Notes
04/16 AdaBoost Error Analysis Notes
04/21 Boosting, Online Learning, and Games Notes
04/23 Oracle Efficient Online Learning I Notes
04/28 Generalized FTPL Notes
04/30 Learning in Presence of Noise Notes
05/05 Statistical Queries and Noise Notes
05/07 Differential Privacy Notes
05/12 Fairness in ML Notes

Scribing and Class Participation

Every student will be responsible for scribing 1-2 lectures, based on the number of student enrolled in class. Scribing is worth 5% of your final grade.

During the add/drop period, i.e., until and including Feb 4, no scribing is needed. A form will be posted to sign up students for scribing after that period.

A lecture can be jointly scribed by two students. We highly encourage non-Ph.D. students to team up with a CS Ph.D. student for scribing. Please use the template and style file posted on Piazza resource page.

The first draft of the scribed notes are due 2 work days after the corresponding lecture, i.e., Tuesday lecture note are due on Thursday and Thursday lecture notes are due on the following Monday. Within these two days, the student has to also schedule a short (15-30 min) meeting with one of the TAs to go over the draft of the scribed notes and receive feedback. The final scribed notes should incorporate the TAs feedback and are due within 2 work days after the initial draft.

Homeworks and Projects

The project and homework schedule have been altered due to COVID-19 class suspension. HW3 has been cancelled and its weight is spread across the other homeworks. The due date for HW4, HW5, and final project are adjusted.

There will be five written homeworks and one project proposal and one final project. Written homeworks will involve deriving and proving mathematical results and critically analyzing the material presented in class.

Please submit your assignments on CMS here. We highly encourage you to typeset your submissions. You can use the provided TeX source as a basis for your submission. Any part of the submitted work that is not readable by the TAs will be ignored.

Solutions will be released about 3-5 days after the deadline via Box.

Due dates (Subject to change)

Homework File Posted Dates Due Dates
Homework 1 Piazza Link, TeX Template, solutions on CMS 01/28 02/06
Homework 2 Piazza Link, TeX Template 02/11 2/20
Project Proposal Piazza Link 03/03 03/12 (last day to submit April 04/06)
Homework 3 Cancelled due to COVID-19 03/17 03/26
Homework 4 Piazza Link 04/14 04/27
Homework 5 Piazza Link 04/30 05/12
Final Project 05/12 (5 days free extension)

Homework Policies

  • Homework is due on CMS by the posted deadline.

  • Each homework is worth 10% of the final grade.

  • You have a budget of 5 late days (i.e. 24 hour periods after the time the assignment was due) throughout the semester for which there is no late penalty. Beyond this 5-day budget, assignments turned in late will be charged a 1 percentage point reduction of the cumulated final homework grade for each period of 24 hours for which the assignment is late. No assignment will be accepted after the solution is made public, which is typically 3-5 days after the time it was due.

  • Regrade requests can be submitted on CMS upto 7 days after the solutions are released.

  • You can discuss the homework with other students, but all final submitted work must be done entirely on your own, without looking at any notes or pictures from the work you did during group discussions. Be sure to mention your collaborators' names and netIDs in your writeup.

Project Policies

  • Project proposal and final report are due on CMS by the posted deadline.

  • Project proposal is worth 5% and the final report is worth 20%

  • You can do the project in teams of at most 2 students.

  • The final report will be in style of a conference submission, with abstract, introduction, main body, and conclusions. The project needs to be typeset with LaTeX, using 11pt font and 1 inch margin on letter size paper. The main body of the paper can be at most 8 pages, not accounting for references and the appendices.

  • We will have a poster session as well.

Exams

The class does not have a midterm. The class has a take-home final exam that is to be done individually by the students with no outside help. Details of the exam will be announced at a later date.

Academic Integrity

Academic integrity is strictly enforced. You are allowed to discuss the homeworks with other student. But, do not take any notes, pictures, recording, etc. from your discussion. Your submission must be entirely your own work. You are allowed to consult online and textbook resources to achieve a deeper understanding of the topic. But, do not look up answers to homework problems and exams. Cite all resources, including online sources, on your submissions. Acknowledge the names of those you have discussed the problems with on your submissions.

Be careful of what you share on Piazza. Do not share your answers or provide hints on Piazza. If your questions may reveal part of the answer to a posted problem, then post your questions privately.

The final exam is to be done individually by each student with no help from others. You may not give or receive any assistance from anyone during the exam. You may consult the resources linked on this page, but you may not use any other material during the exam.

Additional academic integrity resources: