Syllabus for CS4/5780

Introduction to Machine Learning — Spring 2022

TermSpring 2022InstructorChristopher De Sa
Course websitewww.cs.cornell.edu/courses/cs4780/2022sp/E-mail[email hidden]
ScheduleMW 7:30-8:45PMOffice hoursWednesdays 2PM
RoomStatler 185OfficeGates 450

Description: CS4/5780 provides an introduction to machine learning, focusing on supervised learning and its theoretical foundations. Topics include regularized linear models, boosting, kernels, deep networks, generative models, online learning, and ethical questions arising in ML applications.

Prerequisites: probability theory (e.g. BTRY 3080, ECON 3130, MATH 4710, ENGRD 2700), linear algebra (e.g. MATH 2940), calculus (e.g. MATH 1920), and programming proficiency (e.g. CS 2110). You must also pass the prelim exam.

Format: In accordance with university guidelines, the first two weeks of the course will be taught entirely online. Both lectures and office hours will be conducted over zoom. In-person lectures are expected to resume starting on February 7.

Material: The course is based on books, papers, and other texts in machine learning, scalable optimization, and systems. Texts will be provided ahead of time on the website on a per-lecture basis. You aren't expected to necessarily read the texts, but they will provide useful background for the material we are discussing.

Logistics: For enrolled students the companion Canvas page serves as a hub for access to the lecture zoom links, TA office hour zoom links, the TA office hour schedule, Ed Discussions (the course forum), Vocareum (for course projects), Gradescope (for HWs), and quizzes (for the placement exam and paper comprehension quizzes). If you are enrolled in the course you should automatically have access to the site. Please let us know if you are unable to access it.

Late Days: All assignments have two (2) slip days associated with them. This means that you can submit everything (except exams) up to two days after the posted deadline without penalty. Further extensions warranted by extreme circumstances will be handled as needed on a case-by-case basis (inquire on Ed).

Grading: For CS4780, your grade in this course is comprised of three components: homework, exams, and projects. CS5780 also has an additional paper comprehension component. Please also read through the given references in concert with the lectures.

CS 4780 grade breakdown. For CS 4780, paper comprehension is not a required component, but you may choose to complete them if you want. Your final grade consists of the maximum of

60%Examsor50%Examsor55%Examsor45%Exams
10%Problem Sets10%Problem Sets
40%Projects40%Projects35%Projects35%Projects
10%Paper Comprehension10%Paper Comprehension

CS 5780 grade breakdown. For CS 5780, the paper comprehension is a required component, and your final grade consists of the maximum of either

55%Examsor45%Exams
10%Problem Sets
35%Projects35%Projects
10%Paper Comprehension10%Paper Comprehension

Regardless of which grading scheme you are targeting (or ends up being the maximizer), homework must be completed. Homework will be graded for correctness and these scores will be used to compute your overall homework grade. Provided you make a good faith effort (as specified in class) on the homework they do not factor into your final grade under the first scheme above. However, failure to provide a good faith effort for any homework assignments will result in a 5% penalty per missing assignment.

Inclusiveness: You should expect and demand to be treated by your classmates and the course staff with respect. You belong here, and we are here to help you learn—and enjoy—this course. If any incident occurs that challenges this commitment to a supportive and inclusive environment, please let the instructor know so that we can address the issue. We are personally committed to this, and subscribe to the Computer Science Department's Values of Inclusion.

COVID-19 considerations. We understand that the ongoing global health pandemic impacts all of you in varied and profound ways. Therefore, flexibility is important as we continue to navigate the current state of affairs. While many aspects of this course are built with flexibility in mind, if situations arise that may require additional accommodations please reach out to the instructors to discuss potential arrangements.

Mental health resources. Cornell University provides a comprehensive set of mental health resources and the student group Body Positive Cornell has put together a flyer outlined the resources available.

Participation. You are encouraged to actively participate in class. This can take the form of asking questions in class, responding to questions to the class, and actively asking/answering questions on the online discussion board.

Collaboration policy. Students are free to share code and ideas within their stated project/homework group for a given assignment, but should not discuss details about an assignment with individuals outside their group. The midterm and final exam are individual assignments and must be completed by yourself.

Academic integrity. The Cornell Code of Academic Integrity applies to this course.

Accommodations. In compliance with the Cornell University policy and equal access laws, we are available to discuss appropriate academic accommodations that may be required for student with disabilities. Requests for academic accommodations are to be made during the first three weeks of the semester, except for unusual circumstances, so arrangements can be made. Students are encouraged to register with Student Disability Services to verify their eligibility for appropriate accommodations.


Course references

Main texts. While this course does not explicitly follow a specific textbook, there are several that are very useful references to supplement the course.

Additional references.

Background references.

Software. Software we will use in CS4/5780 includes Python, NumPy, and PyTorch.


Course calendar may be subject to change.

Course Calendar

Monday, January 24
Jan
23
Jan
24
Jan
25
Jan
26
Jan
27
Jan
28
Jan
29
Lecture 1. Introduction. [Slides]

Prelim Exam Assigned.

Programming Assignment -1 assigned.

Reading material: MLAPP: 1.1, 1.2, and 1.3; ESL: Ch. 1; and PPA: Ch. 1.

Wednesday, January 26
Jan
23
Jan
24
Jan
25
Jan
26
Jan
27
Jan
28
Jan
29
Lecture 2. ML Basics. [Slides] [Notes] [Handwritten Notes]

Reading material: ESL: 2.1 and 2.2..

Monday, January 31
Jan
30
Jan
31
Feb
1
Feb
2
Feb
3
Feb
4
Feb
5
Lecture 3. k nearest neighbors and the curse of dimensionality. [Slides] [Notebook] [Notes]

Homework 1 assigned.

Programming Assignment 0 assigned.

Reading material: MLAPP: 1.1, 1.2, 1.4.2, 1.4.3, and 1.4.9.

Wednesday, February 2
Jan
30
Jan
31
Feb
1
Feb
2
Feb
3
Feb
4
Feb
5
Lecture 4. K-means clustering. [Light Slides] [Dark Slides] [Notes]

Programming Assignment 1 assigned.

Prelim Exam Due.

Reading material: ESL: 14.3.6 and 14.3.7, and MLAPP: 11.4.1 and 11.4.2.

Monday, February 7
Feb
6
Feb
7
Feb
8
Feb
9
Feb
10
Feb
11
Feb
12
Lecture 5. Principal component analysis. [Notes] [Printable Notes PDF] [Document Camera Notes]

Reading material: ESL: 14.5.1 and 14.5.2.

Wednesday, February 9
Feb
6
Feb
7
Feb
8
Feb
9
Feb
10
Feb
11
Feb
12
Lecture 6. The Perceptron. [Notes] [Printable Notes PDF] [Document Camera Notes]

Paper Reading 1 assigned: Cover and Hart 1967.

Programming Assignment 0 due.

Reading material: Perceptron Wiki.

Monday, February 14
Feb
13
Feb
14
Feb
15
Feb
16
Feb
17
Feb
18
Feb
19
Lecture 7. MLE and MAP. [Notes] [Printable Notes PDF]

Homework 1 due.

Homework 2 assigned.

Reading material: Nice Youtube video for MLE and MAP. Ben Taskar's lecture notes. ESL: 8.2.2-8.3.

Wednesday, February 16
Feb
13
Feb
14
Feb
15
Feb
16
Feb
17
Feb
18
Feb
19
Lecture 8. Naive Bayes. [Notes] [Printable Notes PDF] [Document Camera Notes]

Programming Assignment 1 due.

Programming Assignment 2 assigned.

Reading material: ESL: 6.6.3; Tom Mitchell’s book chapter..

Monday, February 21
Feb
20
Feb
21
Feb
22
Feb
23
Feb
24
Feb
25
Feb
26
Lecture 9. Naive Bayes II. [Notes Same As Naive Bayes I] [Printable Notes PDF] [Document Camera Notes (with Lecture 10)]

Reading material: ESL: 6.6.3; Tom Mitchell’s book chapter.

Wednesday, February 23
Feb
20
Feb
21
Feb
22
Feb
23
Feb
24
Feb
25
Feb
26
Lecture 10. Logistic regression. [Notes] [Printable Notes PDF] [Document Camera Notes (with Lecture 9)]

Homework 2 due.

Paper Reading 1 due.

Homework 3 assigned.

Reading material: MLaPP: 8.2, 8.3, and 8.4; ESL 4.4.

Monday, February 28
Feb
27
Feb
28
Mar
1
Mar
2
Mar
3
Mar
4
Mar
5
February Break. No Classes. No lecture.
Wednesday, March 2
Feb
27
Feb
28
Mar
1
Mar
2
Mar
3
Mar
4
Mar
5
Lecture 11. Gradient descent and Newton's method. [Notes] [Printable Notes PDF] [Document Camera Notes]

Programming Assignment 2 due.

Programming Assignment 3 assigned.

Reading material: MLaPP: 8.2, 8.3, and 8.4.

Monday, March 7
Mar
6
Mar
7
Mar
8
Mar
9
Mar
10
Mar
11
Mar
12
Lecture 12. Linear Regression. [Notes] [Printable Notes PDF]

Reading material: MLaPP 8, 8.1, 8.2, 8.3.1, 8.3.2, 8.3.4, 8.6; Tom Mitchell’s book chapter on Naive Bayes and Linear Regression; ISL: 3.1; and ESL 3.2.

Wednesday, March 9
Mar
6
Mar
7
Mar
8
Mar
9
Mar
10
Mar
11
Mar
12
Lecture 13. Support vector machines (SVM). [Notes] [Printable Notes PDF]

Homework 3 due.

Homework 4 assigned.

Monday, March 14
Mar
13
Mar
14
Mar
15
Mar
16
Mar
17
Mar
18
Mar
19
Lecture 14. SVM + Empirical risk minimization (ERM). [Notes] [Printable Notes PDF]

Programming Assignment 3 due.

Programming Assignment 4 released.

Reading material: MLaPP 6.5-6.5.3.2.

Wednesday, March 16
Mar
13
Mar
14
Mar
15
Mar
16
Mar
17
Mar
18
Mar
19
Midterm Review.
Thursday, March 17
Mar
13
Mar
14
Mar
15
Mar
16
Mar
17
Mar
18
Mar
19
Prelim Exam. 7:30PM. BKL200, BKL219, BKL335.
Monday, March 21
Mar
20
Mar
21
Mar
22
Mar
23
Mar
24
Mar
25
Mar
26
Lecture 15. Kernels. [Notes] [Printable Notes PDF]

Homework 4 due.

Homework 5 assigned.

Reading material: MLaPP Ch. 14.

Wednesday, March 23
Mar
20
Mar
21
Mar
22
Mar
23
Mar
24
Mar
25
Mar
26
Lecture 16. Kernels II. [Notes] [Printable Notes PDF] [All Classifiers Demo]

Programming Assignment 4 due.

Programming Assignment 5 assigned.

Reading material: MLaPP Ch. 14.

Monday, March 28
Mar
27
Mar
28
Mar
29
Mar
30
Mar
31
Apr
1
Apr
2
Lecture 17. Bias-variance tradeoff. [Notes] [Printable Notes PDF] [Under/Overfitting Demo]
Wednesday, March 30
Mar
27
Mar
28
Mar
29
Mar
30
Mar
31
Apr
1
Apr
2
Lecture 18. ERM Returns + Model Selection. [Notes] [Printable Notes PDF] [Gradient Descent Under/Overfitting Demo]

Homework 5 due.

Monday, April 4
Apr
3
Apr
4
Apr
5
Apr
6
Apr
7
Apr
8
Apr
9
Spring Break. No Classes. No lecture.
Wednesday, April 6
Apr
3
Apr
4
Apr
5
Apr
6
Apr
7
Apr
8
Apr
9
Spring Break. No Classes. No lecture.
Monday, April 11
Apr
10
Apr
11
Apr
12
Apr
13
Apr
14
Apr
15
Apr
16
Lecture 19. Classification and regression trees. [Notes] [Printable Notes PDF] [KD-Tree Demo]

Programming Assignment 6 assigned.

Wednesday, April 13
Apr
10
Apr
11
Apr
12
Apr
13
Apr
14
Apr
15
Apr
16
Lecture 20. Classification and regression trees II. [Notes] [Printable Notes PDF]

Programming Assignment 5 due April 14.

Homework 6 assigned.

Monday, April 18
Apr
17
Apr
18
Apr
19
Apr
20
Apr
21
Apr
22
Apr
23
Lecture 21. Bagging. [Notes] [Printable Notes PDF] [Document Camera Notes]

Reading material: ISL 8.2.

Wednesday, April 20
Apr
17
Apr
18
Apr
19
Apr
20
Apr
21
Apr
22
Apr
23
Lecture 22. Boosting. [Notes] [Printable Notes PDF] [Multi-Lecture Document Camera Notes]

Programming Assignment 6 due.

Programming Assignment 7 assigned.

Reading material: ISL 8.2.

Monday, April 25
Apr
24
Apr
25
Apr
26
Apr
27
Apr
28
Apr
29
Apr
30
Lecture 23. Neural Networks. [Notes] [Printable Notes PDF] [Multi-Lecture Document Camera Notes]

Homework 7 assigned.

Reading material: ESL 11.3, 11.4, 11.5. For more in-depth coverage of the modern practice of neural networks, see the Deep Learning Book Part II.

Wednesday, April 27
Apr
24
Apr
25
Apr
26
Apr
27
Apr
28
Apr
29
Apr
30
Lecture 24. Neural Networks II. [Notes (same as previous)] [Multi-Lecture Document Camera Notes]

Homework 6 due.

Programming Assignment 8 assigned.

Monday, May 2
May
1
May
2
May
3
May
4
May
5
May
6
May
7
Lecture 25. Neural Networks III. [Notes (same as previous)] [Multi-Lecture Document Camera Notes]
Wednesday, May 4
May
1
May
2
May
3
May
4
May
5
May
6
May
7
Lecture 26. Neural Networks IV. [Notes (same as previous)] [Document Camera Notes]

Programming Assignment 7 due.

Monday, May 9
May
8
May
9
May
10
May
11
May
12
May
13
May
14
Lecture 27.Final Exam Review.

Homework 7 due.

Programming Assignment 8 due May 10.

Wednesday, May 18
May
15
May
16
May
17
May
18
May
19
May
20
May
21
Final Exam. 2:00PM. Barton Hall 100 West.