Term  Fall 2022  Instructor  Christopher De Sa 
Course website  www.cs.cornell.edu/courses/cs4787/2022fa/  [email hidden]  
Schedule  MW 7:308:45PM  Office hours  Wednesdays 2PM 
Room  Kimball Hall B11  Office  Gates 426 
Term  Fall 2022 
Instructor  Christopher De Sa 
Course website  www.cs.cornell.edu/courses/cs4787/2022fa/ 
[email hidden]  
Schedule  MW 7:308:45PM 
Office hours  Wednesdays 2PM 
Room  Kimball Hall B11 
Office  Gates 426 
[Canvas] [Discussion] [CMS]
Description: CS4787 explores the principles behind scalable machine learning systems. The course will cover the algorithmic and the implementation principles that power the current generation of machine learning on big data. We will cover training and inference for both traditional ML algorithms such as linear and logistic regression, as well as deep models. Topics will include: estimating statistics of data quickly with subsampling, stochastic gradient descent and other scalable optimization methods, minibatch training, accelerated methods, adaptive learning rates, methods for scalable deep learning, hyperparameter optimization, parallel and distributed training, and quantization and model compression.
Prerequisites: CS4780 or equivalent, CS 2110 or equivalent
Format: Lectures during the scheduled lecture period will cover the course content. Problem sets will be used to encourage familiarity with the content and develop competence with the more mathematical aspects of the course. Programming assignments will help build intuition and familiarity with how machine learning algorithms run. There will be one midterm exam and one final exam, each of which will test both theoretical knowledge and programmming implementation of concepts.
Material: The course is based on books, papers, and other texts in machine learning, scalable optimization, and systems. Texts will be provided ahead of time on the website on a perlecture basis. You aren't expected to necessarily read the texts, but they will provide useful background for the material we are discussing.
Grading: Students taking CS4787 will be evaluated on the following basis.
20%  Problem sets 
40%  Programming assignments 
15%  Prelim Exam 
25%  Final Exam 
CS5777 has an additional paperreading component, and students taking CS5777 will be evaluated as follows.
15%  Problem sets 
35%  Programming assignments 
10%  Paper reading 
15%  Prelim Exam 
25%  Final Exam 
Inclusiveness: You should expect and demand to be treated by your classmates and the course staff with respect. You belong here, and we are here to help you learn—and enjoy—this course. If any incident occurs that challenges this commitment to a supportive and inclusive environment, please let the instructor know so that we can address the issue. We are personally committed to this, and subscribe to the Computer Science Department's Values of Inclusion.
Course calendar may be subject to change.
Monday, August 22 Aug 21Aug 22Aug 23Aug 24Aug 25Aug 26Aug 27 
Monday, August 22
Lecture 1. Introduction and course overview. [Notes PDF]
Problem Set 1 Released. [Notebook] [HTML]

Wednesday, August 24 Aug 21Aug 22Aug 23Aug 24Aug 25Aug 26Aug 27 
Wednesday, August 24
Lecture 2. Linear algebra done efficiently: Mapping mathematics to numpy. ML via efficient kernels linked together in python. [Notebook] [HTML]
Background reading material:

Monday, August 29 Aug 28Aug 29Aug 30Aug 31Sep 1Sep 2Sep 3 
Monday, August 29
Hybrid Zoom/In Person — Lecture 3. Software for learning with gradients. Numerical differentiation, symbolic differentiation, and automatic differentiation. [Notebook] [HTML]

Wednesday, August 31 Aug 28Aug 29Aug 30Aug 31Sep 1Sep 2Sep 3 
Wednesday, August 31
Background reading material:
Programming Assignment 1 Released. [Instructions] [Starter Code]
Paper Reading 1 Released. [Instructions]

Monday, September 5 Sep 4Sep 5Sep 6Sep 7Sep 8Sep 9Sep 10 
Monday, September 5
Labor Day. No Lecture.

Wednesday, September 7 Sep 4Sep 5Sep 6Sep 7Sep 8Sep 9Sep 10 
Wednesday, September 7
Background reading material:
Problem Set 1 Due.
Problem Set 2 Released. [PDF]

Monday, September 12 Sep 11Sep 12Sep 13Sep 14Sep 15Sep 16Sep 17 
Monday, September 12
Lecture 6. Scaling to complex models by learning with optimization algorithms. Learning in the underparameterized regime. Gradient descent, convex optimization and conditioning. [Notebook] [HTML] [Notes PDF]
Background reading material:

Wednesday, September 14 Sep 11Sep 12Sep 13Sep 14Sep 15Sep 16Sep 17 
Wednesday, September 14
Background reading material:
Programming Assignment 1 Due.
Paper Reading 1 Due.

Monday, September 19 Sep 18Sep 19Sep 20Sep 21Sep 22Sep 23Sep 24 
Monday, September 19
Lecture 8. Stochastic gradient descent continued. Scaling to huge datasets with subsampling. [Notebook] [HTML]
Background reading material:
Programming Assignment 2 Released. [Instructions] [Starter Code]

Wednesday, September 21 Sep 18Sep 19Sep 20Sep 21Sep 22Sep 23Sep 24 
Wednesday, September 21
Lecture 9. Adapting algorithms to hardware. Minibatching and the effect of the learning rate. Our first hyperparameters. [Notebook] [HTML] [Demo Notebook] [Demo HTML]
Background reading material:

Monday, September 26 Sep 25Sep 26Sep 27Sep 28Sep 29Sep 30Oct 1 
Monday, September 26
Lecture 10. Optimization techniques for efficient ML. Accelerating SGD with momentum. [Notebook] [HTML] [Demo Notebook] [Demo HTML] [Notes PDF]
Background reading material:
Problem Set 2 Due.
Problem Set 3 Released. [PDF]
Paper Reading 2 Released. [Instructions]

Wednesday, September 28 Sep 25Sep 26Sep 27Sep 28Sep 29Sep 30Oct 1 
Wednesday, September 28
Lecture 11. Optimization techniques for efficient ML, continued. Accelerating SGD with preconditioning and adaptive learning rates. [Notebook] [HTML] [Notes PDF]
Background reading material:

Monday, October 3 Oct 2Oct 3Oct 4Oct 5Oct 6Oct 7Oct 8 
Monday, October 3
Lecture 12. Optimization techniques for efficient ML, continued. Accelerating SGD with variance reduction and averaging. [Notebook] [HTML] [Notes PDF]
Background reading material:
Programming Assignment 2 Due.
Programming Assignment 3 Released. [Instructions] [Starter Code]

Wednesday, October 5 Oct 2Oct 3Oct 4Oct 5Oct 6Oct 7Oct 8 
Wednesday, October 5
Lecture 13. Sparsity and dimension reduction. [Notebook] [HTML] [Demo Notebook] [Demo HTML] [Notes PDF]
Background reading material:

Monday, October 10 Oct 9Oct 10Oct 11Oct 12Oct 13Oct 14Oct 15 
Monday, October 10
Indigenous Peoples' Day. No Lecture.

Wednesday, October 12 Oct 9Oct 10Oct 11Oct 12Oct 13Oct 14Oct 15 
Wednesday, October 12
Lecture 14. Deep neural networks review. The overparameterized regime and how it affects optimization. Matrix multiply as computational core of learning. [Notes PDF]
Background reading material:
Problem Set 3 Due.
Paper Reading 2 Due.
Problem Set 4 Released. [PDF]
Paper Reading 3 Released. [Instructions]

Monday, October 17 Oct 16Oct 17Oct 18Oct 19Oct 20Oct 21Oct 22 
Monday, October 17
Lecture 15. Methods to Accelerate DNN training. Early stopping. Batch normalization. [Demo Notebook] [Demo HTML] [Notes PDF]
Background reading material:
Programming Assignment 3 Due.

Wednesday, October 19 Oct 16Oct 17Oct 18Oct 19Oct 20Oct 21Oct 22 
Wednesday, October 19
Lecture 16. Beyond supervised learning. Semisupervised learning. Transfer learning. Selfsupervised learning. [Notes PDF]
Background reading material:
Programming Assignment 4 Released. [Instructions] [Starter Code]

Monday, October 24 Oct 23Oct 24Oct 25Oct 26Oct 27Oct 28Oct 29 
Monday, October 24
Lecture 17. Foundation models. Attention. Transformers. [Notes PDF]
Problem Set 4 Due.

Wednesday, October 26 Oct 23Oct 24Oct 25Oct 26Oct 27Oct 28Oct 29 
Wednesday, October 26
Background reading material:

Thursday, October 27 Oct 23Oct 24Oct 25Oct 26Oct 27Oct 28Oct 29 
Thursday, October 27
Prelim Exam. 7:30PM, OLH155, OLH165.

Monday, October 31 Oct 30Oct 31Nov 1Nov 2Nov 3Nov 4Nov 5 
Monday, October 31
Background reading material:
Paper Reading 3 Due.

Wednesday, November 2 Oct 30Oct 31Nov 1Nov 2Nov 3Nov 4Nov 5 
Wednesday, November 2
Lecture 20. Hyperparameter Optimization Recap. (Lecture Spill Over From Monday; Same Notes/Slides as Monday)
Background reading material:
Programming Assignment 4 Due.
Problem Set 5 Released. [PDF]
Paper Reading 4 Released. [Instructions]

Monday, November 7 Nov 6Nov 7Nov 8Nov 9Nov 10Nov 11Nov 12 
Monday, November 7
Lecture 21. Gaussian Processes and Bayesian Optimization. [Notes PDF]
Background reading material:
Programming Assignment 5 Released. [Instructions] [Starter Code]

Wednesday, November 9 Nov 6Nov 7Nov 8Nov 9Nov 10Nov 11Nov 12 
Wednesday, November 9
Background reading material:

Monday, November 14 Nov 13Nov 14Nov 15Nov 16Nov 17Nov 18Nov 19 
Monday, November 14
Problem Set 6 Released. [PDF]

Wednesday, November 16 Nov 13Nov 14Nov 15Nov 16Nov 17Nov 18Nov 19 
Wednesday, November 16
Lecture 24. Floatingpoint arithmetic. Quantized, lowprecision machine learning. [Notes PDF]
Background reading material:
Problem Set 5 Due.
Programming Assignment 6 Released. [Instructions] [Starter Code]
Paper Reading 5 Released. [Instructions]

Monday, November 21 Nov 20Nov 21Nov 22Nov 23Nov 24Nov 25Nov 26 
Monday, November 21
Lecture 25. Distributed learning and the parameter server. [Notes PDF]
Background reading material:
Paper Reading 4 Due.
Programming Assignment 5 Due.

Wednesday, November 23 Nov 20Nov 21Nov 22Nov 23Nov 24Nov 25Nov 26 
Wednesday, November 23
Thanksgiving Break. No Lecture.

Monday, November 28 Nov 27Nov 28Nov 29Nov 30Dec 1Dec 2Dec 3 
Monday, November 28
Lecture 26. Machine learning on GPUs. ML Accelerators. [Notes PDF]
Background reading material:

Wednesday, November 30 Nov 27Nov 28Nov 29Nov 30Dec 1Dec 2Dec 3 
Wednesday, November 30
Lecture 27. Deployment and lowlatency inference. Realtime learning. Deep neural network compression and pruning. [Notes PDF]
Background reading material:
Problem Set 6 Due.

Monday, December 5 Dec 4Dec 5Dec 6Dec 7Dec 8Dec 9Dec 10 
Monday, December 5
Lecture 28. Recap. Online learning. Scaling: the future of machine learning? [Notes PDF]
Programming Assignment 6 Due.
Paper Reading 5 Due.
