| Term | Fall 2023 | Instructor | Christopher De Sa | 
| Course website | www.cs.cornell.edu/courses/cs4787/2023fa/ | [email hidden] | |
| Schedule | MW 7:30-8:45PM | Office hours | Wednesdays 2PM | 
| Room | Phillips Hall 101 | Office | Gates 426 | 
| Term | Fall 2023 | 
| Instructor | Christopher De Sa | 
| Course website | www.cs.cornell.edu/courses/cs4787/2023fa/ | 
| [email hidden] | |
| Schedule | MW 7:30-8:45PM | 
| Office hours | Wednesdays 2PM | 
| Room | Phillips Hall 101 | 
| Office | Gates 426 | 
[Canvas] [Discussion]
Description: CS4787 explores the principles behind scalable machine learning systems. The course will cover the algorithmic and the implementation principles that power the current generation of machine learning on big data. We will cover training and inference for both traditional ML algorithms such as linear and logistic regression, as well as deep models. Topics will include: estimating statistics of data quickly with subsampling, stochastic gradient descent and other scalable optimization methods, mini-batch training, accelerated methods, adaptive learning rates, methods for scalable deep learning, hyperparameter optimization, parallel and distributed training, and quantization and model compression.
Prerequisites: CS4780 or equivalent, CS 2110 or equivalent
Format: Lectures during the scheduled lecture period will cover the course content. Problem sets will be used to encourage familiarity with the content and develop competence with the more mathematical aspects of the course. Programming assignments will help build intuition and familiarity with how machine learning algorithms run. There will be one midterm exam and one final exam, each of which will test both theoretical knowledge and programmming implementation of concepts.
Material: The course is based on books, papers, and other texts in machine learning, scalable optimization, and systems. Texts will be provided ahead of time on the website on a per-lecture basis. You aren't expected to necessarily read the texts, but they will provide useful background for the material we are discussing.
Grading: Students taking CS4787 will be evaluated on the following basis.
| 20% | Problem sets | 
| 40% | Programming assignments | 
| 15% | Prelim Exam | 
| 25% | Final Exam | 
CS5777 has an additional paper-reading component, and students taking CS5777 will be evaluated as follows.
| 15% | Problem sets | 
| 35% | Programming assignments | 
| 10% | Paper reading | 
| 15% | Prelim Exam | 
| 25% | Final Exam | 
Inclusiveness: You should expect and demand to be treated by your classmates and the course staff with respect. You belong here, and we are here to help you learn—and enjoy—this course. If any incident occurs that challenges this commitment to a supportive and inclusive environment, please let the instructor know so that we can address the issue. We are personally committed to this, and subscribe to the Computer Science Department's Values of Inclusion.
Course calendar may be subject to change.
| Monday, August 21 Aug 20Aug 21Aug 22Aug 23Aug 24Aug 25Aug 26 | 
            Monday, August 21
           
            Lecture 1. Introduction and course overview. [Notes PDF]
           
            Problem Set 1 Released. [Notebook] [HTML]
           | 
| Wednesday, August 23 Aug 20Aug 21Aug 22Aug 23Aug 24Aug 25Aug 26 | 
            Wednesday, August 23
           
            Lecture 2. Linear algebra done efficiently: Mapping mathematics to numpy. ML via efficient kernels linked together in python. [Notebook] [HTML]
           
            Background reading material:
             
 | 
| Monday, August 28 Aug 27Aug 28Aug 29Aug 30Aug 31Sep 1Sep 2 | 
            Monday, August 28
           
            Lecture 3. Software for learning with gradients. Numerical differentiation, symbolic differentiation, and automatic differentiation. [Notebook] [HTML]
           | 
| Wednesday, August 30 Aug 27Aug 28Aug 29Aug 30Aug 31Sep 1Sep 2 | 
            Wednesday, August 30
           
            Background reading material:
             
 
            Programming Assignment 1 Released. [Instructions] [Starter Code]
           | 
| Monday, September 4 Sep 3Sep 4Sep 5Sep 6Sep 7Sep 8Sep 9 | 
            Monday, September 4
           
            Labor Day. No Lecture.
           | 
| Wednesday, September 6 Sep 3Sep 4Sep 5Sep 6Sep 7Sep 8Sep 9 | 
            Wednesday, September 6
           
            Background reading material:
             
 
            Problem Set 1 Due.
           | 
| Monday, September 11 Sep 10Sep 11Sep 12Sep 13Sep 14Sep 15Sep 16 | 
            Monday, September 11
           
            Lecture 6. Scaling to complex models by learning with optimization algorithms. Learning in the underparameterized regime. Gradient descent, convex optimization and conditioning. [Notebook] [HTML] [Notes PDF]
           
            Background reading material:
             
 
            Problem Set 2 Released. [PDF]
           | 
| Wednesday, September 13 Sep 10Sep 11Sep 12Sep 13Sep 14Sep 15Sep 16 | 
            Wednesday, September 13
           
            Background reading material:
             
 
            Programming Assignment 1 Due.
           | 
| Monday, September 18 Sep 17Sep 18Sep 19Sep 20Sep 21Sep 22Sep 23 | 
            Monday, September 18
           
            Lecture 8. Adapting algorithms to hardware. Minibatching and the effect of the learning rate. Our first hyperparameters. [Notebook] [HTML]
           
            Background reading material:
             
 | 
| Wednesday, September 20 Sep 17Sep 18Sep 19Sep 20Sep 21Sep 22Sep 23 | 
            Wednesday, September 20
           
            Lecture 9. Optimization techniques for efficient ML. Accelerating SGD with momentum. [Notebook] [HTML] [Notes PDF]
           
            Background reading material:
             
 
            Programming Assignment 2 Released. [Instructions] [Starter Code]
           
            Paper Reading 1 Released. [Instructions]
           | 
| Monday, September 25 Sep 24Sep 25Sep 26Sep 27Sep 28Sep 29Sep 30 | 
            Monday, September 25
           
            Lecture 10. Optimization techniques for efficient ML, continued. Accelerating SGD with preconditioning and adaptive learning rates. [Notebook] [HTML] [Notes PDF]
           
            Background reading material:
             
 | 
| Wednesday, September 27 Sep 24Sep 25Sep 26Sep 27Sep 28Sep 29Sep 30 | 
            Wednesday, September 27
           
            Lecture 11. Optimization techniques for efficient ML, continued. Accelerating SGD with variance reduction and averaging. [Notebook] [HTML] [Notes PDF]
           
            Background reading material:
             
 
            Problem Set 2 Due.
           | 
| Monday, October 2 Oct 1Oct 2Oct 3Oct 4Oct 5Oct 6Oct 7 | 
            Monday, October 2
           
            Lecture 12. Sparsity and dimension reduction. [Notebook] [HTML] [Demo Notebook] [Demo HTML] [Notes PDF]
           
            Background reading material:
             
 
            Problem Set 3 Released. [PDF]
           
            Paper Reading 2 Released. [Instructions]
           | 
| Wednesday, October 4 Oct 1Oct 2Oct 3Oct 4Oct 5Oct 6Oct 7 | 
            Wednesday, October 4
           
            Lecture 13. Deep neural networks review. The overparameterized regime and how it affects optimization. Matrix multiply as computational core of learning. [Demo Notebook] [Demo PDF] [Notes PDF]
           
            Background reading material:
             
 
            Programming Assignment 2 Due.
           
            Paper Reading 1 Due.
           | 
| Monday, October 9 Oct 8Oct 9Oct 10Oct 11Oct 12Oct 13Oct 14 | 
            Monday, October 9
           
            Indigenous Peoples' Day. No Lecture.
           | 
| Wednesday, October 11 Oct 8Oct 9Oct 10Oct 11Oct 12Oct 13Oct 14 | 
            Wednesday, October 11
           
            Lecture 14. Deep neural networks review continued. Transformers and sequence models. [Notes PDF]
           
            Background reading material:
             
 | 
| Monday, October 16 Oct 15Oct 16Oct 17Oct 18Oct 19Oct 20Oct 21 | 
            Monday, October 16
           
            Lecture 15. Methods to Accelerate DNN training. Early stopping. Batch/layer normalization. [Notes PDF]
           
            Background reading material:
             
 | 
| Wednesday, October 18 Oct 15Oct 16Oct 17Oct 18Oct 19Oct 20Oct 21 | 
            Wednesday, October 18
           
            Background reading material:
             
 
            Problem Set 3 Due.
           
            Paper Reading 2 Due.
           | 
| Monday, October 23 Oct 22Oct 23Oct 24Oct 25Oct 26Oct 27Oct 28 | 
            Monday, October 23
           
            Background reading material:
             
 
            Programming Assignment 3 Released. [Instructions] [Starter Code]
           | 
| Wednesday, October 25 Oct 22Oct 23Oct 24Oct 25Oct 26Oct 27Oct 28 | 
            Wednesday, October 25
           
            Lecture 18. Gaussian Processes and Bayesian Optimization. [Notes PDF]
           
            Background reading material:
             
 | 
| Monday, October 30 Oct 29Oct 30Oct 31Nov 1Nov 2Nov 3Nov 4 | 
            Monday, October 30
           
            Lecture 19. Bayesian optimization continued, and prelim review.
           | 
| Tuesday, October 31 Oct 29Oct 30Oct 31Nov 1Nov 2Nov 3Nov 4 | 
            Tuesday, October 31
           
            Prelim Exam. 7:30PM, OLH155.
           | 
| Wednesday, November 1 Oct 29Oct 30Oct 31Nov 1Nov 2Nov 3Nov 4 | 
            Wednesday, November 1
           
            Lecture 20. Bayesian optimization continued...maybe start parallelism?
           | 
| Monday, November 6 Nov 5Nov 6Nov 7Nov 8Nov 9Nov 10Nov 11 | 
            Monday, November 6
           
            Remove over Zoom— Lecture 21. Parallelism. [Notebook] [HTML] [Demo Notebook] [Demo HTML] [Notes PDF]
           
            Background reading material:
             
 
            Programming Assignment 3 Due.
           | 
| Wednesday, November 8 Nov 5Nov 6Nov 7Nov 8Nov 9Nov 10Nov 11 | 
            Wednesday, November 8
           | 
| Monday, November 13 Nov 12Nov 13Nov 14Nov 15Nov 16Nov 17Nov 18 | 
            Monday, November 13
           
            Lecture 23. Floating-point arithmetic. Quantized, low-precision machine learning. [Notes PDF] [Slides PDF]
           
            Background reading material:
             
 
            Programming Assignment 4 Released. [Instructions] [Starter Code]
           
            Paper Reading 3 Released. [Instructions]
           
            Problem Set 4 Released. [PDF]
           | 
| Wednesday, November 15 Nov 12Nov 13Nov 14Nov 15Nov 16Nov 17Nov 18 | 
            Wednesday, November 15
           
            Background reading material:
             
 | 
| Monday, November 20 Nov 19Nov 20Nov 21Nov 22Nov 23Nov 24Nov 25 | 
            Monday, November 20
           
            Background reading material:
             
 
            Programming Assignment 5 Released. [Instructions] [Starter Code]
           | 
| Wednesday, November 22 Nov 19Nov 20Nov 21Nov 22Nov 23Nov 24Nov 25 | 
            Wednesday, November 22
           
            Thanksgiving Break. No Lecture.
           | 
| Monday, November 27 Nov 26Nov 27Nov 28Nov 29Nov 30Dec 1Dec 2 | 
            Monday, November 27
           
            Lecture 26. Deployment and low-latency inference. Real-time learning. Deep neural network compression and pruning. [Notes PDF] [Slides PDF]
           
            Background reading material:
             
 
            Paper Reading 3 Due.
           | 
| Wednesday, November 29 Nov 26Nov 27Nov 28Nov 29Nov 30Dec 1Dec 2 | 
            Wednesday, November 29
           
            Lecture 27. Online learning. Foundation Models. Transfer Learning. In-context learning. Fine-tuning. [Notes PDF]
           
            Programming Assignment 4 Due.
           | 
| Monday, December 4 Dec 3Dec 4Dec 5Dec 6Dec 7Dec 8Dec 9 | 
            Monday, December 4
           
            Lecture 28. The future of machine learning. Competitors to the transformer. [Notes PDF]
           
            Problem Set 4 Due.
           
            Programming Assignment 5 Due.
           |