Syllabus for CS4787

Principles of Large-Scale Machine Learning — Spring 2021

TermSpring 2021InstructorChristopher De Sa
Course websitewww.cs.cornell.edu/courses/cs4787/2021sp/E-mail[email hidden]
ScheduleMW 7:30-8:45PMOffice hoursWednesdays 2PM
RoomZoomOfficeZoom

Description: CS4787 explores the principles behind scalable machine learning systems. The course will cover the algorithmic and the implementation principles that power the current generation of machine learning on big data. We will cover training and inference for both traditional ML algorithms such as linear and logistic regression, as well as deep models. Topics will include: estimating statistics of data quickly with subsampling, stochastic gradient descent and other scalable optimization methods, mini-batch training, accelerated methods, adaptive learning rates, methods for scalable deep learning, hyperparameter optimization, parallel and distributed training, and quantization and model compression.

Prerequisites: CS4780 or equivalent, CS 2110 or equivalent

Format: Lectures during the scheduled lecture period will cover the course content. Problem sets will be used to encourage familiarity with the content and develop competence with the more mathematical aspects of the course. Programming assignments will help build intuition and familiarity with how machine learning algorithms run. There will be one midterm exam and one final exam, each of which will test both theoretical knowledge and programmming implementation of concepts.

Material: The course is based on books, papers, and other texts in machine learning, scalable optimization, and systems. Texts will be provided ahead of time on the website on a per-lecture basis. You aren't expected to necessarily read the texts, but they will provide useful background for the material we are discussing.

Grading: Students will be evaluated on the following basis.

20%Problem sets
40%Programming assignments
15%Prelim Exam
25%Final Exam

Inclusiveness: You should expect and demand to be treated by your classmates and the course staff with respect. You belong here, and we are here to help you learn—and enjoy—this course. If any incident occurs that challenges this commitment to a supportive and inclusive environment, please let the instructor know so that we can address the issue. We are personally committed to this, and subscribe to the Computer Science Department's Values of Inclusion.

TA Office Hours

Direct link


Course calendar may be subject to change.

Course Calendar

Monday, February 8
Feb
7
Feb
8
Feb
9
Feb
10
Feb
11
Feb
12
Feb
13
Lecture 1. Introduction and course overview. [Notes]

Problem Set 1 Released.

Wednesday, February 10
Feb
7
Feb
8
Feb
9
Feb
10
Feb
11
Feb
12
Feb
13
Lecture 2. Linear algebra done efficiently: Mapping mathematics to numpy. [Slides Notebook] [Slides HTML]
Monday, February 15
Feb
14
Feb
15
Feb
16
Feb
17
Feb
18
Feb
19
Feb
20
Lecture 3. Scaling to complex models by learning with optimization algorithms. Gradient descent, convex optimization and conditioning. [Notes] [Slides Notebook] [Slides HTML] [Demo Notebook] [Demo HTML]

Programming Assignment 1 Released.

Background reading material:

Wednesday, February 17
Feb
14
Feb
15
Feb
16
Feb
17
Feb
18
Feb
19
Feb
20
Lecture 4. Gradient descent continued. Stochastic gradient descent. [Notes] [Slides Notebook] [Slides HTML] [Demo Notebook] [Demo HTML]

Background reading material:

Monday, February 22
Feb
21
Feb
22
Feb
23
Feb
24
Feb
25
Feb
26
Feb
27
Lecture 5. Stochastic gradient descent continued. Scaling to huge datasets with subsampling. [Notes] [Slides Notebook] [Slides HTML] [Demo Notebook] [Demo HTML]

Problem Set 1 Due.

Background reading material:

Wednesday, February 24
Feb
21
Feb
22
Feb
23
Feb
24
Feb
25
Feb
26
Feb
27
Lecture 6. Adapting algorithms to hardware. Minibatching and the effect of the learning rate. Our first hyperparameters. [Notes] [Slides Notebook] [Slides HTML] [Demo Notebook] [Demo HTML]

Problem Set 2 Released. Note that this is a half-length problem set designed to be done in 1 week rather than 2, so that it can be finished before the prelim exam.

Background reading material:

Monday, March 1
Feb
28
Mar
1
Mar
2
Mar
3
Mar
4
Mar
5
Mar
6
Lecture 7. The mathematical hammers behind subsampling. Estimating large sums with samples, e.g. the empirical risk. Concentration inequalities. [Notes] [Slides Notebook] [Slides HTML] [Demo Notebook] [Demo HTML]

Programming Assignment 1 Due.

Background reading material:

Wednesday, March 3
Feb
28
Mar
1
Mar
2
Mar
3
Mar
4
Mar
5
Mar
6
Lecture 8. Optimization techniques for efficient ML. Accelerating SGD with momentum. [Notes] [Slides Notebook] [Slides HTML] [Demo Notebook] [Demo HTML]

Problem Set 2 Due.In order for late days to not conflict with the prelim, this problem set can be submitted late until Monday, March 8 with no penalty.

Background reading material:

Thursday, March 4
Feb
28
Mar
1
Mar
2
Mar
3
Mar
4
Mar
5
Mar
6
Prelim Exam. 8:30PM. Exam released on Gradescope and on Canvas. The exam may cover topics up to Lecture 8, including scalability in ML, gradient descent, stochastic gradient descent, convexity and strong convexity, the computational cost of learning algorithms, concentration inequalities, momentum, and writing learning algorithms in numpy.
Monday, March 8
Mar
7
Mar
8
Mar
9
Mar
10
Mar
11
Mar
12
Mar
13
Lecture 9. Optimization techniques for efficient ML, continued. Accelerating SGD with preconditioning and adaptive learning rates. [Notes] [Slides Notebook] [Slides HTML]

Programming Assignment 2 Released.

Background reading material:

Wednesday, March 10
Mar
7
Mar
8
Mar
9
Mar
10
Mar
11
Mar
12
Mar
13
Wellness Day. No Classes. No lecture.
Monday, March 15
Mar
14
Mar
15
Mar
16
Mar
17
Mar
18
Mar
19
Mar
20
Lecture 10. Optimization techniques for efficient ML, continued. Accelerating SGD with variance reduction and averaging. [Notes] [Slides Notebook] [Slides HTML]

Problem Set 3 Released.

Background reading material:

Wednesday, March 17
Mar
14
Mar
15
Mar
16
Mar
17
Mar
18
Mar
19
Mar
20
Lecture 11. Dimensionality reduction and sparsity. [Notes] [Slides Notebook] [Slides HTML] [Demo Notebook] [Demo HTML]

Background reading material:

Monday, March 22
Mar
21
Mar
22
Mar
23
Mar
24
Mar
25
Mar
26
Mar
27
Lecture 12. Deep neural networks. Matrix multiply as computational core of learning. [Notes] [Demo Notebook] [Demo HTML]

Programming Assignment 2 Due.

Programming Assignment 3 Released.

Background reading material:

Wednesday, March 24
Mar
21
Mar
22
Mar
23
Mar
24
Mar
25
Mar
26
Mar
27
Lecture 13. Automatic differentiation and ML frameworks. [Notes] [Demo Notebook] [Demo HTML]

Background reading material:

Monday, March 29
Mar
28
Mar
29
Mar
30
Mar
31
Apr
1
Apr
2
Apr
3
Lecture 14. Accelerating DNN training: early stopping and batch normalization. [Notes] [Demo Notebook] [Demo HTML]

Problem Set 3 Due.

Problem Set 4 Released.

Background reading material:

Wednesday, March 31
Mar
28
Mar
29
Mar
30
Mar
31
Apr
1
Apr
2
Apr
3
Lecture 15. Hyperparameter optimization. Grid search. Random search. [Notes] [Slides Notebook] [Slides HTML]

Background reading material:

Monday, April 5
Apr
4
Apr
5
Apr
6
Apr
7
Apr
8
Apr
9
Apr
10
Lecture 16. Kernels and kernel feature extraction. [Notes] [Slides Notebook] [Slides HTML]

Programming Assignment 3 Due.

Programming Assignment 4 Released.

Background reading material:

Wednesday, April 7
Apr
4
Apr
5
Apr
6
Apr
7
Apr
8
Apr
9
Apr
10
Lecture 17. Bayesian optimization 1. [Notes]

Background reading material:

Monday, April 12
Apr
11
Apr
12
Apr
13
Apr
14
Apr
15
Apr
16
Apr
17
Lecture 18. Bayesian optimization 2. [Notes]

Problem Set 4 Due.

Problem Set 5 Released.

Background reading material: same as Bayesian optimization 1.

Wednesday, April 14
Apr
11
Apr
12
Apr
13
Apr
14
Apr
15
Apr
16
Apr
17
Lecture 19. Parallelism. [Notes] [Slides Notebook] [Slides HTML] [Demo Notebook] [Demo HTML]

Background reading material:

  • Good resource on parallel programming, particularly on GPUs: Chapter 1 of Programming Massively Parallel Processors: A Hands-On Approach, Second Edition (by David B. Kirk and Wen-mei W. Hwu). This book is available on the Cornell library.
  • Classical work providing background on parallelism in computer architecture: Chapters 3, 4, and 5 of Computer Architecture: A Quantitative Approach. This book is available on the Cornell library.
Monday, April 19
Apr
18
Apr
19
Apr
20
Apr
21
Apr
22
Apr
23
Apr
24
Lecture 20. Memory locality and memory bandwidth. [Notes] [Slides Notebook] [Slides HTML]

Programming Assignment 4 Due.

Programming Assignment 5 Released.

Background reading material: same as Parallelism 1.

Wednesday, April 21
Apr
18
Apr
19
Apr
20
Apr
21
Apr
22
Apr
23
Apr
24
Lecture 21. Machine learning on GPUs; matrix multiply returns. [Notes] [Slides Notebook] [Slides HTML]

Background reading material:

  • Parallel programming on GPUs: Chapters 2-5 of Programming Massively Parallel Processors: A Hands-On Approach, Second Edition (by David B. Kirk and Wen-mei W. Hwu). This book is available on the Cornell library.
Monday, April 26
Apr
25
Apr
26
Apr
27
Apr
28
Apr
29
Apr
30
May
1
Wellness Day. No Classes. No lecture.
Wednesday, April 28
Apr
25
Apr
26
Apr
27
Apr
28
Apr
29
Apr
30
May
1
Lecture 22. Quantized, low-precision machine learning. [Notes] [Slides] [Demo Notebook] [Demo HTML]

Problem Set 5 Due.

Problem Set 6 Released.

Background reading material:

  • An example of a blog post illustrating the use of low-precision arithmetic for deep learning.
Monday, May 3
May
2
May
3
May
4
May
5
May
6
May
7
May
8
Lecture 23. Distributed learning and the parameter server. [Notes] [Slides]

Programming Assignment 5 Due.

Programming Assignment 6 Released.

Background reading material:

Wednesday, May 5
May
2
May
3
May
4
May
5
May
6
May
7
May
8
Lecture 24. Deployment and low-latency inference. Deep neural network compression and pruning. [Notes] [Slides] [Demo Notebook] [Demo HTML]

Background reading material:

Monday, May 10
May
9
May
10
May
11
May
12
May
13
May
14
May
15
Lecture 25. Online Learning and Realtime Learning. [Notes] [Slides Notebook] [Slides HTML]

Background reading material:

Wednesday, May 12
May
9
May
10
May
11
May
12
May
13
May
14
May
15
Lecture 26. Machine learning accelerators, and Course Summary. [Notes] [Slides Notebook] [Slides HTML]

Problem Set 6 Due.

Background reading material:

Friday, May 14
May
9
May
10
May
11
May
12
May
13
May
14
May
15
(No lecture.)

Programming Assignment 6 Due.

Tuesday, May 18
May
16
May
17
May
18
May
19
May
20
May
21
May
22
Final Exam. 9:30AM. Exam released on Gradescope and on Canvas. The exam may include any topics covered in the course.