Schedule

Lecture Date Topic Materials Assignments
Lec. 1 Tue, Jan. 20 Introduction
Course policies
What is generative modeling?
Lec. 2 Thu, Jan. 22 Maximum likelihood
Gaussian distribution
Maximum likelihood
ps1 (simple generative models) out
Lec. 3 Tue, Jan 27 Gaussian mixture models
Stochastic gradient descent
Gaussian mixture models
Variational inference
Lec. 4 Thu, Jan 29 Neural network review (recorded lecture)
Convolutional networks
Image translation networks
Backpropagation
Lec. 5 Tue, Feb 3 Variational autoencoders
Autoencoders
Variational inference for VAEs
Reparameterization trick
Lec. 6 Thu, Feb 5 Normalizing flows
Change of variables formula
Coupling layers
Inverse autoregressive flows
Lec. 7 Tue, Feb 10 Generative adversarial networks
Minimax games
Mode collapse
Optimization
ps1 due (extended to 2/12)
ps2 out (latent variable models)
Lec. 8 Thu, Feb 12 GANs for image synthesis
Conditional GANs
Cycle consistency
VQ-GANs
Tue, Feb 17 No class
Lec. 9 Thu, Feb 19 Energy-based models
Langevin dynamics
The partition function
Score matching perspective
    Lec. 10 Tue, Feb 24 Diffusion models 1
    Diffusion models for image synthesis
    DDPM
    Connection to VAEs
      project proposal info out
      Lec. 11 Thu, Feb. 26 Diffusion models 2
        Lec. 12 Tue, Mar. 3 Image manipulation with diffusion models
        Conditional diffusion models
        Classifier-free guidance
        Inpainting and SDEdit
          Lec. 13 Thu, Mar. 5 Flow matching
          Flow formulation
          Rectified flow
          Continuous normalizing flows
            ps2 due
            ps3 out (diffusion models)
            Lec. 14 Tue, Mar 10 Diffusion architectures
            Transformer-based diffusion models
            Latent diffusion models
            Few-step generation
              Thu, Mar. 12 Autoregressive models
                Lec. 15 Tue, Mar. 17 Language models
                GPT
                Tokenization
                Parallel decoding
                  Lec. 16 Thu, Mar. 19 Discrete diffusion models
                  Masked language modeling
                  Diffusion in latent spaces
                    project proposal due
                    Lec. 17 Tue, Mar. 24 Applying generative models to downstream tasks
                    Representation learning
                    Zero-shot learning
                      ps3 due on Wednesday
                      ps4 out (language models)
                      Lec. 18 Thu, Mar. 26 Scaling
                      Scaling laws
                      Systems issues involved
                        Mon, Mar. 31 No class
                        Wed, Apr. 2 No class
                        Lec. 19 Tue, Apr. 7 Midterm review
                          Lec. 20 Thu, Apr. 9 Midterm exam
                            Lec. 21 Tue, Apr. 14 Post-training
                            Instruction tuning (for both images and language)
                            RLHF
                            RL-based reasoning models
                              Lec. 22 Thu, Apr 16 Evaluating generative models
                                project guidelines
                                Fri, Apr. 17 Tentative midterm time, 2-5pm
                                  Lec. 23 Tue, Apr 21 Generated media provenance
                                  Detecting generated images/text
                                  Artist attribution
                                  Watermarking
                                    Lec. 24 Thu, Apr 23 Model interpretability
                                    Feature visualization
                                    Influence functions
                                      Lec. 25 Tue, Apr. 28
                                        ps4 due on Wednesday
                                        Lec. 26 Thu, Apr. 30 Final project presentations
                                          Lec. 27 Tue, May 5 Final project presentations


                                            Staff & Office Hours



                                            Office Hours

                                            Day Time Name Location
                                            Tuesday 11:30am - 12:00pmAndrew Owens Bloomberg 368
                                            Wednesday 11:30am - 12:30pmZhaolin Gao
                                            Yiming Dou
                                            Studio
                                            Thursday 1:00pm - 2:00pmJeongsoo Park
                                            Yen-Yu Chang
                                            Studio

                                            The Master's Studio is on the first floor of the Bloomberg building. Office hours take place in the huddle space. Please refer to the whiteboard within the studio to help locate this huddle.


                                            Course information

                                            An in-depth introduction to deep generative models. This course covers the mathematical foundations of generative models and their implementation as deep neural networks. Topics include diffusion models, variational autoencoders, autoregressive models, generative adversarial networks, and network architectures for generation. These topics will be discussed in the context of applications in computer vision and natural language processing.

                                            Lectures: Lectures will take place Tuesday and Thursday, 10:10AM - 11:25AM in Bloomberg Center 131. Attendance will not be required, but it is highly encouraged. There are multiple ways to participate:

                                            • In person in Bloomberg Center 131
                                            • We'll post lecture recordings online here.

                                            Prerequisites:

                                            • This class requires familiarity with deep learning, i.e., CS 5787: Deep Learning, CS 4780/5780: Introduction to Machine Learning, or equivalent.
                                            • This course puts a strong emphasis on mathematical methods. We'll cover a wide range of techniques in a short amount of time. Background in linear algebra is required. For a refresher, please see here. This material should mostly look familiar to you.
                                            • This class will require a significant amount of programming. All programming will be completed in Python, using numerical libraries such as numpy, scipy, and PyTorch. In some assignments, we'll give you starter code; in others, we'll ask you to write a large amount of code from scratch.

                                            Google Colab: The problem sets will be completed using Jupyter notebooks, generally using Google Colab. While this service is free, it is important to note that it comes with GPU usage limits. You may only use the GPUs on a given Google account for a certain number of hours per day. These limits are due to the fact that GPUs are very expensive. Since none of the problem sets require training large models, you may never encounter these limits. However, we have provided a few suggestions for avoiding them:

                                            1. Reduce your GPU usage by initially debugging your code on the CPU. For example, after confirming that you can successfully complete a single training iteration without error on the CPU, you can switch to the GPU. You can then switch back to the CPU if you need to debug further errors.
                                            2. Consider using Google Colab Pro. It may be available for free for students (see here) pending availability. For students who would like to use this (optional) service, but are unable to afford it, we may be able to obtain funding for you. Please send the course staff a private message by email if you would like to learn more about this option.

                                            Q&A: This course has a Q&A forum on Ed Discussion, where you can ask public questions. If you cannot make your post public (e.g., due to revealing problem set solutions), please mark your post private, or come to office hours. Please note, however, that the course staff cannot provide help debugging code, and there is no guarantee that they'll be able to answer all questions — especially last-minute questions about the homework. We also greatly appreciate it when you respond to questions from other students! If you have an important question that you would prefer to discuss over email, you may email the course staff (cs5788-staff-2026sp-L@cornell.edu), or you can contact the instructor by email directly.

                                            Homework: There will be homework 4 homework assignments. All programming assignments are to be completed in Python, using the starter code that we provide. Assignments will always be due at midnight (11:59pm) on the due date. Written problems will usually be submitted to Gradescope. You may be asked to annotate your PDF (e.g. by selecting your solution to each problem).

                                            Midterm exam: There will be a midterm exam (date TBD).

                                            Textbook: There are no required textbooks to purchase. However, much of the class will closely follow:

                                            The following textbooks may be useful as references and are available for free online:

                                            • Murphy. Probabilistic Machine Learning: Advanced Topics. 2023 [link]
                                            • Goodfellow, Bengio, Courville. Deep Learning. 2016 [link].
                                            • MacKay. Information Theory, Inference and Learning Algorithms. 2003

                                            Acknowledgements: This course uses material from MIT's 6.869: Advances in Computer Vision which is associated with the optional textbook Foundations of Computer Vision. It also includes lecture slides from other researchers, including Svetlana Lazebnik, Alexei Efros, David Fouhey, and Noah Snavely (please see acknowledgments in the lecture slides).

                                            Late policy: You'll have 120 late hours (enough hours for 5 late days) to use over the course of the semester. Each time you use a late hour, you may submit a homework assignment one hour late without penalty. You may distribute these any way you'd like. For example, you can use all of your days at once to turn in one assignment 5 days late, or you can turn each assignment in a few hours late. You do not need to notify us when you use a late hour; we'll deduct it automatically. If you run out of late hours and still submit late, your assignment will be penalized at a rate of 1% per hour. If you edit your assignment after the deadline, this will count as a late submission, and we'll use the revision time to compute late hours (rounded up per assignment).

                                            We will not provide additional late time, except under exceptional circumstances, and for these cases we'll require documentation (such as a doctor's note). Please note that the late hours are provided to help you deal with minor setbacks, such as routine illness or injury, paper deadlines, interviews, and computer (or Google Colab) problems; these do not generally qualify for an additional extension.

                                            Please note that, due to the number of late days available, there will be a long (2+ week) lag between the time of submission and the time that grades are released. We'll need to wait for the late submissions to arrive before we can complete the grading.

                                            Regrade requests: If you think that there was a grading error, you'll have 9 days to submit a regrade request, using Gradescope. This will be a strict deadline, even for significant mistakes such as missing grades, so please look carefully over your graded assignments.

                                            AI/LLM tools: The use of language models will be set on a per-assignment basis. For most assignments, we will not permit their use at all. For others, we will allow them to be used as a way of learning how to use programming languages and libraries (e.g., as a substitute for reading documentation). There may also be more open-ended assignments where we explicitly permit their full use. Please ask the course staff if additional questions arise on what is or is not permitted.

                                            Grading:

                                            Grades will be computed as follows, with all homeworks equally weighted:
                                            Homework 40%
                                            Midterm exam 30%
                                            Final project30%
                                            We'll use these approximate grade thresholds:
                                            A+ TBD
                                            A 92%
                                            A- 90%
                                            B+ 88%
                                            B 82%
                                            B- 80%
                                            C+ 78%
                                            C 72%
                                            C- 70%
                                            These are lower bounds on letter score grades. For example, if you get an 81%, you will get a B- or better. We may gently curve the class up, in a way that would only improve your letter grade: e.g., after the curve, an 81% might round up to a B, but it would not round down to a C+. To ensure consistency in grading, we will not round (e.g., 87.99% is a B), and we will not consider regrade requests outside of the usual time window.

                                            Academic integrity: While you are encouraged to discuss homework assignments with other students, your programming work must be completed individually. All students should abide by the Cornell University Code of Academic Integrity, and all writing submitted should be one’s own writing. While discussing course concepts with other students is highly encouraged, plagiarism will result in zero credit and/or a referral to the Office of Student & Academic Affairs.

                                            Students with disabilities: Your access in this course is important to us. Please give us your Student Disability Services (SDS) accommodation letter early in the semester so that we have adequate time to arrange your approved academic accommodations. If you need immediate accommodations for equal access, please speak with us after class or send an email message to us and/or SDS at sds_cu@cornell.edu. If the need arises for additional accommodations during the semester, please contact SDS. You may also feel free to speak with the Student & Academic Affairs team at Cornell Tech who will connect you with the university SDS office. If you have, or think you may have a disability, please contact Student Disability Services for a confidential discussion. You must request your SDS accommodation letter no later than 3 weeks prior to needing it.

                                            Support: There are services and resources at Cornell designed specifically to bolster student mental health and well-being. This link provides a list of resources for Cornell Tech students. You can additionally also contact studentwellness@tech.cornell.edu with concerns.