Schedule

Lecture Date Topic Materials Assignments
Lec. 1 Mon, Aug. 25 Introduction
About the course
Neighborhood filtering
Blurring
Gradient filters
Lec. 2 Wed, Aug. 27 Filtering
Convolution and cross-correlation
Edge detection
Nonlinear filtering
ps1 out (image processing)
Tutorial Fri, Aug. 29 Programming review (optional)
3:00pm - 3:45pm in Masters Studio
Jupyter notebooks
Numerical computing with numpy
Lec. 3 Mon, Sep 1 No class - Labor Day
Lec. 4 Wed, Sep 3 Image pyramids
More filtering
Gaussian pyramid
Laplacian pyramid
Image statistics (optional)
Lec. 5 Mon, Sep 8 Frequency
Image bases
Fourier transform
Filtering in frequency space
Lec. 6 Wed, Sep 10 Machine learning
Nearest neighbor
Linear regression
Training models
Lec. 7 Mon, Sep 15 Linear classifiers (Zoom lecture)
Logistic regression
Stochastic gradient descent
ps1 due
ps2 out (machine learning)
Lec. 8 Wed, Sep 17 Neural networks (Guest lecture by Bharath Raj)
Nonlinearities
Network structure
PyTorch tutorial
Lec. 9 Mon, Sep 22 Convolutional networks
Convolution layers
Pooling
Normalization
Lec. 10 Wed, Sep 24 Object detection
Sliding window
Region-based CNNs
Instance segmentation
Lec. 11 Mon, Sep 29 Transformers for vision
Transformers
ViTs
ps2 due
ps3 out (transformers)
Lec. 12 Wed, Oct. 1 Autoregressive generation and GANs
Autoregressive models
GANs
VQ-GANs
    Lec. 13 Mon, Oct. 6 Diffusion models
    Diffusion models for image synthesis
    DDPM
    Latent diffusion models
    Applications
    Lec. 14 Wed, Oct. 8 Diffusion for image manipulation
    Conditional diffusion models
    Classifier-free guidance
    SDEdit for image editing
    Visual Anagrams
    Mon, Oct 13 No class - Fall Break
    Lec. 15 Wed, Oct. 15 Vision and language
    Semantic segmentation
    Contrastive learning
    Captioning
      ps3 due on Fri., Oct. 17
      ps4 out (image generation)
      Lec. 16 Mon, Oct. 20 Take home exam
        Lec. 17 Wed, Oct. 22 Motion and video (Guest lecture)
        3D CNNs
        ViTs for video
        Optical flow
        Point tracking
          Lec. 18 Mon, Oct. 27 Image formation
          Camera models
          Projection
          Plenoptic function
            Lec. 19 Wed, Oct. 29 Two-view geometry
              Lec. 20 Mon, Nov. 3 Fitting geometric models
              Finding correspondences
              Fitting a homography
              RANSAC
                ps4 due
                ps5 out (panorama stitching)
                Lec. 21 Wed, Nov. 5 Structure from motion
                Structure from motion
                Multi-view stereo
                Neural radiance fields
                  Lec. 22 Mon, Nov. 10 Depth estimation
                  Triangulation
                  Multi-view stereo
                  Monocular depth estimation
                    Wed, Nov. 12 No class
                    Lec. 23 Mon, Nov. 17 Inverse graphics
                    Neural radiance fields
                    Gaussian splatting
                      Lec. 24 Wed, Nov 19 Recent advances in video and 3D
                        ps5 due
                        ps6 out (neural radiance fields)
                        Lec. 25 Mon, Nov 24 Fairness + image forensics
                        Fake images
                        Supervised detection
                        Datasets
                        Algorithmic fairness
                          Lec. 26 Wed, Nov 26 Embodied vision
                          Behavior cloning
                          Policy gradient
                            Mon, Dec. 1 No class - Thanksgiving
                            Lec. 27 Wed, Dec. 3 Self-supervised learning
                            Masked autoencoders
                            Sound and touch
                              Lec. 28 Mon, Dec 8 Final exam review
                                ps6 due
                                Fri, Dec 12 Final exam (9am-12pm)


                                  Staff & Office Hours



                                  Office Hours

                                  Day Time Name Location
                                  Monday 2:00pm - 3:00pmAndrew Owens Bloomberg 368
                                  Tuesday 1:30pm - 2:30pmXuanchen Lu Studio
                                  Thursday 1:00pm - 2:00pmBharath Raj Studio
                                  Friday 10:00am - 11:00amYen-Yu Chang Studio

                                  The Master's Studio is on the first floor of the Bloomberg building. Office hours take place in the huddle space. Please refer to the whiteboard withing the studio to help locate this huddle.


                                  Course information

                                  CS 5670 is an introductory computer vision class. Class topics include low-level vision, object recognition, motion, 3D reconstruction, basic signal processing, and deep learning. We'll also touch on very recent advances, including image generation, self-supervised learning, and embodied perception.

                                  Lectures: Lectures will take place Monday and Wednesdays 7:30–8:45pm in Bloomberg Center 161. Attendance will not be required, but it is highly encouraged. We will give short, in-class quizzes. However, these are not worth much of your total grade (2% total), and you may miss several of them without penalty. There are multiple ways to participate:

                                  • In person in Bloomberg Center 161
                                  • We'll post lecture recordings online here.

                                  Prerequisites:

                                  • This course puts a strong emphasis on mathematical methods. We'll cover a wide range of techniques in a short amount of time. Background in linear algebra is required. For a refresher, please see here. This material should mostly look familiar to you.
                                  • This class will require a significant amount of programming. All programming will be completed in Python, using numerical libraries such as numpy, scipy, and PyTorch. In some assignments, we'll give you starter code; in others, we'll ask you to write a large amount of code from scratch.

                                  Google Colab: The problem sets will be completed using Jupyter notebooks, generally using Google Colab. While this service is free, it is important to note that it comes with GPU usage limits. You may only use the GPUs on a given Google account for a certain number of hours per day. These limits are due to the fact that GPUs are very expensive. Since none of the problem sets require training large models, you may never encounter these limits. However, we have provided a few suggestions for avoiding them:

                                  1. Reduce your GPU usage by initially debugging your code on the CPU. For example, after confirming that you can successfully complete a single training iteration without error on the CPU, you can switch to the GPU. You can then switch back to the CPU if you need to debug further errors.
                                  2. Consider purchasing Google Colab Pro ($10/month) during the portion of the class where GPUs are required (PS5 and onward; approximately 2 months). For students who would like to use this (optional) service, but are unable to afford it, we may be able to obtain funding for you. Please send the course staff a private message by email if you would like to learn more about this option.

                                  Q&A: This course has a Q&A forum on Canvas, where you can ask public questions. If you cannot make your post public (e.g., due to revealing problem set solutions), please mark your post private, or come to office hours. Please note, however, that the course staff cannot provide help debugging code, and there is no guarantee that they'll be able to answer all questions — especially last-minute questions about the homework. We also greadly appreciate it when you respond to questions from other students! If you have an important question that you would prefer to discuss over email, you may email the course staff (cs5670-staff-2025fa-L@cornell.edu), or you can contact the instructor by email directly.

                                  Homework: There will be homework assignments approximately every two weeks. All programming assignments are to be completed in Python, using the starter code that we provide. Assignments will always be due at midnight (11:59pm) on the due date. The will all be assigned. Written problems will usually be submitted to Gradescope. You may be asked to annotate your pdf (e.g. by selecting your solution to each problem).

                                  Midterm: There will be a take-home midterm exam. It will take place approximately halfway through the semester (see schedule above). You will have 24 hours to complete it.

                                  Final exam: There will be a final exam during Cornell's exam period (see schedule above).

                                  In class quizzes: We will include quiz questions in some (not all) lectures to encourage an interactive lecture. That is, there will be 0~3 quiz questions per lecture. We will use Poll Everywhere to administer these quizzes. Quiz grading will mostly be based on effort. You will get 2 points for answering, or 3 points for correct answer (i.e., you'll get most of the points just for showing up). These in-class quiz questions will be worth only 2% of your final grade. We will allow you to drop your lowest grades on 5 quizzes, so you may miss up to (at least) 5 classes without penalty. Given this drop policy and the very small overall weight for the quizzes on the final grade, we will not grant excused absences for lectures/quizzes.

                                  Textbook: There are no required textbooks to purchase. However, much of the class will closely follow:

                                  Torralba, Isola, Freeman. Foundations of Computer Vision.
                                  It is available for free online, and also in print.

                                  The following textbooks may also be useful as references:

                                  • Szeliski. Computer Vision: Algorithms and Applications, 2nd edition draft (available for free online)
                                  • Goodfellow, Bengio, Courville. Deep Learning. (available for free online)
                                  • Hartley and Zisserman. Multiple View Geometry in Computer Vision.
                                  • Forsyth and Ponce. Computer Vision: A Modern Approach.

                                  Acknowledgements: This course uses material from MIT's 6.869: Advances in Computer Vision which is associated with the optional textbook Foundations of Computer Vision. It also includes lecture slides from other researchers, including Svetlana Lazebnik, Alexei Efros, David Fouhey, and Noah Snavely (please see acknowledgments in the lecture slides).

                                  Late policy: You'll have 120 late hours (enough hours for 5 late days) to use over the course of the semester. Each time you use a late hour, you may submit a homework assignment one hour late without penalty. You may distribute these any way you'd like. For example, you can use all of your days at once to turn in one assignment 5 days late, or you can turn each assignment in a few hours late. You do not need to notify us when you use a late hour; we'll deduct it automatically. If you run out of late hours and still submit late, your assignment will be penalized at a rate of 1% per hour. If you edit your assignment after the deadline, this will count as a late submission, and we'll use the revision time to compute late hours (rounded up per assignment).

                                  We will not provide additional late time, except under exceptional circumstances, and for these we'll require documentation (such as a doctor's note). Please note that the late hours are provided to help you deal with minor setbacks, such as routine illness or injury, paper deadlines, interviews, and computer (or Google Colab) problems; these do not generally qualify for an additional extension.

                                  Please note that, due to the number of late days available, there will be a long (2+ week) lag between the time of submission and the time that grades are released. We'll need to wait for the late submissions to arrive before we can complete the grading.

                                  Regrade requests: If you think that there was a grading error, you'll have 9 days to submit a regrade request, using Gradescope. This will be a strict deadline, even for significant mistakes such as missing grades, so please look carefully over your graded assignments.

                                  AI/LLM tools: The use of language models will be set on a per-assignment basis. For most assignments, we will not permit their use at all. For others, we will allow them to be used as a way of learning how to use programming languages and libraries (e.g., as a substitute for reading documentation). There may also be more open-ended assignments where we explicitly permit their full use. Please ask the course staff if additional questions arise on what is or is not permitted.

                                  Grading:

                                  Grades will be computed as follows, with all homeworks equally weighted:
                                  Homework 60%
                                  Midterm exam 15%
                                  Final exam 23%
                                  In-class quizzes 2%
                                  We'll use these approximate grade thresholds:
                                  A+ TBD
                                  A 92%
                                  A- 90%
                                  B+ 88%
                                  B 82%
                                  B- 80%
                                  C+ 78%
                                  C 72%
                                  C- 70%
                                  These are lower bounds on letter score grades. For example, if you get an 81%, you will get a B- or better. We may gently curve the class up, in a way that would only improve your letter grade: e.g., after the curve, an 81% might round up to a B, but it would not round down to a C+. To ensure consistency in grading, we will not round (e.g., 87.99% is a B), and we will not consider regrade requests outside of the usual time window.

                                  Academic integrity: While you are encouraged to discuss homework assignments with other students, your programming work must be completed individually. All students should abide by the Cornell University Code of Academic Integrity, and all writing submitted should be one’s own writing. While discussing course concepts with other students is highly encouraged, plagiarism will result in zero credit and/or a referral to the Office of Student & Academic Affairs.

                                  Students with disabilities: Your access in this course is important to us. Please give us your Student Disability Services (SDS) accommodation letter early in the semester so that we have adequate time to arrange your approved academic accommodations. If you need immediate accommodations for equal access, please speak with us after class or send an email message to us and/or SDS at sds_cu@cornell.edu. If the need arises for additional accommodations during the semester, please contact SDS. You may also feel free to speak with the Student & Academic Affairs team at Cornell Tech who will connect you with the university SDS office. If you have, or think you may have a disability, please contact Student Disability Services for a confidential discussion. You must request your SDS accommodation letter no later than 3 weeks prior to needing it.

                                  Support: There are services and resources at Cornell designed specifically to bolster student mental health and well-being. This link provides a list of resources for Cornell Tech students. You can additionally also contact studentwellness@tech.cornell.edu with concerns.