CS6670 - Computer Vision

Picture credit: Magritte and some computer vision researchers

Quick info

Instructor: Bharath Hariharan
Lecture time: Tues. and Thurs. 1:25pm - 2:40pm
Lecture venue: Phillips Hall 219
TA: Davis Wertheimer

Office Hours:
Bharath: Wed and Fri 3 pm (at 311 Gates Hall)
Davis: Tues 11 am (G17 Gates Hall)

Objective:

This course will serve as an introduction to computer vision for anyone who wants to do research in this area. It will cover both fundamentals which underly classic techniques, as well as the problems the community is currently working on, and modern techniques being used to solve these problems. A tentative list of topics is below:

Prerequisites:

This course is intended for graduate students starting out in computer vision research. As such it will assume that students are mathematically mature, and are comfortable with: In addition, familiarity with basic machine learning will be useful but is not required.

Guidelines for project proposal:

The project proposal should be approximately one page in length. As a rule of thumb, it should spend about one or two paragraphs each on:

Lectures / Notes:

Reference (for the first part of the course): Rick Szeliski's book. This is not a textbook, in that it covers a lot more material in a lot more detail, but can be used for additional reading. Below is the (tentative) list of classes, with possible additional readings. These may change as the semester progresses.
Date Topic (with linked notes / slides) Additional reading
Aug 22 Introduction
Aug 24 Image Formation - Geometry
Aug 29 All about rotations | Image formation - color
Aug 31 Reconstruction - I
Sep 5 Reconstruction - II (Epipolar Geometry)
Sep 7 The correspondence problem
Sep 12 Optical flow Szeliski 8.4
Sep 14 Grouping Contour detection
Graph-based segmentation (Szeliski 5.4, 5.5)
Segmentation for object proposals (Selective search)
Sep 19 Introduction to machine learning
Example case: logistic regression
Empirical risk minimization
Classical (pre-convnet) recognition
Bag-of-words, Spatial pyramids
Sep 21 Non-linear classifiers and Neural networks
Convolutional networks
Deformable part models
MNIST (Sections I, II and III. Also read the rest and contemplate cyclical nature of research)
Sep 26 Backpropagation and computation graphs
Image classification
ImageNet
Sep 28 Transfer learning
Convolutional network architectures
Transfer learning (Many examples)
VGG16, VGG19, 3x3 convolutions
Batch normalization
Highway networks
Residual networks
Oct 3 Object detection
Datasets and metrics
R-CNN
Fast R-CNN
Faster R-CNN
SSD
Oct 5 Semantic segmentation
Datasets and metrics
FCN, skip connections
Dilated convolutions, CRFs
Oct 12 Instance segmentation
Pose Estimation
Datasets and metrics
Dataset, metrics, segmentation as region classification
Hypercolumns / skip connections, segmentation as detection refinement
Instance segmentation using FCNs

Heatmap representations, graphical model based refinement
Sequential prediction, autocontext and inference machines
Hourglass architectures
Oct 17 Learning for 3D
Datasets and metrics
Rigid body pose estimation
Deep stereo
Learning to correspond for stereo
Depth estimation from a single image
Normal estimation from a single image
Oct 19 Learning correspondence Learning optical flow from simulated data
Learning from hallucinated data
Learning from constraints
Oct 31 Detour: Writing
Video recognition
Datasets and metrics
Video classification as frame+flow classification
CNN+LSTM
3D convolution
I3D
Nov 2 Vision and language Captioning
Visual question answering
Attention-based systems
Problems with VQA
Nov 7 Reducing supervision
One- and Few-shot learning
Classic unsupervised learning (See Chapter 2)
Self-supervised learning
Learning from noisy labels
Nov 9 Vision and action
Active perception
Learning from ego-motion
Learning tasks in robotics
Nov 14 GANs Generative Adversarial Networks
CycleGANs
Nov 16 Adversarial examples and interpreting convnets Adversarial examples