CS 5670: Computer Vision, Spring 2023
Project 5:  Neural Radiance Fields

Brief

  • Assigned: Thursday, April 20, 2023 (fork from Github Classroom)
  • Due:  Wednesday, May 3, 2023 (8:00 pm) (turn in via Github)
  • Teams: This assignment should be done in teams of 2 students.

Synopsis

In this project, you will learn:

  1. Basic use of the PyTorch deep learning library.
  2. How to understand and build neural network models in PyTorch.
  3. How to build Neural Radiance Fields (NeRF) [1] with a set of images.
  4. How to synthesize novel views from NeRF.

[1] Mildenhall et al, "NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis", ECCV 2020

The assignment is contained in an IPython Notebook on Colab; see below.

Colab

Google Colab or Colaboratory is a useful tool for machine learning, that runs entirely on cloud and thus is easy to setup. It is very similar to the Jupyter notebook environment. The notebooks are stored on users Google Drive and just like docs/sheets/files on drive, it can be shared among users for collaboration (you'll need to share your notebooks as you'll be doing this in a team of 2).

TODO

There are several pieces we ask you to implement in this assignment:

  1. Fit a single image with positional encoding (TODO 1);
  2. Compute the origin and direction of rays through all pixels of an image (TODO 2);
  3. Sample the 3D points on the camera rays (TODO 3);
  4. Compute compositing weight of samples on camera ray (TODO 4);
  5. Render an image with NeRF (TODO 5).

Tests: To help you verify the correctness of your solutions, we provide tests at the end of each TODO block, as well as qualitative and quantitative evaluation at the end of the notebook.

The instructions about individual TODOs are present in detail in the notebook.

Setup

  1. Check out the code base from your Github Classroom repo.
  2. Upload CS5670_Project5_NeRF.ipynb notebook to your Google drive.
  3. Double clicking on the notebook on Google drive should give you an option for opening it in Colab.
  4. Alternatively, you can directly open Colab and upload notebook following this:
    File -> Upload notebook...
  5. If you haven't used Colab or Jupyter Notebooks before, first read the Colaboratory welcome guide.
  6. You will find, rest of the instructions in the notebook. As already stated, Colab required almost no setup, so there is no need to install PyTorch locally.

What to hand in?

  1. Execute your completed notebook file, download it as ipynb containing figures and results from visualization and test case cells;
    File -> Download .ipynb
  2. Download your completed notebook file as python file, serving as a backup version in case ipynb file doesn't save output;
    File -> Download .py
  3. Upload the completed notebook and python file to your Github Classroom repo;
  4. Download your 360 video output and upload to your Github Classroom repo;
  5. Extra credit 1: If you are doing the extra credit of training NeRF on your own data, submit that notebook and the rendered video as well via GitHub.
  6. Extra credit 2: If you are doing the MNIST Challenge submit that notebook as well via GitHub.

Common issues

  1. Problem: Colab GPU limits: Cannot run with GPU due to Colab GPU limits.
    Solution: Only training NeRF on 360 scene requires GPU (running on CPU will otherwise take hours). You can wait for several hours for Colab to recover or you can work on other TODO blocks first.
  2. Problem: CUDA out of memory.
    Solution: Restart and rerun the notebook.

What should my images look like?

This section contains images to illustrate what kinds of qualitative results we expect.

Fit a single image with positional encoding: We expect that the model tends to produce an oversmoothed output without positional encoding.

  1. Result with no positional encoding:
  2. Result with positional encoding at frequency=3:
  3. Result with positional encoding at frequency=6:

Train NeRF:

The optimization for this scene takes around 1000 to 3000 iterations to converge on a single GPU (in 10 to 30 minutes). We use peak signal-to-noise ratio (PSNR) to measure structual similarity between the prediction and target image, and we expect a converged model to reach a PSNR score higher than 20, and produce a reasonable depth map.

Render 360 video with NeRF:

A 360 video created with the trained NeRF by rendering a set of images around the object.

Extra credit

Extra credit 1: Train NeRF on your own data

For extra credit, teams may train NeRF on their own data. Follow LLFF to capture images for forward facing scene. Use Colab version of COLMAP or follow the LLFF tutorial to get the camera poses. Train the NeRF with your own data and then render a video of the scene.

Capture your own data:

Example input:

Example output (render a set of images in spiral camera motion):

Extra credit 2: MNIST Challenge

For extra credit, teams may design and train their own neural network for the MNIST dataset. You will be given an example network and your aim should be to improve the accuracy while being under the specified parameter limit. Look at the MNIST notebook for skeleton code and more instructions.

Last updated 15 April 2023