In this project, we will be visualizing and manipulating AlexNet :
Some parts of this assignment were adapted/inspired from a Stanford cs231n assignment. The parts that are similar have been modified heavily and ported to caffe.
The assignment is contained in an IPython Notebook; see below.
The coding part will be completed in teams of 2.
There are many pieces to the assignment, but each piece is just a few lines of code. you should expect to write less than 10 lines of code for each TODO .
Unit tests: to help verify the correctness of your solutions, you can run pytest in a shell (same directory as the notebook):
Running out of memory: the VM should be able to hold exactly one AlexNet in memory by default, which is enough to complete the assignment. If you run the unit tests with the notebook open, you will either need to close the notebook server or give the VM more memory (2GB or more).
This section contains images to illustrate what kinds of qualitative results we expect.
Saliency: we expect that pixels related to the class have a higher value. Left: Input image. Right: saliency.
Fooling imageThese images look nearly identical, and yet AlexNet will classify each image on the middle as "snail". If you look really closely you can notice some tiny visual differences. The right image shows the difference magnified by 5x (with 0 re-centered at gray).
These images are classified as 100% belonging to different classes by AlexNet. If you run these for longer or adjust the hyperparameters, you may see a more salient result.
Many classes don't give very good results; here we show some of the better classes.
Feature inversion (Extra Credits; Optional)
Note that we could probably obtain higher quality reconstructions if we ran the optimization for longer, or added a better regularizer. To keep things simple, your images only need to be mostly converged.