Processing math: 100%

CS4620 (5620) Introduction to Computer Graphics

CS4621 (5621) Computer Graphics Practicum

Cornell University

MWF 1:25pm, Gates G01

M 12:20pm, Hollister B14 [4621 only]

Instructor: Steve Marschner (office hours: Tues. 4–5, Fri. 2:30–3:30)

Staff

Graduate TAs

Zekun Hao (CS4620 head TA)
Rundong Wu (CS4621 head TA)
Zechen Zhang
Gregory Yauney
Tomasz Chmielewski

Ugrad TAs

Jason Liu
Sandy Fang
Serge-Olivier Amega
Meredith Young-Ng
Lily Lin
Sitian Chen

See below for the office hours calendar.

Schedule

date topic reading assignments
24Aug Introduction slides    
27Aug Triangle meshes slides    
29Aug Triangle meshes | .obj files demo meshes    
31Aug Triangle meshes 2 slides Sec. 12.1  
3Sep —Labor Day—    
5Sep Ray tracing intersection slides   Mesh pa due
7Sep Perspective slides    
10Sep Ray tracing shading slides Chap. 4, Sec. 2.7  
12Sep Ray tracing interpolation slides Sec 11.1–2 except 11.2.4  
14Sep Texture mapping slides   Ray 1 hw due
17Sep Transformations slides Chap. 6, Review 5.1–5.3  
19Sep Transformations    
21Sep Viewing: Orthographic demo slides Sec. 7.0–7.1 Ray 1 pa due
24Sep Viewing: Perspective Sec. 7.2–7.5  
26Sep Rasterization slides Sec. 12.2  
28Sep Rasterization demo Sec. 8–8.1 Manip hw due
1Oct Graphics Pipeline demos slides Chap. 8  
3Oct OpenGL and GLSL exhibits slides Chap. 17 Manip pa due
5Oct Games with textures slides Chap. 11  
8Oct —Fall Break—    
10Oct Games with textures    
12Oct Images and Displays slides Chap. 3 Shaders hw due
15Oct Splines slides Sec. 15.0–15.3  
17Oct Splines Sec. 15.4–15.6 Shaders pa due
19Oct Splines    
22Oct Subdivision surfaces slides    
23Oct Midterm: 7:30pm, Olin Hall 155    
24Oct Subdivision surfaces    
26Oct Hierarchies and scene graphs slides    
29Oct Animation slides Sec. 16.0–2 Splines hw due
31Oct Animation and 3D Rotations notes Sec. 16.3–7 (just for interest)  
2Nov Animation and 3D Rotations notes   Splines pa due
5Nov Surface reflection slides    
7Nov Monte Carlo illumination slides    
9Nov More Monte Carlo notebook   Animation hw due
12Nov Generating random samples    
14Nov Advanced ray tracing slides   Animation pa due
16Nov Ray Tracing acceleration slides    
19Nov Antialiasing slides Chap. 3, Sec. 11.4  
21Nov —Thanksgiving Break—    
23Nov —Thanksgiving Break—    
26Nov Compositing slides    
28Nov Final exam review   Ray 2 hw due
30Nov Color science slides Chap. 19  
3Dec Color science | Wrap-up   Ray 2 pa due

Projects

There will be 7 projects and 7 homeworks during the semester.

The CS4620 late policy is outlined in lecture 2. Briefly, you have 7 slip days that you can use to turn assignments in late: turn one assignment in 7 days late or turn 7 assignments in one day late. After you run out of slip days there are late penalties, and assignments can't ever be turned in more than 7 days late.

Due: Wednesday September 5 2018 (11:59pm)

This particular assignment only has a programming exercise, and no written exercise.

Do this project alone or in groups of two, as you prefer.

Introduction

In this assignment you will learn about the most widely used way to represent surfaces for graphics: triangle meshes. A triangle mesh is just a collection of triangles in 3D, but the key thing that makes it a mesh rather than just a bag of triangles is that the triangles are connected to one another to form a seamless surface. The textbook and lecture slides discuss the data structures used for storing and manipulating triangle meshes, and in this assignment we will work with the simplest kind of structure: an indexed triangle mesh.

Your job in this assignment is to write a simple mesh generation and processing utility that is capable of building triangle meshes to approximate some simple curved surfaces, and also can add some information to existing meshes. It reads and writes meshes stored in the popular OBJ file format, a text format with one line for each vertex and face in the mesh.

Building a mesh

Suppose we wish to generate a mesh for a loaded die as shown below:

4 different views of our loaded die, represented as a triangle mesh.

The cube extends from -1 to 1 in each of the x, y, and z directions. Each face has the following texture applied:

The texture to apply to each face of the cube. In uv-space, the lower left corner of the image is the origin, and the top right corner of the image is the point (1,1). The u direction extends horizontally across the image, the v direction extends vertically.

We could represent this mesh by listing out each triangle. Each triangle would be specified by 3 points in 3D space (i.e., the locations of each of its vertices), 3 points in 2D space (i.e., the uv-space texture coordinates of each of its vertices), and 3 unit-length 3D vectors (i.e., the normals of the surface at each vertex). This is sufficient to represent this mesh, but you may notice that we are repeating some information; for instance, the position of each corner of the cube is shared by at least 3 different triangles. Additionally, each triangle represents a flat surface, and thus the normals at each of its vertices are all equivalent. (Later, we will use triangles to approximate curved surfaces where this is not the case.) We can reduce this repetition by introducing an indexing scheme.

The OBJ file format is one such indexing scheme, and is the one we'll be using in this assignment. In this format, the positions of all vertices are listed first, and then triangles are specified by providing 3 integers that index into this list of positions. Texture coordinates and normals are abstracted similarly; thus, each triangle is specified by 3 position indices, 3 texture coordinate indices, and 3 normal indices.

A vertex position is specified on a single line by the letter "v" followed by 3 floating point numbers that represent the x, y, and z coordinates of the point. A texture coordinate is specified by the letters "vt" followed by 2 floating point numbers that represent the u and v coordinates respectively. A vertex normal is specified by the letters "vn" followed by 3 floating point numbers that represent the normal vector. (The OBJ file format does not require that these be unit length, but we will require it for this assignment.) Triangles are specified with the letter "f" followed by 3 groups of indices. Groups of indices can be a single integer (indexing the vertex position), two integers separated by a "/" (indexing the vertex position and texture coordinate respectively), two integers separated by "//" (indexing the vertex position and vertex normal respectively), and three integers each separated with a "/" (indexing the position, texture coordinates, and normals). Please note all indices in the OBJ file format are 1-based! Vertices should be specified in counter-clockwise order, assuming you are looking down at the outer surface of the triangle (i.e., the triangle's normal is pointed towards you).

Given all of this information, we can specify the loaded die above with the following OBJ file:

v 1.0 -1.0 -1.0
v 1.0 -1.0 1.0
v -1.0 -1.0 1.0
v -1.0 -1.0 -1.0
v 1.0 1.0 -1.0
v 1.0 1.0 1.0
v -1.0 1.0 1.0
v -1.0 1.0 -1.0
vt 1.0 1.0
vt 0.0 1.0
vt 0.0 0.0
vt 1.0 0.0
vn 1.0 0.0 0.0
vn -1.0 0.0 0.0
vn 0.0 1.0 0.0
vn 0.0 -1.0 0.0
vn 0.0 0.0 1.0
vn 0.0 0.0 -1.0
f 1/1/4 2/2/4 3/3/4
f 1/1/4 3/3/4 4/4/4
f 1/4/1 5/1/1 6/2/1
f 1/4/1 6/2/1 2/3/1
f 2/4/5 6/1/5 7/2/5
f 2/4/5 7/2/5 3/3/5
f 3/2/2 7/3/2 8/4/2
f 3/2/2 8/4/2 4/1/2
f 4/2/6 8/3/6 5/4/6
f 4/2/6 5/4/6 1/1/6
f 5/1/3 8/2/3 7/3/3
f 5/1/3 7/3/3 6/4/3

Note that even though there are 12 total triangles, there are only 8 unique locations for each triangle vertex, so there is no point in listing them multiple times. Similarly, each vertex has only 1 of 4 total texture coordinates, and each vertex has 1 of 6 total normals. This scheme allows us to eliminate the redundancy of listing these values once for each triangle. Also note that many vertices can share a position, but each can have different texture coordinates and normals; for instance, each corner of the mesh is shared by at least 3 triangles, but each may face a different direction and therefore may have its own normal.

Specification

For this assignment, you will create a command line utility called MeshGen. Its usage is:

  (1) java MeshGen -g <sphere|cylinder> [-n <divisionsU>] [-m <divisionsV>] -o <outfile.obj>
  (2) java MeshGen -i <infile.obj> -o <outfile.obj>

For usage (1), the first required argument is the geometry specifier, and the second is the output filename. If the geometry specifier is one of the fixed strings sphere or cylinder, a triangle mesh is generated that approximates that shape, where the number of triangles generated is controlled by the optional -n and -m options (details below), and written to the output OBJ file.

A triangle mesh that is an approximation of a smooth surface should have normal vectors stored at the vertices that indicate the direction normal to the exact surface at that point. When generating the predefined shapes, you generate points that are on the surface, and for each you should also calculate the normal vector that is perpendicular to the surface at that point. Additionally, the generated meshes will have texture coordinates. Note that you should do take advantage of the indexed storage scheme where possible; for full credit, you should make sure that if two triangles meet at a position in space, that position should only be specified once in the resulting OBJ file. Similarly, if two vertices have both the same location and texture coordinates, those texture coordinates should only be specified once; if they share the same location and normal, then the normal should only be specified once. (Duplicate normals and texture coordinates are allowed, but only as long as they are used at different places on the mesh.)

For usage (2), the user provides an input OBJ mesh file, which the program reads in. The mesh is assumed to have no normals (if normals are included in the input file, they are ignored). The program then generates approximate normals at each vertex as described below, and writes the resulting mesh to the user-provided output file.

Details of predefined geometries

Cylinder

The cylinder has radius 1 and height 2 and is centered at the origin; its longitudinal axis is aligned with the y-axis. It is tessellated with n divisions arranged radially around the outer surface. The two ends of the cylinder are closed by disc-shaped caps parallel to the xz-plane. The vertices around the rims of the cylinder share 2 or more normals and texture coordinates, though they share the same positions. Each cap consists of n vertices arranged in a circle as well as a single point where the cap intersects the y-axis. This point is incorporated into each triangle that makes up the cap.

Along the cylinder's shell (i.e., excluding its caps), texture coordinates in the u dimension run from 0 to 1 in a counterclockwise direction as viewed from the +y direction. There is a texture seam (where u=0 meets u=1) along vertices that have a z-coordinate of 1. Multiple texture coordinates occur at the same position in space to allow this discontinuity. Coordinates run from 0 to 0.5 in the v dimension, increasing in the +y direction. The texture coordinates for the two caps are circles inscribed in the upper-left (for the y cap) and upper-right (for the +y cap) quadrants of the unit square in the uv-plane, with the +u direction corresponding to the +x direction in 3D space, and the +v direction corresponding to the z direction for the top cap, and the +z direction for the bottom cap. The -m flag is ignored for the cylinder.

Sphere

The sphere has radius 1 and is centered at the origin in 3D coordinates. It is tessellated in latitude-longitude fashion, with n divisions around the equator and m divisions from pole to pole along each line of longitude. Note that m divisions requires m+1 vertices from pole to pole. The North pole is at (0,1,0), the South pole at (0,1,0), and points on the Greenwich meridian have coordinates (0,y,z) with z>0. The mesh is generated with vertex normals that are normal to the exact sphere, and with texture coordinates (u,v) where u depends only on longitude, with u=0 at longitude 180 degrees West and u=1 at 180 degrees East, and where v depends only on latitude, with v=0 at the South Pole and v=1 at the North pole. Each quadrilateral formed by two adjacent longitude lines and two adjacent latitude lines is divided on the diagonal to form two triangles. The texture coordinates along the 180th meridian are duplicated: one texture coordinate has u=0 and the other has u=1, to enable the discontinuity required for correct wrapping of a image texture across the seam. The pole has n different texture coordinates, to enable nearly-appropriate texture in the row of triangles adjacent to the pole. Every other triangle around each pole becomes degenerate, i.e., collapses into a line; these degenerate triangles should be omitted.

Specs illustration for the cylinder and sphere (for the case of n = 32 and m = 16)

The default value of n should be 32, and the default value of m should be 16. You may assume that both arguments will always be greater than 2.

Computing vertex normals

For a mesh that has vertex positions and triangles, but no vertex normals, one often wants to compute vertex normals so that the mesh can appear smooth. But since the original surface which the mesh approximates is forgotten (if there even was one), we need some way to make up plausible normals. There are a number of ways to do this, and we'll use a simple one for this assignment: the normal at a vertex is the average of the geometric normals of the triangles that share this vertex.

Your first thought might be to do this as a loop over vertices, with an inner loop over the triangles that share that vertex:

   for each vertex v
      normal[v] = (0,0,0)
      for each triangle t around v
         normal[v] += normal of triangle t
      normal[v].normalize()

With the appropriate data structures, this is possible, but in our case there's no efficient way to do the inner loop: our data structure tells us what vertices belong to a triangle, but the only way to find triangles that belong to a vertex is to search through the whole list of triangles. This is possible but would be quadratic in the mesh size, which is bad news for large meshes.

However, it's simple to do it with the loops interchanged:

   for each vertex v
      normal[v] = (0,0,0)
   for each triangle t
      for each vertex v around t
         normal[v] += normal of triangle t
   for each vertex v
      normal[v].normalize()

This way the inner loop can efficiently visit just the necessary vertices. Nifty!

Framework

We have provided a framework to help you get stared. The framework is available on a public Git repository. If you have not used Git before, you can follow the tutorial here, or ask a TA for assistance. We encourage you to use version control though we request that you refrain from publicly sharing your class repository. Future assignments will be distributed by pushing new commits to the repository linked above, so we recommend creating a separate branch before you begin working on each assignment.

For this first assignment, we've provide a small library to help you build your utility. It contains data structures for dealing with vectors and meshes stored in the OBJ format.

The math package contains two classes, Vector2 and Vector3, for dealing with 2D and 3D vectors respectively. They also contain many methods such as vector addition and subtraction, dot and cross products, and component-wise operators. If you need to modify a vector in some way, chances are there is already a method that will do what you want—there is rarely a need to extract the individual components of a vector.

Please be aware that these classes contain in-place methods, even if they return a vector! This means that it is easy to make a mistake like the following:

  Vector3 a = new Vector3(...), b = new Vector3(...);
  Vector3 c = a.add(b); // Don't do this!
In the snippet above, the add() method modifies a and assigns it to c. This is almost certainly not what you want. Instead, consider the following code:
  Vector3 c = a.clone().add(b);
This creates a copy of a and modifies it instead, leaving the original a unchanged. This may seem awkward, but it allows for very concise code when performing many successive operations on vectors:
  x.add(y).cross(z).normalize();
When in doubt, please consult the Javadocs. They're there for your benefit!

The meshgen package contains data structures for operating on OBJ-style meshes. They are very similar in structure to the OBJ file itself; vertex positions are stored as Vector3s in an ArrayList, and similarly for vertex texture coordinates and vertex normals. Faces of the mesh are represented via the OBJFace class, which defines a polygon on the mesh by listing integers for each vertex which index into the position, texture coordinate, and normal arrays. This structure allows you to reuse repeated positions, texture coordinates, and normals, which will come in handy when generating the geometries mentioned above.

The OBJMesh class provides methods for reading and writing OBJ files; thus, to create a mesh, you simply create a new instance of the OBJMesh class, fill in the appropriate class members, and call writeOBJ(). The class supports both 0- and 1-based indexing, allowing you to either conform to the Java standard or OBJ standard as you set the mesh data. You can switch between indexing schemes via the indexBase static variable, which by default is 0. Either way, meshes are automatically converted to 1-based indexing before being written to file.

Some useful methods include the setVertex() method in OBJFace, which allows you to set the indices for the position, texture coordinate, and normal for a particular vertex simultaneously. Also of note are OBJMesh's getPosition(), getUV(), and getNormal() methods, which retrieve the appropriate vectors from the mesh data, taking the indexing issues mentioned above into account.

Testing your program

Since your program just writes a file full of inscrutable numbers, you need some way to look at the results. We have provided a mesh visualization tool that works in your browser, available here. To view a mesh, click "Select OBJ File" (or just drag an OBJ file onto the window). If the mesh has texture coordinates, you can upload an image texture by clicking "Select Texture File" (or, again, just drag the image file onto the window). The viewer optionally shows the wireframe structure and vertex normals of the mesh. Coordinate axes may also be displayed; +x is red, +y is green, and +z is blue. If triangle faces are wound in the wrong direction, they appear yellow and will not be textured.

There are a number of additional programs to visualize meshes, for example Blender, ObjViewer, MeshLab, or p3d.in. Be careful, though! Some of these programs add normals to your meshes if they don't already exist, or if they're malformed.

In addition, the OBJMesh class contains the isValid() method to check that your mesh conforms to the OBJ standard, and the compare() method to check if two different meshes are equivalent (ignoring the order in which faces are defined). Thus, another way to verify your code is correct is to compare your output to reference meshes, which are included in the data directory. Reference meshes are given for the cylinder and sphere with default arguments. To receive full credit, your output must be equivalent to ours, and generate no warnings when used with the verbose option. (We will have other reference meshes to test your code against in addition to the ones provided.) In addition, we have provided two meshes without normals (a bunny and a horse), as well as examples of what the meshes should look like when normals have been estimated. Be patient with the horse example, as it contains many triangles and may take a while to process.

    A final version of a textured sphere and cylinder

Extensions

There are many opportunities to expand your implementation and receive extra credit. Some ideas are listed below.

  • A torus is a doughnut-shaped surface defined by a major radius, affecting the size of the hole, and a minor radius, affecting the thickness of the ring. Add torus to the list of geometries generated by your program. Your code should create a torus with major radius 1 and minor radius r (controlled by an additional -r flag with a default of 0.25). Its u coordinates are like the sphere, and the v coordinate runs from 0 to 1 around the inside of the torus, with the direction arranged so that the texture is right-reading from the outside (i.e., the texture is not flipped when mapped to the surface). Like the sphere, it has a seam on the z half of the yz-plane, and it has a similar seam around the inner surface of the doughnut hole; vertices along each seam share a pair of texture coordinates, and single vertex, at the position (0,0,r1) where the seams meet, shares 4 texture coordinates.

    Specs illustration for the torus (for the case of n = 32, m = 16, and r = 0.25)

  • You may have noticed that for the sphere, the triangles get compressed near the poles. Provide an alternative tessellation approach to create a more even mesh; see icosphere as an example. (Note that this feature should be provided in addition to the sphere specification described above.)
  • The compare() method provided in the OBJMesh class compares each face in one mesh with each face in the other until it finds a match. This naive, brute force algorithm is very slow when given large meshes as input. Provide an alternative method with better performance. (Can you reduce the complexity from O(n2) to O(nlogn)?)
  • Per the OBJ specification, the OBJFace class supports faces that are arbitrary polygons, not just triangles. However, in many cases, triangles are much nicer to work with. Extend your MeshGen utility to generate a triangle mesh from an input mesh with polygonal faces. That is, take all faces that have 4 or more vertices and split them into triangles.
  • As mentioned above, the OBJ data format allows the reuse of positions, vertex coordinates, and normals through indexing. However, this reuse is not enforced. Extend your MeshGen utility to take in an uncompressed OBJ file and remove duplicate positions, texture coordinates, and normals, adjusting vertex indices as necessary.
  • Many algorithms that operate on triangle meshes assume that the mesh is well-behaved; that is, it has some structure that is more predictable than just a soup of arbitrary triangles. One class of well-behaved mesh is known as a manifold mesh. Manifold meshes satisfy the following properties:
    • Every edge in the mesh borders either 1 or 2 triangles.
    • Every vertex in the mesh belongs to a connected set of triangles; that is, if you consider all triangles that share one particular vertex, you can draw a curve from any triangle to any other triangle without leaving the surface of the mesh, and without crossing over the vertex. "Bowtie" configurations are not allowed.
    Extend your MeshGen utility to check if an input triangle mesh is manifold or not. As a hint, it may be useful to create a new data structure which, given the index of a triangle, allows you to quickly find that triangle's neighboring faces.

If you implement any of the above options, please provide sample input and/or output with your submission that demonstrates that your code is functioning correctly.

Command-line argument in Java

The command-line utility MeshGen is supposed to handle the command-line argument as a combination of options and parameters. If you've never used command-line arguments in Java, take a look at this brief documentation first.

The command-line arguments are stored as array of strings String[] args in the main function. If the input arguments are

  -g sphere -n 32 -m 64 -o outfile.obj
then
  args = {"-g", "sphere", "-n", "32", "-m", "64", "-o", "outfile.obj"};
A simple example parsing the command-line arguments above.
class MeshGen 
{
  public static void main(String[] args) {

    if (args[0].equals("-g")) {
      // generate mesh if the first option is "-g"

      for (int i=0; i < args.length; i += 2) 
      {
        if (args[i].equals("-g"))
        {
          // parse whether it is sphere or cylinder from args[i+1]
        }
        else if (args[i].equals("-n"))
        {
          // parse latitude divisions from args[i+1]
          // use Integer.parseInt() for parsing integer parameter
        } else if (...)
        {
          // similar operations for other options and parameters
        }
      }
    }
  }
}

Handing in

For ease of grading, please make sure your solution, named MeshGen.java, is in the meshgen package. When handing in your solution, just submit this file to CMS. Please also submit a README file that contains:

  • You and your partner's names and NetIDs.
  • Explanations of any extensions you implemented and incantations to run them via the command line.
  • Anything else you want us to know.

If there is anything else you would like to put in your solution (e.g., meshes you generated for extra credit), please place it in a ZIP file and submit it separately.

Do this project alone or in groups of two, as you prefer. You can use Piazza to help pair yourselves up.

Written Part

Due: Friday September 14th 2018 (11:59pm)

Q1.

A group of astronauts has just left on their journey to Mars when ground control informs them that there will be a lunar eclipse back home. The astronauts are asked to take a picture of this rare event from their location in space and report back. The situation at the moment they take the picture is as follows:

Note that we use Earth-centered coordinates that match the ones for the sphere in the mesh assignment. So 1 Earth radius is the unit of measure.

  • The sun is positioned 23,679 units away from the Earth (from the center, if you must ask), exactly above the point at latitude and longitude zero, at (0, 0, 23679).
  • The moon is positioned at (0, 1, -10) units away from the center of the earth. It has radius 0.15.
  • The spacecraft is at (15, 0, 20) and its camera is looking in the direction (-3/5, 0, -4/5) (which is the ray through the very center of the image)
1). Ray Intersection and Shadow Ray
  1. In the astronauts' photo, what (x, y, z) point on the Earth's surface appears exactly at the center of the image? Check your work by showing that the point is both on the ray and on the surface of the Earth (plug the point into the ray and sphere equations). See Sec. 4.4.1 of the textbook.
  2. Is the point (0, .91, -9.88) on the moon in the shadow of the Earth? You can approximate the direction from the point to the sun as (0, 0, 1).
2). Lambertian Reflectance
  1. Let's assume for this question that colors only have one component (reflectance) instead of three RGB components. Let the point on the Earth at the center of the image be P (calculated above). If the intensity of the sun is 100,000,000 units and we assume only diffuse reflectance from the Earth, with diffuse reflectance R = 0.5 at P. What is the light reflected from P toward the camera? Again, you can approximate the light direction as (0, 0, 1).

Q2.

Normal Interpolation

Suppose we are approximating a sphere by a regular tetrahedron of edge length 2, centered at the origin. (Of course this is a pretty coarse approximation to a sphere, but it keeps the numbers simple.) The coordinates of the 4 vertices are A \((0,-1,1/\sqrt{2})\), B \((1,0,-1/\sqrt{2})\), C \((-1, 0, -1/\sqrt{2})\), D \((0, 1, 1/\sqrt{2})\). Since the mesh is meant to approximate a sphere, the vertex normals all point directly away from the origin. Linear (barycentric) interpolation (See textbook sec 2.7) of vertex normals is used for shading.

Suppose we are shading a ray that hit at the point E be on face ABD. If the (un-normalized) interpolated normal at E is \((\frac{\sqrt{6}}{18}, \; \frac{\sqrt{2}}{2}, \; \frac{2}{3})\) what is the ratio of the distance from E to edge AB, edge DA and edge BD, what are the barycentric coordinates of E in triangle ABD, and what is the 3D position of E?

Programming Part

Due: Friday September 21st 2018 (11:59pm)

Introduction

Ray tracing is a simple and powerful algorithm for rendering images. Within the accuracy of the scene and shading models and with enough computing time, the images produced by a ray tracer can be physically accurate and can appear indistinguishable from real images. Cornell pioneered research in accurate rendering. See http://www.graphics.cornell.edu/online/box/compare.html for the famous Cornell box, which exists in Rhodes Hall.

The basic idea behind ray tracing is to model the movement of photons as they travel along their path from a light source, to a surface, and finally to a viewer's eye. Tracing forward along this path is difficult, since it can be unclear which photons will make it to the eye, especially when photons can bounce between objects several times along their path (and in unpredictable ways). To simplify the system, these paths are computed in reverse; that is, rays originate from the position of the eye. Color information, which depends on the scene's light sources, is computed when these rays intersect an object, and is sent back towards the eye to determine the color of any particular pixel.

In this assignment, you will complete an implementation of a ray tracer. It will not be able to produce physically accurate images, but later in the semester you will extend it to produce nice looking images with many interesting effects.

We have provided framework code to save you from spending time on class hierarchy design and input/output code, and rather to have you focus on implementing the core ray tracing components. However, you also have the freedom to redesign the system as long as your ray tracer meets our requirements and produces the same images for the given scenes.

Requirement Overview

The programming portion of this assignment is described below.

A ray tracer computes images in image order, meaning that the outer loop visits pixels one at a time. (In our framework the outer loop is found in the class RayTracer). For each pixel it performs the following operations:

  • Ray generation, in which the viewing ray corresponding to the current pixel is found. The calculations depend on the characteristics of the camera: where it is located, which way it is looking, etc. In our framework, ray generation happens in the subclasses of Camera.
  • Ray intersection, which determines the visible surface at the current pixel by finding the closest intersection between the ray and a surface in the scene (in our framework, ray intersection is handled by the Scene class).
  • Shading, in which the color of the object, as seen from the camera, is determined. Shading accounts for the effects of illumination and shadows (in our framework, shading happens in the subclasses of Shader).
In this assignment, your ray tracer will have support for:
  • Spheres and triangle meshes
  • Lambertian and Phong shading
  • Mirror reflections
  • Point lights with shadows
  • Parallel and perspective cameras

Your ray tracer will read files in an XML file format and output images in the OpenEXR high-dynamic-range (HDR) image format. We have split ray tracing in CS4620 into two parts. For now, you will implement basic ray tracing. Later in the semester, after you have learned many more graphics techniques, you will implement other features, such as advanced shading and illumination, faster rendering, and transformations on groups of objects. In this document, we focus on what you need to implement the basic elements of a ray tracer.

Precisions in the programming assignment

All the input in this project is parsed in single precision and the scene is stored in single precision as well. This is a good practice since scene data in real systems can occupy a lot of memory. But accurate ray intersections and shadows are difficult to achieve with single precision. So you should carry out all the geometric calculations, and store all related intermediate results, in double precision.

Getting Started

A new commit has been pushed to the class Github page in the master branch. We recommend switching to your master branch, pulling this commit, creating a new branch (e.g. A2 solution), and committing your work there. This new commit contains all the framework changes and additions necessary for this project.

The ray1 framework is set up to output high-dynamic range (HDR) images in the OpenEXR format. We use the openexr C++ library to produce HDR output, via a JNI library called openexrjni. In order for the Java runtime to find the native library that is appropriate to your platform, you need to configure your native library location. To do this in Eclipse, go to Project → Properties → Java Build Path → Libraries tab → openexrjni-3.0.0.jar dropdown menu → Modify Native Library Location (or open openexrjni-3.0.0.jar, select "Native library location", and click the Edit button on the right, then the Workspace button in the dialog). Modify this setting so that it matches your OS; it should read a2/deps/native/<your_OS>. You may also have to set it in one more place: in Project → Properties → Java Build Path → Source tab → a2/src, set the native library location in the same way.

If you are on Linux, you will additionally need to add deps/native/linux to LD_LIBRARY_PATH. One way to do so is to launch Eclipse as follows in a terminal, where the actual path to deps/native/linux is substituted in for <path/to/deps/native/linux>:

	$ LD_LIBRARY_PATH=<path/to/deps/native/linux>:$LD_LIBRARY_PATH eclipse

The HDR output file will have a .exr extension. HDR images are becoming fairly widely supported; handy tools you can use to view them include:

  • HDRView, a minimal but convenient viewer that is cross-platform but comes precompiled only for MacOS.
  • IrfanView, a mature and full featured image manipulation program that's free for commercial use but is Windows-only.
  • The qt4image app, part of HDRITools package from which our JNI library also comes, which is cross-platform but very minimal.
  • The Preview app that comes with MacOS, which is handy (you can adjust exposure in the Adjust Color dialog) but has a habit of overwriting your files without asking.
  • Photoshop, the widely used image manipulation program, if you already happen to own a license.
The writeHDR variable in RayTracer can be set to false to output PNG images, but pixel values above 1.0 will be clipped.

Specification and Implementation

We have marked all the functions or parts of the functions you need to complete with TODO#A2 in the source code. To see all these TODO's in Eclipse, select Search menu, then File Search and type "TODO#A2". There are quite a few separate code snippets to complete in this assignment. To check your progress as you go along, we recommend implementation in the following order.

Cameras

An instance of the Camera class represents a camera in the scene. It must implement the following methods:

  public void init()
This method performs one-time setup after the scene (including the camera) is read and before rendering begins. You will want to use it to set up an orthonormal basis for the camera based on the view and up directions, so that you don't waste time doing this computation for every ray.
  public void getRay(Ray outRay, double inU, double inV)
This method generates a new ray starting at the camera through a given image point and stores it in outRay. The image point is given in UV coordinates, which must first be converted into view coordinates. Recall that an orthographic camera produces rays with varying origin and constant direction, whereas a perspective camera produces rays with constant origin and varying direction.

You will implement these two methods for both the OrthographicCamera and the PerspectiveCamera. All of the properties you will need are stored in the Camera class.

You can then test your getRay() method for both of these classes by running the main method in Camera.java. You should get "Test successful" messages if your implementation is correct.

Surfaces

An instance of the Surface class represents a piece of geometry in the scene. It has to implement the following method:

  boolean intersect(IntersectionRecord outRecord, Ray ray)

This method intersects the given ray with the surface. Return true if the ray intersects the geometry and write relevant information to the IntersectionRecord structure. Relevant information includes the position of the hit point, the surface normal at the hit point, the value t of the ray at the hit point, and the surface being hit itself. See the IntersectionRecord class for details.

We ask that you implement the intersection method for the following surfaces:

  • Sphere: The class contains a 3D point center and a float radius. You will need to solve the sphere/ray intersection equations. Be sure to calculate the exact normal at the point of intersection and store it in the IntersectionRecord. Additionally, please compute and store UV-coordinates in the same orientation as Assignment 1 (i.e. the poles should lie on the y-axis, and the seam should intersect the negative z-axis if the sphere is located at the origin). See section 4.4.1 of the book.
  • Triangle: In our system, each triangle belongs to a Mesh, which keeps a reference to a mesh object. A triangle stores an OBJFace object which corresponds to the face formed by the triangle. See section 4.4.2 of the book. If the owning face does not have a per-vertex normal, the intersect method should return the triangle normal stored in Triangle. If it does, it should interpolate the normals of the three vertices using the barycentric coordinate of the hit point. Texture coordinates, if available, should be interpolated using barycentric coordinates as well.
Scene

The Scene class stores information relating to the composition, i.e. the camera, a list of surfaces, a list of lights, etc. It also contains a scene-file-specified exposure field, which can be used to adjust the brightness of the resulting image. You need to implement the following method:

  boolean intersect(IntersectionRecord outRecord, Ray rayIn, boolean anyIntersection)

The method should loop through all surfaces in the scene (stored in the Surfaces field), looking for intersections along rayIn. This is done by calling the intersect method on each surface and it is used when a ray is first cast out into the scene. If anyIntersection is false, it should only record the first intersection that happens along rayIn, i.e. the intersection with the lowest value of t. If there is such an intersection, it is recorded in outRecord and true is returned; otherwise, the method returns false, and outRecord is not modified. If anyIntersection is true, the method may return true immediately upon finding any intersection with the scene's geometry, even without setting outRecord (this version is used to make shadow calculations faster).

Ray Tracing

You then need to complete the shadeRay method of the RayTracer class. This method contains the main logic of the ray tracer. It takes a ray generated by the camera, intersects it with the scene, and returns a color based on what the ray hits. Fortunately, most of the details are implemented in other methods; all you need to do is call those methods, and this should only take a few lines. For instance, the Scene class contains a method for finding the first object a given ray hits. Details are in the code's comments.

We have provided a normal shader that determines color based on the normal of the surface, as well as some sample scenes that use this shader. Once you have completed the previous sections, you should be able to render the scenes one-sphere-rgb-normals.xml and two-boxes-rgb-normals.xml correctly. See Appendix A for details on rendering scenes.

Shaders

We ask you to implement the Lambertian and Phong shading models, with two Microfacet models as extra credit. (Don't worry if you don't get to these; they will come back later in the semester.)

A shader is represented by an instance of the abstract Shader class. The key method in the Shader hierarchy is

  public abstract void shade(Colord outRadiance, Scene scene, Ray ray, IntersectionRecord record, int depth)

which sets outRadiance (the resulting color) to the fraction of the light that comes from the incoming direction \(\omega_i\) (i.e. the direction towards the light) and scatters out in direction \(\omega_o\) (i.e., towards the viewer).

The framework comes with a shader called RGBNormals, which is helpful for debugging your intersection code, but all the "interesting" shaders are subclasses of ReflectionShader. To avoid repetitive code, the shade method is implemented here, with the differences between different shading models encapsulated in the BRDF class. You can see that the Lambertian, Phong, and two Microfacet classes are already implemented, but all they do is construct instances of corresponding BRDF subclasses LambertianBRDF, PhongBRDF, etc.

The idea is to compute shading using a common formula that works for all the models: \[ L_o = f_r(\mathbf{x}, \omega_i, \omega_o) \max(\mathbf{n} \cdot \omega_i, 0) \frac{I}{r^2} \] where \(\omega_i\) is a unit vector towards the light, \(\omega_o\) is a unit vector toward the viewer, \(\mathbf{x}\) is a location on the surface, and \(\mathbf{n}\) is the surface normal at \(\mathbf{x}\). \(I\) and \(r^2\) are the intensity and distance of the light source, respectively. The function \(f_r\) is the BRDF of the surface, and it is different for each material. The BRDF is represented by the BRDF class, with its key method

  public abstract void evalBRDF(
      Vector3d incoming, Vector3d outgoing, Vector3d surfaceNormal, 
      Vector2 texCoords, Colorf BRDFValue);

We ask you to implement the Lambertian and Phong shading models by implementing the single shading method ReflectionShader.shade and the evaluation methods LambertianBRDF.evalBRDF and PhongBRDF.evalBRDF (with the Microfacet BRDFs as extra credit: implementing them will automatically enable microfacet shading).

  • ReflectionShader: In the shade method you will basically evaluate the expression above for \(L_o\) (which corresponds to the output parameter outRadiance), using the provided parameters.

    Notice that these values should be computed once for each light in the scene, and the contributions from each light should be summed, unless something is blocking the path from the light source to the object. Check whether your shading point is shadowed by constructing a ray segment (meaning a Ray that has both a starting and ending \(t\) value) that starts at your shading point and ends at the light source, then calling scene.getAnyIntersection to check whether it intersects any surfaces. Don't forget to offset the start of the ray; Ray.makeOffsetSegment is the ideal tool for making shadow rays.

     reset outRadiance to (0, 0, 0)
       for each light in the scene
         if light is not shadowed
           compute color contribution from this light
           add this contribution to outRadiance
         end if 
       end for 
    
  • LambertianBRDF: This class implements the perfectly diffuse BRDF. Its BRDF value is constant, corresponding to the reflectance \(R_d\) specified by the diffuseColor field: \[ f_r(\mathbf{x}, \omega_i, \omega_o) = R_d(\mathbf{x}) / \pi \] See below under Textures for the dependence of \(R_d\) on \(\mathbf{x}\). Section 4.5.1 of the book is relevant, but note that there are several differences and the course slides are more consistent with this assignment.
  • After ReflectionShader and LambertianBRDF are implemented, you should be able to render two-boxes.xml correctly.

  • PhongBRDF: This class implements the Modified Blinn-Phong lighting model. It has a diffuse component identical to the Lambertian model, as well as a specular component controlled by the specularColor, \(k_s\), and exponent, \(p\), fields, where the exponent determines the "shininess". The evalBRDF method should evaluate this model: \[ f_r(\mathbf{x}, \omega_i, \omega_o) = R_d(\mathbf{x}) / \pi + k_s \, \max(\mathbf{n} \cdot \mathbf{h}, 0)^p \] Here, \(\mathbf{h} = \frac{\omega_i \,+\, \omega_o}{|\omega_i \,+\, \omega_o|}\). See section 4.5.2 with the same caution as above.
  • To avoid some weird artifacts related to the difference between shading normals and geometric normals, the evalBRDF function should return zero if either \(n \cdot \omega_i\) or \(n \cdot \omega_i\) is negative. This way, if you manage to get a view direction that actually points into the surface rather than away from it, you will just return zero.

Mirror reflection

Once you have shading based on illumination from light sources working, there is one more kind of reflection to include: reflections of objects in one another's surfaces. Doing this accurately and consistently with shading from BRDFs is a topic for later in the semester, but for this assignment we simply add a completely separate mirror reflection component to the shading. It is computed independently of the shading computed using light sources and BRDFs, and it's controlled by the parameter mirrorCoefficient, which is defined in ReflectionShader and therefore available for all shaders that are subclasses.

Mirror reflection is computed by recursively calling RayTracer.shadeRay to compute the direction that is seen by a ray starting at the surface and proceeding in the direction that is the mirror reflection of the view direction. Compute the reflection direction as discussed in the book (Section 4.8) and in the slides, find the radiance seen in that direction, scale it by mirrorCoefficient, and add it into outRadiance.

Mirror reflection is not enabled for most scenes because it slows things down a bit; to keep from wasting time when it is turned off, check whether mirrorCoefficient is nonzero before tracing the recursive ray. Once mirror reflection is working you will be able to render the four-spheres scene and see lots of interreflections, and will see the reflections added to the teapot scenes.

Textures

The Shader classes above have a texture field that is by default set to null. If the scene XML file specifies an image texture for the shader, a Texture object will be referenced by this field. In this case, you should use the texture to determine the diffuse color in the shade method, ignoring the diffuseColor field.

You will see that all the ReflectionShader subclasses set up the BRDFs with a reference to the texture also. Thus, whenever you are computing diffuse shading, you should get the reflectance from the texture if available, and otherwise the diffuse color. This logic is encapsulated in the helper method BRDF.getDiffuseReflectance which is available to all the BRDF methods you're implementing.

The Texture class and its subclasses are responsible for using the UV-coordinates of a particular point on the Surface to find this color on a given image (located in the Texture class). You will be implementing the getTexColor(Vector2d texCoord) method for two very similar subclasses of Texture. Both classes convert UV-coordinates into the nearest pixel coordinates corresponding to the image. The difference is in the way they handle UV-coordinates outside of the [0.0, 1.0] range:

  • ClampTexture: This class uses the nearest pixel for UV-coordinates that are out of range. As a result, the colors of the boundary pixels of a texture are extended outwards in UV space.
  • RepeatTexture: This class repeats the texture for out-of-range UV-coordinates. In this case, copies of the texture are tiled across UV space.

The details of how to implement these are provided in the code. Once these are completed, all scenes in subdirectory textured_scenes should render correctly.

Appendix A: Testing

The data directory contains scene files and reference images for the purpose of testing your code. Right click on the RayTracer file in Package Explorer, and choose Run As->Java Application. These scene files are organized into several subdirectories:

  • fast_scenes: a collection of simple scenes that should render very quickly, under a second each, so that you can quickly tell what is working. You should re-render these constantly as you work, so that you learn right away if you have broken something that used to work.
  • mesh_scenes: scenes that test the ability to handle larger triangle meshes. Since your ray tracer takes time linear in the number of triangles, these will be slower to render. When you are iterating on one of these results, don't hesitate to edit the input file to make the output image temporarily much smaller!
  • textured_scenes: scenes that test the texture mapping features. These won't render correctly until you have implemented the texture mapping methods.
Without just a directory name passed as a command line argument, the ray tracer will attempt to render all scenes in the corresponding scene directory (i.e. data/scenes/ray1/dirname). If you want to render specific scenes, you can list the file names individually on the command line.

You can set up command line arguments from the Run dialog (Run Configurations) in Eclipse. Choose the Arguments tab, and provide your command line arguments in Program arguments. Alternatively, you can run the program from the command line, which is sometimes more convenient for rapid iteration. You will need to set both the class path and the library path; a command line like the following, executed in the a2/ project directory, will do it:

java -cp bin/:deps/lib/openexrjni-3.0.0.jar -Djava.library.path=deps/native/macosx/ ray1.RayTracer fast_scenes/
with an appropriate substitution to the library path for Windows or Linux.

Note that RayTracer prepends data/scenes/ray1 to its arguments. This means that you only need to provide a scene file name to render it. For example, if you give the argument fast_scenes/one-sphere.xml to the RayTracer it will load the scene file data/scenes/ray1/fast_scenes/one-sphere.xml and produce an output image in the same directory as the scene file.

If you would like to change the default scene location, there is a -p command line option that allows you to specify a different default path instead of using data/scenes/ray1.

Compare the images you have with the reference images; if there are visually significant differences, consider checking for bugs in your code.

Appendix B: Extra Credit

Cylinder surface

    We've included a Cylinder subclass of Surface, similar to the Triangle and Sphere classes. The class contains center, a 3D point, and radius, height, both real numbers. Assume that the cylinder is capped, i.e., the top and bottom have caps (it is not a hollow tube). Also, assume it is aligned with the z-axis, and is centered around center;
    i.e., if center is \((c_x, c_y, c_z)\) and height is \(h\), the cylinder's z-extent is from \(c_z-\frac{h}{2}\) to \(c_z+\frac{h}{2}\).

    Your task is to implement the intersect method of a mathematical cylinder (NOT a meshed cylinder). Add an XML scene that includes a cylinder to test your implementation (See Appendix C for details on how this should be done--you shouldn't need to change any parser code).

    Once this is done, try implementing a cylinder that is arbitrarily oriented. You will need to add two parameters, rotX and rotY, to specify the cylinder's rotation about the x and y axes. Note that this will require changing the coordinate system of the incoming ray within the intersect method to account for these rotations. After the coordinate system change, the math used to intersect the ray with the cylinder should be the same as its axis-aligned variant. Finally, build a test scene with a cylinder object.

Microfacet BRDF with Beckmann distribution
  • MicrofacetBRDF: This and its two subclasses implement two variants of the Microfacet BRDF model. Its BRDF function is product of several factors:
      \[ f_r(\omega_i,\omega_o,n) \; = \; \frac{F(\omega_i,h)G(\omega_i,\omega_o,h)D(h)}{4|\omega_i \cdot n| |\omega_o \cdot n|} \]

    \(\omega_i\): Direction unit vector along incoming ray.

    \(\omega_o\): Direction unit vector along outgoing ray.

    \(n\): unit normal vector of surface at the shaded point.

    \(h\): Half vector of \(\omega_i\) and \(\omega_o\) and its length is normalized to unit.

      \[ h(\omega_i, \omega_o) \; = \; \frac{\omega_i + \omega_o}{|\omega_i + \omega_o|} \]

    To avoid some weird artifacts related to the difference between shading normals and geometric normals, the evalBRDF function should return zero if either \(n \cdot \omega_i\) or \(n \cdot \omega_i\) is negative. This way, if you manage to get a view direction that actually points into the surface rather than away from it, you will just return zero.

    More details on each factor of the microfacet BRDF function follow.

    \(F(\omega_i,h)\): This is the Fresnel term of unpolerized light and only reflection is considered in this assignment.

      \[ F(\omega_i, h) \; = \; \frac{1}{2}\frac{(g-c)^2}{(g+c)^2}\left(1 \; + \; \frac{(c(g+c)-1)^2}{(c(g-c)+1)^2}\right)\]
    where \(g \; = \; \sqrt{\frac{n_t^2}{n_i^2}-1+c^2}\) and \(c \; = \; |\omega_i \cdot h|\). \(n_t\) and \(n_i\) are the refractive index of material and air respectively. \(n_i\;=\;1.0\). In microfacet shader, there is a variable nt parsed from xml file and it corresponds to \(n_t\). Note that the value of \(g\) can be imaginary for some materials, clamp \(F\;=\;1\) in this case.

    The \(G\) and \(D\) functions differ for the two variants.

  • BeckmannBRDF: this class implements the Beckmann microfacet distribution.

    \(D(h)\): This factor comes from the microfacet distribution function. For the Beckmann model it is: \[ D(h) \; = \; \frac{\chi^{+}(h \cdot n)}{\pi\alpha_b^2\cos^2\theta_h} \; e^{\frac{-\tan^2\theta_h}{\alpha_b^2}} \] where \(\theta_h\) denotes the angle between \(h\) and \(n\); \(\chi^{+}(a)\) is positive characteristic function (one if \(a>0\) and zero if \(a \leq 0\) ). \(\alpha_b\) is width parameter of the Beckmann distribution, corresponding to parsed variable alpha in microfacet shader.

    \(G(\omega_i,\omega_o,h)\): This is Smith shadowing-masking term and depends on the choice of microfacet distribution function \(D(h)\). \(G(\omega_i, \omega_o,h)\) can be separated as the product of two parts: \[ G(\omega_i, \omega_o, h) = G_1(\omega_i,h)G_1(\omega_o,h) \] For Beckmann distribution, \[ G_1(v,h) \; = \; \chi^{+}\left(\frac{v\cdot h}{v\cdot n}\right) \begin{cases} \frac{3.535a\;+\;2.181a^2}{1\;+\;2.276a\;+\;2.577a^2}, & \text{if}\ a < 1.6 \\ 1 & \text{otherwise} \end{cases} \] where \(a=(\alpha_b\tan\theta_v)^{-1}\) and \(\theta_v\) is the angle between \(v\) and \(n\).

  • GGXBRDF: The microfacet-based shader implemented above assumes the normal distribution of the microfacet surface is Beckmann. Another class of normal distribution that is widely used is GGX. In order to implement microfacet shader with GGX distribution, both the Smith shadowing-masking factor \(G(\omega_i,\omega_o,h)\) and the microfacet distribution factor \(D(h)\) should be modified.

    The GGX distribution with width parameter \(\alpha_gs\) is \[ D(h) \; = \; \frac{\alpha_g^2\;\chi^{+}(h \cdot n)}{\pi\cos^4\theta_h\;(\alpha_g^2\;+\;\tan^2\theta_h)^2} \] The corresponding shadowing-masking term is \[ G(\omega_i, \omega_o, h) = G_1(\omega_i,h)G_1(\omega_o,h) \] And for GGX distribution, \[ G_1(v,h) \; = \; \chi^{+}\left(\frac{v\cdot h}{v\cdot n}\right) \; \frac{2}{1\;+\;\sqrt{1\;+\;\alpha_g^2\tan^2\theta_v}} \] The meaning of the variable notations and all other factors in the BRDF function keep the same as above.

    Appendix C: Framework

    This section describes the framework code that comes with the assignment. You do not need to spend time trying to understand it and it is not essential to read this to get started on the assignment. Instead, you can reference it as needed.

    The framework for this assignment includes a simple main program, some utility classes for vector math, a parser for the input file format, and stubs for the classes that are required by the parser.

    RayTracer

    This class holds the entry point for the program. The main method is provided, so that your program will have a command-line interface compatible with ours. It treats each command line argument as the name of an input file, which it parses, renders an image, and writes the image to a PNG file. The method RayTracer.renderImage is called to do the actual rendering.

    Image

    This class contains an array of pixel colors (stored as doubles) and the requisite code to get and set pixels and to output the image to a PNG file.

    egl.math package

    This package contains classes to represent 2D and 3D vectors, as well as RGB colors. They support all the standard vector arithmetic operations you're likely to need, including dot and cross products for vectors and addition and scaling for colors.

    Parser

    The Parser class contains a simple parser based on Java's built-in XML parsing. The parser simply reads a XML document and instantiates an object for each XML entity, adding it to its containing element by calling set... or add... methods on the containing object.

    For instance, the input:

    <scene>
        <surface type="Sphere">
            <shader type="Lambertian">
                <diffuseColor>0 0 1</diffuseColor>
            </shader>
            <center>1 2 3</center>
            <radius>4</radius>
        </surface>
    </scene>
    
    results in the following construction sequence:
    • Create the scene.
    • Create an object of class Sphere and add it to the scene by calling Scene.addSurface. This is OK because Sphere extends the Surface class.
    • Create an object of class Lambertian and add it to the sphere using Sphere.setShader. This is OK because Lambertian extends the Shader class.
    • Call setDiffuseColor(new Colord(0, 0, 1)) on the shader.
    • Call setCenter(new Vector3d(1, 2, 3)) on the sphere.
    • Call setRadius(4) on the sphere.

    Which elements are allowed where in the file is determined by which classes contain appropriate methods, and the types of those methods' parameters determine how the tag's contents are parsed (as a number, a vector, etc.). There is more detail for the curious in the header comment of the Parser class.

    The practical result of all this is that your ray tracer is handed an object of class Scene that contains objects that are in one-to-one correspondence with the elements in the input file. You shouldn't need to change the parser in any way.

    What to Submit

    Submit your solution of the written part on CMS, which should include a PDF file for Q1 and Q2 together.

    As for the code part, submit a zip file containing your solution organized the same way as the code on Github, along with any additional files. Include a readme in your zip that contains:

    • You and your partner's names and NetIDs.
    • Any problems with your solution.
    • Anything else you want us to know.

    Upload here (CMS)

  • Do this project alone or in groups of two, as you prefer. You can use Piazza to help pair yourselves up.

    Written Part

    Due: Friday September 28th 2018 (11:59pm)

    Note: You should feel free to use a symbolic algebra package, such as Wolfram Alpha (easiest to get started with, not so easy to automate), Wolfram Cloud (access to the underlying language for Alpha, offered in freemium model), or SymPy (open source system with many of the capabilities of Wolfram Cloud), and/or numerical math tools like a scientific calculator or Matlab/NumPy/Julia etc., to help with these problems.

    Q1. Transformation matrices

    1. Write the matrix for a CCW rotation by 30 degrees around the origin. Verify it by transforming the points (0,0), (1,0), and (0,1) to ensure they go to the right places.
    2. Write the matrix for a CW rotation by 30 degrees around the point (1,0). Verify it by transforming the points (1,0), (2,0), and (1,1) to ensure they go to the right places.
    3. Compute the composition of these two rotations in the two possible orders. Describe the resulting transformations in words (e.g. as a rotation about a point, a translation, etc.).

    Q2. Perspective Projection

    The orthographic projection matrix given in lecture and in the book is: \[M_o = \begin{bmatrix}2/(r-l) & 0 & 0 & -(r+l)/(r-l) \\ 0 & 2/(t-b) & 0 & -(t+b)/(t-b) \\ 0 & 0 & 2/(n-f) & -(n+f)/(n-f) \\ 0 & 0 & 0 & 1\end{bmatrix}\] where \(l\), \(r\), \(b\), \(t\), \(n\), and \(f\) specify the left, right, bottom, top, near, and far bounds of the view. The inverse of this matrix is: \[ M_o^{-1} = \begin{bmatrix}(r-l)/2 & 0 & 0 & (l+r)/2 \\ 0 & (t-b)/2 & 0 & (b+t)/2 \\ 0 & 0 & (n-f)/2 & (f+n)/2 \\ 0 & 0 & 0 & 1 \end{bmatrix}\] The perspective matrix is: \[ P = \begin{bmatrix}n & 0 & 0 & 0 \\ 0 & n & 0 & 0 \\ 0 & 0 & n+f & -fn \\ 0 & 0 & 1 & 0 \end{bmatrix} \] And its inverse transformation is: \[ P^\text{inv} = \begin{bmatrix} f & 0 & 0 & 0 \\ 0 & f & 0 & 0 \\ 0 & 0 & 0 & fn \\ 0 & 0 & -1 & f+n \end{bmatrix}\] (Note that this is not precisely the inverse of the matrix \(P\); it has been re-scaled to make it neater without changing the projective transformation that it defines.)

    1. Multiply these matrices to compute the perspective projection matrix, \(M_p\), and a matrix \(M_p^\text{inv}\) for the inverse of the perspective projection. Your answer will still be in terms of the view parameters \(l\), \(r\), \(b\), \(t\), \(n\), and \(f\). For a perspective view, recall that the near and far coordinates are both negative (since they are the coordinates of planes that are in front of the camera), and the left, right, bottom, and top coordinates are the bounds of the view volume as measured on the near plane (\(z = n\)).

    Just to recall some terminology: each of these projection transformations (represented by the matrices \(M_o\) and \(M_p\)) takes points in eye space (aka. camera space) and maps them to normalized device coordinates (aka NDC).

    1. Verify your answer to (a):
      1. Compute the projection and inverse projection matrices for \(l = -2\), \(r = +2\), \(b = -1\), \(t = +1\), \(n = -2\), \(f = -4\).
      2. Transform the 3D eye-space point \((2, 1, -3)\), which is inside the view volume, to a 3D point in NDC, and confirm it lands inside the canonical view volume.
      3. Transform the point from (ii) back into eye space and confirm it lands in the same place where it started.

    In NDC space, the eight corners of the canonical view volume are \(c_\text{lbf}, c_\text{rbf}, \ldots, c_\text{rtn} = (\pm 1, \pm 1, \pm 1)\). The view volume in eye space consists of all the points that map into the canonical view volume under the projection transformation.

    1. Use the inverse projection matrices to compute the eight 3D points that are the corners of the eye-space view volumes for the perspective and orthographic cases. Your answer will be in terms of the view parameters \(l\), \(r\), \(b\), \(t\), \(n\), and \(f\). Check your answer by verifying that the 3D corner points you found for each volume transform to the correct corners of the canonical view volume. [If you have not managed to automate the computations, just hand in 3 of the 8 corners for each case, 2 on the near plane and 1 on the far plane.]

    In homogeneous coordinates, in which the 3D point \((x,y,z)\) is written as the 4D vector \(p = (x,y,z,1)\), the plane \(\{(x, y, z) \,|\, ax + by + cz + d = 0\}\) can be written compactly using the 4D vector \(q = (a,b,c,d)\). The plane equation is then \(\{p \,|\, p \cdot q = 0\}\), which makes it easy to check whether a point \(p\) is on a plane \(q\). Just as the 4D vector describing a 3D point is only unique up to a scale factor, the 4D vector describing a plane is only unique up to a scale factor.

    1. Write the 4D vectors corresponding to the six planes \(P_l, P_r, P_b, P_t, P_f, P_n\) that contain the left, right, bottom, top, far, and near faces of the canonical view volume, in NDC.
    2. Find the 4D vectors of the 6 planes that bound the eye-space perspective view volume. You can do this by inspection, from the corners of the view volume that you worked out earlier, or if you want a niftier way, by transforming the six planes from the previous part through the inverse projection matrix (planes in homogeneous coordinates transform in exactly the same way as normal vectors do in ordinary 3D transformations). Remember these vectors are only defined up to a scale factor, so you may like to scale them to the neatest, simplest, most understandable form. [If you have not managed to automate the computations, only hand in the front, left, and top planes.]
    3. Verify that each of the 8 corners of the volume is indeed on each of the 3 planes it should be, and that the left, right, bottom, and top planes all also contain the origin (aka. the eye position). If you want, see if you can do this part using a single matrix multiplication. [If you have not managed to automate the computations, only hand in the checks for the front, left, and top planes and three corner points of your choice.]

    Q3. Manipulator Warmup

    Manipulators are 3D user interface elements that allow the user to edit the transformation of an object. In this question, we will consider a translation manipulator. A translation manipulator can be thought of as a ray in three-dimensional space that corresponds to an object space axis (either x, y, or z) for a given object. One challenge with implementing manipulators to map the user's action (in a 2-D image plane) to a 3-D transformation. We are given the coordinates and direction of the manipulator and two click locations on screen, and the question is: what translation should be applied to the object?

    Consider a manipulator as a ray with origin \(\mathbf{o}\) and direction \(\mathbf{d}\). Any point \(\mathbf{p}\) on that ray can be described by a parameter t: \(\mathbf{p} = \mathbf{o} + t\mathbf{d}\). For a translation manipulator, the goal is to find the positions on the ray that the user clicked (\(t_1\) being the first click point, and \(t_2\) being the second). Once we've determined the best \(t_1\) and \(t_2\), we can then apply a translation transformation on the object in the object's frame along the manipulator's axis by the amount \(t_2-t_1\).

    Parameters
    • The camera is at the origin pointing in the negative \(z\) direction, the \(y\) axis is up (as per usual)
    • The image plane is a square in the \(z=-4\) plane with side length 5 centered at \((0,0,-4)\).
    • The translation manipulator's origin is located at \((2,2,-6)\)
    • The translation manipulator is pointing in the direction of \((-1,-1,-1)\). Note that you should normalize this vector such that your final parameter values match ours
    • The translation manipulator corresponds to the object's object-space x-axis.
    • The user has clicked on world-space coordinates \(\mathbf{c}_1 = (1.5,2,-4)\) and \(\mathbf{c}_2 = (-.5,0,-4)\) [note that these are on the image plane]
    Here are some reference images that might be useful for you to visualize the process of determining \(t_1\) and \(t_2\).
    Ray Generation Point Projection
    Questions
    1. Our first job is to describe the rays outgoing from the eye intersecting the image plane in the places where the user clicked. What are the equations of the two rays in question?
    2. Next, we need to find the world coordinates of the points where these rays intersect the plane defined by the translation manipulator's origin and direction. A point and a direction, however, aren't enough to determine a plane -- we need an additional constraint. We want the plane to contain...
      1. the manipulator's origin
      2. the manipulator's direction
      3. a ray perpendicular to the manipulator's ray and parallel to the image plane
      What are the coordinates of the two points in question?
    3. Now we need to find the points on the manipulator ray that are closest to these points. Keeping in mind that points on our ray can be completely described by their associated \(t\) parameters, what are \(t_1\) and \(t_2\)?
    4. Finally, what is the matrix we would apply to translate the object appropriately in its own coordinate space?

    What to Submit

    Submit a PDF file containing Q1 Q2 and Q3 together.

    Programming Part

    Due: Wednesday October 3rd 2018 (11:59pm)

    For the programming part of this assignment, you will implement a 3D user interface of a type commonly provided in 3D modeling tools, which we call manipulators. A 3D UI element is one that appears in a 3D scene, and like 2D UI elements such as scroll bars and sliders, it responds to a user's mouse input to adjust a property of the scene the user wants to control. In particular, manipulators are 3D UI elements that are used to edit the transformations that position objects in a scene.

    Manipulators are designed to follow the UI design principle of direct manipulation; if you want to translate or rotate an object, you do so by grabbing it with the mouse and pulling in the direction you want it to go. Because transformations have a lot of degrees of freedom (ours has 9), manipulators let the user edit one part of the transformation at a time, constraining the motion so that 2D mouse input suffices to define how the object moves. Our manipulators are all single-axis manipulators, editing translation or scale along a single axis, or a single fixed-axis rotation, at a time.

    In our framework, the user chooses a manipulation mode using the keyboard: T for translation, R for rotation, S for scaling, and the appropriate manipulator—or more precisely, a trio of manipulators, one for X, one for Y, and one for Z—appears. The user uses a manipulator by clicking on it, to indicate that he or she wishes to edit the object's transformation on that particular axis. We modify the transformations by applying axis-aligned rotations, translations, or scales to them, and the goal of a manipulator is (a) to give a visual indication of what will happen when you apply one of these transforms and (b) to give a direct "handle" to pull on to indicate the direction and size of the transformation.

    In this framework, an object to be manipulated (that is, a mesh specified by a user) has several \(4\times 4\) matrices that compose together to specify the transformation from a point on the object in that object's local space to the same point in world space. These matrices are separated into a translation matrix \(T\), 3 rotation matrices \(R_x\), \(R_y\), and \(R_z\) (one for rotations around each axis), and a scale matrix \(S\). The order in which these matrices are multiplied is important (remember that in general, transformation matrices do not commute!). For this assignment, the order is \(TR_xR_yR_zS\); that is, when computing the world space location of a point on the mesh, scaling is applied first, then rotations (\(Z\), then \(Y\), then \(X\)), and then translations are applied last. These matrices will be multiplied together to form the model matrix, but keeping the components separated like this allows them to be modified more easily.

    Setup

    A new commit has been pushed to the cs4620-18fa-frameworks repository. The framework for this assignment is under the a3 directory.

    This assignment uses OpenGL to display and update 3D graphics in real-time. In order to take advantage of OpenGL from within Java, we use the LWJGL library, which provides bindings to OpenGL. It also provides bindings to GLFW, which is a framework for managing application windows. These libraries are included in the lib directory under the a3 directory; you'll need to link to them to run the program. If you're using Eclipse, select Project > Properties. In the menu on the left, click "Java Build Path", and then click "Add JARs...". Navigate to the lib directory within the project, and select lwjgl.jar, lwjgl-opengl.jar, and lwjgl-glfw.jar. In addition, select any JAR files that contain the word "natives" followed by the name of your operating system.

    IMPORTANT NOTE FOR MAC USERS: In order to use GLFW in Java, you need to tell Java to run your program on the first thread that is spawned when you run your program. To do this, select Run > Run Configurations. In the Arguments tab, under VM arguments, add the option “-XstartOnFirstThread”.

    Specification

    The code in the framework handles displaying objects, drawing the manipulators, and detecting when the user has clicked on a manipulator. Manipulators are specified by extending the abstract Manipulator class; concrete child classes must implement the method called applyTransformation. Each manipulator has a member variable called axis (an enum that is either X, Y, or Z), and a member variable called reference that keeps track of the object being manipulated. When the user clicks on a manipulator and drags, the framework code calls applyTransformation to compute and apply the transformation on the referenced object's matrix stack. For instance, in translation mode, if the user clicks on the red (X axis) arrow and drags the mouse, the method applyTransformation will be called on the TranslationManipulator with axis==X. This method will then update reference.translation, which is a Matrix4 that contains the translation portion of the object's current transformation. The input to this function includes the previous and current mouse positions, as well as the camera's current view projection matrix. Your task is to implement applyTransformation for TranslationManipulator, ScaleManipulator, and RotationManipulator.

    The first step in implementing manipulators is to get the correct transformations, not worrying yet about the size of the transformations. For this, you just need to make a transformation of the right type, along the given axis, with a magnitude that is computed from the mouse delta in some easy way: for instance, translate by the difference in mouse x coordinate times 0.01 (the details don't matter since this is just temporary). Once this works, you will see the object moving in the direction of the manipulator arrow you clicked on, but it will generally be confusing to figure out which way to move the mouse to get the desired result.

    The last and final step is to make the manipulations "follow the mouse". This means that if you click on the manipulator and move the mouse, the point you clicked on should remain exactly under the mouse pointer as you drag. The strategy we recommend is to express the ray corresponding to the user's click and the line corresponding to the manipulation axis both in world space, then use the warmup problem to compute the \(t\) values along the axis that correspond to the user's mouse movement. Once you have these values you know how much to translate: it is by the difference of the two \(t\) values. Once this works, if you click on points that are on the manipulation axis and drag exactly along the axis, the object will exactly follow the mouse.

    Translation Manipulator

    The translation manipulator, which you considered already in one of the written questions, displays three arrows that represent the \(X\), \(Y\), or \(Z\) axes in global coordinates. If the user clicks and drags along an axis, the resulting translation should exactly follow the mouse (even if other transformations are applied first!). When the drag is not parallel to the selected axis, the translation should follow the mouse as much as possible (hence your projection scheme you worked with in the written questions).

    The translation manipulator is the most complicated manipulator of the three, particularly if it is the first one you're implementing, but you've already had some experience dealing with it in the written questions. As before, the idea is that the matrix you construct must be a translation of \(t_2-t_1\) along the specified axis.

    Note that when the mouse ray and manipulator axis are nearly parallel, you may observe some strange behavior. Small movements of the mouse may result in significant transformations. This is expected and unavoidable. In real-world applications, manipulators are typically disabled if the axis is close to parallel with the mouse ray.

    Scale Manipulator

    The scaling manipulator shows the three scaling axes by drawing three lines with small boxes at the ends. This manipulator is very, very similar to the translation manipulator (in fact, until you construct your final matrix, finding the appropriate \(t_1\) and \(t_2\) are nearly identical, and as such you might find it useful to break up this functionality into its own method that is available to both classes). The magnitude of your scale, however, should be based on a ratio of \(t_1\) and \(t_2\), rather than a difference. Here, you should scale by \(t_2/t_1\) in the appropriate direction. Furthermore, scaling happens before any other transformation in the matrix stack, meaning all scale operations happen in the object's local coordinate frame.

    Rotation Manipulator

    The rotation manipulator displays three circles on a sphere surrounding the origin of the object's coordinates. Each circle is perpendicular to one of the rotation axes, and clicking on that circle and dragging performs a rotation around that axis. Again, you'll want the rotation to follow the mouse as it drags across the screen.

    The rotation manipulator is slightly different than the previous two; rather than modifying the referenced object along a specific axis, each rotation manipulator rotates the object in the plane which is perpendicular to the given axis. For instance, the \(X\)-axis rotation manipulator rotates the object in the \(YZ\)-plane.

    To implement this manipulator, trace mouse rays into the scene for the current and last mouse positions, and intersect them with the plane defined by the manipulator's origin and axis (transformed according to their position in the matrix stack). If one or both of the rays is parallel to the plane, then don't apply any transformation to the object. Otherwise, calculate the vectors from the manipulator origin to each intersection point. Find the signed angle from the previous mouse position to the current mouse position along the plane, and rotate the referenced object around the appropriate axis by that angle. Note that you can use the plane's normal (i.e., the manipulator axis) to find out if the angle is positive or negative.

    These manipulators behave like a set of gimbals; that is, rotating with one manipulator updates the world-space axis of other rotation manipulators further down the stack. For example, rotating the \(X\) rotation manipulator also rotates the \(Y\) and \(Z\) rotation manipulators; rotating the \(Y\) rotation manipulator updates the \(Z\) manipulator, but NOT the \(X\) manipulator. The scaling manipulators will be affected by all three rotations.

    Rotation Manipulator

    The Complete Program

    The program's main method is found in Main.java, within the main package. The program has a single command line argument, which is the path to an OBJ mesh (try data/meshes/teapot.obj).

    Control Guide
    • Left click and drag to orbit the camera.
    • Right click and drag to move the camera in the view plane.
    • Scroll to zoom the camera.
    • Press 'T' to display the translation manipulators.
    • Press 'S' to display the scale manipulators.
    • Press 'R' to display the rotation manipulators.
    • Pres 'esc' to exit.

    What to Submit

    Submit a zip file containing your solution organized the same way as the code on Github. Include a README in your zip that contains:

    • You and your partner's names and NetIDs.
    • Any problems with your solution.
    • Anything else you want us to know.

    Upload here (CMS)

    Do this project alone or in groups of two, as you prefer. You can use Piazza to help pair yourselves up.

    Written Part

    Due: Friday October 12th 2018 (11:59pm)

    1. Displacement Mapping

    Displacement mapping involves computing new normals for the displaced surface, and the standard approach (which we use in the programming part) involves an approximation. This problem explores when this approximation is accurate, looking at a 2D analog where the math is simpler to write down.

    Suppose we have a 2D curve defined by a function:

    \[p: [0,1] \rightarrow R^2\]

    and a displacement function defined:

    \[h: [0,1] \rightarrow R\]

    The (unnormalized) tangent vector to this curve is its derivative with respect to \(u\):

    \[t(u) = p'(u)\]

    and the (unnormalized) normal vector is:

    \[n(u) = R \: t(u)\]

    where R is a rotation by 90 degrees clockwise (this obeys the convention that "inside" is to the left as you walk along the curve, and the normal points towards the outside).

    The unit tangent is:

    \[\hat{t}(u) = \frac{t(u)}{\lVert t(u)\rVert}\]

    and the unit normal is:

    \[\hat{n}(u) = \frac{n(u)}{\lVert n(u) \rVert} = \frac{n(u)}{\lVert t(u) \rVert}\]

    We define a curve that is "p displaced along the normal by h" which is:

    \[p_d(u) = p(u) + h(u) \hat{n}(u)\]

    The (unnormalized) tangent vector to this curve is:

    \[p'_d(u) = p'(u) + h'(u) \hat{n}(u) + h(u) \hat{n}'(u)\]

    This last term is commonly discarded in practice, on the argument that we are usually applying small displacements to smooth curves, so \(h\) is small and \(\hat{n}^\prime\) (aka. curvature) is also small. The approximate tangent is:

    \[p'_{d}(u) = p'(u) + h'(u) \hat{n}(u)\]

    Let us see how this plays out in practice.

    1. Use the exact and approximate formulas to write two expressions for the unit normal to the displaced curve for the following particular curve and displacement function: \[p(u) = (\cos(2\pi u), \sin(2\pi u))\] \[h(u) = A \cos(40\pi u) + B\]
    2. Make pictures of the results by plotting the curve over the interval \(u \in [0.2, 0.3]\), and plotting the surface normals as short lines at regular intervals. Compare the pictures of both the exact and approximate normals for \(A = 1/20, B = 0\) and for \(A = 1/20, B = 5\). For which of these cases is the approximation a good one, and why? We recommend using MATLAB or Matplotlib to generate the images.

    2. Reflection and Cube Maps

    Imagine that we would like to render a sphere under environment lighting. The sphere is centered at (0, 0, 0) and its radius is 1. The shader that does this will look up into an environment texture, using the direction of the reflected ray to compute the texture coordinates. Suppose we have decided to use a cube map to represent the environment map, and we are using the common convention of packing the 6 faces of that map into a single texture that has the cube “unfolded” into a cross shape with width 300 texels and height 400 texels.

    A perspective camera is positioned at (0, 0, 2), viewing at (0, 0, 0). The following figure (not to scale) is a slice of the scene through the y-z plane, showing several viewing rays.

    Which parts of the cube map could be sampled by reflections from the sphere? Draw the region on the cross and mark your sketch with enough positions and dimensions (in texel units) so that it’s clear exactly what region is shaded. You can round your answer to the nearest integer texel unit. We are providing the following cube map diagram in PDF format--mark it up either electronically or by printing and drawing on it.

    What to Submit

    Submit a PDF file containing Q1 and Q2 together.

    Programming Part

    Due: Wednesday October 17th 2018 (11:59pm)

    Overview

    In this assignment, you will implement some shaders in GLSL, a graphics-specific language that lets you write programs that run on the graphics processor (GPU). There are four main components to the assignment:

    1. Implement a microfacet shader in microfacet.html.
    2. Implement specular (mirror-like) reflection under environment lighting in environment.html.
    3. Implement a displacement mapping shader (without environment mapping) in displacement-bump.html.
    4. Implement a bump mapping shader (also without environment mapping) in displacement-bump.html and compare results with displacement mapping.

    Each part is contained in its own webpage, which includes a framework based on the mesh viewer you saw in Assignment 1 (Mesh). This will allow you to load any meshes and textures you might like to test with. All of the work outside of the shaders themselves has been done for you, but be sure to understand what uniforms and attributes are being passed to your GLSL program. Some of the uniforms may be controlled using a slider on the webpage.

    The shaders are embedded in the HTML documents themselves. You will edit and submit these directly. We have marked all the shaders you need to write with // TODO#A4 in the source code.

    A few test meshes and textures have been provided in the data directory.

    Example: Phong Shader

    We have provided an example in phong.html. You do not need to modify this file, but it may be useful as a reference. Note the built-in uniforms and attributes provided by Three.js as well those we have provided via Javascript. Uniforms include the model, view, and projection matrices. Attributes include vertex positions and normals (in local object space). See also how the "Roughness" slider changes the roughness uniform with a corresponding change in the rendering. (This is the inverse of the Phong exponent; we have chosen to display it this way so that moving the slider to the right has a similar effect to doing the same with the microfacet shader.)

    Also note how the lights are handled. There is one uniform array for the positions and another for the colors. The constant NUM_LIGHTS gives the number of lights.

    Note that this example performs the bulk of its computations in eye space, as this is the most convenient for the provided set of built-in uniforms. We recommend you do the same. Similarly, while we have provided a toggle between fixing light positions to world space and fixing light positions to eye space, in the shader the light position uniforms will always be provided in eye space.

    Problem 1: Microfacet Shader

    The Blinn-Phong shading implemented in the example shader gives OK results -- but it's far from the reality of light reflection. For more realistic appearance, we should use more sophisticated models based on the microfacet framework (like the one used in the Ray 1 extra credit).

    You will implement a microfacet-based lighting model that includes diffuse reflection and also models light that reflects specularly from the top surface of the material. The microfacet model you're required to implement uses the Beckmann distribution to model the roughness of the surface. You are supposed to implement the overall fragment shader, but we are providing implementations of the \(F\), \(D\), and \(G\) functions. This model is defined in terms of its bidirectional reflectance distribution function (BRDF), and the shading equation for a single light source is as follows:

      \[\frac{I_L}{r^2} \;\left(\frac{R_d}{\pi} + R_m\;f_r(\omega_i, \omega_o, n)\right) \max ( n \cdot \omega_i , 0 ) \]
    These computations should be repeated for each visible light in the scene and the contributions summed.

    The term in parentheses is the full BRDF of the surface, including the Lambertian BRDF (which is a constant, since diffuse reflection sends light equally in all directions) and the specular BRDF \(f_r\). You will need to look up the diffuse reflectance \(R_d\) in uniform sampler2D diffuseTexture, which you can do using the texture2D function that is built into GLSL. Its specular reflectance \(R_m\) represents the fraction of light reflected by specular component in each color channel, and it is set to 1 in this case. And the value of function \(f_r(\omega_i, \omega_o, n)\) here is a number (scalar-valued). \(I_L\) is the color of the light in the lightColors array. The specific microfacet model we use comes from this paper and is the product of several factors:
      \[ f_r(\omega_i,\omega_o,n) \; = \; \frac{F(\omega_i,h)G(\omega_i,\omega_o,h)D(h)}{4|\omega_i \cdot n| |\omega_o \cdot n|} \]

    \(\omega_i\): Direction unit vector along incoming ray.

    \(\omega_o\): Direction unit vector along outgoing ray.

    \(n\): unit normal vector of surface at the shaded point.

    \(h\): normalized half vector of \(\omega_i\) and \(\omega_o\):

      \[ h(\omega_i, \omega_o) \; = \; \frac{\omega_i + \omega_o}{|\omega_i + \omega_o|} \]

    The three terms \(F\), \(D\), and \(G\) are the Fresnel factor, the slope distribution, and the shadowing/masking factor, and their particulars are given below. We have already implemented the three functions for you; the following is for reference.

    • \(F(\omega_i,h)\): This is the Fresnel term for unpolarized light, and only reflection (not refraction) is considered in this assignment.
        \[ F(\omega_i, h) \; = \; \frac{1}{2}\frac{(g-c)^2}{(g+c)^2}\left(1 \; + \; \frac{(c(g+c)-1)^2}{(c(g-c)+1)^2}\right)\]
      where \(g \; = \; \sqrt{\frac{n_t^2}{n_i^2}-1+c^2}\) and \(c \; = \; |\omega_i \cdot h|\). \(n_t\) and \(n_i\) are the refractive index of material and air respectively. \(n_i\;=\;1.0\). Note that when computing the value of \(g\), it is possible to encounter a neagive number under the square root in some edge cases. In this case, \(F\;=\;1\).
    • \(D(h)\): This factor comes from the microfacet distribution function. We use the Beckmann distribution.
        \[ D(h) \; = \; \frac{\chi^{+}(h \cdot n)}{\pi\alpha_b^2\cos^4\theta_h} \; \exp\left(\frac{-\tan^2\theta_h}{\alpha_b^2}\right) \]
      where \(\theta_h\) denotes the angle between \(h\) and \(n\); \(\chi^{+}(a)\) is the positive characteristic function (one if \(a>0\) and zero if \(a \leq 0\) ). \(\alpha_b\) is the width parameter of the Beckmann distribution and can be controlled by the slider at the top.
    • \(G(\omega_i,\omega_o,h)\): This is the Smith shadowing-masking term; it is designed based on the choice of microfacet distribution function \(D(h)\). For the Beckmann model, we use a rational approximation calculated by Walter et al. \(G(\omega_i, \omega_o,h)\) can be separated as the product of two parts:
        \[ G(\omega_i, \omega_o, h) = G_1(\omega_i,h)G_1(\omega_o,h) \]
      For Beckmann distribution,
        \[ G_1(v,h) \; = \; \chi^{+}\left(\frac{v\cdot h}{v\cdot n}\right) \begin{cases} \frac{3.535a\;+\;2.181a^2}{1\;+\;2.276a\;+\;2.577a^2}, & \text{if}\ a < 1.6 \\ 1 & \text{otherwise} \end{cases} \]
      where \(a=(\alpha_b\tan\theta_v)^{-1}\) and \(\theta_v\) is the angle between \(v\) and \(n\).

    Implement the shading equation in fragmentShader in microfacet.html. Use vertexShader to pass any data you will need to the fragment shader.

    Problem 2: Environment Mapping

    You used point lights in the Raytracer and Scene assignments. Real world scenes usually have much more complex lighting conditions than that and environment mapping is a clever technique to capture complex lighting in a texture. The idea of an environment map is that the illumination from the environment depends on which direction you look, and an environment map is simply a texture you look up by direction to answer the question "how much light do we see if we look in that direction?". Your task here is to implement specular reflection under a given environment map.

    In OpenGL, there is a special kind of texture called a cube map, which stores images of the distant environment. A cube map consists of six faces, and each of them has a 2D texture that shows the environment viewed in perspective through one face of the cube. You can sample a cubemap with a 3D vector in order to get environment light intensity in the direction that vector is pointing. See section 11.6 of the book.

    In this assignment we have already loaded a cubemap for you as uniform samplerCube environmentMap. Similarly to a 2D texture, you can look up into a cube using the textureCube function that is built into GLSL.

    Your task is to implement specular reflection under environment lighting. Given a viewing direction and a normal direction, we can follow the written part (i.e., Mirror Reflection) to compute the reflection direction, and then use it to sample the cube map. You may assume the environment map is fixed in eye space; i.e. the mouse rotates the scene relative to the environment map, but the environment map remains fixed relative to the camera. Implement the vertexShader and fragmentShader in environment.html.

    Given the way this problem loads textures, you will likely need to run a server due to security restrictions. An easy way to do this is to install Python if you do not already have it and run from your project directory the following command that corresponds to your Python version:

    • Python 2: python -m SimpleHTTPServer 8000
    • Python 3: python -m http.server 8000
    Once the server is running, visit http://localhost:8000/environment.html.

    Problem 3: Displacement mapping

    Building complex shapes in 3D is hard work, but one easy-to-use modeling tool that is good for adding detail to a smooth surface is displacement mapping. The idea is to use a height function \(h(u,v)\) whose graph is a height field that shows the details you'd like, store it in an image, use texture coordinates to establish \((u,v)\) coordinates on the surface, and then displace each point of the surface along the surface normal at that point according to the height function.

    In equations, given a parametric surface defined by a function

    \[p : \mathbb{R}^2 \rightarrow \mathbb{R}^3\]

    We can define the displaced surface by, for any given \((u,v)\), moving the point \(p(u,v)\) along the normal according to \(h(u,v)\):

    \[p_d(u,v) = p(u,v) + \alpha * h(u,v) * \hat{n}(u,v)\]

    where \(\hat{n}\) is the unit normal vector to the surface at \(p(u,v)\) and \(\alpha\) is a scale factor used to control how strong a displacement to apply. In this part of the assignment we will implement a vertex shader that displaces a surface defined by a triangle mesh according to this scheme.

    In a vertex shader it is simple to compute the displaced position of a vertex, since \(p(u,v)\) and \(\hat{n}(u,v)\) are given as vertex attributes, and \(h(u,v)\) can be obtained by a texture lookup. However, as you'll gather from the first written problem in this assignment, displacing the surface also changes the normal to the surface, and this has a very important effect on shading.

    The derivation of the normal to a displaced surface in 3D is much like the curve in 2D from that earlier problem; it just has two partial derivatives instead of one. To compute the normal to the surface defined by \(p\), we can compute two tangent vectors:

    \[\frac{\partial p}{\partial u} (u,v) \qquad \text{and} \qquad \frac{\partial p}{\partial v} (u,v)\]

    These are both perpendicular to the surface normal, so as long as they are not parallel (which would imply our surface is not really a surface) their cross product is the surface normal. And if we can work out the tangents to the displaced surface, we can also compute the normal to the displaced surface.

    The derivatives of the surface defined by \(p_d\) are:

    \[t_u(u,v) = \frac{\partial p_d}{\partial u} = \frac{\partial p}{\partial u} + \alpha \frac{\partial h}{\partial u} \hat{n} + \alpha h \frac{\partial \hat{n}}{\partial u}\] \[t_v(u,v) = \frac{\partial p_d}{\partial v} = \frac{\partial p}{\partial v}+ \alpha \frac{\partial h}{\partial v} \hat{n} + \alpha h \frac{\partial \hat{n}}{\partial v}\]

    The derivative of \(\hat{n}\) (a quantity related to the curvature of the surface) is actually pretty complex to compute, but that's OK, because we'll assume that \(h\) is small and/or curvature is low as an excuse for discarding that last term, leaving us with: \[t_u(u,v) = \frac{\partial p}{\partial u} + \alpha \frac{\partial h}{\partial u} \hat{n}\] \[t_v(u,v) = \frac{\partial p}{\partial v} + \alpha \frac{\partial h}{\partial v} \hat{n}\]

    and we can compute the displaced normal as

    \[\hat{n}_d(u,v) = t_u(u,v) \times t_v(u,v)\]

    In order to do this computation in a vertex shader, we need to have the (non-unit) tangents \(\frac{\partial p}{\partial u}\) and \(\frac{\partial p}{\partial v}\) available. You can't compute them in isolation at a vertex from the usual point, normal, and texture coordinates, because these derivatives are about how the position and texture coordinates change across the surface. For this reason, these tangents need to be computed from the mesh and passed as vertex attributes, alongside the position, normal, and texture coordinates.

    In this assignment, the computation of the attributes derivU and derivV is handled by the framework (the code is in js/BufferGeometryUtils.js) by computing these tangents per-triangle and averaging them at vertices using the algorithm described here.

    To calculate \(\frac{\partial h}{\partial u}\) and \(\frac{\partial h}{\partial v}\), you will need to approximate the partial derivatives of a texture. You can use a finite difference approximation to do so, which for the partial derivative with respect to \(u\) is:

    \[\frac{\partial h(u,v)}{\partial u} = \frac{h(u+ \Delta u, v) - h(u - \Delta u, v)}{2 \Delta u}\]

    where \(\Delta u\) is a small change in \(u\). It is important to choose a \(\Delta u\) that is small enough to capture local differences but large enough to sample the height at neighboring texels. A value of 0.001 for both \(\Delta u\) and \(\Delta v\) will work for the examples in this assignment.

    For this problem you'll find all the work happens in the vertex shader; the fragment shader can remain entirely unchanged from the basic Phong shading implementation. Implement the shaders marked vertexShader and fragmentShader in displacement-bump.html. In the vertex shader, compute the displaced vertex position and store it in gl_Position. Also compute the displaced unit normal to the displaced surface, and store it in vNormal.

    You can test your implementation on this model and height map:

    • Higher-resolution map of the Earth: mesh: data/meshes/sphere_highres.obj; texture: data/textures/earthmap1k.jpg; displacement map: data/textures/earthbump1k.jpg
    • Sine wave: mesh: data/meshes/square.obj; texture: none; displacement map: data/textures/sinbump1k.jpg

    Problem 4: Bump mapping

    Displacement mapping is great, but it requires a dense mesh to work; sometimes this can be too many triangles, leading to too slow performance. But of the two things our displacement shader changes--the vertex position and the vertex normal--it's the normal that has the biggest effect on surface appearance when displacements are small. This leads to the idea of not bothering with displacement, but still using the normal to the displaced surface for shading.

    This technique is known as bump mapping. The idea is to use a low resolution surface, and carry out the displacement normal computation at the fragment stage, so that the normal used for shading varies from fragment to fragment across a triangle.

    For this last problem, take the normal-vector computation you implemented for displacement mapping and move it to the fragment shader. The vertex shader doesn't have to do much; just transform the position, normal, and tangents to world space and pass them as varyings. Implement the shaders marked bumpVertexShader and bumpFragmentShader in displacement-bump.html. You can toggle between displacement mapping and bump mapping using the control toward the top of the screen.

    You can test your implementation on these sets of models and height maps:

    • Higher-resolution map of the Earth: mesh: data/meshes/sphere_highres.obj; texture: data/textures/earthmap1k.jpg; displacement map: data/textures/earthbump1k.jpg
    • Lower-resolution map of the Earth: mesh: data/meshes/sphere.obj; texture: data/textures/earthmap1k.jpg; displacement map: data/textures/earthbump1k.jpg

    Exams

    There will be an evening midterm and a final exam:

    The exams are closed book, but you're allowed to bring one letter-sized piece of paper with writing on both sides, to avoid the need to memorize things.

    Old Exams

    Here are links to exams from previous versions of this class. They can be useful for studying, because they give an indication of the style and scope of problems that may be expected. Note that they are not a useful reference for topics to expect, because the content of the course has changed from year to year.

    About CS4620

    Questions, help, discussion: The instructors are available to answer questions, advise on projects, or just to discuss interesting topics related to the class at office hours and by appointment as needed. For electronic communication we are using Piazza (handy link at the top of this page).

    Practicum: In the optional practicum course, CS4621/5621, you will get a more in-depth exposure to the course material and complete an additional project. Students taking the practicum will also be required to attend lectures most weeks during the Monday meeting time for CS4621/5621.

    Academic integrity: We assume the work you hand in is your own, and the results you hand in are generated by your program. You are welcome to read whatever you want to learn what you need to do the work, but we do expect you to build your own implementations of the methods we are studying. If you are ever in doubt, just include a citation in your code or report indicating clearly and specifically where some particular idea or implementation came from, whether it be a classmate, a web site, another piece of software, or anything—this always maintains your honesty, whether the source was used in a good way or not.

    The principle is that an assignment is an academic document, like a journal article. When you turn it in, you are claiming that everything in it is your original idea (or is original to you and your partner, if you are handing in as a pair) unless you cite a source for it.

    School can be stressful, and your coursework and other factors can put you under a lot of pressure, but that is never a reason for dishonesty. If you feel you can't complete the work on your own, come talk to the professor or the TAs, or your advisor, and we can help you figure out what to do. Think before you hand in!

    Clear-cut cases of dishonesty will result in failing the course.

    For more information see Cornell's Code of Academic Integrity.

    Collaboration: You are welcome (encouraged, even) to discuss projects among yourselves in general terms. But when it comes to writing up the homeworks or implementing the projects, you need to be working alone (or only with your partner if you are doing a project as a pair). In particular, it is never OK for you to see another student's homework writeup or another team's program code, and certainly never OK to copy parts of one person's or team's writeup, code, or results into another's, even if the general solution was worked out together.

    Books

    Textbook:


    Marschner & Shirley,
    Fundamentals of Computer Graphics
    fourth edition
    (Available online in Cornell library two ways: 1 2)

    Supplemental Books and Materials:


    Gortler,
    Foundations of 3D Computer Graphics
    first edition