CS4620 Introduction to Computer Graphics

CS4621 Computer Graphics Practicum

Cornell University

MWF 2:30pm, Hollister B14

F 3:35pm, Philips 203 [4621 only]

Instructor: Steve Marschner


Eston Schweickart (Ph.D. TA, ers@cs.cornell.edu)
Rundong Wu (Ph.D. TA, rw489@cornell.edu)
Balazs Kovacs (Ph.D. TA, bk472@cornell.edu)
Nicolas Savva (M.S. TA, nsavva@graphics.cornell.edu)
Deedy Das (M.Eng. TA, dd367@cornell.edu)
Jack Hessel (Ph.D. TA, jmh563@cornell.edu)
Cristian Zaloj (Ugrad TA, cz68@cornell.edu)
Victoria Dye (Ugrad TA, ved8@cornell.edu)
Stephen McDowell (Ugrad TA, sjm324@cornell.edu)
Eric Gao (Ugrad TA, emg222@cornell.edu)
Alex Spitzer (Ugrad TA, aes368@cornell.edu)
Andrew Wolfers (Ugrad TA, asw225@cornell.edu)
Ryan Hall (Ugrad TA, rth45@cornell.edu)
David Lee (Ugrad TA, dkl53@cornell.edu)

Office Hours


date topic reading assignments
27Aug intro slides Ch. 1, Ch. 2  
29Aug triangle meshes | no 4621 slides Ch. 12: up to 12.1.4 Mesh out
1Sep —Labor Day—    
3Sep Blender Mesh Tutorial slides    
5Sep History of CG | no 4621   Ray1 out
8Sep ray tracing intersection slides Ch. 4: up to 4.4 Mesh due
10Sep ray tracing shading slides Ch. 4: 4.5 to end  
12Sep shading and textures slides | 4621 intro starter ideas slides    
15Sep interpolation slides | transformations slides    
17Sep 2D transformations slides Ch. 5, Sec. 6.1  
19Sep hierarchies slides Sec. 12.2  
22Sep 3D transformations slides Ch. 6: 6.2 to end Ray1 due
24Sep perspective slides   Scene out
26Sep viewing slides Ch. 7  
29Sep viewing (cont'd.)    
1Oct —no class (Gates dedication)—    
3Oct rasterization slides Ch. 8: up to 8.1  
6Oct graphics pipeline slides Ch. 8: 8.2 to end  
8Oct images slides Ch. 3: up to 3.3  
10Oct GLSL slides slides slides   Scene due, Shaders out
13Oct —Fall Break—    
15Oct OpenGL slides    
17Oct games with textures slides  
20Oct images | antialiasing slides    
21Oct Midterm: 7:30pm, Phillips 101    
22Oct compositing slides Sec 3.4  
24Oct spline curves slides Ch. 15: up to 15.5 Splines out
27Oct spline curves Ch. 15: up to 15.6.2 Shaders due
29Oct spline curves    
31Oct surfaces slides    
3Nov surfaces    
5Nov subdivision slides    
7Nov subdivision   Animation out
10Nov animation slides   Splines due
12Nov animation notes    
14Nov animation    
17Nov animation | ray tracing    
19Nov ray tracing acceleration slides    
21Nov reflection & illumination slides   Ray2 out
24Nov advanced ray tracing slides   Animation due
26Nov —Thanksgiving Break—    
28Nov —Thanksgiving Break—    
1Dec Prof. Greenberg: TBA    
3Dec Prof. James: TBA    
5Dec Prof. Bala: TBA   Ray2 due
11Dec Final exam, 7pm, Barton Hall (west section)    
15Dec Final presentations [4621] (11:00-14:00 - Gates 114)    


There will be 7 projects during the semester, due approximately every two weeks.

Due: Monday 8th September 2014 (11:59pm)

Do this project alone or in groups of two, as you prefer. You can use Piazza to help pair yourselves up.


In this assignment you will learn about the most widely used way to represent surfaces for graphics: triangle meshes. A triangle mesh is basically just a collection of triangles in 3D, but the key thing that makes it a mesh rather than just a bag of triangles is that the triangles are connected to one another to form a seamless surface. The textbook and lecture slides discuss the data structures used for storing and manipulating triangle meshes, and in this assignment we will work with the simplest kind of structure: a shared-vertex triangle mesh, also known as an indexed triangle mesh.

Your job in this assignment is to write a simple mesh generation and processing utility that is capable of building triangle meshes to approximate some simple curved surfaces, and also can add some information to existing meshes. It reads and writes meshes stored in the popular OBJ file format, a text format with one line for each vertex and face in the mesh.

Building a pyramid

The written part of this assignment is a warmup for generating more complex meshes. Suppose we wish to generate a mesh (with just vertex positions, no normals or texture coordinates) for a square pyramid (like the one at Giza). The base is a square in the \(x-z\) plane, going from \(-1\) to \(1\) on each axis and the apex is at \((0,1,0)\). Write out:

  • The coordinates of the 5 vertices
  • The indices of the 6 triangles (you will need two for the base).

The answer consists of 15 floating point numbers (3D coordinates of the vertices) and 18 integers (indices into the list of vertices, 3 for each triangle).

To test out your solution, type it into text file with the vertices, one per line, and then the triangles, also one per line. Precede each vertex with "v" and each triangle with "f". Note that the indices for the vertices, making up a particular triangle face, should be listed in a counter-clockwise order as viewed from the outside of the given mesh. And one more thing: add one to all the vertex indices, so that the count starting from 1. (Caution! this is not the same as what we do in the data structures inside the program!) For example, a single triangle in the \(x-z\) plane, facing towards \(+y\) is:

v 0 0 0
v 0 0 1
v 1 0 0
f 1 2 3

This text file is now an OBJ file! Read it into one of the mesh viewers described below, take a couple of screen shots to show that it works, and include them with your writeup.

The OBJ files written by the framework are a little more complicated because they can contain additional data at the vertices: normals and texture coordinates.


The utility "meshgen" is a mesh-generation program that runs at the command line. Its usage is:

  meshgen [-n] [-m] [-r] sphere|cylinder|torus|cube -o <outfile>.obj
  meshgen [-nv] [-opt] -f <infile>.obj -o <outfile>.obj

The first required argument is the mesh specifier, and the second is the output filename. If the mesh specifier is one of the fixed strings "sphere", "cylinder", "cube", or "torus", a triangle mesh is generated that approximates that shape, with the number of triangles generated is controlled by the optional -n and -m options (details below), and written to the output file.

A triangle mesh that is an approximation of a smooth surface should have normal vectors stored at the vertices that indicate the direction normal to the exact surface at that point. When generating the predefined shapes, you generate points that are on the surface, and for each you should also calculate the normal vector that is perpendicular to the surface at that point.

If the mesh specifier is -f followed by the name of an OBJ file, the mesh in that file is read. If the -nv flag is given, any vertex normals are discarded and new vertex normals appropriate for a smooth surface are computed from the triangle geometry as described below. Finally, the mesh is writen to the output file.

Details of predefined shapes


The cylinder has radius 1 and height 2 and is centered at the origin; its axis is aligned with the \(y\) axis. It is tesselated with \(n\) divisions around the outer surface. The two ends of the cylinder are closed by disc-shaped caps. The vertices around the rims of the cylinder are duplicated to allow discontinuities in the normals and texture coordinates. Texture coordinates on the outer surface are like the sphere in \(u\), running from 0 to 1 in a counterclockwise direction as viewed from the \(+y\) direction, and run from 0 to 0.5 in \(v\), increasing in the \(+y\) direction. The texture coordinates for the two caps are circles inscribed in the upper-left (for the \(-y\) cap) and upper-right (for the \(+y\) cap) quadrants of the unit square, with the \(+u\) direction corresponding to the \(+x\) direction in 3D space, and the \(v\) direction arranged to keep the texture right-reading. We have provided a partial implementation of the cylinder.


The sphere has radius 1 and is centered at the origin in 3D coordinates. It is tesselated in latitude-longitude fashion, with \(n\) divisions around the equator and \(m\) divisions from pole to pole along each line of longitude. The North pole is at \((0,1,0)\), the South pole at \((0,-1,0)\), and points on the Greenwich meridian have coordinates \((0,y,z)\) with \(z > 0\). The mesh is generated with vertex normals that are normal to the exact sphere, and with texture coordinates \((u,v)\) where \(u\) depends only on longitude, with \(u=0\) at longitude \(180^\circ\) West and \(u=1\) at \(180^\circ\) East, and \(v\) depends only on latitude, with \(v=0\) at the South Pole and \(v=1\) at the North pole. Each quadrilateral formed by two adjacent longitude lines and two adjacent latitude lines is divided on the diagonal to form two triangles. The vertices along the 180th meridian are duplicated: one vertex has texture coordinate \(u=0\) and the other has texture coordinate \(u=1\), to enable correct wrapping of a tileable texture image across the seam. The vertices at the poles are duplicated \(n+1\) times, to enable nearly-appropriate texture in the row of triangles adjacent to the pole. You may have noticed that the triangles get compressed near the poles. An alternative tesselation approach can be used to create an isotropic mesh (See icosphere for more details. The implementation is left as an extra credit exercise for those who wish to explore this further).

Specs illustration for the cylinder and sphere (for the case of n = 32 and m = 16)
Torus [Extra credit]

The torus has major radius 1 and minor radius \(r\) (controlled by the -r flag with a default of 0.25). Its \(u\) coordinates are like the sphere, and the \(v\) coordinate runs from 0 to 1 with a seam on the inside of the torus, with the direction arranged so that the texture is right-reading from the outside. Like the sphere it has a seam on the \(-z\) half of the \(y\)-\(z\) plane, and it has a similar seam around the inside of the tube; vertices along each seam are duplicated twofold and a single vertex, at the position \((0, 0, -1+r)\), is duplicated fourfold.

    Wireframe view of the shapes (Notice that the default up direction in Blender, as seen here, is the \(+z\) axis whereas the convention used in our description and Meshlab specifies the \(+y\) axis as upwards. You should be aware of this fact and not be perplexed by such a discrepancy).

Computing vertex normals

For a mesh that has vertex positions and triangles, but no vertex normals, one often wants to compute vertex normals so that the mesh can appear smooth. But since the original surface, if there even was one, is forgotten, we need some way to make up plausible normals. There are a number of ways to do this, and we'll use a simple one for this assignment: the normal at a vertex is the average of the geometric normals of the triangles that share this vertex.

Your first thought might be to do this as a loop over vertices, with an inner loop over the triangles that share that vertex:

   for each vertex v
      normal[v] = (0,0,0)
      for each triangle t around v
         normal[v] += normal of triangle t

With the appropriate data structures, this is possible, but in our case there's no efficient way to do the inner loop: our data structure tells us what vertices belong to a triangle, but the only way to find triangles that belong to a vertex is to search through the whole list of triangles. This is possible but would be quadratic in the mesh size, which is bad news for large meshes.

However, it's simple to do it with the loops interchanged:

   for each vertex v
      normal[v] = (0,0,0)
   for each triangle t
      for each vertex v around t
         normal[v] += normal of triangle t
   for each vertex v

This way the inner loop can efficiently visit just the necessary vertices. Nifty!

You should compute normals whenever the -nv flag is turned on; the logic to do this is already in the framework.


The framework is available on a public Git repository: https://github.com/CornellCS4620/Framework.git

If you have not used Git before, you can follow the tutorial here, or ask a TA for assistance. Once you have the code set up on your local machine, we recommend that you create a new branch (named, for example, my-solution), and commit all changes you make to that branch. When releasing future assignments, we will ask you to pull from our updated repository onto your master branch; creating your own solution branch will avoid the headaches of merge conflicts. We encourage you to use version control though we request that you refrain from publicly sharing your class repository.

Setting Up Your Workspace For Eclipse

  • After grabbing the repo and creating your own branch, you may fire up Eclipse to get started with coding.
  • Navigate to and press "File->Import"
  • Choose the option "General->Existing Projects Into Workspace"
  • Press "Next"
  • For the root directory, browse and select the folder CS4620, which may be found directly in the repo's root folder.
  • Press "Finish". You have set up a coding environment, but you're not done yet.
  • Right-click on the CS4620 project in Package Explorer and press "Build Path->Configure Build Path"
  • Navigate to the "Libraries" tab
  • Expand the lwjgl.jar file and select "Native library location"
  • Press "Edit" and enter in your specific path, which will be "CS4620/deps/native/<OS>".

We have marked all the functions or parts of the functions you need to complete in with TODO#A1 in the source code. To see all these TODO#A1’s in Eclipse, select Search menu, then File Search and type “TODO#A1” TODO's in Eclipse can be viewed through the Task List as well. You should only need to modify those the sections marked with TODO unless you want to add other cool mesh generators for extra credit (in which case look at cs4620.util.MeshGen's switch block). All other functionality has been implemented.

Testing your program

  • The entry point of this assignment is cs4620.util.MeshGen
  • Select that Java class in the Package Explorer and press the play button.
  • Unless you have already put in arguments to the program, you should get an output that no arguments have been provided
  • Navigate to and press "Run->Run Configurations..."
  • Pick Java Application
  • Select the tab "Arguments"
  • Type in your arguments into "Program arguments": (eg. "cube -o cube.obj")
  • Run it

Since your program just writes a file full of inscrutable numbers, you need some way to look at the results. For this you can use MeshLab. Please refer to the notes from lecture on Sept 3rd for more information about how to get MeshLab and how to use it to load an OBJ mesh and find out what it contains. (Here are a few more programs that might be worth checking out: Blender, ObjViewer, p3d.in ). Warning: Blender discards the normals stored in the OBJ file and as such is not a good tool to check whether your normals are correct. Caution: Blender by default will rotate your mesh by 90 degrees around the \(x\) axis so that it displays right way up in Blender's \(z\)-is-up world.

Download and unpack the following archive: MeshesTestData.zip

To see if your meshes are generated properly, first load them into MeshLab and examine them with the default flat shading. The surface should be complete without any holes or odd artifacts. Note that as you debug, you can click on a triangle to find out its index and the indices and locations of its three vertices (use the Get Info "i" button in the toolbar).

Viewing triangle information

If you wish to test the normals, select "smooth shading" (the smooth-looking cylinder icon in the toolbar). The surface should look much smoother; it should be smooth but still a bit lumpy looking for coarse tessellation (e.g. -n 24 -m 12) and smooth and perfect looking once you use enough triangles (e.g. -n 64 -m 32).

When testing the texture coordinates, put one of the PNG images from UVTestImages into the same directory as your mesh, and in MeshLab use the "Filter -> Texture -> Set Texture" command and enter the name of the texture file. (If it shows a gray grid that means it didn't find the image.) The full texture should appear on the surface; the sphere with the Earth texture should look appropriate. On the cylinder, the number texture should read right (not mirrored) on all sides and appear undistorted on the top and bottom faces (see images below).

    A textured sphere
    A textured cylinder

Handing in

Submit a zip file containing your solution using CMS

You should include a readme in your zip that contains:

  • You and your partner's names and NetIDs.
  • The solution for the paper and pencil parts of the homework.
  • Anything else you want us to know.
Due: Monday 22nd September 2014 (11:59pm)

Do this project alone or in groups of two, as you prefer. You can use Piazza to help pair yourselves up.


Ray tracing is a simple and powerful algorithm for rendering images. Within the accuracy of the scene and shading models and with enough computing time, the images produced by a ray tracer can be physically accurate and can appear indistinguishable from real images. Cornell pioneered research in accurate rendering. See http://www.graphics.cornell.edu/online/box/compare.html for the famous Cornell box, which exists in Rhodes Hall.

The basic idea behind ray tracing is to model the movement of photons as they travel along their path from a light source, to a surface, and finally to a viewer's eye. Tracing forward along this path is difficult, since it can be unclear which photons will make it to the eye, especially when photons can bounce between objects several times along their path (and in unpredictable ways). To simplify the system, these paths are computed in reverse; that is, rays originate from the position of the eye. Color information, which depends on the scene's light sources, is computed when these rays intersect an object, and is sent back towards the eye to determine the color of any particular pixel.

In this assignment, you will complete an implementation of a ray tracer. It will not be able to produce physically accurate images, but later in the semester you will extend it to produce nice looking images with many interesting effects.

We have provided framework code to save you from spending time on class hierarchy design and input/output code, and rather to have you focus on implementing the core ray tracing components. However, you also have the freedom to redesign the system as long as your ray tracer meets our requirements and produces the same images for the given scenes.

Requirement Overview

This assignment contains a written question portion as well as a programming portion. The programming portion is described below.

A ray tracer computes images in image order, meaning that the outer loop visits pixels one at a time. (In our framework the outer loop is found in the class RayTracer). For each pixel it performs the following operations:

  • Ray generation, in which the viewing ray corresponding to the current pixel is found. The calculations depend on the characteristics of the camera: where it is located, which way it is looking, etc. (in our framework, ray generation happens in the subclasses of Camera. This is already done for you).
  • Ray intersection, which determines the visible surface at the current pixel by finding the closest intersection between the ray and a surface in the scene (in our framework, ray intersection is handled by the Scene class).
  • Shading, in which the color of the object, as seen from the camera, is determined. Shading accounts for the effects of illumination and shadows (in our framework, shading happens in the subclasses of Shader).
In this assignment, your ray tracer will have support for:
  • Spheres and triangle meshes
  • Lambertian and Phong shading with textures
  • Point lights with shadows

Your ray tracer will read files in an XML file format and output PNG images. We have split ray tracing in 4620 into two parts. For now, you will implement basic ray tracing. Later in the semester, after you have learned many more graphics techniques, you will implement other features, such as advanced shaders, faster rendering, and transformations on groups of objects. In this document, we focus on what you need to implement the basic elements of a ray tracer.

Warmup exercise: Eclipse?!

A group of astronauts has just left on their journey to Mars when ground control informs them that there will be a solar eclipse back home, exactly at Noon GMT on the day of the vernal equinox. The astronauts are asked to take a picture of this rare event from their location in space and report back. The situation at the moment they take the picture is as follows:

Note that we use Earth-centered coordinates that match the ones for the sphere in the mesh assignment. So 1 Earth radius is the unit of measure.

  • The sun is positioned 23,679 units away from the Earth (from the center, if you must ask), exactly above the point at latitude and longitude zero. (The intensity of the Sun is 100 000 000 units)
  • The moon is positioned at (0, 1, 56.54) units away from the center of the earth. It has radius 0.27.
  • The spacecraft is at (450, 100.5, 300) and its camera is looking in the direction (-9/13, -2/13, -6/13) (which is the ray through the very center of the image)
  • The onboard camera has the following parameters: up-direction is (0, 1, 0), projectionDistance is 25 units, viewWidth and viewHeight are 5 units. The resolution of the camera is 2048 x 2048 pixels.
Q1. In the astronauts' photo,
  1. What (x, y, z) point on the Earth's surface appears exactly at the center of the image? What distance is that point from the spacecraft? Show that the point is both on the ray and on the surface of the Earth.
  2. What are the latitude and longitude at that point?
  3. Is that point in the shadow of the moon?
Q2. For a person standing at that point,
  1. what's the angle of the sun, measured down from the vertical? (meaning 0 degrees is straight overhead)
  2. If the intensity of the sun is 100 000 000 units, the reflectance of the ground at that point is 0.3, and the camera's exposure was set so that a perfectly white object at the spacecraft's position, facing the sun, has a pixel value of 1.0, then what value appears at the center of the astronauts' photo? (Ignore atmospheric effects and anything else that would complicate matters beyond the capabilities of your Ray 1 ray tracer. Also assume the image size is odd so that there's a pixel exactly at the center.)
Q3. Write an xml scene file that models the above scenario.

Getting Started

A new commit has been pushed to the class Github page in the master branch. We recommend switching to your master branch, pulling this commit, and creating a new branch (e.g. A2 solution) and committing your work there. This new commit contains all the framework changes and additions necessary for this project.

Specification and Implementation

We have marked all the functions or parts of the functions you need to complete with TODO#A2 in the source code. To see all these TODO’s in Eclipse, select Search menu, then File Search and type “TODO#A2”. There are quite a few separate code snippets to complete in this assignment. In order to check your progress as you go along, we recommend implementations in the following order.


An instance of the Surface class represents a piece of geometry in the scene. It has to implement the following method:

  boolean intersect(IntersectionRecord outRecord, Ray ray)

This method intersects the given ray with the surface. Return true if the ray intersects the geometry and write relevant information to the IntersectionRecord structure. Relevant information includes the position of the hit point, the surface normal at the hit point, the texture coordinates at the hit point, the time t of the ray at the hit point, and the surface being hit itself. See the IntersectionRecord class for details.

We ask that you implement the intersection method for the following surfaces:

  • Sphere: The class contains a 3D point center and a double radius. You will need to solve the sphere/ray intersection equations. Be sure to calculate the exact normal at the point of intersection and store it in the IntersectionRecord. Additionally, please compute and store UV-coordinates in the same orientation as Assignment 1 (i.e. the poles should lie on the y-axis, and the seam should intersect the negative z-axis if the sphere is located at the origin). See section 4.4.1 of the book.
  • Triangle: In our system, each triangle belongs to a Mesh, which keeps a reference to a MeshData (the same data structure you used in the previous programming assignment). A triangle stores a vector index of 3 integers containing the indices of its three vertices. If the mesh does not have a per-vertex normal, the triangle stores the normal of the triangle in the norm field. See section 4.4.2 of the book. If the owning Mesh does not have a per-vertex normal, the intersect method should return the triangle normal stored in Triangle. If it does, it should interpolate the normals of the three vertices using the barycentric coordinate of the hit point. Texture coordinates, if available, should be interpolated using barycentric coordinates as well.

The Scene class stores information relating to the composition, i.e. the camera, a list of surfaces, a list of lights, etc. It also contains a scene-file-specified exposure field, which can be used to adjust the brightness of the resulting image. You need to implement the following method:

  boolean intersect(IntersectionRecord outRecord, Ray rayIn, boolean anyIntersection)

The method should loop through all surfaces in the scene (stored in the Surfaces field), looking for intersections along rayIn. This is done by calling the intersect method on each surface and it is used when a ray is first cast out into the scene. If anyIntersection is false, it should only record the first intersection that happens along rayIn, i.e. the intersection with the lowest value of t. If there is such an intersection, it is recorded in outRecord and true is returned; otherwise, the method returns false, and outRecord is not modified. If anyIntersection is true, the method may return true immediately upon finding any intersection with the scene's geometry, even without setting outRecord (this version is used to make shadow calculations faster).

Ray Tracing

You then need to complete the shadeRay method of the RayTracer class. This method contains the main logic of the ray tracer. It takes a ray generated by the camera, intersects it with the scene, and returns a color based on what the ray hits. Fortunately, most of the details are implemented in other methods; all you need to do is call those methods, and this should only take a few lines. For instance, the Scene class contains a method for finding the first object a given ray hits. Details are in the code's comments.

We have provided a normal shader that determines color based on the normal of the surface, as well as some sample scenes that use this shader. Once you have completed the previous sections, you should be able to render the scenes one-sphere-normals.xml and two-boxes-normals.xml correctly. See Appendix A for details on rendering scenes.


A shader is represented by an instance of the Shader class. You need to implement the following method:

  public void shade(Colord outIntensity, Scene scene, Ray ray, IntersectionRecord record)

which sets outIntensity (the resulting color) to the fraction of the light that comes from the incoming direction \(\omega_i\)(i.e. the direction towards the light) and scatters out in direction \(\omega_o\) (i.e. towards the viewer).

We ask you to implement two shaders:

  • Lambertian: This class implements the perfectly diffuse shader. Its diffuse color \(k_d\) is specified by the shader's texture, or by the diffuseColor field if no texture is assigned. The shade function should set outIntensity to:
    • \(k_d \max ( n \cdot \omega_i , 0 ) \frac{k_\ell}{r^2}\)
    where \(n\) is the normal at the intersection point, \(k_\ell\) is the intensity of the light source, and \(r\) is the distance to the light source. See section 4.5.1, though notice the book's definition does not include the physically accurate \(\frac{1}{r^2}\) falloff.
  • Phong: This class implements the Blinn-Phong lighting model. It has a diffuse color component as defined above \(k_d\), as well as specularColor \(k_s\), and exponent \(p\) fields, where the exponent is the "shininess". The shade function should set outIntensity to:
      \((k_d \max(n \cdot \omega_i, 0) + k_s \max(n \cdot h, 0)^p) \frac{k_\ell}{r^2}\)
    only if \(n \cdot \omega_i \geq 0 \), and sets it to \(0\) otherwise. Here, \(h = (\omega_i + \omega_o)/\| \omega_i + \omega_o \|\). See section 4.5.2, but same warning as above.

Notice that these values should be computed once for each light in the scene, and the contributions from each light should be summed, unless something is blocking the path from the light source to the object. Shader.java contains a useful isShadowed() method to check this. Overall, the code for each shade method should look something like:

 reset outIntensity to (0, 0, 0)
   for each light in the scene
     if !isShadowed(...)
       compute color contribution from this light
       add this contribution to outIntensity
     end if 
   end for 

After the shaders are completed, you should be able to render one-sphere.xml, two-boxes.xml, wire-box.xml, and the bunny scenes correctly.


The Shader classes above have a texture field that is by default set to null. If the scene XML file specifies an image texture for the shader, a Texture object will be initialized in this field. If the texture has been loaded, you should use it to determine the diffuse color in the shade method rather than the diffuseColor field.

The Texture class and its subclasses are responsible for using the UV-coordinates of a particular point on the Surface to find this color on a given image (located in the Texture class). You will be implementing the getTexColor(Vector2d texCoord) method for two very similar subclasses of Texture. Both classes convert UV-coordinates into the nearest pixel coordinates corresponding to the image. The difference is in the way they handle UV-coordinates outside of the [0.0, 1.0] range:

  • ClampTexture: This class uses the nearest pixel for UV-coordinates that are out of range. As a result, the colors of the boundary pixels of a texture are extended outwards in UV space.
  • RepeatTexture: This class repeats the texture for out-of-range UV-coordinates. In this case, copies of the texture are tiled across UV space.

The details of how to implement these are provided in the code. Once these are completed, all scenes should render correctly.

Appendix A: Testing

The data directory contains scene files and reference images for the purpose of testing your code. Right click on the RayTracer file in Package Explorer, and choose Run As->Java Application. Without any command line arguments, the ray tracer will attempt to render all scenes in the default scene directory (i.e. data/scenes/ray1). If you want to render specific scenes, you can list the file names individually on the command line. This can be done from the Run dialog (Run Configurations). Choose the Arguments tab, and provide your command line arguments in Program arguments. Additionally, passing in a directory name will render all scenes in that directory.

Note that RayTracer prepends data/scenes/ray1 to its arguments. This means that you only need to provide a scene file name to render it. For example, if you give the argument one-sphere.xml to the RayTracer it will load the scene file data/scenes/ray1/one-sphere.xml and produce an output image in the same directory as the scene file.

If you would like to change the default scene location, there is a -p command line option that allows you to specify a different default path instead of using data/scenes/ray1.

Compare the images you have with the reference images, if there are visually significant differences, consider checking for bugs in your code.

Appendix B: Extra Credit

We've included a Cylinder subclass of Surface, similar to the Triangle and Sphere classes. The class contains center, a 3D point, and radius, height, both real numbers. Assume that the cylinder is capped, i.e., the top and bottom have caps (it is not a hollow tube). Also, assume it is aligned with the z-axis, and is centered around center;
i.e., if center is (\(center_x, center_y, center_z\)), the cylinder's z-extent is from \(center_z-\frac{height}{2}\) to \(center_z+\frac{height}{2}\).

Your task is to implement the intersect method of a mathematical cylinder (NOT a meshed cylinder). Add an XML scene that includes a cylinder to test your implementation (See Appendix C for details on how this should be done--you shouldn't need to change any parser code).

Once this is done, try implementing a cylinder that is arbitrarily oriented. You will need to add two parameters, rotX and rotY, to specify the cylinder's rotation about the \(x\) and \(y\) axes. Note that this will require changing the coordinate system of the incoming ray within the intersect method to account for these rotations. After the coordinate system change, the math used to intersect the ray with the cylinder should be the same as its axis-aligned variant. Finally, build a test scene with a cylinder object.

Appendix C: Framework

This section describes the framework code that comes with the assignment. You do not need to spend time trying to understand it and it is not essential to read this to get started on the assignment. Instead, you can reference it as needed.

The framework for this assignment includes a simple main program, some utility classes for vector math, a parser for the input file format, and stubs for the classes that are required by the parser.


This class holds the entry point for the program. The main method is provided, so that your program will have a command-line interface compatible with ours. It treats each command line argument as the name of an input file, which it parses, renders an image, and writes the image to a PNG file. The method RayTracer.renderImage is called to do the actual rendering.


This class contains an array of pixel colors (stored as doubles) and the requisite code to get and set pixels and to output the image to a PNG file.

egl.math package

This package contains classes to represent 2D and 3D vectors, as well as RGB colors. They support all the standard vector arithmetic operations you're likely to need, including dot and cross products for vectors and addition and scaling for colors.


The Parser class contains a simple parser based on Java's built-in XML parsing. The parser simply reads a XML document and instantiates an object for each XML entity, adding it to its containing element by calling set... or add... methods on the containing object.

For instance, the input:

    <surface type="Sphere">
        <shader type="Lambertian">
            <diffuseColor>0 0 1</diffuseColor>
        <center>1 2 3</center>
results in the following construction sequence:
  • Create the scene.
  • Create an object of class Sphere and add it to the scene by calling Scene.addSurface. This is OK because Sphere extends the Surface class.
  • Create an object of class Lambertian and add it to the sphere using Sphere.setShader. This is OK because Lambertian extends the Shader class.
  • Call setDiffuseColor(new Colord(0, 0, 1)) on the shader.
  • Call setCenter(new Vector3d(1, 2, 3)) on the sphere.
  • Call setRadius(4) on the sphere.

Which elements are allowed where in the file is determined by which classes contain appropriate methods, and the types of those methods' parameters determine how the tag's contents are parsed (as a number, a vector, etc.). There is more detail for the curious in the header comment of the Parser class.

The practical result of all this is that your ray tracer is handed an object of class Scene that contains objects that are in one-to-one correspondence with the elements in the input file. You shouldn't need to change the parser in any way.

What to Submit

Submit a zip file containing your solution organized the same way as the code on Github, along with any additional files for the written questions. Include a readme in your zip that contains:

  • You and your partner's names and NetIDs.
  • Any problems with your solution.
  • Anything else you want us to know.

Upload here (CMS)

Due: Friday 10th October 2014 (11:59pm)

Do this project alone or in groups of two, as you prefer. You can use Piazza to help pair yourselves up.


We were able to make some really neat images with our ray tracer, but, unfortunately, those images took a long time to compute. You might be wondering how your favorite video game or CAD program is able to create 30-120 images every second. In general, ray tracing is not used for these applications because beautiful looking ray-traced images are too expensive to compute in real time.

Instead, we create 2-D raster images of 3-D scenes by using a graphics pipeline that takes advantage of specialized hardware and linear algebra to create images quickly. In this class, we will use a very popular and mature API called OpenGL to help us make 2-D raster images of 3-D scenes. The purposes of this assignment are to introduce you to transformation matricies in a graphics pipeline, start you thinking about hirearchical modeling, and get you some practice with OpenGL.

In this assignment, you will complete some warm-up exercises, implement a traversal of a scene tree, and implement three types of real-time manipulators. Again, we will provide framework code, so your focus is on implementing the more "interesting," graphics-related components. However, you also have some freedom to redesign our system as you see fit.

Requirement Overview

This assignment contains a written question portion as well as a programming portion.

Written exercises: Matricies, Viewing, Scene Graphs, and Manipulators

Q1. 2-D transformations (with 3x3 matricies!)
Consider the following matricies:
    \(A = \left( \begin{array}{ccc} .5 & -0.866 & 0 \\ 0.866 & .5 & 0 \\ 0 & 0 & 1 \end{array} \right) \quad B = \left( \begin{array}{ccc} 1 & 0 & 2 \\ 0 & 1 & 3 \\ 0 & 0 & 1 \end{array} \right) \quad C = \left( \begin{array}{ccc} 1 & 0 & 0 \\ 0 & 5 & 0 \\ 0 & 0 & 1 \end{array} \right)\)
  1. What do the matricies A, B, and C represent, in English? (i.e. X represents a rotation by Y degrees, etc.)
  2. Consider transforming a 2-dimensional square defined by corner coordinates <(0,0),(0,1),(1,0),(1,1)>. What are the resulting corner coordinates when you transform it by matrix ABC? How about matrix CBA?
  3. Why are these different?
Q2. Big Matrix Stack
In graphics pipelines, we often apply a series of transformations to coordinates in world space to achieve accurate screen coordinates. In this exercise, the position of a single point in world space is given (in a real graphics pipeline, this might correspond to one of three points that defines a triangle) and it's your job to apply the appropriate matricies to answer the following questions.
The parameters of our world/camera/scene are specified in accordance with the notation/vocabulary established in Chapter 7 of the textbook. All points are specified using cartesian coordinates in world space.
  • Bounding planes for the orthographic view volume: [l,r] x [b,t] x [f,n] = [-5,5] x [-5,5] x [-30,-5]
  • Screen size: nx = 1000 pixels, ny = 500 pixels
  • Camera: e = (1,1,0), g = (0,-1,1), up = (1,1,0)
  • Point of interest: p = (5,-20,15)
  • We will use a perspective transformation.
Here, we'll walk through how a point goes from world space into screen space.
  • In your own words, what is camera space (1-2 sentences)? What matrix would you use to transform coordinates from world space to camera space, in this example? What are the homogeneous coordinates of p in camera space?
  • In your own words, what is the perspective transformation? Viewing frustum? How do these concepts relate to the orthographic view volume (2-4 sentences, total)? What are the coordinates of p in camera space after applying the perspective matrix? Is p in the orthographic view volume?
  • In your own words, what is the canonical view volume, and why do we care about it (1-2 sentences)? Finally, what are our (x,y) pixel coordinates? How about our canonical z-coordinate?
Q3. Manipulator Warmup
In this assignment, you're going to be implementing "manipulators." Manipulators allow users to apply transformations to objects in scenes visually. In this question, we will consider a translation manipulator. A manipulator can be thought of as a ray in three-space that corresponds to an object space axis (either x, y, or z) for a given object. One challenge with implementing manipulators is how one maps the actions of users (in a 2-D image plane) to 3-D transformations. We will think of a simplified case -- we are given the coordinates and direction of the manipulator and two click locations on screen. The question we will investigate is: given all of this information, what translation should be applied to the object in question?

Consider a manipulator as a ray with a given origin and direction. Any point on that ray can be described by a parameter t in the usual p = o + td fashion. For a translation manipulator, the goal is to find the positions on the ray that the user clicked (t1 being the first click point, and t2 being the second). Once we've determined the best t1 and t2, we can then apply a translation transformation on the object in the object's frame along the manipulator's axis in the amount (t2-t1).

  • The camera is at the origin pointing in the negative z direction, the y-axis is up (as per usual)
  • The image plane is a square in the z=-3 plane with side length 5 centered at (0,0,-3).
  • The translation manipulator's origin is located at (1,1,-6)
  • The translation manipulator is pointing in the direction of (-1,-1,-1) EDIT: though you should normalize this vector such that your final parameter values match ours
  • The translation manipulator corresponds to the object's object-space x-axis.
  • The user has clicked on world-space coordinates c1 = (1.5,1,-3) and c2 = (-.5,0,-3) [note that these are on the image plane]
Here are some reference images that might be useful for you to visualize the process of determining t1 and t2.

Ray Generation

Point Projection

  • Our first job is to describe the rays outgoing from the eye intersecting the image plane in the places where the user clicked. What are the equations of the two rays in question?
  • Next, we need to find the real world coordinates of the points where these rays intersect the plane defined by the translation manipulator's origin and direction. A point and a direction, however, aren't enough to determine a plane -- we need an additional constraint.

    The correct plane is the one that contains...

    • The manipulator's ray
    • The manipulator's origin
    • A ray perpendicular to the manipulator's ray that has the same origin as the manipulator, and is parallel to the viewing plane

    What are the coordinates of the two points in question?

  • Now we need to find the points on the manipulator ray that are closest to these points. Keeping in mind that points on our ray can be completely described by their associated t parameters, what are t1 and t2?
  • Finally, what is the matrix we would apply to translate the object appropriately in its own coordinate space?

Programming Exercises

Getting Started

A new commit has been pushed to the class Github page in the master branch. We recommend switching to your master branch, pulling this commit, and creating a new branch (e.g. A3 solution) and committing your work there. This new commit contains all the framework changes and additions necessary for this project. For this project, we use lwjgl, so you may need to update some settings of your build to have this external library work properly. We have included the correct jar files, but you may need to configure your native library location. To do this in Eclipse, go to Project -> Properties -> Java Build Path -> Select Libraries -> Select the lwjgl Drop Down Menu -> Modify Native Library Location. Modify this setting so that it matches your OS.

Specification and Implementation

We have marked all the functions or parts of the functions you need to complete with TODO#A3 in the source code. To see all these TODO’s in Eclipse, select Search menu, then File Search and type “TODO#A3”. There are quite a few separate code snippets to complete in this assignment.

This assignment is best broken down into four parts. We highly recommend completing the parts in order.

Part 1: Traversing the Render Tree

Your goal is to apply the appropriate transformation matrices to each object in your scene tree to produce the correct world coordinates for each object, which can then be appropriately transformed into screen coordinates (see written question 2). Our scene graph (more complex ones are possible!) associates a RenderObject at each node in the tree. A RenderObject can be a triangle mesh, a light, or a camera, depending on its associated SceneObject. For the purposes of this part of the assignment, however, the distinction between these different types of RenderObjects is abstracted away.

Switching focus back to the tree, if a node has children, it acts as a group node, and its transformation applies to all members of its group. For instance, if you wanted to rotate a group of objects, instead of rotating each object individually, you'd be smart to apply a rotation matrix to that group's group node, which automatically propagates to all of the children.

As discussed in class, the transformation matrix you'd like to apply to each object is the transformation matrix you multiplicatively "accumulate" by traversing your scene tree from the root node to a given object's node. A naive way of interpreting a scene tree might be to loop over each object, find its corresponding transformation matrix by finding a path from the root to its associated node, and work from there. However, trees are scene data structures that (purposefully) lend themselves to cleaner methods.

Before continuing, it's worth pointing out that there are two matrix multiplication functions in the Matrix4 class, mulBefore and mulAfter. mulBefore adds a matrix that transforms a point before the old matrix, whereas mulAfter adds a matrix that transforms a point after the old matrix. More specifically A.mulBefore(B) does an in-place modification of A, replacing it with AB. A.mulAfter(B) does an in-place modification of A, replace it with BA. The distinction between these two functions can be interpreted as "which frame is the transformation B specified in?" If the transformation corresponding to B is specified in A's frame, then mulBefore is appropriate. If B is specified in A's parent's frame, then mulAfter is appropriate.

Tree Traversal

Your first job is to implement a recursive function that traverses a given scene tree, and propagates group transformations to child nodes. The function you'll need to implement to accomplish this task is rippleTransformations in cs4620.gl.RenderTreeBuilder.java. Your input render tree is stored in a RenderEnvironment object. A render tree has a single node with no parent called the "root".

rippleTransformations should be recursive (or utilize a recursive helper function). First, for a given RenderObject that is not the root node, we need to set mWorldTransform and mWorldTransformIT within that render object. mWorldTransform should be set to the composition of two matrices, AB, where A is the render object's parent's world transform, and B is the render object's local transform (stored in its corresponding SceneObject). mWorldTransformIT is a bit more complex. Recall from class that surface normals are not preserved when applying a transformation matrix, i.e. if you were to apply the same transformation to an object and its corresponding surface normals, the resulting surface would not be normal. Therefore, mWorldTransformIT needs to be set to the matrix you use to transform transform normal vectors to preserve normalcy.

After you've correctly set the two necessary matrices at a given node, you need to traverse through the tree by making a recursive call on each of the RenderObject's children.

Part 2: Updating the Camera's Matrix

Here, we will be working with cs4620.gl.RenderCamera's updateCameraMatrix function, which is called to recompute the viewing transformations whenever the camera moves or one of its parameters is changed. In addition, we will need to know the view size and near and far distances, as well as whether or not this camera utilized a perspective transformation; all this information can be found in the fields of the corresponding SceneCamera. For rendering, we need the viewing transformation (aka. camera transformation), which maps points in world space to points in camera space, and the projection transformation, which maps from camera space to the canonical view volume. (These transformations are discussed in the book and lecture slides.) Actually, in this program we only need the product of the two, so your job is to compute the matrix mViewProjection in the RenderCamera object, which is the product of these two transformations.

The viewing transformation can be computed from the camera's transformation matrix, and there are functions in Matrix4 that will help with construction of the projection matrix. If you're feeling lost, the viewing section of the textbook might be of use.

We recommend that you implement the above before considering the function's parameter, viewportSize, which simply tells you the size in pixels of the image being rendered. You might notice, after implementing the basics of this function, that your view looks rather strange and objects appear stretched out of shape. This is because the aspect ratio (the ratio of width to height) differs between the viewport and the camera, so the image get stretched to fit the viewport. To fix the strange ratios you're seeing, you'll have to adjust the width and height of the view from the one's given in the camera, considering both the size of the viewport (viewportSize) and the image size (iSize).

Part 3: Controlling the Camera

CameraController is the class responsible for controlling the camera in response to user input. There are two kinds of controls: the ones that translate the camera (in its local frame, so that the controls move left/right, front/back, up/down relative to the user's current view), and the ones that rotate the camera (around axes that are aligned to the left/right, up/down, and front/back directions in the camera frame).

The one additional question with rotation is what center point to rotate around. If you are navigating the camera around a scene (walking or flying around), it makes sense to rotate around the viewpoint: you rotate the camera to look around from a stationary viewpoint, or to change direction as you move forward. We call this "flying mode" (selected using 'F'). On the other hand, when you are inspecting some object, it makes sense to rotate around that object, so that if you are looking at it you stay looking at it but see it from a different view. We call this "orbit mode" (selected using 'O').

We have written the code that polls for keyboard and mouse input for you. The way it works is to collect the user's requests for translation along and rotation around the three axes into two 3-vectors, called "motion" and "rotation". For instance, pressing "D" requests a rightward motion of the camera, which is in the positive x direction in the camera's local frame, so this action results in a positive value in motion.x. Similarly, pressing the right arrow, or clicking and dragging to the right, requests a rotation pointing the camera rightwards, which is achieved by a negative rotation around the y axis, so both these actions will result in a negative value in rotation.y. Note that these rotations are measured in degrees. Since the user could conceivably operate many of the controls at once, we just add them all up into these vectors.

Here, we will be working with cs4620.gl.CameraController's rotate and translate functions. Each function has roughly the same inputs and outputs.

  • Matrix4 parentWorld, this camera's parent's world transformation
  • Matrix4 transformation, the matrix that acts as your function's output. Each of your functions will modify this matrix by applying a new transformation to it. Transformations are specified in this camera's frame
  • Vector3 (rotation) or (motion), a vector specifying (the X, Y, and Z axis rotation amounts) or (the X, Y, and Z displacement)
In these functions, the input is in the object's frame -- this distinction should help you decide whether or not to use mulBefore or mulAfter.
Camera Translation
The camera translation function is the easier of the two to implement. This function simply applies a newly created translation to the output matrix, transformation, in the direction/amount specified by the motion input vector.
Camera Rotation
Camera rotation is a bit more tricky, given that we have two distinct camera modes, fly and orbit, that handle rotations differently. We recommend that you get fly working first.

Once fly is working, you should move on to orbit. Here, all rotations are about the origin in world coordinates. You'll need to compute the coordinates of the world's origin in camera space, and then construct your rotation around that point. When everything is working properly, the earth in Earth.xml should appear stationary when the camera is rotated, as it is orbiting around the origin.

Note that there's a bit of ambiguity with respect to rotation ordering here. If the user requests rotations around multiple axes at once (for example, using a diagonal drag of the mouse) it is not clear exactly what we should do. Should we rotate left and then down, or rotate down and then left? It makes a difference because rotations do not commute, though it's a small difference because the rotations for each frame are small in magnitude. We don't particularly care how you deal with this issue, just do something reasonable.

Part 4: Manipulators

The basic idea of manipulators is to give the user a reasonably intuitive tool to adjust any of the many transformations in the scene. The user selects an object by clicking on it, to indicate that he or she wishes to edit that object's transformation. We modify the transformations by applying axis-aligned rotations, translations, or scales to them, and the goal of a manipulator is (a) to give a visual indication of what will happen when you apply one of these transforms and (b) to give a direct "handle" to pull on to indicate the direction and size of the transformation.

There are two ways to modify the transformation. Suppose we want to apply a translation T. We could compose T with the object's current transformation M on the right (M = M * T, where T is the adjustment), which corresponds to translating along an axis in the object's coordinate system, or on the left (M = T * M), which corresponds to translating it along an axis in the parent's coordinate system. When M contains any rotation, these axes will be different. Each is equally easy to do, so we want to provide both options to the user, and the Scene application does this by providing the 'P' key, which toggles "parent mode" on and off. When parent mode is on, adjustments to an object's transformation happen along the axes of the parent space; when it is off ("local mode"), adjustments happen in the object's space.

The code in the framework handles selecting objects, drawing the manipulators, and detecting when the user has clicked on a part of a manipulator (they each have three parts, one for each axis). When the user clicks on a manipulator and drags, the framework code calls your code to compute and apply the transformation. For instance, if an object is selected in translation mode, and the user clicks on the red (x axis) arrow and drags the mouse, the method applyTranslation will be called with axis==Manipulator.Axis.X, with the currently selected camera and object, and the previous and current mouse position. Specifically, the parameters are...

  • int axis, an integer identifying which axis the manipulator corresponds to
  • Matrix4 mViewProjection, which takes coordinates in world space and transforms them into the canonical view volume.
  • RenderObject object, the object that's being directly manipulated (it's children would be as well, if it has them)
  • Vector2 lastMousePos, the starting position of the mouse
  • Vector2 curMousePos, the current position of the mouse
  • (implicitly -- it's a member of the class that you need to use) boolean parentSpace, a boolean indicating whether the transformation should occur in the object space of the parent, or the child

The first step in implementing manipulators is to get the correct transformations, not worrying yet about the size of the transformations. For this, you just need to make a transformation of the right type, along the given axis, with a magnitude that is computed from the mouse delta in some easy way: for instance, translate by the difference in mouse x coordinate times 0.01 (the details don't matter since this is just temporary). Once this works, you will see the object moving in the direction of the manipulator arrow you clicked on, but it will generally be confusing to figure out which way to move the mouse to get the desired result.

The last and final step is to make translation and scaling manipulations "follow the mouse". This means that if you click on the axis and move the mouse along the axis, the point you clicked on should remain exactly under the mouse pointer as you drag. The strategy we recommend is to express the ray corresponding to the user's click and the line corresponding to the manipulation axis both in world space, then use the warmup problem to compute the t values along the axis that correspond to the user's mouse movement. Once you have these values you know how much to translate: it is by the difference of the two values. Once this works, if you click on points that are on the manipulation axis and drag exactly along the axis, the object will exactly follow the mouse.

Translation Manipulator
The translation manipulator, which you considered already in one of the written questions, displays three arrows that represent the x, y, or z axes in the coordinates of the selected transform. If the user clicks and drags along an axis, the resulting translation should exactly follow the mouse. When the drag is not parallel to the selected axis, the translation should follow the mouse as much as possible (hence your projection scheme you worked with in the written questions).

The translation manipulator is the most complicated manipulator of the three, particularly if it is the first one you're implementing, but you've already had some experience dealing with it in the written questions. As before, the idea is that the matrix you construct must be a translation of (t2-t1) along the specified axis.

Scale Manipulator
The scaling manipulator shows the three scaling axes by drawing three lines with small boxes at the ends. This manipulator is very, very similar to the translation manipulator (in fact, until you construct your final matrix, finding the appropriate t1 and t2 are identical, and as such you might find it useful to break up this functionality into its own method). The magnitude of your scale, however, should be based on a ratio of t1 and t2, rather than a difference. Here, you should scale by t1/t2 in the appropriate direction.
Rotation Manipulator
The rotation manipulator displays three circles on a sphere surrounding the origin of the object's coordinates. Each circle is perpendicular to one of the rotation axes, and clicking on that circle and dragging performs a rotation around that axis. Because getting the transformation to follow the mouse is complicated for this manipulator, it’s fine if you just map vertical mouse motion directly to the rotation angle.

The rotation manipulator should be the easiest to implement of the three because we map all specification to simply the change in the mouse position's y coordinate. Then, you should multiply this quantity by some constant (pick something reasonable) to determine the change in the angle of rotation about the selected axis.

Appendix: More Framework Information

Control Guide
  • Arrow Keys + EQ: Camera Rotation
  • WASD + Left Shift + Space: Camera Movement
  • O/F: Orbit/Fly Mode Toggles
  • R/T/Y: Manipulator toggles (rotation, translate, scale)
  • P: Parent space toggle
  • G: Toggle grid on and off
Test Scene Data
Download here!

Due: Monday 10th November 2014 (11:59pm)

Do this project alone or in groups of two, as you prefer. You can use Piazza to help pair yourselves up.


We've had a lot of fun on past assignments creating straight lines and spheres. However, the question remains: how are we supposed to model spaghetti? Or snakes? Curvy letters such as "s" in typography? The answer: splines! In the most general sense, splines are piecewise polynomial functions that specify potentially complex curves mathematically. In this assignment, you'll get a chance to play around with some splines, and ultimately create some really neat objects to add to your scenes.

In this assignment, you will explore representing a B-Spline with several cubic Bézier curves by completing some warm-up exercises, implementing De Casteljau's algorithm, and sweeping an arbitrary 2D spline along another 2D spline through 3D space to create a really neat looking surface. Again, we will provide framework code, so your focus is on implementing the more "interesting," graphics-related components. However, you also have some freedom to redesign our system as you see fit.

Warm Up Exercises

Extending Modified Béziers with Quintic Polynomials

In class, we discussed control points and how they can be used to describe curved geometry. One particular curve type that we were interested in was the cubic Bézier. This type of curve is commonly used in graphics because it is intuitive to control and cubics are generally sufficient for our our main customer, the human visual system. That is to say, humans can hardly distinguish between C2 continuity and higher levels of continuity. In this question, however, we'll assume that we would like some additional control. We will extend and slightly modify the basic cubic Bézier spline so that the curve has some properties that a Bézier cannot.
  1. If we are representing a curve segment as a polynomial of degree n, how many control points do we need to specify? We are going to modify our Bézier to be qunitic (degree 5). How many control points did we need before? How about now?
  2. Here, we would like to modify our spline so that, in addition to having control over \(f(0)\), \(f(1)\), \(f'(0)\), and \(f'(1)\), we'd like control over \(f''(0)\) and \(f(.5)\). To do this, assume we specify two additional control points \(p_4\) and \(p_5\). Define \(p_4\) such that \(f''(0)\) is proportional to \((p_4-p_0)\) in the same way that \((p_1-p_0)\) is proportional to \(f'(0)\) in our unmodified construction. In the original derivation of Béziers, we assumed a constant factor of 3 relating our original proportionalities because it helps things cancel. Similarly, let the constant factor associated with \((p_4-p_0)\) be 2 -- that is that is, \(f''(0) = 2(p_4 - p_0)\). Seperately, let \(p_5\) = \(f(.5)\).

    Give a matrix A such that this new type of curve can be expressed as... \[\textbf{f}(u) = \textbf{u} A P\]

    where u is a vector of \([1, u, u^2, u^3, u^4, u^5]\) and P is a 6 by 2 matrix where rows correspond to control points.

  3. Plot the blending functions of this curve from u=0 to u=1 and include your plot in your response.

B-Spline to Bézier

As we discussed in lecture, the B-spline curve is a series of segments, each defined by a cubic polynomial in the form... \[\textbf{f}_{Bsp}(u) = \left[u^3,u^2,u, 1\right]M_{Bsp}P_{Bsp}\] where... \[M_{Bsp} = \frac{1}{6}\begin{bmatrix} -1 & 3 & -3 & 1 \\ 3 & -6 & 3 & 0 \\ -3 & 0 & 3 & 0 \\ 1 & 4 & 1 & 0 \end{bmatrix}\]

On the other hand, a Bézier spline segment has the form \[\textbf{f}_{Bez}(u) = \left[u^3,u^2,u, 1\right]M_{Bez}P_{Bez}\] where \[M_{Bez} = \begin{bmatrix} -1 & 3 & -3 & 1 \\ 3 & -6 & 3 & 0 \\ -3 & 3 & 0 & 0 \\ 1 & 0 & 0 & 0 \end{bmatrix}\]

Converting the B-spline control points (which are the ones the user adjusts in the editor, in the programming section) to the Bézier control points (which you will use to approximate the curve) amounts to solving for the matrix M such that if \[P_{Bez} = M P_{Bsp} \] then \[\textbf{f}_{Bez}(u) = \textbf{f}_{Bsp}(u)\] for all u.

You will find that this matrix has nice numbers and it is simple, once you gather the four B-spline control points for a particular segment, to neatly compute the Bézier control points. This will be useful in the programming section. As the final part of this problem, convert B-Spline control points \(p_0, p_1, p_2, p_3 = \langle 0, 0 \rangle, \langle 1, 1 \rangle, \langle 2, 3 \rangle, \langle 6, -2 \rangle\) into their corresponding Bézier control points, and include your plot with your response.

Programming Exercises

Getting Started

A new commit has been pushed to the class Github page in the master branch. We recommend switching to your master branch, pulling this commit, and creating a new branch (e.g. A3 solution) and committing your work there. This new commit contains all the framework changes and additions necessary for this project. For this project, we use lwjgl, so you may need to update some settings of your build to have this external library work properly. We have included the correct jar files, but you may need to configure your native library location. To do this in Eclipse, go to Project -> Properties -> Java Build Path -> Select Libraries -> Select the lwjgl Drop Down Menu -> Modify Native Library Location. Modify this setting so that it matches your OS.

Specification and Implementation

We have marked all the functions or parts of the functions you need to complete with TODO A5 in the source code. To see all these TODOs in Eclipse, select Search menu, then File Search and type TODO A5. There aren't too many of these because we've made all the programming you need to do happen in just two files! These two files are BSpline.java and CubicBezier.java.

This assignment is best broken down into the following parts. We highly recommend completing the parts in order. The overall goal is to create code that lets you sweep a B-Spline along another B-Spline in two dimensions in 3-space. To see exactly what we mean by this, here are some screenshots of the user interface using our solution code.

In general, a "spline sweep" is a function that maps two 2-D splines to a 3-D form. You may think of it as a generalization of a cylinder. In the case of a cylinder, the cross section is a perfect circle, and the spline you sweep along is a straight line. We generalize this idea here. First, think of sweeping that circle along some arbitrary path to generate a different type of shape. Next, think of changing your circle to an arbitrary curve, and sweeping it along the same generalized curve. The resulting shape is a "spline sweep."

Here's an overview of the big pieces you'll need to implement:

  • Tessellation of bezier cubic functions
  • Composition of bezier cubic segements to form a B-Spline
  • The code to generate a triangle mesh for a spline sweep (similar to the mesh assignment)

We've provided you with some basic functionality. The short version: we provide the UI to edit the control points of your spline and to draw the spline sweep surface. The long version: In the interface, there are three basic panels. The leftmost panel edits the cross sectional spline. The center panel edits the sweep spline. Each of these panels also displays normal vectors (blue) and tangent vectors (green). The rightmost panel displays your results. Triangles that face you (i.e. you see the points specified in a CCW order) are red, whereas triangles that don't face you (i.e. you see the points specified in a CW order) are gray. The controls are very similar to the scene assignment's controls. In the additional window, you are given additional functionality, including saving/loading and some display options.

Dealing with splines is a bit tricky in terms of inside-outness. For instance, if I specify a circular spline, who's to say whether or not the normals are facing inward or outward? For this assignment, we will adopt the common convention -- a closed spline that makes a single CCW loop has the enclosed region of the plane on the inside, and the rest on the outside. Its normals point away from the center of the loop. On the other hand, a CW loop has the enclosed region on the "outside". In both cases, as you walk along the curve, the normals point to your right. Similarly, for open splines, the tangents should align with the direction of parametric increase, with the normals pointing to the right.

Cubic Bézier Constructor

Your first job is to complete the unfinished parts of CubicBézier's constructor. This amounts to using your matrix from the write up to convert the input BSpline control points to Bézier control points, and setting p0-p4 appropriately. This part should be little more than typing your answer in, and is a good place to get your feet wet.

Tesselation and De Casteljau

Your next job is to fill in your Bézier curve, given your newly properly converted control points. The motivation for converting the segments to Bézier form is the availability of a simple method for splitting Bézier segments in two. Using de Casteljau's algorithm, one can compute the set of points shown here, given starting points. This image uses u=.5, but the method supports u in the normal range, [0,1].

The pattern here is a simple recurrence in which points at each level are linear combinations of points at the previous level: \[\textbf{p}_{k,i} = (1 - u)\textbf{p}_{k-1,i} + u\textbf{p}_{k-1,i+1}\] \[\textbf{p}_{0,i} = \textbf{p}_i\] where the \(p_i\) with a single index are the control points.

This construction not only gives you a point on the spline for a given u; it also gives you the Bézier control points of the two halves, which can also be thought of as cubic Béziers: they are \(p_0\), \(p_{1,0}\), \(p_{2,0}\), and \(p_{3,0}\) for the left half and \(p_{3,0}\), \(p_{2,1}\), \(p_{1,2}\), and \(p_3\) for the right half.

The simplest way to get the spline on the screen is to use this recurrence to compute f(u) (which is the point \(p_{3,0}\) for a sequence of regularly spaced u values. We encourage you to implement this first, to make sure you have the algorithm right.

A better way, which you should implement for your final handin, is to recursively split the segment, terminating when the control polygon is straight enough. The idea is as follows: Consider an additional parameter epsilon that specifies curve smoothness. Specifically, to approximate a Bézier segment with lines, you can use a single segment if the angles between the edges of your control polygon are both less than half the angle tolerance (given by epsilon) because the tangent to the curve cannot swing more than that much over the segment. If the angles are too big, subdivide and apply the idea recursively. The angles in the two halves will be smaller (less sharp) so ultimately the recursion will terminate for well-behaved curves. But you should also enforce a maximum on the number of levels of recursion (10 is a good choice) to prevent stack overflow in degenerate cases.

Your recursive algorithm should only emit points that are exactly on the spline (so don't add other points you may have computed that aren't on the curve).

The function you need to implement, tesselate(), has three jobs -- it needs to set this.curvePoints, this.curveNormals, and this.curveTangents correctly. As such, you should come up with a way to calculate the tangents to the spline for each point computed by your recursive algorithm. There are a few easy ways to do this, just pick one that is correct and makes sense to you, and explain it briefly in a code comment. From these tangents, compute 2D normal vectors that are perpendicular to the curve.

Your cubic Bézier code should work independently of all other code -- we highly reccomend adding a main() into this class for testing and debugging purposes.

From Béziers to B-Splines: Setting Up Your Béziers

Your first job in moving from Bézier segements to full BSplines is to implement setBéziers() in BSpline. This method constructs an ArrayList of Béziers, approximationCurves, that (unsurprisingly) approximate the input BSpline. Note that the term "approximation" is a bit tricky here. The Bézier segements you specify with control points exactly describe the input B-Spline. The "approximation" here refers to our curve being approximated by line segments.

The spline editor hands you a list of control points, numbered from 0 to N-1, which are provided in the ArrayList controlPoints. Finding which four control points influence a segment of the B-spline is mostly simple: segment i of the curve is influenced by control points i-1, i, i+1, and i+2. Using this definition we can generate N-3 segments from the N control points without falling off the ends of the sequence, which is exactly what we'd like to do in the case of an open curve. However, you may have noticed a boolean called isClosed in your BSpline. (Unsurprisingly) This means you'll need to handle the case of a closed BSpline as well.

To handle closed BSpline segements, you'll need to tack on a few additional BSpline curves. We'll let you figure out exactly how to handle this, but know that you'll need to "wrap around" to the start of your control point list.

Here is also where you may want to handle making all your tangents/normals point in consistent directions across your whole B-Spline.

The Grand Finale, Sweeping Cross Sections Along Curves

Here, you will sweep your cross sectional spline along your sweep spline to produce a triangle mesh.

Your job is to implement build3DSweep which takes in a cross-sectional B-Spline, a sweep B-Spline, a MeshData object to output to, and a scale parameter which controls how big the cross section is with respect to the sweep spline. This function should support arbitrary closed/open splines, i.e. either spline can be opened or closed and it does the "right thing." The function itself acts as you think it might -- it sweeps a scaled version of the cross section spline along the sweep spline, building a triangle mesh along the way. A smart way to specify triangles would be in a similar manner to your sphere mesh from assignment one. In the case of closed curves, creating an extra vertex along the seam (just as you did with the sphere mesh) would be a good idea.

Here are a few specific specification details. Assume that up in your leftmost spline edit pane is +y, that right in the leftmost spline edit pane is +x, and that out of the screen towards you is +z. Call the tangent direction to the sweep spline +t, the normal direction +n, and define +b to be the cross product \(t \times n\). When sweeping, our goal is to map +z to -t, +y to +n and +x to +b. Thinking in terms of coordinate bases like this might be useful when coding up your solution, and it provides a more exact specification of what we're looking for.

For reference in the 3D space, we have provided colored x, y, and z-axis line segments. The camera defaults to looking down the +y direction, and we request that you build your splines in the x-z plane. Red corresponds to the +x direction, blue corresponds to the +z direction, and green corresponds to the +y direction.

The scale parameter corresponds to how big the sweep spline is in comparison to the cross section -- both splines are originally specified in the editing windows whose range is \(\langle -1, 1\rangle \times \langle -1, 1 \rangle\). Use the scale parameter to scale down your cross section spline by that factor.

Extra Credit


What to turn in

  1. Your written solutions
  2. Your BSpline.java and CubicBezier.java

Due: Monday 24th November 2014 (11:59pm)

Do this project alone or in groups of two, as you prefer. You can use Piazza to help pair yourselves up.

Written and programming assignment (PDF) | Lecture notes (PDF)

Due: Friday 5th December 2014 (11:59pm)

Do this project alone or in groups of two, as you prefer. You can use Piazza to help pair yourselves up.

Written assignment (PDF) | Programming assignment (PDF)


There will be an evening midterm and a final exam:

The exams are closed book, but you're allowed to bring one letter-sized piece of paper with writing on both sides, to avoid the need to memorize things.

Old Exams

Here are links to exams from previous versions of this class. They can be useful for studying, because they give an indication of the style and scope of problems that may be expected. Note that they are not a useful reference for topics to expect, because the content of the course has changed from year to year.

About CS4620

Questions, help, discussion: The instructors are available to answer questions, advise on projects, or just to discuss interesting topics related to the class at office hours and by appointment as needed. For electronic communication we are using Piazza (handy link also at the top of this page). CS 4621 and CS 5621 students: sign up for the Piazza page, and add yourself to the cs4621 group in the "Groups" tab of the course page. Please post practicum discussions to the group to help keep the page organized.

Practicum: In the optional practicum course, CS4621/5621, you will get a more in-depth exposure to the course material and complete an additional project. Students taking the practicum will also be required to attend intermittent lectures during the Friday meeting time for CS4621/5621 (to be announced in advance).

Academic integrity: We assume the work you hand in is your own, and the results you hand in are generated by your program. You're welcome to read whatever you want to learn what you need to do the work, but we do expect you to build your own implementations of the methods we are studying. If you're ever in doubt, just include a citation in your code or report indicating where some idea came from, whether it be a classmate, a web site, another piece of software, or anything—this always maintains your honesty, whether the source was used in a good way or not. The principle is that an assignment is an academic document, like a journal article. When you turn it in, you are claiming that everything in it is your original idea (or is original to you and your partner, if you're handing in as a pair) unless you cite a source for it.

School can be stressful, and your coursework and other factors can put you under a lot of pressure, but that is never a reason for dishonesty. If you feel you can't complete the work on your own, come talk to the professor or the TAs, or your advisor, and we can help you figure out what to do. Think before you hand in!

Clear-cut cases of dishonesty will result in failing the course.

For more information see Cornell's Code of Academic Integrity.

Collaboration: You are welcome (encouraged, even) to discuss projects among yourselves in general terms. But when it comes to writing up the homeworks or implementing the projects, you need to be working alone (or only with your partner if you are doing a project as a pair). In particular, it's never OK for you to see another student's homework writeup or another team's program code, and certainly never OK to copy parts of one person's or team's writeup, code, or results into another's, even if the general solution was worked out together. The paper and pencil parts of the projects can be discussed in general terms, but must be worked on alone.



Shirley & Marschner,
Fundamentals of Computer Graphics
third edition

Supplemental Books and Materials:

Foundations of 3D Computer Graphics
first edition