Assignment 3. Animation
Table of Contents
The Animation assignment doesn’t have any checkpoints. Work by yourself or with a partner, as always.
In this assignment you will implement mesh animation with linear blend skinning. For input data we have a few simple test scenes originating from the glTF sample models collection, a few animations we built in Blender (where you can make more!) and for more complex and realistic motion, some data from a cool research project on human motion modeling [Loper et al., SIGGRAPH Asia 2015] (The paper video is a nice overview and worth a look.) The authors provide this model in the form of files in the widely used FBX animation format, which we read into Blender to convert to glTF files. For this assignment we will focus on the mesh skinning aspect of the data, so that we can display animations of the average body shape, but it is a relatively straightforward extension to add the ability to change body shape and to use their pose-dependent shape corrections.
There isn’t really any new framework for this assignment, but pull from the course Git repository to get some input scenes that have suitable animations.
1 Load some animated scenes
Start with your Pipeline solution; you might want to add an option to render using a simple forward shading pipeline for performance (with animation you will find you suddenly care about the difference between 30 and 60 frames per second).
There are a couple of changes needed to the Assimp importer
configuration. Some of the glTF test animations have many bones
influencing some vertices, but we need the number to be limited to 4.
Assimp will drop the lowest weighted bones and renormalize if you pass
the aiProcess_LimitBoneWeights
flag. We also want to keep
the options to triangulate (many of these scenes have character meshes
built from quadrilaterals). So your input-reading call might look like
this:
const aiScene* input_scene = importer.ReadFile(scenePath,
|
aiProcess_LimitBoneWeights |
aiProcess_Triangulate ); aiProcess_SortByPType
The test scenes in the repo include:
BoxAnimated.glb
: a glTF test animation with no skeletons, just rigid motions.CesiumMan.glb
andRiggedFigure.glb
: these are glTF test animations of human characters with simple animation cycles.mosh_cmu_*.glb
: these are SMPL animations using motion capture data from the CMU Motion Capture Database.
With the Assimp options above you should be able to load these and display them frozen in their bind poses (example for Cesium Man). The bind poses for these models are in z-up coordinates but the animated poses will be right side up. These scenes don’t all include cameras or lights so you will want to be sure your program provides sensible defaults.
2 Apply the node animations
A standard skinned character animation has two parts: animation
controls that move a hierarchy of nodes and a deformer that deforms a
mesh to follow this motion. You can get these two halves working one at
a time starting with the node animations. For this the scene
BoxAnimated.glb
is a good one to work with because it has a
simple animation with just rigid motions.
The information needed to animate the node hierarchy is stored in the
array aiScene::mAnimations
; each animation contains an
array mChannels
. Each channel points (by name matching) to
one node in the scene, and contains lists of keyframes for translation,
scaling, and rotation. Extend your scene reading code to convert this
information and store it in some convenient data structures of your
devising; I found std::map
was an ideal tool for storing
keyframes since it is ordered and supports looking up by keys that might
fall in between the keys stored in the map. I also found that I needed a
map to look up nodes by name, so that I could connect these animation
channels to nodes in my already-built scene. Scenes are allowed to have
multiple animations (and often do, e.g. for game characters with several
behaviors) but for this assignment we can assume the first one is always
the one of interest.
Once you have this info available, you just need to write a function
you will call before drawing each frame to update the node
transformations for the current time. To do this, you loop over all the
channels, and for each one you interpolate in the timeline to compute
the current translation, rotation, and scale. You can just use linear
interpolation for translation and scale and spherical linear
interpolation on quaternions for rotations. In the GLM library, there is
support for working with quaternions in the header
<glm/gtc/quaternion.hpp>
, and support for
constructing transformations in
<glm/gtc/matrix_transform.hpp>
. The class
glm::quat
is useful for representing quaternions, and the
function glm::mix
does spherical linear interpolation when
given quaternion arguments. Multiply these transformations together in
the order T, R, S (with T toward the root and S toward the leaf) and
assign that product to the transformation of the node referenced by the
channel. This is all quite analogous to what you did back in the CS4620
animation assignment. That’s really all there is to it!
Once you get this working, you should be able to get poses for
different times in the animation. You will likely want to implement
play/pause and frame forward/backward controls in your application (I
just used a few one-liner keyboard handlers). For realtime playback, the
functions glfwGetTime
and glfwSetTime
are
convenient: you can start the timer running when the user hits “play”
and then just use the actual current time on the timer to fetch each
frame to draw.
The aiAnimation
class carries two other bits of
information that are useful in getting animations to play back sensibly;
the times of keyframes are measured in “ticks” and the conversion factor
to seconds is called mTicksPerSecond
; also, the duration of
the animation (in ticks) is in mDuration
. Many animations
are designed to loop, so that it makes sense to arrange your playback
logic so that times beyond the duration are wrapped back into the range
between 0 and the duration. You might like to include an additional
adjustable scale factor for the playback speed in case you feel like the
default speed of some animations is off.
Here is what the box animation looks like for me.
3 Animate the mesh
The other half of a mesh animation is the skinning weights and transformations that bind the mesh to the node hierarchy. The mesh comes with a collection of “bones,” each of which refers to a node in the animated hierarchy. Each bone comes with two pieces of information: a list of weights (one for every vertex in the mesh), and a matrix that is the transformation from the coordinates of the skeleton root (the node where the mesh lives) to the local coordinates of the bone in the bind pose. This is known as the inverse bind pose matrix. This collection of bones is often called a skeleton.
In Assimp, you will find skeletons stored with meshes. Along with
vertex attributes like positions and normals, a mesh can contain bones,
in the array aiMesh::mBones
. The entries in this array
point to aiBone
objects, each containing a node name, a
list of (vertex index, weight) pairs, and the matrix
mOffsetMatrix
, which is the inverse bind pose matrix.
Extend the code for reading meshes into your scene so that it also reads
bones. I stored the bone information in two parts: the bones themselves
go in a skeleton, which is basically just a list of (node, inverse bind
pose matrix) pairs and a reference to the mesh it operates on; and the
weights need to be converted into mesh attributes to hand them off to a
vertex shader.
The vertex shader for skinning operates on one vertex at a time, and
it needs to have the weights and transformations for all bones that
influence the vertex it is processing. The way we’ll do this is (1)
limit the number of bones influencing each bone to 4 (we already did
this by asking Assimp to take care of it); (2) place the weights for
those 4 bones into a vec4
-valued vertex attribute and the
corresponding bone indices into a ivec4
-valued vertex
attribute; and (3) put the bone transformations into a uniform array of
mat4
s. With this information the vertex shader code can
very simply evaluate the linear blend skinning equation from
lecture:
because in this sum, only 4 terms are nonzero for any particular vertex. The four values of for the current vertex are found in the bone-index attribute, the four values of are found in the weights attribute, and the matrices are in the uniform array.
To set up the two additional vertex attribute arrays, you need to traverse Assimp’s weights-per-bone arrays and organize the data into 4-by- index and weight matrices similar to the ones you use for the position and normal attributes. Then set these attributes in your mesh, and they will be available in a vertex shader that has in declarations with matching attribute indices. For instance, you might use indices 2 and 3, then use declarations
layout (location = 2) in ivec4 boneIds;
layout (location = 3) in vec4 boneWts;
Note that the meshes from the SMPL project always have exactly 4 bones affecting each vertex, but the other meshes do not, so it’s important to have that Assimp importer option active to limit the number of bones per vertex, and also to be able to tolerate fewer than 4 bones affecting a vertex (typically this just requires ensuring the unused weights are zero and the unused indices are not out of range).
Once you have added the vertex attribute arrays and written the
shader that uses them, the only last thing you need is to compute the
bone transformations and upload them to the uniform array. This is where
you need the skeleton you stored: it is a list of references to nodes
with an inverse bind pose matrix for each one. In lecture we talked
about how to construct a transformation from the bind-pose space (the
coordinates in which the mesh vertices are stored) to the bone’s local
space (this is the inverse bind pose transform provided by Assimp), then
back to world space using the pose of that bone at the current frame
(this is the node-to-world transformation of the node referenced by the
bone). The transformations describing how each bone moves from bind pose
to the current frame go into the uniform matrix array. (The simple way
to upload to an array of uniform matrices is to treat it as a collection
of separate variables with names like
boneTransformations[3]
, though it’s also possible to upload
them as a single block of data using a uniform buffer.)
To test whether your transformations are working, before the weights
are correct, you can temporarily have your vertex shader transform all
vertices by just one bone transformation, and see whether the resulting
motion seems plausibly to follow that body part. (Refs for
mosh_cmu_7516
for bone 1 (Pelvis) and bone 8 (R_Ankle) — the former
tracks the overall body motion, and the latter rotates a lot during the
kicking part of the motion.)
Once your weights are correct you will see the complete animation!
One minor pitfall: due to rounding, if you blend homogeneous vectors or matrices you can end up with vectors whose final component is not exactly 1, which will cause them to transform slightly wrong under the projection to NDC. This can cause weird view-dependent distortions in the mesh. They are solved by only blending the spatial components and explicitly setting the final coordinate to 1.
4 Handing in
Hand in using the same process as previous assignments.