Direct3D D3DX Animation API Introduction

When using Direct3D for computer graphics on Windows, Microsoft helpfully provides the D3DX library, which implements a number of useful utilities that aren't traditionally part of a rendering API as such. This library can be compared to the GLU library in some sense, although D3DX goes way beyond the functionality found in GLU by providing 2D, 3D and 4D linear algebra math support, triangle mesh processing and rendering support, texture loading, processing and saving support from various file formats, and other utilities that will greatly simplify the "getting up and running" phase of writing graphics applications.

Part of the support is the Microsoft.DirectX.Direct3D.Mesh class and its derivatives (called ID3DXMesh when using the C++ API). This class not only allows you to store all vertices, indices, and material information for a specific rendered model in your program, but also allows you to animate that model, either using matrix blended skinning, or using rigid jointed animations (H-models).

Microsoft also supplies exporters for popular tools such as 3dsMax, Maya, and others, to their DirectX-specific ".X" mesh file format. The Microsoft exporter has traditionally been of dubious capabilities (I even had to make it support the ISkin interface for 3dsMax myself), but there exists another exporter created by PandaSoft, which comes with somewhat higher recommendations. Whichever exporter you choose, it's great to be able to build a mesh, animate it, and export it without having to write any code along that path yourself!

Mind Reading

If only the documentation was as rich as the code! In fact, the documentation for D3DX is somewhat useful in places, but most of the time, resorts to documenting the function "AnimationController.GetPriorityBlend()" as "returning the current priority blend value for the animation controller."
Gee, thanks! And why should I care?

For those of us who can't mind-read Microsoft engineers, it may be quite confusing to come from another world, and try to make sense of the Microsoft mesh animation API. I've gone through the process once, and am documenting the various steps and confusions I found along the way. Hopefully, this will help someone else in a simular situation.

What's in a name?

Some of the confusion comes because there is no standard terminology for various concepts, or where there is, Microsoft chose to use it another way. In addition, the description of some of the concepts, at the start of the D3DX documentation for the ID3DXMesh section, serves more to befuddle the reader than to illuminate.
To start off, here are some definitions of terminology, as used in D3DX:

A single vertex buffer and index buffer, containing all the vertices for a specific renderable instance (when loaded using Mesh.FromFile). Multiple materials, or multiple transforms within a mesh, are grouped into Subsets. Each pair of material,transform makes up a single Attribute. In a mesh, each face (consists of three indices) has a single Attribute, which is a single DWORD (integer).

A range of indices and vertices within a Mesh. When a mesh is attribute sorted (see Mesh.Optimize()), each subset will contain a contiguous range of faces (triangles) with a specific attribute value.

An Attribute in a Mesh is a single DWORD that is attributed to each face (triangle) within the mesh. The closest analog to a face Attribute is the face Material ID in Max.

What I've previously heard referred to simply as an "Animation" -- all the data that makes a hierarchy of movable things "go" over time. Example can be an idle animation for a humanoid, or a reload animation for a cannon -- multiple parts move, synchronized in time. An AnimationSet consists of multiple Animations.

Any one thing that gets animated -- this can be a single bone in a character skeleton, a single knob on a weapon, etc. All animated properties (position, rotation, scale, etc), when moving over time, for a single thing, makes up an Animation. Animations are grouped into AnimationSet values to make up units that you actually want to use.

People who have worked a lot within 3dsMax and other similar applications may think of a "controller" as a single set of animated parameters, such as the use of the word "Animation" in the D3DX API. However, an AnimationController in D3DX is something that can control multiple AnimationSet instances over time, blend between them, call back the user at pre-determined times, and other such high-level functionality. This is the interface you will call to actually make your animations play.

A Frame is a single transform within a Frame Hierarchy. When loading an animated mesh, each Frame will be loaded separately, and zero or more Meshes will be attached to these Frames using MeshContainers. For a skinned character, each Frame is a bone, but the mesh is only attached at the root of the hierarchy.

Most people are probably used to think of this as a "node set" or "skeleton". To confuse matters, the functions that "load a hierarchy" actually also load the mesh data that goes with the hierarchy.

When loading an animated mesh (using the LoadHierarchyFromFile function), one or more MeshContainer instances may be attached to each Frame in the hierarchy. The framework will call you back to allocate Frames and MeshContainers, but will do the actual attaching itself. You can, however, use this abstraction to share mesh instances between multiple hierarchy instances (as you would want to do when loading multiple instances of the same NPC mesh, that you want to animate independently).

The Duration of a single shot of an AnimationSet is its Period. I guess they didn't want to call it Duration because the "duration" of a looping animation is conceptually infinite.

What about part 2?

It's late, and I need to do other things tonight. Perhaps I'll write up a second article with more hands-on about how to use the D3DX animation API some time. Meanwhile, you can look at the SimpleAnimation sample for Managed DirectX, and the MultiAnimation sample for more advanced usage in C++. You can port the concepts from C++ to C# without too much trouble (and, in fact, I did in my code).

Multiple Instances

To support multiple mesh instances that animate differently, you have to pull a number of tricks. First, if you're using software skinning (ID3DXSkinInfo::UpdateSkinnedMesh()) you have to create vertex buffers for each output mesh, unless you can live with only being able to render an instance once. If you need to render into shadow maps, or dynamic cube maps, or have transparent parts of your animated meshes, you want to keep the generated mesh data around, and thus need a vertex buffer per instance. The easiest way to create this is to clone the input mesh (or input meshes).

The second thing you have to do is to clone the animation controller using ID3DXAnimationController::CloneAnimationController(). This allows you to play different animations (animation sets) on the different instances. However, there's one big caveat: when you call UpdateTime(), the animation controller will write the bone data to some pre-defined output matrices. The way to set up these pre-defined matrices is to use the D3DXFrameRegisterNamedMatrices() function, which takes a pointer to a root D3DXFRAME structure (or your subclass thereof). The problem with this is that you then have to clone the frame hierarchy for each instance, just for the single purpose of passing it into this one function to be able to get the output of the animation controller.

What's extra delicious about this situation is the fact that the D3DXFRAME contains the transformation matrix as an inline data member. If you want to push the matrix into a scene graph, or want to collect all the matrices into a contiguous array (such as needed to call UpdateSkinnedMesh() or setting constants for a skinning vertex shader), you have to manually copy the data out once you're done. This is a pretty abysmal waste of precious cache lines. If you want a high-performance animation system (say, for hundreds of RTS characters), you probably want to write it yourself.