May 2014

Volume 29 Number 5

DirectX Factor : Manipulating Triangles in 3D Space

Charles Petzold

Charles PetzoldThe Dutch graphic artist M. C. Escher has always been a particular favorite among programmers, engineers and other techies. His witty drawings of impossible structures play with the mind’s need to impose order on visual information, while his use of mathematically inspired meshed patterns seems to suggest a familiarity with software recursion techniques.

I’m a particular fan of Escher’s mix of two-dimensional and three-dimensional imagery, such as “Drawing Hands” from 1948, in which a pair of 3D hands arise from a 2D drawing that is itself being sketched by the 3D hands (see Figure 1). But this juxtaposition of the 2D and 3D images emphasizes how the 3D hands only appear to have depth as a result of detail and shading. Obviously, everything in the drawing is rendered on flat paper.

M.C. Escher’s “Drawing Hands”
Figure 1 M.C. Escher’s “Drawing Hands”

I want to do something similar in this article: I want to make 2D graphical objects seem to acquire depth and body as they arise from the screen and float in 3D space, and then retreat back into the flat screen.

These graphical objects won’t be portrayals of human hands, however. Instead, I’ll stick with perhaps the simplest 3D objects—the five Platonic solids. The Platonic solids are the only possible convex polyhedra whose faces are identical regular convex polygons, with the same number of faces meeting at each vertex. They are the tetrahedron (four triangles), octahedron (eight triangles), icosahedron (20 triangles), cube (six squares) and dodecahedron (12 pentagons).

Platonic solids are popular in rudimentary 3D graphics programming because they’re generally easy to define and assemble. Formulas for the vertices can be found in Wikipedia, for example.

To make this exercise as pedagogically gentle as possible, I’ll be using Direct2D rather than Direct3D. However, you’ll need to become familiar with some concepts, data types, functions and structures often used in connection with Direct3D.

My strategy is to define these solid objects using triangles in 3D space, and then apply 3D transforms to rotate them. The transformed triangle coordinates are then flattened into 2D space by ignoring the Z coordinate, where they’re used to create ID2D1Mesh objects, which are then rendered using the FillMesh method of the ID2D1DeviceContext object.

As you’ll see, it’s not enough to simply define coordinates for 3D objects. Only when shading is applied to mimic the reflection of light do the objects seem to escape the flatness of the screen.

3D Points and Transforms

This exercise requires that 3D matrix transforms be applied to 3D points to rotate objects in space. What are the best data types and functions for this job?

Interestingly, Direct2D has a D2D1_MATRIX_4X4_F structure and a Matrix4x4F class in the D2D1 namespace that are suitable for representing 3D transform matrices. However, these data types are designed only for use with the DrawBitmap methods defined by ID2D1DeviceContext, as I demonstrated in the April installment of this column. In particular, Matrix4x4F doesn’t even have a method named Transform that can apply the transform to a 3D point. You’d need to implement that matrix multiplication with your own code.

A better place to look for 3D data types is the DirectX Math library, which is used by Direct3D programs, as well. This library defines more than 500 functions—all of which begin with the letters XM—and several data types. These are all declared in the DirectXMath.h header file and associated with a namespace of DirectX.

Every single function in the DirectX Math library involves the use of a data type named XMVECTOR, which is a collection of four numbers. XMVECTOR is suitable for representing 2D or 3D points (with or without a W coordinate) or a color (with or without an alpha channel). Here’s how you’d define an object of type XMVECTOR:

XMVECTOR vector;

Notice I said XMVECTOR is a collection of “four numbers” rather than “four floating-point values” or “four integers.” I can’t be more specific because the actual format of the four numbers in an XMVECTOR object is hardware-dependent.

XMVECTOR is not a normal data type! It’s actually a proxy for four hardware registers on the processor chip, specifically single instruction multiple data (SIMD) registers used with streaming SIMD extensions (SSE) that implement parallel processing. On x86 hardware these registers are indeed single-precision floating-­point values, but in ARM processors (found in Windows RT devices) they’re integers defined to have fractional components.

For this reason, you shouldn’t attempt to access the fields of an XMVECTOR object directly (unless you know what you’re doing). Instead, the DirectX Math library includes numerous functions to set the fields from integer or floating point values. Here’s a common one:

XMVECTOR vector = XMVectorSet(x, y, z, w);

Functions also exist to obtain the individual field values:

float x = XMVectorGetX(vector);

Because this data type is a proxy for hardware registers, certain restrictions govern its use. Read the online “DirectXMath Programming Guide” (bit.ly/1d4L7Gk) for details on defining structure members of type XMVECTOR and passing XMVECTOR arguments to functions.

In general, however, you’ll proba­bly use XMVECTOR mostly in code that’s local to a method. For the general-purpose storage of 3D points and vectors, the DirectX Math library defines other data types that are simple normal structures, such as XMFLOAT3 (which has three data members of type float named x, y and z) and XMFLOAT4 (which has four data members to include w). In particular, you’ll want to use XMFLOAT3 or XMFLOAT4 for storing arrays of points.

It’s easy to transfer between XMVECTOR and XMFLOAT3 or XMFLOAT4. Suppose you use XMFLOAT3 to store a 3D point:

XMFLOAT3 point;

When you need to use one of the DirectX Math functions that require an XMVECTOR, you can load the value into an XMVECTOR using the XMLoadFloat3 function:

XMVECTOR vector = XMLoadFloat3(&point);

The w value in the XMVECTOR is initialized to 0. You can then use the XMVECTOR object in various DirectX Math functions. To store the XMVECTOR value back in the XMFLOAT3 object, call:

XMStoreFloat3(&point, vector);

Similarly, XMLoadFloat4 and XMStoreFloat4 transfer values between XMVECTOR objects and XMFLOAT4 objects, and these are often preferred if the W coordinate is important.

In the general case, you’ll be working with several XMVECTOR objects in the same block of code, some of which correspond to underlying XMFLOAT3 or XMFLOAT4 objects, and some of which are just transient. You’ll see examples shortly.

I said earlier that every function in the DirectX Math library involves XMVECTOR. If you’ve explored the library, you might find some functions that actually don’t require an XMVECTOR but do involve an object of type XMMATRIX.

The XMMATRIX data type is a 4×4 matrix suitable for 3D transforms, but it’s actually four XMVECTOR objects, one for each row:

struct XMMATRIX
{
  XMVECTOR r[4];
};

So what I said was correct because all the DirectX Math functions that require XMMATRIX objects really do involve XMVECTOR objects, as well, and XMMATRIX has the same restrictions as XMVECTOR.

Just as XMFLOAT4 is a normal structure you can use to transfer values to and from an XMVECTOR object, you can use a normal structure named XMFLOAT4X4 to store a 4×4 matrix, and transfer that to and from an XMMATRIX using the XMLoadFloat4x4 and XMStoreFloat4x4 functions.

If you’ve loaded a 3D point into an XMVECTOR object (named vector, for example), and you’ve loaded a transform matrix into an XMMATRIX object named matrix, you can apply that transform to the point using:

XMVECTOR result = XMVector3Transform(vector, matrix);

Or, you can use:

XMVECTOR result = XMVector4Transform(vector, matrix);

The only difference is that XMVector4Transform uses the actual w value of the XMVECTOR while XMVector3Transform assumes that it’s 1, which is correct for implementing 3D translation.

However, if you have an array of XMFLOAT3 or XMFLOAT4 values and you want to apply the transform to the entire array, there’s a much better solution: The XMVector3TransformStream and XMVector4TransformStream functions apply the XMMATRIX to an array of values and store the results in an array of XMFLOAT4 values (regardless of the input type).

The bonus: Because XMMATRIX is actually in the SIMD registers on a CPU that implements SSE, the CPU can use parallel processing to apply that transform to the array of points, and accelerate one of the biggest bottlenecks in 3D rendering.

Defining Platonic Solids

The downloadable code for this column is a single Windows 8.1 project named PlatonicSolids. The program uses Direct2D to render 3D images of the five Platonic solids.

Like all 3D figures, these solids can be described as a collection of triangles in 3D space. I knew I’d want to use XMVector3­TransformStream or XMVector4TransformStream to transform an array of 3D triangles, and I knew the output array of these two functions is always an array of XMFLOAT4 objects, so I decided to use XMFLOAT4 for the input array, as well, and that’s how I defined my 3D triangle structure:

struct Triangle3D
{
  DirectX::XMFLOAT4 point1;
  DirectX::XMFLOAT4 point2;
  DirectX::XMFLOAT4 point3;
};

Figure 2 shows some additional private data structures defined in PlatonicSolidsRenderer.h that store information necessary to describe and render a 3D figure. Each of the five Platonic solids is an object of type FigureInfo. The srcTriangles and dstTriangles collections store the original “source” triangles and the “destination” triangles after scaling and rotation transforms have been applied.  Both collections have a size equal to the product of faceCount and trianglesPerFace. Notice that srcTriangles.data and dstTriangles.data are effectively pointers to XMFLOAT4 structures and can therefore be arguments to the XMVector4TransformStream function. As you’ll see, this happens during the Update method in the PlatonicSolidRenderer class.

Figure 2 The Data Structures Used for Storing 3D Figures

struct RenderInfo
{
  Microsoft::WRL::ComPtr<ID2D1Mesh> mesh;
  Microsoft::WRL::ComPtr<ID2D1SolidColorBrush> brush;
};
struct FigureInfo
{
  // Constructor
  FigureInfo()
  {
  }
  // Move constructor
  FigureInfo(FigureInfo && other) :
    srcTriangles(std::move(other.srcTriangles)),
    dstTriangles(std::move(other.dstTriangles)),
    renderInfo(std::move(other.renderInfo))
  {
  }
  int faceCount;
  int trianglesPerFace;
  std::vector<Triangle3D> srcTriangles;
  std::vector<Triangle3D> dstTriangles;
  D2D1_COLOR_F color;
  std::vector<RenderInfo> renderInfo;
};
std::vector<FigureInfo> m_figureInfos;

The renderInfo field is a collection of RenderInfo objects, one for each face of the figure. The two members of this structure are also determined during the Update method, and they’re simply passed to the FillMesh method of the ID2D1DeviceContext object during the Render method.

The constructor of the PlatonicSolidsRenderer class initializes each of the five FigureInfo objects. Figure 3 shows the process for the simplest of the five, the tetrahedron.

Figure 3 Defining the Tetrahedron

FigureInfo tetrahedron;
tetrahedron.faceCount = 4;
tetrahedron.trianglesPerFace = 1;
tetrahedron.srcTriangles =
{
  Triangle3D { XMFLOAT4(-1,  1, -1, 1),
               XMFLOAT4(-1, -1,  1, 1),
               XMFLOAT4( 1,  1,  1, 1) },
  Triangle3D { XMFLOAT4( 1, -1, -1, 1),
               XMFLOAT4( 1,  1,  1, 1),
               XMFLOAT4(-1, -1,  1, 1) },
  Triangle3D { XMFLOAT4( 1,  1,  1, 1),
               XMFLOAT4( 1, -1, -1, 1),
               XMFLOAT4(-1,  1, -1, 1) },
  Triangle3D { XMFLOAT4(-1, -1,  1, 1),
               XMFLOAT4(-1,  1, -1, 1),
               XMFLOAT4( 1, -1, -1, 1) }
};
tetrahedron.srcTriangles.shrink_to_fit();
tetrahedron.dstTriangles.resize(tetrahedron.srcTriangles.size());
tetrahedron.color = ColorF(ColorF::Magenta);
tetrahedron.renderInfo.resize(tetrahedron.faceCount);
m_figureInfos.at(0) = tetrahedron;

The initialization of the octahedron and icosahedron are similar. In all three cases, each face consists of just one triangle. In terms of pixels, the coordinates are very small, but code later in the program scales them to a proper size.

The cube and dodecahedron are different, however. The cube has six faces, each of which is a square, and the dodecahedron is 12 pentagons. For these two figures, I used a different data structure to store the vertices of each face, and a common method that converted each face into triangles—two triangles for each face of the cube and three triangles for each face of the dodecahedron.

For ease in converting the 3D coordinates into 2D coordinates, I’ve based these figures on a coordinate system in which positive X coordinates increase to the right and positive Y coordinates increase going down. (It’s more common in 3D programming for positive Y coordinates to increase going up.) I’ve also assumed that positive Z coordinates come out of the screen. Therefore, this is a left-hand coordinate system. If you point the forefinger of your left hand in the direction of positive X, and the middle finger in the direction of positive Y, your thumb points to positive Z.

The viewer of the computer screen is assumed to be located at a point on the positive Z axis looking toward the origin.

Rotations in 3D

The Update method in PlatonicSolidsRenderer performs an animation that consists of several sections. When the program begins running, the five Platonic solids are displayed, but they appear to be flat, as shown in Figure 4.

The PlatonicSolids Program As It Begins Running
Figure 4 The PlatonicSolids Program As It Begins Running

These are obviously not recognizable as 3D objects!

In 2.5 seconds, the objects begin rotating. The Update method calculates rotation angles and a scaling factor based on the size of the screen, and then makes use of DirectX Math functions. Functions such as XMMatrixRotationX compute an XMMATRIX object representing rotation around the X axis. XMMATRIX also defines matrix multiplication operators so the results of these functions can be multiplied together.

Figure 5 shows how a total matrix transform is calculated and applied to the array of Triangle3D objects in each figure.

Figure 5 Rotating the Figures

// Calculate total matrix
XMMATRIX matrix = XMMatrixScaling(scale, scale, scale) *
                  XMMatrixRotationX(xAngle) *
                  XMMatrixRotationY(yAngle) *
                  XMMatrixRotationZ(zAngle);
// Transform source triangles to destination triangles
for (FigureInfo& figureInfo : m_figureInfos)
{
  XMVector4TransformStream(
    (XMFLOAT4 *) figureInfo.dstTriangles.data(),
    sizeof(XMFLOAT4),
    (XMFLOAT4 *) figureInfo.srcTriangles.data(),
    sizeof(XMFLOAT4),
    3 * figureInfo.srcTriangles.size(),
    matrix);
}

Once the figures begin rotating, however, they still appear to be flat polygons, even though they’re changing shape.

Occlusion and Hidden Surfaces

One of the crucial aspects of 3D graphics programming is making sure objects closer to the viewer’s eye obscure (or occlude) objects further away. In complex scenes, this isn’t a trivial problem, and, generally, this must be performed in graphics hardware on a pixel-by-pixel basis.

With convex polyhedra, however, it’s relatively quite simple. Consider a cube. As the cube is rotating in space, mostly you see three faces, and sometimes just one or two. Never do you see four, five or all six faces.

For a particular face of the rotating cube, how can you determine what faces you see and what faces are hidden? Think about vectors (often visualized as arrows with a particular direction) perpendicular to each face of the cube, and pointing to the outside of the cube. These are referred to as “surface normal” vectors.

Only if a surface normal vector has a positive Z component will that surface be visible to a viewer observing the object from the positive Z axis.

Mathematically, computing a surface normal for a triangle is straightforward: The three vertices of the triangle define two vectors, and two vectors (V1 and V2) in 3D space define a plane, and a perpendicular to that plane is obtained from the vector cross product, as shown in Figure 6.

The Vector Cross Product
Figure 6 The Vector Cross Product

The actual direction of this vector depends on the handedness of the coordinate system. For a right-hand coordinate system, for example, you can determine the direction of the V1×V2 cross product by curving the fingers of your right hand from V1 to V2. The thumb points in the direction of the cross product. For a left-hand coordinate system, use your left hand.

For any particular triangle that makes up these figures, the first step is to load the three vertices into XMVECTOR objects:

XMVECTOR point1 = XMLoadFloat4(&triangle3D.point1);
XMVECTOR point2 = XMLoadFloat4(&triangle3D.point2);
XMVECTOR point3 = XMLoadFloat4(&triangle3D.point3);

Then, two vectors representing two sides of the triangle can be calculated by subtracting point2 and point3 from point1 using convenient DirectX Math functions:

XMVECTOR v1 = XMVectorSubtract(point2, point1);
XMVECTOR v2 = XMVectorSubtract(point3, point1);

All the Platonic solids in this program are defined with triangles whose three points are arranged clockwise from point1 to point2 to point3 when the triangle is viewed from outside the figure. A surface normal pointing to outside the figure can be calculated using a DirectX Math function that obtains the cross product:

XMVECTOR normal = XMVector3Cross(v1, v2);

A program displaying these figures could simply choose to not display any triangle with a surface normal that has a 0 or negative Z component. The PlatonicSolids program instead continues to display those triangles but with a transparent color.

It’s All About the Shading

You see objects in the real world because they reflect light. Without light, nothing is visible. In many real-world environments, light comes from many different directions because it bounces off other surfaces and diffuses in the air.

In 3D graphics programming, this is known as “ambient” light, and it’s not quite adequate. If a cube is floating in 3D space and the same ambient light strikes all six faces, all six faces would be colored the same and it wouldn’t look like a 3D cube at all.

Scenes in 3D, therefore, usually require some directional light—light coming from one or more directions. One common approach for simple 3D scenes is to define a directional light source as a vector that seems to come from behind the viewer’s left shoulder:

XMVECTOR lightVector = XMVectorSet(2, 3, -1, 0);

From the viewer’s perspective, this is one of many vectors that points to the right and down, and away from the viewer in the direction of the negative Z axis.

In preparation for the next job, I want to normalize both the surface normal vector and the light vector:

normal = XMVector3Normalize(normal);
lightVector = XMVector3Normalize(lightVector);

The XMVector3Normalize function calculates the magnitude of the vector using the 3D form of the Pythagorean Theorem, and then divides the three coordinates by that magnitude. The resultant vector has a magnitude of 1.

If the normal vector happens to be equal to the negative of the lightVector, that means the light is striking the triangle perpendicular to its surface, and that’s the maximum illumination that directional light can provide. If the directional light isn’t quite perpendicular to the triangle surface, the illumination will be less.

Mathematically, the illumination of a surface from a directional light source is equal to the cosine of the angle between the light vector and the negative surface normal. If these two vectors have a magnitude of 1, then that crucial number is provided by the dot product of the two vectors:

XMVECTOR dot = XMVector3Dot(normal, -lightVector);

The dot product is a scalar—one number—rather than a vector, so all the fields of the XMVECTOR object returned from this function hold the same values.

To make it seem as if the rotating Platonic solids magically assume 3D depth as they arise from the flat screen, the PlatonicSolids program animates a value called lightIntensity from 0 to 1 and then back to 0. The 0 value is no directional light shading and no 3D effect, while the 1 value is maximum 3D. This lightIntensity value is used in conjunction with the dot product to calculate a total light factor:

float totalLight = 0.5f +
  lightIntensity * 0.5f * XMVectorGetX(dot);

The first 0.5 in this formula refers to ambient light, and the second 0.5 allows totalLight to range from 0 to 1 depending on the value of the dot product. (Theoretically, this isn’t quite correct. Negative values of the dot product should be set to 0 because they result in total light that is less than ambient light.)

This totalLight is then used to calculate a color and brush for each face:

renderColor = ColorF(totalLight * baseColor.r,
                     totalLight * baseColor.g,
                     totalLight * baseColor.b);

The result with maximum 3D-ishness is shown in Figure 7.

The PlatonicSolids Program with Maximum 3D
Figure 7 The PlatonicSolids Program with Maximum 3D


Charles Petzold is a longtime contributor to MSDN Magazine and the author of “Programming Windows, 6th Edition” (Microsoft Press, 2013), a book about writing applications for Windows 8. His Web site is charlespetzold.com.

Thanks to the following Microsoft technical expert for reviewing this article: Doug Erickson