February 2014

Volume 29 Number 2

DirectX Factor : A 2D Portal into a 3D World

Charles Petzold

Charles PetzoldIf you’re well-versed in 2D graphics, you might assume that 3D is similar except for the extra dimension. Not quite! Anyone who’s dabbled in 3D graphics programming knows how difficult it is. 3D graphics programming requires you to master new and exotic concepts beyond anything encountered in the conventional 2D world. A lot of preliminaries are required to get just a little 3D on the screen, and even then a slight miscalculation can render it invisible. Consequently, the visual feedback so important to learning graphics programming is delayed until all the programming pieces are in place and working in harmony.

DirectX acknowledges the profound difference between 2D and 3D graphics programming with the division between Direct2D and Direct3D. Although you can mix 2D and 3D content on the same output device, these are very distinct and different programming interfaces, and there’s no middle ground. DirectX doesn’t allow you to be a little bit country, a little bit rock-and-roll.

Or does it?

Interestingly, Direct2D includes some concepts and facilities that originated in the 3D programming universe. Through features such as geometry tessellation (the decomposition of complex geometries into triangles) and 2D effects using shaders (which consist of special code that runs on the graphics processing unit, or GPU), it’s possible to exploit some powerful 3D concepts while still remaining within the context of Direct2D.

Moreover, these 3D concepts can be encountered and explored gradually, and you get the satisfaction of actually seeing the results on the screen. You can get your 3D feet wet in Direct2D so a later plunge into Direct3D programming is a little less shocking.

I guess it shouldn’t be all that surprising that Direct2D incorporates some 3D features. Architecturally, Direct2D is built on top of Direct3D, which allows Direct2D to also take advantage of the hardware acceleration of the GPU. This relationship between Direct2D and Direct3D becomes more apparent as you begin exploring the nether regions of Direct2D.

I’ll commence this exploration with a review of 3D coordinates and coordinate systems.

The Big Leap Outward

If you’ve been following this column in recent months, you know it’s possible to call the GetGlyphRunOutline method of an object that implements the IDWriteFontFace interface to obtain an ID2D1PathGeometry instance that describes the outlines of text characters in terms of straight lines and Bézier curves. You can then manipulate the coordinates of these lines and curves to distort the text characters in various ways.

It’s also possible to convert the 2D coordinates of a path geometry into 3D coordinates, and then manip­ulate these 3D coordinates before converting them back into 2D to display the path geometry normally. Does that sound like fun?

Coordinates in two-dimensional space are expressed as number pairs (X, Y), which correspond to a location on the screen; 3D coordinates are in the form (X, Y, Z) and, conceptually, the Z axis is orthogonal to the screen. Unless you’re dealing with a holographic display or a 3D printer, these Z coordinates aren’t nearly as real as X and Y coordinates.

There are other differences between 2D and 3D coordinate systems. Conventionally the 2D origin—the point (0, 0)—is the upper-left corner of the display device. The X coordinates increase to the right and Y coordinates increase going down. In 3D, very often the origin is in the center of the screen, and it’s more akin to a standard Cartesian coordinate system: The X coordinates still increase going to the right, but the Y coordinates increase going up, and there are negative coordinates as well. (Of course, the origin, scale, and orientation of these axes can be altered with matrix transforms, and usually are.)

Conceptually, the positive Z axis can either point out of the screen or point into the screen. These two conventions are known as “right-hand” and “left-hand” coordinate systems, referring to a technique to distinguish them: With a right-hand coordinate system, if you point the index finger of your right hand in the direction of the positive X axis, and the middle finger in the direction of positive Y, your thumb points to positive Z. Also, if you curve the fingers of your right hand from the positive X axis to the positive Y axis, your thumb points to positive Z. With a left-hand coordinate system, it’s the same except using the left hand.

My goal here is to obtain a 2D path geometry of a short text string, and then twist it around the origin into a 3D ring so the beginning meets the end, similar to the illustration shown in Figure 1. Because I’ll be converting 2D coordinates to 3D coordinates and then back to 2D, I’ve chosen to use a 3D coordinate system with Y coordinates increasing going down, just like in 2D. The positive Z axis comes out of the screen, but it’s really a left-hand coordinate system.

The Coordinate System Used for the Programs in This Article
Figure 1 The Coordinate System Used for the Programs in This Article

To make this whole job as easy as possible, I’ve used a font file stored as a program resource, and created an IDWriteFontFile object for obtaining the IDWriteFontFace object. Alternatively, you could obtain an IDWriteFontFace through a more roundabout method from the system font collection.

The ID2D1PathGeometry object generated from the GetGlyphRunOutline method is then passed through the Simplify method using the D2D1_GEOMETRY_SIMPLIFICATION_OPTION_LINES argument to flatten all Bézier curves into sequences of short lines. That simplified geometry is passed into a custom ID2D1GeometrySink implementation named FlattenedGeometrySink to further decompose all the straight lines into much shorter straight lines. The result is a completely malleable geometry consisting only of lines.

To ease the manipulation of these coordinates, FlattenedGeometry­Sink generates a collection of Polygon objects. Figure 2 shows the definition of the Polygon structure. It’s basically just a collection of connected 2D points. Each Polygon object corresponds to a closed figure in the path geometry. Not all figures in path geometries are closed, but those in text glyphs are always closed, so this structure is fine for that purpose. Some characters (such as C, E and X) are described by just one Polygon; some (A, D and O) consist of two Polygon objects for the inside and outside; some (B, for example) consist of three; and some symbol characters may have many more.

Figure 2 The Polygon Class for Storing Closed Path Figures

struct Polygon
{
  // Constructors
  Polygon()
  {
  }
  Polygon(size_t pointCount)
  {
    Points = std::vector<D2D1_POINT_2F>(pointCount);
  }
  // Move constructor
  Polygon(Polygon && other) : Points(std::move(other.Points))
  {
  }
  std::vector<D2D1_POINT_2F> Points;
  static HRESULT CreateGeometry(ID2D1Factory* factory,
                                const std::vector<Polygon>& polygons,
                                ID2D1PathGeometry** pathGeometry);
};
HRESULT Polygon::CreateGeometry(ID2D1Factory* factory,
                                const std::vector<Polygon>& polygons,
                                ID2D1PathGeometry** pathGeometry)
{
  HRESULT hr;
  if (FAILED(hr = factory->CreatePathGeometry(pathGeometry)))
    return hr;
  Microsoft::WRL::ComPtr<ID2D1GeometrySink> geometrySink;
  if (FAILED(hr = (*pathGeometry)->Open(&geometrySink)))
    return hr;
  for (const Polygon& polygon : polygons)
  {
    if (polygon.Points.size() > 0)
    {
      geometrySink->BeginFigure(polygon.Points[0],
                                D2D1_FIGURE_BEGIN_FILLED);
      if (polygon.Points.size() > 1)
      {
        geometrySink->AddLines(polygon.Points.data() + 1,
                               polygon.Points.size() - 1);
      }
      geometrySink->EndFigure(D2D1_FIGURE_END_CLOSED);
    }
  }
  return geometrySink->Close();
}

Among the downloadable code for this column is a Windows Store program named CircularText that creates a collection of Polygon objects based on the text “Text in an Infinite Circle of,” where the end is intended to connect back to the beginning in a circle. The text string is actually specified in the program as “ext in an Infinite Circle of T” to avoid a space at the beginning or end that would disappear when a path geometry is generated from the glyphs.

The CircularTextRenderer class in the CircularText project contains two std::vector objects of type Polygon called m_srcPolygons (the original Polygon objects generated from the path geometry) and m_dstPolygons (the Polygon objects used to generate the rendered path geometry). Figure 3 shows the method CreateWindowSizeDependentResources that converts the source polygons to the destination polygons based on the size of the screen.

Figure 3 From 2D to 3D to 2D in the CircularText Program

void CircularTextRenderer::CreateWindowSizeDependentResources()
{
  // Get window size and geometry size
  Windows::Foundation::Size logicalSize = m_deviceResources->GetLogicalSize();
  float geometryWidth = m_geometryBounds.right - m_geometryBounds.left;
  float geometryHeight = m_geometryBounds.bottom - m_geometryBounds.top;
  // Calculate a few factors for converting 2D to 3D
  float radius = logicalSize.Width / 2 - 50;
  float circumference = 2 * 3.14159f * radius;
  float scale = circumference / geometryWidth;
  float height = scale * geometryHeight;
  for (size_t polygonIndex = 0; polygonIndex < m_srcPolygons.size(); polygonIndex++)
  {
    const Polygon& srcPolygon = m_srcPolygons.at(polygonIndex);
    Polygon& dstPolygon = m_dstPolygons.at(polygonIndex);
    for (size_t pointIndex = 0; pointIndex < srcPolygon.Points.size(); pointIndex++)
    {
      const D2D1_POINT_2F pt = srcPolygon.Points.at(pointIndex);
      float radians = 2 * 3.14159f * (pt.x - m_geometryBounds.left) / geometryWidth;
      float x = radius * sin(radians);
      float z = radius * cos(radians);
      float y = height * ((pt.y - m_geometryBounds.top) / geometryHeight - 0.5f);
      dstPolygon.Points.at(pointIndex) = Point2F(x, y);
    }
  }
  // Create path geometry from Polygon collection
  DX::ThrowIfFailed(
    Polygon::CreateGeometry(m_deviceResources->GetD2DFactory(),
                            m_dstPolygons,
                            &m_pathGeometry)
    );
}

In the inner loop, you’ll see x, y and z values calculated. This is a 3D coordinate but it’s not even saved. Instead, it’s immediately collapsed back into 2D by simply ignoring the z value. To calcu­late those 3D coordinates, the code first converts a horizontal position on the original path geometry to an angle in radians from 0 to 2π. The sin and cos functions calculate a position on a unit circle on the XZ plane. The y value is a more direct conversion from the vertical coordinates of the original path geometry.

The CreateWindowSizeDependentResources method concludes by obtaining a new ID2D1PathGeometry object from the destination Polygon collection. The Render method then sets a matrix transform to put the origin in the center of the screen, and both fills and outlines this path geometry, with the result shown in Figure 4.

The CircularText Display
Figure 4 The CircularText Display

Is the program working? It’s hard to tell! Look closely and you can see some wide characters in the center and narrower characters at the left and right. But the big problem is that I started out with a path geometry with no intersecting lines, and now the geometry is displayed back over itself, with the result that overlapping areas are not filled. This effect is characteristic of geometries, and it happens whether the path geometry created by the Polygon structure has a fill mode of alternate or winding.

Getting Some Perspective

Three-dimensional graphics programming is not just about coordinate points. Visual cues are necessary for the viewer to interpret an image on a 2D screen as representing an object in 3D space. In the real world, you rarely view objects from a constant vantage point. You’d get a better view of the 3D text in Figure 4 if you could tilt it somewhat so it looks more like the ring in Figure 1.

To get some perspective on the three-dimensional text, the coordinates need to be rotated in space. As you know, Direct2D supports a matrix transform structure named D2D1_MATRIX_3x2_F that you can use to define 2D transforms, which you can apply to your 2D graphics output by first calling the SetTransform method of ID2D1RenderTarget.

Most commonly, you would use a class named Matrix3x2F from the D2D1 namespace for this purpose. This class derives from D2D1_MATRIX_3x2F_F and provides methods for defining various types of standards for translation, scaling, rotation and skew.

The Matrix3x2F class also defines a method named TransformPoint that allows you to apply the transform “manually” to individual D2D1_POINT_2F objects. This is useful for manipulating points before they’re rendered.

You may think I need a 3D rotation matrix to tilt the displayed text. I’ll certainly be exploring 3D matrix transforms in future columns, but for now I can make do with 2D rotation. Imagine yourself situated somewhere on the negative X axis of Figure 1, looking toward the origin. The positive Z and Y axes are situated just like the X and Y axes in a conventional 2D coordinate system, so it seems plausible that by applying a 2D rotation matrix to the Z and Y values, I can rotate all the coordinates around the three-­dimensional X axis.

You can experiment with this with the CircularText program. Create a 2D rotation matrix in the CreateWindowSizeDependent­Resources method sometime before the Polygon coordinates are manipulated:

Matrix3x2F tiltMatrix = Matrix3x2F::Rotation(-8);

That’s a rotation of -8 degrees, and the negative sign indicates a counterclockwise rotation. In the inner loop, after x, y, and z have been calculated, apply that transform to the z and y values as if they were x and y values:

 

D2D1_POINT_2F tiltedPoint =
     tiltMatrix.TransformPoint(Point2F(z, y));
z = tiltedPoint.x;
y = tiltedPoint.y;

Figure 5 shows what you’ll see.

The Tilted CircularText Display
Figure 5 The Tilted CircularText Display

This is much better, but it still has issues. Ugly things happen when the geometry overlaps, and there’s nothing to suggest which part of the geometry is nearer to you and which is further away. Stare at it, and you might experience some perspective shift.

Still, the ability to apply 3D transforms to this object suggests that it might also be easy to rotate the object around the Y axis—and it is. If you imagine viewing the origin from the positive Y axis, you’ll see that the X and Z axes are oriented the same way as the X and Y axes in a 2D coordinate system.

The SpinningCircularText project implements two rotation transforms to spin the text and tilt it. All the computational logic previously in CreateWindowSizeDependentResources has been moved into the Update method. The 3D points are rotated twice: once around the X axis based on elapsed time, and then around the Y axis based on the user sweeping a finger up and down the screen. This Update method is shown in Figure 6.

Figure 6 The Update Method of SpinningCircularText

void SpinningCircularTextRenderer::Update(DX::StepTimer const& timer)
{
  // Get window size and geometry size
  Windows::Foundation::Size logicalSize = m_deviceResources->GetLogicalSize();
  float geometryWidth = m_geometryBounds.right - m_geometryBounds.left;
  float geometryHeight = m_geometryBounds.bottom - m_geometryBounds.top;
  // Calculate a few factors for converting 2D to 3D
  float radius = logicalSize.Width / 2 - 50;
  float circumference = 2 * 3.14159f * radius;
  float scale = circumference / geometryWidth;
  float height = scale * geometryHeight;
  // Calculate rotation matrix
  float rotateAngle = -360 * float(fmod(timer.GetTotalSeconds(), 10)) / 10;
  Matrix3x2F rotateMatrix = Matrix3x2F::Rotation(rotateAngle);
  // Calculate tilt matrix
  Matrix3x2F tiltMatrix = Matrix3x2F::Rotation(m_tiltAngle);
  for (size_t polygonIndex = 0; polygonIndex < m_srcPolygons.size(); polygonIndex++)
  {
    const Polygon& srcPolygon = m_srcPolygons.at(polygonIndex);
    Polygon& dstPolygon = m_dstPolygons.at(polygonIndex);
    for (size_t pointIndex = 0; pointIndex < srcPolygon.Points.size(); pointIndex++)
    {
      const D2D1_POINT_2F pt = srcPolygon.Points.at(pointIndex);
      float radians = 2 * 3.14159f * (pt.x - m_geometryBounds.left) / geometryWidth;
      float x = radius * sin(radians);
      float z = radius * cos(radians);
      float y = height * ((pt.y - m_geometryBounds.top) / geometryHeight - 0.5f);
      // Apply rotation to X and Z
      D2D1_POINT_2F rotatedPoint = rotateMatrix.TransformPoint(Point2F(x, z));
      x = rotatedPoint.x;
      z = rotatedPoint.y;
      // Apply tilt to Y and Z
      D2D1_POINT_2F tiltedPoint = tiltMatrix.TransformPoint(Point2F(y, z));
      y = tiltedPoint.x;
      z = tiltedPoint.y;
      dstPolygon.Points.at(pointIndex) = Point2F(x, y);
    }
  }
  // Create path geometry from Polygon collection
  DX::ThrowIfFailed(
    Polygon::CreateGeometry(m_deviceResources->GetD2DFactory(),
    m_dstPolygons,
    &m_pathGeometry)
    );
    // Update FPS display text
    uint32 fps = timer.GetFramesPerSecond();
    m_text = (fps > 0) ? std::to_wstring(fps) + L" FPS" : L" - FPS";
}

It’s well-known that composite matrix transforms are equivalent to matrix multiplications, and because matrix multiplications aren’t commutative, neither are composite transforms. Try switching around the application of the tilt and rotate transforms for a different effect (which you might actually prefer).

When creating the SpinningCircularText program, I adapted the SampleFpsTextRenderer class created by the Visual Studio template to create the SpinningCircularTextRenderer class, but I left the display of the rendering rate. This allows you to see how bad the performance is. On my Surface Pro, I see a frames per second (FPS) figure of 25 in Debug mode, which indicates the code is not keeping up with the refresh rate of the video display.

If you don’t like that performance, I’m afraid I have some bad news: I’m going to make it even worse.

Separating Foreground from Background

The biggest problem with the path geometry approach to 3D is the effect of overlapping areas. Is it possible to avoid those overlaps? The image this program is attempting to draw is not all that complex. At any time, there’s a front view of part of the text and a back view of the rest of the text, and the front view should always be displayed on top of the back view. If it were possible to separate the path geometry into two path geometries—one for the background and one for the foreground—you could render those path geometries with separate FillGeometry calls so the foreground would be on top of the background. These two path geometries could even be rendered with different brushes.

Consider the original path geometry created by the GetGlyphRunOutline method. That’s just a flat 2D path geometry occupying a rectangular area. Eventually, half of that geometry is displayed in the foreground, and the other half is displayed in the background. But by the time the Polygon objects are obtained, it’s too late to make that split with anything like computational ease.

Instead, the original path geometry needs to be broken in half before the Polygon objects are obtained. This break is dependent on the rotation angle, which means that much more logic must be moved into the Update method.

The original path geometry can be split in half with two calls to the CombineWithGeometry method. This method combines two geometries in various ways to make a third geometry. The two geometries that are combined are the original path geometry that describes the text outlines and a rectangle geometry that defines a subset of the path geometry. This subset appears in either the foreground or background, depending on the rotation angle.

For example, if the rotation angle is 0, the rectangle geometry must cover the central half of the path geometry of the text outlines. This is the part of the original geometry that appears in the foreground. Calling CombineWithGeometry with the D2D1_COMBINE_MODE_INTERSECT mode returns a path geometry consisting only of that central area, while calling CombineWithGeometry with D2D1_COMBINE_MODE_EXCLUDE obtains a path geometry of the remainder—the parts on the left and right. These two path geometries can then be converted to Polygon objects separately for manipulation of the coordinates, followed by a conversion back to separate path geometries for rendering.

This logic is part of a project named OccludedCircularText, which implements the Render method by filling the two geometries with different brushes, as shown in Figure 7.

The OccludedCircularText Display
Figure 7 The OccludedCircularText Display

Now it’s much more obvious what’s in the foreground and what’s in the background. Yet, so much computation has been moved to the Update method that performance is very poor.

In a conventional 2D programming environment, I would’ve exhausted all the 2D programming tools at my disposal and now be stuck with this terrible performance. Direct2D, however, offers an alternative approach to rendering the geometry that simplifies the logic and improves performance immensely. This solution makes use of the most basic 2D polygon—which is a polygon that also plays a major role in 3D programming.

I speak, of course, of the humble triangle.


Charles Petzold is a longtime contributor to MSDN Magazine and the author of “Programming Windows, 6th edition” (O’Reilly Media, 2012), a book about writing applications for Windows 8. His Web site is charlespetzold.com.

Thanks to the following technical expert for reviewing this article: Jim Galasyn (Microsoft)