This topic describes techniques for improving Direct2D applications to achieve fast and high-quality rendering results. It contains the following parts.
- Resource Usage
- Restrict the Use of Flush
- General Guidelines for Rendering Static Content
- Bitmaps
- Use Tiled Bitmap over Dashing
- Drawing Text with Direct2D
- Know Your Pixel Format
- Geometry Rendering
- Make Clips Your First Choice
- GDI Interoperability: Avoid Frequent Switches
- DXGI Interoperability: Avoid Frequent Switches
- Avoid a Large Rectangle Size for the BindDC Function
- Driver Recommendation
- Scene Complexity
- Conclusion
Resource Usage
Before understanding how resources can be used efficiently, you must understand what is meant by a resource. A resource refers to an allocation of some kind, either in video or system memory. Bitmaps and brushes are examples of resources.
In Direct2D, resources can be created both in software and hardware. Resource creation and deletion on hardware are expensive operations because they require lots of overhead for communicating with the video card. Let's see how Direct2D renders content to a target.
In Direct2D, all the rendering commands are enclosed between a call to BeginDraw and a call to EndDraw. These calls indicate that a RenderTarget is being used by the Direct2D API. The BeginDraw method must be called before rendering operations can be called. After BeginDraw is called, a context typically builds up a batch of rendering commands, but delays processing of these commands until one of the following statements is true:
- EndDraw is encountered: When EndDraw is called, it causes any batched drawing operations to complete and returns the status of the operation.
- An explicit call to Flush is made: The Flush method causes the batch to be processed and all pending commands to be issued.
- The buffer holding the rendering commands is full: The buffer that holds the rendering commands is finite. If this buffer becomes full before the previous two conditions are fulfilled, the rendering commands will be flushed out.
Until the primitives are flushed, they will keep internal references to corresponding resources like bitmaps and brushes.
Reuse Resources
As already mentioned, resource creation and deletion is expensive on hardware. So you should reuse resources when possible. Take the example of bitmap creation in game development. Usually, bitmaps that make up a scene in a game are all created at the same time with all the different variations that are required for later frame-to-frame rendering. At the time of actual scene rendering and re-rendering, these bitmaps are reused instead of re-created.
Note This rule does not apply to the window resize operation. When a window is resized, some scale-dependent resources such as compatible render targets and possibly some layer resources must be re-created because the window content has to be redrawn. This can be important for maintaining the overall quality of the rendered scene.
Restrict the Use of Flush
Because the Flush method also causes the batched rendering commands to be processed, you should typically avoid using it. For most common scenarios, resource management should be left to Direct2D.
General Guidelines for Rendering Static Content
Generally, for static content that does not change from frame to frame, use compatible render targets, A8 targets, and meshes. For rendering to occur, an application must issue commands to a Direct2D RenderTarget. RenderTargets receive drawing operations, and store the result of these operations. For example, to draw to an HWND, you create an ID2D1HwndRenderTarget.
Let's examine each kind of render target in more detail:
- Compatible render target: Direct2D can target different kinds of off-screen surfaces. One of them is for creating internal Direct2D intermediate textures. This kind of target can be used for intermediate off-screen drawing. When rendering static content, instead of writing directly into the destination render target, use a compatible render target. This helps reduce the cost of performing a blend with the background.
- A8 target: A8 is a kind of pixel format which represents an alpha channel with 8 bits. A8 targets are useful for drawing geometry as a mask. For scenarios where the opacity of static content must be manipulated, instead of manipulating the content itself, the opacity of the mask can be translated, rotated, skewed, or scaled.
- Meshes: An ID2D1Mesh is a set of vertices represented as (x,y) coordinates that form a list of triangles. In scenarios where you would want to draw aliased geometries, it is faster to convert the geometry into a mesh and render the mesh instead of drawing the geometry itself. Meshes supply the vertex data which is directly consumable by the GPU without additional processing. For geometry, you have to evaluate the geometry and retrieve the vertices before rendering.
Bitmaps
As mentioned earlier, resource creation and deletion are very expensive operations in hardware. A bitmap is a kind of resource that is used very frequently. Creating bitmaps on the video card is very expensive. Reusing them can help make the application faster.
Create Large Bitmaps
Video cards typically have a minimum memory allocation size. If an allocation is requested that is smaller than this, a resource of this minimum size is allocated and the surplus memory is wasted and unavailable for other things. If many small bitmaps are required, a better technique is to allocate one large bitmap and store all the small bitmap contents in this large bitmap. Then subareas of the larger bitmap can be read where the smaller bitmaps are needed. This is also known as an atlas, and it has the benefit of reducing bitmap creation overhead and the memory waste of small bitmap allocations. We recommend that you try to keep most bitmaps to at least 64 KB and limit the number of bitmaps that are smaller than 4 KB.
Create an Atlas of Bitmaps
There are some common scenarios for which a bitmap atlas can serve very well. Small bitmaps can be stored inside a large bitmap. These small bitmaps can be pulled out of the larger bitmap when you need them by specifying the destination rectangle. For example, an application has to draw multiple icons. All the bitmaps associated with the icons can be loaded into a large bitmap up front. And at rendering time, they can be retrieved from the large bitmap.
Note A Direct2D bitmap created in video memory is limited to the maximum bitmap size supported by the adapter on which it is stored. Creating a bitmap larger than the size supported by the underlying hardware might result in an error.
Create Shared Bitmaps
Creating shared bitmaps enables advanced callers to create Direct2D bitmap objects that are backed directly by an existing object, already compatible with the render target. This avoids creating multiple surfaces and helps in reducing performance overhead.
Note Shared bitmaps are usually limited to software targets or to targets interoperable with DXGI.
Copying Bitmaps
Creating a DXGI surface is an expensive operation so you should reuse existing surfaces when you can. Even in software, if a bitmap is mostly in the desired form except for a small portion, it is better to update that portion than to throw the whole bitmap away and re-create everything. Although you can use CreateCompatibleRenderTarget to achieve the same results, rendering is generally a much more expensive operation than copying. This is because, to improve cache locality, the hardware does not actually store a bitmap in the same memory order that the bitmap is addressed. Instead, it is "swizzled." The swizzling is hidden from the CPU either by the driver (which is slow and used only on lower-end parts), or by the memory manager on the GPU. Because of constraints on how data is written into a render target when rendering, render targets are typically either not swizzled, or swizzled in a way that is less optimal than can be achieved if you know that you never have to render to the surface. Therefore, the CopyFrom* methods are provided for copying rectangles from a source to the Direct2D bitmap.
CopyFrom can be used in any of its three forms:
Use Tiled Bitmap over Dashing
Rendering a dashed line is a very expensive operation because of the high quality and accuracy of the underlying algorithm. For most of the cases not involving rectilinear geometries, the same effect can be generated faster by using tiled bitmaps.
Drawing Text with Direct2D
Direct2D text rendering functionality is offered in two parts. The first part, exposed as the ID2D1RenderTarget::DrawText and ID2D1RenderTarget::DrawTextLayout method, enables a caller to pass either a string and formatting parameters or a DWrite text layout object for multiple formats. This should be suitable for most callers. The second way to render text, exposed as the ID2D1RenderTarget::DrawGlyphRun method, provides rasterization for customers who already know the position of the glyphs they want to render. The following two general rules can help improve text performance when drawing in Direct2D.
DrawTextLayout Vs. DrawText
Both DrawText and DrawTextLayout enable an application to easily render text that is formatted by the DirectWrite API. DrawTextLayout draws an existing DWriteTextLayout object to the RenderTarget, and DrawText constructs a DirectWrite layout for the caller, based on the parameters that are passed in. If the same text has to be rendered multiple times, use DrawTextLayout instead of DrawText, because DrawText creates a layout every time that it is called.
Aliased Vs. AntiAliased Text
Generally, use aliased text over antialiased if text quality is not the top priority. For example, if the text is very small and is animating back and forth constantly, it is difficult to differentiate between aliased and antialiased text.
Know Your Pixel Format
When you create a render target, you can use the D2D1_PIXEL_FORMAT structure specify the pixel format and alpha mode used by the render target. An alpha channel is part of the pixel format that specifies the coverage value or opacity information. If a render target does not use the alpha channel, it should be created by using the D2D1_ALPHA_MODE_IGNORE alpha mode. This spares the time that is spent on rendering an alpha channel that is not needed.
For more information about pixel formats and alpha modes, see Supported Pixel Formats and Alpha Modes.
Geometry Rendering
Use Specific DrawPrimitive over DrawGeometry
Use more specific DrawPrimitive calls like DrawRectangle over generic DrawGeometry calls. This is because with DrawRectangle, the geometry is already known so rendering is faster.
Make Clips Your First Choice
Let's examine a very common application scenario: A region of the render target has to be clipped. The region to be clipped is aligned to the axis of the render target. This case is suited for using a clip rectangle instead of a layer. The performance gain is more for aliased geometry than antialiased geometry. Generally, you should not use layers unless the effect cannot be achieved otherwise. Each layer is associated with a backing surface that stores the results produced between PushLayer/PopLayer functions. This backing surface creation is like any other surface creation and is expensive.
GDI Interoperability: Avoid Frequent Switches
Direct2D can interoperate seamlessly with GDI. This is very useful for extending existing GDI applications by using new Direct2D rendering and for adding existing GDI code to newer Direct2D rendered applications.
Although Direct2D API is designed in a way that makes interleaving GDI and Direct2D code very easy, you should be careful of operating with GDI and Direct2D at the same time, because it can decrease performance significantly.
DXGI Interoperability: Avoid Frequent Switches
Direct2D can interoperate seamlessly with Direct3D 10.1 surfaces. This is very useful for creating applications that render a mixture of 2D and 3D content. But each switch between drawing Direct2D and Direct3D content affects performance.
When rendering to a DXGI surface, Direct2D saves the state of the Direct3D devices while rendering and restores it when rendering is completed. Every time that a batch of Direct2D rendering is completed, the cost of this save and restore and the cost of flushing all the 2D operations are paid, and yet, the Direct3D device is not flushed. Therefore, to increase performance, limit the number of rendering switches between Direct2D and Direct3D.
Avoid a Large Rectangle Size for the BindDC Function
For binding a GDI device context to a Direct2D render target, the BindDC function is used. It takes a rectangle subregion as one of the parameters. For the best performance, use the smallest possible rectangle for this binding.
Driver Recommendation
For the best performance when rendering in hardware, use a WDDM 1.1 driver for Windows 7.
To know whether WDDM 1.1 drivers are being used, run DxDiag from the Run dialog box on the Start menu. If WDDM 1.1 drivers are present, the number will be displayed explicitly. Otherwise, nothing will be displayed.
The following screen shot shows where you can find the driver information.

Warning If the WDDM1.1 driver is missing on Windows Vista with Service Pack 2 (SP2), the performance of Direct2D/GDI interop degrades.
Scene Complexity
When you analyze performance hot spots in a scene that will be rendered, knowing whether the scene is fill-rate bound or vertex bound can provide useful information.
- Fill Rate: Fill rate refers to the number of pixels that a graphics card can render and write to video memory per second.
- Vertex Bound: A scene is vertex bound when it contains lots of complex geometry.
Understand Scene Complexity
You can analyze your scene complexity by altering the size of the render target. If performance gains are visible for a proportional reduction in size of the render target, then the application is fill-rate bound. Otherwise, the scene complexity is the performance bottleneck.
When a scene is fill-rate bound, reducing the size of the render target can improve the performance. This is because the number of pixels to be rendered will be reduced proportionally with the size of the render target.
When a scene is vertex bound, reduce the complexity of the geometry. But remember that this is done at the expense of image quality. Therefore, a careful tradeoff decision should be made between the desired quality and the performance that is required.
Conclusion
Although Direct2D is hardware accelerated and is meant for high performance, it is very important to know how to use the features correctly to maximize throughput. The techniques that are discussed here are derived from studying common scenarios and might not apply to all application scenarios. Therefore, careful understanding of application behavior and performance goals can help achieve the results that you want.
Send comments about this topic to Microsoft
Build date: 3/7/2012