{ End Bracket }

Improving Managed DirectX Performance

Tom Miller

It seems that at least twice a week, I am asked about poor performance in Managed DirectX®. This frequency is actually a big improvement over the 5-10 times a week I was asked a few years ago when the technology first came out. Many of the people who are posing the questions begin with the basic assumption that managed code is slow, and therefore assume that Managed DirectX must also be slow.

Many different types of developers hold this misconception, from the hardcore game developer to programmers within my own DirectX team. In one case, a co-worker was writing a "proof of concept" prototype using Managed DirectX. He was having performance problems, and had concluded it was simply because managed code was slow. I offered to look at it more closely.

I noticed that it indeed ran very slowly, so much so in fact that I first thought it was written that way intentionally. After a review of the code, though, it became clear that this wasn't the case. This situation spawned a design review of the Managed DirectX libraries, resulting in changes to the API. This was my first look at an actual customer scenario, how a real customer might use the API, and it was quite an eye opener.

The biggest performance problem was the complete lack of resource buffers. These buffers are used to store vertex and index data and are optimized to send data to the graphics card efficiently. This developer's application was not using buffers at all, but was instead passing in the entire set of data each and every time the application wanted to draw it. The developer was unaware that this would cause the same data to be copied to the graphics card on each call, even if it hadn't changed. Fixing this class of issues resolved a large part of the application's performance problems.

These draw calls were also not batched, and the application was using essentially one draw call per object. Having thousands upon thousands of draw calls every frame can really hurt your performance. Given that this prototype wasn't intended to perform in real time, I decided to leave the number of draw calls alone.

Profiling the application after these fixes resulted in an extraordinary number of garbage collections happening during the running of the application. Generation 1 and 2 collections were happening as well, and that's something you never want to see in a high-performance 3D application. In looking through the code, I noticed that in virtually every case where an object was needed (such as a device), rather than using a cached version of the object, the developer had simply used the property on a child object to retrieve its parent (for example, VertexBuffer.Device). The problem here was twofold. First, this property was actually returning an entirely new object every time it was accessed by the application, and second, because of the event-hooking mechanism built into Managed DirectX, it was also creating three other miscellaneous objects every time an event was raised.

This type of code pattern was repeated literally everywhere throughout the API. It was causing a minimum of 2,000 extra objects to be created every frame, increasing memory size, causing more and more garbage collections. And even worse, the vast majority of these objects were never being collected at all, despite the numerous garbage collections. My job now was to find out why.

It turns out the culprit was the fancy event handling and hooking that Managed DirectX utilizes. Every resource a device was creating would hook events on the device itself (at the very least the "disposing" event). These event hooks would create a hard link to the child object, keeping it alive regardless of whether it was now out of scope. That event hook ensured that the object could never be collected until the device itself was disposed.

Armed with these two new important facts, I commenced a design review of the Managed DirectX API. Properties that returned other objects now had their values cached internally to remove the need for the application itself to maintain this list. The event-hooking mechanism was changed to allow it to be disabled for applications that wanted more control over its objects' lifetimes. With these changes in place, the application now performed well above expectations.

Nowadays the number of times I'm asked about performance issues is steadily diminishing. It seems developers are realizing that you can get both high performance and increased productivity using Managed DirectX. When they see the Visual Studio® 2005 version that will be released soon, I can only imagine more people will be convinced.

Tom Miller is a developer in the DirectX group at Microsoft, and is the designer and developer of the Managed DirectX libraries, as well as the author of two books on the subject. You can view his blog at blogs.msdn.com/tmiller.