.NET Matters

Debugger Visualizations, Garbage Collection

Stephen Toub

Code download available at:NETMatters0408.exe(208 KB)

Q I really enjoyed reading Morgan Skinner's article in the May 2004 issue of MSDN®Magazine covering the new debugger visualizations feature in Visual Studio® 2005 (see Debugging: DataTips, Visualizers and Viewers Make Debugging .NET Code a Breeze). His Hashtable visualizer is also really useful. I know this feature doesn't exist in Visual Studio .NET 2003, but it's just so helpful, I'd love to be able to use something like it now. What are my options?

Q I really enjoyed reading Morgan Skinner's article in the May 2004 issue of MSDN®Magazine covering the new debugger visualizations feature in Visual Studio® 2005 (see Debugging: DataTips, Visualizers and Viewers Make Debugging .NET Code a Breeze). His Hashtable visualizer is also really useful. I know this feature doesn't exist in Visual Studio .NET 2003, but it's just so helpful, I'd love to be able to use something like it now. What are my options?

A Debugger visualizations are an amazing feature for the debugger and will certainly save developers a ton of time. Unfortunately, as you state, they're new to Visual Studio 2005 and don't exist in the current version of the product. But that doesn't mean you're completely at a loss. With a little help from the Command/Immediate window, there's a solution that still gives you the power to visualize your data in a manner similar to the way it works in Visual Studio 2005 and with only slightly more work. As an example, I'll show you how to take Morgan's code and, with only a few very minor changes (and only slightly more work at run time), get it to work in Visual Studio .NET 2003.

A Debugger visualizations are an amazing feature for the debugger and will certainly save developers a ton of time. Unfortunately, as you state, they're new to Visual Studio 2005 and don't exist in the current version of the product. But that doesn't mean you're completely at a loss. With a little help from the Command/Immediate window, there's a solution that still gives you the power to visualize your data in a manner similar to the way it works in Visual Studio 2005 and with only slightly more work. As an example, I'll show you how to take Morgan's code and, with only a few very minor changes (and only slightly more work at run time), get it to work in Visual Studio .NET 2003.

The simplest way to get "visualizations" in Visual Studio .NET 2003 is to use the inline "printf style" of debugging, whereby you insert lots of calls inline in the code to help trace its execution. Let's say I wanted to visualize a Hashtable. Instead of using Console.WriteLine, I could write my own method that takes a Hashtable as input and shows a dialog with a rendered view of that Hashtable. Since I like to reuse existing code to make my life easier, I'll use Morgan's code for rendering the table. Of course, his code was written for the March 2004 Technical Preview release of Visual Studio 2005, so I had to make some minor changes to get it to compile in Visual Studio .NET 2003. I started by downgrading the implementation to compile in the current version of C#. This actually took very little effort, only requiring me to combine the pieces of the partial classes in use (since partial classes are new to C# 2.0) and removing some properties that are new to the Windows® Forms 2.0 object model.

With this new version of the Hashtable rendering code in place, which is available in the code download on the MSDN Magazine Web site, I can simply make calls to the render method inline in my program. This is a step in the right direction, but one of the great benefits of debugger visualizations in Visual Studio 2005 is that you don't need to mess with your source to use them; rather, they're built into the debugger and can be used by any solution regardless of whether it is aware of the visualization's existence. How can we accomplish this in Visual Studio .NET 2003? Through liberal use of the Command window (also known as the Immediate window), which lets us execute custom code at run time while stopped at a breakpoint. Ideally, I'd like to be able to simply execute the following code in the Immediate window and have it display my custom rendering of the Hashtable:

NetMatters.DebuggingVisualizations.DisplayHashtable(ht)

This exact usage is actually possible, but only if the assembly containing the NetMatters.DebuggingVisualizations class has already been loaded and if DisplayHashtable is a static method on the DebuggingVisualizations class.

The latter condition is easily taken care of, as shown in Figure 1. The former condition will only be satisfied if references to code in that assembly have been JIT compiled, or if the assembly has been loaded explicitly using one of the loading methods on the System.Reflection.Assembly class (Assembly.Load, Assembly.LoadFrom, and so on). If you know that the assembly containing the visualizers has not been loaded, you can easily fix this problem while debugging by loading the assembly from the Command window, as shown in the following:

System.Reflection.Assembly.LoadFrom("C:\DebugVis.dll")

Figure 1 DebuggingVisualizations

public class DebuggingVisualizations { // Display a Hashtable while debugging public static void DisplayHashtable(Hashtable ht) { HashtableProxy proxy = new HashtableProxy(ht); Hash [] hashes = proxy.WalkHashtable(); using(HashtableVisualizerForm form = new HashtableVisualizerForm()) { form.Hashes = hashes; form.ShowDialog(); } } // Display an image while debugging public static void DisplayImage(Bitmap img) { string temp = Path.GetTempFileName() + ".jpg"; img.Save(temp, ImageFormat.Jpeg); Process.Start(temp); } // Display an XmlDocument while debugging public static void DisplayXml(XmlDocument doc) { string temp = Path.GetTempFileName() + ".xml"; using(StreamWriter writer = new StreamWriter(temp)) { doc.Save(writer); } Process.Start(temp); } }

I've used Assembly.LoadFrom, specifying the path to the DLL. If you put the DLL into the Global Assembly Cache, which would be useful for debugging lots of different projects, you could instead use Assembly.Load, specifying the full name of the assembly. Since the full name is a fairly long string to type into the Command window, you might also use LoadWithPartialName to make for less typing while debugging, although there are some drawbacks to using this particular method. For more information on the trade-offs, see Suzanne Cook's excellent blog on the subject at Avoid Partial Binds. In fact, LoadWithPartialName is being deprecated in the Microsoft .NET Framework 2.0. Regardless, after executing this line of code, the DebugVis assembly will be loaded, a fact you can verify by looking at the Modules window in the debugger. We can then continue by executing the previously mentioned code in the Immediate window to display the Hashtable, as shown in Figure 2. Piece of cake!

Figure 2 Visualizing a Hashtable

There are obviously benefits to having this type of functionality built into the debugger and the IDE, but just because Visual Studio .NET 2003 lacks support for this feature doesn't mean we can't be inventive and create something very similar that can be used today. Figure 1 shows some code for visualizing Bitmaps, XmlDocuments, and Hashtables. You could obviously augment this with visualization routines for other types in the .NET Framework or even for your own custom types. The possibilities are endless. Happy debugging!

Q From within my application, I'd like to be notified when the garbage collector runs in my process. Is this possible? Is there an event raised when a collection takes place with which I could register an event handler? I looked at the System.GC class but couldn't find anything appropriate.

Q From within my application, I'd like to be notified when the garbage collector runs in my process. Is this possible? Is there an event raised when a collection takes place with which I could register an event handler? I looked at the System.GC class but couldn't find anything appropriate.

A As far as I know there's no perfect way to do this, though you can certainly construct a few good approximations.

A As far as I know there's no perfect way to do this, though you can certainly construct a few good approximations.

One approach is to take advantage of finalization. As a very quick review (and, as with any quick review, I'm leaving out some important details), managed classes can implement a finalizer. An instance's Finalize method is called after the garbage collector (GC) determines that the instance is unreachable but before the memory for that instance is reclaimed. When an object that implements a finalizer is instantiated, a reference to that object is added to a finalization list, sometimes referred to as the RegisteredForFinalization list. When the GC runs and finds unreachable objects that are also in the finalization list, the references to these objects are moved to another list, the freachable queue, also referred to as the ReadyForFinalization queue. A dedicated and high-priority thread owned by the common language runtime (CLR) scans this queue and runs the Finalize method of all objects found. Jeffrey Richter describes the process in much greater detail in his book Applied Microsoft .NET Framework Programming (Microsoft Press®, 2003).

The net result of this is that we can take advantage of finalization in order to estimate when the GC has run. By instantiating and then losing all references to an object whose Finalize method raises an event, I can be notified when the finalization thread processes my object from the freachable queue, and this will (hopefully) be soon after the GC has collected my object.

Here is my first attempt at creating such a class:

public class GcNotify { public GcNotify(){} public static event EventHandler Collection; public static void OnCollection() { EventHandler ev = Collection; if (ev != null) ev(null, EventArgs.Empty); } ~GcNotify() { OnCollection(); } }

I can test this by implementing a simple command-line application that continually calls GC.Collect in order to force a collection:

static void Main() { GcNotify.Collection += new EventHandler(GcNotify_Collection); new GcNotify(); while(true) { GC.Collect(0); Thread.Sleep(1000); } }

The implementation of GcNotify_Collection simply writes a message with a timestamp out to the console. Running this, I find that I'm only notified of a single garbage collection. Why? Because I only created one object and, once it was finalized and deallocated, its Finalize method was never run again. I might try to fix this by modifying the class's finalizer as follows:

~GcNotify() { OnCollection(); GC.ReRegisterForFinalize(this); }

The GC.ReRegisterForFinalize method causes a reference to the object to be put back onto the finalization list. Thus, the next time the GC runs, it will find this instance unreachable and will move it again to the freachable queue for the finalization thread to process. However, when I run this new code I find that I'm still only being notified of a single collection. A reexamination of the test application reveals why: I'm calling GC.Collect and passing in 0 as an argument. This tells the garbage collector to collect only Generation 0, skipping the collection of Generations 1 and 2. Why is that important? Because the instance survives collections and is thus promoted to higher generations. The first time the GC runs after the object has been created (and before the object is finalized), the instance is promoted to Generation 1, and the second time the GC runs the instance is promoted to Generation 2. You can verify this in the object's finalizer by writing out the result of calling GC.GetGeneration(this), which retrieves the current generation of the specified object. Since the garbage collector should collect Generation 0 much more frequently than the other generations, I'm doing myself a huge disservice by forcing my notification object to be promoted.

Since I want an event to be raised for every collection and not just for those that include Generations 1 and 2, I can change the implementation of the finalizer to instantiate a new instance of GcNotify each time it's run. That way, I have a much better chance of having an instance of GcNotify in Generation 0 (note that I've refrained from saying that I'll always have an instance in Generation 0 for reasons that should become clear in a moment):

~GcNotify() { OnCollection(); new GcNotify(); }

Running the sample with this new finalizer gives me the desired behavior. There are still some problems with it, however. First, there are a few situations in which objects will be finalized. One of them, of course, is when the GC is run, but finalizers are also executed during AppDomain unloading and during CLR shutdown. In these situations, I don't want to cause problems by creating new objects that will need to be finalized themselves, so I test for this using Environment.HasShutdownStarted and AppDomain.CurrentDomain.IsFinalizingForUnload, only creating a new GcNotify instance if both of these return false. For another example of this, see the GCBeep example from Richter's book.

The second problem is that I haven't implemented any mechanism to prevent the user from creating multiple instances of GcNotify. If the user instantiates more than one instance of GcNotify, the Collection event will be raised erroneously multiple times per collection. To fix this, I can modify GcNotify's constructor to be private and implement a public and static initialization method that shields the user from having to explicitly create an instance of the class. The final version of my class and test harness is shown in Figure 3.

Figure 3 GcNotify and Test Harness

class TestGcNotify { static void Main() { GcNotify.Collection += new EventHandler(GcNotify_Collection); GcNotify.Initialize(); while(true) { GC.Collect(0); Thread.Sleep(1000); } } private static void GcNotify_Collection(object sender, EventArgs e) { Console.WriteLine("GC collection: " + DateTime.Now); } } public class GcNotify { private static bool _initialized = false; public static void Initialize() { if (!_initialized) { _initialized = true; new GcNotify(); } } private GcNotify(){} public static event EventHandler Collection; public static void OnCollection() { EventHandler ev = Collection; if (ev != null) ev(null, EventArgs.Empty); } ~GcNotify() { if (!Environment.HasShutdownStarted && !AppDomain.CurrentDomain.IsFinalizingForUnload()) { OnCollection(); new GcNotify(); } } }

As mentioned earlier, while this is a decent approximation, the approach isn't perfect. A major problem is that it relies on there being very few objects that require finalization and on those objects' finalizers running to completion quickly. Since there's only one finalization thread, finalizers are run in serial, and thus a poorly implemented finalizer in another class could delay or even prevent ~GcNotify from running. Consider the following class:

public class StallFinalization { ~StallFinalization() { while(true) Thread.Sleep(1000); } }

If an instance of this class is finalized, it will stall the finalization thread and no other objects will have their finalizers run. So much for our event notification.

An alternative to the finalization approach would be to implement a polling mechanism that checks to see if the garbage collector has recently run. If it has, an event can be raised. One way to implement this polling is to take advantage of weak references. The WeakReference class has tight integration with the garbage collector; WeakReference wraps an object which it exposes through its Target property, but as far as the garbage collector is concerned, the object is unreachable (assuming there are no other strong references to it) and it can be collected.

When the collector runs, the Target of the WeakReference will be set to null. Thus, by creating a WeakReference to an object and then polling to see when the WeakReference's Target property is null (or when its IsAlive property is false), I can determine whether the garbage collector has recently run. If it is null, the Collection event is raised and the Target is reset to be a new object for which I can continue to poll. The code for this version of GcNotify is shown in Figure 4.

Figure 4 GcNotifyWithWeakReference

public class GcNotifyWithWeakReference { private static bool _initialized = false; private static WeakReference _obj; private static Timer _timer; public static void Initialize(int pollTime) { if (!_initialized) { if (pollTime <= 0) throw new ArgumentOutOfRangeException("pollTime"); _obj = new WeakReference(new object()); TimerCallback callback = new TimerCallback(CheckWeakReference); _timer = new Timer(callback, null, pollTime, pollTime); _initialized = true; } } private GcNotifyWithWeakReference() {} public static event EventHandler Collection; public static void OnCollection() { EventHandler ev = Collection; if (ev != null) ev(null, EventArgs.Empty); } private static void CheckWeakReference(object state) { if (!_obj.IsAlive) { OnCollection(); _obj.Target = new object(); } } }

I can also set up a polling scheme similar to that in Figure 4, but which checks the .NET CLR Memory performance counters, querying to see how many Generation 0 collections have occurred for the current process. The code for this version is shown in Figure 5. Note that it follows the same general format as the WeakReference-based version. To determine whether there has been a recent collection, it stores the number of collections and polls to see when this count changes. In order to access this information and have it be specific to the current application, I need to get the performance counter instance that represents the current process. Since the instances are created and named based on the name of the running application, I can get the appropriate text by accessing the name of the entry assembly. Additionally, with the .NET Framework 1.x, the length of the name of the instance is limited to 15 characters, so I need to ensure that I've appropriately trimmed the string before instantiating the PerformanceCounter.

Figure 5 GcNotifyWithPerfCounter

public class GcNotifyWithPerfCounter { private static bool _initialized = false; private static PerformanceCounter _pc; private static float _lastValue; private static Timer _timer; public static void Initialize(int pollTime) { if (!_initialized) { if (pollTime <= 0) throw new ArgumentOutOfRangeException("pollTime"); string instanceName = Assembly.GetEntryAssembly().GetName().Name; if (instanceName.Length > 15) instanceName = instanceName.Substring(0, 15); _pc = new PerformanceCounter( ".NET CLR Memory", "# Gen 0 Collections", instanceName, true); TimerCallback callback = new TimerCallback(CheckForCollection); _lastValue = _pc.NextValue(); _timer = new Timer(callback, null, pollTime, pollTime); _initialized = true; } } private GcNotifyWithPerfCounter(){} public static event EventHandler Collection; public static void OnCollection() { EventHandler ev = Collection; if (ev != null) ev(null, EventArgs.Empty); } private static void CheckForCollection(object state) { float newValue = _pc.NextValue(); if (newValue != _lastValue) { _lastValue = newValue; OnCollection(); } } }

Of course, this solution still requires explicit polling, just as with the WeakReference-based solution. Using Windows Management Instrumentation (WMI) to access the performance counters, I could instead let WMI handle all of polling overhead and just register with it to receive notifications of a change (note that WMI will still be polling).

The System.Management namespace supplies the WqlEventQuery class which represents a Windows Management Instrumentation Query Language (WQL) query. I can create an instance of this class to watch for changes in a specific performance counter. To configure the query, I first need to set the type of event for which it's watching, in this case an __InstanceModificationEvent. Second, I need to specify the condition that narrows down what instances I want the query to be concerned with; this often comes in the form "TargetInstance isa 'class'", where class is the type of the WMI class that represents the performance counter class in which I'm interested. One way to find that name is to write a simple VBScript that displays all of the registered classes that derive from Win32_PerfRawData, the base class for all concrete raw performance counter classes:

Set subclasses = GetObject("winmgmts:root/cimv2") .SubclassesOf("Win32_PerfRawData") For Each subclass In subclasses WScript.Echo subclass.Path_.Class Next

Executing this script in a console window will cause the list of Win32_PerfRawData-derived classes to be output to the console. Scanning the list, you should notice the Win32_PerfRawData_NETFramework_NETCLRMemory class, which represents the .NET CLR Memory category. I can use this value to fill in my condition. Finally, I need to tell the query how often to poll for changes. The code to configure the query looks like this:

WqlEventQuery q = new WqlEventQuery(); q.EventClassName = "__InstanceModificationEvent"; q.Condition = "TargetInstance isa" + "'Win32_PerfRawData_NETFramework_NETCLRMemory'"; q.WithinInterval = new TimeSpan(0,0,0,0,250);

With the query in place, I need to configure the system to use it to inform me of events. The ManagementEventWatcher class provides this functionality, and I can create one as follows (where OnCollection is the name of the event handler that will be invoked when the GC has recently run):

ManagementEventWatcher watcher = new ManagementEventWatcher(q); watcher.EventArrived += new EventArrivedEventHandler(OnCollection); watcher.Start();

Inside my OnCollection event handler, I can extract from EventArrivedEventArgs information about the number of Generation 0 collections that have occurred:

private static void OnCollection(object sender, EventArrivedEventArgs e) { ManagementBaseObject eventArg = (ManagementBaseObject)e.NewEvent["TargetInstance"]; uint numCollections = (uint)eventArg.Properties["NumberGen0Collections"].Value; Console.WriteLine("Gen 0: " + numCollections); }

This looks fairly clean, but unfortunately there are a few problems with it. First, the condition specified on the WqlEventQuery watches for all changes to any instance of the specified performance counter class, not just those for my process, and it watches for any change to those instances, not just a change in the property that represents the number of Generation 0 collections. These issues can be fixed by augmenting the condition in order to ensure that the TargetInstance.Name property value equals the name of our performance counter instance and that the TargetInstance.NumberGen0Collections property value does not equal the PreviousInstance.NumberGen0Collections property value.

The second and more complicated problem is that there are many environmental issues which affect whether this solution will even work correctly at all. These include access control list (ACL) issues based on which user is attempting to access the counters and as which user the process is running, as well as whether the CLR perfcounter DLL was initialized before or after the managed process was started. Most of these complications have been fixed in the .NET Framework 2.0, so in the meantime I'd suggest using one of the previously mentioned approaches.

All that said, I have to wonder why you even need this information. If it's for monitoring the health of your application, that should really be done externally to it, using performance counters or even a profiler. If you're actually using this information in your application to determine whether to perform some action, I'd highly recommend that you reevaluate. It's a poor idea to base your application's logic on something that should be fairly invisible to it, and a healthy application should really not have to worry about when the garbage collector is running. Rico Mariani, an architect on the CLR team, has tons of great advice on his blog concerning garbage collection and application performance tuning. I suggest you check it out at https://blogs.msdn.com/ricom.

Send your questions and comments to  netqa@microsoft.com.

Stephen Toub is the Technical Editor for MSDN Magazine. You can reach him at stoub@microsoft.com.