.NET Matters

Abortable Thread Pool

Stephen Toub

Code download available at:NETMatters0603.exe(118 KB)

Q In your July 2005 column (.NET Matters: StringStream, Methods with Timeouts), you demonstrated how to use Thread.Abort to cancel a long-running operation. I know you warned against this approach for reliability reasons, but I think the risks are acceptable for my situation. I've queued a bunch of work items to be executed in the thread pool. Later on, I may need to cancel one of these operations (in effect, a manual timeout), dequeuing it from the pool if it hasn't already executed, and aborting it if it's currently in the process of executing. The ThreadPool doesn't seem to expose any functionality like this. Is my only option to write my own ThreadPool?

Q In your July 2005 column (.NET Matters: StringStream, Methods with Timeouts), you demonstrated how to use Thread.Abort to cancel a long-running operation. I know you warned against this approach for reliability reasons, but I think the risks are acceptable for my situation. I've queued a bunch of work items to be executed in the thread pool. Later on, I may need to cancel one of these operations (in effect, a manual timeout), dequeuing it from the pool if it hasn't already executed, and aborting it if it's currently in the process of executing. The ThreadPool doesn't seem to expose any functionality like this. Is my only option to write my own ThreadPool?

A That's a great question. To answer the first part, the ThreadPool does not expose the ability to dequeue an item from its internal queue, though there are alternatives, as you'll see shortly. The second part of your question comes up frequently in a much more generalized form: how can you determine which thread a queued work item will execute on? The answer, of course, is that you don't know until it's already executing. After all, the whole point of having a pool of threads is to allow any one of them to complete the requested work. This means that the best way to discover which thread a work item is executing on is to have the work item tell you as soon as it starts running. For example, the first thing the work item's method could do is make some sort of callback, passing along information about the current Thread object. In your scenario, with the work item's Thread in hand, you could then easily abort it.

A That's a great question. To answer the first part, the ThreadPool does not expose the ability to dequeue an item from its internal queue, though there are alternatives, as you'll see shortly. The second part of your question comes up frequently in a much more generalized form: how can you determine which thread a queued work item will execute on? The answer, of course, is that you don't know until it's already executing. After all, the whole point of having a pool of threads is to allow any one of them to complete the requested work. This means that the best way to discover which thread a work item is executing on is to have the work item tell you as soon as it starts running. For example, the first thing the work item's method could do is make some sort of callback, passing along information about the current Thread object. In your scenario, with the work item's Thread in hand, you could then easily abort it.

This solution is perilous, however. By the time the notification is received and the thread is aborted, the original work item could have already finished, such that the pool's thread already moved on to another work item. You would be aborting that secondary work item rather than the desired one, opening yourself up to a world of hurt. Plus, this requires you to have the appropriate callback logic in every method you queue to the pool.

Instead, consider a thread pool implementation that handed you back a cookie of sorts for a queued work item. The pool could then also provide a Cancel method that would take one of these cookies and cancel the associated work item, removing it from the queue or aborting the executing work item as appropriate. Implementing a custom thread pool for this task probably isn't the best idea, but there are other alternatives. (Note that it is possible to implement custom thread pools; for an example of doing so, see my February 2005 column at .NET Matters: File Copy Progress, Custom Thread Pools.)

In my October, November, and December 2004 columns, I showed how to implement static wrappers for ThreadPool, each of which accomplished a different task (waiting for queued work to complete, prioritizing certain queued items, and throttling pool usage). That same technique can be used here to create a wrapper for the thread pool that allows work items to be canceled. Figure 1 shows a high-level diagram of the process involved and Figure 2 shows its implementation.

Figure 2 Implementation

public sealed class WorkItem { private WaitCallback _callback; private object _state; private ExecutionContext _ctx; internal WorkItem(WaitCallback wc, object state, ExecutionContext ctx) { _callback = wc; _state = state; _ctx = ctx; } internal WaitCallback Callback { get { return _callback; } } internal object State { get { return _state; } } internal ExecutionContext Context { get { return _ctx; } } } public enum WorkItemStatus { Completed, Queued, Executing, Aborted} public static class AbortableThreadPool { private static LinkedList<WorkItem> _callbacks = new LinkedList<WorkItem>(); private static Dictionary<WorkItem, Thread> _threads = new Dictionary<WorkItem, Thread>(); public static WorkItem QueueUserWorkItem( WaitCallback callback, object state) { if (callback == null) throw new ArgumentNullException("callback"); WorkItem item = new WorkItem( callback, state, ExecutionContext.Capture()); lock (_callbacks) _callbacks.AddLast(item); ThreadPool.QueueUserWorkItem(new WaitCallback(HandleItem)); return item; } private static void HandleItem(object ignored) { WorkItem item = null; try { lock (_callbacks) { if (_callbacks.Count > 0) { item = _callbacks.First.Value; _callbacks.RemoveFirst(); } if (item == null) return; _threads.Add(item, Thread.CurrentThread); } ExecutionContext.Run(item.Context, delegate { item.Callback(item.State); }, null); } finally { lock (_callbacks) { if (item != null) _threads.Remove(item); } } } public static WorkItemStatus Cancel(WorkItem item, bool allowAbort) { if (item == null) throw new ArgumentNullException("item"); lock (_callbacks) { LinkedListNode<WorkItem> node = _callbacks.Find(item); if (node != null) { _callbacks.Remove(node); return WorkItemStatus.Queued; } else if (_threads.ContainsKey(item)) { if (allowAbort) { _threads[item].Abort(); _threads.Remove(item); return WorkItemStatus.Aborted; } else return WorkItemStatus.Executing; } else return WorkItemStatus.Completed; } } }

Figure 1 High-Level Process

Figure 1** High-Level Process **

To use this class, you simply substitute AbortableThreadPool anywhere you're using ThreadPool to queue work items you want to be able to cancel. It's fine to still make calls to ThreadPool.QueueUserWorkItem while using AbortableThreadPool, but those work items won't be cancelable. A typical call sequence might look like the following:

WorkItem item = AbortableThreadPool.QueueUserWorkItem( new WaitCallback(YourWorkMethod)); ... // do other stuff here WorkItemStatus status = AbortableThreadPool.Cancel(item, false); Console.WriteLine("Status from Cancel: " + status);

As with the previous wrappers, for this to work I need to be able to store the information supplied to the ThreadPool.QueueUserWorkItem method, and for that I've created the WorkItem class (also shown in Figure 2). An instance of this class stores three pieces of information, including the two pieces of information supplied by the user (the WaitCallback to be executed and the object state to be passed as an argument to that method). The third piece of information is the ExecutionContext captured at the time QueueUserWorkItem is invoked.

ExecutionContext is a most welcome addition to the Microsoft® .NET Framework 2.0, serving as a container for important data about the current logical thread of execution. This includes the thread's security context, including its compressed stack, consisting of code access security permission grants and stack modifiers, and the current Windows® identity; its synchronization context, which allows synchronous and asynchronous operations to behave in accordance with the appropriate synchronization model, for example using Control.Invoke or Control.BeginInvoke in a Windows Forms application; its logical call context; and its host execution context. What makes ExecutionContext shine is that you can capture the ExecutionContext for the current thread, and then later on use that ExecutionContext while running arbitrary code, even if that code isn't running on the thread for which the context was originally captured.

Why is that a good thing? Think about what happens when you queue a work item to the thread pool, or use Control.Invoke to marshal a call to the GUI thread of a Windows Forms application, or call Thread.Start to execute a method on a new thread, or invoke a delegate asynchronously. In essence, one thread is asking another thread to perform work on its behalf. Since the two threads don't share any thread-based state in common, if nothing about the source thread is transferred to the target thread, the disconnect that occurs can lead to a variety of problems. Some of these problems lead to functional bugs, such as data stored in the source's thread-local storage not being available during execution, and others lead to security vulnerabilities, such as the execution happening under a different Windows identity.

By allowing the capture and flow of execution context from one thread to another, the .NET Framework 2.0 solves this disconnect. In fact, in most scenarios you'll never have to deal with ExecutionContext directly as it is automatically flowed across thread transfer points (ThreadPool.QueueUserWorkItem, Thread.Start, delegate asynchronous invocation, and so on.) You only have to deal with it directly when some action you perform causes this automatic flow to break. With that said, the reason why WorkItem stores an ExecutionContext should become apparent shortly.

The AbortableThreadPool class shown in Figure 2 maintains several collections of information. One collection, a linked list of WorkItem instances, contains work items queued using AbortableThreadPool.QueueUserWorkItem that haven't had a chance to execute yet. The other collection is a dictionary, mapping WorkItem instances to Thread objects; this collection is used to track on which pool thread a work item is currently executing.

AbortableThreadPool.QueueUserWorkItem is the simplest of the methods on this class. It creates a WorkItem that stores the supplied WaitCallback and user state object, as well as the current ExecutionContext. This WorkItem is then enqueued into the linked list of work items waiting to be executed. In addition, a placeholder callback is inserted into the actual .NET ThreadPool. This placeholder contains no information about the actual work to be performed and simply points back to the AbortableThreadPool.HandleItem method. As one delegate for HandleItem is queued when and only when the AbortableThreadPool,QueueUserWorkItem method is called, every work item in the linked list has a corresponding queued work item in the actual thread pool. The WorkItem, which serves as the cookie discussed earlier in this column, is returned to the user.

When HandleItem is called, it removes the next WorkItem from the linked list of work items. This WorkItem is then inserted along with the current Thread instance (as retrieved from Thread.CurrentThread) into the table of running work items (the WorkItem serves as the key into the dictionary). At this point, the ExecutionContext is retrieved from the WorkItem and is used to execute the queued delegate, supplying it with the relevant user state object. When the delegate finishes executing, the WorkItem-to-Thread mapping is removed from the running items table.

This system makes it possible to cancel a work item regardless of its current status in the pool. A user armed with the WorkItem returned from QueueUserWorkItem can pass the instance to the Cancel method. The Cancel method first looks in the linked list of WorkItems; if the WorkItem has not begun execution, it'll be in this list. If the item is found, it's simply removed from the queue. This causes the number of WorkItems in the linked list and the number of queued work items in the managed thread pool to become disconnected, but that's OK. When HandleItem attempts to retrieve the next item from the queue, it returns gracefully if no more items are available.

However, this scenario is one of the reasons why you must store the ExecutionContext in the WorkItem and execute the delegate under that context. By default, execution context flows from the thread that queues a delegate to the pool to the execution of that delegate. But by canceling a work item in this fashion—something the managed thread pool is completely unaware of—you cause the work item and flowed contexts to get out of sync. As a result, one work item will execute under the context flowed for a different work item.

Even if you couldn't cancel items, it would still be necessary to store the execution context with the work item. Each time HandleItem is called from the ThreadPool, it'll pick off the next WorkItem in our queue. But the fact that the thread pool executes the HandleItem methods in the order you queued them doesn't guarantee anything about the order in which various instructions within those HandleItem calls will execute. It's perfectly possible for one HandleItem call to start before another, but for the second call to remove one of the waiting WorkItems first. If that happens—again, there will be a disconnect in the flow of the execution context—you must do it manually.

If the Cancel method does not find the WorkItem returned from QueueUserWorkItem in the linked list, there are only two other possibilities. First, the work item could be executing currently, in which case it would show up as a key in the dictionary, mapping WorkItem instances to Thread instances. Moreover, if it is in that table you can easily retrieve the Thread on which that delegate is executing, thus allowing you to abort it if requested. If the WorkItem is not a key in that dictionary, it must have already finished executing, and thus there is nothing to be canceled. The locks that bookend the execution of the work item and that surround the call to Abort prevent the incorrect thread from being aborted; this would not be preventable if the Thread object for the pool thread were simply returned to the user of the AbortableThreadPool.

As with my method timeouts example in the July 2005 issue, I'll caution you against using Thread.Abort to control the lifetime of a thread, especially if you have little knowledge about the implementation of the code you're aborting. An abort inside of a critical region could spell disaster for an entire AppDomain, and while a polite abort does signficantly reduce the window of disaster to very tiny slivers, that one time out of 1,000 could lead to a deadlock or resource leak or worse. Depending on your workload and abort rate, this may or may not snowball over time. For more information on why this is dangerous and for ways to deal with it, see my article on reliability in the October 2005 issue (High Availability: Keep Your Code Running with the Reliability Features of the .NET Framework).

That said, and as with the method timeouts example, there are times when it's nice to have the power of Thread.Abort. For those situations, AbortableThreadPool could come in handy. And since I've implemented the Cancel method to accept a parameter that indicates whether a currently executed work item should be aborted, you can, of course, choose to use this code simply to cancel pending work items rather than also canceling work items that have already started execution.

Send your questions and comments to  netqa@microsoft.com.

Stephen Toub is the Technical Editor for MSDN Magazine.