Export (0) Print
Expand All

Chapter 5 — Improving Managed Code Performance

 

Retired Content

This content is outdated and is no longer being maintained. It is provided as a courtesy for individuals who are still using these technologies. This page may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.

patterns & practices Developer Center

Improving .NET Application Performance and Scalability

J.D. Meier, Srinath Vasireddy, Ashish Babbar, Rico Mariani, and Alex Mackman
Microsoft Corporation

May 2004

Related Links

Home Page for Improving .NET Application Performance and Scalability

Chapter 4 — Architecture and Design Review of a .NET Application for Performance and Scalability

Send feedback to Scale@microsoft.com

patterns & practices Library

Summary: This chapter provides design and coding techniques for optimizing and improving the performance of your managed code. This chapter will help you develop managed code that takes full advantage of the high performance offered by the common language runtime. Key topics covered in this chapter include class design considerations, memory management and garbage collection issues, when and how to pin memory, developing efficient multithreaded code, using effective asynchronous execution, boxing considerations, string manipulation, exception management, arrays, collections, precompiling code with Ngen.exe, and much more.

Contents

Objectives
Overview
How to Use This Chapter
Architecture
Performance and Scalability Issues
Design Considerations
Class Design Considerations
Implementation Considerations
Garbage Collection Explained
Garbage Collection Guidelines
Finalize and Dispose Explained
Dispose Pattern
Finalize and Dispose Guidelines
Pinning
Threading Explained
Threading Guidelines
Asynchronous Calls Explained
Asynchronous Guidelines
Locking and Synchronization Explained
Locking and Synchronization Guidelines
Value Types and Reference Types
Boxing and Unboxing Explained
Boxing and Unboxing Guidelines
Exception Management
Iterating and Looping
String Operations
Arrays
Collections Explained
Collection Guidelines
Collection Types
Reflection and Late Binding
Code Access Security
Working Set Considerations
Ngen.exe Explained
Ngen.exe Guidelines
Summary
Additional Resources

Objectives

  • Optimize assembly and class design.
  • Maximize garbage collection (GC) efficiency in your application.
  • Use Finalize and Dispose properly.
  • Minimize boxing overhead.
  • Evaluate the use of reflection and late binding.
  • Optimize your exception handling code.
  • Make efficient use of iterating and looping constructs.
  • Optimize string concatenation.
  • Evaluate and choose the most appropriate collection type.
  • Avoid common threading mistakes.
  • Make asynchronous calls effectively and efficiently.
  • Develop efficient locking and synchronization strategies.
  • Reduce your application's working set.
  • Apply performance considerations to code access security.

Overview

Considerable effort went into making the common language runtime (CLR) suitable for applications that require high performance. However, the way you write managed code can either take advantage of that capability or it can hinder it. This chapter identifies the core performance-related issues that you need to be aware of to develop optimized managed code. It identifies common mistakes and many ways to improve the performance of your managed code.

The chapter starts by presenting the CLR architecture and provides an overview of the top performance and scalability issues to be aware of when you develop managed code. It then presents a set of design guidelines you should apply to all of your managed code development (such as business logic, data access logic, utility component, or Web page assembly). The chapter then presents a series of sections that highlight the top recommendations for each of the performance critical areas of managed code development. These include memory management and garbage collection, boxing operations, reflection and late binding, use of collections, string handling, threading, concurrency, asynchronous operations, exception management, and more.

How to Use This Chapter

This chapter presents the CLR architecture, top performance and scalability issues, and a set of design guidelines for managed code development. To get the most from this chapter, do the following:

  • Jump to topics or read from beginning to end. The main headings in this chapter help you locate the topics that interest you. Alternatively, you can read the chapter from beginning to end to gain a thorough appreciation of performance and scalability design issues.
  • Know the CLR architecture and components. Understanding managed code execution can help towards writing code optimized for performance.
  • Know the major performance and scalability issues. Read "Performance and Scalability Issues" in this chapter to learn about the major issues that can impact the performance and scalability of managed code. It is important to understand these key issues so you can effectively identify performance and scalability problems and apply the recommendations presented in this chapter.
  • Measure your application performance. Read the "CLR and Managed Code" and ".NET Framework Technologies" sections of Chapter 15, "Measuring .NET Application Performance"to learn about the key metrics that can be used to measure application performance. It is important that you be able to measure application performance so that performance issues can be accurately targeted.
  • Test your application performance. Read Chapter 16, "Testing .NET Application Performance" to learn how to apply performance testing to your application. It is important that you apply a coherent testing process and that you be able to analyze the results.
  • Tune your application performance. Read the "CLR Tuning" section of Chapter 17, "Tuning .NET Application Performance" to learn how to resolve performance issues identified through the use of tuning metrics.
  • Use the accompanying checklist in the "Checklists" section of this guide. Use the "Checklist: Managed Code Performance" checklist to quickly view and evaluate the guidelines presented in this chapter.

Architecture

The CLR consists of a number of components that are responsible for managed code execution. These components are referred to throughout this chapter, so you should be aware of their purpose. Figure 5.1 shows the basic CLR architecture and components.

Ff647790.ch05-clr-architecture(en-us,PandP.10).gif

Figure 5.1: CLR architecture

The way you write managed code significantly impacts the efficiency of the CLR components shown in Figure 5.1. By following the guidelines and techniques presented in this chapter, you can optimize your code, and enable the run-time components to work most efficiently. The purpose of each component is summarized below:

  • JIT compiler. The just-in-time (JIT) compiler converts the Microsoft intermediate language (MSIL) that is contained in an assembly into native machine code at run time. Methods that are never called are not JIT-compiled.
  • Garbage collector. The garbage collector is responsible for allocating, freeing, and compacting memory.
  • Structured exception handling. The runtime supports structured exception handling to allow you to build robust, maintainable code. Use language constructs such as try/catch/finally to take advantage of structured exception handling.
  • Threading. The .NET Framework provides a number of threading and synchronization primitives to allow you to build high performance, multithreaded code. Your choice of threading approach and synchronization mechanism impacts application concurrency; hence, it also impacts scalability and overall performance.
  • Security. The .NET Framework provides code access security to ensure that code has the necessary permissions to perform specific types of operations such as accessing the file system, calling unmanaged code, accessing network resources, and accessing the registry.
  • Loader. The .NET Framework loader is responsible for locating and loading assemblies.
  • Metadata. Assemblies are self-describing. An assembly contains metadata that describes aspects of your program, such as the set of types that it exposes, and the members those types contain. Metadata facilitates JIT compilation and is also used to convey version and security-related information.
  • Interop. The CLR can interoperate with various kinds of unmanaged code, such as Microsoft Visual Basic®, Microsoft Visual C++®, DLLs, or COM components. Interop allows your managed code to call these unmanaged components.
  • Remoting. The .NET remoting infrastructure supports calls across application domains, between processes, and over various network transports.
  • Debugging. The CLR exposes debugging hooks that can be used to debug or profile your assemblies.

Performance and Scalability Issues

This section is designed to give you a high-level overview of the major issues that can impact the performance and scalability of managed code. Subsequent sections in this chapter provide strategies, solutions, and technical recommendations to prevent or resolve these issues. There are several main issues that impact managed code performance and scalability:

  • Memory misuse. If you create too many objects, fail to properly release resources, preallocate memory, or explicitly force garbage collection, you can prevent the CLR from efficiently managing memory. This can lead to increased working set size and reduced performance.
  • Resource cleanup. Implementing finalizers when they are not needed, failing to suppress finalization in the Dispose method, or failing to release unmanaged resources can lead to unnecessary delays in reclaiming resources and can potentially create resource leaks.
  • Improper use of threads. Creating threads on a per-request basis and not sharing threads using thread pools can cause performance and scalability bottlenecks for server applications. The .NET Framework provides a self-tuning thread pool that should be used by server-side applications.
  • Abusing shared resources. Creating resources per request can lead to resource pressure, and failing to properly release shared resources can cause delays in reclaiming them. This quickly leads to scalability issues.
  • Type conversions. Implicit type conversions and mixing value and reference types leads to excessive boxing and unboxing operations. This impacts performance.
  • Misuse of collections. The .NET Framework class library provides an extensive set of collection types. Each collection type is designed to be used with specific storage and access requirements. Choosing the wrong type of collection for specific situations can impact performance.
  • Inefficient loops. Even the slightest coding inefficiency is magnified when that code is located inside a loop. Loops that access an object's properties are a common culprit of performance bottlenecks, particularly if the object is remote or the property getter performs significant work.

Design Considerations

The largest contributing factor to application performance is the application architecture and design. Make sure performance is a functional requirement that your design and test performance takes into account throughout the application development life cycle. Application development should be an iterative process. Performance testing and measuring should be performed between iterations and should not be left to deployment time.

This section summarizes the major design considerations to consider when you design managed code solutions:

  • Design for efficient resource management.
  • Reduce boundary crossings.
  • Prefer single large assemblies rather than multiple smaller assemblies.
  • Factor code by logical layers.
  • Treat threads as a shared resource.
  • Design for efficient exception management.

Design for Efficient Resource Management

Avoid allocating objects and the resources they encapsulate before you need them, and make sure you release them as soon as your code is completely finished with them. This advice applies to all resource types including database connections, data readers, files, streams, network connections, and COM objects. Use finally blocks or Microsoft Visual C#® using statements to ensure that resources are closed or released in a timely fashion, even in the event of an exception. Note that the C# using statement is used only for resources that implement IDisposable; whereas finally blocks can be used for any type of cleanup operations.

Reduce Boundary Crossings

Aim to reduce the number of method calls that cross remoting boundaries because this introduces marshaling and potentially thread switching overhead. With managed code, there are several boundaries to consider:

  • Cross application domain. This is the most efficient boundary to cross because it is within the context of a single process. Because the cost of the actual call is so low, the overhead is almost completely determined by the number, type, and size of parameters passed on the method call.
  • Cross process. Crossing a process boundary significantly impacts performance. You should do so only when absolutely necessary. For example, you might determine that an Enterprise Services server application is required for security and fault tolerance reasons. Be aware of the relative performance tradeoff.
  • Cross machine. Crossing a machine boundary is the most expensive boundary to cross, due to network latency and marshaling overhead. While marshaling overhead impacts all boundary crossings, its impact can be greater when crossing machine boundaries. For example, the introduction of an HTTP proxy might force you to use SOAP envelopes, which introduces additional overhead. Before introducing a remote server into your design, you need to consider the relative tradeoffs including performance, security, and administration.
  • Unmanaged code. You also need to consider calls to unmanaged code, which introduces marshaling and potentially thread switching overhead. The Platform Invoke (P/Invoke) and COM interop layers of the CLR are very efficient, but performance can vary considerably depending on the type and size of data that needs to be marshaled between the managed and unmanaged code. For more information, see Chapter 7, "Improving Interop Performance."

Prefer Single Large Assemblies Rather Than Multiple Smaller Assemblies

To help reduce your application's working set, you should prefer single larger assemblies rather than multiple smaller assemblies. If you have several assemblies that are always loaded together, you should combine them and create a single assembly.

The overhead associated with having multiple smaller assemblies can be attributed to the following:

  • The cost of loading metadata for smaller assemblies.
  • Touching various memory pages in pre-compiled images in the CLR in order to load the assembly (if it is precompiled with Ngen.exe).
  • JIT compile time.
  • Security checks.

Because you pay for only the memory pages your program accesses, larger assemblies provide the Native Image Generator utility (Ngen.exe) with a greater chance to optimize the native image it produces. Better layout of the image means that necessary data can be laid out more densely, which in turn means fewer overall pages are needed to do the job compared to the same code laid out in multiple assemblies.

Sometimes you cannot avoid splitting assemblies; for example, for versioning and deployment reasons. If you need to ship types separately, you may need separate assemblies.

Factor Code by Logical Layers

Consider your internal class design and how you factor code into separate methods. When code is well factored, it becomes easier to tune to improve performance, maintain, and add new functionality. However, there needs to be a balance. While clearly factored code can improve maintainability, you should be wary of over abstraction and creating too many layers. Simple designs can be effective and efficient.

Treat Threads as a Shared Resource

Do not create threads on a per-request basis because this can severely impact scalability. Creating new threads is also a fairly expensive operation that should be minimized. Treat threads as a shared resource and use the optimized .NET thread pool.

Design for Efficient Exception Management

The performance cost of throwing an exception is significant. Although structured exception handling is the recommended way of handling error conditions, make sure you use exceptions only in exceptional circumstances when error conditions occur. Do not use exceptions for regular control flow.

Class Design Considerations

Class design choices can affect system performance and scalability. However, analyze your tradeoffs, such as functionality, maintainability, and company coding guidelines. Balance these with performance guidelines.

This section summarizes guidelines for designing your managed classes:

  • Do not make classes thread safe by default.
  • Consider using the sealed keyword.
  • Consider the tradeoffs of virtual members.
  • Consider using overloaded methods.
  • Consider overriding the Equals method for value types.
  • Know the cost of accessing a property.
  • Consider private vs. public member variables.
  • Limit the use of volatile fields.

Do Not Make Classes Thread Safe by Default

Consider carefully whether you need to make an individual class thread safe. Thread safety and synchronization is often required at a higher layer in the software architecture and not at an individual class level. When you design a specific class, you often do not know the proper level of atomicity, especially for lower-level classes.

For example, consider a thread safe collection class. The moment the class needs to be atomically updated with something else, such as another class or a count variable, the built-in thread safety is useless. Thread control is needed at a higher level. There are two problems in this situation. Firstly, the overhead from the thread-safety features that the class offers remains even though you do not require those features. Secondly, the collection class likely had a more complex design in the first place to offer those thread-safety services, which is a price you have to pay whenever you use the class.

In contrast to regular classes, static classes (those with only static methods) should be thread safe by default. Static classes have only global state, and generally offer services to initialize and manage that shared state for a whole process. This requires proper thread safety.

Consider Using the sealed Keyword

You can use the sealed keyword at the class and method level. In Visual Basic .NET, you can use the NotInheritable keyword at the class level or NotOverridable at the method level. If you do not want anybody to extend your base classes, you should mark them with the sealed keyword. Before you use the sealed keyword at the class level, you should carefully evaluate your extensibility requirements.

If you derive from a base class that has virtual members and you do not want anybody to extend the functionality of the derived class, you can consider sealing the virtual members in the derived class. Sealing the virtual methods makes them candidates for inlining and other compiler optimizations.

Consider the following example.

public class MyClass{ 
  protected virtual void SomeMethod() { ... } 
}

You can override and seal the method in a derived class.

public class DerivedClass : MyClass { 
  protected override sealed void SomeMethod () { ... } 
}

This code ends the chain of virtual overrides and makes DerivedClass.SomeMethod a candidate for inlining.

More Information

For more information about inheritance in Visual Basic .NET, see MSDN® Magazine article, "Using Inheritance in the .NET World, Part 2," by Ted Pattison at http://msdn.microsoft.com/en-us/magazine/cc301744.aspx.

Consider the Tradeoffs of Virtual Members

Use virtual members to provide extensibility. If you do not need to extend your class design, avoid virtual members because they are more expensive to call due to a virtual table lookup and they defeat certain run-time performance optimizations. For example, virtual members cannot be inlined by the compiler. Additionally, when you allow subtyping, you actually present a very complex contract to consumers and you inevitably end up with versioning problems when you attempt to upgrade your class in the future.

Consider Using Overloaded Methods

Consider having overloaded methods for varying parameters instead of having a sensitive method that takes a variable number of parameters. Such a method results in special code paths for each possible combination of parameters.

//method taking variable number of arguments
void GetCustomers (params object [] filterCriteria)

//overloaded methods
void GetCustomers (int countryId, int regionId)
void GetCustomers (int countryId, int regionId, int CustomerType)
Note   If there are COM clients accessing .NET components, using overloaded methods will not work as a strategy. Use methods with different names instead.

Consider Overriding the Equals Method for Value Types

You can override the Equals method for value types to improve performance of the Equals method. The Equals method is provided by System.Object. To use the standard implementation of Equals, your value type must be boxed and passed as an instance of the reference type System.ValueType. The Equals method then uses reflection to perform the comparison. However, the overhead associated with the conversions and reflections can easily be greater than the cost of the actual comparison that needs to be performed. As a result, an Equals method that is specific to your value type can do the required comparison significantly more cheaply.

The following code fragment shows an overridden Equals method implementation that improves performance by avoiding reflection costs.

public struct Rectangle{
  public double Length;
  public double Breadth;
  public override bool Equals (object ob) {
  if(ob is Rectangle)
    return Equals((Rectangle)ob);
  else
    return false;

  }
  private bool Equals(Rectangle rect) {
    return this.Length == rect.Length && this.Breadth==rect.Breadth;
  }
}

Know the Cost of Accessing a Property

A property looks like a field, but it is not, and it can have hidden costs. You can expose class-level member variables by using public fields or public properties. The use of properties represents good object-oriented programming practice because it allows you to encapsulate validation and security checks and to ensure that they are executed when the property is accessed, but their field-like appearance can cause them to be misused.

You need to be aware that if you access a property, additional code, such as validation logic, might be executed. This means that accessing a property might be slower than directly accessing a field. However, the additional code is generally there for good reason; for example, to ensure that only valid data is accepted.

For simple properties that contain no additional code (other than directly setting or getting a private member variable), there is no performance difference compared to accessing a public field because the compiler can inline the property code. However, things can easily become more complicated; for example, virtual properties cannot be inlined.

If your object is designed for remote access, you should use methods with multiple parameters instead of requiring the client to set multiple properties or fields. This reduces round trips.

It is extremely bad form to use properties to hide complex business rules or other costly operations, because there is a strong expectation by callers that properties are inexpensive. Design your classes accordingly.

Consider Private vs. Public Member Variables

In addition to the usual visibility concerns, you should also avoid unnecessary public members to prevent any additional serialization overhead when you use the XmlSerializer class, which serializes all public members by default.

Limit the Use of Volatile Fields

Limit the use of the volatile keyword because volatile fields restrict the way the compiler reads and writes the contents of the field. The compiler generates the code that always reads from the field's memory location instead of reading from a register that may have loaded the field's value. This means that accessing volatile fields is slower than nonvolatile ones because the system is forced to use memory addresses rather than registers.

Implementation Considerations

After design is underway, consideration must be given to the technical details of your managed code development. To improve performance, managed code must make effective use of the CLR. Key managed code performance measures include response times, speed of throughput, and resource management.

Response times can be improved by optimizing critical code paths and by writing code that enables the garbage collector to release memory efficiently. By analyzing your application's allocation profile, garbage collection performance can be increased.

Throughput can be improved by making effective use of threads. Minimize thread creation, and ensure you use the thread pool to avoid expensive thread initialization. Performance critical code should avoid reflection and late binding.

Utilization of resources can be improved by effective use of finalization (using the Dispose pattern) to release unmanaged resources, and efficient use of strings, arrays, collections and looping constructs. Locking and synchronization should be used sparingly, and where used, lock duration should be minimized.

The following sections highlight performance considerations when developing managed code.

Garbage Collection Explained

The .NET Framework uses automatic garbage collection to manage memory for all applications. When you use the new operator to create an object, the object's memory is obtained from the managed heap. When the garbage collector decides that sufficient garbage has accumulated that it is efficient to do so, it performs a collection to free some memory. This process is fully automatic, but there are a number of factors that you need to be aware of that can make the process more or less efficient.

To understand the principles of garbage collection, you need to understand the life cycle of a managed object:

  1. Memory for an object is allocated from the managed heap when you call new. The object's constructor is called after the memory is allocated.
  2. The object is used for a period of time.
  3. The object dies due to all its references either being explicitly set to null or else going out of scope.
  4. The object's memory is freed (collected) some time later. After the memory is freed, it is generally available for other objects to use again.

Allocation

The managed heap can be thought of as a block of contiguous memory. When you create a new object, the object's memory is allocated at the next available location on the managed heap. Because the garbage collector does not need to search for space, allocations are extremely fast if there is enough memory available. If there is not enough memory for the new object, the garbage collector attempts to reclaim space for the new object.

Collection

To reclaim space, the garbage collector collects objects that are no longer reachable. An object is no longer reachable when there are no references to it, all references are set to null, or all references to it are from other objects that can be collected as part of the current collection cycle.

When a collection occurs, the reachable objects are traced and marked as the trace proceeds. The garbage collector reclaims space by moving reachable objects into the contiguous space and reclaiming the memory used by the unreachable objects. Any object that survives the collection is promoted to the next generation.

Generations

The garbage collector uses three generations to group objects by their lifetime and volatility:

  • Generation 0 (Gen 0). This consists of newly created objects. Gen 0 is collected frequently to ensure that short-lived objects are quickly cleaned up. Those objects that survive a Gen 0 collection are promoted to Generation 1.
  • Generation 1 (Gen 1). This is collected less frequently than Gen 0 and contains longer-lived objects that were promoted from Gen 0.
  • Generation 2 (Gen 2). This contains objects promoted from Gen 1 (which means it contains the longest-lived objects) and is collected even less frequently. The general strategy of the garbage collector is to collect and move longer-lived objects less frequently.

Key GC Methods Explained

Table 5.1 shows the key methods of the System.GC class. You can use this class to control the behavior of the garbage collector.

Table 5.1: Key GC Methods

MethodDescription
System.GC.CollectThis method forces a garbage collection. You should generally avoid this and let the runtime determine the appropriate time to perform a collection. The main reason that you might be tempted to call this method is that you cannot see memory being freed that you expect to see freed. However, the main reason that this occurs is because you are inadvertently holding on to one or more objects that are no longer needed. In this case, forcing a collection does not help.
System.GC.WaitForPendingFinalizersThis suspends the current thread until the finalization thread has emptied the finalization queue. Generally, this method is called immediately after System.GC.Collect to ensure that the current thread waits until finalizers for all objects are called. However, because you should not call GC.Collect, you should not need to call GC.WaitForPendingFinalizers.
System.GC.KeepAliveThis is used to prevent an object from being prematurely collected by holding a reference to the object. A common scenario is when there are no references to an object in managed code but the object is still in use in unmanaged code.
System.GC.SuppressFinalizeThis prevents the finalizer being called for a specified object. Use this method when you implement the dispose pattern. If you have explicitly released resources because the client has called your object's Dispose method. Dispose should call SuppressFinalize because finalization is no longer required.

Server GC vs. Workstation GC

The CLR provides two separate garbage collectors:

  • Workstation GC (Mscorwks.dll). This is designed for use by desktop applications such as Windows Forms applications.
  • Server GC (Mscorsvr.dll). This is designed for use by server applications. ASP.NET loads server GC but only if the server has more than one processor. On single processor servers, it loads workstation GC.
Note   At the time of this writing, the .NET Framework 2.0 (code-named "Whidbey") includes both GCs inside Mscorwks.dll, and Mscorsvr.dll no longer exists.

Server GC is optimized for throughput, memory consumption, and multiprocessor scalability, while the workstation GC is tuned for desktop applications. When using the server GC, the managed heap is split into several sections, one per CPU on a multiprocessor computer. When a collection is initiated, the collector has one thread per CPU; all threads collect their own sections simultaneously. The workstation version of the execution engine (Mscorwks.dll) is optimized for smaller latency. Workstation GC performs collection in parallel with the CLR threads. Server GC suspends the CLR threads during collection.

You might sometimes need the functionality of the server GC for your custom application when hosting it on a multiprocessor computer. For example, you might need it for a Windows service that uses a .NET remoting host and is deployed on a multiprocessor server. In this scenario, you need to develop a custom host that loads the CLR and the server GC version of the garbage collector. For more information about how to do this, see MSDN Magazine article, "Microsoft .NET: Implement a Custom Common Language Runtime Host for Your Managed App," by Steven Pratschner at http://msdn.microsoft.com/en-us/magazine/cc301479.aspx.

Note   At the time of this writing, the .NET Framework 2.0 (code-named "Whidbey") provides a way to switch between server and workstation GC through application configuration.

Garbage Collection Guidelines

This section summarizes recommendations to help improve garbage collection performance:

  • Identify and analyze your application's allocation profile.
  • Avoid calling GC.Collect.
  • Consider weak references with cached data.
  • Prevent the promotion of short-lived objects.
  • Set unneeded member variables to Null before making long-running calls.
  • Minimize hidden allocations.
  • Avoid or minimize complex object graphs.
  • Avoid preallocating and chunking memory.

Identify and Analyze Your Application's Allocation Profile

Object size, number of objects, and object lifetime are all factors that impact your application's allocation profile. While allocations are quick, the efficiency of garbage collection depends (among other things) on the generation being collected. Collecting small objects from Gen 0 is the most efficient form of garbage collection because Gen 0 is the smallest and typically fits in the CPU cache. In contrast, frequent collection of objects from Gen 2 is expensive. To identify when allocations occur, and which generations they occur in, observe your application's allocation patterns by using an allocation profiler such as the CLR Profiler.

For more information, see "How To: Use CLR Profiler" in the "How To" section of this guide.

Avoid Calling GC.Collect

The default GC.Collect method causes a full collection of all generations. Full collections are expensive because literally every live object in the system must be visited to ensure complete collection. Needless to say, exhaustively visiting all live objects could, and usually does, take a significant amount of time. The garbage collector's algorithm is tuned so that it does full collections only when it is likely to be worth the expense of doing so. As a result, do not call GC.Collect directly — let the garbage collector determine when it needs to run.

The garbage collector is designed to be self-tuning and it adjusts its operation to meet the needs of your application based on memory pressure. Programmatically forcing collection can hinder tuning and operation of the garbage collector.

If you have a particular niche scenario where you have to call GC.Collect, consider the following:

  • Call GC.WaitForPendingFinalizers after you call GC.Collect. This ensures that the current thread waits until finalizers for all objects are called.
  • After the finalizers run, there are more dead objects (those that were just finalized) that need to be collected. One more call to GC.Collect collects the remaining dead objects.
       System.GC.Collect(); // This gets rid of the dead objects
       System.GC.WaitForPendingFinalizers(); // This waits for any finalizers to finish.
       System.GC.Collect(); // This releases the memory associated with the objects that were just finalized.
    

Consider Using Weak References with Cached Data

Consider using weak references when you work with cached data, so that cached objects can be resurrected easily if needed or released by garbage collection when there is memory pressure. You should use weak references mostly for objects that are not small in size because the weak referencing itself involves some overhead. They are suitable for medium to large-sized objects stored in a collection.

Consider a scenario where you maintain a custom caching solution for the employee information in your application. By holding onto your object through a WeakReference wrapper, the objects are collected when memory pressure grows during periods of high stress.

If on a subsequent cache lookup, you cannot find the object, re-create it from the information stored in an authoritative persistent source. In this way, you balance the use of cache and persistent medium. The following code demonstrates how to use a weak reference.

void SomeMethod() 
{
  // Create a collection
  ArrayList arr = new ArrayList(5);
  // Create a custom object
  MyObject mo = new MyObject();
  // Create a WeakReference object from the custom object
  WeakReference wk = new WeakReference(mo);
  // Add the WeakReference object to the collection
  arr.Add(wk);
  // Retrieve the weak reference
  WeakReference weakReference = (WeakReference)arr[0];
  MyObject mob = null;
  if( weakReference.IsAlive ){
    mob = (MyOBject)weakReference.Target;
  }
  if(mob==null){
    // Resurrect the object as it has been garbage collected
  }
  //continue because we have the object
}

Prevent the Promotion of Short-Lived Objects

Objects that are allocated and collected before leaving Gen 0 are referred as short-lived objects. The following principles help ensure that your short-lived objects are not promoted:

  • Do not reference short-lived objects from long-lived objects. A common example where this occurs is when you assign a local object to a class level object reference.
    class Customer{
      Order _lastOrder;
      void insertOrder (int ID, int quantity, double amount, int productID){
        Order currentOrder = new Order(ID, quantity, amount, productID);
        currentOrder.Insert();
        this._lastOrder = currentOrder;
      }
    }
    

    Avoid this type of code because it increases the likelihood of the object being promoted beyond Gen 0, which delays the object's resources from being reclaimed. One possible implementation that avoids this issue follows.

    class Customer{
      int _lastOrderID;
      void ProcessOrder (int ID, int quantity, double amount, int productID){
        . . .
        this._lastOrderID = ID;
        . . .
      }
    }
    

    The specific Order class is brought in by ID as needed.

  • Avoid implementing a Finalize method. The garbage collector must promote finalizable objects to older generations to facilitate finalization, which makes them long-lived objects.
  • Avoid having finalizable objects refer to anything. This can cause the referenced object(s) to become long-lived.

More Information

For more information about garbage collection, see the following resources:

Set Unneeded Member Variables to Null Before Making Long-Running Calls

Before you block on a long-running call, you should explicitly set any unneeded member variables to null before making the call so they can be collected. This is demonstrated in the following code fragment.

class MyClass{
  private string str1;
  private string str2;

  void DoSomeProcessing(…){
    str1= GetResult(…);
    str2= GetOtherResult(…);
  }
  void MakeDBCall(…){
    PrepareForDBCall(str1,str2);
    str1=null;
    str2=null;
    // Make a database (long running) call
  }
}

This advice applies to any objects which are still statically or lexically reachable but are actually not needed:

  • If you no longer need a static variable in your class, or some other class, set it to null.
  • If you can "prune" your state, that is also a good idea. You might be able to eliminate most of a tree before a long-running call, for instance.
  • If there are any objects that could be disposed before the long-running call, set those to null.

Do not set local variables to null (C#) or Nothing (Visual Basic .NET) because the JIT compiler can statically determine that the variable is no longer referenced and there is no need to explicitly set it to null. The following code shows an example using local variables.

void func(…)
{
  String str1;
  str1="abc";
  // Avoid this
  str1=null;
}

Minimize Hidden Allocations

Memory allocation is extremely quick because it involves only a pointer relocation to create space for the new object. However, the memory has to be garbage collected at some point and that can hurt performance, so be aware of apparently simple lines of code that actually result in many allocations. For example, String.Split uses a delimiter to create an array of strings from a source string. In doing so, String.Split allocates a new string object for each string that it has split out, plus one object for the array. As a result, using String.Split in a heavy duty context (such as a sorting routine) can be expensive.

string attendees = "bob,jane,fred,kelly,jim,ann";
// In the following single line the code allocates 6 substrings, 
// outputs the attendees array, and the input separators array
string[] names = attendees.Split( new char[] {','});

Also watch out for allocations that occur inside a loop such as string concatenations using the += operator. Finally, hashing methods and comparison methods are particularly bad places to put allocations because they are often called repeatedly in the context of searching and sorting. For more information about how to handle strings efficiently, see "String Operations" later in this chapter.

Avoid or Minimize Complex Object Graphs

Try to avoid using complex data structures or objects that contain a lot of references to other objects. These can be expensive to allocate and create additional work for the garbage collector. Simpler graphs have superior locality and less code is needed to maintain them. A common mistake is to make the graphs too general.

Avoid Preallocating and Chunking Memory

C++ programmers often allocate a large block of memory (using malloc) and then use chunks at a time, to save multiple calls to malloc. This is not advisable for managed code for several reasons:

  • Allocation of managed memory is a quick operation and the garbage collector has been optimized for extremely fast allocations. The main reason for preallocating memory in unmanaged code is to speed up the allocation process. This is not an issue for managed code.
  • If you preallocate memory, you cause more allocations than needed; this can trigger unnecessary garbage collections.
  • The garbage collector is unable to reclaim the memory that you manually recycle.
  • Preallocated memory ages and costs more to recycle when it is ultimately released.

Finalize and Dispose Explained

The garbage collector offers an additional, optional service called finalization. Use finalization for objects that need to perform cleanup processing during the collection process and just before the object's memory is reclaimed. Finalization is most often used to release unmanaged resources maintained by an object; any other use should be closely examined. Examples of unmanaged resources include file handles, database connections, and COM object references.

Finalize

Some objects require additional cleanup because they use unmanaged resources, and these need to be released. This is handled by finalization. An object registers for finalization by overriding the Object.Finalize method. In C# and Managed Extensions for C++, implement Finalize by providing a method that looks like a C++ destructor.

Note   The semantics of the Finalize method and a C++ destructor should not be confused. The syntax is the same but the similarity ends there.

An object's Finalize method is called before the object's managed memory is reclaimed. This allows you to release any unmanaged resources that are maintained by the object. If you implement Finalize, you cannot control when this method should be called because this is left to the garbage collector — this is commonly referred to as nondeterministic finalization.

The finalization process requires a minimum of two collection cycles to fully release the object's memory. During the first collection pass, the object is marked for finalization. Finalization runs on a specialized thread that is independent from the garbage collector. After finalization occurs, the garbage collector can reclaim the object's memory.

Because of the nondeterministic nature of finalization, there is no guarantee regarding the time or order of object collection. Also, memory resources may be consumed for a large amount of time before being garbage collected.

In C#, implement Finalize by using destructor syntax.

class yourObject {
  // This is a finalizer implementation
  ~yourObject() {
    // Release your unmanaged resources here
    . . .
  }
}

The preceding syntax causes the compiler to generate the following code.

class yourObject {
  protected override void Finalize() {
  try{
    . . .
  }
  finally {
    base.Finalize();
  }
}

In Visual Basic .NET, you need to override Object.Finalize.

Protected Overrides Sub Finalize()
  ' clean up unmanaged resources
End Sub

Dispose

Provide the Dispose method (using the Dispose pattern, which is discussed later in this chapter) for types that contain references to external resources that need to be explicitly freed by the calling code. You can avoid finalization by implementing the IDisposable interface and by allowing your class's consumers to call Dispose.

The reason you want to avoid finalization is because it is performed asynchronously and unmanaged resources might not be freed in a timely fashion. This is especially important for large and expensive unmanaged resources such as bitmaps or database connections. In these cases, the classic style of explicitly releasing your resources is preferred (using the IDisposable interface and providing a Dispose method). With this approach, resources are reclaimed as soon as the consumer calls Dispose and the object need not be queued for finalization. Statistically, what you want to see is that almost all of your finalizable objects are being disposed and not finalized. The finalizer should only be your backup.

With this approach, you release unmanaged resources in the IDisposable.Dispose method. This method can be called explicitly by your class's consumers or implicitly by using the C# using statement.

To prevent the garbage collector from requesting finalization, your Dispose implementation should call GC.SuppressFinalization.

More Information

For more information about the Dispose method, see Microsoft Knowledge Base article 315528, "INFO: Implementing Dispose Method in a Derived Class," at http://support.microsoft.com/default.aspx?scid=kb;en-us;315528.

Close

For certain classes of objects, such as files or database connection objects, a Close method better represents the logical operation that should be performed when the object's consumer is finished with the object. As a result, many objects expose a Close method in addition to a Dispose method. In well written cases, both are functionally equivalent.

Dispose Pattern

The Dispose pattern defines the way you should implement dispose (and finalizer) functionality on all managed classes that maintain resources that the caller must be allowed to explicitly release. To implement the Dispose pattern, do the following:

  • Create a class that derives from IDisposable.
  • Add a private member variable to track whether IDisposable.Dispose has already been called. Clients should be allowed to call the method multiple times without generating an exception. If another method on the class is called after a call to Dispose, you should throw an ObjectDisposedException.
  • Implement a protected virtual void override of the Dispose method that accepts a single bool parameter. This method contains common cleanup code that is called either when the client explicitly calls IDisposable.Dispose or when the finalizer runs. The bool parameter is used to indicate whether the cleanup is being performed as a result of a client call to IDisposable.Dispose or as a result of finalization.
  • Implement the IDisposable.Dispose method that accepts no parameters. This method is called by clients to explicitly force the release of resources. Check whether Dispose has been called before; if it has not been called, call Dispose(true) and then prevent finalization by calling GC.SuppressFinalize(this). Finalization is no longer needed because the client has explicitly forced a release of resources.
  • Create a finalizer, by using destructor syntax. In the finalizer, call Dispose(false).

C# Example of Dispose

Your code should look like the following.

public sealed class MyClass: IDisposable
{
  // Variable to track if Dispose has been called
  private bool disposed = false;
  // Implement the IDisposable.Dispose() method
  public void Dispose(){
    // Check if Dispose has already been called 
    if (!disposed)
    {
      // Call the overridden Dispose method that contains common cleanup code
      // Pass true to indicate that it is called from Dispose
      Dispose(true);
     // Prevent subsequent finalization of this object. This is not needed 
     // because managed and unmanaged resources have been explicitly released
      GC.SuppressFinalize(this);
    }
  }

  // Implement a finalizer by using destructor style syntax
  ~MyClass() {
    // Call the overridden Dispose method that contains common cleanup code
    // Pass false to indicate the it is not called from Dispose
    Dispose(false);
  }

  // Implement the override Dispose method that will contain common
  // cleanup functionality
  protected virtual void Dispose(bool disposing){
   if(disposing){
     // Dispose time code
     . . .
   }
   // Finalize time code
    . . .
  }
  …}

Passing true to the protected Dispose method ensures that dispose specific code is called. Passing false skips the Dispose specific code. The Dispose(bool) method can be called directly by your class or indirectly by the client.

If you reference any static variables or methods in your finalize-time Dispose code, make sure you check the Environment.HasShutdownStarted property. If your object is thread safe, be sure to take whatever locks are necessary for cleanup.

Use the HasShutdownStarted property in an object's Dispose method to determine whether the CLR is shutting down or the application domain is unloading. If that is the case, you cannot reliably access any object that has a finalization method and is referenced by a static field.

protected virtual void Dispose(bool disposing){
  if(disposing){
    // dispose-time code
  . . .
  }
  // finalize-time code
  CloseHandle();

  if(!Environment.HasShutDownStarted) 
  { //Debug.Write or Trace.Write – static methods
    Debug.WriteLine("Finalizer Called");
  }
  disposed = true;
}

Visual Basic .NET Example of Dispose

The Visual Basic .NET version of the Dispose pattern is shown in the following code sample.

'Visual Basic .NET Code snippet
Public Class MyDispose Implements IDisposable
    
    Public Overloads Sub Dispose() Implements IDisposable.Dispose
        Dispose(True)
        GC.SuppressFinalize(Me) ' No need call finalizer
    End Sub
 
    Protected Overridable Overloads Sub Dispose(ByVal disposing As Boolean)
        If disposing Then
            ' Free managed resources
        End If
        ' Free unmanaged resources
    End Sub
 
    Protected Overrides Sub Finalize()
        Dispose(False)
    End Sub
End Class

Finalize and Dispose Guidelines

This section summarizes Finalize and Dispose recommendations:

  • Call Close or Dispose on classes that support it.
  • Use the using statement in C# and Try/Finally blocks in Visual Basic .NET to ensure Dispose is called.
  • Do not implement Finalize unless required.
  • Implement Finalize only if you hold unmanaged resources across client calls.
  • Move the Finalization burden to the leaves of object graphs.
  • If you implement Finalize, implement IDisposable.
  • If you implement Finalize and Dispose, use the Dispose pattern.
  • Suppress finalization in your Dispose method.
  • Allow Dispose to be called multiple times.
  • Call Dispose on base classes and on IDisposable members.
  • Keep finalizer code simple to prevent blocking.
  • Provide thread safe cleanup code only if your type is thread safe.

Call Close or Dispose on Classes that Support It

If the managed class you use implements Close or Dispose, call one of these methods as soon as you are finished with the object. Do not simply let the resource fall out of scope. If an object implements Close or Dispose, it does so because it holds an expensive, shared, native resource that should be released as soon as possible.

Disposable Resources

Common disposable resources include the following:

  • Database-related classes: SqlConnection, SqlDataReader, and SqlTransaction.
  • File-based classes: FileStream and BinaryWriter.
  • Stream-based classes: StreamReader, TextReader, TextWriter, BinaryReader, and TextWriter.
  • Network-based classes: Socket, UdpClient, and TcpClient.

For a full list of classes that implement IDisposable in the .NET Framework, see "IDisposable Interface" in the .NET Framework Class Library on MSDN at http://msdn.microsoft.com/en-us/library/system.idisposable.aspx.

COM Objects

In server scenarios where you create and destroy COM objects on a per-request basis, you may need to call System.Runtime.InteropServices.Marshal.ReleaseComObject.

The Runtime Callable Wrapper (RCW) has a reference count that is incremented every time a COM interface pointer is mapped to it (this is not the same as the reference count of the IUnknown AddRef/Release methods). The ReleaseComObject method decrements the reference counts of the RCW. When the reference count reaches zero, the runtime releases all its references on the unmanaged COM object.

For example, if you create and destroy COM objects from an ASP.NET page, and you can track their lifetime explicitly, you should test calling ReleaseComObject to see if throughput improves.

For more information about RCWs and ReleaseComObject, see Chapter 7, "Improving Interop Performance."

Enterprise Services (COM+)

You are not recommended to share serviced components or COM or COM+ objects in cases where your objects are created in a nondefault context. An object can end up in a nondefault context either because your component is a serviced component configured in COM+ or because your component is a simple COM component that is placed in a nondefault context by virtue of its client. For example, clients such as ASP.NET pages running in a transaction or running in ASPCOMPAT mode are always located inside a COM+ context. If your client is a serviced component itself, the same rule applies.

The main reason for not sharing serviced components is that crossing a COM+ context boundary is expensive. This issue is increased if your client-side COM+ context has thread affinity because it is located inside an STA.

In such cases, you should follow acquire, work, release semantics. Activate your component, perform work with it, and then release it immediately. When you use Enterprise Services and classes that derive from System.EnterpriseServices.ServicedComponent, you need to call Dispose on those classes.

If the component you call into is an unmanaged COM+ component, you need to call Marshal.ReleaseComObject. In the case of nonconfigured COM components (components not installed in the COM+ catalog) if your client is inside a COM+ context and your COM component is not agile, it is still recommended that you call Marshal.ReleaseComObject.

For more information about proper cleanup of serviced components, see the "Resource Management" section in Chapter 8, "Improving Enterprise Services Performance."

Use the using Statement in C# and Try/Finally Blocks in Visual Basic .NET to Ensure Dispose Is Called

Call Close or Dispose inside a Finally block in Visual Basic .NET code to ensure that the method is called even when an exception occurs.

Dim myFile As StreamReader
myFile = New StreamReader("C:\\ReadMe.Txt")
Try 
  String contents = myFile.ReadToEnd()
  '... use the contents of the file
Finally 
  myFile.Close()
End Try

The using Statement in C#

For C# developers, the using statement automatically generates a try and finally block at compile time that calls Dispose on the object allocated inside the using block. The following code illustrates this syntax.

using( StreamReader myFile = new StreamReader("C:\\ReadMe.Txt")){
       string contents = myFile.ReadToEnd();
       //... use the contents of the file
 
} // dispose is called and the StreamReader's resources released

During compilation, the preceding code is converted into the following equivalent code.

StreamReader myFile = new StreamReader("C:\\ReadMe.Txt");
try{
  string contents = myFile.ReadToEnd();
  //... use the contents of the file
}
finally{
  myFile.Dispose();
}
Note   The next release of Visual Basic .NET will contain the equivalent of a using statement.

Do Not Implement Finalize Unless Required

Implementing a finalizer on classes that do not require it adds load to the finalizer thread and the garbage collector. Avoid implementing a finalizer or destructor unless finalization is required.

Classes with finalizers require a minimum of two garbage collection cycles to be reclaimed. This prolongs the use of memory and can contribute to memory pressure. When the garbage collector encounters an unused object that requires finalization, it moves it to the "ready-to-be-finalized" list. Cleanup of the object's memory is deferred until after the single specialized finalizer thread can execute the registered finalizer method on the object. After the finalizer runs, the object is removed from the queue and literally dies a second death. At that point, it is collected along with any other objects. If your class does not require finalization, do not implement a Finalize method.

Implement Finalize Only If You Hold Unmanaged Resources across Client Calls

Use a finalizer only on objects that hold unmanaged resources across client calls. For example, if your object has only one method named GetData that opens a connection, fetches data from an unmanaged resource, closes the connection, and returns data, there is no need to implement a finalizer. However, if your object also exposes an Open method in which a connection to an unmanaged resource is made, and then data is fetched using a separate GetData method, it is possible for the connection to be maintained to the unmanaged resource across calls. In this case, you should provide a Finalize method to clean up the connection to the unmanaged resource, and in addition use the Dispose pattern to give the client the ability to explicitly release the resource after it is finished.

Note   You must be holding the unmanaged resource directly. If you use a managed wrapper you do not need your own finalizer, although you might still choose to implement IDisposable so that you can pass along the dispose request to the underlying object.

Move the Finalization Burden to the Leaves of Object Graphs

If you have an object graph with an object referencing other objects (leaves) that hold unmanaged resources, you should implement the finalizers in the leaf objects instead of in the root object.

There are several reasons for this. First, the object that is being finalized will survive the first collection and be placed on the finalization list. The fact that the object survives means that it could be promoted to an older generation just like any other object, increasing the cost of collecting it in the future. Second, because the object survived, any objects it might be holding will also survive, together with their sub objects, and so on. So the entire object graph below the finalized object ends up living longer than necessary and being collected in a more expensive generation.

Avoid both these problems by making sure that your finalizable objects are always leaves in the object graph. It is recommended that they hold the unmanaged resource they wrap and nothing else.

Moving the finalization burden to leaf objects results in the promotion of only the relevant ones to the finalization queue, which helps optimize the finalization process.

If You Implement Finalize, Implement IDisposable

You should implement IDisposable if you implement a finalizer. In this way, the calling code has an explicit way to free resources by calling the Dispose method.

You should still implement a finalizer along with Dispose because you cannot assume that the calling code always calls Dispose. Although costly, the finalizer implementation ensures that resources are released.

If You Implement Finalize and Dispose, Use the Dispose Pattern

If you implement Finalize and Dispose, use the Dispose pattern as described earlier.

Suppress Finalization in Your Dispose Method

The purpose of providing a Dispose method is to allow the calling code to release unmanaged resources as soon as possible and to prevent two cycles being taken for the object's cleanup. If the calling code calls Dispose, you do not want the garbage collector to call a finalizer because the unmanaged resources will have already been returned to the operating system. You must prevent the garbage collector from calling the finalizer by using GC.SuppressFinalization in your Dispose method.

public void Dispose()
{
  // Using the dispose pattern
  Dispose(true); 
  // ... release unmanaged resources here
  GC.SuppressFinalize(this);
}

Allow Dispose to Be Called Multiple Times

Calling code should be able to safely call Dispose multiple times without causing exceptions. After the first call, subsequent calls should do nothing and not throw an ObjectDisposedException for subsequent calls.

You should throw an ObjectDisposedException exception from any other method (other than Dispose) on the class that is called after Dispose has been called.

A common practice is to keep a private variable that denotes whether Dispose has been called.

public class Customer : IDisposable{
  private bool disposed = false;
  . . .
  public void SomeMethod(){
     if(disposed){
       throw new ObjectDisposedException(this.ToString());
           }
           . . .
     } 
  public void Dispose(){
     //check before calling your Dispose pattern
     if (!disposed)
     { ... }
  }
  . . .
}

Call Dispose On Base Classes and On IDisposable Members

If your class inherits from a disposable class, then make sure that it calls the base class's Dispose. Also, if you have any member variables that implement IDisposable, call Dispose on them, too.

The following code fragment demonstrates calling Dispose on base classes.

public class BusinessBase : IDisposable{
  public void Dispose() {...}
  protected virtual void Dispose(bool disposing)  {}
  ~BusinessBase() {...}
}

public class Customer : BusinessBase, IDisposable{
private bool disposed = false;

  protected virtual void Dispose(bool disposing) {
    // Check before calling your Dispose pattern
    if (!disposed){
      if (disposing) {
        // free managed objects
      }
      // free unmanaged objects
      base.Dispose(disposing);
      disposed = true;
    }
  }

Keep Finalizer Code Simple to Prevent Blocking

Finalizer code should be simple and minimal. The finalization happens on a dedicated, single finalizer thread. Apply the following guidelines to your finalizer code:

  • Do not issue calls that could block the calling thread. If the finalizer does block, resources are not freed and the application leaks memory.
  • Do not use thread local storage or any other technique that requires thread affinity because the finalizer method is called by a dedicated thread, separate from your application's main thread.

If multiple threads allocate many finalizable objects, they could allocate more finalizable objects in a specific timeframe than the finalizer thread can clean up. For this reason, Microsoft may choose to implement multiple finalizer threads in a future version of the CLR. As a result, it is recommended that you write your finalizers so they do not depend on shared state. If they do, you should use locks to prevent concurrent access by other instances of the same finalizer method on different object instances. However, you should try to keep finalizer code simple (for example, nothing more complicated than just a CloseHandle call) to avoid these issues.

Provide Thread Safe Cleanup Code Only if Your Type Is Thread Safe

If your type is thread safe, make sure your cleanup code is also thread safe. For example, if your thread safe type provides both Close and Dispose methods to clean up resources, ensure you synchronize threads calling Close and Dispose simultaneously.

Pinning

To safely communicate with unmanaged services, it is sometimes necessary to ask the garbage collector to refrain from relocating a certain object in memory. Such an object is said to be "pinned" and the process is called "pinning". Because the garbage collector is not able to move pinned objects, the managed heap may fragment like a traditional heap and thereby reduce available memory. Pinning can be performed both explicitly and implicitly:

  • Implicit pinning is performed in most P/Invoke and COM interop scenarios when passing certain parameters, such as strings.
  • Explicit pinning can be performed in a number of ways. You can create a GCHandle and pass GCHandleType.Pinned as the argument.
    GCHandle hmem = GCHandle.Alloc((Object) someObj, GCHandleType.Pinned);
    

    You can also use the fixed statement in an unsafe block of code.

    // assume class Circle { public int rad; }
    Circle cr = new Circle ();    // cr is a managed variable, subject to gc.
    fixed ( int* p = &cr.rad ){  // must use fixed to get address of cr.rad
        *p = 1;                //   pin cr in place while we use the pointer
    }
    

If You Need to Pin Buffers, Allocate Them at Startup

Allocating buffers just before a slow I/O operation and then pinning them can result in excessive memory consumption because of heap fragmentation. Because the memory just allocated will most likely be in Gen 0 or perhaps Gen 1, pinning this is problematic because, by design, those generations are the ones that are the most frequently compacted. Each pinned object makes the compaction process that much more expensive and leads to a greater chance of fragmentation. The youngest generations are where you can least afford this cost.

To avoid these problems, you should allocate these buffers during application startup and treat them as a buffer pool for all I/O operations. The sooner the objects are allocated, the sooner they can get into Gen 2. After the objects are in Gen 2, the cost of pinning is greatly reduced due to the lesser frequency of compaction.

Threading Explained

The .NET Framework exposes various threading and synchronization features. Your use of multiple threads can have a significant impact on application performance and scalability.

Managed Threads and Operating System Threads

The CLR exposes managed threads, which are distinct from Microsoft Win32® threads. The logical thread is the managed representation of a thread, and the physical thread is the Win32 thread that actually executes code. You cannot guarantee that there will be a one-to-one correspondence between a managed thread and a Win32 thread.

If you create a managed thread object and then do not start it by calling its Start method, a new Win32 thread is not created. When a managed thread is terminated or it completes, the underlying Win32 thread is destroyed. The managed representation (the Thread object) is cleaned up only during garbage collection some indeterminate time later.

The .NET Framework class library provides the ProcessThread class as the representation of a Win32 thread and the System.Threading.Thread class as the representation of a managed thread.

Poorly-written multithreaded code can lead to numerous problems including deadlocks, race conditions, thread starvation, and thread affinity. All of these issues can negatively impact application performance, scalability, resilience, and correctness.

Threading Guidelines

This section summarizes guidelines to improve the efficiency of your threading code:

  • Minimize thread creation.
  • Use the thread pool when you need threads.
  • Use a Timer to schedule periodic tasks.
  • Consider parallel vs. synchronous tasks.
  • Do not use Thread.Abort to terminate other threads.
  • Do not use Thread.Suspend and Thread.Resume to pause threads.

Minimize Thread Creation

Threads use both managed and unmanaged resources and are expensive to initialize. If you spawn threads indiscriminately, it can result in increased context switching on the processor. The following code shows a new thread being created and maintained for each request. This may result in the processor spending most of its time performing thread switches; it also places increased pressure on the garbage collector to clean up resources.

private void Page_Load(object sender, System.EventArgs e) 
{
  if (Page.IsPostBack)
  {                     
    // Create and start a thread
    ThreadStart ts = new ThreadStart(CallMyFunc);
    Thread th = new Thread(ts);
    ts.Start();
    …….      
  }

Use the Thread Pool When You Need Threads

Use the CLR thread pool to execute thread-based work to avoid expensive thread initialization. The following code shows a method being executed using a thread from the thread pool.

WaitCallback methodTarget = new WaitCallback( myClass.UpdateCache );
ThreadPool.QueueUserWorkItem( methodTarget );

When QueueUserWorkItem is called, the method is queued for execution and the calling thread returns and continues execution. The ThreadPool class uses a thread from the application's pool to execute the method passed in the callback as soon as a thread is available.

Use a Timer to Schedule Periodic Tasks

Use the System.Threading.Timer class to schedule periodic tasks. The Timer class allows you to specify a periodic interval that your code should be executed. The following code shows a method being called every 30 seconds.

...
TimerCallback myCallBack = new TimerCallback( myHouseKeepingTask );
Timer myTimer = new System.Threading.Timer( myCallBack, null, 0, 30000);

static void myHouseKeepingTask(object state)
{
  ...
}

When the timer elapses, a thread from the thread pool is used to execute the code indicated in the TimerCallback. This results in optimum performance because it avoids the thread initialization incurred by creating a new thread.

Consider Parallel vs. Synchronous Tasks

Before implementing asynchronous code, carefully consider the need for performing multiple tasks in parallel. Increasing parallelism can have a significant effect on your performance metrics. Additional threads consume resources such as memory, disk I/O, network bandwidth, and database connections. Also, additional threads may cause significant overhead from contention, or context switching. In all cases, it is important to verify that adding threads is helping you to meet your objectives rather then hindering your progress.

The following are examples where performing multiple tasks in parallel might be appropriate:

  • Where one task is not dependent on the results of another, such that it can run without waiting on the other.
  • If work is I/O bound. Any task involving I/O benefits from having its own thread, because the thread sleeps during the I/O operation which allows other threads to execute. However, if the work is CPU bound, parallel execution is likely to have a negative impact on performance.

Do Not Use Thread.Abort to Terminate Other Threads

Avoid using Thread.Abort to terminate other threads. When you call Abort, the CLR throws a ThreadAbortException. Calling Abort does not immediately result in thread termination. It causes an exception on the thread to be terminated. You can use Thread.Join to wait on the thread to make sure that the thread has terminated.

Do Not Use Thread.Suspend and Thread.Resume to Pause Threads

Never call Thread.Suspend and Thread.Resume to synchronize the activities of multiple threads. Do not call Suspend to suspend low priority threads — consider setting the Thread.Priority property instead of controlling the threads intrusively.

Calling Suspend on one thread from the other is a highly intrusive process that can result in serious application deadlocks. For example, you might suspend a thread that is holding onto resources needed by other threads or the thread that called Suspend.

If you need to synchronize the activities of multiple threads, use lock(object), Mutex, ManualResetEvent, AutoResetEvent, and Monitor objects. All of these objects are derivatives of the WaitHandle class, which allows you to synchronize threads within and across a process.

Note   lock(object) is the cheapest operation and will meet most, if not all, of your synchronization needs.

More Information

For more information, see the following resources:

Asynchronous Calls Explained

Asynchronous calls provide a mechanism for increasing the concurrency of your application. Asynchronous calls are nonblocking and when you call a method asynchronously, the calling thread returns immediately and continues execution of the current method.

There are a number of ways to make asynchronous method calls:

  • Calling asynchronous components. Certain classes support the .NET Framework asynchronous invocation model by providing BeginInvoke and EndInvoke methods. If the class expects an explicit call to EndInvoke at the end of the unit of work, then call it. This also helps capture failures if there are any in your asynchronous calls.
  • Calling nonasynchronous components. If a class does not support BeginInvoke and EndInvoke, you can use one of the following approaches:
    • Use the .NET thread pool.
    • Explicitly create a thread.
    • Use delegates.
    • Use timers.

Asynchronous Guidelines

This section summarizes guidelines for optimized performance when you are considering asynchronous execution:

  • Consider client-side asynchronous calls for UI responsiveness.
  • Use asynchronous methods on the server for I/O bound operations.
  • Avoid asynchronous calls that do not add parallelism.

Consider Client-Side Asynchronous Calls for UI Responsiveness

You can use asynchronous calls to increase the responsiveness of client applications. However, think about this carefully because asynchronous calls introduce additional programming complexity and require careful synchronization logic to be added to your graphical interface code.

The following code shows an asynchronous call followed by a loop that polls for the asynchronous call's completion. You can add an exit criteria to the while condition in case you need to exit from function before call is completed. You can use the callback mechanism or wait for completion if you do not need to update the client.

IAsyncResult CallResult = SlowCall.BeginInvoke(slow,null,null);
while ( CallResult.IsCompleted == false)
{ 
   ... // provide user feedback 
}
SlowCall.EndInvoke(CallResult);

Use Asynchronous Methods on the Server for I/O Bound Operations

You can increase the performance of your application by executing multiple operations at the same time. The two operations are not dependent on each other. For example, the following code calls two Web services. The duration of the code is the sum of both methods.

// get a reference to the proxy
EmployeeService employeeProxy = new EmployeeService();

// execute first and block until complete
employeeProxy.CalculateFederalTaxes(employee, null, null);
// execute second and block until complete
employeeProxy.CalculateStateTaxes(employee);

You can refactor the code as follows to reduce the total duration of the operation. In the following code, both methods execute simultaneously, which reduces the overall duration of the operation. Note that the following example uses the BeginCalculateFederalTaxes method, an asynchronous version of CalculateFederalTaxes; both of these methods are automatically generated when you reference a Web service from your client application in Visual Studio .NET.

// get a reference to the proxy
EmployeeService employeeProxy = new EmployeeService();

// start async call, BeginCalculateFederalTaxes
// call returns immediately allowing local execution to continue
IAsyncResult ar = employeeProxy.BeginCalculateFederalTaxes(employee, 
null, null);
// execute CalculateStateTaxes synchronously
employeeProxy.CalculateStateTaxes(employee);
// wait for the CalculateFederalTaxes call to finish
employeeProxy.EndCalculateFederalTaxes(ar);

More Information

For more information, see "Asynchronous Web Methods" in Chapter 10, "Improving Web Services Performance."

Avoid Asynchronous Calls That Do Not Add Parallelism

Avoid asynchronous calls that will block multiple threads for the same operation. The following code shows an asynchronous call to a Web service. The calling code blocks while waiting for the Web service call to complete. Notice that the calling code performs no additional work while the asynchronous call is executing.

// get a proxy to the Web service
customerService serviceProxy = new customerService ();
//start async call to CustomerUpdate 
IAsyncResult result = serviceProxy.BeginCustomerUpdate(null,null); 
// Useful work that can be done in parallel should appear here 
// but is absent here
//wait for the asynchronous operation to complete
// Client is blocked until call is done
result.AsyncWaitHandle.WaitOne();
serviceProxy.EndCustomerUpdate(result);

When code like this is executed in a server application such as an ASP.NET application or Web service, it uses two threads to do one task and offers no benefit; in fact, it delays other requests being processed. This practice should be avoided.

Locking and Synchronization Explained

Locking and synchronization provide a mechanism to grant exclusive access to data or code to avoid concurrent execution.

This section summarizes steps to consider to help you approach locking and synchronization correctly:

  • Determine that you need synchronization.
  • Determine the approach.
  • Determine the scope of your approach.

Determine That You Need Synchronization

Before considering synchronization options, you should think about other approaches that avoid the necessity of synchronization, such as loose coupling. Particularly, you need to synchronize when multiple users concurrently need to access or update a shared resource, such as static data.

Determine the Approach

The CLR provides the following mechanisms for locking and synchronization. Consider the one that is right for your scenario:

  • Lock (C#). The C# compiler converts the Lock statement into Monitor.Enter and Monitor.Exit calls around a try/finally block. Use SyncLock in Visual Basic .NET.
  • WaitHandle class. This class provides functionality to wait for exclusive access to multiple objects at the same time. There are three derivatives of WaitHandle:
    • ManualResetEvent. This allows code to wait for a signal that is manually reset.
    • AutoResetEvent. This allows code to wait for a signal that is automatically reset.
    • Mutex. This is a specialized version of WaitHandle that supports cross-process use. The Mutex object can be provided a unique name so that a reference to the Mutex object is not required. Code in different processes can access the same Mutex by name.
  • MethodImplOptions.Synchronized enumeration option. This provides the ability to grant exclusive access to an entire method, which is rarely a good idea.
  • Interlocked class. This provides atomic increment and decrement methods for types. Interlocked can be used with value types. It also supports the ability to replace a value based on a comparison.
  • Monitor object. This provides static methods for synchronizing access to reference types. It also provides overridden methods to allow the code to attempt to lock for a specified period. The Monitor class cannot be used with value types. Value types are boxed when used with the Monitor and each attempt to lock generates a new boxed object that is different from the rest; this negates any exclusive access. C# provides an error message if you use a Monitor on a value type.

Determine the Scope of Your Approach

You can lock on different objects and at different levels of granularity, ranging from the type to specific lines of code within an individual method. Identify what locks you have and where you acquire and release them. You can implement a policy where you consistently lock on the following to provide a synchronization mechanism:

  • Type. You should avoid locking a type (for example. lock(typeof(type)). Type objects can be shared across application domains. Locking the type locks all the instances of that type across the application domains in a process. Doing so can have very unexpected results, not the least of which is poor performance.
  • "this". You should avoid locking externally visible objects (for example. lock(this)) because you cannot be sure what other code might be acquiring the same lock, and for what purpose or policy. For correctness reasons, "this" is best avoided.
  • Specific object that is a member of a class. This choice is preferred over locking a type, instance of a type, or "this" within the class. Lock on a private static object if you need synchronization at class level. Lock on a private object (that is not static) if you need to synchronize only at the instance level for a type. Implement your locking policy consistently and clearly in each relevant method.

While locking, you should also consider the granularity of your locks. The options are as follows:

  • Method. You can provide synchronized access to a whole method of an instance using the MethodImplOptions.Synchronized enumeration option. You should consider locking at method level only when all the lines of code in the method need synchronized access; otherwise this might result in increased contention. Additionally, this provides no protection against other methods running and using the shared state — it is rarely useful as a policy, because it corresponds to having one lock object per method.
  • Code block in a method. Most of your requirements can be fulfilled choosing an appropriately scoped object as the lock and by having a policy where you acquire that lock just before entering the code that alters the shared state that the lock protects. By locking objects, you can guarantee that only one of the pieces of code that locks the object will run at a time.

Locking and Synchronization Guidelines

This section summarizes guidelines to consider when developing multithreaded code that requires locks and synchronization:

  • Acquire locks late and release them early.
  • Avoid locking and synchronization unless required.
  • Use granular locks to reduce contention.
  • Avoid excessive fine-grained locks.
  • Avoid making thread safety the default for your type.
  • Use the fine-grained lock (C#) statement instead of Synchronized.
  • Avoid locking "this".
  • Coordinate multiple readers and single writers by using ReaderWriterLock instead of lock.
  • Do not lock the type of the objects to provide synchronized access.

Acquire Locks Late and Release Them Early

Minimize the duration that you hold and lock resources, because most resources tend to be shared and limited. The faster you release a resource, the earlier it becomes available to other threads.

Acquire a lock on the resource just before you need to access it and release the lock immediately after you are finished with it.

Avoid Locking and Synchronization Unless Required

Synchronization requires extra processing by the CLR to grant exclusive access to resources. If you do not have multithreaded access to data or require thread synchronization, do not implement it. Consider the following options before opting for a design or implementation that requires synchronization:

  • Design code that uses existing synchronization mechanisms; for example, the Cache object used by ASP.NET applications.
  • Design code that avoids concurrent modifications to data. Poor synchronization implementation can negate the effects of concurrency in your application. Identify areas of code in your application that can be rewritten to eliminate the potential for concurrent modifications to data.
  • Consider loose coupling to reduce concurrency issues. For example, consider using the event-delegation model (the producer-consumer pattern) to minimize lock contention.

Use Granular Locks to Reduce Contention

When used properly and at the appropriate level of granularity, locks provide greater concurrency by reducing contention. Consider the various options described earlier before deciding on the scope of locking. The most efficient approach is to lock on an object and scope the duration of the lock to the appropriate lines of code that access a shared resource. However, always watch out for deadlock potential.

Avoid Excessive Fine-Grained Locks

Fine-grained locks protect either a small amount of data or a small amount of code. When used properly, they provide greater concurrency by reducing lock contention. Used improperly, they can add complexity and decrease performance and concurrency. Avoid using multiple fine-grained locks within your code. The following code shows an example of multiple lock statements used to control three resources.

s = new Singleton();

sb1 = new StringBuilder();
sb2 = new StringBuilder();

s.IncDoubleWrite(sb1, sb2)

class Singleton
{
   private static Object myLock = new Object();
   private int count;
   Singleton()
   {
      count = 0;
   }

    public void IncDoubleWrite(StringBuilder sb1, StringBuilder sb2)
    {
       lock (myLock) 
       {
          count++;
          sb1.AppendFormat("Foo {0}", count);
          sb2.AppendFormat("Bar {0}", count);
        }
    }
    public void DecDoubleWrite(StringBuilder sb1, StringBuilder sb2)
    {
       lock (myLock) 
       {
          count--;
          sb1.AppendFormat("Foo {0}", count);
          sb2.AppendFormat("Bar {0}", count);
       }
     }
}
Note   All methods in all examples require locking for correctness (although Interlocked.Increment could have been used instead).
Identify the smallest block of code that can be locked to avoid the resource expense of taking multiple locks.

Avoid Making Thread Safety the Default for Your Type

Consider the following guidelines when deciding thread safety as an option for your types:

  • Instance state may or may not need to be thread safe. By default, classes should not be thread safe because if they are used in a single threaded or synchronized environment, making them thread safe adds additional overhead. You may need to synchronize access to instance state by using locks but this depends on what thread safety model your code will offer. For example, in the Neutral threading model instance, state does not need to be protected. With the free threading model, it does need to be protected.

    Adding locks to create thread safe code decreases performance and increases lock contention (as well as opening up deadlock bugs). In common application models, only one thread at a time executes user code, which minimizes the need for thread safety. For this reason, most .NET Framework class libraries are not thread safe.

  • Consider thread safety for static data. If you must use static state, consider how to protect it from concurrent access by multiple threads or multiple requests. In common server scenarios, static data is shared across requests, which means multiple threads can execute that code at the same time. For this reason, it is necessary to protect static state from concurrent access.

Use the Fine-Grained lock (C#) Statement Instead of Synchronized

The MethodImplOptions.Synchronized attribute will ensure that only one thread is running anywhere in the attributed method at any time. However, if you have long methods that lock few resources, consider using the lock statement instead of using the Synchronized option, to shorten the duration of your lock and improve concurrency.

[MethodImplAttribute(MethodImplOptions.Synchronized)]
public void MyMethod ()

//use of lock
public void MyMethod()
{
  …  lock(mylock)
  {
   // code here may assume it is the only code that has acquired mylock 
   // and use resources accordingly
   …  }
}

Avoid Locking "this"

Avoid locking "this" in your class for correctness reasons, not for any specific performance gain. To avoid this problem, consider the following workarounds:

  • Provide a private object to lock on.
    public class A {
      …  lock(this) { … }
      …}
    // Change to the code below:
    public class A 
    {
      private Object thisLock = new Object();
      …  lock(thisLock) { … }
      …}
    

    This results in all members being locked, including the ones that do not require synchronization.

If you require atomic updates to a particular member variable, use the System.Threading.Interlocked class.

Note   Even though this approach will avoid the correctness problems, a locking policy like this one will result in all members being locked, including the ones that do not require synchronization. Finer-grained locks may be appropriate.

Coordinate Multiple Readers and Single Writers By Using ReaderWriterLock Instead of lock

A monitor or lock that is lightly contested is relatively cheap from a performance perspective, but it becomes more expensive if it is highly contested. The ReaderWriterLock provides a shared locking mechanism. It allows multiple threads to read a resource concurrently but requires a thread to wait for an exclusive lock to write the resource.

You should always try to minimize the duration of reads and writes. Long writes can hurt application throughput because the write lock is exclusive. Long reads can block the other threads waiting for read and writes.

For more information, see "ReaderWriterLock Class," on MSDN at http://msdn.microsoft.com/en-us/library/system.threading.readerwriterlock.aspx.

Do Not Lock the Type of the Objects to Provide Synchronized Access

Type objects are application domain-agile, which means that the same instance can be used in multiple application domains without any marshaling or cloning. If you implement a policy of locking on the type of an object using lock(typeof(type)), you lock all the instances of the objects across application domains within the process.

An example of locking the whole type is as follows.

lock(typeof(MyClass))
{
  //custom code
}

Provide a static object in your type instead. This object can be locked to provide synchronized access.

class MyClass{
  private static Object obj = new Object();
  public void SomeFunc()
  {
    lock(obj)
    {
      //perform some operation
    }
  }
}
Note   A single lock statement does not prevent other code from accessing the protected resource — it is only when a policy of consistently acquiring a certain lock before certain operations is implemented that there is true protection.

You should also avoid locking other application domain-agile types such as strings, assembly instances, or byte arrays, for the same reason.

Value Types and Reference Types

All .NET Framework data types are either value types or reference types. This section introduces you to these two basic categories of data types. Table 5.2 illustrates common value and reference types.

Table 5.2: Value and Reference Types

Value TypesReference Types
Enums

Structs

Primitive types including Boolean, Date, Char

Numeric types such as Decimal

Integral types such as Byte, Short, Integer, Long

Floating types such as Single and Double

Classes

Delegates

Exceptions

Attributes

Arrays

Value Types

Memory for a value type is allocated on the current thread's stack. A value type's data is maintained completely within this memory allocation. The memory for a value type is maintained only for the lifetime of the stack frame in which it is created. The data in value types can outlive their stack frames when a copy is created by passing the data as a method parameter or by assigning the value type to a reference type. Value types by default are passed by value. If a value type is passed to a parameter of reference type, a wrapper object is created (the value type is boxed), and the value type's data is copied into the wrapper object. For example, passing an integer to a method that expects an object results in a wrapper object being created.

Reference Types

In contrast to value types, the data for reference type objects is always stored on the managed heap. Variables that are reference types consist of only the pointer to that data. The memory for reference types such as classes, delegates, and exceptions is reclaimed by the garbage collector when they are no longer referenced. It is important to know that reference types are always passed by reference. If you specify that a reference type should be passed by value, a copy of the reference is made and the reference to the copy is passed.

Boxing and Unboxing Explained

You can convert value types to reference types and back again. When a value type variable needs to be converted to a reference (object) type, an object (a box) is allocated on the managed heap to hold the value and its value is copied into the box. This process is known as boxing. Boxing can be implicit or explicit, as shown in the following code.

int p = 123;
Object box;
box = p;          // Implicit boxing
box = (Object)p;  // Explicit boxing with a cast

Boxing occurs most often when you pass a value type to a method that takes an Object as its parameter. When a value in an object is converted back into a value type, the value is copied out of the box and into the appropriate storage location. This process is known as unboxing.

p = (int)box; // Unboxing

Boxing issues are exacerbated in loops or when dealing with large amount of data such as large-sized collections storing value types.

Boxing and Unboxing Guidelines

To help ensure that boxing and unboxing does not significantly impact your code's performance, consider the following recommendations:

  • Avoid frequent boxing and unboxing overhead.
  • Measure boxing overhead.
  • Use DirectCast in your Visual Basic .NET code.

Avoid Frequent Boxing and Unboxing Overhead

Boxing causes a heap allocation and a memory copy operation. To avoid boxing, do not treat value types as reference types. Avoid passing value types in method parameters that expect a reference type. Where boxing is unavoidable, to reduce the boxing overhead, box your variable once and keep an object reference to the boxed copy as long as needed, and then unbox it when you need a value type again.

int p = 123;
object box;
box = (object)p;  // Explicit boxing with a cast
//use the box variable instead of p
Note   Boxing in Visual Basic .NET tends to occur more frequently than in C# due to the language's pass-by-value semantics and extra calls to GetObjectValue.

Collections and Boxing

Collections store only data with base type as Object. Passing value types such as integers and floating point numbers to collections causes boxing. A common scenario is populating collections with data containing int or float types returned from a database. The overhead can be excessive in the case of collections due to iteration. The problem is illustrated by the following code snippet.

ArrayList al = new ArrayList();
for (int i=0; i<1000;i++)
  al.Add(i); //Implicitly boxed because Add() takes an object
int f = (int)al[0]; // The element is unboxed

To prevent this, consider using an array instead, or creating a custom collection class for your specific value type. You must perform unboxing with an explicit cast operator.

Note   The .NET Framework 2.0, at the time of this writing, introduces generics to the C# language. This will make it possible to write variations of the above code with no boxing.

Measure Boxing Overhead

There are several ways to measure the impact of boxing operations. You can use Performance Monitor to measure the performance impact of boxing overhead on the resource utilization and response times for your application. To do a static analysis of where exactly you are affected by boxing and unboxing in your code, you can analyze MSIL code. Search for box and unbox instructions in MSIL by using the following command line.

Ildasm.exe yourcomponent.dll /text | findstr box
Ildasm.exe yourcomponent.dll /text | findstr unbox

However, you must watch out where exactly you optimize the boxing overhead. The overhead is significant in places where there are frequent iterations such as loops, inserting, and retrieving value types in collections. Instances where boxing occurs only once or twice are not worth optimizing.

Use DirectCast In Your Visual Basic .NET Code

Use the DirectCast operator to cast up and down an inheritance hierarchy instead of using CType. DirectCast offers superior performance because it compiles directly to MSIL. Also, note that DirectCast throws an InvalidCastException if there is no inheritance relationship between two types.

Exception Management

Structured exception handling using try/catch blocks is the recommended way to handle exceptional error conditions in managed code. You should also use finally blocks (or the C# using statement) to ensure that resources are closed even in the event of exceptions.

While exception handling is recommended to create robust, maintainable code, there is an associated performance cost. Throwing and catching exceptions is expensive. For this reason, you should use exceptions only in exceptional circumstances and not to control regular logic flow. A good rule of thumb is that the exceptional path should be taken less than one time in a thousand.

This section summarizes guidelines for you to review to ensure the appropriate use of exception handling:

  • Do not use exceptions to control application flow.
  • Use validation code to avoid unnecessary exceptions.
  • Use the finally block to ensure resources are released.
  • Replace Visual Basic .NET On Error Goto code with exception handling.
  • Do not catch exceptions that you cannot handle.
  • Be aware that rethrowing is expensive.
  • Preserve as much diagnostic information as possible in your exception handlers.
  • Use Performance Monitor to monitor CLR exceptions.

Do Not Use Exceptions to Control Application Flow

Throwing exceptions is expensive. Do not use exceptions to control application flow. If you can reasonably expect a sequence of events to happen in the normal course of running code, you probably should not throw any exceptions in that scenario.

The following code throws an exception inappropriately, when a supplied product is not found.

static void ProductExists( string ProductId)
{
  //... search for Product
  if ( dr.Read(ProductId) ==0 ) // no record found, ask to create
  {
    throw( new Exception("Product Not found"));
  }
}

Because not finding a product is an expected condition, refactor the code to return a value that indicates the result of the method's execution. The following code uses a return value to indicate whether the customer account was found.

static bool ProductExists( string ProductId)
{
  //... search for Product
  if ( dr.Read(ProductId) ==0 ) // no record found, ask to create
  {
    return false;
  }
  . . .
}

Returning error information using an enumerated type instead of throwing an exception is another commonly used programming technique in performance-critical code paths and methods.

Use Validation Code to Reduce Unnecessary Exceptions

If you know that a specific avoidable condition can happen, proactively write code to avoid it. For example, adding validation checks such as checking for null before using an item from the cache can significantly increase performance by avoiding exceptions. The following code uses a try/catch block to handle divide by zero.

double result = 0;
try{
  result = numerator/divisor;
}
catch( System.Exception e){
  result = System.Double.NaN;
}

The following rewritten code avoids the exception, and as a result is more efficient.

double result = 0;
if ( divisor != 0 )
  result = numerator/divisor;
else
  result = System.Double.NaN;

Use the finally Block to Ensure Resources Are Released

For both correctness and performance reasons, it is good practice to make sure all expensive resources are released in a suitable finally block. The reason this is a performance issue as well as a correctness issue is that timely release of expensive resources is often critical to meeting your performance objectives.

The following code ensures that the connection is always closed.

SqlConnection conn = new SqlConnection("...");
try
{
  conn.Open();
  //.Do some operation that might cause an exception

  // Calling Close as early as possible
  conn.Close();
  // ... other potentially long operations

}
finally
{
  if (conn.State==ConnectionState.Open)
conn.Close();  // ensure that the connection is closed
}

Notice that Close is called inside the try block and in the finally block. Calling Close twice does not cause an exception. Calling Close inside the try block allows the connection to be released quickly so that the underlying resources can be reused. The finally block ensures that the connection closes if an exception is thrown and the try block fails to complete. The duplicated call to Close is a good idea if there is other significant work in the try block, as in this example.

Replace Visual Basic .NET On Error Goto Code with Exception Handling

Replace code that uses the Visual Basic .NET On Error/Goto error handling mechanism with exception handling code that uses Try/Catch blocks. On Error Goto code works but Try/Catch blocks are more efficient, and it avoids the creation of the error object.

More Information

For more information about why Try/Catch is more efficient, see the "Exception Handling" section of "Performance Optimization in Visual Basic .NET" on MSDN at http://msdn.microsoft.com/en-us/library/aa289513(VS.71).aspx.

Do Not Catch Exceptions That You Cannot Handle

Do not catch exceptions unless you specifically want to record and log the exception details or can retry a failed operation. Do not arbitrarily catch exceptions unless you can add some value. You should let the exception propagate up the call stack to a handler that can perform some appropriate processing.

You should not catch generic exceptions in your code as follows.

catch (Exception e)
{….}

This results in catching all exceptions. Most of these exceptions are rethrown eventually. Catching generic exceptions in your code makes it harder to debug the original source of the exception because the contents of the call stack (such as local variables) are gone.

Explicitly name the exceptions that your code can handle. This allows you to avoid catching and rethrowing exceptions. The following code catches all System.IO exceptions.

catch ( System.IO )
{
  // evaluate the exception
}

Be Aware That Rethrowing Is Expensive

The cost of using throw to rethrow an existing exception is approximately the same as throwing a new exception. In the following code, there is no savings from rethrowing the existing exception.

try {
    // do something that may throw an exception…} catch (Exception e) {
    // do something with e
    throw;
}

You should consider wrapping exceptions and rethrowing them only when you want to provide additional diagnostic information.

Preserve as Much Diagnostic Information as Possible in Your Exception Handlers

Do not catch exceptions that you do not know how to handle and then fail to propagate the exception. By doing so, you can easily obscure useful diagnostic information as shown in the following example.

try
{
  // exception generating code
}
catch(Exception e)
{ 
  // Do nothing
}

This might result in obscuring information that can be useful for diagnosing the erroneous code.

Use Performance Monitor to Monitor CLR Exceptions

Use Performance Monitor to identify the exception behavior of your application. Evaluate the following counters for the .NET CLR Exceptions object:

  • # of Exceps Thrown. This counter provides the total number of exceptions thrown.
  • # of Exceps Thrown / sec. This counter provides the frequency of exceptions thrown.
  • # of Finallys / sec. This counter provides the frequency of finally blocks being executed.
  • Throw to Catch Depth / sec. This counter provides the number of stack frames that were traversed from the frame throwing the exception, to the frame handling the exception in the last second.

Identify areas of your application that throw exceptions and look for ways to reduce the number of exceptions to increase your application's performance.

More Information

For more information on exception management, see the following resources:

Iterating and Looping

Applications use iterations to execute a set of statements a number of times. Nonoptimized code within the loops can result in exacerbated performance issues, ranging from increased memory consumption to CPU exhaustion.

This section summarizes guidelines that can improve iteration and loop efficiency:

  • Avoid repetitive field or property access.
  • Optimize or avoid expensive operations within loops.
  • Copy frequently called code into the loop.
  • Consider replacing recursion with looping.
  • Use for instead of foreach in performance-critical code paths.

Avoid Repetitive Field or Property Access

If you use data that is static for the duration of the loop, obtain it before the loop instead of repeatedly accessing a field or property. The following code shows a collection of orders being processed for a single customer.

for ( int item = 0; item < Customer.Orders.Count ; item++ ){
  CalculateTax ( Customer.State, Customer.Zip, Customer.Orders[item] );
}

Note that State and Zip are constant for the loop and could be stored in local variables rather than accessed for each pass through the loop as shown in the following code.

string state = Customer.State;
string zip = Customer.Zip;
int count = Customers.Orders.Count;
for ( int item = 0; item < count ; item++ )
{
  CalculateTax (state, zip, Customer.Orders[item] );
}

Note that if these are fields, it may be possible for the compiler to do this optimization automatically. If they are properties, it is much less likely. If the properties are virtual, it cannot be done automatically.

Optimize or Avoid Expensive Operations Within Loops

Identify operations in your loop code that can be optimized. Look for code that causes boxing or allocations as a side effect. The following code causes side effect strings to be created for each pass through the loop.

String str;
Array arrOfStrings = GetStrings();
for(int i=0; i<10; i++)
{
  str+= arrOfStrings[i];
}

The following code avoids extra string allocations on the heap by using StringBuilder.

StringBuilder sb = new StringBuilder();
Array arrOfStrings = GetStrings();
for(int i=0; i<10; i++)
{
  sb.Append(arrOfStrings.GetValue(i));
}

The following guidelines can help you avoid expensive operations in loops:

  • Be aware of the method calls you make inside loops. Watch out for inadvertent method calls and consider using inline code where appropriate.
  • Consider StringBuilder for string concatenation inside a loop. For more information, see "String Operations" later in this chapter.
  • When testing for multiple conditions to exit out or continue looping, order your tests so that the one most likely to let you escape the loop, is run first.

Copy Frequently Called Code into the Loop

If you repeatedly call methods from inside a loop, consider changing the loop to reduce the number of calls made. The JIT compiler usually inlines any called code if it is simple, but in most complex scenarios it is your responsibility to optimize the code. The costs of the call increase as you cross process or computer boundaries with remoting or Web services. The following code shows a method being called repeatedly inside a loop.

for ( int item = 0 ; item < Circles.Items.Length; item++ ){
  CalculateAndDisplayArea(Circles[item]);
}

Consider the following strategies to reduce the calls incurred:

  • Move the called code into the loop. This reduces the number of calls being made.
  • Move the whole unit of work to the called object. The following code modifies the object being called and passes all the required data so that the whole loop can happen remotely. This is helpful to avoid round trips and offloads the work to local calls for an object which may be hosted remotely.
    // call function to store all items
    OrderProcessing op = new OrderProcessing();
    StoreAllOrderItems (Order.Items);
    ...
    class OrderProcessing{
    ...
      public bool StoreAllOrderItems ( Items itemsToInsert )
      {
        SqlConnection conn = new SqlConnection(...
        SqlCommnd cmd = new SqlCommand(...
        for ( int item = 0 ; item < orders.Items.Length; item++ ){
          // insert order into database
          // set parameters on command object
          cmd.ExecuteNonQuery();
          // insert order item
        }
      }
      . . .
    }
    

Consider Replacing Recursion with Looping

Each recursive call adds data to the stack. Examine your code and see if your recursive calls can be converted to a looping equivalent. The following code makes recursive calls to accomplish a small task of string concatenation.

Array arr = GetArrayOfStrings();
int index = arr.Length-1;
String finalStr= RecurStr(index);
string RecurStr(int ind){
  if (ind<=0)
    return "";
  else
    return (arr.GetValue(ind)+RecurStr(ind-1));
}

Rewritten, the following code now avoids creating new data on the stack for each successive call and avoids an additional method call to itself.

string ConcString (Array array)
{
  StringBuilder sb = new StringBuilder();
  for (int i= array.Length; i>0; i--)
  {
    sb.Append(array.GetValue(i));
  }
  return sb;
}

Use for Instead of foreach in Performance-Critical Code Paths

Use for instead of foreach (C#) to iterate the contents of collections in performance-critical code. foreach in C# and For Each in Visual Basic .NET use an enumerator to provide enhanced navigation through arrays and collections. For more information, see "Enumeration Overhead" in the "Collection Guidelines" section later in this chapter.

String Operations

The .NET Framework provides the System.String data type to represent a string. Intensive string manipulation can significantly degrade performance due to the immutable nature of the System.String type. This means that every time you perform an operation to change the string data, the original string in memory is discarded for later garbage collection and a new one is created to hold the new string data. Also note that the String type is a reference type, so the contents of the string are stored on the managed heap. As a result, strings must be garbage collected to be cleaned up.

This section summarizes recommendations to consider when working with strings:

  • Avoid inefficient string concatenation.
  • Use + when the number of appends is known.
  • Use StringBuilder when the number of appends is unknown.
  • Treat StringBuilder as an accumulator.
  • Use the overloaded Compare method for case insensitive string comparisons.

Avoid Inefficient String Concatenation

Excessive string concatenation results in many allocation and deallocation operations, because each time you perform an operation to change the string, a new one is created and the old one is subsequently collected by the garbage collector.

  • If you concatenate string literals, the compiler concatenates them at compile time.
    //'Hello' and 'world' are string literals
    String str = "Hello" + "world";
    
  • If you concatenate nonliteral strings, CLR concatenates them at run time. So using the + operator creates multiple strings objects in the managed heap.
  • Use StringBuilder for complex string manipulations and when you need to concatenate strings multiple times.
    // using String and '+' to append
    String str = "Some Text";
    for ( ... loop several times to build the string ...) {
      str = str + " additional text ";
    }
    // using String and .Append method to append
    StringBuilder strBuilder = new StringBuilder("Some Text ");
    for ( ... loop several times to build the string ...) {
      strBuilder.Append(" additional text ");
    }
    

Use + When the Number of Appends Is Known

If you know the number of appends to be made and you are concatenating the strings in one shot, prefer the + operator for concatenation.

String str = str1+str2+str3;

If you concatenate the strings in a single expression, only one call to String.Concat needs to be made. It results in no temporary strings (for partial combinations of the strings to be concatenated).

Note   You should not be using + on strings inside a loop or for multiple iterations. Use StringBuilder instead.

Use StringBuilder When the Number of Appends Is Unknown

If you do not know the number of appends to be made, which might be the case when iterating through a loop or building dynamic SQL queries, use the StringBuilder class as shown in the following code sample.

for (int i=0; i< Results.Count; i++)
{
  StringBuilder.Append (Results[i]);
} 

The StringBuilder class starts with a default initial capacity of 16. Strings less than the initial capacity are stored in the StringBuilder object.

The initial capacity of the buffer can be set by using the following overloaded constructor.

public StringBuilder (int capacity);

You can continue to concatenate without additional allocations until you consume the preallocated buffer. As a result, using a StringBuilder object is often more efficient than using String objects for concatenation. If you concatenate further, the StringBuilder class creates a new buffer of the size equal to double the current capacity.

So if you start with a StringBuilder of size 16 and exceed the limit, the StringBuilder allocates a new buffer of size 32 and copies the old string to the new buffer. The old buffer is inaccessible and becomes eligible for garbage collection.

Note   You should always try to set the initial capacity of the StringBuilder to an optimum value to reduce the cost of new allocations. To determine the optimum value for your case, the best way is to track the memory consumption by using the CLR profiler. For more information about how to use CLR profiler, see "How To: Use CLR Profiler" in the "How To" section of this guide.

Treat StringBuilder as an Accumulator

You can treat StringBuilder as an accumulator or reusable buffer. This helps avoid the allocations of temporary strings during multiple append iterations. Some of the scenarios where this helps are as follows:

  • Concatenating strings. You should always prefer the following approach to string concatenation when using StringBuilder.
    StringBuilder sb;
    sb.Append(str1);
    sb.Append(str2);
    

    Use the preceding code rather than the following.

    sb.Append(str1+str2);
    

    This is because you do not need to make the temporary str1+str2 to append str1 and then str2.

  • Concatenating the strings from various functions. An example of this is shown in the following code sample.
    StringBuilder sb;
    sb.Append(f1(…));
    sb.Append(f2(…)); 
    sb.Append(f3(…));
    

    The preceding code snippet results in temporary string allocations for the return values by the functions f1 (...), f2 (…), f3 (…). You can avoid these temporary allocations by using the following pattern.

    void f1( sb,…);
    void f2( sb,…);
    void f3( sb,…);
    

    In this case, the StringBuilder instance is directly passed as an input parameter to the methods. sb.Append is directly called in the function body, which avoids the allocation of temporary strings.

Use the Overloaded Compare Method for Case-Insensitive String Comparisons

Carefully consider how you perform case-insensitive string comparisons. Avoid using ToLower as shown in the following code because you end up creating temporary string objects.

// Bad way for insensitive operations because ToLower creates temporary 
strings
String str="New York";
String str2 = "New york";
if (str.ToLower()==str2.ToLower())
  // do something

The more efficient way to perform case-insensitive string comparisons is to use the Compare method.

str.Compare(str,str2,false);
Note   The String.Compare method uses the info in the CultureInfo.CompareInfo property to compare culture-sensitive strings.

More Information

For more information on string management performance, see "Improving String Handling Performance in .NET Framework Applications" on MSDN at http://msdn.microsoft.com/en-us/library/aa302329.aspx.

Arrays

Arrays provide basic functionality for grouping types. Every language implements array syntax in its own way, although the following considerations apply regardless of language:

  • Arrays have a static size. The size of the array remains fixed after initial allocation. If you need to extend the size of the array, you must create a new array of the required size and then copy the elements from the old array.
  • Arrays support indexed access. To access an item in an array, you can use its index.
  • Arrays support enumerator access. You can access items in the array by enumerating through the contents using the foreach construct (C#) or For Each (Visual Basic .NET).
  • Memory is contiguous. The CLR arranges arrays in contiguous memory space, which provides fast item access.

This section summarizes performance guidelines to consider when using arrays:

  • Prefer arrays to collections unless you need functionality.
  • Use strongly typed arrays.
  • Use jagged arrays instead of multidimensional arrays.

Prefer Arrays to Collections Unless You Need Functionality

Arrays are the fastest of all collections, so unless you need special functionality, such as dynamic extension of the collection, you should consider using arrays rather than collections. Arrays also avoid the boxing and unboxing overhead.

Use Strongly Typed Arrays

Use strongly typed arrays where possible, rather than using object arrays to store types. This avoids type conversion or boxing depending upon the type stored in the array. If you declare an array of objects and then proceed to add a value type such as an integer or float to the array, it involves the boxing overhead as shown in the following code sample.

Object[] array = new Object[10]
arr[0] = 2+3; //boxing occurs here

To avoid the boxing overhead declare a strongly typed int array, as follows:
int [] arrIn = new int [10];
arrIn[0] = 2+3;

Storing reference types, such as string or custom classes in the array of objects, involves the typecasting overhead. Therefore, you should use strongly typed arrays to store your reference types to, as shown in the following code sample.

string[10]  arrStr = new string[10];
arrStr[0] =  new string("abc");

Use Jagged Arrays Instead of Multidimensional Arrays

A jagged array is a single dimensional array of arrays. The elements of a jagged array can be of different dimensions and sizes. Use jagged arrays instead of multidimensional arrays to benefit from MSIL performance optimizations.

MSIL has specific instructions that target single dimensional zero-based arrays (SZArrays) and access to this type of array is optimized. In contrast, multidimensional arrays are accessed using the same generic code for all types, which results in boxing and unboxing for arrays of primitive types.

Note   Avoid nonzero-based arrays because they perform more slowly than SZArrays.

The following example shows the declaration and use of jagged arrays.

string[][] Address = new string[2][];     // A jagged array of strings
Address[0] = new string[1];
Address[1] = new string[2];
Address[0][0] = "Address [0,1]";
Address[1][0] = "Address [1,0]";
Address[1][1] = "Address [1,1]";
for (int i =0; i <=1; i++) {
      for (int j = 0; j < Address[i].Length; j ++)
            MessageBox.Show(Address[i][j]);
}
Note   Jagged arrays are not Common Language Specification (CLS) compliant and may not be used across languages.

You can compare the efficiency of jagged versus multidimensional arrays by studying the MSIL code generated in each case. Notice how the following code that uses a multidimensional array results in a function call.

int [,] secondarr = new int[1, 2];
secondarr[0, 0] = 40;

The preceding code generates the following MSIL. Notice the function call.

IL_0029: ldc.i4.s   40
IL_002b: call instance void int32[0...,0...]::Set(int32,
                                                   int32,
                                                   int32)

The following code shows the MSIL generated for a jagged array. Notice the use of the MSIL stelem instruction. The stelem instruction replaces the array element at a given index with the int32 value on the evaluation stack.

int [][] intarr = new int[1][];
intarr[0] = new int[2];
intarr[0][0] = 10;

The preceding code generates the following MSIL. Note the use of the stelem instruction.

IL_001c:  ldc.i4.s   10
IL_001e:  stelem.i4

Additional Considerations

When using arrays, also consider the following:

  • Sorting. If you retrieve data from a database, see if you can presort it by using an ORDER BY clause in your query. If you need to use the sorted results from the database for additional searching and sorting of the subset of results, you may require sorting the arrays. You should always measure to find out which approach works better for your scenario: sorting, using SQL queries, or sorting using arrays in the business layer.
  • Avoid returning an Array from a property. Instead, consider using indexing properties.
    EmployeeList l = FillList();
    for (int i = 0; i < l.Length; i++) {
       if (l.All[i] == x){...}
    }
    

    In the preceding code, each time the property All is used, you might be creating and returning an array. If the calling code uses the property in a loop as shown in the preceding code, an array is created on each iteration of the loop.

    In addition, if you return an array from a method, the resulting code is somewhat nonintuitive. A code example follows. In either case, document the details for your API.

    // calling code:
    if (l.GetAll()[i]== x) {...}
    

    If you must return an array from a piece of code, consider returning a copy to prevent synchronization issues between clients.

  • In the following code example, each call to the myObj property creates a copy of the array. As a result, a copy of the array will be created each time the code DoSomething(obj.myObj[i]) is executed.
    for (int i = 0; i < obj.myObj.Count; i++)
          DoSomething(obj.myObj[i]);
    

Collections Explained

There are two basic types of collections: lists and dictionaries. Lists are index-based. Dictionaries are key-based, which means you store a key with the value. Table 5.3 summarizes the various list and dictionary types provided by the .NET Framework class libraries.

Collections are types that implement the IEnumerable, ICollection, or IList interfaces. If the types implement IDictionary separately or in addition to these three interfaces, they are referred to as Dictionaries. Table 5.3 lists the guidelines for each of these collection types.

Table 5.3: List and Dictionary Collection Types

TypeDescription
ArrayListThis is a dynamically sizable array. It is useful when you do not know the required array size at design time.
HashtableThis is a collection of key/value pairs that are organized based on the hash code of the key. It is appropriate when you need to search but not sort.
HybridDictionaryThis uses a ListDictionary when the collection is small, and switches to Hashtable when the collection gets large.
ListDictionaryThis is useful for storing 10 or less key/value pairs.
NameValueCollectionThis is a sorted collection of associated String keys and String values that can be accessed either with the key or with the index.
QueueThis is a first-in, first-out collection that implements ICollection.
SortedListThis is a collection of key/value pairs that are sorted by the keys and are accessible by key and by index.
StackThis is a simple last-in, first-out collection of objects.
StringCollectionThis is a strongly typed array list for strings.
StringDictionaryThis is a hash table with the key strongly typed to be a string rather than an object.

Collection Issues

This section summarizes performance-related issues associated with collections:

  • Boxing issues
  • Thread safety
  • Enumeration overhead

Boxing Issues

If you use a collection such as an ArrayList to store value types such as integer or float, every item is boxed (a reference type is created and the value copied) when it is added to the collection. If you are adding many items to the collection, the overhead can be excessive. The problem is illustrated by the following code snippet.

ArrayList al = new ArrayList();
for (int i=0; i<1000;i++)
  al.Add(i); //Implicitly boxed because Add() takes an object
int f = (int)al[0]; // The element is unboxed

To prevent this problem, consider using an array instead, or creating a custom collection class for your specific value type.

Note   The .NET Framework 2.0, at the time of this writing, introduces generics to the C# language which will avoid the boxing and unboxing overhead.

Thread Safety

Collections are generally not thread safe by default. It is safe for multiple threads to read the collection, but any modification to the collection produces undefined results for all threads that access the collection. To make a collection thread safe, do the following:

  • Create a thread safe wrapper using the Synchronized method, and access the collection exclusively through that wrapper.
    // Creates and initializes a new ArrayList.
    ArrayList myAr = new ArrayList();    
    // add objects to the collection    
    // Creates a synchronized wrapper around the ArrayList.
    ArrayList mySyncdAr = ArrayList.Synchronized( myAr );
    
  • Use the lock statement in C# (or SyncLock in Visual Basic .NET) on the SyncRoot property when accessing the collection.
    ArrayList myCollection = new ArrayList();
    lock( myCollection.SyncRoot ) {
      // Insert your code here.
    }
    

    You can also implement a synchronized version of the collection by deriving from the collection and implementing a synchronized method using the SyncRoot property. See the preceding "Locking and Synchronization Guidelines" section to understand the implications because synchronizing in this way is usually a less effective method.

Enumeration Overhead

The .NET Framework version 1.1 collections provide an enumerator by overriding IEnumerable.GetEnumerator. This turns out to be less than optimal for a number of reasons:

  • The GetEnumerator method is virtual, so the call cannot be inlined.
  • The return value is an IEnumerator interface instead of an exact type; as a result, the exact enumerator cannot be known at compile time.
  • The MoveNext method and Current properties are again virtual and so cannot be inlined.
  • IEnumerator.Current requires a return type of System.Object, rather than a more specific data type which may require boxing and unboxing, depending on the data types stored in the collection.

As a result of these factors, there are both managed heap and virtual function overhead associated with foreach on simple collection types. This can be a significant factor in performance-sensitive regions of your application.

For information about how to minimize the overhead, see "Consider Enumerating Overhead" in the next section.

Collection Guidelines

This section summarizes guidelines that help you to use .NET Framework collection types most efficiently and to avoid common performance mistakes:

  • Analyze your requirements before choosing the collection type.
  • Initialize collections to the right size when you can.
  • Consider enumerating overhead.
  • Prefer to implement IEnumerable with optimistic concurrency.
  • Consider boxing overhead.
  • Consider for instead of foreach.
  • Implement strongly typed collections to prevent casting overhead.
  • Be efficient with data in collections.

Analyze Your Requirements Before Choosing the Collection Type

Do you need to use a collection? Arrays are generally more efficient, particularly if you need to store value types. You should choose a collection based on the size, type of data to be stored, and usage requirements. Use the following evaluation criteria when determining which collection is appropriate:

  • Do you need to sort your collection?
  • Do you need to search your collection?
  • Do you need to access each element by index?
  • Do you need a custom collection?

Do You Need to Sort Your Collection?

If you need to sort your collection, do the following:

  • Use ArrayList to bind the read-only sorted data to a data grid as a data source. This is better than using a SortedList if you only need to bind read-only data using the indexes in the ArrayList (for example, because the data needs to be displayed in a read-only data grid). The data is retrieved in an ArrayList and sorted for displaying.
  • Use SortedList for sorting data that is mostly static and needs to be updated only infrequently.
  • Use NameValueCollection for sorting strings.
  • SortedList presorts the data while constructing the collection. This results in a comparatively expensive creation process for the sorted list, but all updates to the existing data and any small additions to the list are automatically and efficiently resorted as the changes are made. Sortedlist is suitable for mostly static data with minor updates.

Do You Need to Search Your Collection?

If you need to search your collection, do the following:

  • Use Hashtable if you search your collection randomly based on a key/value pair.
  • Use StringDictionary for random searches on string data.
  • Use ListDictionary for sizes less than 10.

Do You Need to Access Each Element by Index?

If you need to access each element by index, do the following:

  • Use ArrayList and StringCollection for zero-based index access to the data.
  • Use Hashtable, SortedList, ListDictionary, and StringDictionary to access elements by specifying the name of the key.
  • Use NameValueCollection to access elements, either by using a zero-based index or specifying the key of the element.
  • Remember that arrays do this better than any other collection type.

Do You Need a Custom Collection?

Consider developing a custom collection to address the following scenarios:

  • Develop your own custom collection if you need to marshal by reference because all standard collections are passed by value. For example, if the collection stores objects that are relevant only on the server, you might want to marshal the collection by ref rather than by value.
  • You need to create a strongly typed collection for your own custom object to avoid the costs of upcasting or downcasting, or both. Note that if you create a strongly typed collection by inheriting CollectionBase or Hashtable, you still end up paying the price of casting, because internally, the elements are stored as objects.
  • You need a read-only collection.
  • You need to have your own custom serializing behavior for your strongly typed collection. For example, if you extend Hashtable and are storing objects that implement IDeserializationCallback, you need to customize serialization to factor for the computation of hash values during the serialization process.
  • You need to reduce the cost of enumeration.

Initialize Collections to the Right Size When You Can

Initialize collections to the right size if you know exactly, or even approximately, how many items you want to store in your collection; most collection types let you specify the size with the constructor, as shown in the following example.

ArrayList ar = new ArrayList (43);

Even if the collection is able to be dynamically resized, it is more efficient to allocate the collection with the correct or approximate initial capacity (based on your tests).

Consider Enumerating Overhead

A collection supports enumeration of its elements using the foreach construct by implementing IEnumerable.

To reduce the enumeration overhead in collections, consider implementing the Enumerator pattern as follows:

  • If you implement IEnumerable.GetEnumerator also implement a non-virtual GetEnumerator method. Your class's IEnumerable.GetEnumerator method should call this nonvirtual method, which should return a nested public enumerator struct as shown in the following code sample.
    class MyClass : IEnumerable
    {
      // non-virtual implementation for your custom collection
      public MyEnumerator GetEnumerator() {
        return new MyEnumerator(this); // Return nested public struct
      }
      // IEnumerator implementation
      public IEnumerator.GetEnumerator() {
        return GetEnumerator();//call the non-interface method
      }
    }
    

    The foreach language construct calls your class's nonvirtual GetEnumerator if your class explicitly provides this method. Otherwise, it calls IEnumerable.GetEnumerator if your class inherits from IEnumerable. Calling the nonvirtual method is slightly more efficient than calling the virtual method through the interface.

  • Explicitly implement the IEnumerator.Current property on the enumerator struct. The implementation of .NET collections causes the property to return a System.Object rather than a strongly typed object; this incurs a casting overhead. You can avoid this overhead by returning a strongly typed object or the exact value type rather than System.Object in your Current property. Because you have explicitly implemented a non-virtual GetEnumerator method (not the IEnumerable.GetEnumerator) the runtime can directly call the Enumerator.Current property instead of calling the IEnumerator.Current property, thereby obtaining the desired data directly and avoiding the casting or boxing overhead, eliminating virtual function calls, and enabling inlining.

Your implementation should be similar to the following.

// Custom property in your class
//call this property to avoid the boxing or casting overhead
Public MyValueType Current { 
  MyValueType obj = new MyValueType();
  // the obj fields are populated here
  return obj;
}
// Explicit member implementation
Object IEnumerator.Current {
get { return Current} // Call the non-interface property to avoid casting
}

Implementing the Enumerator pattern involves having an extra public type (the enumerator) and several extra public methods that are really there only for infrastructure reasons. These types add to the perceived complexity of the API and must be documented, tested, versioned, and so on. As a result, you should adopt this pattern only where performance is paramount.

The following sample code illustrates the pattern.

public class  ItemTypeCollection: IEnumerable 
{
   public struct MyEnumerator : IEnumerator 
   {
      public ItemType Current { get {… } }
      object IEnumerator.Current { get { return Current; } }
      public bool MoveNext() { … }
      …   }
   public MyEnumerator GetEnumerator() { … }
   IEnumerator IEnumerable.GetEnumerator() { … }
   …}

To take advantage of JIT inlining, avoid using virtual members in your collection unless you really need extensibility. Also, limit the code in the Current property to returning the current value to enable inlining, or alternatively, use a field.

Prefer to Implement IEnumerable with Optimistic Concurrency

There are two legitimate ways to implement the IEnumerable interface. With the optimistic concurrency approach, you assume that the collection will not be modified while it is being enumerated. If it is modified, you throw an InvalidOperationException. An alternate pessimistic approach is to take a snapshot of the collection in the enumerator to isolate the enumerator from changes in the underlying collection. In most general cases, the optimistic concurrency model provides better performance.

Consider Boxing Overhead

When storing value types in a collection, you should consider the overhead involved, because the boxing overhead can be excessive depending on the size of the collection and the rate of updating or accessing the data. If you do not need the functionality provided by collections, consider using arrays to avoid the boxing overhead.

Consider for Instead of foreach

Use for instead of foreach (C#) to iterate the contents of arrays or collections in performance critical code, particularly if you do not need the protections offered by foreach.

Both foreach in C# and For Each in Visual Basic .NET use an enumerator to provide enhanced navigation through arrays and collections. As discussed earlier, typical implementations of enumerators, such as those provided by the .NET Framework, will have managed heap and virtual function overhead associated with their use.

If you can use the for statement to iterate over your collection, consider doing so in performance sensitive code to avoid that overhead.

Implement Strongly Typed Collections to Prevent Casting Overhead

Implement strongly typed collections to prevent upcasting or downcasting overhead. Do so by having its methods accept or return specific types instead of the generic object type. StringCollection and StringDictionary are examples of strongly typed collections for strings.

For more information and a sample implementation, see "Walkthrough: Creating Your Own Collection Class" in Visual Basic and Visual C# Concepts on MSDN at http://msdn.microsoft.com/en-us/library/xth2y6ft(VS.71).aspx.

Be Efficient with Data in Collections

When dealing with very large numbers of objects, it becomes very important to manage the size of each object. For example, it makes little difference whether you use a short (Int16), int/Integer (Int32), or long (Int64) for a single variable, but it can make a huge difference if you have a million of them in a collection or array. Whether you are dealing with primitive types or complex user-defined objects, make sure you do not allocate more memory than you need if you will be creating a large number of these objects.

Collection Types

This section summarizes the main issues to consider when using the following collection types:

  • ArrayList
  • Hashtable
  • HybridDictionary
  • ListDictionary
  • NameValueCollection
  • Queue
  • SortedList
  • Stack
  • StringCollection
  • StringDictionary

ArrayList

The ArrayList class represents a list that dynamically resizes as new items are added to the list and its current capacity is exceeded. Consider the following recommendations when using an ArrayList:

  • Use ArrayList to store custom object types and particularly when the data changes frequently and you perform frequent insert and delete operations.
  • Use TrimToSize after you achieve a desired size (and there are no further insertions expected) to trim the array list to an exact size. This also optimizes memory use. However, be aware that if your program subsequently needs to insert new elements, the insertion process is now slower because the ArrayList must now dynamically grow; trimming leaves no room for growth.
  • Store presorted data and use ArrayList.BinarySearch for efficient searches. Sorting and linear searches using Contains are expensive. This is essentially for one-off sorting of data, but if you need to perform frequent sorting, a SortedList might be more beneficial because it automatically re-sorts the entire collection after each insertion or update.
  • Avoid ArrayList for storing strings. Use a StringCollection instead.

Hashtable

Hashtable represents a collection of key/value pairs that are organized based on the hash code of the key. Consider the following recommendations when using Hashtable:

  • Hashtable is suitable for large number of records and data that may or may not change frequently. Frequently changing data has an extra overhead of computing the hash value as compared to data which does not change frequently.
  • Use Hashtable for frequently queried data; for example, product catalogues where a product ID is the key.

HybridDictionary

HybridDictionary is implemented internally, using either a ListDictionary when the collection is small or a Hashtable when the collection increases in size. Consider the following recommendations:

  • Use HybridDictionary for storing data when the number of records is expected to be low most of the time, with occasional increases in size. If you are sure that the size of collection will be always high or always low, you should choose Hashtable and ListDictionary respectively. This avoids the extra cost of the HybridDictionary, which acts as a wrapper around both these collections.
  • Use HybridDictionary for frequently queried data.
  • Do not use HybridDictionary to sort data. It is not optimized for sorting.

ListDictionary

Use ListDictionary to store small amounts of data (fewer than 10 items).This implements the IDictionary interface using a singly-linked list implementation. For example, a factory class that implements the Factory pattern might store instantiated objects in a cache using a ListDictionary, so they can be served directly from the cache the next time a creation request is made.

NameValueCollection

This represents a sorted collection of associated string keys and string values that can be accessed either with the key or with the index. For example, you may use a NameValueCollection if you need to display subjects registered by students in a particular class because it can store the data in alphabetical order of student names.

  • Use NameValueCollection to store strings of key/value pairs in a pre-sorted order. Note that you can also have multiple entries with the same key.
  • Use NameValueCollection for frequently changing data where you need to insert and delete items regularly.
  • Use NameValueCollection when you need to cache items and for fast retrieval.

Queue

Queue represents a first-in, first-out object collection. Consider the following recommendations for using Queue:

  • Use Queue when you need to access data sequentially, based on priority. For example, an application that scans the waiting list of plane reservation requests and gives priority by allocating vacant seats to passengers at the beginning of queue.
  • Use Queue when you need to process items sequentially in a first-in, first-out manner.
  • If you need to access items based on a string identifier, use a NameValueCollection instead.

SortedList

The SortedList represents a collection of key/value pairs that are sorted by the keys and are accessible by key and by index. New items are added in sorted order and the positions of existing items are adjusted to accommodate the new items. The creation costs associated with a SortedList are relatively high, so you should use it in the following situations:

  • The collection can be used where the data is mostly static and only a few records need to be added or updated over a period of time; for example, a cache of employee information. This can be updated by adding a new key based on employee number, which is added quickly in the SortedList, whereas an ArrayList needs to run the Sorting algorithm all over again so the delta change is faster in SortedList.
  • Use SortedList for fast object retrieval using an index or key. It is well suited for circumstances where you need to retrieve a set of sorted objects, or for querying for a specific object.
  • Avoid using SortedList for large data changes because the cost of inserting the large amount of data is high. Instead, prefer an ArrayList and sort it by calling the Sort method. The ArrayList uses the QuickSort algorithm by default. The time taken by ArrayList is much less for creating and sorting than the time taken by the SortedList.
  • Avoid using SortedList for storing strings because of the casting overhead. Use a StringCollection instead.

Stack

This represents a simple last-in, first-out object collection. Consider the following recommendations for using a Stack:

  • Use Stack in scenarios where you need to process items in a last–in, first-out manner. For example, an application that needs to monitor the 10 most recent users visiting a Web site over a period of time.
  • Specify the initial capacity if you know the size.
  • Use Stack where you can discard the items after processing it.
  • Use Stack where you do not need to access arbitrary items in the collection.

StringCollection

This represents a collection of strings and is a strongly typed ArrayList. Consider the following recommendations for using StringCollection:

  • Use StringCollection to store string data that changes frequently and needs to be retrieved in large chunks.
  • Use StringCollection for binding string data to a data grid. This avoids the cost of downcasting it to a string during retrieval.
  • Do not use StringCollection for sorting strings or to store presorted data.

StringDictionary

This is a Hashtable with the key strongly typed as a string, rather than an object. Consider the following recommendations for using StringDictionary:

  • Use StringDictionary when the data does not change frequently because the underlying structure is a Hashtable used for storing strongly typed strings.
  • Use StringDictionary to store static strings that need to be frequently queried.
  • Always prefer StringDictionary over Hashtable for storing string key/value pairs if you want to preserve the string type to ensure type safety.

More Information

For more information about .NET collection classes, see the following Microsoft Knowledge Base articles:

Reflection and Late Binding

Reflection provides the ability to examine and compare types, enumerate methods and fields, and dynamically create and execute types at runtime. Even though all reflection costs are high, some reflection operations cost much more than others. The first (comparing types) is the least expensive, while the last (dynamically creating and executing) is the most expensive. This is accomplished by examining the metadata contained in assemblies. Many reflection APIs need to search and parse the metadata. This requires extra processing that should be avoided in performance-critical code.

The late binding technique uses reflection internally and is an expensive operation that should be avoided in performance critical code.

This section summarizes recommendations to minimize the performance impact of reflection or late binding code:

  • Prefer early binding and explicit types rather than reflection.
  • Avoid late binding.
  • Avoid using System.Object in performance critical code paths.
  • Enable Option Explicit and Option Strict in Visual Basic .NET.

Prefer Early Binding and Explicit Types Rather Than Reflection

Visual Basic .NET uses reflection implicitly when you declare the type as object. In C#, you use reflection explicitly. You should avoid reflection wherever possible by using early binding and declaring types explicitly.

Some examples where you use reflection explicitly in C# are when you perform any of the following operations:

  • Type comparisons using TypeOf, GetType, and IsInstanceOfType.
  • Late bound enumeration using Type.GetFields.
  • Late bound execution using Type.InvokeMember.

Avoid Late Binding

Early binding allows the compiler to identify the specific type required and perform optimizations that are used at run time. Late binding defers the type identification process until run time and requires extra processing instructions to allow type identification and initialization. The following code loads a type at run time.

Assembly asm = System.Reflection.Assembly.LoadFrom("C:\\myAssembly.dll");
Type myType = asm.GetType("myAssembly.MyTypeName");
Object myinstance = Activator.CreateInstance(myType);

This is the equivalent of the following.

MyTypeName myinstance = new MyTypeName();

In some cases, you need dynamic execution of types but when performance is critical, avoid late binding.

Avoid Using System.Object in Performance-Critical Code Paths

The System.Object data type can represent any value or reference type but requires late bound calls to execute methods and access properties. Avoid using the Object type when performance of your code is critical.

The Visual Basic .NET compiler implicitly uses reflection if you declare the type as Object.

'VB.NET 
Dim obj As Object
Set Obj = new CustomType()
Obj.CallSomeMethod()
Note   This is a Visual Basic .NET specific issue. C# has no such problem.

Enable Option Explicit and Option Strict in Visual Basic.NET

By default, Visual Basic .NET allows late bound code. Set the Strict and Explicit properties to true to force Visual Basic .NET to not allow late bound code. In Visual Studio .NET, you can access these properties through the Project Properties dialog box. If you use the command line compiler Vbc.exe to compile your code, use the /optionexplicit and /optionstrict flags.

Code Access Security

The .NET Framework provides code access security to control the ability of code to access various protected resources and operations. An administrator can control which permissions a particular assembly is granted through policy configuration. At run time, access to specific resource types and operations triggers a permission demand that verifies that every caller in the call stack has the appropriate permission to access the resource or perform the restricted operation. If the calling code does not have the relevant permission, a security exception is thrown.

If security is a requirement, you typically cannot trade security for performance. But then, neither can you trade performance for security. If your planning indicates that you do not have the necessary resources to deliver a feature that is both secure and has the necessary performance, it may be time to start making simplifications. Delivering a secure feature that is not actually usable because its performance is so poor is really the same as not delivering at all, and is a whole lot more expensive. That said, there are usually plenty of other areas in your application where you can investigate and tune first. Make sure you use security wisely and account for the overhead.

This section summarizes guidelines to consider only after a careful security review of your application:

  • Consider SuppressUnmanagedCodeSecurity for performance-critical trusted scenarios.
  • Prefer declarative demands rather than imperative demands.
  • Consider using link demands rather than full demands for performance-critical, trusted scenarios.

Consider SuppressUnmanagedCodeSecurity for Performance-Critical Trusted Scenarios

When you use P/Invoke or COM interop, the interop code is subject to permission demands that walk the call stack to ensure that the calling code is authorized to call unmanaged code.

You can use the SuppressUnmanagedCodeSecurity attribute to improve performance by eliminating the stack walk permission demand and replacing it with a link demand that only checks the immediate caller. Before doing so, you should perform a thorough code review and be certain that your code is not susceptible to luring attacks.

The following code shows how to use SuppressUnmanagedCodeSecurity with P/Invoke.

public NativeMethods
{
  // The use of SuppressUnmanagedCodeSecurity here applies only to 
FormatMessage
  [DllImport("kernel32.dll"), SuppressUnmanagedCodeSecurity]
  private unsafe static extern int FormatMessage(
                                      int dwFlags,
                                      ref IntPtr lpSource,
                                      int dwMessageId,
                                      int dwLanguageId,
                                      ref String lpBuffer, int nSize,
                                      IntPtr *Arguments);
}

The following example shows how to use SuppressUnmanagedCodeSecurity with COM interop, where this attribute must be used at the interface level.

[SuppressUnmanagedCodeSecurity]
public interface IComInterface
{
}

More Information

For more information, see "Use SuppressUnmanagedCodeSecurity with Caution" in Chapter 8, "Code Access Security in Practice," in Improving Web Application Security: Threats and Countermeasures on MSDN at http://msdn.microsoft.com/en-us/library/ms994921.aspx.

Prefer Declarative Demands Rather Than Imperative Demands

Use declarative demands where possible. Declarative security has a rich syntax and using declarative demands provides the .NET Framework with the maximum ability to optimize code because you are specifying your intent succinctly and directly.

Consider Using Link Demands Rather Than Full Demands for Performance-Critical, Trusted Scenarios

When code accesses a protected resource or performs a privileged operation, code access security demands are used to ensure that the code has the required permissions. Full demands require the runtime to perform a stack walk to ensure that the calling code has the required permissions.

The full stack walk can be avoided by using a link demand instead of a full demand. While performance is improved because the link demand checks only the immediate caller during JIT compilation, you need to balance this performance gain with your security requirements. The link demand significantly increases the chances of your code being subjected to a luring attack, where malicious code calls your code to access a protected resource or perform a privileged operation.

You should consider using link demands only in trusted scenarios where performance is critical, and you should consider it only after you have fully evaluated the security implications.

More Information

For more information about link demands and how to use them appropriately, see "Link Demands" in Chapter 8, "Code Access Security in Practice," in Improving Web Application Security: Threats and Countermeasures on MSDN at http://msdn.microsoft.com/en-us/library/ms994921.aspx.

Working Set Considerations

A smaller working set produces better system performance. The working set of a program is the collection of those pages in the program's virtual address space that have recently been referenced. As the working set size increases, memory demand increases. Factors that govern the working set size include the number of loaded DLLs, the number of application domains in the process, the number of threads, and the amount of memory allocated by your process. When you design your application, review the following points:

  • For better application startup time, load only the assemblies you need.
  • Consider assemblies that are being loaded as side effects of the assemblies you need.
  • Delay application initialization, touch code, and data when requested by the user (pay for play).
  • Reduce the number of application domains or make assemblies shared (nonshared assemblies are loaded once per application domain), or both.
  • Reduce the number of threads. This is less critical, but it reduces the working set by eliminating each thread's stack, the thread-specific memory allocations, and whatever code is unique to that thread. This can especially be an issue if you expect multiple copies of your application to be running, such as a client application running on a terminal server system.
  • Experiment with NGen and non-NGen to determine which saves the largest number of working set pages. Note that an application that is completely natively compiled does not load Mscorjit.dll, which saves approximately 200 KB or more, depending on the cost of the compilations. Generally, you can expect NGen to improve the shareability of your application (fewer private pages) at the price of slightly less raw speed (< 5% slower). Frequently, the speed gains from being smaller more than offset speed lost from having shareable code. Smaller is often faster.

More Information

You can use the Vadump.exe tool to measure your application's working set size. For more information, see "Vadump.exe: Virtual Address Dump" at http://msdn.microsoft.com/en-us/magazine/dd882521.aspx.

Ngen.exe Explained

The Native Image Generator utility (Ngen.exe) allows you to run the JIT compiler on your assembly's MSIL to generate native machine code that is cached to disk. After a native image is created for an assembly, the runtime automatically uses that native image each time it runs the assembly. Running Ngen.exe on an assembly potentially allows the assembly to load and execute faster, because it restores code and data structures from the native image cache rather than generating them dynamically.

While this can lead to quicker application startup times and smaller working sets, it does so at the expense of runtime optimization. It is important that you measure your code's performance to see whether Ngen.exe actually provides any benefits for your application.

Startup Time

Ngen.exe can improve startup time due to shared pages and reduced working set. Keep the following points about Ngen.exe and startup time in mind:

  • If all modules are precompiled with Ngen.exe, JIT compilation is not required.
  • I/O for startup can be reduced if the precompiled modules are already (partly) resident.
  • I/O can be increased due to preloading more code than the corresponding MSIL.
  • Startup time can be improved due to reduced or eliminated JIT compilation.
  • Startup time can actually be increased due to additional I/O, if some modules are not precompiled with Ngen.exe and require JIT compilation.

Working Set

Ngen.exe can reduce the total memory utilization for applications that use shared assemblies which are loaded into many application domains in different processes. In the .NET Framework version 1.0 and 1.1, Ngen.exe cannot generate images that can be shared across application domains but does generate images that can be shared across processes. The operating system can share one copy of the natively compiled code across all processes; whereas code that is JIT-compiled cannot be shared across processes because of its dynamic nature.

An application that is completely precompiled with Ngen.exe does not load Mscorjit.dll, which reduces your application's working set by approximately 200 KB. It should be noted that native modules do not contain metadata (in .NET Framework 1.0 and 1.1) and so in precompiled code cases, the CLR must still load both the MSIL version of the assembly along with the precompiled image to gain access to necessary metadata and MSIL. However, the need for MSIL and metadata is minimized when the precompiled image is available, so those sections of the original MSIL image do not contribute nearly as significantly to the working set.

Keep the following points about Ngen.exe and working set in mind:

  • Code that is precompiled with Ngen.exe has the potential to be shared while JIT-compiled code cannot be shared.
  • Shareable pages only help if something actually shares them.
  • Libraries and multi-instance applications can expect some savings due to sharing.
  • Single instance DLL's (those that exist for deployment or factoring reasons) and single instance EXE's will not benefit from improved potential for sharing.

Running Ngen.exe

To run Ngen.exe, use the following command line.

ngen.exe assemblyname

This generates the native code for the specified assembly. The generated native code is stored in the native image cache, alongside the global assembly cache.

You can delete the assembly from the image cache by running the following command.

ngen.exe /delete assemblyname.

Ngen.exe Guidelines

This section summarizes recommendations if you are considering using Ngen.exe:

  • Scenarios where startup time is paramount should consider Ngen.exe for their startup path.
  • Scenarios which will benefit from the ability to share assemblies should adopt Ngen.exe.
  • Scenarios with limited or no sharing should not use Ngen.exe.
  • Do not use Ngen.exe for ASP.NET version 1.0 and 1.1.
  • Consider Ngen.exe for ASP.NET version 2.0.
  • Measure performance with and without Ngen.exe.
  • Regenerate your image when you ship new versions.
  • Choose an appropriate base address.

Scenarios Where Startup Time Is Paramount Should Consider Ngen.exe for Their Startup Path

Use Ngen.exe for faster startup. Common examples include client scenarios that need the faster startup to be responsive or where you need to improve startup performance of large applications and system services.

Ngen.exe improves startup time for the following reasons:

  • It defers the use of JIT compilation until more infrequent paths start being taken.
  • It potentially allows sharing of pages in memory.
  • It leverages the disk cache to get code loaded quickly.

Scenarios That Benefit from the Ability to Share Assemblies Should Adopt Ngen.exe

Ngen.exe is appropriate for scenarios that benefit from page sharing and working set reduction. Ngen.exe often helps the following scenarios:

  • A line of business executable running on a terminal server (multiple instance).
  • A shared library used by a series of line of business applications (multiple instance).

Scenarios with Limited or No Sharing Should Not Use Ngen.exe

In general, Ngen.exe is not beneficial for scenarios with limited or no sharing, for the following reasons:

  • A dependency on Ngen.exe creates a servicing burden.
  • Single instance applications or libraries gain little benefit. Although code is shareable, no processes will be sharing it because there is only a single instance.
  • The JIT compiler is itself shareable, so the 200 KB cost of loading the JIT compiler is amortized over the applications using it.

Do Not Use Ngen.exe with ASP.NET Version 1.0 and 1.1

Ngen.exe is not recommended for ASP.NET because the assemblies that Ngen.exe produces cannot be shared between application domains. If you use Ngen.exe on a strong named assembly, ASP.NET 1.0 and 1.1 uses the precompiled image for the first application domain that needs it, but then all subsequent application domains load and JIT-compile their own images so you do not get the performance benefit.

More Information

For more information, see Microsoft Knowledge Base article 331979, "INFO: ASP.NET Does Not Support Pre-Just-In-Time (JIT) Compilation Through Native Image Generator (Ngen.exe)," at http://support.microsoft.com/default.aspx?scid=kb;en-us;331979.

Consider Ngen.exe with ASP.NET Version 2.0

At the time of this writing, the .NET Framework 2.0 (code-named "Whidbey") includes a version of Ngen.exe that produces images that can be shared between application domains. Consider using Ngen.exe on assemblies that you share between applications. Make sure you measure performance with and without Ngen.exe.

Measure Performance with and without Ngen.exe

Measure the performance of your application both with and without using Ngen.exe to be sure about the benefits. Make sure that any performance improvements warrant the use of the utility.

Note that Ngen.exe produces code which is optimized for the greatest ability to be shared, sometimes at the expense of raw speed. Ngen.exe can potentially reduce the run-time performance of frequently called procedures because it cannot make some of the optimizations that the JIT compiler can make at run time. It prefers to create code that is shareable; the JIT compiler has no such restriction. You should also consider the extra maintenance required when regenerating native images as required.

Regenerate Your Image When You Ship New Versions

Make sure you regenerate your native image when you ship new versions of your assemblies for bug fixes, updates, or when an external dependency changes.

Ngen.exe emits information including the version of the .NET Framework, CPU type, assembly, and operating system on which the native code was generated. The CLR reverts to JIT compilation if the run-time environment does not match the compiled environment.

Note   To avoid having stale assemblies in the native image cache after servicing, you could easily run ngen on the assemblies again when you install a service pack -- just as the setup program would do during the initial installation. Visual Studio .NET provides an easy and straightforward way to implement this behavior by means of defining custom actions in an Microsoft Installer (MSI) package. More information about general .NET deployment concepts and custom actions in particular can be found at http://msdn.microsoft.com/en-us/library/k2he6h0w(vs.71).aspx.

Choose an Appropriate Base Address

Choose an appropriate base address for optimum performance. You can specify the base address in the Visual Studio .NET integrated development environment (IDE) in the Project Properties dialog box (in the Optimization section of Configuration Properties). You can also specify it using the /baseaddress option of the Csc.exe or Vbc.exe command line compilers.

Try to avoid collisions between assemblies. A good practice is to allocate an address range three times the size of your MSIL assembly file. You should include extra space to accommodate an increase in assembly size due to bug fixes.

More Information

For more information about how to use Ngen.exe, see "Native Image Generator (Ngen.exe)" in ".NET Framework Tools" on MSDN at http://msdn.microsoft.com/en-us/library/6t9t5wcf(VS.80).aspx.

Summary

The CLR is highly optimized and designed to support high performance applications. However, the specific coding techniques that you use to build .NET assemblies determine the extent to which your code can benefit from that high performance. This chapter presented the main performance-related issues that you need to consider when programming managed code applications.

Additional Resources

For more information about CLR and managed code performance, see the following resources:

For more information about managed code performance, see the following resources:

patterns & practices Developer Center

Retired Content

This content is outdated and is no longer being maintained. It is provided as a courtesy for individuals who are still using these technologies. This page may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.

Show:
© 2014 Microsoft