Engineering teams are adopting Agile methodologies with the goal of delivering software early and often. As a result, communication and collaboration between development and operations has become a focal point for process improvement, spawning a trend in software development expressed by the term Development Operations (DevOps).
It’s no longer acceptable for the dev team to spend 18 months working on a product and then abruptly “throw it over the wall” to operations. Companies are instead promoting collaboration between development and operations teams and embracing the concepts of continuous delivery and continuous deployment. In this context, focus shifts from long QA/user acceptance testing (UAT) cycles to rapid identification and resolution of issues in production, and deployment of the fixed application back into production. This rapid identify-fix-deploy loop requires adoption of new tools and processes to be successful.
This article will focus on that first “identify” step in the loop, exploring an important new tooling option in Visual Studio 2012 to handle it. Called PreEmptive Analytics for Team Foundation Server (PA for TFS), this integrated component in TFS 2012 helps teams identify the most important and widespread issues in deployed applications before users even report them.
PA for TFS instruments an application, collects information about incidents in production, then gathers that information and stores it in TFS. This allows the development team to leverage the features of TFS and integrate the stream of incidents into existing workflows, allowing the team to respond rapidly to what’s happening in production once the software is released.
A Community Edition of PA for TFS is included in the TFS 2012 box. This SKU comes with a few constraints and is essentially a subset of the PA for TFS Professional Edition feature set. The Community Edition is limited to connecting to a single TFS instance and two Team Projects within that instance. Each project is limited to three exception rules and can only track unhandled exceptions. The Professional Edition removes these limitations and includes a number of additional capabilities, including:
PA for TFS is composed of five parts:
Let’s explore each of these components in more detail now.
TFS Project Template Extensions PA for TFS extends the chosen Team Project template with new assets. It includes a new work item type and work item query, two reports, and an extension for Team Explorer. The new work item type, Incident, is used to track the issues that arise in production. Shown in Figure 1, it keeps track of the application name, the business name, the version, the stack trace of the exception, the number of times the exception has occurred and comments from the end user.
Figure 1 The Incident Work Type Item Keeps Track of IssuesA full stack trace is collected and displayed on the Exception tab, also shown in Figure 1. User-collected information such as e-mail address, comments and loaded components is shown on the Incident Details tab, shown in Figure 2.
Figure 2 The Incident Details TabThe data in the Metrics group box (shown in both Figures 1 and 2) indicates an uncaught exception that occurred five times. The Accept New Data field is set to No when the incident is closed, to prevent additional data from continuing to be loaded into TFS, but this can be easily overridden (as is shown in Figures 1 and 2) if that’s not the desired behavior. The Aggregator is smart enough to create a new incident if a different exception is uncaught in the same source method.
The Team Explorer extension adds a new "PreEmptive Analytics" node under the project to organize the new query and reports. The Queries node is a new projection of a Team Query folder named Analytics, so you can easily add additional queries that represent Open Incidents or other important items. The All Incidents query, shown in Figure 3, does what the name implies: pulls up a list of all incidents.
Figure 3 The All Incidents QueryThe Reports node works in a similar fashion, mirroring the Analytics node under Reports. The new report, PreEmptive Analytics Incidents Over Time (shown in Figure 4), provides insight into how incidents are trending by application.
The Team Project extensions can be added to any Team Project template, not just the Microsoft for Agile 5.0 template shown here.
Figure 4 The PreEmptive Analytics Incidents Over Time ReportException Endpoint and Aggregator Services The Exception Endpoint service is a Web service that catches all of the data generated by the instrumented application. The incoming data is staged in durable storage for later pickup by the aggregator service. The Exception Endpoint service must be reachable from wherever the instrumented client is running. Ideally it should be secured with an SSL certificate (supported in the Professional Edition) to protect the exception data that’s being transferred to it.
The Exception Endpoint can be deployed in a hosted environment and will run under Windows Azure as well.
Later this year, PreEmptive Solutions will release a router that sits in front of one or more endpoints and supports a number of scenarios, including:
The data reported by the Exception Endpoint service is gathered by the Exception Aggregator service and piped into TFS. The Exception Aggregator service can run inside a firewall adjacent to the TFS server but, like the Exception Endpoint, it doesn’t have to. When it receives new data it will first attempt to look up an existing incident based on the rules that have been configured and update it. If an existing incident doesn’t exist, a new one will be created.
The Exception Aggregator utilizes the combination of Application Name and Version, Business Name and the Exception Stack trace as a key. This means that each exception in a specific version of an application will generate an incident. Incoming exceptions are evaluated according to certain rules. The behavior of the Exception Aggregator is configured using the administration console, shown in Figure 5. There are three main configuration areas: Team Foundation Servers, Exception Sets and Subscriptions.
The Team Foundation Servers area allows you to specify which TFS Instances the Exception Aggregator should interact with. On each of those TFS Instances you can indicate to which Team Projects the PreEmptive Analytics functionality should be added.
Figure 5 The PreEmptive Analytics for Team Foundation Server Administrative ConsoleException Sets allow you to create groupings of exceptions by type and whether or not they were uncaught, caught, thrown or all of the above. This is an opportunity to, for example, group all custom exceptions from your application together and treat them differently than exception types from the base class library. By default there’s an Exception Set called AllExceptions (shown in Figure 6) that takes all uncaught exceptions and groups them together.
Figure 6 AllExceptions Groups All Uncaught ExceptionsSubscriptions map instrumented applications to TFS instances and projects. As shown in Figure 7, you would specify the company ID from the Business Attribute and the application ID from the Application Attribute to identify the application. Then you select a TFS instance, Team Project and area to bind an application to a Team Project. At least one rule must be specified that, for a given Exception Set, defines how many times an incident must occur before creating an Incident work item. By default this is 80 occurrences, but you will likely want to decrease this value for uncaught exceptions to get the feedback more immediately.
Figure 7 Mapping Instrumented Applications to ProjectsInstrumentation The Exception Endpoint doesn’t care where the exception “envelopes” it receives originated, and so PA for TFS is built to support cross-platform, heterogeneous collections of applications. Today, you can use Dotfuscator, DashO or API libraries to instrument applications, across various platforms, for PA for TFS.
Dotfuscator comes in two flavors, a Community Edition (integrated with Visual Studio Professional and higher) and a Professional Edition. Both can inject the attributes required by PA for TFS into an assembly without having to change any code. This is especially convenient if you’re already using the Dotfuscator product to obfuscate code. With just a few settings changes, Dotfuscator’s obfuscation pass will also instrument the application for PA for TFS.
DashO is essentially the Java sibling to Dotfuscator. The main difference is the lack of a Community Edition equivalent; there’s only a Professional Edition.
Let’s look at the steps required to get PA for TFS running for an application. This walk-through assumes a ClickOnce-deployed Windows Presentation Foundation (WPF) application, and that its development team wants to track any unhandled exceptions that bubble up to the top of the application and would potentially cause it to crash.
Install PA for TFS The first step is to install PA for TFS. A SQL Server instance is required to store the data. This could be the SQL Server instance used by TFS in a testing scenario; it could also be a SQL Server Express instance. To test this inside a firewall, install the Combined Exception Endpoint and Exception Aggregator role. Point it at the TFS server and specify a TFS account that can create and edit work items. This will install the Exception Endpoint and Exception Aggregator services and create a database to persist the exception data. When the installer is done, it will give you the URL for the Exception Endpoint service. Save this for use later on, when configuring your application instrumentation. This step only needs to be done once in a development organization.
Right before the installer terminates, it launches the PreEmptive Analytics administration console. The console can be used to configure the connection to the TFS server and provision Team Projects with the PA for TFS extensions.
Install Team Explorer Extensions The second step is to install the Team Explorer extensions. This will add the PreEmptive Analytics node under the Team Project in Team Explorer.
Instrument Your Application There are two choices when it comes to adding instrumentation to an application:
Option No. 2 doesn’t require any changes to the source code to begin collecting data. We use this method in our example.
You must add, at a minimum, five attributes: ApplicationAttribute, BusinessAttribute, ExceptionTrackAttribute, SetupAttribute and TearDownAttribute.
The ApplicationAttribute, shown in Figure 8, is placed at the assembly level and identifies the specific application you’ll be tracking. It’s keyed with a GUID and includes the Name and Version. Using the GUID Generator that comes with Visual Studio, generate a GUID in Registry Format and then strip off the curly braces after pasting it into the attribute. Keep record of this GUID, as it will be needed it later when configuring the aggregator.
Figure 8 The ApplicationAttribute Identifies the Application Being TrackedThe BusinessAttribute, shown in Figure 9, is also placed at the assembly level and keys the application to the company. Generate the GUID again using the GUID Generator and make sure to note it for later use when configuring the aggregator.
Figure 9 The BusinessAttribute Identifies the CompanyThe SetupAttribute, shown in Figure 10, is placed on the application entry point and sets up the collection and reporting of data by PA for TFS. It specifies the Exception Endpoint that should be contacted to report the data. Fill in the value from the end of the Exception Aggregator setup. It will likely also be necessary to scroll down to the bottom of the Attribute Properties grid and set UseSSL to False if a certificate isn’t installed. It’s also possible to drill into the App class and place the SetupAttribute into Main, as shown in Figure 10.
Figure 10 Placing the SetupAttribute into MainThe TeardownAttribute, shown in Figure 11, is placed on the application exit point and wraps up the collection of data, flushes buffers and ensures that PA for TFS has shut down cleanly as the application exits. As with the SetupAttribute, it’s possible to place the TeardownAttribute into the Main method, and Figure 11 illustrates this.
Figure 11 Placing the TeardownAttribute into MainThe final attribute, ExceptionTrackAttribute, shown in Figure 12, can be placed anywhere to track exceptions in the application. In this case, to handle all uncaught exceptions, the attribute is placed at the assembly level and configured to report all unhandled exceptions. To collect comments and contact information from the end user, you can optionally set ReportInfoSourceElement to DefaultAction. This will provide a default UI to collect this data.
Figure 12 Placing the ExceptionTrackAttribute into the Application
Once all the attributes have been applied, Dotfuscator is used to build a new version of the executable. This build process can be automated and done as part of a TFS build once you’ve done the initial configuration.
Configure the Aggregator Start the Aggregator Administrator Console and add the TFS Instance and Team Project collection. Select the Team Project being used and click the Apply action next to it in the grid. Figure 13 illustrates application of the Team Project extensions to a project called Agile.
Figure 13 Placing Team Project Attributes into the ProjectThe default Exception Sets are fine to start with, so we’ll move on to the Subscriptions. Add a subscription for the application and give it a name. This example is simply called WpfApplicationV1. Fill in the Company ID and Application ID GUIDs noted when instrumenting the application. Bind it to the TFS and Team Project added previously. Optionally select an area path from the Team Project. The default rule should be fine—just be sure to change the threshold from 80 to a value appropriate for your needs. Close the console and select Save when prompted.
Deploy Your Application You’re done, and the team can now deploy the application. Make sure to grab the instrumented version that Dotfuscator created in the Dotfuscator directory when it’s deployed.
Identify Problems Now that the application is running out in the wild, teams will likely want to run the All Incidents query on a regular basis. Teams might want to create a new query just for new incidents. Create a query that filters on state = new, and orders by created date in descending order. Run this query frequently to identify new issues.
Triage the Incidents As they go into triage, move the state from New to Active. This will remove the incidents from your new query and make it easier to detect new issues. Assign the incidents to other developers to fix and test. Your team can use the new Incident work item type to understand exactly where the error is occurring and at what frequency. They can then contact the individual customers who have reported the problem and get more information about the error. Deploy the new version of the application. Close the incident in TFS. As new incidents come in, check to ensure they’re not regressions of previous incidents.
PA for TFS Community Edition is included with Visual Studio 2012; give it a try. Consider upgrading to the Professional Edition for the ability to:
Embrace the new world of DevOps and continuous delivery, while leveraging PA for TFS to help teams connect real-world usage to development practices.
Chris Kinsman is currently working for a startup, EveryMove, in the Seattle area. He was previously Vice President of Development for Vertafore, Inc. and Chief Technology Officer for DevX.com. He has in the past served as a conference chair for VBITS and Visual Studio Live! and is currently a Microsoft Regional Director based in Redmond, Washington.
More MSDN Magazine Blog entries >
Browse All MSDN Magazines
Subscribe to MSDN Flash newsletter
Receive the MSDN Flash e-mail newsletter every other week, with news and information personalized to your interests and areas of focus.