Measuring Success with Software Factories
by Marcel de Vries
Summary: Software factories and Visual Studio Team System (VSTS) can be used together to improve quality, predictability, and productivity of software projects. Using the VSTS data warehouse and reporting capabilities, the software-factory builder can determine reliably which aspects of product development need improvement and how to modify the software factory to improve them. The author assumes you already know software-factory nomenclature and concepts like viewpoints, views, factory schema, and factory template.
Quality and Productivity Measures
Applying Visual Studio Team System
Using Measurement Constructs
Adding Constructs to the Data Warehouse
About the Author
Building software today is hard. Systems get more complex and larger every day. We face rapidly changing technology while trying to keep pace with the demands of business customers who want us to write more software better and faster. Is it really possible to be more productive while producing better quality software? Can greater productivity be sustained across maintenance and upgrades without degraded quality or significant rewriting?
Many of these problems arise because we learn too little from the projects we have done. Few teams regularly reuse solutions or keep track of the things that went well and the things that went wrong. As a result, there is not enough knowledge transfer between projects. Lessons already learned are relearned by new developers. Since most projects fail to deliver on time and within budget, we can see that we also have a predictability problem.
It is possible to build software on time, within budget, and with adequate quality. However, there must be an organizational awareness that the current approach to building software is grossly inefficient. Without awareness of existing problems there will be no drive to improve. To start building software systems predictably, we must make a cultural change. We need to make it easier for practitioners to know what to do, when to do it, why to do it, and how to do it; and we must automate more of the rote and/or menial aspects of their work.
What we are talking about is industrializing software development, applying techniques long proven in other industries to our own industry, in the hope of making things better for our customers and ourselves.
As it turns out, the factory schema provides a useful mechanism for organizing metrics. Since each viewpoint targets a specific aspect of the software-development process, we can use viewpoints to define targeted measures of productivity and quality. Using those measures, we can gather data for specific aspects of the software-development process. By analyzing the data, we can then determine which viewpoints need to improve, how to improve them, and what we can gain by improving them.
To implement this approach, we need a way to express product size, time and budget spent, and product quality to be able to quantify predictability, productivity, and quality for each viewpoint. By measuring each viewpoint, as well as overall factory performance, we can determine how each viewpoint affects overall factory performance, and therefore how much to invest in better supporting a given viewpoint.
For example, we might provide simple guidelines for viewpoints that do not significantly affect overall efficiency, and sophisticated domain-specific language (DSL)–based designers for viewpoints that do. This process helps us get the best return on investment in terms of predictability, productivity, and quality. It helps us compare the results to the goals set initially before we started factory development.
Figure 1. An area definition that reflects viewpoints
One of the aspects of software development we need to improve is productivity. However, to quantify productivity we need a metric that we can use to express productivity in terms of software product volume built in a span of time. When we are able to predict the size of the system and to measure product-size growth during development, we can better predict the time required to complete the project, and we can measure productivity in terms of hours spent per unit of product size. By measuring the growth and size, we are able to identify differences between the actual and planned values and to start analyzing and managing the differences when they become apparent.
At this point, you may be wondering how we can predict product size and growth with enough accuracy to make this kind of measurement and analysis useful. It certainly does not seem possible if we are developing arbitrary applications one project at a time. If we are using a software factory, however, we have two advantages that significantly improve predictability.
Firstly, we are developing a member of a specific family of products with known characteristics, not just an arbitrary application. Because a factory allows us to describe a product family and its salient features—and, more importantly to refine that description as experience is gained over the course of multiple projects—we know much more about an application being developed using a factory than we do about an arbitrary application.
Secondly, we are developing the application by applying prescriptive guidance supplied by the factory. By standardizing the way we do some things, a factory tends to remove gratuitous variation from the development process, making it much more likely that product size and growth will follow similar patterns from one application to the next.
If we want a metric that can help us express size and productivity, we need an objective quantification. This objective quantification can be accomplished by using a method that is standardized. One of those methods is functional size measurement as defined in the ISO 24570 standard. This ISO standard uses function points as a way to express the size of the software system based on functional specifications. It specifies a method to measure functional size of software, gives guidelines on how to determine the components of functional size of software, specifies how to calculate the functional size as a result of the method, and gives guidelines for the application of the method. These function points can be considered as a "gross metric" to determine the size of a system and to estimate effort and schedule. During development this metric can be used to determine whether the project requires more or less work relative to other similar projects.
Function-point analysis leverages the knowledge of building database-oriented applications and can be applied whenever we build a system that uses data manipulation in a database. Function points are calculated around the knowledge of the number of estimated tables our application will have and the number of data-manipulating functions as data retrieval and data update functions. From this result we can calculate the number of function points that expresses the size of our product.
Figure 2. Team Foundation Server's data-warehouse architecture
Once we have expressed our estimated product size, we can learn how much time it takes to implement one function point or even use historical data already available to make predictions on how much time it should cost to implement a function point. A software factory can influence the time spent to implement a function point (productivity), the number of defects per function point (quality), and the accuracy of our estimations.
For example, suppose we applied function-point analysis and determined that the system we are going to build has an estimated size of 500 function points. As we start building this system, we can determine that it takes 6500 hours to build. From that result, we can express our productivity as 13 hours (h)/function point (fp).
If we also keep track of the defects we found in the product during development, user acceptance test, and production, we can also express that number as a quality metric. Suppose we found 500 bugs during development, 50 during the acceptance test, and 5 after going into production. We could express this calculation as having 1 defect/fp during development, 0.1 defect/fp at acceptance test, and 0.01 defect/fp in production.
It gets really interesting when many of these defects can be traced back to a specific viewpoint of your factory. From that discovery we learn that the viewpoint has a high contribution to the overall number of defects, and we can focus our attention and analyze what might need improvement within this viewpoint. From this kind of analysis, we can determine which viewpoints to improve and how to improve them to reduce the number of defects the next time the factory is used.
The great thing about having a quantification of the number of defects against a metric such as function points is that we now can set goals for the improvements we want to achieve by our investments. For example, I want the number of defects/function point to go down by 20 percent for the "front-end applications" viewpoint. Performing defect and function-point analysis on a per-viewpoint basis gives us a powerful tool for improving our product development process because it helps us determine where the bottlenecks lie, and therefore where to invest and how to invest to obtain better results.
Figure 3. The structure of a measurement construct (Click on the picture for a larger image)
When we start using function points, we can initially use historical data from surrounding organizations found in the literature to do our first estimations. Historical data is useful because it accounts for organizational influences, both recognized and unrecognized. The same idea applies to the use of historical data within the software factory. Individual projects developed using a software factory will share a lot with projects developed using the same software factory. Even if we do not have historical data from past projects, we can collect data from our current project and use it as a basis for estimating the remainder of our project. Our goal should be to switch from using organizational data or industry average data to factory data and project data as quickly as possible (see Resources).
Now, consider how to enable our product development team to use the factory to create the required work products. This ability starts with a development environment that supports the whole product life cycle from birth to discontinuation, such as Visual Studio Team System (VSTS). Using VSTS is a key to enabling our product development teams to benefit from the approach described previously.
Currently, VSTS does not understand software factories. However, because VSTS is so configurable and extendible, we can set it up manually to support a software factory by mapping various parts of the factory schema onto various configuration elements or extension points.
Recall that a software factory contains a schema that describes its organization. The factory schema defines a set of interrelated viewpoints, and each viewpoint describes related work products, activities, and assets for users in a specific role. We can use this information to configure VSTS for developing applications.
A viewpoint can be mapped to a concept that VSTS calls an area in one or more iterations. The role associated with a viewpoint can be mapped to one or more VSTS project roles. In practice, multiple viewpoint roles will probably be mapped to a single VSTS project role. The activities defined by a viewpoint can be added as work items in those areas at project creation, and directly assigned to the appropriate role. They can also be documented by customizing the process guidance, and custom work items can be created to track them and to link them to work products.
Content assets, such as guidelines, patterns, and templates can be added to the project portal document libraries. Executable assets, such as tools and class libraries, can be placed in the version control system. To measure and improve the performance of our factory, we can add metrics to the VSTS data warehouse.
The keys to configuring VSTS are the Project Creation wizard and the process template. The Project Creation wizard is a tool for creating projects in Team Foundation Server. It uses a file selected by the user called a process template to configure the server for the project. The template contains several sections, each describing the way a specific part of the server will be configured. With the process template, for example, we can define work item types, areas, iterations, and roles and assign the appropriate rights to each role; customize version control; set up the project portal; and do many other things to customize the development environment and the development process.
VSTS uses work items to track the work that needs to be done to create a given product. Work items describe the work that needs to be done, identify the party accountable for that work at a given point in time, and can be of different types designed to describe different kinds of work. For example, a bug can be described by a work item of type Defect that contains information pertinent to fixing a bug, such as the description of the bug, reproduction steps, estimated time to analyze or fix the bug, and so on. Work item types are created or modified by changing the XML definitions loaded into the server and used at the time the project is created. They can also be modified after project setup.
Work items can be linked to a so-called area of a project and to an iteration. Areas provide a way to book the work on a specific part of the solution that is of interest when we want to run reports on the data accumulated in the data warehouse. Areas in VSTS closely match the concept of viewpoints in a software factory, as both represent areas of interest or concern.
When we map areas of interest in tracking a work item to our factory viewpoints, we can use these metrics to provide the productivity and quality measures for specific viewpoints.
One very good starting point in defining viewpoints for a factory is a set of common viewpoints that tends to appear in many factories. Two of those common viewpoints that prove particularly useful in configuring VSTS are System Engineering and Project Engineering. In the System Engineering area we can make a subtree containing the architectural viewpoints that describe salient parts of our system. This description will help us identify which parts of the system have the most significant impact on productivity (time spent) and quality (number of defects). The Project Engineering area is also interesting because it can help us find anomalies in the way activities have been formalized in the project, and it can help us decide whether or not to improve the process definition at certain points. Figure 1 shows an example of areas and iterations that reflects the schema for a simple factory that builds service-oriented administrative applications with multiple front ends.
The area tree can become pretty deep if we try to incorporate every viewpoint defined by our factory. It is very important that we do not explode the tree into many different levels. Keep in mind that it needs to be very simple, allowing team members to easily identify the areas to which work items should be linked. The more deeply nested the tree, the harder it becomes to find the right area for a given work item. If it becomes too hard, developers will simply book work items near the root of the hierarchy, defeating the purpose of creating a deeply-nested tree.
Figure 4. Base and derived measures for software size growth (Click on the picture for a larger image)
The Team System data warehouse keeps track of all kinds of information about the development of the solution. One section of the data warehouse holds information about work items, which is interesting from a factory perspective, as described earlier. Other sections hold information about tests, daily builds, and other VSTS features. The data warehouse can be extended in two ways to support measurement.
First, we can change the fields kept in the warehouse for a specific work item type by modifying the work item type definition, either by changing the fields it contains or by adding the fields to new facts or dimensions in the warehouse. When a field is marked as reportable in the work item type definition, it will be added dynamically to the data warehouse. Of course, if we want to show reports on these additional fields, we will also need to create reports for the data and upload them to the reporting server to make them accessible to other team members.
Figure 5. Graphical representation of a planned-versus-actual indicator for software growth
Second, we can incorporate data generated by custom tools. If our factory provides custom tools that generate data, and we want to use the data in the data warehouse, we can add a custom data-warehouse adapter to the Team Foundation Server (see Figure 2).
For example, to measure the size of each solution in terms of number of lines of code, build a custom tool that counts the lines of code in a file and a custom data-warehouse adapter. Also add a step to the daily build that runs the custom tool over the sources in the current solution and places the result in a file. The custom data-warehouse adapter then picks up the information from the file and makes calls to the data-warehouse object model provided by Team System to add the information to the data warehouse. Custom data can be viewed using custom reports.
So far, we have looked at how to define a factory, how to refine a factory using measurement and analysis, and how to configure VSTS to support a software factory. Before we can put all these insights together to build and refine software factories with VSTS, we need to know one more thing: how to collect the right information.
What we need are formal definitions of the relationships between the things we are measuring and the information we need to support refinement. Those definitions are called measurement constructs. Measurement constructs are combinations of base measures, derived measures, and indicators. A measurement construct describes an information need, the relevant entities and attributes, the base and derived measures, the indicators, and the data collection procedure.
A base measure captures information about a single attribute of some software entity using a specified measurement method. A base measure is functionally independent of all other measures. A derived measure is defined as a function of two or more base and/or derived measures. A derived measure captures information about more than one attribute. An indicator is a measure that provides an estimate or evaluation by applying an analysis model to one or more base and/or derived measures to address specified information needs. Indicators are the basis for measurement analysis and decision making. Additional rules, models, and decision criteria may be added to the base measures, the derived measures, and the indicators. Figure 3 illustrates the structures of a measurement construct (see Resources).
Key terms on software measures and measurement methods have been defined in ISO/IEC 15939 on the basis of the ISO international vocabulary of metrology. The terms used in this discussion are derived from ISO 15939 and Practical Software Measurement (see Resources).
Use these steps to define a measurement construct that we can add to our Team Foundation Server data warehouse:
- Define and categorize information needs. To ensure that we measure the information we need, we must understand clearly our information needs and how they relate to the information we measure. Experience shows that most information needs in software development can be grouped into one of the seven categories defined by ISO 15939: schedule and progress, resources and cost, product size and stability, product quality, process performance, technology effectiveness, and customer satisfaction. An example of an information need in the product size and stability category might be: "Evaluate the size of a software product to estimate the original budget."
- These information needs can be used to measure the properties of a specific viewpoint in a software factory. They must be prioritized to ensure that the measurement program focuses on the needs with the greatest potential impact on the objectives we have defined. As described earlier, our primary objective is usually to identify the viewpoints whose improvement will yield the best return on our investments. Since viewpoints can nest, we can often roll up measurements to higher-level viewpoints. For example, if we had a User Interface viewpoint containing viewpoints like Web Part Development and User Authorization, we might roll up the customer satisfaction measurements from specific Web parts to the User Interface level.
- Define entities and attributes. The entities relevant to the information need, "Evaluate the size of a software product to appraise the original budget estimate," for example, might be a development plan or schedule, and a base-lined set of source files. The attributes might be function points planned for completion each period, source lines of code, and a language expressiveness table for the programming languages used.
- Define base measures and derived measures. Specifying the range and/or type of values that a base measure may take on helps to verify the quality of the data collected. In our example we have two base measures, the estimated size of the software product and the actual size. The scale for both base measures will range from zero to infinity. A derived measure captures information about more than one attribute (see Figure 4).
- Specify the indicators. To use an indicator correctly, its users must understand the relationship between the measure on which it is based and the trends it reveals. The measurement construct should therefore provide this information for each indicator: guidelines for analyzing the information, for our example we might provide an analysis guideline like "Increasing software size growth ratio indicates increasing risk to achieving cost and schedule budgets."; guidelines for making decisions based on the information, for our example we might provide a decision-making guideline like, "Investigate when the software size growth ratio has a variance of greater than 20 percent."; and an illustration of interpreting the indicator, for our example we might provide an illustration (see Figure 5) and describe it like this: "The indicator seems to suggest that the project production rate is ahead of schedule. However, after further investigation, it turns out that the actual size of one item was larger than planned because of missing requirements that were not identified until initial testing. Resource allocations, schedules, budgets, and test schedules and plans are impacted by this unexpected growth."
- Define the data-collection procedure. Now that we know how to relate the base measures to the information needs, we must define the data-collection procedure. The data-collection procedure specifies the frequency of data collection, the responsible individual, the phase or activity in which the data will be collected, verification and validation rules, the tools used for data collection, and the repository for the collected data.
As described, each measurement construct needs to define at least the information needs, the entities and attributes, the base measures and derived measures, the indicators, and a data-collection procedure. To map this to the Team System data warehouse, we need to determine how to obtain the required information, either by modifying work item type definitions to add fields and to mark them as facts or dimensions, or by building a custom tool and a custom data-warehouse adapter that collects data produced by the tool. We also need to determine how to display the indicators, usually by creating custom SQL Server 2005 report server reports.
When we have mapped our factory onto VSTS, we can start using it to build solutions. It will guide our team in building the solutions, and it will provide us with information based on the measurement constructs we have defined and implemented.
Once we have a baseline in place with initial data, we can run a continuous software-factory development loop that analyzes the performance of each viewpoint, uses that information to determine what to improve, build the improvements, and then repeats the process. This virtuous cycle can be used to target a variety of measures. A key part of this process is estimating the cost of making a given improvement, estimating the gain in productivity likely to result from making the improvement, and estimating whether or not the results justify the investment. After implementing the improvement and incorporating it into the factory, we can measure whether or not it met the goals we set in terms of the reduction in hours/function point (see Figure 6).
Figure 6. An iteration loop for factory development
The motivation for this discussion is a desire to change the grossly inefficient way we build software today with "one-off" or project-at-a-time development. Our customers see that we struggle to deliver projects on time, within budget, and with the expected features. We can help ourselves and our industry as a whole by capturing the knowledge we gain from experience and transferring it to other projects using software factories. We learned how to define a factory and how to measure its performance in terms of productivity and quality. By quantifying the sizes of the products we build, measuring the time spent to build them, and registering the number of defects found, we can describe the performance of our factories.
The mapping from the factory schema to VSTS is done using the customization and extensibility points in VSTS. We can set up VSTS by placing the assets identified by the factory schema in the version control repository or the Team Foundation Server portal. We can use the portal to provide process guidance for activities described by the factory schema. We can use the Project Creation wizard to arrange the initial setup of our factory, and we can use feature modeling to create a mapping to define forms to add to the wizard. A large portion of the initial project is done using the process templates, and we can modify the templates to support our factories.
By implementing measurement constructs in the VSTS data warehouse, we can gather metrics that describe software-factory performance in terms of productivity and quality. Over time we can use these metrics to constantly improve our factories and to gain not only productivity and quality, but also to gain predictability by removing excess or gratuitous variability. The result of implementing software factories with VSTS is more successful projects and greater customer satisfaction.
Marcel de Vries is an IT architect at Info Support in the Netherlands and Visual Studio Team System MVP. Marcel is the lead architect for the Endeavour software factory targeted at the creation of service-oriented enterprise administrative applications used at many large enterprise customers of Info Support. Marcel is a well-known speaker on local events in the Netherlands, including developer days and Tech-Ed Europe. He also works part time as trainer for the Info Support knowledge center. Contact Marcel, and read his blog.
McGarry, John, David Card, Cheryl Jones, and Beth Layman. Practical Software Measurement: Objective Information for Decision Makers. Boston, MA: Addison-Wesley Professional, 2002.
McConnell, Steve. Software Estimation: Demystifying the Black Art. Redmond, WA: Microsoft Press, 2006.
Greenfield, Jack, Keith Short, Steve Cook, and Stuart Kent. Software Factories: Assembling Applications with Patterns, Models, Frameworks, and Tools. Indianapolis, IN: Wiley, 2004.
This article was published in the Architecture Journal, a print and online publication produced by Microsoft. For more articles from this publication, please visit the Architecture Journal website.