The Case for Software Factories
Summary: Briefly presents the motivation for Software Factories, a methodology developed at Microsoft. A Software Factory is a development environment configured to support the rapid development of a specific type of application. Software Factories are just a logical next step in the continuing evolution of software development methods and practices. However, they promise to change the character of the software industry by introducing patterns of industrialization. (10 printed pages)
Scaling Up Software Development
Facing the Changes Ahead, Again
Innovation Curves and Paradigm Shifts
Raising the Level of Abstraction
Industrializing Software Development
Can Software Be Industrialized?
Economies of Scale and Scope
What Will Industrialization Look Like?
Software development, as currently practiced, is slow, expensive and error prone, often yielding products with large numbers of defects, causing serious problems of usability, reliability, performance, security and other qualities of service.
According to the Standish Group [Sta94], businesses in the United States spend around $250 billion on software development each year on approximately 175,000 projects. Only 16 percent of these projects finish on schedule and within budget. Another 31 percent are cancelled, mainly due to quality problems, for losses of about $81 billion. Another 53 percent exceed their budgets by an average of 189 percent, for losses of about $59 billion. Projects reaching completion deliver an average of only 42 percent of the originally planned features.
These numbers confirm objectively what we already know by experience, which is that software development is labor intensive, consuming more human capital per dollar of value produced than we expect from a modern industry.
Of course, despite these shortcomings, the products of software development obviously provide significant value to consumers, as demonstrated by a long-term trend of increasing demand. This does not mean that consumers are perfectly satisfied, either with the software we supply, or with the way we supply it. It merely means that they value software, so much so that they are willing to suffer large risks and losses in order to reap the benefits it provides. While this state of affairs is obviously not optimal, as demonstrated by the growing popularity of outsourcing, it does not seem to be forcing any significant changes in software development methods and practices industry-wide.
Only modest gains in productivity have been made over the last decade, the most important perhaps being byte-coded languages, patterns, and agile methods. Apart from these advances, we still develop software the way we did ten years ago. Our methods and practices have not really changed much, and neither have the associated costs and risks.
This situation is about to change, however. Total global demand for software is projected to increase by an order of magnitude over the next decade—driven by new forces in the global economy—like the emergence of China and the growing role of software in social infrastructure, by new application types like business integration and medical informatics, and by new platform technologies like Web services, mobile devices, and smart appliances.
Without comparable increases in capacity, it seems inevitable that total software development capacity is destined to fall far short of total demand by the end of the decade. Of course, if market forces have free play, this will not actually happen, since the enlightened self interest of software suppliers will provide the capacity required to satisfy the demand.
What will change, then, to provide the additional capacity? It does not take much analysis to see that software development methods and practices will have to change dramatically.
Since the capacity of the industry depends on the size of the competent developer pool and the productivity of its members, increasing industry capacity requires either more developers using current methods and practices, or a comparable number of developers using different methods and practices.
While the culture of apprenticeship cultivated over the last ten years seems to have successfully increased the number of competent developers and average developer competency, apprenticeship is not likely to equip the industry to satisfy the expected level of demand for at least two reasons:
- We know from experience that there will never be more than a few extreme programmers. The best developers are up to a thousand times more productive than the worst, but the worst outnumber the best by a similar margin [Boe81].
- As noted by Brooks [Bro95], adding people to a project eventually yields diminishing marginal returns. The amount of capacity gained by recruiting and training developers will fall off asymptotically.
The solution must therefore involve changing our methods and practices. We must find ways to make developers much more productive.
As an industry, we have collectively been here before. The history of software development is an assault against complexity and change, with gains countered by losses, as progress creates increasing demand. While great progress has been made in a mere half century, it has not been steady. Instead, it has followed the well known pattern of innovation curves, as illustrated in Figure 1 [Chr97].
Figure 1. Innovation Curves
Typically, a discontinuous innovation establishes a foundation for a new generation of technologies. Progress on the new foundation is initially rapid, but then gradually slows down, as the foundation stabilizes and matures. Eventually, the foundation loses its ability to sustain innovation, and a plateau is reached. At that point, another discontinuous innovation establishes another foundation for another generation of new technologies, and the pattern repeats. Kuhn calls these foundations paradigms, and the transitions between them paradigm shifts [Kuh70]. Paradigm shifts occur at junctures where existing change is required to sustain forward momentum. We are now at such a juncture.
Historically, paradigm shifts have raised the level of abstraction for developers, providing more powerful concepts for capturing and reusing knowledge in platforms and languages. On the platform side, for example, we have progressed from batch processing, through terminal/host, client/server, personal computing, multi-tier systems and enterprise application integration, to asynchronous, loosely coupled services. On the language side, we have progressed from numerical encoding, through assembly, structured, and object-oriented languages, to byte coded languages and patterns, which can be seen as language-based abstractions. Smith and Stotts summarize this progression eloquently [SS02]:
The history of programming is an exercise in hierarchical abstraction. In each generation, language designers produce constructs for lessons learned in the previous generation, and then architects use them to build more complex and powerful abstractions.
They also point out that new abstractions tend to appear first in platforms, and then migrate to languages. We are now at a point in this progression where language-based abstractions have lagged behind platform-based abstractions for a long time. Or, to put it differently, we are now at a point where tools have lagged behind platforms for a long time. Using the latest generation of platform technology, for example, we can now automate processes spanning multiple businesses located anywhere on the planet using services composed by orchestration, but we still hand-stitch every one of these applications, as if it is the first of its kind. We build large abstract concepts like insurance claims and security trades from small, concrete concepts like loops, strings, and integers. We carefully and laboriously arrange millions of tiny interrelated pieces of source code and resources to form massively complex structures. If the semiconductor industry used a similar approach, they would build the massively complex processors that power these applications by hand soldering transistors. Instead, they assemble predefined components called Application Specific Integrated Circuits (ASICs) using tools like the ones shown in Figure 2, and then generate the implementations.
Figure 2. ASIC Based Design Tools7
Can't we automate software development in a similar way? Of course we can, and in fact we already have. Database management systems, for example, automate data access using SQL, providing benefits like data integration and independence that make data driven applications easier to build and maintain. Similarly, widget frameworks and WYSIWYG editors make it easier to build and maintain graphical user interfaces, providing benefits like device independence and visual assembly. Looking closely at how this was done, we can see a recurring pattern.
- After developing a number of systems in a given problem domain, we identify a set of reusable abstractions for that domain, and then we document a set of patterns for using those abstractions.
- We then develop a runtime, such as a framework or server, to codify the abstractions and patterns. This lets us build systems in the domain by instantiating, adapting, configuring, and assembling components defined by the runtime.
- We then define a language and build tools that support the language, such as editors, compilers, and debuggers, to automate the assembly process. This helps us respond faster to changing requirements, since part of the implementation is generated, and can be easily changed.
This is the well-known Language Framework pattern described by Roberts and Johnson [RJ96]. A framework can reduce the cost of developing an application by an order of magnitude, but using one can be difficult. A framework defines an archetypical product, such as an application or subsystem, which can be completed or specialized in varying ways to satisfy variations in requirements. Mapping the requirements of each product variant onto the framework is a non-trivial problem that generally requires the expertise of an architect or senior developer. Language-based tools can automate this step by capturing variations in requirements using language expressions, and generating framework completion code.
Other industries increased their capacity by moving from craftsmanship, where whole products are created from scratch by individuals or small teams, to manufacturing, where a wide range of product variants is rapidly assembled from reusable components created by multiple suppliers, and where machines automate rote or menial tasks. They standardized processes, designs, and packaging, using product lines to facilitate systematic reuse, and supply chains to distribute cost and risk. Some are now capable of mass customization, where product variants are produced rapidly and inexpensively on demand to satisfy the specific requirements of individual customers.
Analogies between software and physical goods have been hotly debated. Can these patterns of industrialization be applied to the software industry? Aren't we somehow special, or different from other industries because of the nature of our product? Peter Wegner sums up the similarities and contradictions this way [Weg78]:
Software products are in some respects like tangible products of conventional engineering disciplines such as bridges, buildings and computers. But there are also certain important differences that give software development a unique flavor. Because software is logical not physical, its costs are concentrated in development rather than production, and since software does not wear out, its reliability depends on logical qualities like correctness and robustness, rather than physical ones like hardness and malleability.
Some of the discussion has involved an "apples to oranges" comparison between the production of physical goods, on the one hand, and the development of software, on the other. The key to clearing up the confusion is to understand the differences between production and development, and between economies of scale and scope.
In order to provide return on investment, reusable components must be reused enough to more than recover the cost of their development, either directly through cost reductions, or indirectly, through risk reductions, time-to-market reductions, or quality improvements. Reusable components are financial assets from an investment perspective. Since the cost of making a component reusable is generally quite high, profitable levels of reuse are unlikely to be reached by chance. A systematic approach to reuse is therefore required. This generally involves identifying a domain in which multiple systems will be developed, identifying recurring problems in that domain, developing sets of integrated production assets that solve those problems, and then applying them as systems are developed in that domain.
Systematic reuse can yield economies of both scale and scope. These two effects are well known in other industries. While both reduce time and cost, and improve product quality, by producing multiple products collectively, rather than individually, they differ in the way they produce these benefits.
Economies of scale arise when multiple identical instances of a single design are produced collectively, rather than individually, as illustrated in Figure 3. They arise in the production of things like machine screws, when production assets like machine tools are used to produce multiple identical product instances. A design is created, along with initial instances, called prototypes, by a resource-intensive process, called development, performed by engineers. Many additional instances, called copies, are then produced by another process, called production, performed by machines and/or low-cost labor, in order to satisfy market demand.
Figure 3. Economies of Scale
Economies of scope arise when multiple similar but distinct designs and prototypes are produced collectively, rather than individually, as illustrated in Figure 4. In automobile manufacturing, for example, multiple similar but distinct automobile designs are often developed by composing existing designs for subcomponents, such as the chassis, body, interior, and drive train, and variants or models are often created by varying features, such as engine and trim level, in existing designs. In other words, the same practices, processes, tools, and materials are used to design and prototype multiple similar but distinct products. The same is true in commercial construction, where multiple bridges or skyscrapers rarely share a common design. However, an interesting twist in commercial construction is that usually only one or two instances are produced from every successful design, so economies of scale are rarely, if ever, realized. In automobile manufacturing, where many identical instances are usually produced from successful designs, economies of scope are complemented by economies of scale, as illustrated by the copies of each prototype shown in Figure 4.
Figure 4. Economies of Scope
Of course, there are important differences between software and either automobile manufacturing or commercial construction, but it resembles each of them at times.
- In markets like the consumer desktop, where copies of products like operating systems and productivity applications are mass produced, software exhibits economies of scale, like automobile manufacturing.
- In markets like the enterprise, where business applications developed for competitive advantage are seldom, if ever, mass produced, software exhibits only economies of scope, like commercial construction.
We can now see where apples have been compared with oranges. Production in physical industries has been naively compared with development in software. It makes no sense to look for economies of scale in development of any kind, whether of software or of physical goods. We can, however, expect the industrialization of software development to exploit economies of scope.
Assuming that industrialization can occur in the software industry, what will it look like? We cannot know with certainty until it happens, of course. We can, however, make educated guesses based on the way the software industry has evolved, and on what industrialization has looked like in other industries. Clearly, software development will never be reduced to a purely mechanical process tended by drones. On the contrary, the key to meeting global demand is to stop wasting the time of skilled developers on rote and menial tasks. We must find ways to make better use of precious resources than spending them on the manual construction of end products that will require maintenance or even replacement in only a few short months or years, when the next major platform release appears, or when changing market conditions make business requirements change, whichever comes first.
One way to do this is to give developers ways to encapsulate their knowledge as reusable assets that others can apply. Is this far fetched? Patterns already demonstrate limited but effective knowledge reuse. The next step is to move from documentation to automation, using languages, frameworks, and tools to automate pattern application.
Semiconductor development offers a preview into what software development will look like when industrialization has occurred. This is not to say that software components will be as easy to assemble as ASICs any time soon; ASICs are the highly evolved products of two decades of innovation and standardization in packaging and interface technology. On the other hand, it might take less than 20 years. We have the advantage of dealing only with bits, while the semiconductor industry had the additional burden of engineering the physical materials used for component implementation. At the same time, the ephemeral nature of bits creates challenges like the protection of digital property rights, as seen in the film and music industries.
This article has described the inability of the software industry to meet projected demand using current methods and practices. A great many issues are discussed only briefly here, no doubt leaving the reader wanting evidence or more detailed discussion. Much more detailed discussion is provided in the book Software Factories: Assembling Applications with Patterns, Models, Frameworks and Tools, by Jack Greenfield and Keith Short, from John Wiley and Sons. More information can also be found at Software Factories in the MSDN Library, and at http://www.softwarefactories.com/, including articles that describe the chronic problems preventing a transition from craftsmanship to manufacturing, the critical innovations that will help the industry overcome those problems, and the Software Factories methodology, which integrates the critical innovations.
Copyright © 2004 by Jack Greenfield. Portions copyright © 2003 by Jack Greenfield and Keith Short, and reproduced by permission of Wiley Publishing, Inc. All rights reserved.
1. [Boe81] B Boehm. Software Engineering Economics. Prentice Hall PTR, 1981
2. [Bro95] F Brooks. The Mythical Man-Month. Addison-Wesley, 1995
3. [Chr97] C Christensen. The Innovator's Dilemma, Harvard Business School Press, 1997
4. [Kuh70] T Kuhn. The Structure Of Scientific Revolutions. The University Of Chicago Press, 1970
5. [RJ96] D Roberts and R. Johnson. Evolving Frameworks: A Pattern Language for Developing Object-Oriented Frameworks. Proceedings of Pattern Languages of Programs, Allerton Park, Illinois, September 1996
6. [SS02] J. Smith and D Stotts. Elemental Design Patterns – A Link Between Architecture and Object Semantics. Proceedings of OOPSLA 2002
7. This illustration featuring Virtuoso® Chip Editor and Virtuoso® XL Layout Editor has been reproduced with the permission of Cadence Design Systems, Inc. © 2003. Cadence Design Systems, Inc. All rights reserved. Cadence and Virtuoso are the registered trademarks of Cadence Design Systems, Inc.
8. [Sta94] The Standish Group. The Chaos Report. http://www.standishgroup.com/sample_research/PDFpages/chaos1994.pdf
9. [Weg78] P Wegner. Research Directions In Software Technology. Proceedings Of The 3rd International Conference On Software Engineering. 1978
About the author
Jack Greenfield is an Architect for Enterprise Frameworks and Tools at Microsoft. He was previously Chief Architect, Practitioner Desktop Group, at Rational Software Corporation, and Founder and CTO of InLine Software Corporation. At NeXT, he developed the Enterprise Objects Framework, now called Apple Web Objects. A well known speaker and writer, he also contributed to UML, J2EE, and related OMG and JSP specifications. He holds a B.S. in Physics from George Mason University. Jack can be reached at firstname.lastname@example.org.
This article was published in the Architecture Journal, a print and online publication produced by Microsoft. For more articles from this publication, please visit the Architecture Journal website.