Software-Engineering Asset Management
Summary: When users ask for what they need, it's not enough to tell them what you have. When someone looks for the answer to a problem, users will (hopefully) understand what the problem is, but probably will not understand the solution yet. Software-engineering assets must be describable by the problems that they solve—not just the method that they use to solve them or descriptions of the structure of those assets. (6 printed pages)
Users of a software-engineering asset-management system (SEAMS) are not interested in what an asset looks like or how it is constructed. Instead, they are interested in which assets can solve the problem that they must solve, how to obtain those assets, how to decide which offer the best solution, and how to use those assets to solve that problem.
I worked on a reuse library-management system (RLMS) at Trey Research in the early 1990s—part of what we would call in the 21st century a SEAMS. The initial library content consisted of nearly 100 data structures that had been expertly implemented by Trey Research, and it was the core of their business. This collection of generic data types and related functionality was very successful, in terms of both its technical and financial success. The RLMS would build on this success; not only would it be used to manage this collection of data structures, but each customer who purchased the library and its RLMS would be able to extend the RLMS to manage the reusable assets that the customer's development organization produced, too.
During research, design, and prototyping prior to the final design of the RLMS, when testers used the RLMS to catalog their own reusable assets, users had difficulty finding components that were based on their current need. They were having trouble matching their reusable software to the requirement specifications or design features of the system that they were assigned to implement. Free-text search (Internet-search style) was of little use, because the "words" being searched for were mostly programming-reserved words (such as return or cdr), names of variables with system prefixes (FGHMain), or boilerplate header comments that differed little from one file to the next. A free-text search of one of these common words or phrases would always be too broad—producing too many hits to be useful. That was expected, which was why the RLMS concentrated on cataloging assets in a fashion similar to the cataloging of books in a library—that is, in a hierarchical manner—although other search and browsing techniques, including free-text search, also were supported.
The initial collection of data structures was cataloged by using the standard computer-science terms that you would expect that described the implementation of each data structure. Everyone agreed on how to describe the data structures, and which ones applied to which system need (obviously, most had learned the concepts from the same textbooks). Through this common understanding, we knew what each was good for; and, by describing what the data structure we were looking for did, we were implicitly describing each asset by what problems it solved.
However, when we applied the same approach to the business domain–focused reusable assets of the customer, this approach failed. The problem occurred when we started cataloging assets that each customer produced. There was a much broader range of assets to catalog, and there was no common vocabulary predefined for us and understood by everyone, as there had been for the data structures. Even if there had been, chances were low that the descriptions of how the components were constructed would imply the problems that they were good at solving. We rediscovered the same problem that we had before adopting the cataloging-by-classification approach: Users were having difficulty locating the reusable assets that matched their work assignments.
The problem occurred because we described software assets in terms of what we knew about them: what they were built with, and how they were constructed. What we needed to do was describe them in terms of how users would apply them: what they could be used to do, and how to use them. In pursuing a solution to that problem, we discovered another: Developing useful classification catalogues was neither intuitive nor easy. It required a basic understanding of information-science techniques—something understood better by library-science experts than by computer-science experts. Thus, the analysis of the problem exposed two problems:
- Problem 1—The assets had to be cataloged according to what they did—not how they did it. Instead of describing them by how they were constructed, they had to be cataloged according to what they accomplished.
- Problem 2—Building a classification vocabulary was a task that required an understanding of basic information-science concepts, and the RLMS tool would have to either tutor the user in those skills or act as an expert agency to guide them.
We had to adopt an approach for the customer-developed assets that was different from what we used for the data-structure collection. We had to develop a tool to catalog explicitly—and allow RLMS users to search for—assets by describing the needs that the asset was to help satisfy, and not the way in which the solution was implemented. Additionally, we would have to build an RLMS manager's toolset that guided non-library scientists in building efficient classification catalogues. We were seeing the beginning of the evolution of RLMS into SEAMS.
The primary goals of a SEAMS are the same as those of a library automation system (that is, a library of books and journals—not software functions). Library systems do not expect users to search for content by describing the size, number of pages, and color of the book cover. They describe what problems the assets (be they books, magazines, software, or system test suites) can solve. The basic functions that must be present are to support:
- Capture of software-engineering assets. Archive them to preserve them; capture metadata about their use, performance, and what problems they can be used to solve, and what role they serve in the system; and use that metadata to support the search for assets, in the context of the problems that they can be used to solve.
- Ability of users to search the collection of software-engineering assets according to what problems the assets can be used to solve. Present the metadata about the assets in a way that makes it easy to see the differences between—and commonalities among—assets. Present that data in the context of—and for the purpose of—identifying which assets are most useful for solving a particular problem.
- Representation of relationships among the assets, so that, once a useful asset is located, similar or related assets can then be easily discovered. Allow administrators of the RLMS or SEAMS to define types of relationships, and then define relationships among assets by those types.
What Trey Research needed to do was learn how to help users describe their needs; match those needs to resources for solving problems (in this case, software-development assets); match their needs against existing assets; locate the best of the candidate solutions; retrieve them; find related assets, if they exist; and learn how to use those assets to solve their problems. Then, the RLMS system had to be created in such a way that it could capture and represent all of that, and function as an extensible classification system that could be updated to represent new kinds of problems to be solved, new ways of using other assets to solve problems, and new types of relationships among similar or related assets. Finally, that classification-update function would have to be supported by a combination of information-science tutoring and expert-guidance features that would allow non-information scientists to produce and maintain efficient classification catalogues.
The first thing to do when performing a task that is new to you—or to your organization or your craft—is to see if some other person, organization, or craft has already solved a similar problem. That is just applying good software reuse practices to business operations. In this case, we were in luck. What we wanted to do was already well-understood (at least in terms of method, although not in terms of automation) by a professional organization that had been performing these tasks for centuries: librarians. In fact, there are specialized professionals who take these skills to an even higher level: research librarians, special librarians (such as legal- and medical-library staff), and library-classification specialists.
By applying these "tried-and-true" methods, specializing them for software engineers and architects, and developing automated systems to support those methods, we could develop RLMS as a highly effective tool that achieved our goals by performing the following tasks:
- We developed a basic classification-catalog vocabulary, which included terms that described broadly reusable software assets in the context of what problems they could solve, and the roles that those assets took in the development of a system and/or the architecture of a system.
- We organized those terms into an appropriate knowledge classification or library schema that simplified the tasks for the user in understanding the difference and commonalities among assets.
- We developed basic classification-update tools that guided the RLMS manager to gather success metrics on RLMS usage, and updated the classification catalog to represent new assets that solved new problems.
- We developed a search-and-browse functionality that allowed users to define easily the problem that they were trying to solve, determine whether assets that were classified by this RLMS classification schema could solve that problem, and locate those that could. It would allow users also to learn how to use those assets, as well as identify similar or related assets that might help to solve the problem.
At that time, we usually only considered software code as a candidate for reuse. Over time, we had learned that any work product that required effort to create during the development, operation, and maintenance of a system is a candidate for reuse. Any task that is performed in the construction of a new system consumes resources; and anything that can be reused will reduce the amount of resources that are consumed by the development, operation, and maintenance of another system.
To better represent this broader view from software reuse to software-engineering asset management, we redefined the basic unit of reusable software from the component (implying any part of the software that was used in the deployed solution) to the software-development asset (implying anything of value that was created in the development of a software system). In doing so, we also expanded the expectations of the functionality of such systems.
Also, we learned through experience that SEAMS have value beyond supporting reuse. They can be used to capture not just the result of decisions that are made in the creation of a system, but also the rationale behind those decisions, as well as other relationships among all those assets. This makes SEAMS a very valuable tool for:
- Troubleshooting system problems.
- Planning for and estimating, designing, implementing, and testing enhancements.
- Training new staff in the operation and maintenance of a system.
When someone looks for the answer to a problem, users will (hopefully) understand what the problem is, but probably will not understand the solution yet. Software-engineering assets must be describable by the problems that they solve—not just the method that they use to solve them or descriptions of the structure of those assets.
Developing and maintaining a growing reusable-asset collection requires developing and maintaining a developing and growing classification mechanism. This takes special skills that can be learned from other areas of expertise (such as library science) and adopted through a combination of retraining and capturing expertise in automated tools.
Our recipe is straightforward. We apply lessons from library and information science. We take from them the ability to organize assets; develop a schema to organize those assets, according to the problems that they can solve; and help a user formulate their needs, in terms of that schema. We add from our own craft the ability to capture all of that in a digital information system: automating the workflows of capturing assets; cataloging them, according to what they can do for us; archiving them; searching for and accessing them; presenting additional useful information, such as how to use them; and locating additional potentially useful assets.
- How will developers of new systems describe the needs that they have for reusable software assets?
- How will maintainers of existing systems describe the questions that they have about how a system works, and which components perform which roles and functions in the system?
- How will we define what belongs in our RLMS or SEAMS, and what does not? In other words, how do we define the scope of our collection of software-development assets?
- How do we best present an asset and data about past use—and the integration—of that asset into a new system?
- How do we structure the classification schema of an RLMS or SEAMS to reduce the complexity of understanding and using that system, by representing the commonalities and variability among the assets?
- How do we extend existing tools (such as RDBMS or library-automation tools) to automate our SEAMS tools?
By the author
- "Contextual Classification in the Metadata Object Manager (M.O.M.)," Proceedings of the American Society for Information Science & Technology, 1999 Annual Meeting.
- "Extended Faceted Classification System (EFCS): Representing Software Engineering Domains, Systems, and Components," Proceedings of the Reuse '95 Conference.
- "An Empirical Study of Representation Methods for Reusable Software Components," (with W.B. Frakes), IEEE Transactions on Software Engineering, Volume 20, Issue 8 (August 1994).
(Software-engineering) asset—A thing that was created in the pursuit of creating a software-intensive system and, in some cases, can be used again to create another (in which case, it is a reusable asset).
Business domain—A type of business or other pursuit. A group of businesses that compete with each other by selling the same kind of products and services to the same potential customers is a business domain.
Cataloging—Classifying assets and recording their classification and other useful descriptive information into a catalog that helps people to identify the assets that they want and, then, obtain them.
Classification—Organizing assets into groups or assets that are similar, naming those groups or classes of assets, identifying what the assets in each class have in common, and putting those groups or classes into a structure that helps people to find assets according to the common traits that they share.
Software-engineering asset-management system (SEAMS)—A system that catalogs, manages (for example, change control and access control), and makes available software-engineering assets and information about those assets.
About the author
Thomas Pole has been a computer geek for over 25 years: as a PC hardware tech, programmer, researcher, teacher, and even program manager (without the pointy hair). He currently works for a large system integrator on U.S. Government projects and teaches part-time at Johns-Hopkins University (currently writing new courses in SOA and CBSE), where he did graduate work many moons ago.
This article was published in Skyscrapr, an online resource provided by Microsoft. To learn more about architecture and the architectural perspective, please visit skyscrapr.net.