Modeling Languages for Distributed Applications
Keith Short, Architect, Visual Studio Enterprise Tools
Microsoft® Visual Studio® .NET
Summary: Learn how modeling languages can simplify the development of distributed applications. (12 printed pages)
Using Abstraction to Reduce Complexity
Designing Connected Web Services
Models Are Development Artifacts
Generating Artifacts from Models
Organizing Domain-Specific Modeling Languages
Web Service Design and Deployment
Today's developers are faced with critical business drivers requiring the development of heterogeneous, connected solutions and the automation of business processes. Meeting business needs has always been a staple of modern information technology, but developers today are faced with revolutionary changes in their application platforms and tools. Indeed, developers must implement solutions using novel distributed application platforms and architectures, such as Service Oriented Architecture (SOA) and Web services. Together, the evolving needs of businesses and the dramatic changes in platforms and architectures as shown in Figure 1 are enough to confound even the most seasoned development organizations.
Figure 1. Sources of complexity for developers
Of course, none of these sources of complexity are static. A concerted industry effort is evolving the Web service platform with increased reliability, security, and other enterprise-grade requirements. But even as the platform gains stability and completeness, developers still face the daunting task of translating increasingly complex application requirements into SOA designs, and implementing them using available technology. While the platform experts, programming language gurus, and interoperability standards committees have much work to do, we are interested in the role that modeling languages can play in addressing the complexity facing developers.
In this article, we present an argument for using modeling languages in conjunction with source code based on general purpose programming languages to reduce the complexity of software design and implementation, and to help ensure the use of best practices to reduce risk, cost and time to market, and to improve product quality. We'll introduce the idea of a family of inter-related, but individually specialized modeling languages the industry is calling domain-specific1 languages, or DSLs. We'll describe how such languages might be used, and the requirements this approach to modeling drives into development environments.
Few developers today would argue that they were somehow better off in the days of assembler programming than they are in using modern general-purpose programming languages such as C# and Java. Higher-level languages abstract, or hide, more implementation detail than lower-level ones, giving the developer more coarse-grained constructs that are closer to concepts in the problem domain. They enable developers to express their intent more rapidly and with fewer expressions. Most modern general-purpose programming languages hide enough detail to provide a measure of platform independence, but they still require the developer to render implementations of concepts in the problem domain using large numbers of fine-grained constructs. In other words, while they are much higher-level than assembly language, they are still much lower-level than they could be if the goal is to solve problems in specific domains. This gap between intent and implementation creates complexity.
If we define languages that offer abstractions closer to problem domains, naturally the languages become more specific to those domains and cease to be general purpose. On the flip side, because the abstractions are more specific, the developer requires fewer constructs to describe a solution and the amount of manual labor is reduced.
Let's take a more concrete example. In the early 1990's, developers were faced with the complexity of client-server technology. Building efficient and effective client-server applications proved to be a difficult endeavor, initially accessible only by the top few percent of all developers. However, within a couple of years nearly every developer had become empowered to build efficient, effective client-server applications. What happened?
For Windows-based client-server applications at least, two things changed. First, Microsoft® Visual Basic® defined two key abstractions that significantly simplified the developers' task: the form and the control. Along with a tool with which to edit these abstractions, developers could design user interfaces and express relationships between controls and forms. The Form Designer generated code from these abstractions, and developers wrote their business logic for events between controls and forms in clearly separated sections of the file. Second, the Visual Basic runtime was itself extended with framework code that supported the event-based programming model. This worked well because only modest amounts of code had to be generated to complete the framework, keeping synchronization issues to a minimum.
Effectively, the Form Designer and its abstractions, form and control, defined a graphical language in which higher-level concepts could be modeled. The graphical language of forms and controls allowed the developer to focus on one specific aspect of development—client-server UI design—and ignore the complex details of client-server communication protocols. It also made development easier by clearly showing the relationships between UI elements that would otherwise have been difficult to discern from conventional hand-crafted code. By tying these abstractions to the event-handler concepts in the framework, the developer clearly delineated where this domain-specific language or DSL needed to interact with general-purpose languages used to express business logic. As a result, developers enjoyed an exceedingly productive development environment.
Today we are faced with a similar problem of complexity. It takes the top few percent of today's developers to build a distributed application that supports value-chained business processes defined using Web services. What are the few simple abstractions that could in a few years enable the majority of developers to produce efficient, effective distributed applications? What kind of support must be provided for these abstractions in underlying frameworks?
To answer these questions, let's take a look at the concerns of a developer (or an architect, perhaps) whose task is to design the architectural structure—the "big picture" view—of services that support business processes.
- How should activities within a business process map to Web services?
- What messages flow between these services?
- What are the sequences of these messages?
- What are the security requirements for the various services?
- What services already exist that may be reused?
- What are the coarse-grained software components that offer the various services?
- What wire-level protocols should be specified in order to meet the needs of the target data center?
It's possible to analyze these concerns to define suitable abstractions and a modeling language in which they can be expressed. This modeling language will help the developer focus on these concerns while temporarily ignoring others. With a suitable graphical notation—boxes for services, and lines between them summarizing message exchanges, for example—the developer can get a holistic view of the overall structure of his application as a set of interacting Web services. Using a graphical design tool based on this new modeling language, he can experiment with the effect on business protocols of shifting functionality from one service to another. He can define message structures and operations on Web service ports, and so on. Other tasks such as defining the code that acts on messages within the service components, the databases that store and manage the information, and the internal structure of the Web service implementations, can be dealt with separately. This graphical language might well be called a Web Service Interaction modeling language, or a Web Service Interaction DSL—actual artifacts expressed in this language would be known as Web Service Interaction models.
Figure 2. Multiple artifacts and fragments derived from a model
Narrowly focused, domain-specific modeling languages can be very effective at reducing the complexity inherent in translating domain concepts into implementations. They can offer coarse-grained abstractions that map to code based on the underlying runtime frameworks. But there's another benefit: in hand-crafted coding, domain concepts are scattered across multiple development artifacts based on a variety of languages and formats, forcing the developer to keep those artifacts synchronized as requirements evolve, as in Figure 2. Conversely, the meaning of any piece of the design or implementation is difficult to discern when the developer is obliged to look at large amounts of code written in terms of fine-grained, general-purpose primitives. Graphical languages are often more powerful than textual ones in these cases, since they generally make it easier to see and manipulate relationships among concepts. Interactions between Web services, for example, are much easier to see and manipulate when visualized graphically, than when scattered across multiple source code files, schemas and configuration files.
Unlike models used merely for documentation purposes, the models that interest us are first class development artifacts. They are just as important in the design and implementation of the software as source code files, schemas and configuration files, in which information is expressed using languages like C#, Java or XML. Also, they are just as important in later parts of the software life cycle as the files used to describe configurations of artifacts, to deploy executables, to generate tests, to track defects, or to manage software execution. The artifacts which are related to a model as shown in Figure 2 must be continuously synchronized with one another for this to work. This is a very important requirement of a design time environment to support higher level modeling languages: wherever information is edited, all derived and derived-from information must be synchronized.
Table 1 shows just how much models based on domain-specific languages have in common, with source code files based general-purpose programming languages, such as C# and Java, and how they differ.
Table 1. Domain-specific modeling languages vs. general purpose programming languages
|Characteristic||Domain-Specific Modeling Language||General-Purpose Programming Language|
|Notation||Mostly graphical||Mostly textual|
|Specificity||Highly specific to narrow problem domains, such as class interactions or Web service contracts||Highly generic and intended to support broad problem domains, such as general purpose computation|
|Scope||Useful only for solving specific kinds of problems in the target domain||Useful for solving almost any problem|
|Granularity||Offers medium- to coarse-grained abstractions, from classes to components, Web services, assemblies, activities and processes||Offers fine- to medium-grained abstractions, from scalar values to classes|
|Expressiveness||Highly expressive, requiring a small number of expressions to describe or implement a complex concept||Not highly expressive, requiring large numbers of expressions to describe or implement a complex concept|
|Translation||Generally translated either by a generator into development artifacts or pieces of development artifacts, such as source code and XML, or by a compiler into optimized executable instructions as byte code or binary||Generally translated by a generator or compiler into optimized executable instructions as byte code or binary|
|Refactoring||Models can be refactored to improve their structure for maintainability or efficiency||Source code can be refactored to improve its structure for maintainability or efficiency|
|Patterns||Models can be developed using patterns that embody best practices||Source code can be developed using patterns that embody best practices|
|Composition||Models can be composed from cross-cutting aspects that are defined once and applied to multiple model elements, such as deployment policies or security policies in a Web service modeling language.||Source code can be composed from cross-cutting aspects that are defined once and applied at multiple source code insertion points, such as error handling, instrumentation code or security policies.|
The key difference is the narrow focus of a domain-specific language. DSLs work well when they are designed for subsets of a larger set of requirements (such as personalization), subsets of a larger architecture (such as security) or subsets of a larger development process (such as Web service interconnection). A DSL offers abstractions specific to some domain, and adds value by helping the developer solve problems in that domain in a highly efficient manner.
With this approach, models coexist with source code and other artifacts that are related to them. They are not seen as temporary artifacts used only in analysis and design that can be discarded after generation. Nor can they can be seen as the sole source of all derived information. Two scenarios illustrate these points:
- The developer may be required to modify the artifacts in which the generated source code has been placed. When this happens, it may be difficult to accurately reflect those changes in the model, or to avoid overwriting them the next time source code is generated from the model. Poor synchronization usually forces the developer to abandon the model, since the source code must be correct.
- The model may contain information that is not directly manifested in the source code. Examples include information about relationships between classes or about definitions of business processes that cannot be deduced from source code, even by an expert developer. Or, information in the model may be hard to keep synchronized once it has been scattered across derived artifacts. For example, in a Web services model, a particular configuration of security attributes may be defined as a named security policy which is then generated into many different kinds of artifacts once the information has been translated into source code attributes, configuration files, and statements in code, as illustrated in Figure 2. This scattering compounds the synchronization problem.
Failure to solve these problems effectively led to the demise of early CASE technology. There are many ways to solve them, however. For example:
- Generate code and defend it. Generated code is a good thing since it reduces the amount of code that needs to be written by hand. However, any generated code should be placed in inaccessible hidden regions or separate files. An example of this approach is the Microsoft Visual Studio® Forms Editor, a graphical modeling language for describing forms and controls, that generates code into hidden regions. Few developers bother to look at the generated code. Hand-written code receives control via delegation from the generated code, or inherits from the generated code to provide necessary interaction between them. Or, as in the case of Container Managed Persistence in Enterprise Java Beans, generated code can receive control via delegation or inheritance from hand-written code. Similarly, ASP.NET uses partial classes to put hand written and generated code in separate files.
- Provide a run-time framework to support the abstractions defined by the modeling language, and generate framework completion code, or declarative attributes in source code, that extend and control the framework through well-known variability points. An example is ASP.NET, which allows a Web service modeling language to generate common language runtime (CLR) attributes to control and define Web service implementations. Again, the interaction between hand-written code and framework code is managed cleanly through delegation or inheritance in either direction.
- Generate only code that can be managed through an API on the source code—never generate code that would require subsequent parsing of the source code in order to maintain synchronization back to the model. The .NET languages that support the Code Model APIs in Visual Studio offer this capability, enabling a synchronization engine to respond to change events raised through the API at design time.
We use all of these techniques in the design-time technology that implements the domain-specific modeling languages, and in the tools and generators that we will deliver to support them.
Another alternative for translating a domain-specific modeling language into a less abstract form is to translate into another DSL for which a translator already exists. When modeling languages are related in this way, they may be organized using a grid as shown in Figure 3. The columns of the grid represent concerns, while the rows represent levels of abstraction. Typically, for a given application type (such as a B2C e-commerce application), the shape and contents of the grid represent a set of models, source code files and other artifacts that must be developed to properly address functional and non-functional requirements in the course of building and deploying an application of that type. Similarly the path through the grid represents the set of activities that must be performed, and the transformations between the artifacts that define the flow of the development process.
Generally, application development is based on progressive refinement, which proceeds from the top-left region of the grid where business requirements are captured, to the bottom-right where bits are finally loaded on to servers in the data center. At the same time, however, it is also based on progressive abstraction, which proceeds from the bottom-right to the top-left. The trick is to provide bidirectional synchronization or at least bidirectional reconciliation across levels of abstraction, so that this naturally top-down and bottom-up iterative process can converge. A path through the grid defines the best way to build a piece of an application, of the type for which the grid was defined. It prescribes the artifacts that must be developed, and the steps that must be performed to capture the requirements, establish the architecture, produce the implementation, and to ensure that the application will deploy correctly.
Of course, Figure 3 is highly simplified. Each cell in the grid represents a viewpoint from which we can specify some part of the application software. In practice each cell contains more than just DSLs—it also contains:
- Refactoring patterns that can improve models based on the viewpoint
- Aspect definitions that are applicable to models based on the viewpoint
- Development processes used to produce models based on the viewpoint
- Definitions of constraints supplied by models based on neighboring viewpoints
- Frameworks that support the implementation of models based on the viewpoint
- Mappings that support transformations within and between models based on the viewpoint or neighboring viewpoints
The developer can use a model as a kind of control center for applying patterns expressed in terms of the abstractions supported by the modeling language. For example, when using the Web Service Interaction Designer, he might use a pattern library that offers parameterized contract and message definitions, which can be readily applied to services in the model, or one that offers collaboration patterns, such as a service façade (an aggregate of one or more services that provides managed pass-through of messages to a subset of the capabilities of the underlying services). Models can also be used to specify aspects and to weave together their implementations, much as source code is used to define and weave together aspects are in aspect-oriented programming languages. For example, the developer might define a security policy once, and then apply it to many services by group-selecting them in the model. Patterns and aspects add richness to a modeling language, promote the use of best practices, and simplify development tasks.
Figure 3. A layered grid for classifying domain-specific modeling languages
This grid is not in itself new. What is novel is defining domain-specific modeling languages for the cells, and mappings between and within the cells that support fully or partially automatic transformations. As we have seen, we must use well-defined domain-specific modeling languages, not general-purpose modeling languages designed for documentation, in order to provide this kind of automation. Given appropriate domain-specific modeling languages and transformations, we can drive from requirements to executables using framework completion and progressive refinement, keeping related models synchronized, or at least keeping them reconciled. We can also drive from executables to requirements using progressive abstraction, building frameworks, patterns, modeling languages and other reusable abstractions that will prove useful on the way down.
As an example, let's revisit the grid from Figure 3, and zoom in to the bottom-right corner, as in Figure 4, so we can look at the viewpoints and the relationships between them in more detail. In the figure, rectangles represent domain-specific modeling languages, dotted lines represent refinement transformations and solid lines represent constraints.
Figure 4. Part of a set of DSLs for Web service development
The figure illustrates the following:
- The Business Entity DSL defines the business entity abstraction, which describes efficient, message-driven, loosely coupled data services that map onto an object-relational framework. Examples of business entities include Customer and Order.
- The Business Process DSL defines the business activity, role and dependency abstractions, and a taxonomy of process patterns that can be used to compose them, forming business process specifications. An example of a business process is Enter Negotiated Order, which might use three process patterns: one for a User Interface Process to build a shopping cart, one for a sequential process to submit the order and perform credit checks, and one for a rule-driven process to calculate a discount.
- These two DSLs map onto a Web Service DSL that describes collaborating Web services in a service-oriented application. The Web Service DSL is used to describe how the business entities and processes are implemented as Web services, how the messages they exchange are defined and what protocols are used to support their interactions, using abstractions that hide the underlying details of the Web service implementations. Predefined patterns of Web service interactions, such as service façade, service interface, and gateway2, can be applied to ensure that the architecture of an application uses best practices. Model aspects, such as security policies, can be defined and applied at several points in the model—say for every operation of a group of selected service ports.
- A DSL called the Logical Systems Architecture DSL for describing data center configurations. This allows a network architect to describe a scale-invariant definition of the data center in terms of logical servers and connections that will be deployment targets for the Web services described using the Web Service DSL, along with the software they have installed, and their configuration settings. Standard configurations of data centers, such as the Microsoft System Architecture patterns available from msdn.microsoft.com can be applied to models based on this DSL.
- Information from one model can be used to develop another. Examples are the interactions between business entities and processes, and between Web services and logical servers. This last one is particularly interesting because it can be used to design for deployment. Feeding knowledge of the deployment infrastructure into Web service designs constrains those designs to prevent deployment problems. Similarly, working this in reverse, if a design is to be deployed on a given logical server type, then we can validate that the server on which it will be deployed is of the correct type, that it has the right software installed, and that it is configured correctly. We have called this Design for Deployment (see Figure 5) as it addresses some commonly described problems in deployment of distributed applications.
Figure 5. Design for Deployment scenario
- Mappings drive transformations between models at design time. For example, we use transformations to map the model of Web services to multiple implementation artifacts on the target platform, in the form of classes that complete a framework, such as ASP.NET, and various configuration and policy files (see Figure 2).
While abstraction can be used to hide platform differences, we find it much more valuable to use abstraction to reduce complexity of translating requirements into implementations. In our treatment of model-driven development, models become artifacts that are equally as valuable as source code is today, and never become relegated to being merely inconsistent, out-of-date documentation. The underlying modeling language implementation technology ensures that they will always be consistent with source code and other artifacts where necessary. Microsoft will build tools that implement part of the grid for Web service applications and some of the DSLs described here in the next version of Visual Studio. Partners and competitors will be able to leverage their domain knowledge to implement DSLs using our underlying infrastructure, which we plan to make available through our Visual Studio partner programs.
This tool-building infrastructure will make it easier to build graphical tools to edit and view models, and will ensure that the models produced are always synchronized with source code and other artifacts that are derived from them. Today only a handful of companies define domain-specific modeling languages or build model-driven tools to support them. We predict that software developers and architects at SIs, ISVs and enterprises will see the value in building specialized tools or extending those already provided, that enshrine their best practices, patterns and pre-defined content. General purpose modeling languages, which have limited ability to be extended or customized, won't meet the challenge; nor will infrastructure that just focuses on abstract model structure, ignoring the user interface, the visual experience, and model interchange.
Once the metadata inherent within models is available, other development tools can make use of it. Imagine a powerful metadata-enabled source control system (SCS) that could allow a developer to check Web services consisting of multiple files in and out. For example, if a well-formed Web service must include certain files, such as a WSDL contract, a server code file, an XML-based configuration file with entries that match interfaces ports offered by the service, a metadata-enabled SCS could flag a collection of files that don't satisfy these criteria as a malformed configuration. Imagine also how a debugger might be made more powerful once enhanced with metadata. Or imagine how testing tools could become more focused and report their failures more appropriately if metadata were more widely available. How could compilers be made more effective if they could exploit widely available, more abstract metadata?
We will address these interesting topics in detail in subsequent papers, in which we'll also describe more features of the modeling language framework and tool-building infrastructure, and our overall vision for successive industrialization of application development.
1 In this article, we'll use the term domain in its most general sense. Domain may refer to a broad set of problem areas such as banking applications, or manufacturing applications, or to technologies such as the domain of all ASP.NET applications. But it may also refer to narrow areas of focus such as the security aspects of any application, or the architecture of applications.
2 These latter two are defined in the Enterprise Solution Patterns Using Microsoft .NET, available from Microsoft Press.