Application Design Guidelines: From N-Tier to .NET
David Chappell, Chappell & Associates
Steve Kirk, Microsoft Corporation
Summary: Discusses application design for Microsoft .NET and the changes required: examines architectural lessons learned from building N-tier applications using Microsoft Windows DNA, how these same lessons apply in building applications with the Microsoft .NET Framework, and architectural advice for applications that use XML Web services. (11 printed pages)
N-tier applications have become the norm for building enterprise software today. To most people, an N-tier application is anything that is divided into discrete logical parts. The most common choice is a three-part breakdown—presentation, business logic, and data—although other possibilities exist. N-tier applications first emerged as a way of solving some of the problems associated with traditional client/server applications, but with the arrival of the Web, this architecture has come to dominate new development.
The Microsoft Windows® DNA technology has been a very successful foundation for N-tier applications. The Microsoft .NET Framework also provides a solid platform for building N-tier applications. Yet the changes .NET brings should make architects re-think some of the lessons they've learned in the Windows DNA world about designing N-tier applications. Even more important, the fundamental support for XML Web services built into the .NET Framework allows building new kinds of applications that go beyond the traditional N-tier approach. Understanding how best to architect applications for .NET requires knowing what changes in this new world and how to exploit these changes.
This article takes a look at these issues, beginning with a review of key architectural lessons learned in building N-tier applications using Windows DNA. It next examines these same findings, in the same order, as they apply to building applications using the .NET Framework. The final section provides some advice on architecture for applications that use XML Web services.
Factoring an application into logical parts is useful. Breaking a large piece of software into smaller pieces can make it easier to build, easier to reuse, and easier to modify. It can also be helpful in accommodating different technologies or different business organizations. At the same time, there are trade-offs to consider. Modularity and reusability are good things, but they can result in applications that are not as secure, not as manageable, and not as fast as they might otherwise be. This section reviews some of the basic architectural lessons that have come out of the widespread experience in building N-tier applications with Windows DNA technologies.
Writing Business Logic
Windows DNA applications commonly implement their business logic using one or more of three implementation options:
- ASP pages
- COM components, perhaps using the extra services provided by COM+
- Stored procedures running in the DBMS
It is generally a bad idea to write much business logic in ASP pages. Simple languages must be used, such as Microsoft Visual Basic® Script (VBScript), and the code is interpreted each time it is executed, which hurts performance. Code in ASP pages is also hard to maintain, largely because business logic is commonly intermixed with presentation code that creates the user interface.
Given this, one recommended approach for writing middle-tier business logic is to implement that logic as COM objects. This approach is a bit more complex than writing a pure ASP application, but because full-featured languages can be used that produce compiled executables, the result is typically faster. Wrapping business logic in COM objects also cleanly separates this code from the presentation code contained in ASP pages, making the application easier to maintain.
It's a short architectural step from COM to COM+. As many Windows DNA architects have learned, however, the core services provided by COM+—transactions, just-in-time (JIT) activation, role-based security, and threading services—shouldn't be used unless they're truly required. Using COM+, or similar services provided by other development platforms, naively can result in applications that are slower and more complex than they otherwise would be. Using COM+ makes sense when:
- Distributed transactions are required that span heterogeneous resource managers, such as Microsoft SQL Server™ and Oracle.
- Role-based security can be effectively employed by the application.
- Microsoft Visual Basic® 6.0 threading behavior can be enhanced.
- JIT activation will improve performance; this is seldom the case with browser clients, since ASP pages are effectively JIT-activated anyway.
- The configuration benefits of COM+ greatly simplify deploying an application.
The third option for writing business logic is to create some of that code as stored procedures running in the database management system (DBMS). Although a primary reason for using stored procedures is to isolate the details of database schema from business logic to simplify code management and security, having code in such close proximity to data can also help optimize performance. Applications that must be DBMS-independent, such as those created by independent software vendors, typically avoid this option, since it locks the application into a particular database system. Stored procedures can also be harder to write and debug than COM objects and this approach can lower the odds of code reuse, since COM objects are usually easier to reuse than stored procedures. Yet most custom applications remain tied to the DBMS they're originally built on, and the performance benefits of using stored procedures can be enormous. Given this, Windows DNA applications that must perform as well as possible generally use stored procedures for some or all of their business logic.
Windows DNA supports both native Windows clients, written in a language such as Visual Basic, and browser clients. Browser clients are more limited, especially if the browser in use can be either Microsoft Internet Explorer or Netscape. Because of this, applications often have both browser and native Windows clients. The browser client provides a more limited interface but allows easy access across the Internet, while the Windows client provides a full-featured interface. More complex browser interfaces can be created using downloadable Microsoft ActiveX® controls, but only if the browser is guaranteed to be Internet Explorer and the user is willing to trust the application's creator.
Managing State in Browser Applications
ASP applications can use several different mechanisms to maintain information on the server between client requests. One firm rule in Windows DNA, however, is that the ASP Session object should never be used to store per-client state if the application may be load-balanced across two or more machines. The ASP Session object is locked to a single machine, and so it won't work correctly with load-balanced applications.
Both the ASP Session object and the ASP Application object have another limitation, too. Using either one to store an ADO Recordset greatly reduces scalability because it limits the application's ability to exploit multithreading. Because of this, storing Recordsets in either of these objects is nearly always a bad idea.
In Windows DNA, choosing how components running on different machines will communicate is easy: DCOM is just about the only option. From a purely architectural point of view, DCOM is a straightforward extension of COM. In practice, however, DCOM has several implications. Among them are the following:
- Because it is in a very real sense their native protocol, communicating with remote COM+ objects using DCOM is straightforward.
- When correctly configured, DCOM is a very secure protocol. Achieving this configuration isn't a simple task, however, so the protocol can be hard to use. Still, DCOM by itself can provide good distributed authentication, data integrity, and data privacy, especially in a Windows 2000 domain.
- Because it requires opening arbitrary ports, DCOM doesn't work well with firewalls. Accordingly, applications that must communicate across the Internet can't typically use DCOM for this purpose.
Accessing Stored Data
The data access architectures that can be built using ADO can be divided into two categories: light touch and heavy touch. Light-touch ADO clients hold database connections as briefly as possible, and they write to the database using stored procedures. Light-touch clients retrieve data in one of three ways:
- By populating Recordsets using read-only, forward-only cursors;
- Through stored procedure output parameters;
- Using streams (in more recent versions of ADO).
Heavy-touch clients, the alternative, hold database connections for longer periods of time. This kind of application relies on open connections and the stateful server-side cursors those connections allow to:
- Give a Recordset direct access to changes made by other users or applications;
- Enable pessimistic locking;
- Reduce network traffic by minimizing the amount of data copied to the ADO client. Unlike light-touch clients, clients using server-side cursors can leave query results in the database until that data is actually needed. This approach also copies less metadata to the Recordset, leaving more in the database itself.
Light-touch applications are the most scalable, since they use database connections—a scarce resource—most effectively. Heavy-touch applications, by contrast, must maintain long-lived database connections because stateful server-side cursors require them. This severely limits the application's scalability, and can be a particularly bad choice for Internet server applications. While heavy-touch applications may be simpler to develop with ADO, they're rarely the best choice.
ADO is also not especially well suited for working with hierarchical data such as XML documents. The ADO features for doing this are complex to use and not well understood. Similarly, ADO offers only limited support for accessing the XML features of SQL Server 2000. As a result, Windows DNA applications commonly avoid using ADO to work with hierarchical data.
Passing Data to Clients
Moving data effectively from the middle tier to clients is a critical aspect of any N-tier application. When communicating with Windows clients using DCOM, Windows DNA applications can use ADO-disconnected Recordsets. This option can also be used for browser clients when the browser is guaranteed to be Internet Explorer. Sending data to arbitrary browsers is more difficult. One choice is to explicitly convert the data to XML, then send it and any necessary script code down to the browser.
.NET supports conventional N-tier applications, Web services applications, and applications that combine elements of both. This section looks first at how N-tier applications are affected by .NET and then describes some of the major architectural issues in building Web services applications.
Binding N-Tier Applications with .NET
Some of the issues described in the previous section apply equally to Windows DNA applications and applications built using the .NET Framework. It still makes sense, for example, to use COM+ (known as Enterprise Services in the .NET Framework) only when one or more of the conditions listed earlier is met. Similarly, building business logic as stored procedures will lead to better performance in many N-tier applications.
Still, the .NET Framework is full of new technologies and new versions of existing technologies. These enhancements bring with them an assortment of changes to the optimal architecture of an N-tier application. This section walks through the categories described earlier, describing how the .NET Framework changes the decisions an architect makes when creating an N-tier application.
Writing Business Logic
Unlike the three choices in Windows DNA for creating N-tier business logic—ASP pages, COM components, and stored procedures—the .NET Framework really provides only two: assemblies and stored procedures. For browser applications, assemblies can be created using Microsoft ASP.NET .aspx pages. Unlike ASP, writing business logic entirely using ASP.NET is often a good idea.
One reason for this is the ASP.NET code-behind option. Unlike traditional ASP pages, which don't make it easy to mix business logic and presentation code in a maintainable way, using code behind with .aspx pages allows for cleanly separating these two types of code. Where a Windows DNA application might use both ASP pages and COM objects for maintainability, an application built using the .NET Framework can use just ASP.NET. Also, business logic contained in .aspx pages can be written in any .NET-based language, not just in the simple scripting languages supported by traditional ASP pages. And because ASP.NET compiles pages rather than interprets them, ASP.NET applications can be very fast. While an application built using Windows DNA might have used both ASP pages and COM objects to achieve sufficient performance, the performance boost with .NET may allow building the same application using just ASP.NET. Finally, business logic that uses the ASP.NET cache to reduce database access for frequently used data can realize significant performance improvements.
It's worth pointing out, however, that reuse is harder with code wrapped in an .aspx page, even one using code behind, than with a standard assembly. For example, accessing code in an .aspx page from a Windows Forms client is problematic.
With the .NET Framework, the need for a Windows client diminishes—a browser client might be all that is required. One reason for this is that ASP.NET Web controls allow building and/or buying reusable browser graphical user interface (GUI) elements, making it easier to build more usable browser clients. Also, the ability to download .NET Framework-based components to Internet Explorer clients, and then have those components run with partial trust rather than the all-or-nothing trust required for ActiveX controls, helps build better user interfaces.
Managing State in Browser Applications
Because it is bound to a single machine, the ASP Session object isn't as useful as it might be. With the .NET Framework, however, this limitation is removed. Unlike ASP, the ASP.NET Session object can be shared by two or more machines. This allows using the Session object to maintain state in a load-balanced Web server farm, making it much more useful. Also, because the Session object contents can optionally be stored in a SQL Server database, this mechanism can be employed in applications that must maintain per-client state persistently in the event of failures.
Another important change that impacts the architecture of ASP.NET applications is that, unlike ASP, DataSets can be stored in Session and Application objects with no threading implications. In other words, the firm Windows DNA rule that Recordsets shouldn't be stored in these objects doesn't apply to DataSets in the .NET Framework. This makes storing the results of queries both simpler and more natural.
The .NET Framework provides more choices for communicating between distributed parts of an application than does Windows DNA. The choices include the following:
- .NET Remoting, which provides both a TCP channel and an HTTP channel;
- ASP.NET support for SOAP-callable XML Web services, implemented in .asmx pages;
- DCOM for communicating with remote COM objects.
More options mean more architectural choices; they also imply more factors to consider when making the choice. Architectural issues to be aware of when creating distributed applications using the .NET Framework include the following:
- Communicating directly with remote COM+ objects requires DCOM—.NET Remoting cannot be used. Since DCOM is reasonably complex to set up and use, this kind of communication is worth avoiding whenever possible. It may in some cases be worthwhile to expose existing COM+ objects through managed code, although the COM interoperability this requires will reduce performance.
- The .NET Remoting TCP channel provides no built-in security. Unlike DCOM, it does not offer strong authentication, data integrity, or data privacy services. However, this isn't all bad; the TCP channel is much easier to configure than DCOM.
- Unlike DCOM, which doesn't work well with firewalls, the .NET Remoting HTTP channel is explicitly designed to communicate effectively across the Internet. Also, because it can use SSL, this option can provide a secure path for data. In general, TCP channel is a better choice for communication on an intranet, while the HTTP channel or the ASP.NET SOAP support is preferable for communication over the Internet.
- Both the .NET Remoting HTTP channel and the ASP.NET support for XML Web services implement SOAP. The two implementations are distinct, however, and each is intended for a specific purpose. .NET Remoting focuses on preserving the exact semantics of the common language runtime, and so it's the best choice when the remote system is also running the .NET Framework. ASP.NET focuses on providing absolutely standard XML Web services, and so it's the best choice when the remote system might be .NET-based or any other platform. ASP.NET is also faster than the .NET Remoting HTTP channel. The HTTP channel also has advantages, however. It allows both passing parameters by reference and true asynchronous callbacks, features that aren't naturally part of SOAP support in ASP.NET.
Accessing Stored Data
Unlike ADO, which makes it easy to build heavy-touch clients that don't scale well, ADO.NET is biased toward building light-touch clients. An ADO.NET client uses forward-only, read-only cursors to read data. Stateful server-side cursors aren't supported, so the programming model encourages short connection lifetimes. Clients that read and process data directly can use the ADO.NET DataReader object, which provides no caching for returned data. Alternatively, data can be read into a DataSet object, which acts as a cache for data returned from SQL queries and other sources. Unlike an ADO Recordset, however, a DataSet cannot explicitly maintain an open connection to a database.
Still, the heavy-touch approach fostered by ADO has some pluses, as described earlier. Those issues can be addressed in ADO.NET as follows:
- An ADO.NET client that stores data in DataSets and requires access to changes made by other users or applications will need to contain explicit code to make these changes. That code will also typically need to open a connection to the database for each check that it makes.
- Although there is no direct support for pessimistic locking in ADO.NET, a client can achieve the same effect by using ADO.NET transactions or implementing the required functionality in a stored procedure.
- Unlike ADO, ADO.NET doesn't allow leaving some of a query's results in the database, where they can be accessed using a server-side cursor. While ADO.NET does retrieve less metadata than ADO, applications should still be designed to transfer the complete results of a query from the database to the ADO.NET client.
Another change in ADO.NET that can affect architectural choices is its improved support for working with hierarchical data, especially XML documents. Transforming an ADO.NET DataSet into XML is straightforward, as is accessing the XML features of SQL Server 2000. Accordingly, hierarchical data that in the Windows DNA world might have been force-fit into a relational model can now be accessed in its original form. For more information, see the Related Reading section.
Passing Data to Clients
Effectively passing data to clients is every bit as important in N-tier applications built on the .NET Framework as it is in those built using Windows DNA. One significant change is that the ADO.NET DataSets can be automatically serialized into XML, making it simpler to pass data between tiers. While this was possible in the Windows DNA world, .NET makes using XML to exchange information much more straightforward to accomplish.
XML Web Services Architecture
The technologies of XML Web services—SOAP, Web Services Description Language (WSDL), and others—can be used in many different ways in building distributed applications. Some examples include:
- Connecting to an N-tier application's Web client using SOAP instead of just HTTP. Once this is done, that client can be any device capable of making SOAP calls. The client can then provide more functions for its user, since it now has a straightforward way to invoke methods in remote servers.
- Connecting one N-tier application, perhaps built on a .NET Framework-based platform, with another built on another platform, such as a Java application server.
- Connecting two mainframe applications, or one Enterprise Resource Planning (ERP) system to another, or any other kinds of applications. As these examples show, XML Web services can be used in much broader scenarios than just N-tier applications.
However they're used, XML Web services introduce many new architectural issues. Perhaps the most fundamental difference between XML Web services and the more traditional middleware technologies commonly used by N-tier applications is that XML Web services provide loose coupling. Unfortunately, this phrase means different things to different people. As used here, it refers to communicating applications with the following characteristics:
- The applications are largely independent from one another, and are often controlled by different organizations.
- Reliability is not an absolute. Every communicating application isn't guaranteed to be available at all times.
- Their interactions may be synchronous or asynchronous. A Web services client may block waiting for a response to some request, or it may go about its business after making the request, perhaps checking for a response at some later time.
These fundamental characteristics imply a number of architectural guidelines for applications that use XML Web services. While some of these issues are likely to be addressed by future work, such as Microsoft's Global XML Web Services Architecture (GXA) specifications, creators of effective XML Web services applications today must be aware of them. Among these are the following:
- Security is likely to be complex. Planning up front for end-to-end authentication and effective authorization is essential. End-to-end data integrity and data privacy may also be important for some applications. It may be necessary to map between different security mechanisms, although it's good to avoid this if possible. See the Related Reading section for more information.
- Interoperability can be problematic. Because of the relative immaturity of the specification, different vendor's implementations of SOAP don't always work together. See the Related Reading section for more information.
- Modifying existing applications to be accessible via XML Web services can cause problems. Speed, scale, and security are always issues when things that were never meant to work together are connected. Existing applications often weren't built to be servers, so handling many small requests can easily overwhelm them. Making fewer requests, with each one requesting more data, is likely to lead to better application performance. Also, existing applications usually weren't built to handle unpredictable loads, such as those that can result from exposing software to the Internet. If possible, using some kind of queuing mechanism to store requests until they can be serviced may help.
- Accommodating failure is essential. In particular, requests that require exactly-once semantics will typically require extra attention. For example, a request may time out, triggering a retry, yet the original request may have just been delayed for some reason. If executing a remote Web service twice on a single call is a problem, some mechanism must be created to address this issue.
- End-to-end transactions that rely on distributed locks being held across organizational boundaries are unlikely to be available. Most organizations won't allow "foreign" applications to hold locks on data, and so two-phase-commit-style transactions aren't possible. Instead, plan on using compensating transactions for any necessary rollbacks.
- Because the data received may come across application and organizational boundaries, each end of a Web services communication may need to check that data carefully. While the creator of an application might trust the accuracy of data produced by other parts of her own application, they ought not offer that same level of trust to other applications. The received information might even contain hostile code, which is a very good reason to examine it carefully.
- SOAP and the XML-defined data it carries are verbose. Passing too much data on a single call can overwhelm low bandwidth networks. Conversely, passing too little data on each call can overwhelm the application handling these requests. Although it can be challenging, finding the right middle ground is important. See the Related Reading section for more information.
Architecture matters. Choosing the right structure for an application, especially one distributed across several systems, is critically important. Bad architectural choices usually can't be fixed during implementation, no matter how good the developers are. Making the wrong decisions leads to lower performance, less security, and fewer options when an application needs to be updated.
Windows DNA offers a solid foundation for N-tier applications, and Windows developers can build on what they know from the DNA world, applying much of it to the new .NET environment. Yet being aware of the changes suggested in this article will help you in creating faster, more secure, and more functional applications. For both N-tier applications and applications that exploit the new technologies of Web services, .NET has a great deal to offer.
The Windows DNA Environment
Architecture Decisions for Dynamic Web Applications: Performance, Scalability, and Reliability
Duwamish Online Application Architecture