| he Web Forms model is the Web adaptation of the traditional Visual Basic® form-based interaction model. In this model, user activity produces input for processing modules that can reside either locally or remotely. In Visual Basic-based applications, the input is processed synchronously and with full awareness of state. Such a model of interaction is made possible by COM and DCOM components and their low-level protocols.|
Web applications can't work this way because they rely on HTTP, which is a stateless protocol, and because of the latency introduced by the Internet. (The lack of bandwidth is an ongoing problem that only worldwide infrastructure enhancements can ultimately resolve.) However, you could circumvent this problem and wrap your HTTP layer with software that does maintain state. This is just what Microsoft® .NET can provide.
ASP .NET applications are fashioned after the Web Forms model, resulting in form-based, client/server interaction brought to the Web. In Figure 1 you see the graphical representation of the Windows® Forms and the Web Forms model. The ASP .NET runtime shields you from the structural differences of the two models. It takes care of serializing and deserializing the state of the form. In this way, server-side processing can take place in a properly configured environment that reproduces the client scenario in a state-aware manner.
Figure 1 Windows and Web Forms
Using DHTML would also let you cache a lot of data on the client—an advantage on two different levels. First, you'd save quite a few round-trips. For example, with DHTML, you don't need to go to the server to sort a given data set. Second, you don't need to occupy the Web server's memory to cache state information, and this results in a much more scalable solution.
Admittedly, there is nothing really new with this other than the ADO .NET data container objects (DataSet, DataTable) which can be exploited in DHTML through the use of a COM interface or by using a Windows Forms application.
Unfortunately, more often than not, you have to write Internet applications that must work with any browser. In real-world scenarios, to be scalable and effective, the Web Forms model requires you to make some important design decisions. In this column, I'll examine some design issues connected to ASP .NET pages that access data through ADO .NET and display them through data-bound controls like the DataGrid.
State of the Page Consider the following relatively common scenario. You have a page with a few textboxes and a button to run a server-side procedure. As a result of pressing the button, the procedure generates a report that you can navigate using links, sort, filter, and more. Whenever you require a function, the Web server takes control and regenerates the page. Where do you get the data necessary to produce the report? This is a key issue for Web applications that need to be extremely scalable.
Web controls are not magic wands, but they are an elegant and effective way to let you generate boilerplate code. ASP .NET does a lot of work, but it can only follow your directives and decisions. At least in this version, ASP .NET is merely a powerful code executor, not a decision maker or a decision support system!
So what's the point? The Web Forms model pushes and supports the development of extremely interactive pages when you would think you were writing a Visual Basic-based application. But you're on the Web now where scalability is a serious issue. Nothing about ASP .NET scalability comes for free, even though the framework provides excellent tools to write extremely scalable applications.
Years of real-world experience have shown that quite a few factors affect scalability—the system's ability to maintain or improve its responsiveness as the number of clients and the number of users grows. The theory of queuing states that a queue forms when the frequency of the requests tends to overtake the response time. Most of the time, reducing the number of user requests is not a solution, but you can try to reduce the response time. Caching, data persistence, limited memory occupation, and quick use of shared resources are the key aspects to consider.
The Recipe for Scalability As I mentioned earlier, there isn't just one specific way to promote scalability. There are several measures you can take, but their impact on various applications depends strictly on the nature and the structure of the application itself. You can get better scalability by following these guidelines:
It comes as no surprise that a few of these measures clearly contradict others. For example, limiting the number of calls to the DBMS certainly implies that you should not be delegating data processing to it. Limiting the server memory occupation implies that you don't cache data, thus requiring a call to the DBMS whenever data is needed. If you want to limit DBMS access without resorting to data caching, there is no other way than serializing to disk, which means that you don't limit the number of operations the Web server performs with each call.
- Limit the number of calls to the database management system (DBMS) for getting data
- Delegate the DBMS to pre- and post-process data
- Delegate the Web server's memory on a per-user basis
- Limit the number of operations that the Web server has to perform
- Use relatively simple and stateless components
- Write fast, optimized code
As you know, optimization and speed are always important, but they're particularly important under stress conditions. Having code that's extremely fast under optimum conditions, but which loses half its speed under the worst conditions, is usually sufficient for most Web applications.
As for the physical elements affected by the recommended measures, you can always add more RAM, a faster CPU, or a bigger hard-drive to improve performance. You can also add more machines to run as a Web farm or add more CPUs.
In other words, the recipe for scalability is unique for each application and is often influenced by the background of the team, including developers and architects.
.NET Facilities for Scalability Once you have the solution on paper, you have to turn it into concrete programming calls and configuration settings. In doing so, you can take advantage of the new features that the .NET runtime and the .NET Framework make available. The measures I've described map to some extent to the use of the following programming techniques and .NET objects: the Session and other global objects, Web Services, the DataSet and DataReader objects, and XML.
What follows is an annotated overview of the various .NET programming objects and techniques. Knowing what a given piece of system software can provide is one of the key factors in implementing effective solutions.
The .NET Session Object In ASP and ASP .NET, the Session object is a global repository for data and objects that belong to the session. The visibility of the data is limited to the pages invoked within the session. Using the Session object is critical however you look at it. It guarantees quick and prompt access to data and ready-to-use objects. On the downside, though, all the session data is duplicated for each active session and each connected user. From this you can easily deduce that applications based extensively on the Session object cannot support uncontrolled and constant growth in the number of users. These considerations hold true in general and apply to .NET as well as to the previous versions of ASP.
Feel free to use Session when you write, test, and demo code, but be extremely careful when it comes to production code. This does not mean, of course, that Microsoft would have been better off dropping the Session object. The amount of data stored in Session must be kept under strict control, but using this object is still the fastest way to deal with session-specific data.
In .NET, the Session object plays the same role as in ASP, but it has two significant enhancements. First, all commonly used objects can now be safely stored in a Session slot. Second, the Session object is the programming interface for a module, the Session Manager, that can work in-process, out-of-process, and can even rely on SQL Server™ for data storage. As a result, the Session object can also be used in Web farm architectures.
In ASP, some flavors of COM objects (typically, objects written with Visual Basic) could cause serious scalability issues when stored to Session. This is due to the threading model employed by Visual Basic COM components, which forces a given component to be manipulated only by the thread that created it. Since Internet Information Services (IIS) uses a pool of threads, there is nothing that can guarantee that a request involving a living instance of a Visual Basic COM object is served by the same thread. If this is not the case, the request is delayed until the right thread is available. Don Box explained this topic in the House of COM column in the September 1998 MSJ.
This scalability issue has been solved in .NET as all .NET objects are thread-safe and can be managed by any member of the IIS thread pool. So, as long as the size of the data is not an issue, you can save any live .NET object in a Session slot.
The way in which Session works is determined by the settings in the web.config file. The following is an excerpt of such a file that illustrates all the attributes you can set.
Figure 2 describes the various attributes in more detail. By default, the Session Manager works in-process and utilizes cookies. This makes it perfectly aligned with the behavior of the ASP Session object. You can make it work outside the IIS process, on the same machine, or on a different one. This clearly results in slower performance, but it's more reliable. The SqlServer option lets you use a local or remote SQL Server database to store the data. The SqlServer option is out-of-process with respect to IIS. Figure 3 shows how the Session Manager relates to ASP .NET applications.
timeout="number of minutes"
connectionString="server name:port number"
sqlConnectionString="sql connection string" />
Figure 3 Session Manager
The out-of-process options (both StateServer and SqlServer) also make the use of Session suitable for Web farms. This feature may have a significant impact on scalability. If you plan to deploy an application on a Web farm, using Session soon becomes a more attractive option. You don't have to worry about the physical distribution of the machines since the Session Manager does that for you. In addition, any alternative to using Session, such as persisting to XML on local disk, requires you to use a shared path or to know the physical address of the machine that is actually processing your request. In other words, you have the same problems that in ASP resulted in a number of alternative techniques to support Session on Web farms. See Marco Tabini's article "Maintaining Session State on Your Web Farm" in the October 1999 issue of Microsoft Internet Developer.
Global Caching The ASP and ASP .NET Application object is another data container that can be even more dangerous than Session. Everything that you store there is kept in memory until the application is closed by the last user connected. Of all the information stored in memory, Application is the most persistent. However, if you have relatively constant information to share across the various sessions, and if you can have a reasonable estimation of the size, using Application turns out to be a good bargain. In general, the size of the data, as well as the number of concurrently connected users, is not an issue as long as you know its magnitude in advance and can choose the hardware accordingly. For example, the NASDAQ Web site is built with more than 40 MB of stock information stored in the Application object. As long as the hardware is properly tuned, using the Application object will not be a cause for concern.
In ASP .NET you'll also find that the Cache object provides similar capabilities. Cache is a thread-safe object and does not require you to lock and unlock before reading from or writing to it. In addition, Cache lets you associate a duration as well as a priority and a decay factor with any of the items you store in it. Any cached item expires after the specified number of seconds, thereby freeing up some of the memory. By setting priority and decay of priority, you can control the memory occupation and help the object to control memory for you. Both Application and Cache are globally visible across sessions but don't work in a Web farm scenario. In this case, if Session does not meet your needs, you might want to resort to a custom module that acts as a data container while being globally visible. Web Services certainly provide a good infrastructure to devise and build such made-to-measure components.
Web Services Web Services are another option to consider if you are looking at a piece of software that acts as a data container and works outside IIS. The value that a Web Service has over the Session Manager is that its content is easily accessible from other software components running outside IIS. The downside of Web Services is that you have to code them entirely from scratch. Given the fact that the Web Service is, in this case, reasonably accessed over a local network, the latency it may introduce is comparable to that of an out-of-process module like the Session Manager.
Next Step In Figure 4 you'll find a table that summarizes the main techniques with the scenarios I've just discussed so you can compare your options for improving design and scalability.
Next month I'll cover the features in ADO .NET and XML that will significantly increase your programming power when persisting data locally to the Web server. This looks like the most promising of all the techniques for reducing the involvement of the Web server in building pages. The DataSet object has been designed to provide for more scalable applications. However, simply using the DataSet object does not automatically scale the application up or out. More considerations apply and you'll see how the integration with XML makes the DataSet particularly suited in some scenarios. With XML persistence, you can build a scalable server-side disk-based cache that reconfigures itself as you pull records out of a data source. I'll also show you how IDataReader-based object can be more effective than a DataSet if you reload records each time you need them.
Send questions and comments for Dino to email@example.com.