Consider the following relatively common scenario. You have a page with a few textboxes and a button to run a server-side procedure. As a result of pressing the button, the procedure generates a report that you can navigate using links, sort, filter, and more. Whenever you require a function, the Web server takes control and regenerates the page. Where do you get the data necessary to produce the report? This is a key issue for Web applications that need to be extremely scalable. Web controls are not magic wands, but they are an elegant and effective way to let you generate boilerplate code. ASP .NET does a lot of work, but it can only follow your directives and decisions. At least in this version, ASP .NET is merely a powerful code executor, not a decision maker or a decision support system! So what's the point? The Web Forms model pushes and supports the development of extremely interactive pages when you would think you were writing a Visual Basic-based application. But you're on the Web now where scalability is a serious issue. Nothing about ASP .NET scalability comes for free, even though the framework provides excellent tools to write extremely scalable applications. Years of real-world experience have shown that quite a few factors affect scalability—the system's ability to maintain or improve its responsiveness as the number of clients and the number of users grows. The theory of queuing states that a queue forms when the frequency of the requests tends to overtake the response time. Most of the time, reducing the number of user requests is not a solution, but you can try to reduce the response time. Caching, data persistence, limited memory occupation, and quick use of shared resources are the key aspects to consider.
As I mentioned earlier, there isn't just one specific way to promote scalability. There are several measures you can take, but their impact on various applications depends strictly on the nature and the structure of the application itself. You can get better scalability by following these guidelines:
It comes as no surprise that a few of these measures clearly contradict others. For example, limiting the number of calls to the DBMS certainly implies that you should not be delegating data processing to it. Limiting the server memory occupation implies that you don't cache data, thus requiring a call to the DBMS whenever data is needed. If you want to limit DBMS access without resorting to data caching, there is no other way than serializing to disk, which means that you don't limit the number of operations the Web server performs with each call. As you know, optimization and speed are always important, but they're particularly important under stress conditions. Having code that's extremely fast under optimum conditions, but which loses half its speed under the worst conditions, is usually sufficient for most Web applications. As for the physical elements affected by the recommended measures, you can always add more RAM, a faster CPU, or a bigger hard-drive to improve performance. You can also add more machines to run as a Web farm or add more CPUs. In other words, the recipe for scalability is unique for each application and is often influenced by the background of the team, including developers and architects.
Once you have the solution on paper, you have to turn it into concrete programming calls and configuration settings. In doing so, you can take advantage of the new features that the .NET runtime and the .NET Framework make available. The measures I've described map to some extent to the use of the following programming techniques and .NET objects: the Session and other global objects, Web Services, the DataSet and DataReader objects, and XML. What follows is an annotated overview of the various .NET programming objects and techniques. Knowing what a given piece of system software can provide is one of the key factors in implementing effective solutions.
In ASP and ASP .NET, the Session object is a global repository for data and objects that belong to the session. The visibility of the data is limited to the pages invoked within the session. Using the Session object is critical however you look at it. It guarantees quick and prompt access to data and ready-to-use objects. On the downside, though, all the session data is duplicated for each active session and each connected user. From this you can easily deduce that applications based extensively on the Session object cannot support uncontrolled and constant growth in the number of users. These considerations hold true in general and apply to .NET as well as to the previous versions of ASP. Feel free to use Session when you write, test, and demo code, but be extremely careful when it comes to production code. This does not mean, of course, that Microsoft would have been better off dropping the Session object. The amount of data stored in Session must be kept under strict control, but using this object is still the fastest way to deal with session-specific data. In .NET, the Session object plays the same role as in ASP, but it has two significant enhancements. First, all commonly used objects can now be safely stored in a Session slot. Second, the Session object is the programming interface for a module, the Session Manager, that can work in-process, out-of-process, and can even rely on SQL Server™ for data storage. As a result, the Session object can also be used in Web farm architectures. In ASP, some flavors of COM objects (typically, objects written with Visual Basic) could cause serious scalability issues when stored to Session. This is due to the threading model employed by Visual Basic COM components, which forces a given component to be manipulated only by the thread that created it. Since Internet Information Services (IIS) uses a pool of threads, there is nothing that can guarantee that a request involving a living instance of a Visual Basic COM object is served by the same thread. If this is not the case, the request is delayed until the right thread is available. Don Box explained this topic in the House of COM column in the September 1998 MSJ. This scalability issue has been solved in .NET as all .NET objects are thread-safe and can be managed by any member of the IIS thread pool. So, as long as the size of the data is not an issue, you can save any live .NET object in a Session slot. The way in which Session works is determined by the settings in the web.config file. The following is an excerpt of such a file that illustrates all the attributes you can set.
<sessionState mode="Off|Inproc|StateServer|SqlServer" cookieless="true|false" timeout="number of minutes" connectionString="server name:port number" sqlConnectionString="sql connection string" />
timeout="number of minutes"
connectionString="server name:port number"
sqlConnectionString="sql connection string" />
Figure 2 describes the various attributes in more detail. By default, the Session Manager works in-process and utilizes cookies. This makes it perfectly aligned with the behavior of the ASP Session object. You can make it work outside the IIS process, on the same machine, or on a different one. This clearly results in slower performance, but it's more reliable. The SqlServer option lets you use a local or remote SQL Server database to store the data. The SqlServer option is out-of-process with respect to IIS. Figure 3 shows how the Session Manager relates to ASP .NET applications.Figure 3 Session Manager The out-of-process options (both StateServer and SqlServer) also make the use of Session suitable for Web farms. This feature may have a significant impact on scalability. If you plan to deploy an application on a Web farm, using Session soon becomes a more attractive option. You don't have to worry about the physical distribution of the machines since the Session Manager does that for you. In addition, any alternative to using Session, such as persisting to XML on local disk, requires you to use a shared path or to know the physical address of the machine that is actually processing your request. In other words, you have the same problems that in ASP resulted in a number of alternative techniques to support Session on Web farms. See Marco Tabini's article "Maintaining Session State on Your Web Farm" in the October 1999 issue of Microsoft Internet Developer.
The ASP and ASP .NET Application object is another data container that can be even more dangerous than Session. Everything that you store there is kept in memory until the application is closed by the last user connected. Of all the information stored in memory, Application is the most persistent. However, if you have relatively constant information to share across the various sessions, and if you can have a reasonable estimation of the size, using Application turns out to be a good bargain. In general, the size of the data, as well as the number of concurrently connected users, is not an issue as long as you know its magnitude in advance and can choose the hardware accordingly. For example, the NASDAQ Web site is built with more than 40 MB of stock information stored in the Application object. As long as the hardware is properly tuned, using the Application object will not be a cause for concern. In ASP .NET you'll also find that the Cache object provides similar capabilities. Cache is a thread-safe object and does not require you to lock and unlock before reading from or writing to it. In addition, Cache lets you associate a duration as well as a priority and a decay factor with any of the items you store in it. Any cached item expires after the specified number of seconds, thereby freeing up some of the memory. By setting priority and decay of priority, you can control the memory occupation and help the object to control memory for you. Both Application and Cache are globally visible across sessions but don't work in a Web farm scenario. In this case, if Session does not meet your needs, you might want to resort to a custom module that acts as a data container while being globally visible. Web Services certainly provide a good infrastructure to devise and build such made-to-measure components.
Web Services are another option to consider if you are looking at a piece of software that acts as a data container and works outside IIS. The value that a Web Service has over the Session Manager is that its content is easily accessible from other software components running outside IIS. The downside of Web Services is that you have to code them entirely from scratch. Given the fact that the Web Service is, in this case, reasonably accessed over a local network, the latency it may introduce is comparable to that of an out-of-process module like the Session Manager.
In Figure 4 you'll find a table that summarizes the main techniques with the scenarios I've just discussed so you can compare your options for improving design and scalability. Next month I'll cover the features in ADO .NET and XML that will significantly increase your programming power when persisting data locally to the Web server. This looks like the most promising of all the techniques for reducing the involvement of the Web server in building pages. The DataSet object has been designed to provide for more scalable applications. However, simply using the DataSet object does not automatically scale the application up or out. More considerations apply and you'll see how the integration with XML makes the DataSet particularly suited in some scenarios. With XML persistence, you can build a scalable server-side disk-based cache that reconfigures itself as you pull records out of a data source. I'll also show you how IDataReader-based object can be more effective than a DataSet if you reload records each time you need them.Send questions and comments for Dino to email@example.com.
More MSDN Magazine Blog entries >
Browse All MSDN Magazines
Subscribe to MSDN Flash newsletter
Receive the MSDN Flash e-mail newsletter every other week, with news and information personalized to your interests and areas of focus.