From the March 2002 issue of MSDN Magazine
New Features Improve Your Web Server's Performance, Reliability, and Scalability
|This article assumes you're familiar with IIS|
|Level of Difficulty 1 2 3 |
|SUMMARY As the Web evolves, so does the role that Internet servers play. The Internet has seen the growth of e-commerce, B2B business, collaboration, streaming and other new media, and these new applications require new features to meet increasingly complex needs. Microsoft Internet Information Services (IIS) has many of the features today's mature Web sites need.|
This article outlines the features in the upcoming version 6.0 and discusses how they promote better scalability, reliability, and performance. Features such as Remote administration, caching, and metabase improvements, as well as custom isolation and security enhancements, make IIS 6.0 the Web server of the future.
| f you run your Web site using Windows® 2000, you're probably using Microsoft® Internet Information Services (IIS) to do so. Over the last few years, IIS has evolved quite a bit, turning Windows into a sophisticated and robust Web server platform. In addition to a slew of new features, IIS has also become more stable and scalable. The newest version, 6.0, is expected to be released with Windows Server 2003.|
In this article I'll to take a close look at Microsoft Internet Information Services 6.0, examining the new features that will make this Web server the choice for many enterprises over the coming months and years.
Please note that many of the features and concepts presented are based on the Beta 2 version of IIS 6.0. There are a number of changes in Beta 3 and you can find more information about them at Internet Information Services 6.0 Overview - Beta 3.
The Current Web Climate State-of-the-art Web sites have quickly evolved from glorified electronic brochures to fully interactive customer order-entry sites, online media dispensers, data collection vehicles, and invaluable research tools. And there are many new uses in the pipeline that represent a departure from the traditional Web applications. These include resource sharing and business-to-business communications. This trend can only mean the continued growth of the Web. Some of the upcoming creative uses of the Web include:
B2C (Business to Consumer) Full-featured interactive commerce-oriented Web sites keep getting easier to build using products such as Microsoft Commerce Server.
Web-based intranet access Employees get to the company intranet through the Web using products like Office Web Access and a secure URL (https).
Resource sharing Products like Microsoft SharePoint™ streamline the distributed usage of corporate data and documents.
Programmatic business-to-business communications Products such as Microsoft BizTalk™ are making cross-organizational cooperation and workflow possible.
Media services Vendors can stream audio and video to targeted user-products like Windows Media™ Player.
Each of these usage scenarios is fairly demanding. And these scenarios don't even include full-scale Web Services and the services yet to be invented by developers using Microsoft .NET. While IIS 5.0 has made long strides toward enabling Windows as a powerful Web server platform, IIS 6.0 adds long-awaited features to make Windows Server 2003 even better. In addition, IIS has been rearchitected to meet the requirements necessary to enable the new uses listed earlier.
IIS Then and Now As organizations recognize the value of creating a Web presence and multiple Internet utilization models, IIS is evolving to meet their requirements. Several factors pushing the evolution of IIS include widespread adoption of Active Server Pages (ASP) and increased response demands, as well as various ISP, hosting, dot-com/org, and IT scenarios. Valuable additions to IIS 6.0 are its many new features for Web server management, performance and scalability, availability, reliability, and security.
At the end of the day, the primary job of a Web server is to accept incoming HTTP requests and respond to them appropriately. In the earliest days of the Web, this was simple because the HTTP request was usually for a simple text file (HTML), so the Web server just had to shoot the contents of the file back to the browser. However, it quickly became apparent that the Web plus the browser as a front end would make a really great way to interact with the server on the other end. So, as HTML began to support features like control tags, scripting, and insertable objects, the back end had to adapt as well. Web servers, including IIS, became more programmable and customizable.
As you can see from the table in Figure 1, IIS has been gradually picking up features over the last five years, and IIS 6.0 continues that trend by including support for Passport authentication and native Windows authentication schemes. The IIS metabase, which has been stored in a binary format for the last five years, will be represented by an XML file in IIS 6.0.
I'll start the tour of IIS with its most visible feature, namely, Web site administration.
Changes to the Administration Snap-in As with previous versions of IIS, most IIS 6.0 administration happens through the IIS Snap-in for the Microsoft Management Console (MMC). You can access the MMC Snap-in via Start | Programs | Administrative Tools | Internet Services Manager. The IIS Snap-in displays the traditional tree list views common to most MMC consoles. The set of nodes on the left-hand side shows the FTP sites, Application Pools, Web Sites (and virtual directories associated with the Web Sites), and a node for managing SMTP hosting. Figure 2 shows the IIS 6.0 MMC Snap-in.
Figure 2 The IIS 6.0 MMC Snap-in
Much of the snap-in should look pretty familiar. At the very outset, you may notice a couple of differences between IIS 5.0 and IIS 6.0. First, the FTP sites have been moved into a separate node on the left-hand tree view. Second, there's a separate node for Application Pools. Configuring each section of the Web server involves selecting a node and right-clicking on it to get to the node's property sheet. The snap-in also includes facilities to create new Web sites, FTP sites, and Application Pools. Application Pools are a very important part of IIS 6.0, and I'll be looking at them very closely later in this article.
Distributed Administration Options At one time, it might have been possible to run your entire Web site from one box. Not so these days. Today's Web users no longer an elite group of individuals who use Mosaic and understand cryptic protocols. Today everybody is a Web user. The increased load on Web servers means that they need to scale well—and the most common way to make that happen is to scale out your infrastructure. That is, to add new boxes to your site. Most modern Web sites are spread across Web farms, groups of servers dedicated to running the organization's Web site.
A major consequence of scaling out and adding boxes to your site is that configuration and administration becomes increasingly more complex. When you have a Web farm consisting of perhaps dozens of boxes, it's no longer practical to scoot your chair around to each box to configure and administer it. That wouldn't be an option at all with rack-mounted systems. IIS 6.0 supports comprehensive remote administration features to accommodate the modern Webmaster.
There are three ways you can support remote installations of IIS. The first, and most common, means of administering IIS over an intranet is the IIS Snap-in hosted in MMC. The second line of support is the Remote Administration Interface that lets you change properties on your site. Figure 3 shows the interface for administering IIS using the Web-based front end. Interestingly, the Web-based front end is now a standard feature with many other Microsoft enterprise products as well, including SharePoint and Microsoft Operations Manager.
Figure 3 Web-based Administration in the Browser
Finally, you can use Terminal Services over a network connection (such as LAN, PPTP, or dial-up) to administer IIS remotely. Terminal Services is an amazing product. In addition to using Terminal Services to administer IIS remotely, I've used it to administer Microsoft Operations Manager installations over the span of thousands of miles. The performance of Terminal Services was so good I could even perform administration over a dial-up line. Using Terminal Services feels just like being there—and neither MMC nor the IIS Snap-in need to be installed locally.
These three lines of support for remote administration enable you to manage a Web site from virtually anywhere in the world.
Metabase Improvements When Web developers and administrators manage a site, they configure such items as the directory structure used by the application, the executable modules mapped to certain file extensions, the pages shown for different errors, and the way security is managed. When changes like this are made to a Web application, the changes go into the IIS metabase. Earlier versions of IIS stored the metabase as a binary file. While choosing binary storage worked well for the most part, this scheme did have a few major disadvantages. One of the problems was the inability of humans to read the file directly. IIS 6.0 circumvents these limitations by storing the IIS settings in an XML file which is both human-readable and programmable, replacing the original binary files.
Gone is the proprietary binary metabase file METABASE.BIN. In its stead, the IIS metabase is represented by two standard XML files on disk: METABASE.XML (containing the actual configuration values for IIS), and MBSCHEMA.XML (containing the XML schema providing default values of metabase properties). When loading, IIS 6.0 reads the XML files into memory and creates an in-memory representation of the metabase. IIS flushes the metabase to disk periodically.
In moving the format of the metabase from a binary file to an XML file, IIS gains several distinct advantages over older versions:
The XML metabase is 100 percent compatible with the existing public metabase APIs as well as with the Active Directory Service Interfaces (ADSI).
- It's easier to diagnose and repair a corrupted metabase because it's represented in a human-readable format.
- The metabase files can be read and saved directly using standard text editor tools (though you normally won't want to edit it by hand unless you really know what you're doing—just like the Registry in that respect).
- The XML metabase has improved performance and scalability. It has faster read times on Web server startup than the IIS 5.0 binary metabase, and has write performance that is roughly equivalent to the binary metabase.
Figure 4 XML Metabase
While the file metabase.xml is the final serialized format of the metabase, IIS maintains an in-memory representation of the same data. This in-memory database is accessible in a number of ways. Figure 4 illustrates the relationship between the metabase XML files and the IIS human and programmatic interfaces.
The Metabase Storage Layer As mentioned earlier, IIS 6.0 replaces the old binary storage layer with one that is XML-based. The Metabase Storage Layer takes care of managing the XML files and converting the contents of the XML files to the in-memory version of the metabase. From the outside world, there are several ways to talk to the IIS metabase, including the IIS MMC Snap-in, Active Directory Service Interface providers, WMI providers, and COM-based software that uses ADSI or WMI.
Sandwiched between the metabase and the outside world is a layer containing the Admin Base Objects (ABO). The ABOs are COM objects representing the programmatic interface to IIS's configuration values. Access to the metabase occurs through the ABO. Individual IIS Admin Base Objects are mapped to individual metabase key names. Each object has a set of properties reflecting the properties stored in the metabase.
Because the in-memory version of the metabase is volatile, it's periodically written to disk via the Metabase Storage Layer. This layer reads the metabase files into the in-memory metabase and writes the in-memory metabase back out to the metabase files. The storage layer takes care of converting the XML to an in-memory binary representation (that IIS understands) via the Admin Base Objects. The Metabase Storage Layer reads the XML files during start-up, and the storage layer writes the in-memory metabase directly to the metabase files at one of the following events:
- When the IIS service is stopped
- After a predetermined number of changes have been made to the in-memory metabase within five minutes
- When a program or process causes the in-memory metabase to be written to disk programmatically
Metabase Editing There are two instances in which you may edit metabase.xml directly and have the changes take effect immediately: when EditWhileRunning mode is turned on or when the IIS service is stopped before the metabase.xml file is edited and saved. The EditWhileRunning checkbox appears on the Computer name property page within IIS. The EditWhileRunning mode lets you modify the metabase XML file using a text editor, save the file, and IIS will update the in-memory version of the metabase while running.
IIS uses the Windows file change notification to determine when the metabase.xml file has been saved. When IIS detects that it's been saved, it goes through an elaborate set of steps to reload the XML-based metabase and ensure its integrity. The entire process is rather complex, but is well-documented within the IIS 6.0 documentation.
Support for Simultaneous Updates IIS also supports simultaneous updates between programmatic interfaces such as ADSI and WMI. However, simultaneous updates to the metabase can be a tricky business and can cause errors under certain circumstances. The fundamental rule of updating the metabase is that the last write is the one that sticks. IIS also allows programmatic updates and administrator-edited updates to the metabase.xml file while in EditWhileRunning mode to happen simultaneously.
The Metabase History Feature IIS 6.0 includes a new metabase history feature that keeps track of changes to the metabase when they're written to disk. Whenever the metabase is persisted, IIS marks the new metabase.xml file with a version number and saves a copy of the file in the history folder. (The default location of the history folder is %windir%\system32\inetsrv\history.) Each history file is marked with a unique version number, which is then available whenever the metabase needs to be rolled back or otherwise restored.
IIS uses the following versioning scheme to enumerate the copies of metabase.xml stored in the history folder.
After modifying the metabase a few times, here's a sampling of the history file names IIS produced on my machine:
Metabase Export and Backup IIS 6.0 includes a feature for importing and exporting the metabase. This is helpful for picking up a site and moving or cloning it. You can create an export file containing specifically selected elements of the metabase and read it in on either the same computer or another computer running Windows Server 2003. The metabase export feature does not export the metabase schema. However, the metabase backup feature does export the schema.
The ability to back up the metabase is also extremely important in a modern Web climate. The metabase backup feature picks up the entire metabase configuration and schema and saves it. You can perform the backup in either secure mode (requiring a password) or non-secure mode (not requiring a password).
The metabase export and backup features provide complementary functionality. The metabase backup feature creates a backup of the entire metabase, which may only be restored wholesale. The metabase export feature creates export files containing selected metabase elements.
Changes to the IIS 6.0 Architecture The redesign of IIS 6.0 went much farther than tweaking the metabase and adding more administration support. The entire underpinnings of IIS 6.0 have been revamped to improve the flexibility, scalability, and reliability.
Modern Web sites and application code are growing increasingly complex as Web sites provide more sophisticated content. Modern Web sites are also dynamic in nature. Today's Web users expect content to churn. How often have you been frustrated by lack of new content on a Web site? (Whenever it happens, most surfers quickly move on).
To keeps things fresh, Web designers are constantly updating and rereleasing versions of their Web sites (sometimes every month or two) unlike the historical 12 to 18 month development cycle of desktop software. Because developers are pushing code out the door so quickly, it's not always tested thoroughly. Consequently, the onus of robustness and reliability is transferring from the specific Web software to the system. IIS 6.0 adds robustness by automatically detecting memory leaks, access violations, and other errors, handling them, and continuing to run. (This is also one of the primary thrusts behind the .NET Common Language Runtime—to push the responsibility for mundane, easily overlooked details down to the system.)
IIS also actively recycles (stops and restarts) processes as necessary while continuing to manage requests without interrupting the experience on the client end. To enable this, IIS 6.0 provides a new dedicated application isolation environment with active process management known as worker process isolation mode and kernel-level request queuing.
Worker Process Isolation Mode The idea behind IIS 6.0 worker process isolation mode is to put different Web applications into separate application pools. These application pools define a set of Web applications that share one or more worker processes; each application pool is separated from other pools naturally by standard Win32 process boundaries. The application pools remain independent of each other, and an application in one pool is not affected by other application pools. Application pools effectively serve as namespace groups. Figure 5 illustrates the worker process isolation mode of IIS.
Figure 5 Worker Process Isolation Mode
The worker processes operate independently of each other so they can fail without affecting other worker processes (via the natural Windows process boundary). The pooling of applications protects them from the impact of failing worker processes.
Kernel Mode Queuing If application isolation is the first part of the robustness story, the second part is kernel-level queuing. IIS 6.0 HTTP service (http.sys) is where all incoming HTTP requests first hit the Web server. The kernel-mode HTTP service is also responsible for overall connection management, bandwidth throttling, and text-based logging. Http.sys implements a URI response cache. By implementing a cache, the service handle caches HTTP responses completely in kernel mode with no transition to user mode, thereby greatly improving performance. The URI namespace mechanism implemented by http.sys is called application pooling (recall the Application Pools node on the IIS Snap-in shown in Figure 2).
Each application pool has its own request queue within http.sys. Http.sys listens for HTTP requests and puts them on the appropriate queue. Because no user-mode code runs within http.sys, it remains unaffected by the type of code that would normally cause the host process to crash. Even if an accident happens within the user mode request processing infrastructure, http.sys continues to accept and queue requests until either there are no queues available, there is no space left on the queues, or the W3SVC has been shut down.
Even if a worker process crashes, it's not a big deal because whenever the W3SVC notices a crashed worker process, it starts a new instance of the process. Thus, while there may be a temporary disruption in the ability to process user mode requests, the user doesn't experience the failure because requests continue to be accepted and queued within http.sys (by the new instance of the process).
With the advent of application pooling and kernel mode queuing, IIS 6.0 no longer maintains the notion of in-process versus out-of-process applications. The IIS runtime services (such as ISAPI extension support) are equally available in any application pool. Because pools are separated by worker process boundaries, an application in one pool is not affected by problems caused by applications in other pools.
The Web Administration Service Finally, the component that ties together the application pools and the kernel mode queuing is the Web Administration Service (WAS). WAS and http.sys make up the core of IIS 6.0. Both are isolated from user-mode code by standard Win32 process boundaries outside user mode and thereby remain unaffected by accidents within the Web application code, unlike IIS 5.0 which shared the core Web server process (INETINFO) with application code. As a result, with IIS 5.0 accidents within the application code could affect the core IIS functionality.
Smoother Daily Operations with IIS 6.0 IIS 6.0 worker process isolation mode and kernel-level queuing, and the WAS deliver the following specific improvements over earlier versions of IIS:
Robust performance Your Web site requires less rebooting because isolation protects Web applications from each other and from the Web Service.
Self healing IIS 6.0 will automatically restart failed processes and will also restart them periodically if you request it.
Scalability IIS 6.0 supports Web gardens, allowing more than one worker process to serve the same application pool.
Automated debugging IIS 6.0's debugging feature lets you run an executable process (such as the debugger) if a worker process fails to respond.
In the next sections, I'll explain how these improvements work.
Recycling Processes As you know, humans can go only so far in reducing errors in a program. We can come close, but it takes a tremendous amount of effort to test and make sure a program works as it should. Certain types of programs must work as advertised and stay up and running for extended periods of time, like Windows 2000—you turn on your box and often don't need to reboot until some piece of installation software tells you it wants you to.
On the other end of the spectrum is the class of Web-based content providers. The goal behind most sites delivering content is to ship early and often. Often this means that the testing part of the development process suffers. Sometimes these apps may spring nasty memory leaks or other similarly sneaky bugs that violate the integrity of the process space in which it runs. In this case, IIS can detect these inconsistencies and crash in user mode. It then proactively recycles application pools.
You can configure IIS to periodically restart worker processes within an application pool. By specifying that an application be recycled, you basically tell IIS to shut down the process space and create a new one at various intervals. So if you know that one application has a problem such as a memory leak, you may recycle the application, every hour perhaps. Because HTTP is a disconnected protocol, one instance of an application process space can often handle requests just as well as another. These broken, recycled applications will remain healthy because they regularly get a new lease on life. The option to recycle processes is available in worker process isolation mode. Figure 6 shows the Recycling configuration property sheet.
Figure 6 Recycling Properties
You can restart applications based on elapsed time, number of requests served, scheduled times, memory usage, and even on demand. To recycle a worker process, WAS spins down the faulty worker process while it completes the processing of the remaining requests in the queue. You configure the application to drain the requests at a specific time. While the process is winding down, WAS creates a replacement worker process for the same namespace group and starts the new worker process before the old worker process stops. It's like passing the baton in a relay race. As a result, service interruptions are minimized. Once the old process finishes processing the outstanding requests, it shuts down normally. If the old process takes too long shutting down (perhaps it's hanging), IIS will terminate it directly.
Site Health In addition to recycling processes automatically to maintain the integrity of a Web application, you may also configure IIS 6.0 to detect problems within an application and take certain steps in response. For example, you can configure an application pool to ping the worker process periodically to make sure it's alive. You may also configure an application pool to disable itself after a specific worker process crashes a configured number of times in a defined time limit. Finally, you can put time-outs on the worker process startup and shutdown periods. Figure 7 shows the property page for configuring the application pool health properties.
Figure 7 Application Pool Properties
Site Performance In addition to setting up health and recycling parameters, IIS features several tweakable performance parameters. For example, if a specific application pool remains idle for a while, it makes no sense to keep it running and eating clock cycles. IIS lets you specify an idle time and will shut down an application pool's worker process after it remains idle for a specific period of time. You can also limit the number of requests that may be queued up, and you can configure how often the CPU counters are refreshed. Finally, you may specify the number of worker processes that can run in a Web garden (where several instances of an app are running at once). Figure 8 shows the Performance property sheet for an application.
Figure 8 Performance Properties
IIS 5.0 Isolation Mode There is a particular case in which you cannot use IIS 6.0 worker process isolation mode, and must use IIS 5.0 isolation mode instead. Specifically this is if your application makes use of any raw ISAPI filters. If it does, then you should run it under IIS 5.0 compatibility mode.
Selecting Execution Mode Choosing an execution mode in IIS 6.0 is a matter of selecting the Web Sites node and marking (or unmarking) the checkbox named "Run Web service in IIS 5.0 isolation mode." This checkbox is off by default, and worker process isolation mode is the default. Running your Web site using worker process isolation mode provides application pooling, automated restarts, and debugging. You'll probably want to use worker process isolation mode unless you find a conflict with an existing application.
Security IIS 6.0 offers several new security features including a selectable Cryptographic Service Provider (CSP), configurable worker process identity, and the ability to disable unknown extensions.
When your Web site requires SSL, you'll get security at the expense of performance because of the number of clock cycles used to encrypt the content. Fortunately, there are some hardware-based accelerator cards that allow you to move some of this processing to the hardware. These accelerators implement their own version of the Crypto API, and IIS 6.0 supports third-party crypto providers.
One place where security attacks can succeed is within components running as LocalSystem. Any opening in such a piece of software (like a buffer overrun) can let the attacker completely take over the machine on which it is running. IIS lets you configure the account under which an application's worker process (or processes) work, thereby controlling access to system resources.
Finally, IIS lets you restrict the extensions of the files you send to users. A metabase property allows you to send out only files with known extensions, while unknown file extensions receive an "access denied" error.
Other Features Finally, there's a potpourri of new features in IIS to make the Web developer's life easier. These include extensions to the FTP facilities, UTF-8 and Unicode ISAPI support, and support for transmitting vector buffers.
IIS 6.0 extends its FTP support in two significant areas. First, it includes an FTP User Isolation feature, which lets you restrict users to their own FTP directory. This keeps one user from viewing and/or overwriting other Web content. Second, IIS now supports multiple character sets for FTP.
IIS 6.0 includes support for Unicode and UTF-8 for file names and URLs. ASP can now deal with any file name using the Unicode filename string. Incoming UTF-8 URLs are converted to a Unicode representation and presented to ASP.
Finally, through a feature called VectorSend, IIS supports the transmission of ordered lists of buffers and file handles. Http.sys compiles the buffer(s) into one response buffer within the kernel and then sends it. This way, IIS doesn't have to do buffer reconstruction or multiple write clients.
Conclusion Clearly, the platform of choice for the foreseeable future is the Internet. With so many users on the Internet every day, and with so many new applications on the way, the world's Web servers will certainly experience an increase in demand. IIS 6.0 has been designed to meet this demand. Its far-reaching improvements enhance performance, reliability, and scalability, and secure a spot for IIS 6.0 and the .NET platform as computing platforms for the millennium.
| For related articles see:|
The Windows Server 2003 Application Environment
http://www.microsoft.com/Windows.NetServer For background information see:
Internet Information Services Features
| George Shepherd is a software consultant and an instructor with DevelopMentor. George is the coauthor of MFC Internals (Addison-Wesley, 1996), Programming Visual C++ (Microsoft Press, 1998), and Applied .NET (Addison-Wesley, 2001 ). He may be reached at email@example.com.|