This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.
Using Server-Side XSL for Early Rendering: Generating Frequently Accessed Data-Driven Web Pages in AdvancePaul Enfield
|This article assumes you're familiar with HTML, ASP, XML, and XSL|
|Level of Difficulty 1 2 3|
SUMMARY Dynamic data-driven pages have become the basis of many cutting-edge Web sites. Early render systems can provide better performance and maintainability for data-driven Web sites by generating frequently accessed pages that contain less-volatile information ahead of time.
We'll show you an example of a server-side solution that uses Extensible Stylesheet Language (XSL) to merge data and layout information into HTML that is compatible with just about any modern Web browser. Using these techniques to render Web pages early can reduce the load on your database back end and increase performance for your users.
What is an Early Render System?An early render system is a process of building Web content incrementally prior to demand. This means that a given dynamic page might not be rendered completely on the fly as it is in ASP. Instead, parts of the page can be built prior to the first page request. The build might include merging data with static content or conditionally building ASP script sections based on some criteria.
One of the first sites to recognize and utilize this power was discussed in Wayne Berry's article, "Architecting the 15Seconds.com Site" (http://www.microsoft.com/mind/1199/fifteen/fifteen.htm), in the November 1999 issue of Microsoft Internet Developer. Wayne utilized an application known as XBuilder to process his articles and produce the content on an incremental basis on his site (http://www.15Seconds.com). Two benefits Wayne derived from implementing this system were enhanced performance and easy maintenance.
By removing the need to build the entire page at request time, you can offload some of the page-building work. Considering that most sites build data-driven pages on the fly, thereby delivering identical pages to a majority of their users, why not build these pages ahead of time? Instead of building the same page each time, through analysis you can determine which pages or areas you can render in advance, then build them incrementally. By removing data access from a page and moving from a page driven by ASP and ActiveXÂ® Data Objects (ADO) to a static HTML page, you can potentially increase your performance 10 to 20 fold.
By building your architecture carefully you can also segregate your content from your data. This would allow you to have content writers devote all their time to content while Web developers could concentrate solely on development.
Understand Your ContentThe key to understanding early render systems is in understanding your content. You will need to understand which areas are dynamic, which areas are mostly static, which parts are data-driven, and which are volatile.
To illustrate this point, let me introduce a concept I call data volatility. Data that is used to drive content can be thought of as falling somewhere along a continuum from highly volatile to static. Highly volatile data might be your checking account balance. Such data changes frequently and must be accessible in its current state. Static data does not change. Such data might include your name or birth date.
Highly volatile data is less compatible with early render systems because it is not as easily incorporated into an incremental build process. That is not to say volatile data is entirely incompatible, but it would most likely need to be accessed via conventional means such as ADO. Completely static data is highly compatible with early render systems. Such data can be merged with content at some interval and left in that state as long as it remains static. If it does change, the data can be incorporated into the next build of the content quite easily.
The more volatile the data, the more frequently you need to run your content through an early render system. This can be a very expensive operation or a fairly simple process, depending on the frequency it's performed. By running a delta on the data, you can narrow down the content that needs to be processed to a subset that can potentially be very small. By working with a reduced data set, your incremental build process is simplified greatlyâ"as is the ability to present timely data.
A Potential ImplementationNow that I've laid down the theory, let's see how an early render system can be applied to a real-life scenario. Consider your common business-to-consumer e-commerce site that normally involves a product catalog. Product catalogs are usually stored in database back ends. Product data is also fairly static, making it an ideal target for an early render system.
Many Microsoft-built e-commerce solutions access product data on the fly and build pages through ASP and ADO. An early render system would look at the data on an incremental basis according to the volatility of that data. By comparing the last modified date on each product and the last build time of the site, you can come up with the delta on which products need to have their pages rebuilt.
XSL as a Rendering SolutionThe Extensible Stylesheet Language (XSL) makes a great solution for merging the data with the layout to provide content. Through the nature of XML, XSL allows you to separate the data from the layout. As long as you can provide the data in XML format, you can use XSL to produce content.
Another benefit of XSL is that it allows you to produce cross-browser-compatible HTML. Because building the pages takes place long before they are delivered, the product of the build process can be cross-browser-compatible HTML. Additionally, because the result of an XML/XSL merge is text, you can use XSL to build both ASP pages and the build script itself.
In this article, XSL is used as a server-side solution. This means that the merging or transformation of the XML/XSL to HTML or ASP occurs on the server. By contrast, you could also build a client-side solution based on the XML Document Object Model (DOM) in Microsoft Internet Explorer 5.0. Using this method, Web site content would be provided separately through XML data, using XSL for layout. For the sake of cross-browser support, however, I will concentrate on the server-side solution.
Since XSL processes well-formed XML, to use XSL your data must be provided in XML format. Take a look at the abbreviated product catalog data that's shown in Figure 1. For ease of understanding, in this sample I will use this data in static form. In a real-world environment you can provide the data using any back end you have at your disposal. Some obvious alternatives are building the XML through ASP, or using the XML ISAPI filter for SQL Server (see http://msdn.microsoft.com/xml/articles/xmlsql/default.asp).
One of the luxuries afforded by an early render approach is flexibility in how the data is provided. Data can be provided live at build time via the methods I've outlined, or it can be provided through static data files. Additionally, the performance of live data is almost a nonissue. Because each data request occurs serially, you don't have to worry about the number of concurrent hits the data provider might incur.
Producing the ContentThe build process involves merging a set of XSL stylesheets with the XML data to produce the final content. To achieve this goal, I built on some code written by Matt Oshry for MSDN Online. Matt created an application in Visual Basic that provides an interface to merge XML and XSL files to an output file. I modified his code to accept command-line arguments without a user interface, and named it xml2xml.exe. Source file arguments accept physical file paths and URLs. Using this utility, I can write batch files that feed data and XSL to produce my content. XML2XML uses the following command-line syntax:
A master build batch file would contain the following:
XML2XML /sSourceFile /tTemplateFile /oOutputFile
The first step in building content is building the batch file itself. Using the XML catalog data, the SKU numbers contained within, and the XSL template shown in Figure 2, you can output a Windows NT command file.
Build.cmd XML2XML /sProducts.xml /tBuild.xsl /oBuildScript.cmd Call BuildScript.cmd
The first thing the template does is create the build command for the product list, outputting it as Default.htm. Products.xsl, shown in Figure 3, creates a basic table of the products with hyperlinks to each product description page. This basic product listing page would look like the one shown in Figure 4. Keep in mind that this is a simplified page, but moving to a full page would be quite easy.
Figure 4 A Basic Product Listing
Next, the build script creates build commands for each individual product description page. The output file name is Overviewxx.htm, where xx is the SKU number of the product. This process repeatedly applies the same XSL template to XML data. Because the XSL remains static, the XML data provided must change with each page build. To accomplish this, I used ASP. This is where a technology such as the XML preview for SQL Server would be ideal.
For this example, GetProduct.asp includes a Boolean switch to allow it to work independently of a database back end (see Figure 5). The exact logic for obtaining the data from the database would need to be adapted to your database schema.
Figure 6 A Sample Product Overview
A sample product overview page is shown in Figure 6. The Add To Basket link jumps to an ASP page that handles the addition of items to a shopping basket. It passes the SKU in the URL to be accessed by the ASP basket handler. The basket-handling ASP can store the SKUs and quantities in a cookie, or place a unique key in the cookie and store the basket items in a database back end. The unique key is then used to retrieve the basket from the database. The nice thing about this solution is its Web farm compatibility. This complements the scalability of a static HTML implementation of a product catalog.
Where to Go from HereThe first necessary step is to obtain a utility with which to perform your XML/XSL merges. By using Matt Oshry's example, you'll have a good algorithm to run regardless of your choice of language. To enhance the project, you can build XML2XML as an object, as shown in Figure 7. Doing so would enable you to instantiate the object once in a Windows Script Host file and reuse it. You can also add better threading support so that multiple scripts might run concurrently, thereby speeding the generation process.
This solution demonstrates how to provide ASP or HTML content. It does not, however, handle the need for dynamic server-side merging of XML and XSL. Internet Explorer 5.0 can handle client-side XML/XSL merges, but server-side merges would need to be implemented in ISAPI or ASP. As mentioned earlier, Wayne Berry's XBuilder can perform server-side merges, and I anticipate that there will be other solutions in the near future as XSL gains momentum. If you want to pursue a more complete XML solution, this is one avenue to explore.
Because the XML/XSL merge results in XML, there is an opportunity to architect a multi-pass generation process. This would allow you to create site templates, build the content for your pages separately, and separate the components that can be reused without recoding.
Since the generation process is essentially a preprocessing environment, you can define tags and attributes that are in effect preprocessor directives. This allows you to build debug or release versions of your site or target your output for a particular browser. Targeting browsers allows you to take advantage of support for XSL in Internet Explorer 5.0, permitting the browser to perform the final XML/XSL merge, thus reducing server traffic.
For related articles see:
Paul Enfield is currently working as a senior Web developer at vJungle.com, a leading provider of application services for small businesses. He specializes in designing highly scalable Web solutions, with previous experience on Shop.Microsoft.com and other E-commerce solutions. Paul enjoys writing an occasional technical article and can be reached at firstname.lastname@example.org.
From the April 2000 issue of MSDN Magazine.