MSDN Magazine > Issues and Downloads > 2000 > June >  Cutting Edge: Creating and Optimizing Performan...
This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.
MIND

Cutting Edge

Creating and Optimizing Performance for XML Document/View Web Applications

Dino Esposito

Code for this article: CUTTING0600.exe (54KB)

In my August 1999column in Microsoft Internet Developer I discussed how XML and XSL could be used to provide different views of the same data. That column was geared towards desktop applications built with Visual Basic®. Some readers have asked "well, what about Web apps?" This month I'll discuss XML document/view Web applicationsâ€"Web-based apps that provide a true separation of data and presentation and are capable of offering a number of different views of the same data.
      Giving your users a chance to modify your table layout allows each user to walk through data in the way he prefers, especially if you're providing plenty of records or long table reports. One user might want to scroll horizontal lines of fields, while another may prefer a tree view or a more general master/detail schema. To get a clearer understanding of the matter, think of the Windows® shell, and in particular consider the five standard views on a folder: large icons, small icons, list, details, and thumbnails (see Figure 1). You can switch between them at any time and see the changes instantaneously. Regardless of the view selected, the data displayed is always the same.
Figure 1 Available Views in Explorer
Figure 1 Available Views in Explorer

      Another clear example of document/view design can be found in the Visual InterDev® user interface. Figure 2 shows the list of views that Visual InterDev supports. If you select the DevStudio view, then you'll see your project organized as it would be in Visual C++®, with the Project Window docked on the left side of the screen. Once again, the data doesn't change, but is simply rearranged to fit into a different page layout. It's up to each user to choose his or her preferred view mode.
Figure 2 Visual InterDev Views
Figure 2 Visual InterDev Views

      Both the Visual InterDev designs and my August 1999 sample program work locally, so nothing needs to be downloaded or cached to change the view. Shifting this paradigm to the Web world, you run into a problem: where and how do you get the information you need? It's likely that this information comes from the Web. As a result, you'll need to minimize the amount of data transferred back and forth to the server. This month I'll revisit the August 1999 code, this time discussing how you can obtain optimized Web pages that provide multiple views of the same data. Figure 3 shows the final page I'll build here.
Figure 3 Views for a Web Page
Figure 3 Views for a Web Page

The Document/View Model

      Even though the document/view model is often associated with MFC, it is a much older programming paradigm that was originally introduced with SmallTalk. The mother of today's document/view model was the Model-View-Controller (MVC). In MFC, the word model was replaced with the word document, while the word controller was dropped since that aspect was merged into the overall user interface SDK for Windows.
      The model (or document) represents the actual data you want to display through an application. In particular, the model depicts the data along with its structure and storage. In MFC jargon, a model maps to the CDocument-derived classes.
      The view represents one possible presentation for the data. A view is a (mostly) graphical way in which you present your data to users. For each document you can have several different and independent views. Speaking in terms of MFC, a view is based on the CView class. For example, a recordset could be rendered through a tabular grid control as well as a master/detail control, such as a customized tree view.
      The controller was originally meant to control the way a view had to be rendered. In other words, the controller had the task of mapping the view's constituent elements onto the underlying UI of the operating system. While this was a viable issue with a general-purpose language like SmallTalk, it turned out to be far less important in MFC, a framework running only under Windows. This is why the notion of a controller disappeared and was replaced by the standard controller provided by the Win32® SDK.
      An MFC document/view application usually has a CDocument-derived class that governs the way the data is read and written. In addition, it has one or more CView-derived classes to provide different representations. The MFC implementation of the MVC schema lacks the controller because, as mentioned earlier, under Windows there's just one possible UI: the standard look that's realized through the Windows common controls.
      Whether you love or hate the specific MFC document/view model, the programming paradigm it offers is definitely interesting. This model allows a neat separation between the data your application holds and the possible ways to render it. This paradigm doesn't just make sense with MFC or an object-oriented language like SmallTalk; a clear separation between data and presentation is useful in many scenarios, including Web apps where a smart and feature-rich browser is involved.

A Web Document/View Model

      Sooner or later most Web applications will show some report to their users. Each user may have different preferences, not just for colors and fonts, but also for page size and layout. Smart Web applications already enable users to apply filters and sort report data. In most cases, this UI functionality is obtained through a new query to the server and, hence, through a new HTML page. Clearly this is not the most efficient way to do it, but sometimes it is the only way to go if browser compatibility is a serious issue.
      More advanced browsers, such as Microsoft® Internet Explorer 4.0 and higher, allow you to do ad hoc reporting and manipulate information directly on the client. You can sort, filter, group records, and so on, and on the client you only have to handle the raw data, not its HTML representation. This can be accomplished through data binding as well as with Remote Data Services (RDS) and ActiveX® Data Object (ADO) recordsets. You can take advantage of the built-in features of recordsets when it comes to sorting and filtering. ADO recordsets are just a powerful way to arrange interactive data reports since they make it easy to scroll by pages, sort, group, and filter.
      But what if you want to provide your users with the ability to radically change the layout of the report page on the fly? This is where our old acquaintance ASP comes in handy.

ASP is the Simplest Way

      ASP allows you to format an ad hoc page for not only the browser, but the user as well. At first, you might want to define a few possible page layouts, list them in a combobox, and then use conditional statements in the page body to distinguish which one the user selected. Each time a new layout is requested, script code posts a request for a new page to the server. Two URLs you can use to address pages that provide different views of the same data might be http://expoware/mypage.asp?viewtype=normal and http://expoware/mypage.asp?viewtype=dhtml.
      So far I haven't touched on how you actually prepare the pages for the client. You could employ an if or select case statement and use the value of the viewtype parameter to redirect to different ASP pages or to internal script procedures that retrieve and organize the data properly.
      Incidentally, some of the new features of ASP 3.0 and Internet Information Services (IIS) 5.0 (the Web server built into Windows 2000) could help with this. In particular, the new Server.Execute method could be helpful in writing modular, simpler, and more elegant ASP code.


<%
select case viewtype
    case "normal"
        Server.Execute "normal.asp"
    case "dhtml"
        Server.Execute "dhtml.asp"
end select
%>
      To minimize page flickering and reduce the number of bytes crossing the network, you might want to employ frames or iframes so that only the portion of the page with the data is redrawn. For example, the page shown in Figure 3 has a fixed graphical infrastructure. It also contains a form with the list of all the supported views. All the data that the views affect has been isolated in an <iframe> tag:


<!-- Page content -->
<iframe id="myView" frameborder="no" class="view"
    src="http://expoware/mind/xmldocview/content.asp">
</iframe>
If you want this to work well with Netscape Navigator or Communicator 4.x, then you might consider using the <layer> tag or the simpler basic frames.
      Overall, implementing a document/view solution through ASP is simple and quite straightforward, but as you'll see in a moment, not necessarily effective. The major drawback is that each time you change the view of the document, the data it contains is downloaded again. This is due to the inherent page-based nature of the Web. As a workaround you could resort to special techniques such as remote scripting, data binding, or direct COM calls through HTTP issued by RDS. In recent installments of this column I've examined a few aspects of direct data exchange over HTTP. For more information, please refer to my column in the January 2000 issue of MIND and to the March and April issues of MSDN Magazine.
      Two important goals in writing Web document/view applications are downloading the data only once and providing different views without too much coupling among the involved modules.
      Using ASP as the primary technology is not the ideal approach since it just creates HTML pages, and HTML can't manipulate data well. For each new page layout you must download a brand new page with the same data arranged in a different way. In addition, you need to add explicit script code to produce this new layout. The task is somewhat simplified by IIS 5.0, which supports Server.Execute and allows you to have distinct child ASP pages for each layout.
      XML is an excellent way to describe the data, and XSL is a useful way to compile that XML code into HTML pages. However, as long as ASP is involved you still end up sending a brand new page back to the browser with the same data hardcoded within the HTML tags. In this scenario, XML is only an alternative to ADO recordsets, and XSL is only an alternative to child ASP pages to divide the task of generating pages with different layouts.
      My aim was to find software with several key features. It had to be capable of downloading the data to the client only once. It also had to be able to store it in a general-purpose format that makes it easy to convert to HTML. XML with XSL sounds like the perfect solutionâ€"but without ASP in the middle to govern the page creation and download process.

Using XML Data Islands

       Figure 4 depicts what appears to be the ideal solution. XML is still used to describe the data, but now the XML tags are embedded in the HTML page through an XML data island. In other words, XML is definitely part of the page, but the data is kept in a separate context from the rest of the page with an <XML> tag, and can be read and modified as-is. The data is independent from its graphical representation.
Figure 4 Formatting Data with XSL
Figure 4 Formatting Data with XSL

      An XML island can be rendered into HTML via XSL. If you use a number of different XSL files, you can create different views of the same data. Just the XSL file is downloaded and applied to the XML data that is cached on the client as part of the page being viewed. With XML and data islands you download the data only once. The only round-trips to the server are made to get the proper XSLâ€"if it's not already present in the Internet Explorer cache, the XSL is downloaded each time you want to change the view. This pattern allows a block of data to have several views. Changing a view is a process that takes place interactively on the client, at least if you can target smart browsers like Microsoft Internet Explorer 4.0 and higher.
      A data island is a standalone piece of XML code embedded in an HTML page. Internet Explorer 4.0 was the first browser to provide some level of built-in XML support, including these data islands. To HTML browsers a piece of XML code looks like bad HTML code made out of unknown tags. Since most Web browsers are quite lazy and forgiving when it comes to analyzing HTML content, an unknown tag is simply ignored, but the text it contains is displayed anyway. For example, the following HTML page contains some XML code enclosed in the <msdnmag> tag:



<HTML>
<BODY>
<msdnmag>
<msj>www.microsoft.com/msj</msj>
<mind>www.microsoft.com/mind</mind>
</msdnmag>
<HR>
</BODY>
</HTML>
Both Internet Explorer and Netscape Communicator treat it the same wayâ€"both browsers just print the text that is between the tags (see Figure 5).
Figure 5 Browsers Ignore Unknown Tags
Figure 5 Browsers Ignore Unknown Tags

      If you want to embed invisible XML code you have two choices. You could comment it with <!-- -->. But with Internet Explorer, commented tags aren't accessible through the DOM. Alternatively, you can use the following workaround:



<div id="xml" style="display:none">
<msdnmag>
<msj>www.microsoft.com/msj</msj>
<mind>www.microsoft.com/mind</mind>
</msdnmag>
</div>
      Embedding XML code in a page that will be viewed with one of Netscape's browsers poses a problem if you plan to retrieve it later, since Dynamic HTML (DHTML) support will be different.
      Internet Explorer 5.0 provides not only the <xml> tag, but a robust XML Document Object Model (XMLDOM) that makes data manipulation easier through DHTML. All the text included within the <xml> tag is invisible to users. Moreover, you can use inline code like this


<xml id="myXML">
<msdnmag>
<msj>www.microsoft.com/msj</msj>
<mind>www.microsoft.com/mind</mind>
</msdnmag>
</xml>
and refer to external files:


<xml src="http://expoware/myfile.xml">
To retrieve and process the XML code later with the DOM, you can assign the <xml> tag an ID. Notice that <xml> is the tag that should wrap all the XML code you want to store in the page. It is not used to indicate the root of the XML document.
      Taking advantage of data islands allows you to generate DHTML pages that provide the data to process while requiring only one round-trip to the server to obtain the XSL stylesheet necessary to change the layout.
      Figure 6 contains the source code that produces the page shown in Figure 3. The page maintains all the information about the employees in embedded XML code that is generated on the server accessing a database via ADO. The XML code initially refers to an XSL stylesheet for display. My sample application supports two stylesheets: table.xsl provides a tabular view, while tree.xsl offers a hierarchical view.
      When the page is loaded, the XML code is retrieved and converted to HTML via the XMLDOMâ€"represented as an object by the Microsoft.XMLDOM component. The string obtained from this code


xmlData = embeddedXML.innerHTML
strURL = "http://expoware/myfile.xml"

set xml = CreateObject("Microsoft.XMLDOM")
xml.async = False
xml.loadXML xmlData
set xsl = CreateObject("Microsoft.XMLDOM")
xsl.async = False
xsl.load strURL
view.innerHTML = xml.transformNode(xsl.documentElement)
is then assigned to a <div> tag for display through DHTML. Notice that you can initialize the XMLDOM using either a file name or a string. You need to invoke the load method for a file name; loadXML does the job for a string.
      There's just one problem left to solve now: how to change the stylesheet on the fly to alter the rules that convert XML to HTML.

Applying XSL Stylesheets

      The XML code that represents the content of the page is embedded in the HTML page itself and contains a reference to an XSL file. This reference is mandatory; otherwise it would be impossible to convert the XML data to HTML. The following code links an XSL file to an XML document:


<?xml-stylesheet type="text/xsl"
    href="http://expoware/mystyle.xsl" ?>
This code should be edited at runtime to change the page layout.
      The XMLDOM allows you to edit the content of any loaded XML document. Unfortunately, the line that specifies the stylesheet is not technically part of the XML core data. It is simply a processing directive that instructs the XML parser to call the XSL processor (if supported) to work on the specified XSL file. To change the stylesheet of an XML document dynamically, you need to resort to special methods that the XMLDOM makes available through its processing instructions.
      Once you load the XML data and the XMLDOM is initialized, you can create a processing instruction through a method called createProcessingInstruction. A processing instruction looks like a normal tag. However, it doesn't need to be closed and must be wrapped with question marks. A processing instruction is rendered by the tag called xml-stylesheet.


strPI="type='text/xsl' href=" & strURL & "'"
Set pi=xml.createProcessingInstruction("xml-stylesheet",strPI)
If you specify more of these instructions, only the first one will be taken into account.
      Creating a processing instruction is only the first step. Next, you insert it before the first XML node at the top of the page.


xml.insertBefore pi, xml.childNodes.item(1)
      To finalize changes to the page layout, you should force the XML parser to invoke the XSL processor again to transform the XML into HTML.


view.innerHTML = xml.transformNode(xsl.documentElement)
This refreshes the entire page.
      The code for dynamically replacing the XSL stylesheet for a client-side data island is shown in Figure 6.
      The page might include a combobox with all the possible layoutsâ€"one for each XSL file available on the Web server. When the user changes the selection, the following code would run:


<SCRIPT language="VBScript"
        for="viewtypes" event="onchange">
  set cOptions = viewtypes.tags("option")
  xslFile = cOptions.item(viewtypes.selectedIndex).xsl
  strURL = "http://expoware/docview/" & xslFile & ".xsl"
  DoInit(strURL)
</SCRIPT>
DoInit is the helper routine that replaces the stylesheet and refreshes the page. If no stylesheet is specified, my example uses table.xsl as the default layout.
      The Internet Explorer browsing engine that makes all this possible is a reusable ActiveX control called WebBrowser. Programmers who use Visual Basic are already acquainted with this component, which is also available through an MFC class called CHtmlView. This class is a C++ wrapper for the major functionality of the control, including navigation and XML support.
      If you like the XML document/view model and you use the CHtmlView class (which inherits from CView) with MFC, then you might want to consider creating a new CXmlView class. This custom class could expose methods to hide the dynamic replacement of the XSL stylesheet behind easy-to-use methods and properties.

Wrap-up

      The partnership formed by XML and XSL is an excellent solution for building document/view Web applications. Data islands (especially with Internet Explorer 5.0 and higher) allow you optimize the bandwidth necessary to refresh the page layout. Data islands could also be easily simulated with Internet Explorer 4.0 and DHTML.
      This approach works well, particularly for intranet solutions or with those Internet projects where you can ensure that your users have a recent version of Internet Explorer. You can use ASP and IIS to create pages that take advantage of the XML document/view model under Internet Explorer yet behave normally when viewed through other browsers.

Dino Esposito is a senior trainer and consultant based in Rome. He has recently written Windows Script Host Programmer's Reference (WROX, 1999). You can reach Dino at desposito@vb2themax.com.

From the June 2000 issue of MSDN Magazine.

Page view tracker