| SP is a Web technology that relies specifically on capabilities of MicrosoftÂ® Internet Information Services (IIS). For this reason, very few commercial products have attempted to provide an ASP-to-HTML converter. The problem with such converters is that you must work in conjunction with the Web server to trigger the ASP parser and find the intrinsic objects available. When you double-click on an HTML page from the Explorer shell, you simply ask the browser to retrieve and render the source code of the file. However, when you double-click on an ASP file from Explorer, you cannot ask the browser to translate it into HTML.
Practical Reasons for an ASP Converter OK, so an ASP-to-HTML converter might not be the tool that thousands of programmers dream of every night. However, I can envision at least a couple of scenarios where such a tool would be very handy. The first scenario was mentioned by Robert Hess in the April 2000
Web Q&A column. Suppose you have several pages that require some interaction with a database on a frequently visited Web site. Writing them as ASP pages looks like the perfect solution. However, if the database is not very volatile and the page output is not highly dependent on the user's input, you could easily resort to plain old HTML for better performance.
For example, a list of suppliers is probably the kind of data that you would update only a few times a year. Why rebuild that list on the fly each time it's requested, when a static HTML page would incur less overhead?
An ASP-to-HTML tool could be used as a kind of batch compiler for ASP pages. You write them as server-side resources, and then when you realize they are not particularly dependent on runtime conditions, you can transform them into static HTML pages with either the .asp or .htm(1) extension.
While I'm on the subject, let me point out a significant improvement in the management of scriptless ASP pages that's available with IIS 5.0. Until IIS 4.0, all resources with a .asp extension were subject to parsing, whether or not they contained script code. With IIS 5.0 this drawback has been eliminated as IIS checks for <%...%> blocks before loading the ASP parser.
An ASP-to-HTML converter would also be handy when you need to view ASP pages offline. For example, a client recently asked me about the possibility of using a single development environment for building both Web sites and CDs. I first considered using static HTML pages that could be viewed over the Web or in a local browser, but the idea was soon dismissed given the complexity and the amount of content involved. Also, my client could not guarantee any particular software configuration on the user's machine, and the only product that could be supplied with the CDs was Microsoft Internet Explorer or a custom Web browser.
ASP looked like the natural choice for the Web side of the project, but what about the CD? To make ASP work offline without a Web server, you need code that extracts all the <%ï¿½%> code blocks from the page and processes them. In addition, this module would have to provide a simulated ASP object model and take care of collecting the portions of plain HTML text. Then it would have to put it all together, combining the static HTML code with the output of the processed scripts.
In this column, I will discuss the architecture of the offline ASP viewer and some implementation details. In particular, I'll show you how to emulate the behavior of the ASP Response object. Next month, I'll finish up the code, covering Request and Server plus some other related topics. This month's code shows the potential of this approach and works with typical ASP pages, though it is not comprehensive. I won't cover other ASP objects such as Session or Application because they are rarely needed in local scenarios.
The Browser's Role To emulate ASP while working offline, you need a little help from the browser. Basically, the browser must be able to detect whether the page to which it's about to navigate is a URL or a local path name and whether it contains the .asp extension. If the user is calling a URL, the browser does what it would normally do. Otherwise, it calls a custom module to locally parse the content of the ASP file.
Furthermore, the browser is involved when the ASP page that will be emulated contains forms and hyperlinks. (I'll discuss this further next month.) Given these requirements, to deal with ASP pages offline you need a customized version of the browser. While subclassing Internet Explorer or Netscape Communicator is always possible, I suggest you write a brand new browser from scratch using existing Web browser technology such as the Microsoft WebBrowser control. While I'll use Visual BasicÂ® here, you can also use C++. As a good starting point in C++, you can try the MFCIE or ATLBrowser samples, both of which come with the latest Platform SDK. In Figure 1 you can see the layout of the browser. For illustration, I've divided the client area into three blocks: one for the actual HTML rendering, one for the original ASP text, and one for the expanded HTML text.
Figure 2 shows the code for the browser.
|Figure 1 The Custom ASP Browser |
During the form's initialization, a new CAspParser object is created and set to work properly. Once you've clicked the Go button, the browser detects whether you're calling the ASP page locally or over HTTP, and acts accordingly. All the logic is hidden in the CAspParser class, which exposes three public functions: Initialize, SetScriptControl, and ParseTextToFile. Initialize makes sure the scripting environment is properly initialized and ready to work. Through SetScriptControl, the class receives the working instance of the script environment (more on this later). ParseTextToFile parses the content of the given ASP file and creates an output stream. Basically, the parser reads the whole content of the ASP file into memory and then walks through it. It locates any occurrence of "<%", then copies the text that precedes "<%" to the output buffer, and starts a new search for the closing tag, "%>". The command text is extracted and processed separately. Any output is then appended to the response buffer.
The script code in the body of an ASP page may contain references to the intrinsic objects that form the ASP object model. These well-known objects are listed in
Figure 3. IIS is responsible for making these objectsâ"plus two more: ASPError and ObjectContextâ"available in the script's namespace when the parser is about to process the content of the various code blocks. To obtain an ASP parser that works outside the Web server, you should provide a replacement for these objects, which means building a client-side ASP object model.
A Client-side ASP Object Model One of the problems with Web applications is the inability to maintain state when working over HTTP. State is the ability to associate variables and objects with a particular user. A tool to store individual settings and resources can solve the problem. This is what the Session and Application objects provide, albeit at different levels. But you don't always need to implement this feature in a client-side ASP object model. In fact, a local ASP page is normally accessed by one user at a time and state management is a far less important issue.
From the perspective of an offline ASP viewer, the key ASP objects are Response and Request because they provide the basic functionality that make a page interact with the rest of the world. Whether you need to implement all or a part of the standard methods and properties depends on your particular project.
Although ASP is tightly integrated with IIS and Microsoft Transaction Services (MTS), and COM+ environments, this doesn't mean that you cannot use a unified, yet ASP-based approach for the concurrent development of products that deliver content through different media (like the Web and CDs). Offline pages consumed without the intervention of the Web server are normally much simpler and don't need all the features of an online Web application. Based on my personal experience, I suggest you implement a minimal set of features (similar to those I discuss here) and then extend the set when your pages need to support extra ASP features.
I deployed the first version of my project with only Response and Request objects. In particular, I only implemented the Write method of the Response object, and just for the HTML content type. Request only exposed the QueryString collection. In a second step, I added support for Response.End and the Request's Form and ServerVariables collections. Later, I also added some special features such as new environment variables and new offline-only objects, including Scripting.FileSystemObject.
The key questions concern how you simulate the Response or Request object and how you run all the script code that an ASP file contains. To execute script code, you can either take advantage of the Microsoft Script Controlâ"a downloadable component (see
http://msdn.microsoft.com/scripting), or use the raw Windows Script COM interfaces. For a primer, look at the Extreme C++ column in the August 1997 issue of Microsoft Internet Developer. Since I'm developing an application in Visual Basic, using the Script Control is the natural choice.
The Script Control ScriptControl is an ActiveXÂ® control without a user interface that wraps all the Windows Script interfaces needed for dialog with a script language parser. It has a Language property through which you select a language. VBScript and JScriptÂ® are the two usual options, but provided you have a compliant parser, any scripting language is fine. Francesco Balena covered the ScriptControl in detail in the July 1999 issue of MIND (see "
Exploring the Microsoft Script Control "). Tobias Martinsson's article, "Active Scripting with PerlScript," in the August 1999 issue of MIND, explores the use of Perl with ASP.
When it comes to using the ScriptControl you need to do three things: set up the language, add as many objects as you want to the script namespace, and execute the script code. In my special edition browser, I set the language to VBScript during the form load event. At the same time, I create instances of all the objects I want to be visible to the script engine at runtime. Named items visible to the parser at runtime is a concept that warrants further explanation. The whole set of named items forms the script's namespace.
A Windows Script parser (such as the Microsoft parser for VBScript) receives a vocabulary of known names at startup. This dictionary contains the language's keywords and global resources such as variables, objects, and subroutines. Behind each name (such as MsgBox) there's a programmable entityâ"whether it is a parser-specific function or the method of a certain in-process COM object. You can add new names to this namespace. Better yet, the interface of the ScriptControl (and thereby the Windows Script programming interface) allows you to do this in a very handy way. Look at the following code snippet:
Through the AddObject method, the ScriptControl adds a named item called Response to the script namespace. From then on, it is considered a language item. Each call to this element is automatically routed to the COM object you specified as the second argument of AddObject. Those two lines are part of the CAspParser.Initialize method and m_objScriptCtl is the instance of the ScriptControl that is going to be used for script processing.
Set m_objResponse = CreateObject("MyASP.Response")
m_objScriptCtl.AddObject "Response", m_objResponse
Once you execute those lines, any script code you run through that instance of the ScriptControl recognizes Response as a keyword and uses MyASP.Response to work with it. It's a very common technique in scripting. Incidentally, this is the same technique that allows IIS to inject the true ASP object model in the scripting context of a server-side ASP page. This workaround also makes it possible for Windows Script Host (WSH) scripts to rely on a system-provided WScript object.
Call in Action When the browser's main form is ready to parse and display the ASP code, it calls the ParseTextToFile method, which takes two file names: the source ASP file and the target HTML file. When the method returns successfully, the form simply navigates to the newly created local HTML page. The full source code of the CAspParser class is shown in
Figure 4. Let's see how it works step by step on a very simple ASP page:
The CAspParser class initializes the script control by setting the script language to VBScript (this is not strictly necessary since the ScriptControl already defaults to it), and adding a brand new instance of the MyASP.Response object to the namespace. The control then passes to the method ParseTextToFile. It receives the name of the ASP file, verifies it has an ASP extension, and reads in all of its content. I used the Scripting.FileSystemObject for clarity only (see
Figure 4). Using the CreateFile API or other I/O technique could give you better performance.
<% X=1 %>
<% Response.Write "Hello, world!" %>
The value of X is <%= X%>
The string with all the ASP content is then parsed for <%ï¿½%> blocks. All the text outside of these markers is written to the Response object. It accumulates the text into an internal string buffer that emulates the stream where the real ASP Response object writes. In this way, the simulated Response object caches all the output, just as the real ASP Response does when buffering is on. Note that under IIS 5.0 buffering is on by default, while it was turned off by default in earlier versions of IIS.
Figure 4, the Response.Clear method is used to clear any buffered text that you accumulated through repeated calls to Response.Write. This Clear method plays exactly the same role it does in the real ASP object model you're used to on the server.
Now let's have a closer look at the implementation of the simulated ASP Response object. To further illustrate the language neutrality of COM and to avoid the problem of writing objects in Visual Basic with the same method names as some language keywords (such as Write or End), I decided to write the MyASP objects using ATL and Visual C++Â®. The implementation of the MyASP.Response object is straightforward (see
Figure 5). The MyASP objects need to expose methods with signatures that match the way you're using them in your client-side ASP pages. If you're using the client-side ASP engine to work on specific client-only pages, then there's no particular reason for you to use a custom object that mimics the ASP's Response. You are better off writing a completely custom object with the programming interface you prefer. The need to mimic the signatures of ASP intrinsic objects arises when you're writing dual pages to be used on the Web as well as locally on a CD.
When you invoke the Write method on MyASP.Response, the text you pass in is added to an internal member variable that's ready for return to the caller. This behavior mimics exactly what the ASP Response object does internally when buffering is on. The Clear method empties the buffer. MyASP also implements a property called ResponseBuffer that returns the current content of the output buffer. This property works in much the same way as the ASP Flush method. Each time you read it, its contents are cleared. IIS itself manages to send the transformed text to the browser via HTTP. Consequently, there's no need to make the internal buffer available to the scripts in the ASP page. In fact, the ASP Response object doesn't have a method or property (such as ResponseBuffer) that returns the text accumulated in the internal buffer. In this client-side emulation, the browser needs to get the transformed text from the object, and a property is more helpful than a subroutine like Flush.
Finally, the End method sets an internal variable to false. This variable is exposed through the CanContinue property and is used to stop the loop that governs the parsing of the ASP text. As you can see, the programming interface of the MyASP.Response object is similarâ"but not identical toâ"the ASP Response object. The logic behind the two objects is shared to some extent, but it clearly differs as the working context of the client and server-side editions of Response requires.
In Figure 6 you can see that both the custom browser and Internet Explorer render my simple page in the same way. If you open Explorer in the folder that contains the specified ASP page and double-click the item, in most cases Visual InterDevÂ® will open because it is the program that is usually registered to edit ASP files. If you want to be able to double-click on ASP files and see their content, you could associate them with a program like Visual InterDev. However, remember that a generic ASP page might be using objects like Session or Application that the client-side parser doesn't support.
|Figure 6 The Custom Browser versus Internet Explorer |
Consider a page like the following, which is nearly identical to the previous one except for a Response.End statement.
Figure 7 shows that the End method correctly stops the processing. If you're confused by the truncated output in the HTML textbox, don't be too concerned. Try viewing the same document through Internet Explorer and HTTP and you'll see that the HTML the browser receives from the Web server is exactly the same.
<% x=1 %>
<% Response.Write "Hello, world!" %>
<% Response.End %>
The value of X is <%= x%>
|Figure 7 Using Response.End |
A possible stumbling block in the conversion process is the meaning of the = sign which is often used within <%...%> code blocks to denote Response.Write. In fact,
is exactly the same as
The value of X is <%= x%>
To deal with this particular situation (and other similar circumstances) I've added the ResolveAmbiguity method to the CAspParser class. Each time that the script command begins with "=" it replaces the character with a Response.Write.
The value of X is <% Response.Write x %>
A More Complicated Page So far I've worked with a very simple ASP page. Let's use the client-side parser to work with a more complex ASP page that involves databases.
Figure 8 shows an ASP page that fills and displays a table with a few records taken from an OLE DB data source. Despite the use of ActiveX Data Objects (ADO), the structure of the page is relatively simple. To make this example more realistic I would need to implement the Request object and the simulation of the POST and the GET HTTP commands. I'll cover those topics next month.
|Figure 9 The Page Rendered in Internet Explorer |
In Figure 9 you can see how Internet Explorer renders this page. Figure 10 shows that the offline parser renders it the same way.
|Figure 10 The Page Rendered in the ASP Browser |
Let's go back to the page in
Figure 8, which imports a Cascading Style Sheets (CSS) file:
The file name is not a URL and does not contain path information. This means that both IIS and the ASP offline browser will look for it in the current folder, where the hosting ASP page resides. You can also specify a fully qualified path name for the CSS and it will work fine as well. What happens if the page contains hyperlinks? If the anchor tag points to an existing absolute URL, then everything will work normally as the browsing engine simply navigates to the specified location. If the hyperlink refers to a relative URL that does not contain the protocol and Web server name, such as
<link rel="stylesheet" href="tablestyles.css">
then two things happen. First, the browser attempts to locate the specified page in the current location. If the page with the link is c:\pages\foo.asp, then seminars.htm is assumed to be in c:\pages. An HTTP 404 error is returned if it isn't there.
<a href="seminars.htm">Click here</a>
Finding the page, however, doesn't mean that the browser knows how to handle it. The browser certainly knows how to handle .css, .htm, .js, or .vbs files. But when the linked page has an ASP extension, the scenario suddenly changes.
The browser completes the name of the referenced file with the current path name. If you're clicking from http://server/ pages/foo.asp then the browser attempts to navigate to http://server/pages/seminars.asp. In an offline scenario, though, you're clicking from something like c:\pages\foo.asp, therefore the absolute page will be c:\pages\seminars.asp. If you try to type this path name in the address bar of Internet Explorer, a dialog will promptly ask you whether you want to download the file or open it from its current location. (A similar dialog box also appears with Netscape and other browsers.)
<a href="seminars.asp">Click here</a>
The key point is that the browser doesn't know how to cope with an ASP page without the help of an ASP-enabled Web server. No browser includes a client-side runtime engine capable of parsing and expanding ASP pages like I'm building here. Figure 11 explains the typical browser's schema for navigation. While the schema has been drawn with Internet Explorer in mind, it is general enough to be extended to all browsers. There are at least three ways a user can ask the browser to move to a URL: from the address bar, through page links, or via scripting. In all cases, the request is queued to an internal module that prepares the actual HTTP request for the Web server or manages to do otherwise for local files. When you ask the browser to navigate to a local ASP page, there's nothing predefined this module can do. Win32Â®-based browsers usually look in the registry for the application registered to open ASP files. Often this application is Visual InterDev.
|Figure 11 Web Browser Navigation |
To follow a link to an ASP page in a client-side ASP environment, you need a customized browser. The ASP browser utilizes the WebBrowser control to display the page. The component traps all clicks on hyperlink tags and processes them the usual way, through the Internet Explorer standard navigation module. You need to prevent the standard browser's engine from getting involved when the user wants to follow a link to a relative ASP page. The WebBrowser control raises an event, BeforeNavigate2, each time it is about to navigate to another URL. This event accepts a Boolean return value that you can set to True if you want to prevent the default operation from taking place.
Figure 12 shows how to write the code that redirects any link to a local ASP page. LocalNavigate is the same subroutine that gets called when you click on the Go button. Figure 13 shows that the hyperlink works.
|Figure 13 Redirecting to ASP in Action |
Note that for completeness you should also implement your own history mechanism. The standard one won't work because the browser doesn't know how to move back and forth between local ASP pages. To create a custom history mechanism you can use a special system folder for persistence and a collection to keep information in memory.
Conclusion You can use the Microsoft ScriptControl to execute any VBScript or JScript code block, and to populate the scripting context with custom objects. This way, you can emulate all the ASP intrinsic objects and add new ones as well. A browser that can handle client-side ASP pages must behave in slightly different ways than regular Web browsers. This is particularly true of hyperlinks, navigation, and forms. In this column, I added client-side support only for the ASP's Response object. Next month, (October 2000) I'll cover the Request object and form management. Stay tuned.