ISAPI Extensions: Creating a DLL to Enable HTTP-based File Uploads with IIS

MSDN Magazine

ISAPI Extensions: Creating a DLL to Enable HTTP-based File Uploads with IIS

Panos Kougiouris
This article assumes you're familiar with HTML, ASP, COM, and C++
Level of Difficulty     1   2   3 
Download the code for this article: Upload.exe (70KB)
Browse the code for this article at Code Center: ISAPI Extension DLL
SUMMARYThe MIME-compliant content type, called multipart/form-data, makes writing HTML that uploads files almost trivial. On the server side though, ASP does not have a way to access data in the multipart/form-data format. The most flexible way to access the uploaded file is through a C++ ISAPI Extension DLL. This article describes a reusable ISAPI extension DLL that allows you to upload images and files without writing C++ code. It is coupled with a few COM components that make it readily reusable for ASP development. With .NET, this whole process is much easier, and this article shows preliminary code that uploads files using ASP .NET features.

U ploading files from a browser is often a Web site requirement. For instance, let's say that you have created a Web site for a real estate company. The site contains listings of properties along with a picture of each property. Realtors add new listings through an administration section of the site. Adding text for the new listing is easy using an HTML form. Then one day a realtor tells you that she wants to upload a photo of a property for sale. How would you do that with ASP?
      The W3C specs define a way to upload files from a browser to a server using a MIME-compliant content type called multipart/form-data (see the W3C HTML 4.01 Recommendation). This makes writing HTML that uploads files really trivial. On the server side though, ASP does not have a way to access data in the multipart/form-data format. There is a posting acceptor component, but its programmability is limited and it always stores the uploaded files into the file system. For instance, if you know in advance that users must always load small files and you process these files in memory, you might want a more flexible solution. The most flexible way to access the uploaded file is through a C++ ISAPI Extension DLL. Microsoft Knowledge Base article Q189651 contains sample code that explains how to write such an extension.
      The problem with the solution presented in the Knowledge Base article is that you have to change this DLL for every upload you need to make. Mucking with C++ code every time you add a page that requires an upload is not very productive. In this article, I'll present a reusable ISAPI extension DLL that does the heavy lifting so that you don't have to write any C++ code to upload images and files. The parsing part of the DLL is based on the Knowledge Base example. It is coupled to a few COM components that make it readily reusable for ASP development. The first section of this article explains how to write HTML that uploads a file, and the next section explains my upload sample and how to use the components from ASP. Before I explain how the ISAPI extension DLL works, I'll describe how browsers submit forms to the server. Finally, in the last section, I'll give you a preview of the Microsoft® .NET Framework features that will make file uploading even easier.

Uploading Files

      Figure 1 shows a very simple HTML form that uses multipart/form-data encoding to upload a file. The ACTION attribute of the form element should point to a target component (for example CGI or ISAPI Extension DLL) that knows how to parse the multipart encoding and process the data. The page is rendered as shown in Figure 2. The user presses the Browse button, selects a file from the file system, and then presses the UPLOAD button to send the file. Of course you might want to hire a designer to spice up the page with graphics and other cool JavaScript tricks, but as far as the HTML is concerned it couldn't be any simpler.

Figure 2 The Page in the Browser
Figure 2 The Page in the Browser

      On the other hand, if you use Internet Information Services (IIS) and ASP, the server-side processing couldn't be more difficult. You need to write an ISAPI DLL or CGI program that will process the request. After looking carefully into the code of a couple such DLLs, I noticed there was a pattern. The DLL would first read all the data, parse it, and put it into some sort of table. Then the DLL would use the data in an application-specific way. For instance, the application would store the file in the file system. Or the application would process the file and write some results into a database. The possibilities are endless.
      My idea was to break these two actions in two modules, an application-independent module that does the parsing of the multipart encoding and an application-dependent one that does the rest. I packaged the application-independent module into a reusable C++ ISAPI extension that stores the data into a COM object. Then, the application-dependent part can be written in ASP and use automation and scripting to retrieve and further process the uploaded data. Before I delve into the implementation, I will explain the architecture of my sample and how to install it.

The Upload Sample

      The sample consists of two dynamically linked libraries (DLLs). The first DLL is an ISAPI extension that should be installed in a virtual Web directory with execute permissions set. In IIS 5.0 you set the permissions through the properties of the directory in the IIS MMC snap-in. Make sure that the Execute Permissions field reads "Scripts and Executables." The second DLL is a COM DLL that should be registered at the target machine using the regsvr32 utility (see Figure 3).

Figure 3 File Upload Architecture
Figure 3 File Upload Architecture

      Every time you need to upload files from a page, you should set the ACTION attribute of the FORM element to the path of the extension DLL. Here is an example:

  <form ACTION="bin/swupload.dll?IFERROR=/error.asp" 
  
method="POST"
name=frmIMAGE
ENCTYPE="multipart/form-data">

 

The IFERROR argument is an optional query string element that I will discuss later.
      Now all you need to do is specify the virtual path of the ASP page that will process the uploaded file. You do this by adding a hidden argument in the form. The name of this element should be REDIRECT. For instance, in my example where the ASP page is /showImg.asp, the line in the HTML file would be:

  <INPUT TYPE="HIDDEN" NAME="REDIRECT" VALUE="/showImg.asp">
  

 

      Now you're probably wondering what showImg.asp looks like. In my example, showImg.asp just returns the image and renders it in the browser. This is not very useful, but if you understand this simple example, you will be able to extend it to do more interesting things like writing the file into the server file system or into a database (see the article "Delivering Web Images from SQL Server" by Scott Stanfield to learn how to do this).
      The core of the code in showImg.asp (see Figure 4) is in the getMultiPartDictionary function. The code creates the MSDNUpload.DictMap object. This object is a dictionary of dictionaries. It then uses the DICTMAPID argument that is passed as a query string from the extension DLL to retrieve a dictionary that contains all the elements of the form! The name of each element is the same as the name in the form.
      On certain occasions, an error can occur during upload. The most common error is when a user tries to upload a very big file. The current maximum size is hardcoded in the ISAPI extension.
      Other errors might be caused by malicious users who try to find a security hole. In such a case, the ISAPI extension DLL will redirect the request to the file passed in the IFERROR argument of the query string. A string describing the error is passed to this file in the ERRORMESSAGE query string argument.

How Browsers Submit HTML Forms

      Before I explain how the ISAPI extension works, let me quickly review the multipart encoding. When a browser sends a request to a server, it always sends an HTTP packet with the data describing the request. The packet always contains the virtual path of the URL. For instance, if you call https://myserver/default.asp, the packet will contain the /default.asp path. In addition, if the request is the result of submitting an HTML form, the request contains the contents of the INPUT tags in the form.
      The next question, of course, concerns how the data is encoded. It turns out that it's done using MIME encoding. The default (and simplest) encoding is application/x-www-form-urlencoded, which is described in the HTML 4.01 specification at the W3C. The application/xwww-form-urlencode type is simple and perfect for submitting a few text fields.
      For large text files or images, the specification defines another encoding: multipart/form-data. The following snippet shows what the HTTP packet of a form submitted using the multipart/form-data encoding looks like.

  1---------------7d01615701d4
  
2 Content-Disposition: form-data; name="Name"
3
4 Panos
5 --------------7d01615701d4
6 Content-Disposition: form-data; name="filedata"; filename="C:\Documents
and Settings\panos\My Documents\My Pictures\Sample.jpg"
7 Content-Type: image/pjpeg
8
9 ....

 

The packet is logically divided into a number of sections. One section is for every input element in the form. In the previous excerpt you see two sections: one that starts at line 1 and another one that starts at line 5. Every section starts with a signature. The signature is browser-specific, but it is unique enough to guarantee clear marking of the different sections (see lines 1 and 5 in the excerpt). Every section contains a header with information regarding the element corresponding to the section. At a minimum, it contains the name of the element in the form. In the case of a file, the file type and its location in the client file system is also passed (see lines 2 and 6). Finally, the actual data follows the header section (see lines 4 and 9).

The ISAPI Extension DLL

      The ISAPI extension DLL parses the input stream and collects all the name/value pairs. Then it creates a new dictionary COM object and stores the name/value pairs in the dictionary. Finally, it generates a new ID and stores the new dictionary into the dictionary of dictionaries using that ID as the key. The ID is then passed to the ASP file through the DICTMAPID argument. Figure 5 highlights the most important code of the ISAPI extension. You can find the remainder of the code download at the link at the top of this article.
      Although the code is fairly complete, I have to point out some limitations of the solution. On a high traffic site or on a site where very big files are uploaded, the filter could slow down the server dramatically because it reads the whole file into RAM before passing it to the ASP page. A better implementation in these two cases would involve storing the uploaded files somewhere in the file system of the server and making the file name available to the ASP. Another limitation is that the maximum file size is hardcoded in the extension. This should have been a configuration parameter, initialized from the registry or kept in a configuration file.

Future Directions

      If you would prefer to have all this stuff bundled with your development environment and can wait a little bit, there is good news. In the .NET Framework there are two technologies you can use to dynamically render pages: ASP .NET and ATL Server. In both cases there is a very easy, built-in way to upload files.
      In ASP .NET, the form for the server-side Web control supports uploading of files. Figure 6 shows an ASP .NET page that uploads a file and stores it in the file system. Since the code is reproduced here from the samples of a prerelease version of the .NET SDK, it is subject to change.
      ATL Server uploads files into the TEMP directory. The m_HTTPRequest variable informs you of the existence of the files, and is responsible for moving the files and deleting them from the TEMP directory. For more information about how this works, check Shaun McAravey and Ben Hickman's article, "ATL Server and Visual Studio .NET: Developing High-Performance Web Applications Gets Easier" in the October 2000 issue of MSDN Magazine.

Conclusion

      ASP does not provide a way to receive files uploaded through a browser. In this article I presented the design and implementation of an ISAPI filter and a few COM objects that solve the problem for many of the common cases.
      If you are an ASP developer, downloading and building the code that accompanies this article will give you a simple but effective way to upload small files to your ASP-based Web site, without needing to deal with the complexity of ISAPI and C++. If you are a C++/COM developer, this article demonstrates how to use COM to communicate between an ISAPI filter and ASP. Then again, if you can wait a little bit longer, .NET will take care of all these issues out of the box without needing any extra code.

For related articles see:
ATL Server and Visual Studio .NET: Developing High-Performance Web Applications Gets Easier

For background information see:
ISAPI
Taking the Splash: Diving into ISAPI Programming
Panos Kougiouris is a Fellow at WebMD's Internet Division based in Santa Clara, CA. He can be reached at panos@acm.org.


From the October 2001 issue of MSDN Magazine.

s