Linking Office and the Web Via Script
October 16, 2000
Note The code sample download originally included with this article is no longer available.
I get quite a few requests from customers about how best to integrate Microsoft Office documents and the Web. This can mean a multitude of requirements and expectations, but I thought I'd cover some of the ways that you can use script as the glue to integrate Office content within a Web application. I'm going to cover two main areas of Office integration:
- Generating Office content.
- Integrating Office content into a Web application.
To illustrate this, I'm going to use the mythical fabrikam.com Web application (and, of course, that means that no association with any real company, organization, product, domain name, e-mail address, logo, person, places, or events is intended or should be inferred), and I'll build on the simple example I covered in my April article. The fabrikam.com Web site allows script authors to upload useful scripts, which can be shared among other script authors who subscribe to the site. The plan was to integrate a billing mechanism for scripts at some point. Script authors would also be able to advertise their script-writing abilities, and fabrikam.com would link those authors to people who wanted to commission scripts.
For our purposes, imagine that the billing and matching functionalities are being developed—and some behind-the-scenes systems, invisible to the Web customer, must manage the legal requirements for prospective script vendors. In addition, reporting mechanisms are required so that the folks who actually run fabrikam.com can see what's going on.
To sign up a script vendor for the work-for-hire matching capabilities, we need the vendor to sign a contract and return it to fabrikam.com. In the future, this could all be done electronically; until that day arrives, a paper contract with the appropriate signatures is the only documentation that the fabrikam.com lawyers will accept.
Whenever a customer signs up to be a provider on the service, the fabrikam.com staff must cut and paste information from the sign-up Web page into a Microsoft Word document, which can then be sent to the applicant for signatures. The current workload makes this process acceptable, if not efficient, but as the business grows, the administrative staff will need a much more automated system. Automation will allow the administrative staff members to use their time more efficiently, and it will also cut down on the number of mistakes in each contract because the system will no longer rely on a manual, cut-and-paste transfer from one application to another. Since each contract may be changed to meet a customer's requirements, it's vital that each original contract be saved for future reference.
To meet the requirements of the administrative department and keep a copy of the contract, fabrikam.com has built a system using Active Server Pages (ASP) technology, Visual Basic Scripting Edition (VBScript), ActiveX Data Objects (ADO), SQL Server, and Microsoft Word 2000.
The key to building the system is the ability of Word 2000 to save and load documents in HTML. This ability allows the standard script running in an ASP page on the server to build up a Word document just as it would build a normal HTML page. By using the content type property on the Response object, you can ensure that the document will be rendered in Microsoft Word, which provides a number of advantages over plain HTML. In this application, it provides the ability to control the printing and margins (every contract I've seen has the most Byzantine margin settings), and allows you to embed Visual Basic for Applications (VBA) macros that take advantage of the Word object model.
To create the contract based on customer data, the fabrikam.com system uses two ASP pages: one to select the customer and the other to generate the contract and send a cover letter to the customer. The cover letter is based on a template, but the customer-service representative can add to it to meet the needs of the customer. Once the cover letter is complete, the customer-service representative clicks the Print button in Word. I considered storing only the data that is merged with the document, rather than saving the entire document, but because it's a contract, it's important that the exact file sent out to the customer be stored for future reference.
The main user selection page, users.asp, is pretty simple. It uses an ADO connection to query a database for all the customers, and it allows you to search for a specific customer based on customer data. When the customer-services user opens the users.asp page and clicks the Create link next to the customer name, a request is sent to the createcontract.asp page on the fabrikam.com server. The request includes the customerid, which the VBScript code in the ASP page uses to retrieve customer information from the database. Once the information has been received, it needs to be inserted in the relevant areas of the document.
I mocked up both the cover letter and the contract in Word 2000, including a page break between the cover letter and the contract. I then used the editing capabilities of Word to format the document exactly how I wanted it. Once the document was formatted, I saved it as HTML and renamed it as an ASP page.
It was easy to replace the placeholders with the relevant fields from the database. When the page is requested, the ASP page runs the VBScript code and generates the resultant HTML for the browser. So far, this is nothing special; people have been doing this for ages (at least, for Internet ages). To integrate the result of the database query into Office, you must set the content type of the response via the ASP page. If the content type is left to the default (HTML), the HTML will be displayed in the browser just like any other HTML page. Changing the content type to application/msword changes the HTTP response mime type to msword, which will instruct the browser to send the content to Word 2000 rather than to the HTML renderer (this will work in older versions of Word as well).
Depending on your security settings, you may get a security dialog box that asks whether you want to open the content. This is important, because the Word document may include VBA macros that you prefer not to run. If you open the Word document, the default security setting is to run only signed VBA macros, so unless the VBA macro is signed by a source you trust, it won't be allowed to run. Now that you've changed the content type, ASP dynamically creates the Word document.
<% Response.ContentType = "application/msword" %> <html> <% set conIncident = server.CreateObject ("ADODB.Connection") ' Build the SQL string to call the get script stored proc. strSQL = "select * from dbo_users WHERE ((dbo_Users.UserID)=" & request("id") & ")" conIncident.Open "fabrikam" ' Execute the sql. Set result = conIncident.Execute(strSQL) %> <head> <link rel=File-List href="createcontract_files/filelist.xml"> <link rel=Edit-Time-Data href="createcontract_files/editdata.mso"> </head> <body> <p align="right"><%=formatdatetime(now,2)%></p><br><br> <%=result("firstname")%> <%=result("lastname")%><br> <%=result("address")%><br> <%=result("city")%><br> <%=result("state")%> <%=result("PostalCode")%><br> <%=result("country")%><br><br> <p>Dear <%=result("firstname")%><br><br> Thank you for becoming part of fabrikam.com and its exciting work-for-hire process. Please find attached the contract for you to sign and return to us to get things started. It will be sent out today and should be in your mailbox soon. <br> <br> Thanks again, <br> <br> Fabrikam.com Team <br> <br> This letter was created at <% response.write formatdatetime(now) %> </p> <br clear=all style='page-break-before:always'> <--! Contract stuff goes here; check in the linked file for the details. --> </body> </html> <% set conIncident=nothing set result = nothing %>
The obvious question at this point is: Why not just do a mail merge to achieve the same thing? Mail merge has been around for a long time and works pretty well, but the machine performing the merge has to have direct access to the database—a scenario that is becoming increasingly rare. When you process the document before it reaches the client machine, a number of new capabilities are open to you.
- You can lock down access to the database server. This means you put your database server behind the firewall and have all the public access come via front-end Web servers. A setup like this is often called a DMZ (demilitarized zone) and makes securing your data much easier, because the only queries allowed originate from a small (relative to the Internet) number of Web servers.
- You can take advantage of multiple-tier architecture. Because an ASP page is generating the document, you can query the database via a business component that encapsulates both the business logic and the database access. This can significantly improve the performance and scalability of your application, because the business component can provide a more basic view of the data, enabling you to concentrate on writing your business application rather than on the intricacies of database access. For simplicity's sake, this sample uses script in an ASP page, but it could instead use a Windows Script Component or a COM component developed in any language.
- You can control where customization code is run. We've all been affected to some extent by macro viruses, and people are becoming increasingly concerned about downloading code to client machines. When you produce a document on the server, you can decide where it makes sense to run your code, and you can still include custom code in the generated document if it is required.
- You can have programmatic control over what the document will include. Because you are using script to determine what goes into the document, you can take advantage of the script's control flow features (such as if, do while, for each, etc.), rather than being limited to the static mechanisms that mail merge provides.
- You can use standard HTTP requests to access the document content.
The example above uses Microsoft Word 2000, but the content type trick also works for other document types. The fabrikam.com site uses that ability to create documents for its reporting mechanism. At the end of each month, salesreport.asp is run to create a Microsoft Excel spreadsheet that is sent to all the executives, so they can be up to date on how the company is doing. Creating an Excel spreadsheet from a database query is just as simple as creating a Word document. You just change the content type to application/vnd.ms-excel, and the response from the ASP page will be sent into Excel. The sample I've provided even has a chart that automatically updates itself from the content. If you take a look at the source for salesreport.asp, you'll notice that the chart is described in XML, so you could even build different charts based on what the user wants. (You can view the complete code for salesreport.asp; use the link at the top of this article, "View and download the source code for this article.")
The examples so far have used ASP pages to generate the Office content, which is fine for a hyperlink from a Web page, but most Office users won't be looking for .asp files. Instead, they will expect a .doc or .xls file. Luckily, you can achieve exactly the same results as an ASP page but still use the .doc or .xls extensions. The key to this is the way that Internet Information Services (IIS) maps document types on the server. IIS provides a way to specify which ISAPI DLL should handle certain document types. For example, ASP pages are handled by asp.dll. This document mapping is on a per-application basis, so you can have different settings for each folder on your Web site.
To make the fabrikam.com server easier for users to navigate, the reports and contract folders on the Web site have been changed into applications, and the document mapping for each has been configured so that asp.dll handles .doc and .xls files. This configuration isn't a good idea for general-purpose folders, because the binary files will confuse the ASP handler, but for these particular folders, I know that every .doc and .xls file will really be HTML/ASP. It's a small thing, but it allows a user to create a shortcut to a Word document, which is much more natural than a shortcut to an ASP page.
Andrew Clinick is a program manager in the Microsoft Script Technology group, so chances are, if there's script involved, he's probably had something to do with it. He spends most of his spare time trying to get decent rugby coverage on U.S. television and explaining cricket to his American colleagues.