Simon Guest
Microsoft Corporation
August 2005
Applies to:
Microsoft .NET Framework 1.1
Microsoft Visual Studio .NET 2003
Microsoft Office Word 2003
BEA WebLogic Workshop 8.1.4
IBM Rational Application Developer 6.0
Microsoft Windows XP Professional
RedHat Linux 9.0
Summary: This is the first article in a series to show interoperability with the Microsoft Office XML Reference Schemas (WordProcessingML and SpreadSheetML). Here, we'll be looking at how both BEA WebLogic and IBM WebSphere (running on either Microsoft Windows or Linux) can be used to generate server-side documents that can be read by clients running Microsoft Office Word 2003. (17 printed pages)
This article outlines a sample use case based on the creation of Birth Certificate documents.
Contents
Use Case: Birth Certificate Creation
WordProcessingML
Using WordProcessingML for Contoso
Installing Database Support
Extracting the Sample Code
Starting the Database
Running the Sample on BEA WebLogic 8.1.4
Installing the Sample for IBM WebSphere 6.0
How the Sample Works
Creating the Birth Certificate Record
Searching for a Birth Certificate Record
Generating the Microsoft Word Document
Dealing with Images in the Document
Returning the Microsoft Word Document
Conclusion
Use Case: Birth Certificate Creation
To show interoperability with Office XML on BEA WebLogic and IBM WebSphere, imagine the following scenario:
Contoso Registrations is a private firm, sub-contracted by the government to record the registration of births. Part of their service offering includes a "walk-in" office where applicants can register in person by providing their details and showing identification.
Today the system to record this data is fairly simple: A registrant completes a paper-based form with the details of the birth. Information from the form is typed into a Web application (based on J2EE, the Java 2 Platform, Enterprise Edition) and entered into a database. Once a week, a snapshot of this database is sent to a printing firm. From each record in the database, the printing firm generates a "Commemorative Certificate of Birth," which is then mailed to the registrant.
This system works well today, but Contoso is looking to streamline the process and save some costs. After an analysis of the process they are finding that the off-site printing of the birth certificate documents is becoming expensive—plus, they've had to deal with a number of incidents where the certificates were lost in the mailing process.
To overcome this, Contoso is considering printing the birth certificate document at the time of registration. All registry offices are equipped with printers that can handle the load—and machines that are running Microsoft Windows XP and Microsoft Office 2003. They feel that this would both help save money and offer a better service to their customers: The registry office would be able to hand the certificate to the registrant at the time of registration.
The IT department at Contoso takes up the challenge of investigating how this is possible with today's technology. In short, they need to generate Office 2003 compatible documents from their existing J2EE application—preferably without any additional software or migration of the current application. After reading up on the area, the IT department discovers a supported format in Microsoft Office 2003 called WordProcessingML.
WordProcessingML
Microsoft Word 2003 supports a feature called "Save as XML." This is where you can open any Microsoft Word document, go to the File menu, select Save As... and select XML as the desired file format. As you may expect, this saves the current file as XML, using a format known as WordProcessingML.
WordProcessingML is one of the XML schemas offered through the Office 2003 Reference Schemas. It describes how a document in Word 2003 and associated parts—such as fonts, styles, tables, images, and so on—can be represented in an XML document. Microsoft offers the licenses and documentation for these schemas royalty-free.
<w:wordDocument xmlns:w=
'http://schemas.microsoft.com/office/word/2003/wordml'>
<w:body>
<w:p>
<w:r>
<w:t>Hello, World.</w:t>
</w:r>
</w:p>
</w:body>
</w:wordDocument>
A Word XML file is nothing more than a Word Document saved in XML format, conforming to the WordProcessingML schema. However, unlike some other "Save As" formats available in Microsoft Word (such as .txt or .rtf), when a document is saved in this XML format, it retains all of its formatting, functionality, and editing capabilities. That is, a document saved in Microsoft Word XML has the advantage of being functionally equivalent to a document saved in the native binary .doc format.
This capability leads to a number of interesting scenarios that were not possible with previous versions of Office. For example, XML files conforming to the WordProcessingML schema can now be created on non-Microsoft platforms (providing the platform can generate XML). When this type of XML document is loaded into Office 2003, it is automatically recognized as a Word document and rendered accordingly.
Using WordProcessingML for Contoso
Going back to our use case, the IT group at Contoso looks at WordProcessingML and sees that it could offer a good solution for them. The solution is illustrated in the diagram below. Contoso modifies their existing J2EE application, adding functionality to generate their commemorative birth certificates as XML files that conform to the WordProcessingML schema. These files are returned to Contoso's existing Office 2003 clients. These documents can then be opened in Office 2003, printed, and immediately handed to the registrant.
.gif)
Figure 1. Using WordProcessingML
As shown in the previous diagram, the registration details for the birth are entered using the terminal at the registration office and are stored on the J2EE Application Server. The server stores the details in the database, but also returns the generated commemorative certificate in WordProcessingML format. This document is then opened in Word 2003 and sent to the local printer.
Note This use case (and the sample code herein) does not advocate any guidance for creating applications on the Java and/or J2EE platform. Its purpose is to show a sample application that demonstrates the creation of Word 2003 compliant documents on a platform that is based upon non-Microsoft technology. In addition, the birth certificate sample shown here is used for illustrative purposes only.
Installing and Using the Sample
The sample code supplied with this article is designed to run on either BEA WebLogic 8.1.4 or IBM WebSphere 6.0 (using IBM's Rational Application Developer 6.0). The sample supports both BEA WebLogic and IBM WebSphere running on either Windows or Linux operating systems. While it may run on other platforms and operating systems, you should note that these were not tested during the creation of this article.
Before running the sample, you should following the installation steps for the database—and then the J2EE Application Server platform that you wish to choose.
Installing Database Support
As described in the overview of the use case, this sample reads and writes the birth certificate information to a database. The sample has been written in such a way that any JDBC compliant database can be used by modifying the details in the Database class in the com.microsoft.samples.officexml.database package name.
For the purposes of this article, however, we will be using IBM's Cloudscape v10.0 database (which was recently donated to the Apache Software Foundation and renamed "Derby"). You are of course free to use any other supported database for this sample—although this will require changes in the source code where appropriate. IBM's Cloudscape was chosen to support the J2EE application server and operating system combinations in this article.
If you do not have Cloudscape installed, you will need to download the Derby database software from IBM as this is not included in the sample code files. To do this, follow these steps:
- Download the "Java Installer" from the Cloudscape install page. Extract the contents to a temporary directory.
- Open a command prompt to this temporary directory and type the following command:
java -jar 10.0-IBM-Cloudscape.jar
This will install the required files. The default installation directory is C:\program files\IBM\Cloudscape_10.0 for Windows and /opt/IBM/Cloudscape_10.0 for Linux, but you are free to choose whatever directory you wish.
Extracting the Sample Code
Download the JAR file containing the sample code for your platform from http://workspaces.gotdotnet.com/officexmlinterop. The download will be a zip file containing a single JAR file for each platform.
For BEA WebLogic, extract the contents of this archive to a directory on the server running BEA WebLogic using the following command:
jar xvf BEABirthCertSample.jar
For IBM WebSphere, use a similar command to extract the IBM specific version of the sample:
jar xvf IBMBirthCertSample.jar
For the purposes of this sample, we assume all contents of the JAR file will be placed in a directory called c:\officexml\BirthCert (for Windows) and /usr/officexml/BirthCert (for Linux). You are of course free to choose an alternate directory, but this will require modification of the sample setup.
Directories extracted from the JAR file include:
/database
This contains the database required for the sample.
/docs
This directory is used to store the XML files that are generated on the server.
/images
This directory is used to store signature images for the Certificates.
/project
This contains the project files and source code for the JSP client and Web Service.
/schema
This contains an XSD schema used to represent calls made to the Web Service.
/template
This directory contains the XML template used for the Birth Certificate.
Starting the Database
To start the database, run the database.bat file (database.sh for Linux) in the database subdirectory (c:\officexml\BirthCert\database or /usr/officexml/BirthCert). This will compile and run the required files to provide database support for the sample. If you chose an alternative directory for installation of the Cloudscape database you will want to edit the script file accordingly.
Now that you have installed the database and extracted the sample code, follow the section for your platform.
Running the Sample on BEA WebLogic 8.1.4
This sample code has been tested with BEA WebLogic Workshop 8.1.4 (8.1 SP4) running on Microsoft Windows XP Professional, Microsoft Windows Server 2003, and Red Hat Linux 9.0. Although it may be possible to run the sample code on other versions and operating systems, this has not been tested with the code supplied.
Compiling the Sample Code
To open the required projects in BEA WebLogic Workshop, double click on the BirthCert.work file in the project directory. If opening this file for the first time, you are prompted to define a server for the application. Select OK and choose a server home directory. Any of the sample configurations will work for this code or you can create your own using the configuration wizard.
.gif)
Figure 2. Viewing BirthCertSample
You will notice that there are three projects displayed in BEA WebLogic Workshop—the BirthCertSchema (a project for the XSD schema that represents a birth certificate), the BirthCertWebClient (a project for the JSP client) and the BirthCertWebService (a project for the Web Service).
Open the BirthCertificate_en_US.properties file in the BirthCertificateWebService project. This file can be found in the com/microsoft/samples/officexml/properties directory.
Within this file you will see three properties that are used to reference the directories for the sample application.
WordMLXSL=/officexml/BirthCert/template/BirthCertificate.xsl
ImageFolder=/officexml/BirthCert/images/
WordMLDocFolder=/officexml/BirthCert/docs/
Note If you are using Linux to host this sample you will need to add /usr to the start of each of these paths (or modify it based on your own location).
The WordMLXSL property is used for the location of the XSL file. The ImageFolder and WordMLDocFolder represent the location for the output images and documents.
If you have used any other directories apart from the defaults, change these paths accordingly and save the file.
Adding Required Library Support
In addition, you will need to add three libraries to your project in BEA WebLogic Workshop. Two of these libraries are required for database support. One library is required for uploading files via a JSP page. To do this, right click on the Libraries folder in the WebLogic Workshop IDE and select Add Library.
For the database libraries, navigate to the lib directory of the Cloudscape installation (by default this is C:\program files\IBM\Cloudscape_10.0 or /opt/IBM/Cloudscape_10.0) and select the db2cc.jar and db2jcc_license_c.jar files.
For the uploading support you will need a library called commons-fileupload-1.0.jar. This library is part of the commons libraries from the Apache Jakarta project. If you do not have this JAR file locally, it can be obtained from The Jakarta Project Web site.
Starting the Application Server
With the required libraries loaded, select Build Application from the Build menu to compile the application.
Once built, select Start WebLogic Server from the Tools/WebLogic Server menu. This will start the application server.
By default, the application should auto-deploy to the server. In the case that this doesn't work, you can manually deploy the application by right-clicking on the BirthCertSample project and selecting Deployment and Deploy.
Installing the Sample for IBM WebSphere 6.0
This sample code has been tested with IBM WebSphere Application Server 6.0 running on Windows XP Professional, Windows Server 2003, and Red Hat Linux 9.0. Although it may be possible to run the sample code on other versions and operating systems, this has not been tested with the code supplied.
Compiling the Sample Code
To open the required projects in IBM Rational Application Developer 6.0, open the IDE and select the project folder location for the default workspace. If you installed the sample code to the default directory, this will be c:\officexml\BirthCert\project for Windows or /usr/officexml/BirthCert/project for Linux.
Opening the IDE for the first time will create a new workspace. To import the projects into the workspace, select Import... from the File menu. Select Project Interchange and select Next. Choose the projects.zip file in the project directory and select all of the available five projects. Click on the Finish button to import these projects into the workspace.
.gif)
Figure 3. Importing Project Interchange
After import you will notice that there are five projects displayed in the IDE—the Client and WebService projects are listed under the Dynamic Web Projects folder. Corresponding to each of these is an EAR project (ClientEAR and WebServiceEAR, respectively). Finally, a Schema project is used to generate the Java Beans from an XSD file containing the data types used in the Birth Certificate sample.
Open the BirthCertificate_en_US.properties file in the WebService project. This file can be found in the com/microsoft/samples/officexml/properties directory under the Java Resources / JavaSource directory.
Within this file you will see three properties that are used to reference the directories for the sample application.
WordMLXSL=/officexml/BirthCert/template/BirthCertificate.xsl
ImageFolder=/officexml/BirthCert/images/
WordMLDocFolder=/officexml/BirthCert/docs/
The WordMLXSL property is used for the location of the XSL file. The ImageFolder and WordMLDocFolder represent the location for the output images and documents.
If you have used any other directories apart from the defaults, change these paths accordingly and save the file.
Adding Required Library Support
In addition, you will need to add three libraries to your project in IBM Rational Application Developer. Two of these libraries are required for database support. One library is required for uploading files via a JSP page. Perform the following steps to add this support.
For the database libraries, right-click on the WebService project and select Properties. In the properties window, select Java Build Path and click on the Add External JARs... button. Navigate to the lib directory of the Cloudscape installation (by default this is C:\program files\IBM\Cloudscape_10.0 or /opt/IBM/Cloudscape_10.0) and select the db2cc.jar and db2jcc_license_c.jar files.
Click OK to save the changes. Now navigate to the WebContent -> WEB-INF -> lib directory in the WebService project. Right-click on this folder and select Import. Select File System from the import wizard, again browse the lib directory of the Cloudscape installation and select the same two JAR files (db2cc.jar and db2cc_license_c.jar). Import these JAR files to this lib directory.
For the uploading support you will need a library called commons-fileupload-1.0.jar. This library is part of the commons libraries from the Apache Jakarta project. If you do not have this JAR file locally, it can be obtained from The Jakarta Project Web site.
To add this JAR file to your project, right-click on the Client project and select Properties. In the properties window, select Java Build Path and click on the Add External JARs... button. Navigate to the directory containing the commons-fileupload-1.0.jar file and add it to the build path.
Click OK to save the changes. Now navigate to the WebContent -> WEB-INF -> lib directory in the Client project. Right-click on this folder and select Import. Select File System from the import wizard, again browse to the directory of the commons-fileupload-1.0.jar file and select the file to add it to this lib directory.
Starting the Application Server
Once these imports have been added you should add this project to a pre-configured instance of IBM WebSphere 6.0 Application Server (the test environment works just fine for this). If you have not already created an instance of a server you should consult the IBM documentation to set this up.
To deploy the application, right click on the server and select Add/Remove Projects. Select both the ClientEAR and WebServiceEAR deployments and add them. Performing a publish operation on the server will deploy these two applications.
Running the Sample
With the application deployed and the database running you can now test the sample application.
To do this, open a browser to:
http://servername:7001/BirthCertWebClient (for BEA WebLogic)
http://servername:9080/Client (for IBM WebSphere)
Note Replace servername with the actual name of your server and adjust the port if you have configured this. You can use localhost if running on the same machine.
The following menu screen will be displayed as shown:
Figure 4. Creating a Birth Certificate
Click on the Create a Birth Certificate button. This displays an entry screen for the birth certificate details:
Figure 5. Completing Birth Certificate Details
In this form, enter some details for a registration of birth. These details should include a full name, date, and place of birth. Notice also that you have the ability to upload signature files for the Mother and Father. Clicking on the appropriate links will prompt you to browse for the file.
With the birth certificate data entered, click on the Save button. If the information was entered successfully you'll see a message and will be returned to the main menu. If for some reason an error occurred—possibly with the data or the setup—you'll be notified at this point also.
The birth certificate details have now been entered. When returned back at the main menu, select Print a Birth Certificate. This will take you to a second JSP page:
Figure 6. Searching Birth Certificate
Enter the information required to search for the record that you entered. The minimum amount of information is a first name, middle name (if supplied), last name, and date of birth. (These should be the same details that you entered in the previous form). Click on the Search button to submit the search.
If the search was successful, the Web application will generate the WordProcessingML Birth Certificate. Two windows will be opened within the browser—one with a regular XML representation of the document, the other with the WordProcessingML document.
For the WordProcessingML document, you'll be prompted with a similar dialog box to the one shown. Notice how the dialog box automatically recognizes this as a Word Document.
.gif)
Figure 7. Downloading Birth Certificate to Microsoft Word
To view the Birth Certificate in Word you have two options: If you have Word 2003 installed, you can click on the Open button—this will display the Word document embedded within the browser. Alternatively, you can select the Save option, save the file to a folder, and re-open it from Windows Explorer.
When opening the file you'll be presented with the "Commemorative Certificate of Birth," similar to the one shown below.
Figure 8. Viewing Birth Certificate in Microsoft Word
This is the Word document that was generated by the J2EE application server. Although the application server returns the file as XML, a document type setting in the file allows it to be opened directly within Word.
In our use case, you can imagine how this could then be sent to a local printer and printed on the appropriate card stock.
How the Sample Works
As outlined in the use case, the sample works by using a series of JSPs (Java Server Pages) to collect the information from the user, to store and retrieve information to/from the database, and to generate the actual WordProcessingML XML document. This section describes how each of the steps work.
Creating the Birth Certificate Record
In the BirthCertWebClient project, under the jsp root, the BirthCertHomePage.jsp is used to present the "create or search" initial menu. When the user creates a new birth certificate, the BirthCertCreateForm.jsp page is used to collect the details. UploadMotherImage.jsp and UploadFatherImage.jsp are used respectively to capture any mother and father signature images for the document.
Upon submitting the information in the form, a call is made to the Birth Certificate Web Service running on the same machine. The Web Service is responsible for both entering the information in the database as well as generating the final WordProcessingML document. From the JSP pages, these calls are made through the RequestHandler Servlet in com/microsoft/samples/office/handler.
For creating the birth certificate, the Web Service (BirthCert.jws) exposes a method called createBirthCertificate. This method accepts a birth certificate type which has been pre-defined in XSD (and can be found in the Schemas folder of the project). When the method is called, an AccessBirthCertData class is used to store the information to the database. When this is complete, the Web Service reports success and the user is notified and returns to the main menu.
Searching for a Birth Certificate Record
To search for a birth certificate, the user is directed to BirthCertSearchForm.jsp from the main menu. Here the user enters the search criteria for the birth certificate. These are then passed to the Web Service, again through the RequestHandler Servlet. To perform this, the Web Service exposes a method called searchBirthCertificate. This method accepts a birth certificate search type pre-defined in XSD (and can also be found in the Schemas folder of the project).
With this data, the Web Service searches in the database for a matching record, and if one is found, then goes about generating the WordProcessingML document.
Generating the Microsoft Word Document
Generating an XML document that conforms to the WordProcessingML schema can be done in a number of ways.
One option is to build the XML file from scratch. This gives ultimate flexibility over the file, but it may be a long process to hand-write all of the code to do this. A complex document with many fonts and styles could result in many lines of complex code.
Another option is to create a template in Word, save it as XML and mark certain fields with identifiers (for example ###NAMEFIELD###). You can then load the document in an XML parser, search for the appropriate field (###NAMEFIELD###), and replace this with the actual value. This is more effective that the first approach, but can still be prone to errors (for example, if there is text in the document that clashes with the field names).
The approach taken in this sample uses a third option where an XSL (XML Stylesheet) file is used to generate the final WordProcessingML document.
.gif)
Figure 9. Using an XSL (XML Stylesheet) to generate a WordProcessingML Document
As shown in the above diagram, a template of the Commemorative Birth Certificate is saved as XML. This file is renamed to XSL and the sample data is replaced with XSL tags. For example, the position where the child's name is positioned on the birth certificate is replaced with a tag as follows:
<xsl:template match="BirthCertificate/ChildName">
You can investigate the XSL file in the template folder to see all of the fields.
When the Web Service retrieves the data from the database it builds an XML document (in memory) that represents the data in the birth certificate. For example, this could look like:
<BirthCertificate>
<ChildName>John Doe</ChildName>
<Sex>Male</Sex>
<BirthDate>12-05-2005</BirthDate>
...etc...
</BirthCertificate>
An XML transform is then applied—taking the XSL template and applying the data in the XML stream. This is done using the XSL libraries that ship with both BEA WebLogic and IBM WebSphere, but any XSLT compliant processor can be used.
// Create a new instance of the transformer factory
TransformerFactory tf = TransformerFactory.newInstance();
StreamSource xslStream = new StreamSource(xslFile);
Transformer t = tf.newTransformer(xslStream);
// Create the output file
StreamResult xmlOutputResult = new StreamResult(filename);
// Perform the transform
t.transform(ds, xmlOutputResult);
The result is the final WordProcessingML document that is returned to the user.
To ensure that the resulting XML file is recognized by the client as a WordProcessingML document we add the following XML preprocessing header to the XSL file:
<xsl:processing-instruction name="mso-application">
progid="Word.Document"</xsl:processing-instruction>
This sets the XML header to enable the XML document to be recognized by the Word application.
Dealing with Images in the Document
As you may have noticed in the sample, the JSP pages also support the uploading of images (for the signatures that are to be attached to the commemorative birth certificate sample).
To support this, the image is uploaded by means of the JSP using a file upload control. When passed to the Web Service, the image is actually stored as a local file (in the images folder in the project directory). The image name corresponds to the ID of the birth certificate (for example, 22_Father.bmp is the father signature file for certificate number 22).
Inserting the image into the returned WordProcessingML document is relatively easy. A base 64 encoder is used to take the binary image into a Base 64 representation—this Base 64 representation is added to the document model. The XSL template contains the required transform to take the Base64 data into the Word document:
<xsl:template match="BirthCertificate/FatherSignature">
<ns0:FatherSignature><w:tc><w:tcPr><w:tcW w:type=
"dxa" w:w="4428"/></w:tcPr><w:p><w:pPr><w:jc w:val=
"center"/></w:pPr><w:r><w:t><w:pict>
<w:binData w:name="wordml://03000002.png">
<xsl:apply-templates/>
</w:binData><v:shape id="_x0000_i1025" type=
"#_x0000_t75" style="width:115.6pt;height:144.65pt">
<v:imagedata src="wordml://03000002.png"
o:title="Sample2"/></v:shape></w:pict></w:t></w:r></w:p></w:tc>
</ns0:FatherSignature>
</xsl:template>
As you can see, the Base 64 representation of the image is inserted within the w:binData element of the Word document. A temporary filename (wordml://0300002.png) and title are used to add the reference into the document.
Returning the Microsoft Word Document
When the document is created by the Web Service, the filename is returned to the calling JSP page. A new JSP page (ViewWordML.jsp) is then called. This reads the file from disk, sets the content type correctly (to text/xml), and returns the contents of the file via the JSP output stream.
This prompts the browser to display the "open/save/cancel" dialog box for the user from where the Word Document can be opened or saved.
Conclusion
This article has shown how to use WordProcessingML to generate a Word 2003 compliant document on an application running on either BEA WebLogic or IBM WebSphere. The purpose of this document was to show that the flexibility of the Office XML Schema formats and how these types of documents to be created on non-Microsoft platforms.
After reading this article you may be thinking "we can use PDF to do the same thing." For this particular use case, this is true. However, the use of the Office XML schemas can lead to some new and exciting scenarios. For example, when the returned document is edited and saved by the user it is saved as an XML file. This opens up the possibility of sending the document back to the server—providing round trip support and simple workflow for documents.
For example, an application could create a document that prompted the user for some details (for example, filing in an expense report). The user could fill in the requested information, save the file as XML, and then submit the file to the server. The server could then perform an "XML diff" on the submitted file vs. the original, to see the differences—and effectively extract what the user had entered. As you explore these possibilities, it's very easy to see the power of dealing with XML files that conform to the WordProcessingML schema.
I'd like to express my thanks to Anurag Katre, Pramod Pawar, Nilesh Jain and Hemlata Jadhwar (Tata Consulting Services) for their assistance in developing the sample code. I'd also like to recognize Jean Paoli, Joe Andreshak, Pascal Stolz (Microsoft Corporation), and Phillip Conrad (University of Delaware) for their support and guidance.
About the author
Simon Guest is a Program Manager on the Architecture Strategy team at Microsoft Corporation, and specializes in interoperability and integration. Simon holds a Masters Degree in IT Security from the University of Westminster, London, and is the author of the Microsoft .NET and J2EE Interoperability Toolkit (Microsoft Press, Sept. 2003).
Simon can be reached via his blog at http://www.simonguest.com.