From the March 2002 issue of MSDN Magazine

MSDN Magazine

.NET Web Services

Web Methods Make it Easy to Publish Your App's Interface over the Internet

Paula Paul
This article assumes you're familiar with C# and XML
Level of Difficulty     1   2   3 
Download the code for this article: WebMethods.exe (100 KB)
SUMMARY Web Services are a great way to accept and manage contributions to a public clip art library, digital music catalog, or corporate knowledge base. Since the SOAP interface to a Web method operates over HTTP, contributors can easily publish content any time, from anywhere across the Internet. However, accepting binary content and managing content metadata through SOAP over HTTP presents Web Service developers with some interesting design decisions. This article discusses three ways to enable content publishing using Web methods.

T he Internet provides any time, anywhere access to information that has already been published, but how easy is it to publish something? As you know, files can be copied from one place to another by FTP, WebDAV, HTML file upload, or by attaching documents to e-mail messages. These approaches give contributors a way to submit content via the Internet, but you may not want to require users to have software such as an FTP client or e-mail. In addition, managing the process and uploaded files can be a headache. There must be a better way.

Enter Web Services

      Web Services are a great way to provide content publishing via the Internet. In this article I'll show you how to build a Web Service using a few Web methods, which can be called through the Simple Object Access Protocol (SOAP) over HTTP. To publish to the Web using the service I'm about to build, contributors simply post some text (an XML document containing a SOAP envelope, to be exact) to a URL in order to publish content.
      But what if contributors need to post an XML document to the Web Service? How can they submit binary Windows® metafiles in the request? Can binary files be included in an XML SOAP request? Well, if digital files can be transferred by DAV, e-mail, or HTTP file upload, there must be a way to do it with Web Services. To show how it's done, I'll describe a sample content-based service with two Web methods created for publishing. I'll start with an XML publishing method and a simple SOAP client, then go on to explain two approaches for publishing binary content. The source code for the article, available at the link at the top of this article, includes Visual Studio® .NET solution files for the Web Service and a sample SOAP client with SOAP envelopes for calling the Web methods. You'll be able to download a Beta 2 version in addition to an RC3 version, which should work with the RTM.

Publishing XML

      The methods that are in my sample content Web Service enforce some basic business rules for required metadata so that content can be validated when it is submitted for publication. The methods also accept one or more content items, including metadata for each, in a single publishing request in order to save round-trips to the server. The PublishXML Web method shown in Figure 1 performs content validation and publishes one or more XML documents in a single request.
      The Web method accepts an XmlElement array as an argument; this allows a client to submit multiple XML content items (as an array) for publication. Each XML content item must be enclosed in a parent XML tag, but can contain any level of detail and nesting within that parent element. The Web method itself only contains a few lines of code; the work of validating content and saving it to disk is done by the DocumentXML class (see Figure 2).
      One of the best practices I've adopted in developing Web Services is to keep the code in the Web Service class to a minimum. In the sample code for this article you'll see that most of the work is done by two classes: DocumentXML and Document. The Web methods themselves contain very little code. By encapsulating all the business logic for publishing content in C# classes rather than in the body of the Web method, it's easy to test and debug your business logic (for instance, through a Windows Forms application) before trying to exercise it through a Web method interface. This approach also makes it easy to expose the business logic through different kinds of access protocols (in addition to SOAP, which can be used to access Web methods). For instance, the business logic classes could be attributed for COM Interop and made available to COM clients using the Microsoft® .NET Assembly Registration tool, and the same business logic classes could be exposed through .NET TCP remoting in order to support an even broader range of consuming applications.
      The business rules in the DocumentXML class are very simple: XML content submitted for publication must contain a <FileName> tag, and the contents must not be empty. The <FileName> tag contents specify a target file name for the published XML document. DocumentXML could be expanded to enforce additional rules, such as requiring a certain file name format or requiring additional tags like <Title> and <Author>, without changing the PublishXML Web method interface.
      When PublishXML is invoked, it instantiates a new DocumentXML object and passes the candidate XML content on the constructor of DocumentXML object. Validation occurs in the constructor. Implementing the validation rules is a breeze in .NET, thanks to the XmlTextReader, which locates the required <FileName> tag. If the DocumentXML object is successfully instantiated, the content is valid, and the Web method invokes the object's SaveXML method. SaveXML relies on the .NET XmlTextWriter for writing the content to disk. The DocumentXML constructor and SaveXML method (from \DocumentTypes\DocumentXML.cs) are shown in Figure 2.
      The SaveXML method determines where to save content files by calling ConfigurationSettings.AppSettings.Get("ContentDirectory"). Application settings such as "ContentDirectory" can be stored in the Web.config file of a Web Service and accessed through the .NET ConfigurationSettings class. A section of the sample Web.config file that defines the ContentDirectory is shown here:

  <configuration>
  
<!-- application specific settings -->
<appSettings>
<!-- the content directory for the Publish web method -->
<add key="ContentDirectory"
value="C:\ContentWebService\PublishedContent\" />
</appSettings>
•••

 

The SaveXML method would be a good spot to include additional logic for indexing new content, for instance, by adding an entry to a database table or other master content indices.

The Little SOAP Client

      Now that I've looked at the basic architecture for the content Web Service, a thin Web method interface that relies on business logic classes to validate and save content, let's look at a simple tool for testing Web methods before tackling binary data.

Figure 3 SOAP Envelope
Figure 3 SOAP Envelope

      The sample code for this article includes the world's smallest SOAP client (that I know of). Therefore it is an easy way to test Web methods and a good way to learn more about SOAP. SOAP envelopes are simply XML documents (see Figure 3). A SOAP request to a Web method is just an HTTP POST, performed on an XML document containing the SOAP envelope, to a Web Service URL. To see this in action, once you have installed the sample content Web Service, look in TheLittleSOAPClient subdirectory of the sample for a small VBScript file that you can use to post SOAP envelopes to Web Services. Sample SOAP envelopes to exercise the two Web methods from this article can be found in the SOAPEnvelopes subdirectory.
      These six lines of script form the heart of LittleSOAPClient.vbs:

  Set requestHTTP = CreateObject("Microsoft.XMLHTTP")
  
requestHTTP.open "POST", WebServiceURL, false
requestHTTP.setrequestheader "Content-Type", "text/xml"
requestHTTP.setrequestheader "SOAPAction", WebMethodName
requestHTTP.Send SoapEnvelope
MsgBox("Request sent. HTTP request status= " & requestHTTP.status)

 

Two of the variables, WebServiceURL and WebMethodName, are initialized in a section at the top of the script file so the variables can be easily modified to call a different Web Service or Web method. The script sets the SoapEnvelope variable by opening a text file containing the SOAP envelope and loading the envelope's contents into the SoapEnvelope variable. The Microsoft XMLHTTP object is used to post the contents of the SOAP envelope, including a required SOAPAction header specifying the name of the Web method, to the Web Service URL. That's it—a SOAP client using half a dozen lines of script.
      Using Visual Studio .NET, it's easy to create sample SOAP envelopes that you can use to call any Web method using LittleSOAPClient.vbs. Right-click any Web Service in the Visual Studio .NET Solution Explorer, and select the View in Browser popup menu item. Click on the hyperlink to the Web method you are interested in, and you'll see that Visual Studio .NET provides you with a sample SOAP envelope for the method. Save the SOAP envelope to a file (myMethodEnvelope.xml, for example), fill in values for the method arguments as needed, and use LittleSOAPClient.vbs to post the SOAP request to the Web method. It's no substitute for a debugger, but it's a great way to perform a quick smoke test on your Web methods. Figure 4 contains a SOAP envelope for publishing two XML documents using the PublishXML method.
      SOAP envelopes are easy to read. In Figure 4, the envelope contains information to call the PublishXML method, so <PublishXML> is the first tag found in the <soap:Body>. The PublishXML method takes a single argument called contentItems. The argument is represented in the SOAP envelope as an XML tag named, predictably, <contentItems>. This tag contains the argument value, which is an array of strings. One way to encode arrays in a SOAP envelope is to separate members of the array by tags that indicate the data type. For instance, an array of strings named myStringArray could be passed as an argument in a SOAP envelope as follows:

  <myStringArray>
  
<String>The first string array element</String>
<String>The second string array element</String>
<String>The third string array element</String>
</myStringArray>

 

      In PublishXML.xml, the members of the contentItems array are separated by the <Any> tag, which tells the .NET deserializer that the contents of the elements should be passed through to the Web method "as is", without further encoding or deserialization. There are two items (enclosed in <Any>�</Any> tags) in the contentItems array in Figure 4. The XML content for each item is enclosed in a single XML element; <testDocument> is the enclosing element for the first item, and <anotherDocument> is the enclosing element for the second item.
      By knowing a little about how the SOAP envelope is organized, it's easy to see that the PublishXML.xml envelope contains information to call the PublishXML method. This method takes one argument called contentItems, which is an array of type any, having two array members containing XML. Note that the published XML documents will consist of any and all XML found between the <Any> and </Any> tags in the SOAP envelope, so metadata tags like <FileName> will be included in the published content. Document content and metadata can be separated if desired, either by parsing the XML or by designing a more structured argument list, as you'll see in the next Web method.
      Now that I have shown you an XML content-enabled Web Service and a simple SOAP client, let's take on binary content.

Binary Data and HTTP

      The name HTTP provides a clue as to why binary data requires special consideration in SOAP requests. As you know, the first T in HTTP stands for "text"; HTTP is a text or character set-based protocol, and cannot be used to transport raw binary data. Nonetheless, it's common to move binary files around the Internet through HTTP and other protocols. It's been done with e-mail attachments, DAV, and HTML file upload for some time now, and you can use the same techniques to transport binary data in Web method requests. This involves encoding the binary data so that it can survive the trip through the Web method's underlying transport protocol, HTTP. Once the binary data is encoded in a stream that can be delivered through HTTP, it can be delivered to the Web method.
      You should note that Base64 encoding, described in Internet RFC 2045, is commonly used for transporting binary data through e-mail and HTML file upload. To Base64 encode a file, every 24 bits of source data is converted to a four-character sequence (32 bits) taken from a 65-character ASCII subset. The Base64 character set has the same representation in all currently standardized character sets, making Base64 a very safe encoding scheme. In addition, current work on XML Schema standards (https://www.w3.org/TR/xmlschema-2/#section-Datatypes-and-Facets) defines a Base64Binary type, so the practice of bundling binary data within XML documents through Base64 encoding is likely to become a standard practice.
      My only frustration with this project is that I haven't found a means of Base64 encoding a file that most people might have on their desktop. Given the work in progress on XML Schema data types, perhaps the encoding and decoding capabilities will be added to the MSXML parser. The good news is that Base64 encoding and decoding has already been added to the .NET XmlTextReader and XmlTextWriter classes (more on this subject later), and there is an abundance of sample code and freeware that's readily available for encoding files.
      With Base64 encoding, I'm ready to implement a Web method that can publish one or more binary documents, including related metadata for each item, in a single request. The next steps are to decide where to insert the encoded content in the request, and how to handle decoding in the Web method. The sample code for this article illustrates two approaches for passing encoded content to a Web method. You can either put the encoded content inside of the SOAP envelope or turn the SOAP envelope and each content item into separate parts of a multipart MIME message.

The Publish Web Method

      Publish is the second and final Web method in the sample content Web Service. Similar to the PublishXML method I described previously, Publish is short and simple, as you can see in Figure 5.
      Again, like the PublishXML Web method, all the work is done by classes outside of the Web Service. In this case the Document class (DocumentTypes\Document.cs) defines and validates metadata and knows how to decode and save content to disk. The Publish Web method accepts an array of Document objects as an argument, allowing contributors to publish multiple files (including associated metadata) with a single request.

The Document Class

      Rather than constructing objects from arguments passed to the Web method as in the PublishXML method, the Publish method takes a Document array as an argument. Public member variables of the Document class define metadata fields and storage for Base64-encoded binary data.
      The Document class constructor does not enforce business rules such as the requirement for a non-blank file name. If business rules were enforced in the constructor, the method would return a SOAP fault when passed an invalid Document object. Although the SOAP fault would contain any exception message produced in the constructor, the exception handling in the Web method would never be invoked.
      To let business rule validation exceptions pass through the exception handling of the Web method, the Document.Validate method is called from the body of the Web method. Note that SOAP faults can still occur if the SOAP envelope argument list contains XML that does not represent a Document object. Clients of the Publish Web method are responsible for providing valid XML representations of Document objects as an argument to the Publish method. The source code for the Document class is shown in Figure 6, and Figure 7 shows an example of an XML (or serialized) representation of the Document class. Figure 7 is an example of a SOAP envelope that a Web method client could post to the Publish Web method. The Publish Web method takes an array of Document objects as an argument where Document objects consist of the public members of the Document class.
      The Document class uses the member variable called Data to hold the Base64-encoded content. Since the Base64-encoded content is plain ASCII text, Data is a string variable. The Save method performs content decoding and saves the results in the Content Directory, which is identified using the same approach found in the DocumentXML class. .NET makes the decoding a snap. Since the XmlTextReader class understands Base64 encoding, when the Save method is invoked, Save loads the Base64-encoded Data string into an XML element and decodes it using the XmlTextReader.ReadBase64 method. A binary writer makes short work of saving the file to disk, and hence binary files have been published with associated metadata using a Web method.

Stuffing the Envelope

      The SOAP envelope in Figure 7 should look familiar; it's very similar to the envelope used to call PublishXML. Two Document objects are passed as an array in the argument called contentItems. Since the members of the array are of type Document, each array element is enclosed in a <Document> tag. The names of the public member variables in the Document class appear as XML tag names within the <Document> and </Document> tags.
      The Publish method separates metadata from content using the Document class, as opposed to requiring metadata in the document content in PublishXML. The metadata approach in the Publish method could be applied to XML content as well as binary content, for instance, by making the Data member variable in the Document class an XmlElement rather than a string variable. The Document class could even be modified to let a single Publish method handle both XML text and binary data.
      Give the Publish method a spin with this SOAP envelope using TheLittleSOAPClient.vbs. Don't forget to edit the script to comment out the PublishXML method and SOAP envelope names, and take the comments out of the Publish method and corresponding SOAP envelope before executing the script.

Thinking Outside the Envelope

      The Publish Web method is the interface for both of these binary publishing approaches. It can be invoked either by including the encoded binary content within the SOAP envelope, or by packaging the SOAP envelope and the content together in a multipart MIME message.
      When a standard SOAP envelope is submitted to this Web method, the Web method assumes that the binary content is encoded within the envelope, as described in Figure 7. To enable additional processing for multipart MIME messages, the Web method uses a SOAP extension called PublishViaMIME. PublishViaMIME intercepts all incoming requests to the method and handles the case in which the request is enclosed in a multipart MIME message. To enable a SOAP extension for any Web method, decorate the Web method with an attribute that identifies the SOAP extension class, as with the [PublishViaMIME] attribute shown here:

  [PublishViaMIME]
  
[WebMethod]
public string Publish(Document[] contentItems) {
string results = contentItems.Length.ToString() +
" documents published successfully!";
try {
•••

 

      What does it mean to put the SOAP envelope and the content items in a multipart MIME message? In the sample shown in Figure 8, the SOAP envelope is located in the root of the message, and contains references to other parts of the message. Additional message parts are used to hold Base64-encoded binary content. This approach is described in "SOAP Messages with Attachments," by John J. Barton, Satish Thatte, and Henrik Frystyk Neilsen, at https://www.w3.org/TR/SOAP-attachments.
      As with SOAP envelopes, multipart MIME messages are easy to read. The headers at the top of the message provide most of the information needed to interpret the message. In particular, the Content-Type header provides three valuable pieces of information:

  • Multipart/Related indicates that the stream contains a multipart MIME message.
  • boundary=xxxx defines the contents of the boundary that separates each section of encoded content.
  • start=xxxx identifies the Content-ID of the root of the message.

      Note that the SOAP envelope section in the root of the message is similar but not quite the same as the SOAP envelope previously used to call the Publish method. Rather than including the encoded binary content directly in the Document object, within the <Data>�</Data> element, this approach places a reference attribute on the <Data> element that links to a section of the multipart MIME message that contains the encoded content. An href attribute like the code in the following line is a link to the Content-ID header of the associated content:

  <Data href="cid:MIMEtestdoc.doc@paulsoftware.com"></Data>
  

 

      The sample SOAP extension described in this article makes some broad assumptions about the MIME message format, based on an example from "SOAP Messages with Attachments":

  • The Content-Type message header must specify start=xxxx and boundary= xxxx.
  • Each MIME part must have a Content-ID header.
  • The SOAP envelope must be located in the start Content-ID or root of the message.
  • The SOAP envelope must be ASCII text.
  • All other attachments must use Base64 Content-Transfer-Encoding for this example.
  • The Content-ID header must immediately precede the encoded content.

      These assumptions were made to simplify the MIME message parsing in the sample code. With any luck, the MIME-oriented classes in later versions of the .NET Framework will make MIME message handling much easier and more robust, but since my goal is to focus on the SOAP request, I'll leave more advanced MIME processing for another day.
      If the sample multipart MIME message is posted to the Web Service URL and targeted at the Publish Web method, what will happen? Without any special processing the Web Service will return a SOAP fault indicating:

  System.Web.Services.Protocols.SoapException: Server found request 
  
content type to be 'Multipart/Related; boundary=MIME_boundary;
type=text/xml; start=&lt;soapenvelope.xml@paulsoftware.com&gt;',
but expected 'text/xml'.

 

To handle multipart MIME requests, you'll need to rearrange the incoming HTTP request before it reaches the Web method. Luckily, the .NET Framework provides plenty of opportunities to jump in and rearrange things during the processing of HTTP requests of any type, and particularly during SOAP requests.

The Publish via MIME SOAP Extension

      SOAP extensions are one way to intercept incoming requests and outgoing results during the execution of a Web method request. For instance, the sample SOAP extension PublishViaMIME does several things. First, it examines the HTTP request stream and determines if it is a multipart MIME message. If it is, it finds the root message (the SOAP envelope) and resolves references to content to construct a complete SOAP envelope, including encoded content. Then it replaces the original request stream with the SOAP envelope right before the request stream is deserialized to invoke the Web method.
      If the HTTP request stream does not contain a multipart MIME message, PublishViaMIME will pass the original request stream through for normal SOAP processing. That way, once the Publish method is decorated with an attribute that enables the SOAP extension, Publish can be invoked using a normal SOAP envelope, or by enclosing the SOAP envelope in a multipart MIME message. One method, two ways to invoke it!
      Creating a SOAP extension involves three steps. First, create a class that inherits from SOAPExtension. Second, intercept the incoming request stream and/or outgoing response stream by overriding the ChainStream method. Finally, access or modify the request and/or response streams as needed at various stages of the request processing by overriding the ProcessMessage method.
      SOAP extensions get control at a number of stages during the lifetime of a SOAP request. The flow of control through the SOAP extension is as follows:

  1. The caller posts the HTTP request (with a SOAPAction header specifying the target method name) to the Web Service URL.
  2. The SOAP extension's ChainStream method is then invoked by using a reference to the request stream (the stream type is HttpInputStream).
  3. The SOAP extension's ProcessMessage method is invoked at the BeforeDeserialize stage. This is the point before the request is handed off to the .NET deserializer to get information needed from the SOAP envelope in order to invoke the proper Web method.
  4. ProcessMessage gets control again at the AfterDeserialize stage. At this point the framework has deserialized the method arguments and information from the SOAP envelope and is ready to invoke the target method.
  5. The Web method is invoked.
  6. The SOAP extension's ChainStream method is invoked again and is passed a reference to the response stream (the stream type is now SoapExtensionStream).
  7. ProcessMessage gets control again at the BeforeSerialize stage, before the Web method results are serialized (to the resulting SOAP envelope/XML representation) for return to the caller.
  8. Finally, ProcessMessage gets control at the AfterSerialize stage, when the response buffer contains the serialized SOAP envelope that will be returned to the caller.

      The PublishViaMIME class (in SOAPEx_PublishViaMIME.cs), shown in Figure 9, inherits from SoapExtension. It saves a reference to the original request stream when ChainStream is invoked, and replaces multipart MIME streams with fully resolved SOAP envelopes. The SOAP envelope is resolved by replacing references to different parts of the MIME message with the Base64-encoded content from the referenced parts of the message. This results in a SOAP envelope that looks just like the original envelope format used to call the Publish method. The ChainStream and ProcessMessage overrides from PubishViaMIME are shown in Figure 9.
      Give it a try. In order to post the MIMEMessage.txt file with the right Content-Type headers to my Web Service, use the LittleSOAPViaMIMEClient.vbs client. The client logic is nearly identical to the LittleSOAPClient.vbs client, except that the Content-Type header specifies Multipart/Related and includes the start= and boundary= settings.

Conclusion

      Web methods and the .NET Framework provide developers with a diverse set of tools for publishing all types of content, including XML, multimedia files, and Office documents. The approaches you choose for your own Web methods will depend on the types of content you want to handle and the capabilities of your Web Service clients. SOAP extensions can enable specialized features such as MIME handling, compression, or encryption. I won't go into the details here, but HTTP requests can also be intercepted in a more general fashion using custom HttpHandlers or HTTP request and response filters, for instance, to reroute an HTML file upload request to a Web method. In general, if it can be done via the Internet, it can be done using a Web Service!

For related articles see:
Fun with SOAP Extensions
Web Services Interoperability and SOAP

 

For background information see:
Introducing Microsoft .NET by David S. Platt and Keith Ballinger (Microsoft Press, 2001)
.NET Framework Essentials by Thuan Thai and Hoang Q. Lam (O'Reilly, 2001)

Paula Paul is a Microsoft Certified Solutions Developer with an itch for .NET. Her most recent consulting project involved architecture and development of a knowledge management application using Exchange 2000 and Microsoft .NET Web Services. Contact her at Paul Software (https://www.paulsoftware.com).