Using MSXML with HTML


Adding semantic information to HTML pages is not easy. Historically, various programs have attempted to deal with this problem by using nonstandard tricks, such as hiding data inside HTML comments. However, these comments are awkward and, unlike XML, are not exposed to the object model.

To solve this, the World Wide Web Consortium (W3C) has defined a format for putting XML-based data (XML Data Islands) inside HTML pages. Extending HTML through the use of data islands will allow a wide range of applications to use HTML as the primary document or display format, and also to use XML embedded within these documents to hold data.

An HTML page can therefore include, among other things, specific data about the subject of the page. For example, if the page displayed an advertisement for an author's most recent novel, the page can also contain XML data concerning that book, such as its ISBN number, publisher, or suggested retail price. It is not important that this information be displayed, but it is important that this information be accessible and understandable as data.

System_CAPS_ICON_warning.jpg Warning

XML Data Islands is only available in MSXML 3.0. To achieve the same functionality in MSXML 6.0, you should either transform data by using XSLT or retrieve data from network by using XMLHttpRequest.

If you decide to use XML data within HTML as part of your Web application, you will want to be aware of how Microsoft XML Core Services (MSXML) supports two different XML parsing modes for use.

The XML 1.0 specification defines two basic kinds of XML parsers; nonvalidating and validating parsers. These are both summarized as follows:

nonvalidating parsers
Check document syntax and report all violations of well-formedness constraints. Nonvalidating parsers can also add information to the document based on declarations in the DTD. MSXML does read the DTD, including external resources, and acts on that information.

validating parsers
Perform the same functions as nonvalidating parsers, but also compare the structures of documents to rules in the DTD.

The MSXML parser can operate in either validating or nonvalidating mode.

Testing for Well-Formedness and Validity with Internet Explorer and MSXML

When you load a document in Internet Explorer using MSXML, by default MSXML will check for well-formedness. To validate an XML document, however, you must set a switch in the parser before loading the document.

The following JScript® code loads and parses the xml document at the specified url.

xmldoc= new ActiveXObject("Msxml2.DOMDocument.6.0");  
xmldoc.validateOnParse = true;  

XML Data Islands