Saving and Writing a Document

When you load and save an XmlDocument, the saved document may be different from the original in the following ways:

  • If the PreserveWhitespace property is set true before the Save method is called, white space in the document is preserved in the output; otherwise, if this property is false, XmlDocument auto-indents the output.
  • All the white space between attributes is reduced to a single space character.
  • The white space between elements is changed. Significant white space is preserved and insignificant white space is not. But when the document is saved, it will use the XmlTextWriter Indenting mode by default to neatly print the output to make it more readable.
  • The quote character used around attribute values is changed to double quote by default. You can use the QuoteChar property on XmlTextWriter to set the quote character to either double quote or single quote.
  • By default, general entities like &abc; are preserved. But if you construct an XmlValidatingReader that has the default EntityHandling setting of ExpandEntities, then call Load, the general entities will be expanded and you lose general entities in the saved document.
  • By default, numeric character entities like { are expanded.
  • The byte order mark found in the input document is not preserved. UCS-2 is saved as UTF-8 unless you explicitly create an XML declaration that specifies a different encoding.
  • If you want to write out the XmlDocument into a file or stream, the output written out is the same as the content of the document. That is, the XmlDeclaration is written out only if there is one contained in the document, and the encoding used when writing out the document is the same encoding given in the declaration node.

Writing an XmlDeclaration

The XmlDocument and XmlDeclaration methods of OuterXml, InnerXml, and WriteTo, in addition to the XmlDocument methods of Save and WriteContentTo, create an XML declaration.

For the XmlDocument methods of OuterXml, InnerXml, and the Save(Stream stm), Save(String filename), WriteTo, and WriteContentTo methods, the encoding written out in the XML declaration is taken from the XmlDeclaration node. If there is no XmlDeclaration node then XmlDeclaration is not written out. If there is no encoding in the XmlDeclaration node, then encoding is not written out in the XML declaration.

The Save(TextWriter writer) and Save(XmlWriter writer) methods alway write out an XmlDeclaration. These methods take the encoding from the writer that it is writing to. That is, the encoding value on the writer overrides the encoding on the document and in the XmlDeclaration object. For example, the following code does not write an encoding in the XML declaration found in the output file out.xml.

Dim tw As New XmlTextWriter("out.xml", Nothing)
doc.load("text.xml")
doc.Save(tw)
[C#]
XmlTextWriter tw = new XmlTextWriter("out.xml", null);
doc.load("text.xml");
doc.Save(tw);

For the Save(XmlTextWriter writer) method, the XML declaration is written out using the WriteStartDocument method in the XmlWriter class. Therefore overwriting the WriteStartDocument method changes how the start of the document is written.

For the XmlDeclaration methods of OuterXml, WriteTo, and InnerXml, if the Encoding property is not set, then no encoding is written out. Otherwise the encoding written out in the XML declaration is the same as the encoding found in the Encoding property.

Writing Document Content Using the OuterXml Property

The OuterXml property is a Microsoft extension to the W3C Document Object Model standards. The OuterXml property is used to get markup of the whole XML document, or just the markup of a single node and its child nodes. OuterXml returns the markup representing the given node and all its child nodes.

The following code sample shows how to save a document in its entirety as a string.

Dim mydoc As New XmlDocument()
' Perform application needs here, like mydoc.Load("myfile");
' Now save the entire document to a string variable called "xml".
Dim xml As String = mydoc.OuterXml

[C#]

XmlDocument mydoc = new XmlDocument();
// Perform application needs here, like mydoc.Load("myfile");
// Now save the entire document to a string variable called "xml".
string xml = mydoc.OuterXml;

The following code sample shows how to save only the document element.

' For the content of the Document Element only.
Dim xml As String = mydoc.DocumentElement.OuterXml

[C#]

// For the content of the Document Element only.
string xml = mydoc.DocumentElement.OuterXml;

In contrast, you can use the InnerText property if you want the content of child nodes.

There is an aspect of the OuterXml property that affects the output it generates from a document, for example, when you define an element as an EntityReference node in XML. In the DTD, you have defined the element with a default attribute. The OuterXml property does not write out the default attributes for the element. For example, assume that the following general entity has been declared, where replacement text is in a file that is called 013.ent, and is shown below:

013.ent

<e/>

For the example, the following XML called X_Entity.XML, contains a DTD with the element defined as having default attributes:

X_Entity.XML

<!DOCTYPE doc [

<!ELEMENT doc (e)><!ELEMENT e (#PCDATA)><!ATTLIST e a1 CDATA "a1 default" a2 NMTOKENS "a2 default"><!ENTITY x SYSTEM "013.ent">]><doc>&x;</doc>

When this XML is parsed, the line of code <!ENTITY x SYSTEM "013.ent"> defines where to find the replacement text for the entity x. When the document is parsed and the &x; EntityReference is encountered, the 013.ent file is located and the &x; is replaced by its definition. The important part of its definition becomes the xml data shown below:

<doc>

<e a1="a1 default" a2="a2 default" />

</doc>

However, when using the OuterXml property, the actual output is:

Output

<!DOCTYPE doc [

<!ELEMENT doc (e)><!ELEMENT e (#PCDATA)><!ATTLIST e a1 CDATA "a1 default" a2 NMTOKENS "a2 default"><!ENTITY x SYSTEM "013.ent">]><doc><e /></doc>

-------------------------
AttributeCount 2

The following code is used to test the default attributes and show the output that does not contain the default attributes defined in the DTD.

Imports System
Imports System.Data
Imports System.Xml
Imports System.Xml.XPath
Imports System.Runtime.Remoting

Namespace TestSimple

    Public Class MyTestApp
      
        Public Shared Sub Main()
            Dim treader As New XmlTextReader("X_Entity.xml")
            Dim vrdr As New XmlValidatingReader(treader)
            Dim xDoc As New XmlDocument()
            xDoc.Load(vrdr)
            Console.WriteLine(("AttributeCount " + CType(xDoc.DocumentElement.ChildNodes(0), XmlElement).Attributes.Count))
            Console.WriteLine("-------------------------")
            Console.WriteLine(xDoc.OuterXml)
        End Sub 'Main
    End Class 'MyTestApp
End Namespace 'TestSimple
[C#]
using System;
using System.Data;
using System.Xml;
using System.Xml.XPath;
using System.Runtime.Remoting;

namespace TestSimple {
 public class MyTestApp
    {
    public static void Main()
    {
        XmlTextReader treader = new XmlTextReader("X_Entity.xml");
        XmlValidatingReader vrdr = new XmlValidatingReader(treader);
        XmlDocument xDoc = new XmlDocument();
        xDoc.Load (vrdr);
        Console.WriteLine(xDoc.OuterXml);
        Console.WriteLine("-------------------------");
        Console.WriteLine("AttributeCount " + ((XmlElement)(xDoc.DocumentElement.ChildNodes[0])).Attributes.Count);
    }
    }
}

See Also

XML Document Object Model (DOM)