2.2.12 [XML] Section 3.3.3, Attribute-Value Normalization

C0015:

The specification states:

 Before the value of an attribute is passed to the application or checked for 
 validity, the XML processor MUST normalize the attribute value by applying the 
 algorithm below, or by using some other method such that the value passed to the 
 application is the same as that produced by the algorithm.
  
 1.  All line breaks MUST have been normalized on input to #xA as described in 2.11 End-of-Line Handling, so the rest of this algorithm operates on text normalized in this way.
 2.  Begin with a normalized value consisting of the empty string.
 3.  For each character, entity reference, or character reference in the unnormalized attribute value, beginning with the first and continuing to the last, do the following:
  
      For a character reference, append the referenced character to the normalized value.
      For an entity reference, recursively apply step 3 of this algorithm to the 
 replacement text of the entity.
      For a white space character (#x20, #xD, #xA, #x9), append a space character (#x20) to the normalized value.
      For another character, append the character to the normalized value.
  
 If the attribute type is not CDATA, then the XML processor MUST further process the 
 normalized attribute value by discarding any leading and trailing space (#x20) 
 characters, and by replacing sequences of space (#x20) characters by a single space 
 (#x20) character.

MSXML3

The following clarifications apply:

  • During attribute value normalization, white-space characters (#x20, #xD, #xA, #x9) are not replaced with space characters (#x20).

  • When the attribute type is NMTOKENS, the attribute value is not normalized by removing leading and trailing space (#x20) characters or by replacing sequences of space characters with one space character.

MSXML6

When using the old parser (that is, the NewParser property has not been enabled), the following clarifications apply:

  • During attribute value normalization, white-space characters (#x20, #xD, #xA, #x9) are not replaced with space characters (#x20).

  • When the attribute type is NMTOKENS, the attribute value is not normalized by removing leading and trailing space (#x20) characters or by replacing sequences of space characters with one space character.

C0016:

The specification states:

 All attributes for which no declaration has been read SHOULD be treated by a non-
 validating processor as if declared CDATA.

MSXML3 and MSXML6

MSXML3 and MSXML6 are validating parsers.