Determining Document Content Type for XML Parsing

Last modified: February 10, 2010

Applies to: SharePoint Foundation 2010

For the built-in XML parser to be able to determine the document’s content type, and thereby access the content type definition, the document itself must contain the content type as a document property. The parser looks for a special processing instruction in your XML documents to identify the document's content type. You can include processing instructions that identify the document's content type by content type and/or document template.

When a user uploads an XML document to a document library, SharePoint Foundation invokes the built-in XML parser. Before the parser can promote document properties, it must determine the document's content type, if any.

The parser first looks at the Field element in the document library schema that represents the content type ID column on the document library. The parser examines the Field element for the location in the document where the content type ID should be stored. The parser then determines if the content type ID is indeed stored in the document at this location. If no content type ID is specified at that location, the parser assigns the default content type to the document. The parser then uploads the document and promotes any document properties accordingly.

If the document does contain a content type ID at the specified location, the parser determines if the content type with that ID is also associated with the document library. If it is, the parser uploads the document and promotes any document properties accordingly.

If the parser doesn't find an exact match, it examines the IDs of the content types on the document library to determine if one or more are children of the document content type. If so, the parser assigns the closest child content type to the document. The parser then uploads the document and promotes any document properties accordingly.

Note Note

The parser examines the list for content types that are children of the document content type because, in most cases, the document is assigned a site content type, and the matching list content type is a child of the site content type.

If the parser finds no content type match at all, it looks at the Field element in the document library schema that represents the document template column on the document library, if the column is present. If the document library does contain a document template column, the parser examines the Field element for the location in the document where the document template should be stored. The parser then determines if the document template is stored in the document at this location.

If the document does contain a document template, the parser compares the template with the document templates specified in each content type on the document library. If the parser finds a content type with the same document template as the document, the parser assigns that content type to the document. If there are multiple content types with the same document template as the document, the parser simply assigns the first such content type it finds. The parser then uploads the document and promotes any document properties accordingly.

Finally, if the parser cannot find a content type match, the parser assigns the default content type to the document. The parser then uploads the document and promotes any document properties accordingly.

The following flowchart shows the checks the parser performs to determine a document's content type.

For more information about how the parser promotes and demotes specific document properties, see Using Content Types to Specify XML Document Properties.

Logic flow of parser process

About Parser Operation

The parser looks to the document library's content type and document template columns to determine where in the XML file to locate those matching document properties. Therefore, for promotion and demotion to work correctly, all content types on a given document library must contain content type and document template column definitions that specify the same location for those document properties as the document library columns. Otherwise, the parser looks in the wrong location within the document for those properties.

For more information about specifying content type by content type ID or document template, see Specifying Document Content Type for XML Parsing.

Show: