XML Task

Applies to: SQL Server SSIS Integration Runtime in Azure Data Factory

The XML task is used to work with XML data. Using this task, a package can retrieve XML documents, apply operations to the documents by using Extensible Stylesheet Language Transformations (XSLT) style sheets and XPath expressions, merge multiple documents, or validate, compare, and save the updated documents to files and variables.

This task enables an Integration Services package to dynamically modify XML documents at run time. You can use the XML task for the following purposes:

  • Reformat an XML document. For example, the task can access a report that resides in an XML file and dynamically apply an XSLT style sheet to customize the document presentation.

  • Select sections of an XML document. For example, the task can access a report that resides in an XML file and dynamically apply an XPath expression to select a section of the document. The operation can also get and process values in the document.

  • Merge documents from many sources. For example, the task can download reports from multiple sources and dynamically merge them into one comprehensive XML document.

  • Validate an XML document and optionally get detailed error output. For more info, see Validate XML with the XML Task.

You can include XML data in a data flow by using an XML source to extract values from an XML document. For more information, see XML Source.

XML Operations

The first action the XML task performs is to retrieve a specific XML document. This action is built into the XML task and occurs automatically. The retrieved XML document is used as the source of data for the operation that the XML task performs.

The XML operations Diff, Merge, and Patch require two operands. The first operand specifies the source XML document. The second operand also specifies an XML document, the contents of which depend on the requirements of the operation. For example, the Diff operation compares two documents; therefore, the second operand specifies another, similar XML document to which the source XML document is compared.

The XML task can use a variable or a File connection manager as its source, or include the XML data in a task property.

If the source is a variable, the specified variable contains the path of the XML document.

If the source is a File connection manager, the specified File connection manager provides the source information. The File connection manager is configured separately from the XML task, and is referenced in the XML task. The connection string of the File connection manager specifies the path of the XML file. For more information, see File Connection Manager.

The XML task can be configured to save the result of the operation to a variable or to a file. If saving to a file, the XML task uses a File connection manager to access the file. You can also save the results of the Diffgram generated by the Diff operation to files and variables.

Predefined XML Operations

The XML task includes a predefined set of operations for working with XML documents. The following table describes these operations.

Operation Description
Diff Compares two XML documents. Using the source XML document as the base document, the Diff operation compares it to a second XML document, detects their differences, and writes the differences to an XML Diffgram document. This operation includes properties for customizing the comparison.
Merge Merges two XML documents. Using the source XML document as the base document, the Merge operation adds the content of a second document into the base document. The operation can specify a merge location within the base document.
Patch Applies the output from the Diff operation, called a Diffgram document, to an XML document, to create a new parent document that includes content from the Diffgram document.
Validate Validates the XML document against a Document Type Definition (DTD) or XML Schema definition (XSD) schema.
XPath Performs XPath queries and evaluations.
XSLT Performs XSL transformations on XML documents.

Diff Operation

The Diff operation can be configured to use a different comparison algorithm depending on whether the comparison must be fast or precise. The operation can also be configured to automatically select a fast or precise comparison based on the size of the documents being compared.

The Diff operation includes a set of options that customize the XML comparison. The following table describes the options.

Option Description
IgnoreComments A value that specifies whether comment nodes are compared.
IgnoreNamespaces A value that specifies whether the namespace uniform resource identifier (URI) of an element and its attribute names are compared. If this option is set to true, two elements that have the same local name but a different namespace are considered to be identical.
IgnorePrefixes A value that specifies whether prefixes of element and attribute names are compared. If this option is set to true, two elements that have the same local name but a different namespace URI and prefix are considered identical.
IgnoreXMLDeclaration A value that specifies whether the XML declarations are compared.
IgnoreOrderOfChildElements A value that specifies whether the order of child elements is compared. If this option is set to true, child elements that differ only in their position in a list of siblings are considered to be identical.
IgnoreWhiteSpaces A value that specifies whether white spaces are compared.
IgnoreProcessingInstructions A value that specifies whether processing instructions are compared.
IgnoreDTD A value that specifies whether the DTD is ignored.

Merge Operation

When you use an XPath statement to identify the merge location in the source document, this statement is expected to return a single node. If the statement returns multiple nodes, only the first node is used. The contents of the second document are merged under the first node that the XPath query returns.

XPath Operation

The XPath operation can be configured to use different types of XPath functionality.

  • Select the Evaluation option to implement XPath functions such as sum().

  • Select the Node list option to return the selected nodes as an XML fragment.

  • Select the Values option to return the inner text value of all the selected nodes, concatenated into a string.

Validation Operation

The Validation operation can be configured to use either a Document Type Definition (DTD) or XML Schema definition (XSD) schema.

Enable ValidationDetails to get detailed error output. For more info, see Validate XML with the XML Task.

XML Document Encoding

The XML task supports merging of Unicode documents only. This means the task can apply the Merge operation only to documents that have a Unicode encoding. Use of other encodings will cause the XML task to fail.

Note

The Diff and Patch operations include an option to ignore the XML declaration in the second-operand XML data, making it possible to use documents that have other encodings in these operations.

To verify that the XML document can be used, review the XML declaration. The declaration must explicitly specify UTF-8, which indicates 8-bit Unicode encoding.

The following tag shows the Unicode 8-bit encoding.

<?xml version="1.0" encoding="UTF-8"?>

Custom Logging Messages Available on the XML Task

The following table describes the custom log entry for the XML task. For more information, see Integration Services (SSIS) Logging.

Log entry Description
XMLOperation Provides information about the operation that the task performs

Configuration of the XML Task

You can set properties through SSIS Designer or programmatically.

For more information about the properties that you can set in SSIS Designer, click one of the following topics:

For more information about how to set properties in SSIS Designer, click the following topic:

Programmatic Configuration of the XML Task

For more information about programmatically setting these properties, click the following topic:

Set the Properties of a Task or Container

XML Task Editor (General Page)

Use the General Node of the XML Task Editor dialog box to specify the operation type and configure the operation.

To learn about this task, see Validate XML with the XML Task. For information about working with XML documents and data, see "Employing XML in the .NET Framework" in the MSDN Library.

Static Options

OperationType
Select the operation type from the list. This property has the options listed in the following table.

Value Description
Validate Validates the XML document against a Document Type Definition (DTD) or XML Schema definition (XSD) schema. Selecting this option displays the dynamic options in section, Validate.
XSLT Performs XSL transformations on XML documents. Selecting this option displays the dynamic options in section, XSLT.
XPATH Performs XPath queries and evaluations. Selecting this option displays the dynamic options in section, XPATH.
Merge Merges two XML documents. Selecting this option displays the dynamic options in section, Merge.
Diff Compares two XML documents. Selecting this option displays the dynamic options in section, Diff.
Patch Applies the output from the Diff operation to create a new document. Selecting this option displays the dynamic options in section, Patch.

SourceType
Select the source type of the XML document. This property has the options listed in the following table.

Value Description
Direct input Set the source to an XML document.
File connection Select a file that contains the XML document.
Variable Set the source to a variable that contains the XML document.

Source
If Source is set to Direct input, provide the XML code or click the ellipsis button (...) and then provide the XML by using the Document Source Editor dialog box.

If Source is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.

Related Topics: File Connection Manager, File Connection Manager Editor

If Source is set to Variable, select an existing variable, or click <New variable...> to create a new variable.

Related Topics: Integration Services (SSIS) Variables, Add Variable.

OperationType Dynamic Options

OperationType = Validate

Specify options for the Validate operation.

SaveOperationResult
Specify whether the XML task saves the output of the Validate operation.

OverwriteDestination
Specify whether to overwrite the destination file or variable.

Destination
Select an existing File connection manager, or click <New connection...> to create a new connection manager.

Related Topics: File Connection Manager, File Connection Manager Editor

DestinationType
Select the destination type of the XML document. This property has the options listed in the following table.

Value Description
File connection Select a file that contains the XML document.
Variable Set the source to a variable that contains the XML document.

ValidationType
Select the validation type. This property has the options listed in the following table.

Value Description
DTD Use a Document Type Definition (DTD).
XSD Use an XML Schema definition (XSD) schema. Selecting this option displays the dynamic options in section, ValidationType.

FailOnValidationFail
Specify whether the operation fails if the document fails to validate.

ValidationDetails
Provides rich error output when the value of this property is true. For more info, see Validate XML with the XML Task.

ValidationType Dynamic Options

ValidationType = XSD

SecondOperandType
Select the source type of the second XML document. This property has the options listed in the following table.

Value Description
Direct input Set the source to an XML document.
File connection Select a file that contains the XML document.
Variable Set the source to a variable that contains the XML document.

SecondOperand
If SecondOperandType is set to Direct input, provide the XML code or click the ellipsis button (...) and then provide the XML by using the Source Editor dialog box.

If SecondOperandType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.

Related Topics: File Connection Manager, File Connection Manager Editor

If XPathStringSourceType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.

Related Topics: Integration Services (SSIS) Variables, Add Variable.

OperationType = XSLT

Specify options for the XSLT operation.

SaveOperationResult
Specify whether the XML task saves the output of the XSLT operation.

OverwriteDestination
Specify whether to overwrite the destination file or variable.

Destination
If DestinationType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.

Related Topics: File Connection Manager, File Connection Manager Editor

If DestinationType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.

Related Topics: Integration Services (SSIS) Variables, Add Variable.

DestinationType
Select the destination type of the XML document. This property has the options listed in the following table.

Value Description
File connection Select a file that contains the XML document.
Variable Set the source to a variable that contains the XML document.

SecondOperandType
Select the source type of the second XML document. This property has the options listed in the following table.

Value Description
Direct input Set the source to an XML document.
File connection Select a file that contains the XML document.
Variable Set the source to a variable that contains the XML document.

SecondOperand
If SecondOperandType is set to Direct input, provide the XML code or click the ellipsis button (...) and then provide the XML by using the Source Editor dialog box.

If SecondOperandType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.

Related Topics: File Connection Manager, File Connection Manager Editor

If XPathStringSourceType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.

Related Topics: Integration Services (SSIS) Variables, Add Variable.

OperationType = XPATH

Specify options for the XPath operation.

SaveOperationResult
Specify whether the XML task saves the output of the XPath operation.

OverwriteDestination
Specify whether to overwrite the destination file or variable.

Destination
If DestinationType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.

Related Topics: File Connection Manager, File Connection Manager Editor

If DestinationType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.

Related Topics: Integration Services (SSIS) Variables, Add Variable.

DestinationType
Select the destination type of the XML document. This property has the options listed in the following table.

Value Description
File connection Select a file that contains the XML document.
Variable Set the source to a variable that contains the XML document.

SecondOperandType
Select the source type of the second XML document. This property has the options listed in the following table.

Value Description
Direct input Set the source to an XML document.
File connection Select a file that contains the XML document.
Variable Set the source to a variable that contains the XML document.

SecondOperand
If SecondOperandType is set to Direct input, provide the XML code or click the ellipsis button (...) and then provide the XML by using the Source Editor dialog box.

If SecondOperandType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.

Related Topics: File Connection Manager, File Connection Manager Editor

If XPathStringSourceType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.

Related Topics: Integration Services (SSIS) Variables, Add Variable.

PutResultInOneNode
Specify whether the result is written to a single node.

XPathOperation
Select the XPath result type. This property has the options listed in the following table.

Value Description
Evaluation Returns the results of an XPath function.
Node list Return the selected nodes as an XML fragment.
Values Return the inner text value of all selected nodes, concatenated into a string.

OperationType = Merge

Specify options for the Merge operation.

XPathStringSourceType
Select the source type of the XML document. This property has the options listed in the following table.

Value Description
Direct input Set the source to an XML document.
File connection Select a file that contains the XML document.
Variable Set the source to a variable that contains the XML document.

XPathStringSource
If XPathStringSourceType is set to Direct input, provide the XML code or click the ellipsis button (...) and then provide the XML by using the Document Source Editor dialog box.

If XPathStringSourceType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.

Related Topics: File Connection Manager, File Connection Manager Editor

If XPathStringSourceType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.

Related Topics: Integration Services (SSIS) Variables, Add Variable

When you use an XPath statement to identify the merge location in the source document, this statement is expected to return a single node. If the statement returns multiple nodes, only the first node is used. The contents of the second document are merged under the first node that the XPath query returns.

SaveOperationResult
Specify whether the XML task saves the output of the Merge operation.

OverwriteDestination
Specify whether to overwrite the destination file or variable.

Destination
If DestinationType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.

Related Topics: File Connection Manager, File Connection Manager Editor

If DestinationType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.

Related Topics: Integration Services (SSIS) Variables, Add Variable.

DestinationType
Select the destination type of the XML document. This property has the options listed in the following table.

Value Description
File connection Select a file that contains the XML document.
Variable Set the source to a variable that contains the XML document.

SecondOperandType
Select the destination type of the second XML document. This property has the options listed in the following table.

Value Description
Direct input Set the source to an XML document.
File connection Select a file that contains the XML document.
Variable Set the source to a variable that contains the XML document.

SecondOperand
If SecondOperandType is set to Direct input, provide the XML code or click the ellipsis button (...) and then provide the XML by using the Document Source Editor dialog box.

If SecondOperandType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.

Related Topics: File Connection Manager, File Connection Manager Editor

If SecondOperandType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.

Related Topics: Integration Services (SSIS) Variables, Add Variable

OperationType = Diff

Specify options for the Diff operation.

DiffAlgorithm
Select the Diff algorithm to use when comparing documents. This property has the options listed in the following table.

Value Description
Auto Let the XML task determine whether to use the fast or precise algorithm.
Fast Use a fast, but less precise Diff algorithm.
Precise Use a precise Diff algorithm.

Diff Options
Set the Diff options to apply to the Diff operation. The options are listed in the following table.

Value Description
IgnoreXMLDeclaration Specify whether to compare the XML declaration.
IgnoreDTD Specify whether to ignore the document type definition (DTD).
IgnoreWhite Spaces Specify whether to ignore differences in the amount of white space when comparing documents.
IgnoreNamespaces Specify whether to compare the namespace uniform resource identifier (URI) of an element and its attribute names.

Note: If this option is set to True, two elements that have the same local name but different namespaces are considered identical.
IgnoreProcessingInstructions Specify whether to compare processing instructions.
IgnoreOrderOf ChildElements Specify whether to compare the order of child elements.

Note: If this option is set to True, child elements that differ only in their position in a list of siblings are considered identical.
IgnoreComments Specify whether to compare comment nodes.
IgnorePrefixes Specify whether to compare prefixes of element and attribute names.

Note: If you set this option to True, two elements that have the same local name, but different namespace URIs and prefixes, are considered identical.

FailOnDifference
Specify whether the task fails if the Diff operation fails.

SaveDiffGram
Specify whether to save the comparison result, a DiffGram document.

SaveOperationResult
Specify whether the XML task saves the output of the Diff operation.

OverwriteDestination
Specify whether to overwrite the destination file or variable.

Destination
If DestinationType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.

Related Topics: File Connection Manager, File Connection Manager Editor

If DestinationType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.

Related Topics: Integration Services (SSIS) Variables, Add Variable.

DestinationType
Select the destination type of the XML document. This property has the options listed in the following table.

Value Description
File connection Select a file that contains the XML document.
Variable Set the source to a variable that contains the XML document.

SecondOperandType
Select the destination type of the XML document. This property has the options listed in the following table.

Value Description
Direct input Set the source to an XML document.
File connection Select a file that contains the XML document.
Variable Set the source to a variable that contains the XML document.

SecondOperand
If SecondOperandType is set to Direct input, provide the XML code or click the ellipsis button (...) and then provide the XML by using the Document Source Editor dialog box.

If SecondOperandType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.

Related Topics: File Connection Manager, File Connection Manager Editor

If SecondOperandType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.

Related Topics: Integration Services (SSIS) Variables, Add Variable

OperationType = Patch

Specify options for the Patch operation.

SaveOperationResult
Specify whether the XML task saves the output of the Patch operation.

OverwriteDestination
Specify whether to overwrite the destination file or variable.

Destination
If DestinationType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.

Related Topics: File Connection Manager, File Connection Manager Editor

If DestinationType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.

Related Topics: Integration Services (SSIS) Variables, Add Variable.

DestinationType
Select the destination type of the XML document. This property has the options listed in the following table.

Value Description
File connection Select a file that contains the XML document.
Variable Set the source to a variable that contains the XML document.

SecondOperandType
Select the destination type of the XML document. This property has the options listed in the following table.

Value Description
Direct input Set the source to an XML document.
File connection Select a file that contains the XML document.
Variable Set the source to a variable that contains the XML document.

SecondOperand
If SecondOperandType is set to Direct input, provide the XML code or click the ellipsis button (...) and then provide the XML by using the Document Source Editor dialog box.

If SecondOperandType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.

Related Topics: File Connection Manager, File Connection Manager Editor

If SecondOperandType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.

Related Topics: Integration Services (SSIS) Variables, Add Variable