This topic has not yet been rated - Rate this topic

Configure Data Correction in the Data Cleansing Transformation Editor

[This documentation is for preview only, and is subject to change in later releases. Blank topics are included as placeholders.]

To configure a Data Correction transformation, you use the Data Correction Transformation Editor. You use the editor to configure the following settings:

  • On the Data Quality Field Schema tab, you configure the Data Quality Services server to which the transformation will connect and the data quality field schema that contains the correction rules that the transformation will apply.

  • On the Columns tab, you select the input columns to which you want to apply data corrections.

  • On either tab, you can configure the way in which the destination will handle errors.

Configuring the Data Correction transformation is part of the second step in discovering rules about the data that can later be used to correct data. For more information about this data discovery and correction process, see Data Cleansing.

  1. In SQL Server Data Tools (SSDT), open the Integration Services package that has the Data Correction transformation.

  2. On the Data Flow tab, double-click the Data Correction Transformation.

On the Data Quality Field Schema tab, you select a data quality field schema for the connected data source. The schema contains the rules that the Data Correction transformation uses to correct data. The schema is in the Data Quality Knowledge Base on the Data Quality Services server. To access the schema, you specify a Data Quality Services connection manager to connect to the Data Quality Services server, as described in the following procedure.

To connect to the Data Quality Services server and select a data quality field schema

  1. If not already open, open the Data Quality Field Schema tab.

  2. From the Connection manager list, select an existing Data Quality Services connection manager.

    —or—

    Click New to open the Connection Manager dialog box and create a new Data Quality Services connection manager. For more information, see Configure a Connection in the Data Quality Services Connection Manager.

  3. From the Data quality field schema list, select a schema for the data source that you connected to the transformation.

    Note Note

    When you use the Data Discovery destination to create rules for applying data correction to a data source, you also create a schema for the data source. For more information, see Data Discovery Destination.

    The names and data types of the columns that are associated with the selected data quality field schema now appear in the Schema columns box.

On the Columns tab, you select the input columns to which you want to apply data corrections. To correct the data, you map schema columns to the selected input columns, as described in the following procedure.

To select the input columns to which you want to apply corrections

  1. If not already open, open the Columns tab.

  2. From the Name list, use the check boxes to select the input columns to which you want to apply data corrections.

    The Name list shows all available input columns in the data flow from the connected source.

    After you select a column in the Name list, that column appears in the Input Columns list. If you clear the selection on the Name list, the column disappears from the Input Columns list.

    Important note Important

    The system applies data corrections only to selected input columns with the DT_STR and DT_WSTR data types. The other selected input columns are passed through the transformation.

    Input columns with unsupported data types are not available for selection.

  3. In the Data Quality Columns list, select the schema columns that you want to map to the input columns.

    You map a schema column to an input column to correct the data in the input column.

    Note Note

    You cannot map a schema column to more than one input column.

    Note Note

    The data types of the mapped columns must be compatible.

  4. Optionally, modify the column names:

    • In the Output Alias list, type a unique alias for the output that contains the original value from the input column. The default value is the input column name.

    • In the Corrected Output Alias list, type a unique alias for the output that contains the corrected version of the original value. The default value is the input column name with the prefix, Corrected.

    • In the Merged Output Alias list, type a unique alias for the output that contains either the corrected value when the original value is corrected or the original value when the original value is not corrected. The default value is the input column name with the prefix, Merged.

Errors can occur when the transformation corrects data from the connected data source. These errors can occur because of unexpected data values or validation constraints.

To configure the error output

  • On either tab, from the Configure Error Output list, specify how to handle row-level errors that occur when the transformation corrects data from the connected data source:

    • Select Fail Component if you want the transformation to fail when an error occurs. No input data will be inserted into the Data Quality Services database. (This is the default value.)

    • Select Redirect Row if you want the transformation to redirect the input data to the error output when an error occurs.

Did you find this helpful?
(1500 characters remaining)