Functional Dependency Profile Request Options (Data Profiling Task)
Use the Request Properties pane of the Profile Requests page to set the options for the Functional Dependency Profile Request selected in the requests pane. A Functional Dependency profile reports the extent to which the values in one column (the dependent column) depend on the values in another column or set of columns (the determinant column). This profile can also help you identify problems in your data such as invalid values. For example, you profile the dependency between a Zip Code/Postal Code column and a United States state column. In this profile, the same Zip Code should always have the same state, but the profile discovers violations of the dependency.
The options described in this topic appear on the Profile Requests page of the Data Profiling Task Editor. For more information about this page of the editor, see Data Profiling Task Editor (Profile Requests Page).
For more information about how to use the Data Profiling Task, see Setup of the Data Profiling Task. For more information about how to use the Data Profile Viewer to analyze the output of the Data Profiling Task, see Data Profile Viewer.
A Functional Dependency Profile Request computes the degree to which the determinant side column or set of columns (specified in the DeterminantColumns property) determines the value of the dependent side column (specified in the DependentColumn property). For example, a United States state column should be functionally dependent on a United States Zip Code column. That is, if the Zip Code (determinant column) is 98052, the state (dependent column) should always be Washington.
For the determinant side, you can specify a column or a set of columns in the DeterminantColumns property. For example, consider a sample table that contains columns A, B, and C. You make the following selections for the DeterminantColumns property:
When you select the (*) wildcard, the Data Profiling task tests each column as the determinant side of the dependency.
When you select the (*) wildcard and another column or columns, the Data Profiling task tests each combination of columns as the determinant side of the dependency. For example, consider a sample table that contains columns A, B, and C. If you specify (*) and column C as the value of the DeterminantColumns property, the Data Profiling task tests the combinations (A, C) and (B, C) as the determinant side of the dependency.
For the dependent side, you can specify a single column or the (*) wildcard in the DependentColumn property. When you select (*), the Data Profiling task tests the determinant side column or set of columns against each column.
If you select (*), this option might result in a large number of computations and decrease the performance of the task. However, if the task finds a subset that satisfies the threshold for a functional dependency, the task does not analyze additional combinations. For example, in the sample table described above, if the task determines that column C is a determinant column, the task does not continue to analyze the composite candidates.
For a Functional Dependency Profile Request, the Request Properties pane displays the following groups of options:
Data, which includes the DeterminantColumns and DependentColumn options
DeterminantColumns and DependentColumn Options
The following options are presented for each column selected for profiling in DeterminantColumns and in DependentColumn.
For more information, see the section, "Understanding the Selection of Determinant and Dependent Columns," earlier in this topic.