Data Quality Matching in the MDS Add-in for Excel
Topic Status: Some information in this topic is preview and subject to change in future releases. Preview information describes new features or changes to existing features in Microsoft SQL Server 2016 Community Technology Preview 2 (CTP2).
Over time, you will want to add more data to the MDS repository. Before adding data, it can be useful to compare the new data to the data that’s already managed in MDS, to ensure you are not adding duplicate or inaccurate data.
The MDS Add-in for Excel uses the Data Quality Services (DQS) feature of SQL Server to match data that’s similar. When you use the matching functionality in the Add-in, similar records are grouped together and a score that represents the accuracy of the result is displayed. For more information about the matching functionality provided by DQS, see Data Matching.
When using DQS with the MDS Add-in for Excel, use the following workflow:
Retrieve a list of MDS-managed data and combine it with a list that is not managed in MDS. For more information, see Combine Data (MDS Add-in for Excel).
Use DQS knowledge to compare the data in the combined list. For more information, see Match Similar Data (MDS Add-in for Excel).
To view more details about the similarities found by DQS, show the detail columns.
Go through the results and determine which data should be added to the MDS repository and which data is duplicated.
Publish the new and/or updated data to the MDS repository.
The matching results provided in the Add-in are based on a DQS knowledge base.
The default knowledge base (DQS Data) is created when DQS is installed. If you choose to use the default knowledge base (without adding a matching policy to the default knowledge base in the Data Quality Client), you must map columns in the worksheet to domains in the knowledge base, and then assign a weight value to the domains you choose.
You can use the Data Quality Client to create a new knowledge base with a matching policy, or to add a matching policy to the default knowledge base. In this case, the weight values are determined by the matching policy you already created and you need only to map the columns to the domains. For more information, see Create a Matching Policy.
For more information about knowledge bases, see DQS Knowledge Bases and Domains.