Generating Test Data for Databases by Using Data Generators
You can use Visual Studio Premium to create test data that you need to verify the behavior of the database objects in your database project. You can generate data that is appropriate for the schema of the database but unrelated to the production data. This approach can help you protect the privacy or security of the production data.
You define a data generation plan that specifies the details of how you want data to be generated for specific tables and columns. For each column, you specify a data generator that produces data of a particular type.
When you create a data generation plan in a database project, the plan is based on the schema of the project. If you create a data generation plan and then the schema changes, you are prompted to update the plan.
Important |
|---|
If you do not update the data generation plan based on the schema changes, you cannot continue to modify the plan. You can only close and reopen the data generation plan to display the prompt again. |
When you run database unit tests, you can specify a different data generation plan for each test project. Therefore, you can initialize the database to a different state for each group of tests.
Visual Studio Premium includes several built-in data generators for generating different kinds of data. For example, the integer generator provides random integer values, the string generator provides random strings, and the regular expression generator provides strings that match a pattern that you specify. For more information and the complete list of standard data generators, see Standard Data Generator Types and Specify Details of Data Generation for a Column.
If the generators that are included with Visual Studio Premium do not provide data that meets your requirements, you can create custom generators. For example, you might want to create a custom data generator if your database contains a column that has a check constraint that references another column. For more information, see Generate Specialized Test Data with a Custom Data Generator.
You might receive errors if you try to generate data for a table that contains SPARSE columns and a COLUMN_SET. For more information, see Troubleshooting Data Generation Issues.
A data generation plan is an XML file that contains information about the database schema and configuration information that controls the data generation for each column in each table that you specify. A data generation plan for a database that contains approximately 40 tables can be approximately 8 MB and more than 100,000 lines long.
You cannot automatically merge changes to a .dgen file by using Team Foundation version control or another version control system. The process that you must use to manually merge large XML files is difficult and prone to errors. You can minimize problems that relate to your data generation plan by using exclusive checkout on the plan when you must change it.
Common Tasks | Supporting Content |
|---|---|
Get hands-on practice: You can follow an introductory walkthrough to become familiar with how to create and run a simple database unit test. | |
Create a data generation plan: Create a data generation plan for each unit test project or for specific application tests. You can also define a data generation plan that defaults all columns to reuse data from another database that you specify. | |
Upgrade a data generation plan from a previous release: You can use data generation plans that you created with a previous release of Visual Studio Premium. However, you must upgrade the plan immediately after you open it. | |
Specify the tables that you want to generate: You can generate data for all tables or for only those tables that you specify. For each table, you can specify a number of rows to generate. As an alternative, you can generate rows in a ratio that is based on the number of rows that you generate for another table. | |
Configure how data is generated for each column in the tables that you specify: You specify the type of data generator for each column. You can also provide constraints on the range of data values. For more-advanced data generators, you can provide additional configuration information to further refine the data that you generate. | |
Preview the generated data: Before you write data to your database, you can preview the data for any table that is included in your data generation plan. By previewing the data, you can adjust your data generation plan without generating data iteratively. | |
Generate test data: After you have previewed and verified the data that you will generate, you can run the data generation plan to generate the test data. You can also configure your unit test project so that data is generated automatically whenever you run your unit tests. | |
Check in the data generation plans: Just as you must check in your unit tests to version control to share them with your team, you must also check in your data generation plans. | |
Change data generator defaults and options: You can specify default values for the random seed and the number of rows to insert. You can also specify options that control aspects of the data generators, such as the number of rows to display in the preview window and the maximum number of acceptable errors that can be displayed while data is generated for an individual table before the operation fails. | |
Troubleshoot problems: You can learn more about how to troubleshoot common problems with creating and using data generation plans. |
Important