An Overview of Data Generator Extensibility
You can use Visual Studio Premium or Visual Studio Ultimate to generate meaningful data for testing. By using the built-in data generators, you can generate random data, generate data from existing data sources, and control many aspects of data generation. If the functionality of the built-in generators is insufficient, you can create custom data generators. To create custom data generators, you use the classes in the Microsoft.Data.Schema.Tools.DataGenerator namespace.
The extensibility API provides classes from which developers can inherit. In addition to classes, the API includes attributes that you can apply to your derived classes. By applying these attributes, you reduce the amount of code that is required in custom generators for common cases.
You can use the extensibility API in the following three ways to create custom data generators:
The built-in integer data generator
Medium. This method is recommended most of the time.
The base extensibility API is the mechanism by which the data generation engine and the designers for data generation plans interact. This API was designed to meet the following goals:
Robustness — To promote a consistent and robust implementation in both the design-time and run-time engines.
Flexibility — To support complex generators such as the data bound generator.
A design trade-off that is implicit in the base extensibility API is that it is more complex than the higher-level declarative extensibility API.
Before you can use your custom data generator, you must register it on your computer. If you are giving your custom data generator to other people to use, they must register the generator on their computers.
You can register custom data generators in the following ways:
You can create custom data generators and custom designers for those generators. You can also create custom distributions for numeric data generators and custom designers for those distributions.
Custom data generators produce random test data according to a set of rules that you specify. You can use the default designer with those generators, or you can create a custom designer for them by inheriting from DefaultGeneratorDesigner. For example, the regular expression data generator is a built-in generator, but it uses a custom designer so that it can perform custom validation of user inputs at design time.
By using a custom generator designer, you can customize how input and output properties are retrieved from the user, set default values, and specify validation behavior.
By using a custom distribution, you can control the distribution of numeric values that a data generator generates.
Custom distribution designers control the design-time behavior for a custom distribution. This behavior includes getting the names of the input properties for the distribution, setting the default values of the input properties, and validating the values of the input properties for the distribution.
The data generators that are included with Visual Studio Premium and Visual Studio Ultimate are localized because Visual Studio ships multiple language versions. You probably do not have to localize your custom data generators. If you must create a data generator that will be localized, you should create a custom designer. You can also override the GetInputs method to localize the input property names.
Custom data generators can share data. The scope of the shared data is generator type and database table. Every generator type has a unique instance dictionary for each database table. For example, a custom data generator for a table named Customers has access to a shared dictionary. You can put any information into the dictionary and share that information. The dictionary is guaranteed to be the same instance for each generator type and table. For example, you can create a custom data generator and request the dictionary from GeneratorInit. Then you can verify whether the dictionary contains shared information. If it does, you can use the information to generate data. You can also create the shared information that other instances of your generator can use.
Generator instancing is an advanced technique. You can use generator instancing to create a custom data generator that handles check constraints across columns — for example, a check constraint that requires that one column is greater than another column.
Data generation occurs in the following phases:
Determine the designer type
This phase requires the type of the data generator as an input. The engine can then query the GeneratorAttribute to retrieve the designer type. Most of the time, GeneratorAttribute is inherited from the base class, which specifies the default designer.
Instantiate and initialize the designer
The designer is instantiated. The designer is initialized by calling Initialize and passing the generator type as an argument.
Retrieve the input descriptors
Set the default values
The default values are set.
Get the generator output descriptions
The OutputDescriptor is retrieved from the designer. The default designer uses properties that are marked with OutputAttribute to create the descriptions that appear in the Generator Output column of the Column Details window.
Instantiate the generator
The data generator is instantiated by using the default constructor.
Set the generator inputs
All input values are set in the data generator from the input descriptors that are retrieved from the designer.
Validate the generator
The ValidateInputs method is called. If validation fails, the generator will throw an InputValidationException exception. Any exception other than a data validation exception is treated as an unrecoverable error.
Initialize the generator
The Initialize method is called. This step enables the data generator to perform any necessary setup before data generation occurs, such as specifying the connection string for the target database or seeding the random number generator. This phase occurs one time before data generation occurs.
Run the data generation
During this phase, new results are generated by calling the GenerateNextValues method. Results can be retrieved by using the GetOutputValue method. This method retrieves a scalar value from the generator that corresponds to the output key that is passed to the generator as input. This phase iterates through results until all the results that you want have been generated.
After all data generation is complete, Dispose is called to clean up the generator.