Identifier and Property Mapping

Mapping generic concepts to identifier parts and properties is more complicated than mapping types, as discussed in Type Mapping. For example, in addition to having the b:g problem to solve, as you do with type mapping, there is the added matter of data mapping. Managing data requires mapping the correct conversion of data-source-specific data to the format expected by generic concepts.

In this and the following sections, we'll first discuss the more difficult matter of data mapping and then apply that discussion to the various flavors of the b:g problem:

  • Data Mapping

  • Identifier and Property Mapping

  • 1:1 Data Mapping

  • 1:g Data Mapping

  • b:1 Data Mapping

  • b:g Data Mapping

Data Mapping

Data mapping presents difficulties of two main types. First, there is the matter of conflicting data types as, for example, when Boolean-type values true and false are stored as strings. Second, there is the problem of different values representing the same data as, for example, when you use the string index values "0", "1" and "2" alongside analogous string values "NONKEY", "UNIQUE" and "PRIMARY".

To provide the greatest flexibility, a data provider can specify a series of conversion steps that, when followed, convert a data-source-specific value to a value consistent with a generic mapped type. Each conversion step may implement one of the following actions:

  • ChangeType
    To perform a type conversion on the current value, call the .NET Framework method Convert.ChangeType. It is up to the data provider to ensure that the current value is of a type that can be converted (that is, that it implements the IConvertible Interface interface), and that it contains data that can be converted successfully.

  • Match
    To parse a string representation of the current value, use a .NET regular expression. This produces a set of current values equal to the values of each matched group (or if no groups are specified, the single matching value). If no match is found, a default set of new current values specified by the provider is chosen. If the provider knows a match will always be found, the default values do not need to be specified.

  • Split
    To parse a string representation of the current value into parts separated by a string that matches the expression, use a .NET regular expression. This produces a set of current values equal to the values of each part of the string.

  • Replace
    To parse a string representation of the current value and then use a replacement pattern to build a new current value based on matched group values, use a .NET regular expression.

  • Format
    To build a string based on current values, use a .NET format string. The set of current values is passed as format parameters to the String.Format function call, which makes it possible to reference the set of current values using {0}, {1}, and so on.

  • Calculate
    To build a new value based on current values, use an ADO.NET expression. The expression string is first passed through String.Format with the set of current values as format parameters, which makes it possible to reference these values using {0}, {1}, and so on. Referencing values in this manner requires the use of an escape character when there are curly braces in the referenced string.

  • CallMapper
    This element should be the last conversion in the set of conversion steps. It specifies a custom method that is called for converting the value of the mapped member, and also contains the type that implements this method.

For each of the above actions, one or more current values may contain data that the action can reference. When a conversion begins, an initial current value is equal to the data-source-specific value. As the conversion steps proceed, these current values are modified, ending up as a final converted value.

This conversion-step architecture allows data providers to manipulate virtually any data-source-specific data into the proper type and format for the generic mapped type it represents.

Identifier and Property Mapping

The b:g problem for identifier parts and properties occurs because you can have data that is built or parsed from other data. For example, a column data type can be composed of a type name, a length, a precision, and a scale. Some, all, or none of these data could be part of a single property. The b:g problem occurs when the generic identifiers or properties are not equivalently packaged.

As an example, consider a SQL Server database that exposes the data type of a column as a single property encompassing the type name, length, precision, and scale. If a generic mapped property type DataType expects a particular format of this information and SQL just happens to match this format, the mapping is 1:1 and there is no problem (see code example, below). On the other hand, if there are generic mapped property types DataType, Length, Precision, and Scale (which is far more likely), you have a 1:g mapping—that is, a single data-source-specific property (DataType) representing more than one generic property concept—in this case, DataType, Length, Precision, and Scale. See code example below.

Take for another example the case of a database having data-source-specific properties IsPrimaryKey and IsUniqueKey as its index type. Suppose, too, that there is a generic property concept, IndexKeyType, which uses a bit vector to specify (a) whether the index is a key, and (b) if so, which kind of key. (In this scenario the key index might be No key=0, Unique key=1, and Primary key=2.) In such a scenario, we have a b:1 mapping, in which multiple data-source-specific properties are represented by a single generic property concept (see code example, below).

Finally, consider an example of the most complicated case, in which there are data-source-specific properties DataType and DataTypeNumInfo. The DataType property includes a name and length, and the DataTypeNumInfo property includes precision and scale. Now suppose there are generic properties DataTypeName and DataTypeInfo. The DataTypeName property includes just a name, and the DataTypeInfo property includes length, precision, and scale. This is a b:g mapping, wherein multiple data-source-specific properties are represented by multiple generic properties.

Following are summaries and code examples of each of the identifier and property-mapping scenarios.

1:1 Data Mapping

The one-to-one mapping is the simplest case and requires little work. A data provider has a single data-source-specific identifier part or property that maps to a single generic identifier part or property concept as in the following example:

<Type name="Table" preferredOrdering="Database, Schema, Name">
...
    <Properties>
...
        <Property name="ObjectType" type="System.String" />
    </Properties>
</Type>

<MappedType name="Table" underlyingType="Table">
...
    <Properties>
        <Property name="IsSystemObject" underlyingMember="ObjectType">
            <Conversion>
                <Calculate expr="IIF({0}=SYSTEM,true,false") exprType="System.Boolean" />
            </Conversion>
        </Property>
   </Properties>
</MappedType>

In this example, a data-source-specific property, ObjectType, can contain two string values, "USER" and "SYSTEM". This information maps to the generic mapped type IsSystemObject, which is defined as a Boolean value. Thus, a conversion step calculates a Boolean expression based on the data-source-specific string value.

1:g Data Mapping

The 1:g case is slightly more complicated. A data provider has a single data-source-specific identifier part or property that maps to multiple generic identifier parts or properties. To succeed, the data provider applies a mapping by specifying multiple generic properties with the same underlyingMember value, as shown in the following code:

<Type name="Column" preferredOrdering="Database, Schema, Table, Id">
...
    <Properties>
...
        <Property name="DataType" type="System.String" />
    </Properties>
</Type>

<MappedType name="TableColumn" underlyingType="Column">
...
    <Properties>
        <Property name="DataType" underlyingMember="DataType">
            <Conversion>
                <Match regex="^[^(]+" />
            </Conversion>
        </Property>
        <Property name="Length" underlyingMember="DataType">
            <Conversion>
                <Match regex="(?<=^CHAR\(|^VARCHAR\()\d+">
                    <Defaults>
                        <Default value="0" />
                    </Defaults>
                </Match>
                <ChangeType type="System.Int32" />
            </Conversion>
        </Property>
        <Property name="Precision" underlyingMember="DataType">
            <Conversion>
                <Match regex="(?<=^NUMERIC\()\d+">
                    <Defaults>
                        <Default value="0" />
                    </Defaults>
                </Match>
                <ChangeType type="System.Int32" />
            </Conversion>
        </Property>
   </Properties>
</MappedType>

In this example, a data-source-specific property, DataType, contains information for three generic properties. It begins with the data type name, and then in brackets contains information about length for character types, and then about the precision for the numeric type.

To retrieve a value for the generic mapped property DataType, the provider must extract the first part of the data-source-specific value using a simple regular expression that matches words to an open bracket.

To retrieve a value for the generic mapped property Length, the provider must extract the first number in brackets, but only if it is a character data type. Otherwise, the value is zero. The specified regular expression does this in collaboration with the indicated default value. A second action converts the extracted string to the correct type expected by the generic property concept.

To retrieve a value for the generic mapped property Precision, the provider must extract the first number in brackets, but only if it is a numeric data type. Otherwise, the value is zero. The specified regular expression does this in collaboration with the indicated default value. A second action converts the extracted string to the correct type expected by the generic property concept.

b:1 Data Mapping

This case requires a little more work than the 1:g mapping. In this scenario, a data provider has multiple data-source-specific identifier parts or properties that map to a single generic identifier part or property concept. Data providers apply this mapping by specifying a comma-delimited value for the underlyingMember attribute of the Property element containing all the data-source-specific properties that map to the generic property, as illustrated in the following code example:

<Type name="Index" preferredOrdering="Database, Schema, Table, Name">
...
    <Properties>
        <Property name="IsUniqueKey" type="System.Boolean" />
        <Property name="IsPrimaryKey" type="System.Boolean" />
...
    </Properties>
</Type>

<MappedType name="TableIndex" underlyingType="Index">
...
    <Properties>
        <Property name="DataType" underlyingMember="DataType">
            <Conversion>
                <Match regex="^[^(]+" />
            </Conversion>
        </Property>
        <Property name="IndexKeyType" underlyingMember="IsUniqueKey,IsPrimaryKey">
           <Conversion>
               <Calculate expr="IIF({0}=true,
                                IIF({1}=true,PRIMARY,UNIQUE),
                                NONE)"
                          type="System.String" />
           </Conversion>
        </Property>
        <Property name="Precision" underlyingMember="DataType">
            <Conversion>
                <Match regex="(?<=^NUMERIC\()\d+">
                    <Defaults>
                        <Default value="0" />
                    </Defaults>
                </Match>
                <ChangeType type="System.Int32" />
            </Conversion>
        </Property>
   </Properties>
</MappedType>

In this example, data-source-specific properties IsUniqueKey and IsPrimaryKey are true or false based on whether an index is a key or not, and if so, what type. The generic property IndexKeyType is a string that is equal to NONE, UNIQUE, or PRIMARY.

Both of the data-source-specific properties are necessary to provide the complete information for the generic property. Therefore, the provider must merge the contents of multiple properties into one and a conversion must occur.

b:g Data Mapping

This case presents a combination of the 1:g and b:1 mapping. A data provider has multiple data-source-specific identifier parts or properties that map to multiple generic identifier parts or properties. Data providers apply this mapping by specifying multiple data source properties that map to multiple generic properties, as illustrated in the following example:

<Type name="Column" preferredOrdering="Database, Schema, Table, Id">
...
    <Properties>
        <Property name="DataType" type="System.String" />
        <Property name="DataTypeNumInfo" type="System.String" />
...
    </Properties>
</Type>

<MappedType name="TableColumn" underlyingType="Column">
...
    <Properties>
        <Property name="DataTypeName" underlyingMember="DataType">
            <Conversion>
                <Match regex="^[^(]+" />
            </Conversion>
        </Property>
        <Property name="DataTypeInfo" underlyingMember="DataType,DataTypeNumInfo">
            <Conversion>
                <Format string="{0}#{1}" />
                <Replace regex="\((\d+)\)#(\d+),(\d+)" pattern="$1,$2,$3" />
            </Conversion>
        </Property>
...
   </Properties>
</MappedType>

In this example, data-source-specific properties DataType and DataTypeNumInfo provide a data type with length, precision, and scale, respectively. The generic property DataTypeName contains only the name, and DataTypeInfo contains length, precision, and scale information.

To retrieve a value for the generic property DataTypeName, the provider must extract the first part of the data-source-specific property DataType using a simple regular expression that matches words up to an open bracket.

To retrieve a value for the generic property DataTypeInfo, the data provider must extract the second part of the data-source-specific property DataType and the two parts of the data-source-specific property DataTypeNumInfo, and then paste these together in the correct format. This is done by first concatenating both properties into a single string, and then using a regular expression to match the length, precision, and scale information. Finally, the data provider performs a replace operation to create a new single string with this information in the right format.

See Also

Concepts

Mapping Object Type Identifiers and Properties to Generic Types

Type Mapping