Table Mapping in ADO.NET

 

Dino Esposito
Wintellect

March 15, 2002

Table mapping is the process that controls how data adapters copy tables and columns of data from a physical data source to ADO.NET in-memory objects. A data adapter object utilizes the Fill method to populate a DataSet or a DataTable object with data retrieved by a SELECT command. Internally, the Fill method makes use of a data reader to get to the data and the metadata that describe the structure and content of the source tables. The data read is then copied into ad hoc memory containers (that is, the DataTable). The table mapping mechanism is the set of rules and parameters that lets you control how the SQL result sets are mapped onto in-memory objects.

The following code shows the typical way to collect data out of a data source using a data adapter:

SqlDataAdapter da;
DataSet ds;
da = new SqlDataAdapter(m_selectCommand, m_connectionString);
ds = new DataSet();
da.Fill(ds); 

Admittedly, this code isn't exactly rocket science and I can venture a guess that you are already familiar with it, and can successfully run it more than once. But what really happens behind the scenes of this code? Believe it or not, there's a little known object running behind the curtain whose nature and behavior heavily affect the final results.

When you run the code shown above, a new DataTable object is added to the (initially empty) DataSet for each result set that the execution of the SELECT statement may have generated. If you pass a non-empty DataSet to the Fill method, the contents of the result sets and the existing DataTable objects are merged as long as a match is found with the name of the DataTable. Similarly, when it comes to copying rows of data from the result set to a given DataTable, the contents of matching columns are merged. By contrast, if no match is found on the column name, then a new DataColumn object is created (with default settings) and added to the in-memory DataTable.

The question is, how does the adapter map the contents of the result sets into the DataSet constituent items? What tells the data adapter which tables and columns names to match? The TableMappings property of data adapters is the object behind the curtain that decides how tables in the result set map to the objects in the DataSet.

The Mapping Mechanism

The mapping mechanism begins to work once the SELECT command is terminated and the data adapter has returned one or more result sets. The adapter gets a reference to an internal data reader object and starts processing the fetched data. By default, the data reader is positioned on the first result set. The following pseudocode describes what's going on:

int Fill(DataSet ds)
{
   // Execute the SELECT command and gets a reader
   IDataReader dr = SelectCommand.ExecuteReader();
   
   // Map the first result set to the DataSet and return the table
   bool bMoreToRead, bMoreResults;
   DataTable dt = MapCurrentResultSet(ds);
   
   // Copy rows from the result set to the specified DataTable
   while(true)
   {
   // Move to the next data row
   bMoreToRead = dr.Read();
   if (!bMoreToRead)
   {
      // No more rows in this result set. More result sets?
      bMoreResults = dr.NextResult()   
      if (!bMoreResults)
         break;
      else
         // Map this new result set and continue the loop
         dt = MapCurrentResultSet(ds);
   }
   else
   AddRowToDataTable(dt)
   }
}

The Fill method maps the first result set to a DataTable object in the given DataSet. Next, it loops through the result set and adds rows of data to the DataTable. When the end of the result set is reached, the method looks for a new result set and repeats the operation.

Mapping a result set to a DataSet is a process that comprises two phases:

  • Table mapping
  • Column mapping

During the table mapping step, the data adapter has to find a name for the DataTable that will contain the rows in the result set being processed.

Each result set is given a default name that you might want to change at will. The default name of the result set depends on the signature of the Fill method that has been used for the call. For example, let's consider the two overloads below:

Fill(ds);
Fill(ds, "MyTable");

In the former case, the name of the first result set defaults to Table. Further result sets are named Table1, Table2, and so on. In the latter case, the first result set is called MyTable and the others are named after it—MyTable1, MyTable2, and so forth.

The adapter looks up its TableMappings collection for an entry that matches the default name of the result set. If a match is found, the adapter attempts to locate a DataTable object with the name specified in the mapping in the DataSet. If no such DataTable object exists, it is created and then filled. If such a DataTable exists in the DataSet, its contents are merged with the contents of the result set.

Figure 1. Mapping a result set onto a DataSet object

In Figure 1, I assume that the query produces at least three result sets. The TableMappings collection contains three default names and the corresponding mapping names. If the SELECT command creates a result set with a default name of Table, then its contents go into a new or existing DataTable object called Employees. How do you control this from a code standpoint? Look at the code snippet below:

SqlDataAdapter da = new SqlDataAdapter(...);
DataSet ds = new DataSet();
DataTableMapping dtm1, dtm2, dtm3;
dtm1 = da.TableMappings.Add("Table", "Employees"); 
dtm2 = da.TableMappings.Add("Table1", "Products");
dtm3 = da.TableMappings.Add("Table2", "Orders");
da.Fill(ds);

Of course, the default names you map onto your own names must coincide with the default names originated by the call to the Fill method. In other words, if you change the last line to da.Fill(ds, "MyTable");, the code won't work any longer because the default names are now MyTable, MyTable1, and MyTable2 for which the above TableMappings collection has no entries.

You can have any number of table mappings that are not necessarily related to the expected number of result sets. For example, you can map only Table1, being the second result set returned by the command. In this case, the destination DataSet will hold three tables named Table, Products, and Table2.

The DataTableMapping object describes a mapped relationship between a result set and a DataTable object in a DataSet. The SourceTable property returns the default result set name, whereas DataSetTable contains the mapping name.

If you use Visual Studio® .NET, you can configure the table mappings in a visual manner by running the Data Adapter Configuration Wizard.

Column Mapping

If table mapping ended here, then it wouldn't be such a big deal. In fact, if your goal is to give a mnemonic name to your DataSet tables, you can use the following code:

SqlDataAdapter da = new SqlDataAdapter(...);
DataSet ds = new DataSet();
da.Fill(ds);
ds.Tables["Table"].TableName = "Employees";
ds.Tables["Table1"].TableName = "Products";

The final effect is exactly the same. The mapping mechanism, though, has another, rather interesting facet—column mapping. The figure below extends the previous diagram and includes details of the column mapping.

Figure 2. Table and column mappings

The DataTableMapping object has a property called ColumnMappings that turns out to be a collection of DataColumnMapping objects. A column mapping represents a mapping between the name of a column in the result set and the name of the corresponding column in the DataTable object. Basically, the ultimate goal of DataColumnMapping object is that it enables you to use column names in a DataTable that are different from those in the data source.

SqlDataAdapter da = new SqlDataAdapter(...);
DataSet ds = new DataSet();
DataTableMapping dtm1;
dtm1 = da.TableMappings.Add("Table", "Employees"); 
dtm1.ColumnMappings.Add("employeeid", "ID");
dtm1.ColumnMappings.Add("firstname", "Name");
dtm1.ColumnMappings.Add("lastname", "Surname");
da.Fill(ds);

In the code above, I assume that the fetched result set has columns called employeeid, firstname, and lastname. These columns have to be copied into an in-memory DataTable child of a DataSet. By default, the target DataColumn will have the same name as the source column. The column mapping mechanism, though, allows you to change the name of the in-memory column. For example, when the column employeeid is copied to memory, it is renamed to ID and placed in a DataTable called Employees.

The name of the column is the only argument you can change at this level. Keep in mind that this entire mapping takes place automatically within the body of the Fill method. When Fill terminates and each column in the source result set has been transformed into a DataColumn object, you can intervene and apply further changes—relationships, constraints, primary key, read-only, auto-increment seed and step, support for null values, and more.

In summary, the Fill method accomplishes two main operations. First off, it maps the source result sets onto in-memory tables. Secondly, it fills the tables with the data fetched out of the physical data source. While accomplishing any of these tasks, the Fill method could raise some special exceptions. Conceptually, an exception is an anomalous situation that needs to be specifically addressed from a code standpoint. When the adapter can't find a table or a column mapping, and when a required DataTable or DataColumn can't be found in the target DataSet, the adapter throws a type of lightweight exception.

Unlike real exceptions that must necessarily be resolved in code, this special breed of adapter exceptions have to be resolved declaratively by choosing an action from a small set of feasible options. Adapters raise two types of lightweight exceptions:

  • Missing mapping
  • Missing schema

Missing Mapping Action

A missing mapping action is required in two circumstances when the adapter is collecting data to fill the DataSet. You need a missing mapping action if a default name is not found in the TableMappings, or if a column name is not available in the table's ColumnMappings collection. You must customize the behavior of the adapter's MissingMappingAction property in order to handle such an exception. Feasible values for the property come from the MissingMappingAction enum type listed in the table below.

Value Description
Error A SystemException is generated whenever a missing column or a table is detected.
Ignore The unmapped column or table is ignored.
Passthrough Default option; add the missing table or column with the default name.

Table 1. The MissingMappingAction enumeration

Unless you explicitly set the MissingMappingAction property prior to filling the adapter, it assumes a default value of Passthrough. As a result, the table or the column is added to the DataSet using the default name. For example, if no table mapping has been specified for the result set called Table, then the target DataTable takes the same name. In fact, the following statements end up adding a new DataTable to the DataSet called Table and MyTable respectively.

da.Fill(ds);
da.Fill(ds, "MyTable");

If you set the MissingMappingAction property to Ignore, then any unmapped table or column is simply ignored. No error is detected, but there will be no content for the incriminating result set (or one of its columns) in the target DataSet.

If the MissingMappingAction property is set to Error, then the adapter is limited to throw a SystemException exception whenever a missing mapping is detected.

Once the adapter is done with the mapping phase, it starts populating the target DataSet with the contents of the selected result sets. Any required DataTable or DataColumn object that is not available in the target DataSet triggers another lightweight exception and requires another declarative action:missing schema action.

Missing Schema Action

A missing schema action is required if the DataSet does not contain a table with the name that has been determined during the table mapping step. Similarly, the same action is required if the DataSet table does not contain a column with the expected mapping name. MissingSchemaAction is the property that you set to indicate the action you want to be taken in case of an insufficient table schema. Feasible values for the property come from the MissingSchemaAction enum type, listed in the table below.

Value Description
Error A SystemException is generated whenever a missing column or a table is detected.
Ignore The unmapped column or table is ignored.
Add Default option; complete the schema by adding any missing column or table.
AddWithKey Adds primary key and constraints.

Table 2. The MissingSchemaAction enumeration

By default, the MissingSchemaAction property is set to Add. As a result, the DataSet is completed by adding any constituent item that is missing—DataTable or DataColumn. Bear in mind, though, that the schema information added in this way is very limited. It only includes name and type. If you want extra information—like primary key, auto-increment, read-only and null settings—use the AddWithKey option instead. Notice that even if you use the AddWithKey option, not all available information about the column is loaded into the DataColumn. For example, AddWithKey marks a column as auto-increment, but does not set the related seed and step properties. Also, the default value for the source column, if any, is not automatically copied. The primary key is imported, but not any extra indexes you may have set.

The other two options, Ignore and Error, work exactly as they did with the MissingMappingAction property.

Impact on the Code

Although I repeatedly talked about the actions in terms of (lightweight) exceptions, the actions you declare to execute in case of missing objects are not as expensive as true exceptions. On the other hand, this doesn't mean that your code is completely unaffected by such actions. More specifically, filling a DataSet that already contains all the needed schema information is a form of code optimization. This is especially true as long as your code is structured in such a way that you repeatedly fill an empty DataSet with a fixed schema. In this case, using a global DataSet object preloaded with schema information helps to prevent all those requests for recovery actions.

How can you fill a DataSet with the schema information that belongs to a group of result sets? Guess what, the data adapter objects have a tailor-made method—FillSchema.

DataTable[] FillSchema(DataSet ds, SchemaType mappingMode);

FillSchema takes a DataSet and adds as many tables to it as needed by the SELECT command associated with the adapter. The method returns the various DataTable objects (only schema, no data) created in an array. The mapping mode parameter can be one of the values defined in the SchemaType enum.

Value Description
Mapped Apply any existing table mappings to the incoming schema. Configure the DataSet with the transformed schema. Preferable option.
Source Ignore any table mappings on the DataAdapter. Configure the DataSet using the incoming schema without applying any transformations.

Table 3. The SchemaType enumeration

The options available are quite self-explanatory. Mapped describes what happens when mappings are defined. Source, instead, deliberately ignores any mappings you may have set. The tables in the DataSet retain their default name and all the columns maintain the original name they were given in the source tables.

Managing User Profiles

To round out this discussion about table mappings, let's review a realistic scenario in which you might want to consider their use. Suppose that you have to manage different user profiles. Each profile requires you to access the same tables, but return a different set of columns. You can tackle this issue in a number of ways, but the ADO.NET table mapping mechanism may be the best.

The idea is that you use always a single query—the one targeted to the most privileged profile—and then map to the resulting DataSet with only the columns specific of the current user profile. Here's some Visual Basic® code that illustrates the point:

Dim da As SqlDataAdapter
da = New SqlDataAdapter(m_selectCommand, m_connectionString)

Dim dtm As DataTableMapping
dtm = da.TableMappings.Add(da.DefaultSourceTableName, "Employees")

If bUserProfileAdmin Then       
   dtm.ColumnMappings.Add("EmployeeID", "ID")
   dtm.ColumnMappings.Add("LastName", "Last Name")
   dtm.ColumnMappings.Add("FirstName", "Name")
   dtm.ColumnMappings.Add("Title", "Position")
   dtm.ColumnMappings.Add("HireDate", "Hired")
Else
   dtm.ColumnMappings.Add("LastName", "Last Name")
   dtm.ColumnMappings.Add("FirstName", "Name")
End If

Dim ds As DataSet = New DataSet()
da.MissingMappingAction = MissingMappingAction.Ignore
da.MissingSchemaAction = MissingSchemaAction.Add
da.Fill(ds)
 

In this simple case, the query returns only one result set that I decided to identify through its default name of Table. Notice that for the sake of generality you should use the DefaultSourceTableName property of the data adapter object, rather than literal name (Table). The table mapping defines different column mappings according to the role of the user. If the user is an administrator, the DataSet includes more columns. Of course, the actual implementation of concepts like roles and privileges is completely up to you. The key statement for all this to work as expected is the value of the MissingMappingAction property that has been set to Ignore. The result is that unmapped columns are just ignored. Finally, remember that case sensitivity is important for column names, and that the name of the column mapping must match the case of the source column name.

Summary

In this article, I reviewed the table mapping mechanism available in ADO.NET. Table mapping is the set of rules and behaviors that govern the passage of rows from the data source to an in-memory DataSet. The mapping consists of two steps—table and column mapping—and is only the first phase of a broader operation that involves filling a DataSet operated by a data adapter object. The second phase begins when the target DataSet is actually populated. Any logical exception in the mapping and filling phases can be controlled by declaring which actions to take when a table or a column is not explicitly bound to a DataSet table or when a needed table or a column is not present in the DataSet.

Dialog Box: Getting @ the Difference

What's the difference between @Register and @Import, and what's the right place for a non-system assembly DLL used by ASP.NET applications?

First and foremost, ASP.NET applications are .NET applications. As such, they need to link to any assemblies whose objects they plan to use. The @Register directive serves just this purpose. Any assembly you register with the page is then passed as a reference to the compiler of choice. The role of the @Import directive is less important as its function is to simplify the coding. @Import lets you import a namespace, not an assembly. An assembly can contain more namespaces. For example, the assembly system.data.dll contains System.Data, System.Data.OleDb, System.Data.SqlClient, and more.

Importing a namespace lets you write simpler code in the sense that you don't need to specify the full path to a given object. Importing System.Data allows you to use a data set through the class DataSet, instead of System.Data.DataSet. To use a DataSet, you can do without the @Import directive, but not without the reference to system.data.dll.

In particular, with ASP.NET applications you don't need to explicitly register any assemblies available in the Global Assembly Cache (GAC). You use @Register only to reference custom assemblies that have been registered with the system GAC.

Where do these assemblies reside? They must be placed in the BIN directory under the application's virtual directory. If this directory does not exist, you should create it. If your ASP.NET application does not use a virtual directory, then it implicitly runs from the Web server's root directory. Therefore, the BIN directory is below the Web server's root. For example, c:\inetpub\wwwroot\bin.

Dino Esposito is Wintellect's ADO.NET expert and a trainer and consultant based in Rome, Italy. Dino is a contributing editor to MSDN Magazine and writes the Cutting Edge column. He also regularly contributes to Developer Network Journal and MSDN News. In addition, he is the author of Building Web Solutions with ASP.NET and ADO.NET from Microsoft Press, and the cofounder of https://www.vb2themax.com/. You can reach Dino at dinoe@wintellect.com.