Saving Data in Datasets

Article
04/28/2015

Saving data is the process of persisting changed data in an application back to the original data store, typically a relational database such as SQL Server.

Because a dataset is effectively a cache — an in-memory copy — of data, the process of writing information to the original data source is separate from the process of modifying the data in the dataset. You can send updated data in datasets back to the database by calling one of the Update methods of a TableAdapter, or by calling one of the TableAdapter's DBDirect methods.

For more information on sending the changes in a dataset back to the database, see How to: Update Data by Using a TableAdapter and How to: Save Dataset Changes to a Database.

Visual Studio provides a TableAdapterManager component that assists in performing saves when you save data in related tables. This component ensures that saves are performed in the proper order based on the foreign-key constraints defined in the database. For more information, see Hierarchical Update Overview.

For information on modifying data in the dataset, see Editing Data in Your Application.

Two-Stage Updates

Updating a data source by means of a dataset is a two-step process. The first step is to update the dataset with new information — new records, changed records, or deleted records. If your application is concerned only with the dataset — for example, after updating the dataset, you send it to another application that will perform further processing of the dataset — then you are finished with the update.

Note

In Windows Forms, the data-binding architecture takes care of sending changes from data-bound controls to the dataset, and you do not have to explicitly update the dataset with your own code. For more information, see Windows Forms Data Binding.

If you are updating a data source (such as a database), the second step is to send the changes from the dataset to the original data source. That is, the process of updating the dataset does not also write the changes through to an underlying data source; you must explicitly perform this second step. You typically accomplish this by calling the Update method of the same TableAdapter (or data adapter) that you used to populate the dataset, although you can also use different adapters, for example, to move data from one data source to another or to update multiple data sources.

Two-stage update process and the role of the DataRowVersion in a successful update

Visual Basic Dataset Updates

Structurally, a dataset makes data available as sets of collections. Datasets contain collections of tables. Tables contain collections of rows. Tables are exposed as a collection of the DataSet object and records are available in the Rows collection of DataTable objects. Making changes to data in a dataset by simply manipulating these collections using base collection methods is possible, but if you intend to update an underlying data source you must use the methods specifically designed for dataset modification.

For example, to remove a record from a data table, you could call the RemoveAt Method of the table's Rows collection, which physically deletes the record from the dataset. If you are using the dataset only as a structured store for data and are not concerned about transmitting change information to another application, manipulating collections this way is an acceptable way of updating a dataset.

However, if you intend to send changes to a data source or another application, you need to maintain change information (that is, metadata) about each update. Later, when you send changes to the data source or application, the process will have the information it needs to locate and update the proper records. For example, if you delete a record in the dataset, information about the deleted record has to be maintained in the dataset. That way, when the DeleteCommand of the TableAdapter is invoked, there is enough historic information to locate the original record in the data source so it can be deleted. For more information, see "Maintaining Information About Changes" below.

Merging Datasets

You can update the contents of a dataset by merging — that is, by copying the contents of one dataset (referred to as the source dataset) into the calling dataset (referred to as the target dataset). When you merge datasets, new records in the source dataset are added to the target dataset. Additionally, extra columns in the source dataset are added to the target dataset. Merging datasets is particularly useful when you have a local dataset and get a second dataset from another application or from a component such as an XML Web service. It is also useful when you need to integrate data from multiple datasets.

When merging datasets, you can also pass an optional Boolean argument (preserveChanges) that tells the Merge method whether to retain existing modifications in the target dataset. Because datasets maintain multiple versions of records, it is important to keep in mind that more than one version of the records is being merged. The following table illustrates a record in two datasets that will be merged:

DataRowVersion	Target dataset	Source dataset
Original	James Wilson	James C. Wilson
Current	Jim Wilson	James C. Wilson

Calling the Merge method on the above table with preserveChanges=false targetDataset.Merge(sourceDataset) results in the following:

DataRowVersion	Target dataset	Source dataset
Original	James C. Wilson	James C. Wilson
Current	James C. Wilson	James C. Wilson

Calling the Merge method with preserveChanges = true targetDataset.Merge(sourceDataset, true) results in the following:

DataRowVersion	Target dataset	Source dataset
Original	James C. Wilson	James C. Wilson
Current	Jim Wilson	James C. Wilson

Warning

In the preserveChanges = true scenario, if the RejectChanges method is called on a record in the target dataset, then it will revert to the original data from the source dataset. This means that if you try to update the original data source with the target dataset, it may not be able to find the original row to update. However, you can prevent a concurrency violation by filling another dataset with the updated records from the data source and then performing a merge prevents a concurrency violation. (A concurrency violation occurs when another user modifies a record in the data source after the dataset has been filled.)

Update Constraints

To make changes to an existing data row, you add or update data in the individual columns. If the dataset contains constraints (such as foreign keys or non-nullable constraints), it is possible that as you update a record — after you have finished updating one column but before you get to the next one — the record can temporarily be in an error state.

To prevent premature constraint violations you can temporarily suspend update constraints. This serves two purposes:

It prevents an error from being thrown when you update one column before you get to another column.
It suspends certain update events from being raised (events that are often used for validation).

After you have completed an update, you can re-enable constraint checking, which also re-enables update events and raises them.

Note

In Windows Forms, the data-binding architecture built into the datagrid suspends constraint checking until focus moves out of a row, and you do not have to explicitly call the BeginEdit, EndEdit or CancelEdit methods.

Constraints are automatically disabled when the Merge method is invoked on a dataset. When the merge is complete, if there are any constraints on the dataset that cannot be enabled, then a ConstraintException is thrown. In this situation, the EnforceConstraints property is set to false and all constraint violations must be resolved before resetting the EnforceConstraints property to true.

After you have completed an update, you can re-enable constraint checking, which also re-enables update events and raises them.

For more information about suspending events, see How to: Turn Off Constraints While Filling a Dataset.

Dataset Update Errors

When you update a record in a dataset, there is the possibility of an error. For example, you might inadvertently write data to a column that is of the wrong data type, or too long, or that has some other integrity problem. In addition, you may have application-specific validation checks that can raise custom errors during any stage of an update event. For more information, see Validating Data in Datasets.

Maintaining Information About Changes

Information about the changes in a dataset is maintained in two ways: by flagging the row that indicates whether it has been changed (RowState), and by keeping multiple copies of a record (DataRowVersion). By using this information, processes can determine what has changed in the dataset and can send appropriate updates to the data source.

RowState Property

The RowState property of a DataRow object is a value that provides information about the status of a particular row of data.

The following table details the possible values of the DataRowState enumeration:

DataRowState Value	Description
Added	The row has been added as an item to a DataRowCollection. (A row in this state does not have a corresponding original version since it did not exist at the time the last AcceptChanges method was called).
Deleted	The row was deleted using the Delete of a DataRow object.
Detached	The row has been created but is not part of any DataRowCollection. A DataRow object is in this state immediately after it has been created and before it is added to a collection, or if it has been removed from a collection.
Modified	A column value in the row has been changed in some way.
Unchanged	The row has not changed since AcceptChanges was last called.

DataRowVersion Enumeration

Datasets maintain multiple versions of records. The DataRowVersion enumeration of a DataRow object is a value that can be used to return a specific version of a DataRow object.

The following table details the possible values of the DataRowVersion enumeration:

DataRowVersion Value	Description
Current	The current version of a record contains all modifications performed on the record since the last time AcceptChanges has been called. If the row has been deleted there is no current version.
Default	The default value of a record, as defined by the dataset schema or data source.
Original	The original version of a record is a copy of the record as it was the last time changes were committed in the dataset. In practical terms, this is typically the version of a record as read from a data source.
Proposed	The proposed version of a record that is available temporarily, while you are in the middle of an update — that is, between the time you called the BeginEdit method and the EndEdit method. You typically access the proposed version of a record in a handler for an event such as RowChanging. Invoking the CancelEdit method reverses the changes and deletes the proposed version of the data row.

The original and current versions are useful when update information is transmitted to a data source. Typically, when an update is sent to the data source, the new information for the database is in the current version of a record. Information from the original version is used to locate the record to update. For example, in a case where the primary key of a record is changed, you must have a way to locate the proper record in the data source, in order to update the changes. If no original version existed, then the record would most likely be appended to the data source, resulting not only in an extra unwanted record, but in one record that is inaccurate and out of date. The two versions are also used in concurrency control; you can compare the original version against a record in the data source to determine if the record has changed since it was loaded into the dataset.

The proposed version is useful when you need to perform validation before actually committing the changes to the dataset.

Even if records have changed, there are not always original or current versions of that row. When you insert a new row into the table, there is no original version, only a current version. Similarly, if you delete a row by calling the table's Delete method, there is an original version, but no current version.

You can test to see if a specific version of a record exists by querying a data row's HasVersion method. You can access either version of a record by passing a DataRowVersion enumeration value as an optional argument when you request the value of a column.

Getting Changed Records

It is common that you do not update every record in a dataset. For example, a user might be working with a Windows Forms DataGridView control that displays many records. However, the user might update only a few records, delete one, and insert a new one. Datasets and data tables provide a method (GetChanges) for returning only the rows that have been modified.

You can create subsets of changed records using the GetChanges method of either the data table (GetChanges) or of the dataset (GetChanges) itself. If you call the method for the data table, it returns a copy of the table with only the changed records. Similarly, if you call the method on the dataset, you get a new dataset with only changed records in it. GetChanges by itself will return all changed records. In contrast, by passing the desired DataRowState as a parameter to the GetChanges method, you can specify what subset of changed records you want: newly added records, records marked for deletion, detached records, or modified records.

Getting a subset of changed records is particularly useful when you want to send records to another component for processing. Instead of sending the entire dataset, you can reduce the overhead of communicating with the other component by getting only the records that the component needs. For more information, see How to: Retrieve Changed Rows.

Committing Changes in the Dataset

As changes are made in the dataset, the RowState property of changed rows is set. The original and current versions of records are established and maintained and made available to you by the RowVersion property. The metadata stored in these properties representing the changes is needed to send the proper updates to the data source.

If the changes reflect the current state of the data source, you no longer need to maintain this information. Typically, there are two times when the dataset and its source are in synch:

Immediately after you have loaded information into the dataset, such as when you read data from the source.
After sending changes from the dataset to the data source, (but not before, because you would lose the change information required to send changes to the database).

You can commit the pending changes to the dataset by calling the AcceptChanges method. Typically, AcceptChanges would be called at the following times in your application.

After you have loaded the dataset. If you load a dataset by calling a TableAdapter's Fill method, then the adapter automatically commits changes for you. However, if you load a dataset by merging another dataset into it, then you have to commit the changes manually.

Note

You can prevent the adapter from automatically committing the changes when calling the Fill method by setting the AcceptChangesDuringFill property of the adapter to false. If it is set to false, then the RowState of each row inserted during the fill is set to Added.
After you have sent dataset changes to another process, such as an XML Web service.

Warning

Committing the change this way erases any change information. Do not commit changes until after you have performed any operations in which your application depends on knowing what changes have been made in the dataset.

This method accomplishes the following:

Writes the Current version of a record into its Original version, overwriting the original version.
Removes any row whose RowState property is set to Deleted.
Sets the RowState property of a record to Unchanged.

The AcceptChanges method is available at three levels. You can call it on a DataRow object, which commits changes for just that row. You can also call it on a DataTable object to commit all rows in a table, or on the DataSet object to commit all pending changes in all records of all tables of the dataset.

The following table describes which changes are committed based on what object the method is called on.

Method	Result
DataRow.AcceptChanges	Changes are committed only on the specific row
DataTable.AcceptChanges	Changes are committed on all rows in the specific table
DataSet.AcceptChanges	Changes are committed on all rows in all tables of the dataset

Note

If you load a dataset by calling a TableAdapter's Fill method, you do not have to explicitly accept changes; by default the Fill method calls the AcceptChanges method when it has finished populating the data table.

A related method, RejectChanges, undoes the effect of changes by copying the Original version back into the Current version of records and setting the RowState of each record back to Unchanged.

Data Validation

In order to verify that the data in your application meets the requirements of the processes that it is passed to, you often have to add validation. This might involve checking that a user's entry in a form is correct, validating data sent to your application by another application, or even checking that information calculated within your component falls within the constraints of your data source and application requirements.

You can validate data in several ways:

In the business layer, by adding code to your application to validate data. The dataset is one place you can do this. The dataset provides some of the advantages of back-end validation — such as the ability to validate changes as column and row values are changing. For more information, see Validating Data in Datasets.
In the presentation layer, by adding validation to forms. For more information, see User Input Validation in Windows Forms.

In the data back end, by sending data to the data source — for example, the database — and allowing it to accept or reject the data. If you are working with a database that has sophisticated facilities for validating data and providing error information, this can be a practical approach because you can validate the data no matter where it comes from. However, it might not accommodate application-specific validation requirements. Additionally, having the data source validate data can result in numerous round trips to the data source, depending on how your application facilitates resolving validation errors raised by the back end.

Security Note
When using data commands with a CommandType property set to Text, carefully check information that is sent from a client before passing it to your database. Malicious users might try to send (inject) modified or additional SQL statements in an effort to gain unauthorized access or damage the database. Before you transfer user input to a database, you should always verify that the information is valid; it is a best practice to always use parameterized queries or stored procedures when possible. For more information, see Script Exploits Overview.

When using data commands with a CommandType property set to Text, carefully check information that is sent from a client before passing it to your database. Malicious users might try to send (inject) modified or additional SQL statements in an effort to gain unauthorized access or damage the database. Before you transfer user input to a database, you should always verify that the information is valid; it is a best practice to always use parameterized queries or stored procedures when possible. For more information, see Script Exploits Overview.

After changes have been made in a dataset, you can transmit the changes to a data source. Most commonly, you do this by calling the Update method of a TableAdapter (or data adapter). The method loops through each record in a data table, determines what type of update is required (update, insert, or delete), if any, and then executes the appropriate command.

How an Update Is Transmitted to the Data Source

As an illustration of how updates are made, suppose your application uses a dataset containing a single data table. The application fetches two rows from the database. After the retrieval, the in-memory data table looks like this:

(RowState)     CustomerID   Name             Status
(Unchanged)    c200         Robert Lyon      Good
(Unchanged)    c400         Nancy Buchanan    Pending

Your application changes Nancy Buchanan's status to "Preferred." As a result of this change, the value of the RowState property for that row changes from Unchanged to Modified. The value of the RowState property for the first row remains Unchanged. The data table now looks like this:

(RowState)     CustomerID   Name             Status
(Unchanged)    c200         Robert Lyon      Good
(Modified)     c400         Nancy Buchanan    Preferred

Your application now calls the Update method to transmit the dataset to the database. The method inspects each row in turn. For the first row, the method transmits no SQL statement to the database, because that row has not changed since it was originally fetched from the database.

For the second row, however, the Update method automatically invokes the proper data command and transmits it to the database. The specific syntax of the SQL statement depends on the dialect of SQL supported by the underlying data store. But the following general traits of the transmitted SQL statement are noteworthy:

The transmitted SQL statement is an UPDATE statement. The adapter knows to use an UPDATE statement because the value of the RowState property is Modified.
The transmitted SQL statement includes a WHERE clause indicating that the target of the UPDATE statement is the row whose CustomerID = 'c400'. This part of the SELECT statement distinguishes the target row from all others because the CustomerID is the primary key of the target table. The information for the WHERE clause is derived from the original version of the record (DataRowVersion.Original), in case values required to identify the row have been changed.
The transmitted SQL statement includes the SET clause, to set the new values of the modified columns.

Note

If the TableAdapter's UpdateCommand property has been set to the name of a stored procedure, the adapter does not construct an SQL statement. Instead, it invokes the stored procedure with the appropriate parameters passed in.

Passing Parameters

Values for records to be updated in the database are usually passed using parameters. When the TableAdapter's Update method executes an UPDATE statement, it needs to fill in the parameter values. It gets these values from the Parameters collection for the appropriate data command — in this case, the UpdateCommand object in the TableAdapter.

If you have used the Visual Studio tools to generate a data adapter, the UpdateCommand object will contain a collection of parameters that correspond to each parameter placeholder in the statement.

The SqlParameter.SourceColumn property of each parameter points to a column in the data table. For example, the SourceColumn property for the au_id and Original_au_id parameters is set to whatever column in the data table contains the author id. When the adapter's Update method runs, it reads the author id column from the record being updated and fills the values into the statement.

In an UPDATE statement, you need to specify both the new values (those that will be written to the record) as well as the old values (so that the record to be updated can be located in the database). There are therefore two parameters for each value: one for the SET clause and a different one for the WHERE clause. Both parameters read data from the record being updated, but they get different versions of the column value based on the parameter's SqlParameter.SourceVersion Property. The parameter for the SET clause gets the current version, and the parameter for the WHERE clause gets the original version.

Note

You can also set values in the Parameters collection yourself in code, which you would typically do in an event handler for the data adapter's RowChanging event.

If your dataset contains multiple tables, you have to update them individually by calling the Update method of each data adapter separately. If the tables have a parent-child relationship, it is likely that you will have to send updates to the database in a particular order. A common scenario is that you have added both parent and related child records to a dataset — for example, a new customer record and one or more related order records. If the database itself is enforcing relational integrity rules, then it will raise errors if you send the new child records to the database before the parent record has been created.

Conversely, if you delete related records in the dataset, then you generally have to send updates in the reverse order: child table first, then parent table. Otherwise, the database is likely to raise an error because referential integrity rules will prevent you from deleting a parent record while related child records still exist.

A general rule for sending updates for related tables is to follow this order:

Child table: delete records.
Parent table: insert, update, and delete records.
Child table: insert and update records.
For more information, see Walkthrough: Saving Data to a Database (Multiple Tables).

Concurrency Control

Because datasets are disconnected from the data source, you do not hold locks on records in the data source. Therefore, if you want to update the database, and if it is important for your application to maintain concurrency control, you must reconcile records in the dataset with those in the database. For example, you might find that records in the database have changed since last filling the dataset. In that case, you must execute application-appropriate logic to specify what should happen with the database record or the changed record you have in your dataset.

Saving Data in Datasets

Two-Stage Updates

Merging Datasets

Update Constraints

Dataset Update Errors

Maintaining Information About Changes

RowState Property

DataRowVersion Enumeration

Getting Changed Records

Committing Changes in the Dataset

Data Validation

How an Update Is Transmitted to the Data Source

Passing Parameters

Concurrency Control

See Also

Tasks

Concepts

Other Resources

Additional resources

Saving Data in Datasets

Two-Stage Updates

Merging Datasets

Update Constraints

Dataset Update Errors

Maintaining Information About Changes

RowState Property

DataRowVersion Enumeration

Getting Changed Records

Committing Changes in the Dataset

Data Validation

How an Update Is Transmitted to the Data Source

Passing Parameters

Updating Related Tables

Concurrency Control

See Also

Tasks

Concepts

Other Resources

Additional resources