Tackle Data Concurrency Exceptions Using the DataSet Object
This article assumes you're familiar with ADO.NET and Visual Basic .NET
Level of Difficulty
ADO.NET provides many techniques for improving the performance of data-intensive applications and for making them easier to build. The DataSet, the hallmark of the ADO.NET object model, serves as a miniature, disconnected facsimile of a data source. While using the DataSet improves performance by reducing expensive trips to the database server, it also introduces the possibility of multiple users attempting to change the same data simultaneously, thereby generating data concurrency exceptions. This article examines the common causes behind data concurrency exceptions and presents techniques for overcoming them.
There are many benefits to upgrading your data access layer to ADO.NET, most of which involve using the intrinsic DataSet object. The DataSet object is basically a disconnected, in-memory replica of a database. DataSet objects contain one or more DataTables, each of which usually corresponds to a table in the underlying database. DataSets provide many benefits, but also present a few challenges. Specifically, you can run into problems related to data concurrency exceptions. I've created a simple Windows® Forms customer service application that illustrates the potential pitfalls of this particular problem. In this article, I'll walk you through the app and show you ways to overcome the data concurrency issues that arose.
The customer service application example shown here is built using Visual Basic® .NET and SQL Server™ 2000; however, since the Microsoft® .NET Framework is language-agnostic, any .NET Framework-compatible language could be used. Likewise, since the DataSet object abstracts the data source, the actual implementation of the data source is not important; data concurrency exceptions will be generated identically whether the underlying data source is SQL Server, local XML files, or derived from data retrieved through a service.
DataSet Pros and Cons
DataSets provide a number of benefits. For example, you gain the ability to enforce rules of integrity in memory rather than at the database level. DataSet objects can define and enforce relationships between tables and constraints upon columns, ensuring that business rules are applied without a round-trip to the database. With database abstraction, you can write a single set of code to access a DataSet object regardless of the origin of the data that fills the DataSet. The underlying data source may be SQL Server, Oracle, or even XML files. Regardless of the underlying data source, your code interacts with the DataSet in the same way. This allows you to change the underlying data source without changing your code.
The most important benefit of using DataSets, however, is improved performance. Since the DataSet is disconnected from the underlying database, your code will make fewer calls to the database, significantly boosting performance. You can add many new rows to multiple DataTables in a DataSet, checking each row for validity and referential integrity as you create it. A DataAdapter object, which connects the DataSet to the underlying database, can then update the underlying data with a single command. All of the new rows in each table are added with that command, assuring that all of the rows being added to the database are valid.
As with most performance optimizations, this one comes with a price. Since the DataSet object is disconnected from the underlying database, there is always a chance that the data is out of date. Since a DataSet doesn't hold live data, but rather a snapshot of live data at the time the DataSet was filled, problems related to data concurrency can occur. Data concurrency issues arise when multiple users have access to the same data and any single user can update the data without the other users' knowledge. It then becomes possible for a user to accidentally update data without knowing that the data they are changing is not the same data they see in their application. Fortunately, the DataSet object comes with built-in support for catching data concurrency issues as they arise so that your application can react accordingly.
The Sample Application
The customer service application I've created is used by a fictional company to create orders on behalf of their customers, as well as update the customers' personal information. There are many customer sales representatives (CSRs) who use the application from their own desktop. These CSRs take orders over the phone, gathering personal and payment information from the customer. The customer record is saved in the database to speed the processing of orders placed by returning customers. The CSR then creates an order and adds individual products to it, specifying the quantity and current price. Once all the information is gathered, the CSR presses the Place Order button, which inserts the customer and order records into the actual database.
CSRs also use the application to fulfill requests sent to the company through electronic or snail mail. These requests are divided evenly among the CSRs and forwarded to them each morning. CSRs fulfill the requests between phone calls. The system is designed to improve the speed at which requests are fulfilled; however, it comes at the cost of having all customers shared among CSRs. Each request made by a single customer, whether by phone or mail, will likely be processed by a different CSR, increasing the chance of data concurrency issues.
In order to improve performance, the application keeps a DataSet object filled with customers and orders in memory. Since many employees use the application simultaneously, there are always many disconnected snapshots of the live data, one on each employee's workstation. All customer maintenance, order entry, and order maintenance use this DataSet object, named dsAllData. The code used to build dsAllData is shown in Figure 1 and is part of a global module, so it's available to all forms in the application.
Figure 1 Filling the DataSet Object
Const connString = "server=localhost;database=northwind;uid=sa;pwd="
Public connCustSvc As SqlClient.SqlConnection
Public daCustomer As SqlClient.SqlDataAdapter
Public cbCustomer As SqlClient.SqlCommandBuilder
Public daOrders As SqlClient.SqlDataAdapter
Public cbOrders As SqlClient.SqlCommandBuilder
Public daOrderDetail As SqlClient.SqlDataAdapter
Public cbOrderDetail As SqlClient.SqlCommandBuilder
Public dsAllData As DataSet
Public Sub Main()
connCustSvc = New SqlClient.SqlConnection(connString)
daCustomer = New SqlClient.SqlDataAdapter("SELECT * FROM
cbCustomer = New SqlClient.SqlCommandBuilder(daCustomer)
daOrders = New SqlClient.SqlDataAdapter("SELECT * FROM Orders",
cbOrders = New SqlClient.SqlCommandBuilder(daOrders)
daOrderDetail = New SqlClient.SqlDataAdapter("SELECT * FROM
cbOrderDetail = New SqlClient.SqlCommandBuilder(daOrderDetail)
dsAllData = New DataSet()
daCustomer.MissingSchemaAction = MissingSchemaAction.AddWithKey
daOrders.MissingSchemaAction = MissingSchemaAction.AddWithKey
dsAllData.Tables("Orders").Columns("Total").DefaultValue = 0
The code that builds dsAllData creates an empty DataSet object, three DataAdapters, and three CommandBuilders. Each DataAdapter does a simple "SELECT *" from the appropriate table, while the CommandBuilder fills in the rest of the information necessary for the DataSet to have full insert, update, and delete capability. The Main routine uses the DataAdapter objects to fill dsAllData with data from all three tables and then begins the application with the Customer Maintenance form.
Figure 2 shows the Customer Maintenance screen, which has a DataGrid control bound to the Customers DataTable in dsAllData. This simple grid allows CSRs to edit any of the basic properties of a customer. Since the grid is bound to the Customers DataTable, any changes made to the values in the grid will automatically be stored in the DataTable. dsAllData will store these values until the CSR explicitly tells the form to update the underlying data source by clicking the Save Changes button.
Figure 2 Customer Maintenance Grid Allows Editing
For order entry, several new rows are created and added to dsAllData using the code in Figure 3. First an Order record is created, and then records are created in the OrderDetail DataTable for each item in the order. When all of the necessary rows have been added to dsAllData, a single call to the Update method of the appropriate DataAdapter will update the underlying data source with the new rows.
Figure 3 New Order with Detail Records
Private Sub CreateOrder()
Dim dr As DataRow
dr = dsAllData.Tables("Orders").NewRow
.Item("DateOrdered") = Now
.Item("CustomerID") = 1
.Item("ShipToAddress") = "123 Main"
.Item("ShipToCity") = "Kansas City"
.Item("ShipToState") = "MO"
.Item("ShipToZip") = "12345"
AddOrderDetail(dr.Item("ID"), 1, 1, 9.99)
AddOrderDetail(dr.Item("ID"), 2, 2, 4.99)
Private Sub AddOrderDetail(ByVal OrderID As Integer, _
ByVal ProductID As Integer, ByVal Quantity As Integer, _
ByVal Price As Single)
Dim dr As DataRow
dr = dsAllData.Tables("OrderDetail").NewRow
.Item("OrderID") = OrderID
.Item("ProductID") = ProductID
.Item("Quantity") = Quantity
.Item("Price") = Price
Since there are many CSRs using the application and updating customer information simultaneously, it is likely that at any given time a CSR is looking at out-of-date information. Planning for this, the architect of the application decided that the data cached in dsAllData should be refreshed every 30 minutes in order to ensure that CSRs are usually looking at correct information. The application is carefully architected to ensure that the data refresh is only performed during idle time so that it doesn't affect performance. DataSet refreshes are therefore slightly more than 30 minutes apart, depending on the activity of the CSR.
Figure 4 Application Data Model
The data model for this application is very simple (see Figure 4). It is stored in SQL Server 2000 and consists of just three tables, one for customers, one for orders, and one for the details of each order. The appropriate primary keys and relationships have been defined to ensure referential integrity. Additionally, a trigger has been defined on the OrderDetail table to update the Total column of the Orders table. Each time an OrderDetail record is inserted, updated, or deleted, the trigger will fire, calculate the total sale value of the order, and update the appropriate row of the Orders table. The code for the trg_UpdateOrderTotal is shown in Figure 5.
Figure 5 Updating the Total Column
CREATE TRIGGER trg_UpdateOrderTotal ON [dbo].[OrderDetail]
FOR INSERT, UPDATE, DELETE
DECLARE @OrderID int
SELECT @OrderID=OrderID FROM Inserted
IF @OrderID IS NULL
SELECT @OrderID=OrderID FROM Deleted
The First Data Concurrency Exception
The customer service application has been running in production with no problems for a couple of months when suddenly the application goes down with an unhandled DBConcurrencyException. In this section, I'll break down the circumstances that led to this new exception.
Joe, the first CSR, opens the application. This forces the initial load of data into the dsAllData DataSet and begins the cycle of refreshing the data every 30 minutes or so. Joe has a stack of paperwork in his inbox containing change requests that customers have faxed, mailed, or e-mailed. He begins working through the change requests, but is frequently interrupted by phone call orders.
Meanwhile, Sally, the second CSR, arrives at the office and opens her own instance of the application. Sally's instance of the app does an initial load of data from SQL Server, including any updates Joe has made so far. Sally receives a phone call from a customer requesting a phone number change. The customer had e-mailed a change of address request previously, but didn't know their new phone number at that time and wants to update their record now. Sally uses the Customer Maintenance screen to update the customer phone number. When Sally changes the phone number in the DataGrid, the new number is stored in dsAllData. When Sally confirms the rest of the customer's information, she realizes that the original address change request has not yet been processed so she updates the information and clicks the Save Changes button, which sends the new data to the SQL Server database.
Joe is in the next cubicle working diligently on his e-mail requests when he comes across the original change request for the same customer. When he pulls up his customer maintenance screen, the application works as designed and pulls the information from the cached DataSet object. Joe's application has not automatically synchronized itself with the database since Sally updated the customer's address, so his Customer Maintenance screen still shows the old address. Joe corrects the information displayed in the DataGrid with the new information provided in the e-mail and clicks the Save Changes button. When he does, however, he receives an error message that reads "Concurrency violation: the UpdateCommand affected 0 records," and the application crashes. When Joe restarts the application, he sees that the address has been updated and believes his change must have been completed before the application crashed. The following is the line of code in question:
Private Sub butSave_Click (ByVal sender As System.Object, _
ByVal e As System.EventArgs)
The actual exception that is created is of type DBConcurrencyException and is the result of specific functionality built into the DataAdapter object (see Figure 6). The DataAdapter is designed to pull data into disconnected objects like the DataSet, so it automatically double-checks data for changes before performing an update. If the underlying data has been changed, the DataAdapter will then raise the DBConcurrencyException instead of performing the requested update.
Figure 6 A DBConcurrency Exception
The implementation of this integrity check is fairly straightforward and leverages the ability of the DataTable object to keep multiple sets of data. When data is first loaded into a DataTable, all DataRows in the DataTable have their DataRowVersion property set to Original. When you modify a column value in a DataRow, a copy of the DataRow is created and marked as Current. The DataRow marked Original still exists in its unchanged form. All subsequent changes to the data will only affect the Current DataRow. When you execute the DataAdapter's Update method for a DataTable (or multiple DataTables in a DataSet), it iterates through all Current rows to determine the Update statements that should be applied to the underlying data source. In addition to the DataRowVersion property, DataRows have a RowState property which identifies the state of data in the row. Possible values include Unchanged, Added, Modified, Deleted, and Detached.
Having now determined which rows of the underlying data require updating, the dsCustomer DataAdapter builds the SQL statements needed to update the SQL Server database. Recall from Figure 1
that I used CommandBuilder objects with my DataSet to build the necessary INSERT, UPDATE, and DELETE statements. The UPDATE statement created by the CommandBuilder object uses the values stored in the copy of the DataRow with a DataRowVersion value of Original to identify and update the appropriate row in the database. That is, instead of simply identifying the correct row using the primary key values, the CommandBuilder creates a SQL statement that looks for a row which exactly matches all of the original values stored in the DataSet. The following code is an example of the UPDATE statement built to update the customer's phone number:
UPDATE Customer SET Phone = @p1
WHERE ((ID = @p2) AND ((FirstName IS NULL AND @p3 IS NULL)
OR (FirstName = @p4))
AND ((LastName IS NULL AND @p5 IS NULL) OR (LastName = @p6))
AND ((Address IS NULL AND @p7 IS NULL) OR (Address = @p8))
AND ((Address2 IS NULL AND @p9 IS NULL) OR (Address2 = @p10))
AND ((City IS NULL AND @p11 IS NULL) OR (City = @p12))
AND ((State IS NULL AND @p13 IS NULL) OR (State = @p14))
AND ((Zip IS NULL AND @p15 IS NULL) OR (Zip = @p16))
AND ((Phone IS NULL AND @p17 IS NULL) OR (Phone = @p18)))
The UPDATE statement uses parameters rather than actual values, but you can see how the value of each and every column in the row is checked.
Having now identified the exact row and original values in the underlying database, the DataAdapter can safely update the row. However, if any of the column values in the database row have changed since the DataTable was filled, the UPDATE statement will fail because there will be no row in the database that matches the criteria in the WHERE clause. It is easy for the DataAdapter to determine whether the UPDATE was successful by simply checking the actual number of rows updated in the database. If no rows are updated, then the underlying data must have been altered or deleted and the data concurrency exception is raised. This explains the somewhat cryptic error message that Joe receives when he tries to update the customer's phone number; the exact error detected by the DataAdapter is not that the underlying data changed, but that no records were updated, indicating that the underlying data must have changed.
There are a couple of ways to deal with the DBConcurrencyException. The first is to ensure that it never occurs. I can eliminate the SqlCommandBuilder objects used in Figure 1 and replace them with custom-built SqlCommand objects for the UpdateCommand properties of my DataAdapter objects. I would create the SQL used in the CommandText property with a WHERE clause that only filters based on the primary key instead of all available columns. This would eliminate all concurrency issues, assuming that the primary key is immutable.
However, this technique introduces several problems. First, there is considerably more code involved since I would also have to create SqlCommand objects for the InsertCommand and DeleteCommand properties of each of my DataAdapters. Additionally, these hard-coded SQL statements would introduce new errors should the underlying database schema be altered. Using the SqlCommandBuilder objects, the application determines the database schema at run time, accepting any changes which may have been introduced, and creates the SQL statements accordingly. This does not solve the concurrency problem, but rather avoids the issue altogether, enabling users to unknowingly overwrite someone else's changes.
A better solution is to catch the exception, as shown in Figure 7. The Try...Catch block catches the DBConcurrencyException and displays a message box identifying the exception to the user and offering them an option, as shown in Figure 8. At this point, I have identified that a concurrency error has occurred and I have two choices: I can retrieve the new underlying data and display it to the user, forcing them to make their changes yet again, or I can simply overwrite the new underlying data with the changes that have been specified by the user. These are the options that are displayed in the message box shown in Figure 8.
Figure 7 Catching the Concurrency Exception
Catch dbcEx As Data.DBConcurrencyException
Dim dResult As DialogResult
dResult = MessageBox.Show(messageString, _
"Data Concurrency Exception Occurred", _
MessageBoxButtons.YesNoCancel, MessageBoxIcon.Error, _
If dResult = DialogResult.Yes Then
'Two options: Fill the whole table, or just refresh the row
ElseIf dResult = DialogResult.No Then
'Save a copy of the new row
Dim drCopy As DataRow, drCurrent As DataRow
drCopy = dsAllData.Tables("Customer").NewRow()
Dim dc As DataColumn
drCurrent = _
For Each dc In drCurrent.Table.Columns
If dc.ReadOnly = False Then _
drCopy.Item(dc.ColumnName) = drCurrent.Item(dc.ColumnName)
'Get the current values from the database
'Now restore the values entered by the user and save again
For Each dc In drCurrent.Table.Columns
If dc.ReadOnly = False Then _
drCurrent.Item(dc.ColumnName) = drCopy.Item(dc.ColumnName)
Figure 8 Handling the Data Concurrency Exception
If the user decides to view the new underlying changes and discard their current changes, you simply need to refresh the data stored in the Customer DataTable. Since the DataGrid is bound to this DataTable, the grid will automatically display the new data. To refresh the data, you have two options. The first is to simply fill the entire DataTable again using the Fill method of the daCustomer DataAdapter. While this technique will work, it is overkill since you are only interested in refreshing the data in a single DataRow. I have created a routine named UpdateRow which only pulls data for the single DataRow in question (see Figure 9). One caveat of using the UpdateRow routine is that I have defined in the parameter collection that the DataRow to be found will have a single, integer key column. If the table has a key of a different data type or has a composite key, you will need to overload UpdateRow to handle the specific key requirements. After the DataRow and/or DataTable have been refreshed with the current data, the DataGrid displays a small Error icon next to the row that caused the concurrency exception (see Figure 10).
Figure 9 UpdateRow Routine Updates Cached DataRow
Private Sub UpdateRow(ByVal TableName As String, ByVal ID As Integer)
'Get a reference to the specified row
Dim dr As DataRow = dsAllData.Tables(TableName).Rows.Find(ID)
'Create a Command update to pull the new underlying data
Dim cmd As New SqlClient.SqlCommand("SELECT * FROM " & TableName & _
" WHERE ID=" & ID.ToString(), connCustSvc)
'Open the connection and create the DataReader
Dim rdr As SqlClient.SqlDataReader = cmd.ExecuteReader()
'Copy the new data from the database to the DataRow
Dim dc As DataColumn
For Each dc In dr.Table.Columns
If dc.ReadOnly = False Then _
dr.Item(dc.ColumnName) = rdr.Item(dc.ColumnName)
'Accept changes in the DataRow
Figure 10 DataGrid Displays Error Icon
The other option is for the user to disregard the changes made to the underlying database and force their changes to take effect. There are a couple of ways this can be accomplished. The first would be to execute a direct SQL statement against the SQL Server database to synchronize it with the data in the DataTable. While this method performs well, it can be problematic in that it would cause you to rewrite the SQL statement with every database change. The SQL you write for this technique may also need to be hard-coded to the specific database version being used, losing the abstraction afforded to you by the DataAdapter and DataSet objects.
A better option uses the UpdateRow routine I wrote previously and allows the DataAdapter to handle the update. The code in Figure 9 first creates a copy of the DataRow with the new changes and then calls the UpdateRow routine to get the new data from the database. It is necessary to call the UpdateRow procedure so that you do not receive another concurrency exception when you try to perform the update again. The code then iterates through all the columns on the DataRow, replacing the newly retrieved values with those supplied by the user. Finally, the DataAdapter is used to update the underlying database with the user's changes.
There are a couple of potential problems with both of these code solutions. First, by default, the DataAdapter's Update method will fail on the first concurrency exception and will not process additional DataRows. This could lead to partial updates of data or several concurrency exceptions, each handled individually by the user. Another potential exception could arise in the code shown in Figure 7, in the branch that forces the user's changes into the underlying database. There is a slim chance that another user may change the underlying database between the call to the UpdateRow routine and executing the Update method on daCustomer. This would generate an unhandled DBConcurrencyException. One possible workaround is to place the Update method call in a Try...Catch block, but your second Catch code would be similar to the first and could potentially generate its own concurrency exception, requiring more Try...Catch blocks, ad infinitum. A more elegant solution would break this code out into a separate routine which could be called recursively in the case of multiple concurrency exceptions.
Using SQL Server Triggers
The developers of the customer service application for my fictional company have now fixed the code to handle the DBConcurrencyException that Joe received when updating a customer, and everything is working fine again—that is, until Sally tries to enter an order later that afternoon.
Sally receives a phone call from a customer wanting to place an order. The customer is already in the database, so Sally takes the basic information for the order, including the shipping address. She then pulls up the OrderDetails screen and adds two items to the order. Each OrderDetail record includes an OrderID, a ProductID, a Quantity, and a Price. When the order is correct, Sally clicks the Place Order button to insert the order into the database. I have simplified the code by adding hard-coded values for the OrderDetail records in the code shown in Figure 3.
The order is successfully placed and inserted into the database, but Sally receives a notification that one of the items is currently out of stock and the order cannot be shipped for two weeks. The customer realizes that he will be out of town when the package is available for delivery and asks if he can change the shipping address for the order. Sally has changed information about an order before by using the Order Maintenance screen, so she says that she'll be happy to help. She pulls up the appropriate screen and changes the shipping address for the customer. However, when she clicks the Save Changes button, she sees an error dialog similar to the one that Joe saw earlier, and the app crashes.
The key to understanding why this DBConcurrencyException is raised lies in the underlying database. Recall from Figure 5 that I placed a trigger on the OrderDetail table that is fired for all INSERTS, UPDATES, and DELETES. The trg_UpdateOrderTotal updates the total cost of the order by calculating the sum of the Quantity times the Price for each OrderDetail row tied to the Order. When Sally created the order with two items, the application first inserted a new row into Orders and then inserted two new rows into OrderDetail. After each of the OrderDetail records is created, the trigger is fired, which updates the Total column of the Order record. However, this value is generated only in the database and is not automatically transferred to the application. In fact, the code in Figure 3 does not specify a Total because a default value was already specified in Figure 1. The default value of 0 is then passed into SQL Server with the new Orders record, which is correct at the time the order is created. The database then updates the total column, but the DataTable in the application still has 0 as the order total. When Sally tries to update the Order record, the DataAdapter recognizes that the Total column value has changed and raises the DBConcurrencyException.
The trick here is that in the past when Sally had updated Order records, by chance, she didn't update a new order that had been created since the last automatic data refresh had taken place. Within 30 minutes of creating each order, the dsAllData DataSet object is refreshed, which will update the Orders DataTable with the correct totals. Any order updates made after the automatic data refresh (or an application restart) will work as planned.
The solution to this issue is similar to the previous problem: refresh the data seen by the user. I can, however, be a bit more proactive here. Instead of waiting for a concurrency exception to occur, I can automatically refresh each new Orders DataRow after OrderDetail records are created. I can modify the code in Figure 3 to include a call to my UpdateRow routine for the specified row in Orders just after calling the Update method of daOrderDetail. SQL Server will normally complete this trigger almost instantaneously, but you can't depend on this, so a prudent developer might add a time delay so the trigger has enough time to complete.
The disconnected nature of the new ADO.NET DataSet object offers significant performance benefits at the cost of possibly introducing new types of data concurrency errors into your application. Understanding how and why these errors occur allow you to catch and handle such errors elegantly, but also adds an additional level of support to your application. This gives your users more options for maintaining the integrity of their underlying database. Handling concurrency issues is sure to become an increasingly larger challenge as application development moves farther into the world of Internet and intranet applications, with hundreds or thousands of simultaneous users. ADO.NET provides you with all the necessary tools to overcome this challenge.
For background information see:
.NET e-Business Architecture
by David Burgett, Matthew Baute, John Pickett, and Eric Brown (SAMS, 2001)Data Points: Establishing Relationships Between Rowsets with ADO.NET
is a Senior Technical Architect with G. A. Sullivan in Kansas City, and the co-owner/developer of ReprintOrders.com. He has contributed to three books on Visual Studio, most recently .NET eBusiness Architecture
(Sams, 2001). David frequently speaks at technical events. You can reach him at firstname.lastname@example.org