This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

MIND

 

C++ Q&A
Transactional Programming

Jason Masterman

 

T

he declarative programming model provided by COM+ and Microsoft® Transaction Services (MTS) makes it easier for programmers to build transactional applications. Since these technologies were introduced, however, many developers have taken the plunge before really understanding how to build an efficient transactional system properly. As a result, many have become discouraged after building applications that run sluggishly under MTS. It's not because MTS and COM+ are poorly designed. Rather, many people do not seem to want to design for the transaction. Instead, they would rather stick to what they're comfortable with and continue designing objects around the problem domain instead of around the transaction.
      In this column I will take a look at several key elements of designing and optimizing transactions. I will explain how to design your objects for the transaction, when to use Distributed Transaction Coordinator (DTC) transactions, and when to avoid the cost of them. Then I will look at how to mix your existing transactional stored procedures with MTS and COM+components and clarify what happens when a transactional stored procedure is called from a transactional MTS and COM+ component. Next I'll explore how MTS and COM+ set the isolation level of a transactional component and how you can change it from within a stored procedure. Finally, I will look at how to handle ActiveX® Data Objects (ADO) errors properly since ADO dramatically differs from ODBC in this area. I'm going to assume you are already familiar with the basics of COM, MTS, Visual Basic®, and Microsoft SQL Server.

Designing Objects for the Problem Domain

      Take a look at Figure 1. The flow of execution in this diagram operates as described in the following steps:

  1. The client calls to the MTS and COM+ component to place an order.
    • The Root object is marked as transactional and calls secondary business layer objects Customer, Product, and Order, which are also transactional.
      • The Root object calls the Customer object to verify whether the customer exists and to see if they have enough funds in their account to purchase the desired products. Once verified, the customer's account balance is debited.
        • The Root object calls the Product object to decrement inventory for each product ordered. A validation check is first performed to see if there is enough inventory on hand.
          • The Root object calls the Order object to create the order record in the orders table.
            • All secondary objects call through the DataAccess object layer, which in turn uses ADO to communicate with the SQL database.

      How does this look in terms of basic design to you? At quick glance you should see that I have used the middle tier to perform all the necessary business logic of the application. The business logic does not live inside the database in a stored procedure. So I am free to change the database from SQL Server to Oracle or something else if I want to, thus avoiding a lot of problems migrating stored procedures from one database to another.
      This three-tier structure should yield some performance and maintainability benefits. But, have I lost anything in the process? Does this design really score well in the area of performance? How many round-trips to the database does this application make to process a transaction? Moreover, do I really need to employ the DTC in a transaction like this?
      This design is very common in applications today, and you can argue that it has both good and bad points. To understand when this design is good and when it is bad, you have to look at it from a few different perspectives. I'm going to look at this design in terms of object and design purity, maintainability, and performance.
      So how many round-trips does this design require from the middle tier to the database? If each business layer object needs to do one thing with the database, it has to make at least one round-trip for each business object to perform its work. This means at least three round-trips to the database. If you consider round-trips evil in a distributed design, then this would be considered a bad design. Designing those three calls into one would be better.
      When you create components that run as configured components in MTS or COM+, you can set the transaction property of your component to signal the runtime whether you want transactional support. In Figure 1, the components are all marked as Transaction Required. This means that when the component creates an object, MTS and COM+ are responsible for creating the transaction and either committing or aborting the transaction based on how the objects vote via calls to SetAbort, SetComplete, EnableCommit, or DisableCommit. To learn more about the semantics of controlling transactions, see "Writing MTS-style Transactions with Visual Basic and SQL Server 7.0," by Ted Pattison, in the October 1999 issue of Microsoft Systems Journal.
      When you create and run a declarative transaction, the MTS and COM+ runtime asks the DTC to begin a transaction. The DTC is a sophisticated service that allows for distributed transactions, as its name indicates. This means the DTC can control transactions that span multiple resources, such as one machine running the MTS and COM+ component, a SQL Server database on a separate machine, and an Oracle database on a third machine. These types of distributed transactions use a two-phase commit protocol. However, using a DTC-style transaction is probably overkill most of the time.
      When you create a transactional MTS and COM+ component that calls out to SQL Server, the connection that is established will run by default at an isolation level of Serializable. This is different than when you connect to SQL Server without an MTS and COM+ transaction, where the isolation level is Read Committed. The difference between Read Committed and Serializable has to do with how SQL Server holds locks on the data. A transaction running with an isolation level of Serializable acquires more locks than one running at Read Committed. It also holds locks longer.
      For example, say you begin a transaction that is running at Read Committed. If you first read some records via a simple Select statement, and then insert a record and update a record, SQL Server will not hold any locks on the records you asked for in the Select statement past the point where the physical read has been completed. In other words, read locks are held for the actual read, but are released before the transaction has ended. If you were running at Serializable, on the other hand, any records you touch either with a Select statement or an update would be locked for the life of the transaction. This is a much stricter form of locking and has the potential for lowering the overall concurrency of the system.
      MTS and COM+ components run transactions at an isolation level of Serializable because it eliminates all problems caused by separate users running transactions concurrently. It's as if there were only one person using the system at a time. So if I have an object that implements multiple interfaces, one for read operations and another for write operations, can I have the read interface not be transactional while the write interface is transactional?
      With MTS and COM+ you can only set the transactional property on the component (at object level), not on the interface. This means if you want part of an object to be transactional and the other part to not be transactional, you have to break the functionality into two separate objects. (There is another way to solve this problem, which I will discuss later.)
      Since the transaction is really rather simple, I am only working with a single resource, SQL Server. Asking for a DTC-style transaction is an expensive operation if you don't really need it; SQL Server has its own transactional support that will work well in this scenario. So, should you ask for the DTC transaction? The DTC is smart enough to optimize away the two-phase commit protocol if only a single resource is involved, but you still have the cost of creating the DTC-style transaction and reporting back to the DTC when you finish your work.
      Compare Figure 1 with Figure 2, in which a single Order object forwards the call for placing an order to a stored procedure that lives in the database. The Order object is not marked as transactional. Instead, the stored procedure is transactional and controls all of the logic for committing or aborting the transaction. This design clearly reduces the number of round-trips to the database to a single call for the stored procedure. It also reduces the cost of involving the DTC by not marking the object with transactional support. If performance is the goal, this is a faster solution. Also, since you are not running a transactional MTS and COM+ component, your connection from the Order object to the database will not be set to run automatically with an isolation level of Serializable, which will reduce the number of locks held when running at the higher level.
      The problem with this solution is that all of the logic has moved into the database layer. If you decide to move to another database, you end up having to rewrite the stored procedures. Also, take a look back at Figure 1. If a customer has an order with 10 line items (representing 10 products), you would have to make a call to the database for each line item. In Figure 2, you can't pass an array of line items to the stored procedure and you don't want to make 10 round-trips if you don't have to. You could design around this by creating a big string that has several "execute sp_ PlaceOrder" commands in it, then calling to the database with a single ADODB.Connection.Execute method call. This would work, and it keeps the round-trips to a minimum, but you start to sacrifice something in the process: the maintainability of your code.
      Sure, this solution optimizes performance. However, if you design only for performance, you could create a system that's going to be hard to understand and, therefore, harder to maintain. Also, what happens when you have to include some other resource manager, possibly an Oracle database, into your transactional work? You will have a difficult time including any other resource managers in the transaction without the help of a DTC.

Stored Procedures and Components

      Many current systems already have lots of stored procedures in them, and many of these stored procedures have been written with transactions. If your system uses transactional stored procedures and you want to invoke those stored procedures from transactional MTS and COM+ objects, what will happen? For example, say I have an MTS and COM+ component. It is marked as Transaction Required, and it calls a stored procedure that starts a transaction via BEGIN TRAN. The component then performs some inserts or updates or deletes, and then calls ROLLBACK TRAN. Does the MTS and COM+ transaction get aborted? Is calling ROLLBACK TRAN in a stored procedure similar to calling SetAbort in a transactional component?
      The answer is yes, the transaction gets rolled back and all work being done in MTS and COM+ would also get rolled back. So, the ROLLBACK TRAN statement acts like a SetAbort call. However, calls within your stored procedure to BEGIN TRAN and COMMIT TRAN are simply ignoredâ€"a call to BEGIN TRAN does not actually start another transaction, and calls to COMMIT TRAN do not actually commit the transaction. The transaction cannot be committed until the MTS and COM+ root object gives the final vote via SetAbort or SetCompleteâ€"then the DTC goes about its work to handle the transaction's outcome.
      This is important to know because if you mix your existing transactional logic in stored procedures with transactional MTS and COM+ components, you need to know how they are going to interact. Besides, if you have several hundred or even thousands of stored procedures that are transactional, you may not want to rewrite them just so you can use MTS and COM+. You should test for yourself to make sure all of them are going to work as expected. If you are creating a system from scratch and you intend to use MTS and COM+, you are obviously free to leave the transactional logic in your components instead of putting it in both the stored procedure and your components.

Lowering the Isolation Level

      When I talked about calling from a transactional MTS and COM+ component to a SQL Server database via ADO, I mentioned that MTS and COM+ automatically move the isolation level to Serializable. This causes the strictest level of locking for all read and write operations when you may not need it. I showed you an object that implemented two interfaces, the read interface and the write interface, and suggested that this should actually be implemented as two different objects since MTS and COM+ only allow you to configure the transaction property on the object. So, if you want the read interface to run at an isolation level of Read Committed, you can't do it by just setting a property in your MTS and COM+ component.
      There is, however, another way you can adjust this setting. When you connect to SQL Server via ADO, MTS and COM+ adjust the connection's isolation level to Serializable. You could just as easily change this setting by calling a stored procedure or issuing a SQL statement that sets the transaction isolation level to Read Committed. For that matter, you could take the gloves off completely and set it as Read Uncommitted. You could also adjust the isolation level in a SQL statement by using locking hints in your statements. For example, you could issue a SQL select like this:

 

Select au_lname, au_fname from authors (READCOMMITTED)

This instructs SQL Server to perform the query as if the isolation level was set to Read Committed instead of the Serializable level you might be using.
      Lowering the isolation level through SQL allows you to have an object that implements multiple interfaces and is marked Transaction Required in MTS and COM+, but lower the isolation level when you don't need Serializable. This may be preferable to creating two different objects.
      An interesting question arises when you lower the isolation level of a connection to something other than Serializable. What happens to the isolation level when the connection is released by an object in one transaction and gets put into the connection pool for some other object to use? After some testing, I found that MTS and COM+ always bump the isolation level back up to Serializable whenever they enlist a connection in a new transaction.

Error Handling with ADO and ODBC

      Many of you probably started writing database applications long before ADO came on the scene. You used database APIs like Data Access Objects (DAO), Remote Data Objects (RDO), and ODBC. RDO is a set of objects that sits on top of the ODBC layer and allows the programmers working with Visual Basic to call to ODBC via a friendly set of objects. Those of you familiar with ODBC are also used to getting errors back from stored procedures in a rather straightforward way; if the stored procedure raises an error, you can easily catch the error in your code.
      When you move to ADO and the native OLE DB provider for SQL Server, SQLOLEDB, the techniques you have used in the past for raising and catching errors is very different from the ODBC mechanism. Of course, you can make ADO work like your old ODBC code by simply employing the ODBC provider for ADO, MSDASQL. But if you use the native SQLOLEDB provider, you will need to watch out. The MSDASQL provider allows you to use the ADO objects instead of RDO while still calling through the ODBC layer. I know this discussion of ADO and ODBC may seem way off-topic from the rest of this column, but it really is important since a lot of you use SQL Server and ADO for data access.
      If you call the following simple stored procedure (which simply raises an error) from an ADO component that uses the SQLOLEDB or MSDASQL providers, you will always catch the error from the calling component:


CREATE PROCEDURE spRaiseError AS

raiserror ('Some generic error string',16,1)

Figure 3 shows the Ax.dll COM component created with Visual Basic. This component is called from the following test client code:

Private Sub cmdTest_Click()
    Dim ref As CSQLTx
    Dim rc As Integer
    Set ref = New CSQLTx
    
    rc = ref.ExecFailedSPROC()
End Sub

      Now, let's make a few small changes to the stored procedure. I will make the stored procedure do something like beginning a transaction and then inserting a record into the Authors table of the Pubs database. Then it will roll back the transaction and raise an error back to the client.

CREATE PROCEDURE spRaiseError AS

begin tran

insert into authors(au_id, au_lname,au_fname,contract)
values('111-11-1111','Flinstone','Fred',1)

rollback tran

raiserror ('Some generic error string',16,1)

This is obviously a contrived example to demonstrate the point. You will find that if you use the SQLOLEDB provider, your component will never catch the error the way it is currently written. However, if you use the MSDASQL provider, which means you are using ODBC, you will catch the error. This is a major difference in the way errors get passed back to the component.
      The creators of the OLE DB specification wanted users to be able to find out very granular information about statement executions. They wanted to provide the client application calling a stored procedure or SQL batch the ability to inspect the success or failure of each statement within the procedure. There is a lot of information specified and available from OLE DB, but the implementation of ADO does not always take advantage of this information. This is why you often get back the generic "Errors Occurred" when really a very specific error occurred, but it was just not propagated back through the ADO layer.
      In the modified stored procedure shown earlier, more than a single statement is executed. Each statement will return a recordset to ADO. This means that the error being raised at the bottom of the stored procedure is returned to the calling component, but that component needs to iterate through all the recordsets passed back to trap the error.
      Take a look at the revised code in Figure 4. In this version, I set a recordset for return of the connection's Execute method, and then iterate through each recordset using the NextRecordset method. When I iterate over the recordset that has the error in it, the error handling routine is thrown. If this same code was executed against the MSDASQL provider, I would have trapped the error on the original connection object's Execute method and would not have had to iterate through each recordset. This is important because the sample code aborts the transaction and raises an error back to the caller.
      Assuming that the caller is a transactional MTS and COM+ component, I would want to know that my work had failed so I could then call SetAbort and return to the client. If I don't iterate the recordsets, I would never know about the error. As a result, I might continue trying to do other transacted work when my transaction is already doomed.

Jason Masterman is a computer scientist at DevelopMentor. Jason consults on Visual Basic, MTS, MSMQ, IIS, and Commerce Server development. He recently returned from Japan, where he worked on an intranet and Internet system to sell cars over the Web internationally.

From the June 2000 issue of MSDN Magazine.