Recovery Processing

After any type of failure that disrupts normal transaction processing, KTM and each resource manager must perform recovery operations. Recovery is the process by which transaction participants arrive at a consistent view of each transaction's state.

Resource managers may be in-doubt about a transaction's outcome, meaning that at the time of failure, they had received a TRANSACTION_NOTIFY_PREPARE notification, had prepared to durable storage, but had not received (or received but not logged) a final outcome for the transaction. Similarly, KTM can be in-doubt about a transaction if it had been prepared but had not received (or received but not logged) an outcome. At recovery time, all outcomes that have been sent but not acknowledged must be re-sent. For example, if a resource manager received a TRANSACTION_NOTIFY_COMMIT notification and called the CommitComplete function, the RM may still receive a duplicate TRANSACTION_NOTIFY_COMMIT notification at recovery time.

For a transaction to properly recover after a resource manager or system failure, each resource manager must do the following each time it is started:

  1. Call the OpenResourceManager function to re-open its resource manager handle using its unique, persistent name. This informs KTM that the resource manager is running again and is available to perform recovery. If no enlistments exist to be recovered, the call to OpenResourceManager can fail. Call CreateResourceManager to re-create the RM object.

  2. Call RecoverResourceManager. The resource manager will receive a TRANSACTION_NOTIFY_RECOVER notification event for each enlistment for which it needs to perform recovery operations, followed by a TRANSACTION_NOTIFY_LAST_RECOVER. The notification event includes a globally unique identifier for both the transaction and the enlistment.

  3. Call the OpenEnlistment function to re-open each enlistment handle for which the resource manager received a TRANSACTION_NOTIFY_RECOVER notification.

  4. For each enlistment opened by OpenEnlistment, call RecoverEnlistment. This causes the TRANSACTION_NOTIFY_COMMIT or TRANSACTION_NOTIFY_INDOUBT notification to be redelivered.

  5. If the RM received TRANSACTION_NOTIFY_COMMIT, the RM can complete the transaction by calling CommitComplete.

    If the RM received TRANSACTION_NOTIFY_INDOUBT, the RM should wait for the outcome notification to arrive.

  6. For any transactions that the RM does not receive a TRANSACTION_NOTIFY_RECOVER notification, but previously received a TRANSACTION_NOTIFY_PREPARE notification for, the RM should process the transaction as if it were rolled back.

Note

Resource managers are allowed to enlist in or create new transactions while in the process of performing recovery.

 

KTM uses a presumed abort transaction model. The following scenario illustrates this behavior. Assume that KTM and an resource manager exist on the same computer. Suppose KTM issues a prepare notification for a transaction, but the system crashes before KTM logs the prepare notification. Further suppose that the resource manager receives and logs the prepare notification just before the system crashes. After the system is restored, KTM has no knowledge of the transaction, because it never logged the prepare phase. The resource manager has knowledge of the transaction, because it received, processed, and logged the prepare notification. When KTM issues its recovery notifications, the resource manager does not include a recovery notification for the transaction in question. With the presumed abort model, the resource manager in this case will treat the prepared transaction as aborted when it does not receive notifications to perform recovery on that transaction.