
Handle Receive-Specific Batch Errors
Handle Receive Failures
When an adapter submits an operation (or batch of operations) to BizTalk Server there can be various reasons for failure. The two most significant are:
-
The receive pipeline failed.
-
A routing failure occurred while publishing a message.
The Messaging Engine automatically tries to suspend the message when it gets a receive pipeline failure. The suspend operation may not always be successful. For example, if the Messaging Engine hits a routing failure while publishing a message, then the engine does not even try to suspend the message.
It is always possible that a message will fail. In such a situation, the adapter should explicitly call the MoveToSuspendQ API and should try to suspend the message. When an adapter tries to suspend a message, one of the following should be true:
-
The same message object that the adapter submitted (recommended) should be suspended.
-
If the adapter has to create a new message, then it should set the message context of the new message with the pointer to the message context of the message that was originally submitted. This is because the message context of a message has a lot of valuable information about the message and the failure. This information is required to debug the failed message.
Note |
|---|
|
If the adapter creates a new message object and suspends it, the adapter should copy the error information from the old message object to the new message object.
|
Some adapters, such as the HTTP adapter provided with BizTalk Server, do not require that the message be suspended. These adapters can return an error back to their client.
Causes of failure
Simple causes of failure are the errors that can occur as the batch is constructed or when IBTTransportBatch::Done is called.
-
Submit failure. The Submit call can fail for a limited number of reasons, and all of them are fatal. These reasons include:
-
Out-of-memory errors occurring in the BizTalk Server process space.
-
The schema assembly has been dropped from the deployment. In this case, the Submit fails with a cryptic error. In the MQSeries adapter, the generic failure exception from BizTalk Server is caught, and an extended error message is written in the system event log. This message suggests that one of the possible causes of the error is that the schema assembly has somehow been dropped from the deployment.
In general, if Submit fails you should try to suspend the message using the same transaction.
-
IBTTransportBatch::Done failure. The IBTTransportBatch::Done call can fail for one of several reasons. In general, you should always attempt one suspend operation and end the transaction only if that fails. One of the error codes you might receive from the failure of IBTTransportBatch::Done is that BizTalk Server is trying to shut down. In this case, you should just end the transaction and leave it because the Terminate call is probably happening concurrently. Other scenarios occur when you have successfully constructed the batch and successfully executed IBTTransportBatch::Done. In these cases, the errors are returned in BatchComplete and the adapter must determine what to do with them. The rest of this section deals with this case.
Processing BatchComplete errors
BatchComplete is a callback provided by the adapter that is invoked by BizTalk Server to indicate the completion status of a batch operation.
The most important parameter passed to BatchComplete is the batch status hResult. This indicates success or failure for the batch. If the batch failed, it means that none of the operations in the batch succeeded. The adapter goes through the batch status structure and determines which messages failed (this is known as filtering the batch).
Nontransactional BatchComplete errors
For nontransactional adapters, you must choose your response if a failure occurs for a SubmitMessage/SubmitRequestMessage or SubmitResponseMessage operation. Typically adapters suspend the message by calling MoveToSuspendQ.
The following operations are always expected to pass: DeleteMessage, MoveToSuspendQ, ResubmitMessage. If these operations fail, it typically means that there is a bug in the adapter. You do not have to write code to handle a failure in these cases. However if the batch failed because another operation failed, then these operations must be re-executed in a fresh batch.
If the adapter calls MovetoBackupTransport and that fails (because there was no backup transport), then the adapter should call MoveToSuspendQ to suspend the message
Transactional BatchComplete errors
When you submit batches to BizTalk Server using a transaction created by the adapter, you should follow one of these two scenarios:
-
Use single-message batches. Send a single-message batch to BizTalk Server. If that single message fails, then you can legally send BizTalk Server a second batch under the same transaction, but you must move the offending message to the Suspended queue rather than resubmitting it. After the failed message is removed, the submission of the second batch should succeed. After that occurs you can commit the transaction when BizTalk Server confirms that the second batch was successful. If the second batch fails, the adapter must abort the transaction, or find somewhere else to place that message. In this scenario, you immediately take a significant performance hit due to transaction rollback processing.
There are some techniques that you can use to improve the performance of the adapter. For example, the MQSeries adapter adjusts its approach dynamically at run time. It runs with 100-message batches. If it hits an error, it must end the batch, but it switches to single-message batches for a short time as it gets past the bad message. It then reverts to 100-message batches. If it hits the error again, it slows down again.
-
Use preemptive suspension. Construct a multi-message batch in which the erroneous messages are preemptively suspended. The batch contains a mix of Submit and MoveToSuspendQ operations, and is the first and only batch under the transaction. It should succeed because the bad data was preemptively suspended, and the transaction can be committed (after waiting to receive the confirmation from BizTalk Server).
This might seem to require looking into the future, but this technique has been used in the MSMQ adapter. It depends on having reliably unique message IDs. This adapter constructs a batch of messages. If anything fails it rolls back the transaction (and therefore the batch), but remembers the message ID in a temporary data structure. (To prevent this structure from growing indefinitely, items in it are removed after some fixed time delay.) Before each batch is submitted, the adapter checks the list of bad message IDs. If it sees one, it knows that message will fail (because it failed once in the past), and preemptively suspends it rather than trying to submit it.
Not every adapter has a reliably unique message ID, and a transactional store is less likely to have one. Because of this, many transactional adapters are restricted to sending single-message batches.
Processing other errors
In all other cases (such as failures in suspending messages), the adapter must end the transaction. Any other outcome results in either duplicate or dropped messages.
Whenever the adapter can, it should abort the transaction if a batch fails. However there are scenarios where the adapter cannot abort the transaction. In such a scenario it should suspend the message using the same transaction.
Processing errors on transactional receive
A common transactional processing pattern is to end a transaction when an error occurs. In this case everything returns to its previous state and no data is lost. However, if you are consuming data from a transactional feed (for example, pulling a row at a time from a staging table in a database, or pulling one message at a time from a queuing product like MQSeries or MSMQ), then this might not be enough. If you simply end the transaction and go back and pick up the same data again, the same error is likely to occur and the system becomes stuck in a repeated loop.
The SQL adapter in an earlier version of BizTalk Server shipped with this behavior. However, soon after release the adapter behavior was changed to attempt to suspend a failed message and commit the transaction. Moving a message to the Suspended queue under the same transaction and then committing the transaction saves the data from being lost and also allows the adapter to get past bad data.
When the receive portion of an adapter is passed an error message in response to a Submit message operation, the adapter should process that error and move the message to the Suspended queue.
In the case of transactional batches in which the adapter has created the transaction object and submits messages under the transaction, the adapter should logically move the message to the Suspended queue under the same transaction when failures occur. The transaction ensures that data is not dropped, and even data that is causing an error should never be dropped.
Handle Messages without Subscriptions
BizTalk Server does not accept a message to be published in its MessageBox database if there are no subscriptions defined to accept it. Subscriptions are registered by either orchestrations or send ports. Multiple subscriptions can be defined, in which case the message is sent to multiple destinations. If there are no subscriptions, BizTalk Server rejects the message and does not attempt to suspend it. If the adapter does not handle this error and explicitly suspend the message, then the message is dropped and its data is potentially lost. Of course a transactional adapter may end the transaction and return the message to its destination.
Support Seek with Your Receive Stream
The receive-side stream must support the Seek method for BizTalk Server to be able to suspend the message on a pipeline failure. If the message stream is not seekable, then BizTalk Server generates an error when it tries to run Seek.
In many cases supporting Seek is not easy. When streaming data from a network, for example, it may be difficult to go back to the network resource and request the data again.
Several adapters that ship with BizTalk Server spool the message data onto a file on disk at the same time as BizTalk Server reads the data. This allows the adapter to use Seek on that file if it encounters an error (in the pipeline processing of the message data, for example). Internally the adapter uses the ReadOnlySeekableStream class that wraps an incoming non-seekable stream and overflows to disk when a configurable size threshold is reached. For messages smaller than the threshold size, the disk is never hit.
Consider User-Configurable Error-Handling Options
Sometimes there is no one correct response to an error. In this case, you should consider a user-configurable option to choose between behaviors. The MQSeries adapter does this.
The problem with having the adapter suspend messages when it sees an error is that the Suspended queue in BizTalk Server is something of a "black hole." It is relatively easy to get messages into the queue, but harder to get them out again.
Some users of the adapter might not want anything in the Suspended queue. For example, in the case of the MQSeries adapter, the user is offered a configuration option to do one of the following:
-
Set the adapter to end the current transaction and disable itself when it sees an error.
-
Suspend the failed message and commit the transaction. The adapter does this even when BizTalk Server has successfully suspended the message. This action meets the requirements of the customer even if it causes the event log to not be strictly correct.
Implement Receive Ordering by Using a Single Thread and Waiting on BatchComplete
The interface to BizTalk Server is designed for performance and the ability to scale out by supporting concurrency. However, if you want a strictly ordered receive of messages (as is sometimes required when receiving messages from a message queue product like MQSeries or MSMQ), then you must do some additional work in the adapter to disable some of that concurrency. This can be done in two steps:
-
You must use a single thread for all the data processing in the adapter.
-
You must wait for BizTalk Server to completely process each batch. This requirement is important and can be accomplished by using .NET thread synchronization primitives. For example, using an AutoResetEvent, you would:
-
Declare the event object where it can be accessed by both the main worker thread and the BatchComplete callback object.
-
On the main worker thread, submit the messages to the batch as usual but then call AutoResetEvent.Reset on the event object just before the call to the batch IBTTransportBatch::Done.
-
Call AutoResetEvent.WaitOne on the event object from this same thread. This causees the main worker thread to block. In the BatchComplete callback from BizTalk Server you then call AutoResetEvent.Set on the same event object to unblock the worker thread so it is ready to process another message.
It is strongly suggested that receive ordering like this be made configurable because it causes significant performance degradation. Many, if not most, user scenarios do not require ordering of messages. Suspending messages can also break ordering. Exactly what to do in this case is application-dependent, so the best thing for your adapter to do is to offer the user a configuration point.
In ordered scenarios, some customers have stated that they would prefer to stop the processing, that is, disable the adapter, rather than break ordering. The MQSeries adapter, which supports ordered receive, provides this option to the user.