Export (0) Print
Expand All
1 out of 1 rated this helpful - Rate this topic

Handling Transient Communication Errors

Updated: March 19, 2014

To improve the reliability of a solution that uses the Windows Azure Service Bus .NET managed brokered messaging API, it is recommended that you adopt a consistent approach to handling transient faults and intermittent errors that can appear when the solution communicates to the highly multi-tenant cloud-based queuing and publish/subscribe messaging service infrastructure provided by the Service Bus.

Techniques for Handling Transient Communication Errors

To improve the reliability of a solution that uses the Service Bus .NET managed brokered messaging API, it is recommended that you adopt a consistent approach to handling transient faults and intermittent errors that can appear when the solution communicates to the highly multi-tenant cloud-based queuing and publish/subscribe messaging service infrastructure provided by Service Bus.

When considering a specific technique for detecting transient conditions, you may want to reuse existing technical solutions such as the Transient Fault Handling Framework or build your own. In both cases, you will have to make sure that only a subset of communication exceptions is treated as transient before you try to recover from the respective faults.

The table that is shown here provides a list of exceptions that can be compensated by implementing retry logic:

 

Exception Type

Recommendation

ServerBusyException

This exception can be caused by an intermittent fault in the Service Bus messaging service infrastructure that is not able to process a request because of point-in-time abnormal load conditions. The client can attempt to retry with a delay. A back-off delay would be preferable to prevent adding unnecessary pressure to the server.

MessagingCommunicationException

This exception signals a communication error that can manifest itself when a connection from the messaging client to the Service Bus infrastructure cannot be successfully established. In most cases, provided network connectivity exists, this error can be treated as transient. The client can attempt to retry the operation that has resulted in this type of exception. It is also recommended that you verify whether the domain name resolution service (DNS) is operational as this error may indicate that the target host name cannot be resolved.

TimeoutException

This exception indicates that the Service Bus messaging service infrastructure did not respond to the requested operation in the specified time which is controlled by the OperationTimeout setting. The requested operation may have still been completed; however, because of network or other infrastructure delays, the response may not have reached the client in a timely manner. Compensating this type of exceptions must be done with caution. If a message has been delivered to a queue but a response has timed out, resending the original message will cause duplication.

For more detailed information about different types of exceptions that can be reported by the Service Bus messaging API, see the Messaging Exceptions topic.

noteNote
When handling transient communication errors, beware of transient exceptions masked by outer exceptions of a different type. For example, a timeout may return to the caller in the form of a communication error that hides the original timeout as an inner exception. It is therefore recommended that you inspect all inner exceptions of a given exception object in a recursive manner to be able to reliably detect transient communication errors. The ServiceBusTransientErrorDetectionStrategy class in the Transient Fault Handling Framework provides an example of how you can do this.

The following code snippet demonstrates how to asynchronously send a message to a Service Bus topic while making sure that all known transient faults will be compensated by a retry. Please note that this code sample maintains a dependency on the Transient Fault Handling Framework.

var credentials = TokenProvider.CreateSharedSecretTokenProvider(issuerName, issuerSecret);
var address = ServiceBusEnvironment.CreateServiceUri("sb", serviceNamespace, String.Empty);
var messagingFactory = MessagingFactory.Create(address, credentials);
var topicClient = messagingFactory.CreateTopicClient(topicPath);
var retryPolicy = new RetryPolicy<ServiceBusTransientErrorDetectionStrategy>(RetryPolicy.DefaultClientRetryCount);

// Create an instance of the object that represents message payload.
var payload = XDocument.Load("InventoryFile.xml");

// Declare a BrokeredMessage instance outside so that it can be reused across all 3 delegates below.
BrokeredMessage msg = null;

// Use a retry policy to execute the Send action in an asynchronous and reliable fashion.
retryPolicy.ExecuteAction
(
    (cb) =>
    {
        // A new BrokeredMessage instance must be created each time we send it. Reusing the original BrokeredMessage instance may not 
        // work as the state of its BodyStream cannot be guaranteed to be readable from the beginning.
        msg = new BrokeredMessage(payload.Root, new DataContractSerializer(typeof(XElement)));

        // Send the event asynchronously.
        topicClient.BeginSend(msg, cb, null);
    },
    (ar) =>
    {
        try
        {
            // Complete the asynchronous operation. 
            // This may throw an exception that will be handled internally by the retry policy.
            topicClient.EndSend(ar);
        }
        finally
        {
            // Ensure that any resources allocated by a BrokeredMessage instance are released.
            if (msg != null)
            {
                msg.Dispose();
                msg = null;
            }
        }
    },
    (ex) =>
    {
        // Always dispose the BrokeredMessage instance even if the send 
        // operation has completed unsuccessfully.
        if (msg != null)
        {
            msg.Dispose();
            msg = null;
        }

        // Always log exceptions.
        Trace.TraceError(ex.Message);
    }
);

The next code sample shows how to reliably create a new or retrieve an existing Service Bus topic. This code also maintains a dependency on the Transient Fault Handling Framework which will automatically retry the corresponding management operation if it cannot be completed successfully because of intermittent connectivity issues or other types of transient conditions:

public TopicDescription GetOrCreateTopic(string issuerName, string issuerSecret, string serviceNamespace, string topicName)
{
    // Must validate all input parameters here. Use Code Contracts or build your own validation.
    var credentials = TokenProvider.CreateSharedSecretTokenProvider(issuerName, issuerSecret);
    var address = ServiceBusEnvironment.CreateServiceUri("sb", serviceNamespace, String.Empty);
    var nsManager = new NamespaceManager(address, credentials);
    var retryPolicy = new RetryPolicy<ServiceBusTransientErrorDetectionStrategy>(RetryPolicy.DefaultClientRetryCount);

    TopicDescription topic = null;
    bool createNew = false;

    try
    {
        // First, let's see if a topic with the specified name already exists.
        topic = retryPolicy.ExecuteAction<TopicDescription>(() => { return nsManager.GetTopic(topicName); });

        createNew = (topic == null);
    }
    catch (MessagingEntityNotFoundException)
    {
        // Looks like the topic does not exist. We should create a new one.
        createNew = true;
    }

    // If a topic with the specified name doesn't exist, it will be auto-created.
    if (createNew)
    {
        try
        {
            var newTopic = new TopicDescription(topicName);

            topic = retryPolicy.ExecuteAction<TopicDescription>(() => { return nsManager.CreateTopic(newTopic); });
        }
        catch (MessagingEntityAlreadyExistsException)
        {
            // A topic under the same name was already created by someone else, 
            // perhaps by another instance. Let's just use it.
            topic = retryPolicy.ExecuteAction<TopicDescription>(() => { return nsManager.GetTopic(topicName); });
        }
    }

    return topic;
}

In summary, we recommend that you assess the probability of a failure occurring, and determine the feasibility of adding additional resilience. Virtually all messaging operations can be subject to transient conditions. When calling into the brokered messaging API, therefore, it is recommended that you take appropriate actions to always provide recovery from intermittent issues.

Did you find this helpful?
(1500 characters remaining)
Thank you for your feedback

Community Additions

ADD
Show:
© 2014 Microsoft. All rights reserved.