ACS Retry Guidelines
Published: January 9, 2013
Updated: January 9, 2014
Applies To: Windows Azure
Windows Azure Active Directory Access Control (also known as Access Control Service or ACS) supports a number of different token issuance and management endpoints that clients can query. This article defines guidelines for implementing retry logic for these endpoints to handle an unexpected network or server failure.
Error-Handling Scenarios
Failures that respond to retries typically return an HTTP 500-series error codes for a request to an ACS endpoint. In some scenarios, the client is an application or service that makes automated requests to ACS. In other scenarios, such as web-based federation that uses the WS-Federation protocol, the client is a web browser and retries must be performed manually by the end user. This topic covers error-handling scenarios in which the client is an application or service.
These scenarios include:
-
Management operations that use the ACS Management Service
-
Token requests for Web services using the WS-Trust protocol (see Securing WCF Services with ACS)
-
Token requests for Web Services using the OAuth WRAP protocol (see How to: Request a Token from ACS via the OAuth WRAP Protocol)
-
Token requests for Web Services using the OAuth 2.0 protocol (see Code Sample: OAuth 2.0 Certificate Authentication)
Retry Guidelines
The following guidelines explain how to implement retry logic in the error-handling scenarios.
Guideline #1: Implement retry logic based on HTTP 500-series error responses
Retry logic is strongly recommended when ACS returns HTTP 500-series errors. The following list includes examples of typical HTTP 500-series errors.
-
HTTP Error 500 - Internal Server Error
-
HTTP Error 502 - Bad Gateway
-
HTTP Error 503 - Service Unavailable
-
HTTP Error 504 – Gateway Timeout
Although individual HTTP codes can be listed explicitly in the retry logic, it is sufficient to invoke retry logic if any HTTP 500-series error is returned.
Typically, retry logic is not recommended when HTTP 400-series error codes are returned. A 400-series HTTP error response code from ACS means the request is invalid and needs to be revised.
Retry logic should be triggered by HTTP error codes, such as HTTP 504 (External server timeout) or the HTTP 500 error code series, and not by ACS error codes, such as ACS90005. ACS error codes are informational and subject to change.
Guideline #2: Retries should use a back-off timer for optimal flow control
When a client receives an HTTP 500-series error, the client should wait for a specified period of time before retrying the request. For best results, it is recommended that this period of time increase with each subsequent retry. This approach allows transient errors to be resolved quickly while optimizing the request rate for transient network or server issues that take longer to resolve.
For example, use an exponential back-off timer where the delay before retry increases exponentially with each instance, such as Retry 1: 1 second, Retry 2: 2 seconds, Retry 3: 4 seconds, etc.
Adjust the number of retries and the time between each retry based on your user experience requirements. However, we recommend up to five retries over a period of five minutes. Failures caused by a timeout take longer to resolve.
Guideline #3: Verify that the item does not exist before attempting to create or delete it
When performing create or delete operations with the ACS Management Service, such as creating a new relying party application or deleting a rule, the retry logic should query if the item exists before performing the operation.In some circumstances, such as a transient network failure that occurs while delivering the server response, a creation or deletion operation can succeed even when the client gets an error response.
If a create operation is retried without checking for the existence of the item, duplicate items can being created. , Also, the system might return an HTTP 400 error if the item must be unique.
If a delete operation is retried without checking for the existence of the item, the system might return an HTTP 400 error when it cannot find the item.
See Also