SALES: 1-800-867-1380

About Traffic Manager Monitoring

Updated: March 13, 2014

Windows Azure Traffic Manager

Azure Traffic Manager monitors your endpoints, including cloud services and websites, to ensure they are available. In order for monitoring to work correctly, you must set it up the same way for every endpoint that you specify in your Traffic Manager profile. After you configure monitoring, Traffic Manager will display the status for your endpoints and profile in the Management Portal. You can configure monitoring settings in the Management Portal on the Configure page for your Traffic Manager profile.
You can specify the following settings:

  • Protocol – Choose HTTP or HTTPS. It’s important to note that HTTPS monitoring does not verify whether your SSL certificate is valid, it only checks that certificate is present.

  • Port – Choose the port used for the request. Standard HTTP and HTTPS ports are among the choices.

  • Relative path and file name – Give the path and the name of the file that the monitoring system will attempt to access. Note that a forward slash “/“ is a valid entry for the relative path and implies that the file is in the root directory (default). For more information about configuring settings, see Configure Traffic Manager Monitoring.

Azure Traffic Manager displays profile and endpoint service health in the Management Portal. The status column for both the profile and the endpoint displays the most recent monitor status. You can use this status to understand the health of your profiles according to your Traffic Manager monitoring settings. When your profile is healthy, DNS queries will be distributed to your services based on the load balancing settings for the profile (Round Robin, Performance, or Failover). Once the Traffic Manager monitoring system detects a change in monitor status, it updates the status entry in the Management Portal. It can take up to five minutes for the state change to refresh.

Endpoint Monitor status

The Endpoint Monitor status in the table below is the result of a combination of the endpoint health probe results and your profile and endpoint configurations.

 

Profile status Endpoint status Endpoint Monitor status
(API and Portal)
Notes

Disabled

Enabled

Inactive

Disabled profiles are not monitored. However, the endpoint status within disabled profiles can still be managed.

<any>

Disabled

Disabled

Disabled profiles are not monitored. However, the endpoint status within disabled profiles can still be managed.

Enabled

Enabled

Online

Endpoint is monitored and is healthy.

Enabled

Enabled

Degraded

Endpoint is monitored and is unhealthy.

Enabled

Enabled

CheckingEndpoint

Endpoint is monitored but the results of the first probe have not yet been received. This state is temporary when you’ve just added a new endpoint to the profile, or have just enabled an endpoint or profile.

Enabled

Enabled

Stopped

The underlying cloud service or website is not running.

Profile Monitor status

The Profile Monitor status in the table below is the result of the combination of the endpoint monitor status and your configured profile status.

 

Profile status
(as configured)
Endpoint Monitor status Profile Monitor status
(API and Portal)
Notes

Disabled

<any> or a profile with no defined endpoints.

Disabled

Endpoints are not monitored.

Enabled

The status of at least one endpoint is “Degraded”.

Degraded

This is a flag that customer action is required.

Enabled

The status of at least one endpoint is “Online”. No endpoints are “Degraded”.

Online

The service is accepting traffic and customer action is not required.

Enabled

The status of at least one endpoint is “CheckingEndpoint”. No endpoints are “Online” or “Degraded”.

CheckingEndpoints

Transition state. This typically occurs when a profile has just been enabled and the endpoint health is being probed.

Enabled

The status of all endpoints defined in the profile is either “Disabled” or “Stopped”, or the profile has no defined endpoints.

Inactive

No endpoints are active, but the profile is still enabled.

To fit the graphic below on a single screen, click here.

An example timeline illustrating the monitoring process with a single cloud service is displayed is below. This scenario shows the following:

  • The cloud service is available and receiving traffic via this Traffic Manager profile ONLY.

  • The cloud service becomes unavailable.

  • The cloud service remains unavailable for a time much longer than the DNS Time-to-Live (TTL).

  • The cloud service becomes available again.

  • The cloud service resumes receiving traffic via this Traffic Manager profile ONLY.

Traffic Manager Monitoring Sequence

Figure 1 – Monitoring sequence example. The numbers in the diagram correspond to the numbered explanation below.

  1. GET – The Traffic Manager monitoring system performs a GET on the path and file you specified in the monitoring settings.

  2. 200 OK – The monitoring system expects an HTTP 200 OK message back within 10 seconds. When it receives this response, it assumes that the cloud service is available.

    noteNote
    Traffic Manager only considers an endpoint to be Online if the return message is a 200 OK. If a non-200 response is received, it will assume the endpoint is not available and will count this as a failed check.

  3. 30 seconds between checks – This check will be performed every 30 seconds.

  4. Cloud service unavailable – The cloud service becomes unavailable. Traffic Manager will not know until the next monitor check.

  5. Attempts to access monitoring file (4 tries) – The monitoring system performs a GET, but does not receive a response in 10 seconds or less. It then performs three more tries at 30 second intervals. This means that at most, it takes approximately 1.5 minutes for the monitoring system to detect when a service becomes unavailable. If one of the tries is successful, then the number of tries is reset. Although not shown in the diagram, if the 200 OK message(s) come back more than 10 seconds after the GET, the monitoring system will still count this as a failed check.

  6. Marked degraded – After the fourth failure in a row, the monitoring system will mark the unavailable cloud service as Degraded.

  7. Traffic to cloud service decreases – Traffic may continue to flow to the unavailable cloud service. Clients will experience failures because the service is unavailable. Clients and secondary DNS servers have cached the DNS record for the IP address of the unavailable cloud service. They continue to resolve the DNS name of the company domain to the IP address of the service. In addition, secondary DNS servers may still hand out the DNS information of the unavailable service. As clients and secondary DNS servers are updated, traffic to the IP address of the unavailable service will slow. The monitoring system continues to perform checks at 30 second intervals. In this example, the service does not respond and remains unavailable.

  8. Traffic to cloud service stops – By this time, most DNS servers and clients should be updated and traffic to the unavailable service stops. The maximum amount time before traffic completely stops is dependent on the TTL time. The default DNS TTL is 300 seconds (5 minutes). Using this value, clients stop using the service after 5 minutes. The monitoring system continues to perform checks at 30 second intervals and the cloud service does not respond.

  9. Cloud service comes back online and receives traffic – The service becomes available, but Traffic Manager does not know until the monitoring system performs a check.

  10. Traffic to service resumes - Traffic Manager sends a GET and receives a 200 OK in under 10 seconds. It then begins to hand out the cloud service’s DNS name to DNS servers as they request updates. As a result, traffic starts to flow to the service once again.

See Also

Was this page helpful?
(1500 characters remaining)
Thank you for your feedback
Show:
© 2014 Microsoft