Updated: April 15, 2015
The load balancer probe is a customer defined health probe of UDP endpoints and endpoints in role instances. The LoadBalancerProbe is not a standalone element; it is combined with the web role or worker role in a service definition file. A LoadBalancerProbe can be used by more than one role.
The default extension for the service definition file is .csdef.
The Azure Load Balancer is responsible for routing incoming traffic to your role instances. The load balancer determines which instances can receive traffic by regularly probing each instance in order to determine the health of that instance. The load balancer probes every instance multiple times per minute. There are two different options for providing instance health to the load balancer – the default load balancer probe, or a custom load balancer probe which is implemented by defining the LoadBalancerProbe in the .csdef file.
The default load balancer probe utilizes the Guest Agent inside the virtual machine listens and responds with an HTTP 200 OK response only when the instance is in the Ready state (ie. The instance is not in the Busy, Recycling, Stopping, etc states). If the Guest Agent fails to respond with HTTP 200 OK, the Azure Load Balancer marks the instance as unresponsive and stops sending traffic to that instance. The Azure Load Balancer will continue to ping the instance, and if the Guest Agent responds with an HTTP 200, the Azure Load Balancer will send traffic to that instance again. When using a web role your website code typically runs in w3wp.exe which is not monitored by the Azure fabric or guest agent, which means failures in w3wp.exe (eg. HTTP 500 responses) will not be reported to the guest agent and the load balancer will not know to take that instance out of rotation.
The custom load balancer probe overrides the default guest agent probe and allows you to create your own custom logic to determine the health of the role instance. The load balancer will regularly probe your endpoint (every 15 seconds, by default) and the instance will be considered in rotation if it responds with a TCP ACK or HTTP 200 within the timeout period (default of 31 seconds). This can be useful to implement your own logic to remove instances from load balancer rotation, for example returning a non-200 status if the instance is above 90% CPU. For web roles using w3wp.exe this also means you get automatic monitoring of your website since failures in your website code will return a non-200 status to the load balancer probe. If you do not define a LoadBalancerProbe in the .csdef file then the default load balancer behavior as described above will be used.
If you use a custom load balancer probe you will need to ensure that your logic takes into consideration the RoleEnvironment.OnStop method. When using the default load balancer probe the instance will be taken out of rotation prior to OnStop being called, but a custom load balancer probe can continue to return a 200 OK during the OnStop event. If you are using the OnStop event to clean up cache, stop service, or otherwise making changes that can affect the runtime behavior of your service then you need to ensure that your custom load balancer probe logic will remove the instance from rotation.
The basic format of a service definition file containing a load balancer probe is as follows.
<ServiceDefinition …> <LoadBalancerProbes> <LoadBalancerProbe name="<load-balancer-probe-name>" protocol="[http|tcp]" path="<uri-for-checking-health-status-of-vm>" port=”<port-number>” intervalInSeconds="<interval-in-seconds>" timeoutInSeconds="<timeout-in-seconds>"/> </LoadBalancerProbes> </ServiceDefinition>
The LoadBalancerProbes element describes the collection of load balancer probes. This element is the parent element of the LoadBalancerProbe Element.
The LoadBalancerProbe element defines the health probe for a model. You can define multiple load balancer probes.
The following table describes the attributes of the LoadBalancerProbe element:
Required. The name of the load balancer probe. The name must be unique.
Required. Specifies the protocol of the end point. Possible values are http or tcp. If tcp is specified, a received ACK is required for the probe to be successful. If http is specified, a 200 OK response from the specified URI is required for the probe to be successful.
The URI used for requesting health status from the VM. path is required if protocol is set to http. Otherwise, it is not allowed.
There is no default value.
Optional. The port for communicating the probe. This is optional for any endpoint, as the same port will then be used for the probe. You can configure a different port for their probing, as well. Possible values range from 1 to 65535, inclusive.
The default value is set by the endpoint.
Optional. The interval, in seconds, for how frequently to probe the endpoint for health status. Typically, the interval is slightly less than half the allocated timeout period (in seconds) which allows two full probes before taking the instance out of rotation.
The default value is 15, the minimum value is 5.
Optional. The timeout period, in seconds, applied to the probe where no response will result in stopping further traffic from being delivered to the endpoint. This value allows endpoints to be taken out of rotation faster or slower than the typical times used in Azure (which are the defaults).
The default value is 31, the minimum value is 11.