Article
01/04/2019

September 2018

Volume 33 Number 9

[Azure]

Managing Event Delivery with Azure Event Grid

Event Grid is a fully managed messaging service in Microsoft Azure that provides an innovative approach for the routing of events in the cloud and beyond. It has unlocked new and unique patterns for how event-driven solutions are designed with a powerful and flexible publish-subscribe model.

I provided an introduction to Azure Event Grid in the February 2018 issue (aka.ms/eventgridarticle) that explores the fundamentals of the service and how it can be used to publish and consume events in various ways. In this new article, I’ll delve into how events are delivered and what the options are for retry policies, invalid events and events that aren’t successfully delivered. Before I dive in, it will be helpful to go over how Event Grid works at a high level.

A Brief Overview of Event Grid

With Azure Event Grid, event sources can originate from a growing list of services in Azure, such as Event Hubs and Media Services, or even from a custom application that’s running on-premises or within another cloud provider or datacenter. These events are consumed and managed by Event Grid, which is responsible for the ingesting of the messages and for their distribution to each event subscription. Event handlers are used to take action on incoming events and can be services in Azure. A very popular scenario is one that leverages other serverless technologies such as Functions and Logic Apps. Together, these highly scalable and flexible solutions can be composed very quickly and affordably without the burden of managing any infrastructure.

An event handler can also be a simple WebHook, which means that just like an event source, it can reside anywhere as long as it supports HTTPS and can accept a POST request. This platform- and language-agnostic approach is one of the many options that make Event Grid a very special service that has only just begun to open up new solutions in the cloud. Figure 1 illustrates how Event Grid is used to connect multiple sources and handlers. The list of services in Azure that integrate with Event Grid is constantly growing and only a subset is depicted in the diagram.

Figure 1 Azure Event Grid Overview

Event Delivery and Response Codes

Event Grid treats each event independently. This means that there isn’t a guaranteed order for the events and, in some cases, an event can be delivered more than once. Therefore, it’s the responsibility of the event handler to code defensively and be idempotent. Sending the same event repeatedly should produce the same result. If the ordering of events is a requirement, this should be managed on either the compute side (within the logic of the event handler) or by using another service such as Service Bus or Event Hubs to preserve their order.

The HTTP response that’s returned by the event handler to Event Grid will determine how it will proceed with the management of the event. The status codes 200 OK and 202 Accepted are considered to be an acknowledgement of a successfully delivered event.

Any of the following failure codes are indicative of a failed delivery attempt: 400 Bad Request, 401 Unauthorized, 404 Not Found, 408 Request Timeout, 414 URI Too Long, 500 Internal Server Error, 503 Service Unavailable and 504 Gateway Timeout. Depending on the failure code, Event Grid might retry sending the event to the endpoint. I’ll dig into this in just a bit.

Retry Policies

If an acknowledgement isn’t received or an error code is returned, another attempt will be made to send the event to the endpoint. A retry policy that employs an exponential backup is then put into place to attempt a final delivery of the event before it expires. By default, an event will expire after 24 hours of unsuccessful delivery. After the first attempt, the delivery schedule will back off, using the following timeline: 10 seconds, 30 seconds, 1 minute, 5 minutes, 10 minutes, 30 minutes and 1 hour. After the first one-hour attempt, each subsequent request is made once per hour until the time-to-live (TTL) is reached.

A new feature in Event Grid allows you to configure the retry policy for an event subscription by setting two possible values:

Max delivery attempts is a configurable value that sets the maximum retry attempts for an event subscription. Its default value, and maximum allowed, is 30.

Event TTL is a value that corresponds to the time-to-live setting for an event. Its default value, as well as its maximum setting, is 1,440 minutes (24 hours).

Either of these properties can be used to manage the retry policy at the event subscription level and can be useful when an event is interesting to you only for a short period of time, or you’re intentionally throwing an error such as 503 to protect yourself during high load.

Invalid and Dead Letter Channels

One of the primary responsibilities of a messaging service is to ensure that a message is delivered properly. However, it can’t guarantee that the receiver of that message will handle it successfully.

In some cases, the receiver of an event might reject the message if it doesn’t meet certain expectations. This could come in the form of an invalid data type or payload, or an unauthorized message. When this happens, the message is typically moved to a designated location, often referred to as an invalid message channel.

Another common scenario is a message that’s successfully sent but can’t be processed due to an error on the receiving end. These are typically the 500-level status codes that indicate a service error or service unavailability. Usually, the messaging service will retry sending these messages until a threshold is met and the message is deemed undeliverable. A dead letter channelis used, similar to the invalid message channel, to store these messages along with any relevant metadata.

The desire in both scenarios is that there will be some utility or service that monitors the channels and will know what to do with the messages. This could surface in a scheduled report or perhaps another event-driven service, such as a Logic App, that can pick up the message and notify end users or other systems and services.

Dead Lettering with Event Grid

A highly requested feature for Azure Event Grid shortly after it was released was to provide a mechanism for capturing events that couldn’t be delivered. This feature has recently come to fruition with the ability to configure the retry policies and dead letter location for each event subscription. With this feature, I can now configure the dead letter channel to an Azure Storage Account that will capture the message and relevant details about its delivery. This same endpoint will serve as both a dead letter and invalid message channel. Figure 2 illustrates how dead letter events are now supported in Azure Event Grid.

Figure 2 Dead Letter Events

I want to call out a couple of important details before putting a sample together that uses both the retry policies and dead letter delivery.

Notice that the arrows coming from the Event Grid topic are going in the direction of the handlers. I wanted to point this out to reinforce the notion that Event Grid is a push-push model. Event handlers provide a webhook that’s called by Event Grid when it’s time to send an event. This important design decision emphasizes the event-driven nature of the service. Unlike other messaging services, there’s no longer a need to resort to long-polling or hammer-polling techniques to check if a message is available. Instead, a handler can rely on this model to be notified of new events, which is another reason why serverless technologies such as Functions and Logic Apps for event handlers are appealing. Let’s put this into practice.

Setup

Now I’ll walk you through the steps of setting up and configuring both the retry policies and a dead letter endpoint. In addition, it would be great to also inspect the dead letter events and react to them immediately after they become available.

Everything I’m going to set up in Azure will be done from the Azure Cloud Shell. This will ensure that you can do it from any machine and not be at the mercy of any particular tools or other dependencies. Details about Azure Cloud Shell can be found at bit.ly/2CsFtQB.

At the time of this writing, this feature is in preview, though it most likely will be released before or shortly after the article is published. While it’s in preview, you’ll have to install the Event Grid extension to use it in the command-line interface (CLI):

az extension add -–name eventgrid

With that out of the way, I’m going to initialize a few local variables that will be used repeatedly throughout the exercise:

rgname=msdndemo
topicname=<your-unique-grid-topic-name>
storagename=<your-unique-storage-account-name>
containername=deadletterevents

I’m going to create a resource group, storage account and a container that will be used to receive the dead letter events, as shown in Figure 3.

Figure 3 Creating a Resource Group, Storage Account and Container

# create resource group
az group create --name $rgname -l westus2
# create storage account
az storage account create \
  -–name $storagename \
  --location westus2 \
  --resource-group $rgname \
  --sku Standard_LRS \
  --kind StorageV2 \
  --access-tier Hot
# create storage container
export AZURE_STORAGE_ACCOUNT=$storagename
export AZURE_STORAGE_ACCESS_KEY=”$(az storage account keys list -–account-name $storagename
  --resource-group $rgname –-query “[0].value” –-output tsv)”
az storage container create –-name $containername

Next, I need the storage account’s resource ID. This will be used, along with the container name, to define the dead letter endpoint:

# storage ID
storageid=$(az storage account show –-name $storagename
  –-resource-group $rgname –-query id --output tsv)  
# container ID for dead letter channel
containerid="$storageid/blobservices/default/containers/$containername"

I’ll be sending the events to a custom Event Grid topic. Let’s create the topic and save the endpoint address and access key. These values will be leveraged later when publishing events:

# create custom topic
az eventgrid topic create -g $rgname –-name $topicname -l westus2
# save topic endpoint
topicendpoint=$(az eventgrid topic show –-name $topicname -g $rgname
  –-query “endpoint” –-output tsv) 
# save topic key
topickey=$(az eventgrid topic key list --name $topicname -g $rgname
  --query "key1" --output tsv)

Azure Functions Event Handler

In this scenario, I’ll pretend that users are sending song requests to a fictitious radio station that specializes in Blues music. As requests come in, they’re put onto a playlist for the station. However, if the genre of the song doesn’t match the target music for the radio station, it will be rejected and placed on the dead letter channel.

The event handler is an Azure Function built on the version 2 runtime. It will inspect the Subject field of the event and approve or reject the song request accordingly. The code for the Azure Function is shown in Figure 4.

Figure 4 The Azure Function Event Handler

using System.IO;
using System.Linq;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Extensions.Http;
using Microsoft.AspNetCore.Http;
using Microsoft.Azure.EventGrid.Models;
using Microsoft.Azure.WebJobs.Host;
using Newtonsoft.Json;
using Microsoft.Extensions.Logging; 

namespace WeWantTheFunc
{
  public static class SongRequestHandler
  {
    [FunctionName("SongRequestHandler")]
    public static IActionResult Run(
      [HttpTrigger(AuthorizationLevel.Function, "post", Route = null)]
      HttpRequest req, ILogger log)
    {
      // Get the body of the request
      var requestBody = new StreamReader(req.Body).ReadToEnd();
      // Check the header for the event type           
      if (!req.Headers.TryGetValue("Aeg-Event-Type", out var headerValues))
        return new BadRequestObjectResult("Not a valid request");
      var eventTypeHeaderValue = headerValues.FirstOrDefault();
      if (eventTypeHeaderValue == "SubscriptionValidation")
      {
        // Validate the subscription
        var events = JsonConvert.DeserializeObject<EventGridEvent[]>(requestBody);
        dynamic data = events[0].Data;
        var validationCode = data["validationCode"];
        return new JsonResult(new
        {
          validationResponse = validationCode
        });
      }
      else if (eventTypeHeaderValue == "Notification")
      {
        // Handle the song request
        log.LogInformation(requestBody);
        var events = JsonConvert.DeserializeObject<EventGridEvent[]>(requestBody);
        // Reject the request if it does not
        // match the genre for the station.
        if (events[0].Subject != "genre/blues")
          return new BadRequestObjectResult("Sorry, this is a Blues station");
        return new OkObjectResult("");
      }
      return new BadRequestObjectResult("Not a valid request");
    }
  }
}

This function has two parts. The first checks to see if the event coming in is intended for validating the endpoint. If it’s a validation request, the function returns the validation code to prove ownership and acceptance of incoming messages.

The second part of the function is for event notifications from Event Grid. By returning a bad request response (401), an explicit statement is made to no longer send the event to the handler. This brings me to the most important part of the process—creating an event subscription.

Subscribing to Events

With all the pieces in place, it’s finally time to create an event subscription that demonstrates both the retry policies and dead lettering feature. An assumption is that the Azure Function in Figure 4 has been deployed and is currently running in Azure. The command to create the subscription is as follows:

az eventgrid event-subscription create \
  --endpoint <your-azure-function-url> \
  --topic-name $topicname \
  -g $rgname \
  --name song-request-sub \
  --deadletter-endpoint $containerid \
  --max-delivery-attempts 2
  --event-ttl 1

Notice that both the max-delivery-attempts and event-ttl settings are included. It’s not a requirement to include both settings, but they can be used together to configure the retry policies. I’m setting the maximum number of delivery attempts to two and the time to live to one minute. Another new argument called deadletter-endpoint is initialized to the storage account container using the variable that was created earlier. It’s time to send some events and see this working end-to-end.

Sending Events

From the CLI, I can copy a sample body that contains a song request. This request will actually contain an invalid music genre (Rock) that will be rejected by the event handler.

# copy the request body
body=$(eval echo "'$(curl https://raw.githubusercontent.com/dbarkol/azure-event-grid-patterns/master/badsongrequest.json)'")
# post the request to the custom topic endpoint
curl -X POST -H "aeg-sas-key: $topickey" -d "$body" $topicendpoint

The expectation is that a few minutes after the message is sent, it will end up in the storage account container that I configured as the dead letter channel. You can use a tool like Azure Storage Explorer to see the new blob that’s created for each dead-lettered message. However, that isn’t very exciting. As an alternative, I want to be notified when a new message is added to the dead letter channel.

Inspecting Dead Letter Events

Because the dead letter event is just a new blob created in a container, Event Grid can be used to kick off another workflow that’s triggered when the file is created. Figure 5 demonstrates the workflow for how an event can originate from a storage account and ultimately be received by a Logic App and send an email.

Figure 5 Blob Events Handled by Logic Apps

The only thing I’ll need to put this together is a Logic App that begins with an Event Grid trigger. The first three steps in the application comprise the following actions and trigger:

Event Grid trigger is configured for the storage account. It uses two filters—one for the event type (Microsoft.Storage.BlobCreated) and the second for the container using the prefix filter option. Both of these filters ensure that the app is only invoked when a new file is created within the dedicated dead letter container.

Initialize variable is the next action that retrieves the URL from the body of the Event Grid message. Its value is the following expression:

triggerBody()?['data']['url']

Get blob content using path is the action that will read the contents of the blob. I format the value of the blob path using this string manipulation expression to remove the portion that isn’t needed:

replace(variables('DeadLetterUrl'),
  'https://<storage-account-name>.blob.core.windows.net', '')

These initial steps are shown in Figure 6.

Figure 6 An Event Grid-Triggered Logic App

Here are the final actions of the Logic App:

Parse JSON will evaluate the contents of the blob and provide a set of variables I can reference later. The expression for the Content property is:

json(body('Get_blob_content_using_path'))

I’ll also need to provide a sample payload that looks like:

[{
  "id":"100",
  "eventTime":"2017-08-21T06:42:20.0000000+00:00",
  "eventType":"type",
  "dataVersion":"",
  "metadataVersion":"1",
  "topic":"enpoint",
  "subject":"testsubject",
  "deadLetterReason":"reason",
  "deliveryAttempts":1,
  "lastDeliveryOutcome":"BadRequest",
  "lastHttpStatusCode":400,
  "data":{"something":"data"}
}]

Send an email is the last action, which formats the body and subject of the email with the artifacts of the events (dead letter reason and subject). Because the payload actually contains an array, Logic Apps recognizes that and wraps the action in a for each action.

As a side note, each step within a for each loop is done in parallel instead of sequentially in Logic Apps; this is the default behavior that can be changed to run sequentially, if desired. The end result is displayed in Figure 7.

Figure 7 Parsing JSON and Sending Notifications

Testing this end-to-end will result in an email being sent for each dead letter event.

Wrapping Up

With its original design and deep integration into a growing list of Azure services, Event Grid is only beginning to reveal innovative ways to build event-driven solutions. In this article, I demonstrated how to leverage the new retry policies and dead lettering functionality in Azure Event Grid. These highly anticipated features are powerful options for making solutions built on Event Grid more resilient and scalable. The code in this article can be found at bit.ly/2LGKjhN.

David Barkol is an Azure specialist at Microsoft on the Global Black Belt Team. You can contact him on Twitter: @dbarkol or through email at dabarkol@microsoft.com. He blogs regularly about Event Grid at madeofstrings.com.

Thanks to the following Microsoft technical expert for reviewing this article: Bahram Banisadr

Bahram Banisadr is the PM in charge of Azure Event Grid working to build the connective tissue for Azure services.

Discuss this article in the MSDN Magazine forum