Microsoft Azure provides a completely managed event routing service called Azure Event Grid. It allows you to respond to events received from various Azure services and external applications and forward them to different Azure services and endpoints.
Azure Event Grid provides a unified way to manage events in Azure with event-driven programming. With Event Grids, you can create event-driven applications in a serverless environment, cutting down costs and performance lags.
Event Grid topics receive events sent by event sources. Delivered events are routed to one of the available Event Grid subscriptions based on the event filters defined in an Event Grid subscription. Submitted events are further forwarded to the endpoint specified in the Event Grid subscription.
You can define how long a delivered event should be available for subscriptions and a retry policy that attempts to deliver the event a specified number of times.
Let’s take a closer look at Azure Event Grid dead letters and how you can monitor them effectively.
What is Azure Event Grid dead letter?
Each Event Grid subscription is associated with an endpoint to which delivered events are routed. Suppose a particular endpoint is unavailable for some unknown reason. The event will be attempted to be delivered based on the provided retry policy.
Event Grid sends the event to the dead letter location after all retries are completed. This results in a response code of 400 (Bad Request) or 413 (Request Entity Too Large) being returned by Event Grid. These response codes indicate that the event was not delivered to the target endpoint. Dead-lettered events are stored as individual blobs in the storage account configured as a dead-letter destination for your Event Grid subscription.
What happens if the Event Grid Subscription does not have dead-lettering enabled?
Such situations result in missing events and data loss because events cannot be forwarded to the target endpoint after the retry period.
Reasons why messages end up as Event Grid dead letter
An event is considered as a dead letter if any of the following conditions are true:
- Events are not delivered within the validity period.
- A limit of attempts was exceeded to deliver an event.
The event is either discarded or dead-lettered if one of the requirements is satisfied. Event Grid subscriptions, by default, do not allow dead lettering. To enable dead lettering, you must connect a storage account to store all your undeliverable events.
Benefits and Use case of monitoring Azure Event Grid dead letter
Imagine an order processing system for an e-commerce application. Order data is received as events within an Event Grid topic in this scenario. The events are routed to the appropriate Event Grid subscription based on the event filters provided. Assume that the target endpoint will be unavailable for some time, for example, due to low storage capacity.
Dead-lettered (or undeliverable) events are stored as individual blobs in the storage account configured as the dead-letter destination for the Event Grid subscription. A dead-letter event is considered for processing once the target endpoint is available, thus avoiding data loss.
Consider that the dead letter count is monitored using a monitoring tool in the given scenario. Monitoring the dead letter count helps you understand the number of events not delivered to the target endpoint and requires further processing.
The benefits of monitoring dead letters include
- Prevention of data loss
- Track failures for troubleshooting
- Enable business data validation without missing a single event.
Monitoring dead-lettered events in Azure Event Grid using Azure Monitor
Using Microsoft Azure, you can set up alert rules to track the number of dead-lettered events in Event Grids and take appropriate action if necessary.
Navigate to the relevant Event Grid subscription and view the real-time count of dead-lettered events available based on the specified time granularity.
You can customize the data visualization of your Event Grid metrics at the specified time granularity by drilling down into the graph data. Click the New Alert Rule option in this panel to start configuring data monitoring alert rules.
Dead Lettered Events should be selected as the signal. A signal is a metric that you monitor. The next step is configuring the alert logic by specifying thresholds and conditions. Finally, you must select the desired action group containing the services for which alert notifications will be sent when the value exceeds the threshold.
Event Grid dead letter monitoring challenges in Azure Monitor
Despite providing a monitoring solution for keeping track of the Event Grid dead letter count, Microsoft Azure is one step behind the dedicated monitoring tools due to the lack of the following functionalities:
- Limited notification services
- Automate dead-letter processing in case of violations
- Lack of options to process dead-lettered events
- Escalation system for alert notification
- Difficulty in locating dead-letters
- The ability to monitor Event Grids across multiple subscriptions and receive consolidated alert reports
Easily handle Event Grid dead letter events using Turbo360
Turbo360 makes handling and accessing dead-letter events from your Event Grid subscriptions easy.
Accessibility of dead-lettered events
You can access dead-lettered events for an Event Grid Subscription by associating the respective storage blob container with its Event Grid Subscription in a logical container, popularly known as “Business Application” in Turbo360. Each failed event will be created as a separate blob.
Three significant actions can be performed over the dead-lettered events:
- Resubmit
- Repair and Resubmit
- Delete
Resubmit – The source event can remain at the dead-letter destination, while copies of the dead-letter event can be resent to configured endpoints for further processing. The source event properties (including event ID, subject, type, time, data version, metadata version, and event data) are preserved when resending an event.
Repair and Resubmit – Copies of dead-lettered events can be resent to a configured endpoint for further processing while modifying its properties. The source event is retained on the dead-letter destination.
Delete – A source event available on a dead-letter destination can be entirely removed after being resubmitted (or) undergoing the repair and resubmit process.
Purging junk events
There may be cases when plenty of dead-lettered events accumulate in the storage account, out of which a majority might be processed and discarded to clear up the storage. Turbo360 allows you to delete these events by defining the period during which (or before) the dead-lettered event was created without having to select and delete each event individually.
Junk events can be purged using the Purge option available within the Event Grid subscription.
Automatic event handling
Handling dead-lettered events in an Event Grid subscription can be automated with Turbo360 by specifying actions, endpoint targets, and the events for which the action will be taken. Automating dead-letter event handling saves a lot of time and manual effort and improves the performance of event-driven applications.
Navigate to the following path: Business Application -> Automated Tasks to configure an automated task for dead-lettered event processing. These tasks can also be scheduled to run at specific times in specific time zones.
Turbo360 Terminology
Business Application – is a logical container to group Azure Services that constitutes a line of business solution.
Monitoring profile – a set of monitoring rules that can be configured for Azure service type and associated with Business Applications to monitor multiple resources of the same type.
Escalation policy – defines how an alert is escalated to the configured notification channels after a predetermined time.
Notification channel – allows users to receive alerts from Azure services configured in Turbo360 while performing monitoring activities.
Step by Step guide to monitor Azure Event Grid dead letter using Turbo360
Azure Event Grid Monitoring Tool by Turbo360 is a feature-rich application that regularly expands its capabilities in managing and monitoring Azure Event Grid.
Suppose you want to track the number of dead-lettered events for multiple Azure Event Grid subscriptions available across multiple Azure subscriptions and receive a consolidated alert report.
Below is a step-by-step monitoring approach:
Step 1: Associate Event Grid subscriptions in a Business Application
The first step is to associate desired Event Grid subscriptions available across multiple Azure Subscriptions with Business Applications. You can associate resources when creating a Business Application or via the following path: Add -> Resources in the Business Application’s Resources section.
Step 2: Monitoring rule configuration
The second step is configuring your Event Grid subscription’s Dead Lettered Events monitoring rule. Monitoring rules can be set for each resource from the Monitoring screen available within the resource.
You can also use monitoring profiles to map it to monitor multiple subscriptions simultaneously. A monitoring profile can be created in the following path: Monitoring settings -> Monitoring profiles.
The created profile can be mapped to a Business Application using the following path: Business Application -> Monitoring -> Profile settings -> Apply profile.
Step 3: Monitoring configuration
The third step is to adjust the monitoring configuration according to your business cycle. The monitoring configuration can be changed at the Business Application level through the following path: Business Application -> Monitoring -> General settings (or) You can change the profile configuration assigned to the Business Application through the following path: Business Application -> Monitoring – > Profile settings. You need to click the settings icon next to the desired profile associated with your Business Application.
Monitoring settings allow you to change the monitoring frequency, monitoring cycle schedule, escalation policy, and notification channel selection.
All set! You will now receive alert reports to the configured notification channels and mail addresses in case of violation during the monitoring cycle, and an hourly report on the health state of the resources will be generated based on the specified hours.
The image shown below is a consolidated alert report received in Turbo360 mail:
Turbo360 expands its monitoring capabilities by offering exciting features that set it apart.
Additional features to improve the Event Grid dead letter monitoring
Below are additional features that can improve the dead letter monitoring approach:
Automated dead-lettered event processing in case of violations
You can configure rule violation tasks to automatically handle dead-lettered events by performing specific actions (including resubmit, repair & resubmit, and delete) when the count exceeds a threshold. The configured task gets triggered when the corresponding rule is violated. This reduces the risks that may arise if no action is taken at a certain point.
Support for a wide range of notification services
Turbo360 integrates about ten notification services that business organizations commonly use to transmit and receive alert notifications.
The image shown below displays the list of supported services:
Imagine that all your business notifications are sent to ServiceNow. The Turbo360 and ServiceNow accounts can be linked, and you can choose them when setting up a Business Application’s monitoring configuration.
Escalation policy to notify the high-level team of priority alerts
Alert escalation policies help reduce risk and error by escalating alerts to the next level if they are not acknowledged within a specified time.
Escalation policies can be created in the Business Applications module’s Monitoring settings -> Escalation Policies section. You can create up to 5 escalation rules, each with different notification channels.
You can choose your escalation policy in the monitoring configuration of a Business Application (or) while mapping a monitoring profile to a Business Application.
Conclusion
Event Grid is critical in developing event-driven applications in a serverless environment. Data loss can result in a considerable data recovery effort, so handling all events is essential. Storing non-delivery events allows all business data to be processed without data loss or error recovery.
Azure Event Grid dead letter monitoring helps you troubleshoot problems, prevent data loss, and enable validation of your business data.
Compared to Azure Monitor, Turbo360 hits the right chords of customer requirements through its highly effective features, such as automatic dead letter processing, alert escalation, etc.