Azure Event Grid cuts the middleman in event routing by pushing events directly to endpoints like functions or webhooks. You don’t build the infrastructure—just declare what triggers matter and where they should go.
The push model works like this: when a blob is created, a resource group changes, or a custom event fires, Event Grid sends the event to subscribed endpoints. Subscribers don’t poll; they get updates within seconds. Supported endpoints include Azure Functions, Logic Apps, Event Hubs, Service Bus, webhooks, and Azure Relay.
Event Grid supports dozens of built-in triggers. Blob Storage fires on create/delete, Container Registry on image pushes, and Resource Manager on resource changes. You can also publish custom events via the Event Grid API. Events use either the Cloud Events schema (CNCF standard) or the native Event Grid schema.
When a subscriber fails to process an event, Event Grid retries with exponential backoff for up to 24 hours. If delivery still fails, the event goes to a dead-letter blob in Azure Storage if configured. Without this, undeliverable events vanish silently after the retry window.
In production we always wire Event Grid to Azure Monitor. The diagnostic logs give you per‑topic ingress rates, latency, and failure counts. A single topic can sustain about 1 000 events per second, and you can push that to 10 000 with a premium tier, but the cost scales linearly with the number of operations. I once hit the 1 000 eps limit on a photo‑upload service and the retry backlog grew until the dead‑letter bucket filled up. The fix was to shard the workload across three topics and add a throttling layer in the publisher.
Dead-lettering isn’t optional—it’s a safety net. I’ve seen teams lose critical events during subscriber outages until they added dead-lettering. The blob captures the event for later analysis or replay, which is invaluable for debugging.
Custom schemas are a common source of bugs. When we moved from the native Event Grid schema to the CloudEvents 1.0 format we had to update every Function binding and add a version field in the payload. The binding layer in Azure Functions can deserialize automatically, but only if the schema matches exactly; a missing attribute caused a silent 400 response that the retry logic treated as a permanent failure. Adding a small validation step before the business logic saved us a lot of noise in the dead‑letter store.
Event Grid isn’t a one-size-fits-all tool. Use it when publishers shouldn’t know their subscribers—like reacting to resource changes or triggering webhooks. For ordered, session-based messaging, Service Bus wins. For high-volume streaming with retention, Event Hubs is the pick.
The three services coexist in event-driven architectures. Event Grid handles reactive routing, Service Bus ensures reliable delivery, and Event Hubs scales for massive data streams. I’ve built pipelines that chain all three, using Event Grid to trigger processing, Service Bus to queue results, and Event Hubs to archive logs.
Choosing the right tool depends on your needs. If you need to react to resource events or webhooks, Event Grid is your starting point. For guaranteed delivery with sessions, Service Bus is better. And for storing event streams for later analysis, Event Hubs fits. They’re not competitors—they’re parts of a larger system.