I've seen organisations struggle with Kafka's operational overhead, so the idea of a managed alternative is tantalising. Azure Event Hubs provides exactly that, with a Kafka-compatible API and a native SDK. But be warned: Event Hubs isn't a drop-in replacement for Kafka, and its architecture is fundamentally different.

When I migrated a high-throughput event stream from Kafka to Event Hubs, I had to carefully plan the partition strategy. Event Hubs requires you to specify the partition count upfront, which can be a challenge if your event volume is unpredictable. For example, we started with 16 partitions, but had to double that to 32 within a few months as our event stream grew. This required careful coordination with our ops team to adjust the configuration without disrupting the event flow.

Event Hubs is based on a partitioned log, similar to Kafka topics with partitions. Events are appended to a partition and retained for a configurable period, ranging from 1-7 days standard, up to 90 days with Event Hubs Capture. Consumer groups maintain independent checkpoints, and partitions are the unit of parallelism: 32 partitions allow 32 parallel consumers. However, unlike Kafka, you can't change the partition count after namespace creation.

One thing to watch out for is the cost implications of using Event Hubs Capture. While it's a powerful feature for storing event streams in Azure Blob Storage or ADLS, the storage costs can add up quickly if you're not careful. For example, we had a use case where we were storing event streams for 30 days, and the storage costs were significantly higher than expected. We ended up optimizing our retention period and compression settings to reduce costs. It's also worth noting that Event Hubs Capture uses Avro format, which can be a good choice for event streams, but may require additional processing for certain consumers.

Event Hubs' Kafka endpoint supports the Kafka 1.0 protocol, which means existing Kafka producers and consumers can be redirected to Event Hubs by changing the bootstrap server URL and authentication configuration. However, this compatibility isn't 100%: Kafka admin operations like topic creation and deletion work differently. Standard produce/consume workflows and consumer group management are compatible, though, and tools like MirrorMaker 2 and Confluent Replicator can help migrate from Kafka to Event Hubs.

Event Hubs Capture is a valuable feature for organisations that want to store incoming events in Azure Blob Storage or ADLS in Avro format. This provides a cold storage layer for the event stream without requiring consumer code, and it's useful for audit trails, replaying events for new consumers, and as a source for batch analytics. The capture overhead is low, and the storage cost for compressed Avro is significantly less than the Event Hub retention cost for extended periods.

In 2021, Microsoft added an Event Hubs Schema Registry that provides a managed schema store for Avro, JSON Schema, and Protobuf schemas. Producers register schemas, and consumers validate against registered schemas. The schema ID is embedded in the event payload, allowing consumers to look up the schema for each event. This provides the same schema governance capability as the Confluent Schema Registry for Kafka.