Azure Event Hubs provides a Kafka-compatible API alongside its native SDK. For organisations standardising on Kafka tooling, Event Hubs provides a managed alternative to operating a Kafka cluster.
The Event Hubs architecture
Event Hubs is a partitioned log (like Kafka topics with partitions). Events are appended to a partition and retained for a configurable period (1-7 days standard, up to 90 days with Event Hubs Capture). Consumer groups maintain independent checkpoints. Partitions are the unit of parallelism: 32 partitions allow 32 parallel consumers. Unlike Kafka, partition count cannot be changed after namespace creation.
Kafka protocol compatibility
Event Hubs' Kafka endpoint supports the Kafka 1.0 protocol. Existing Kafka producers and consumers can be redirected to Event Hubs by changing the bootstrap server URL and authentication configuration. The compatibility is not 100%, Kafka admin operations (topic creation, deletion) work differently, but standard produce/consume workflows and consumer group management are compatible. Tools like MirrorMaker 2 and Confluent Replicator support Kafka-to-Event Hubs migration.
Event Hubs Capture to Data Lake
Event Hubs Capture automatically stores all incoming events to Azure Blob Storage or ADLS in Avro format. The capture provides a cold storage layer for the event stream without consumer code, useful for audit trails, replaying events for new consumers, and as a source for batch analytics. The capture overhead is low and the storage cost for compressed Avro is significantly less than the Event Hub retention cost for extended periods.
Schema Registry
The Event Hubs Schema Registry (added in 2021) provides a managed schema store for Avro, JSON Schema, and Protobuf schemas. Producers register schemas; consumers validate against registered schemas. The schema ID is embedded in the event payload, allowing consumers to look up the schema for each event. This provides the same schema governance capability as the Confluent Schema Registry for Kafka.