OpenTelemetry Reaches GA

I've been following OpenTelemetry since its inception, and it's exciting to see it reach general availability for traces and metrics in 2021. The project was created from the merger of OpenTracing and OpenCensus, and it provides a much-needed vendor-neutral observability instrumentation standard.

The lack of standardisation in observability instrumentation has been a major pain point for developers. Before OpenTelemetry, each vendor had its own instrumentation SDK, which meant re-instrumenting your codebase whenever you wanted to switch vendors. OpenTelemetry changes this by separating instrumentation from the backend, allowing you to instrument your application once with the OTEL SDK and route telemetry to any backend that supports the OTLP protocol.

For example, in one of my previous projects, we had to switch from New Relic to Datadog, and it required a significant amount of re-instrumentation work, taking about 3 months and 2 engineers to complete. With OpenTelemetry, this process would have been much simpler, as we could have just changed the backend configuration without touching the instrumentation code. This is a huge time saver and reduces the risk of introducing bugs during the re-instrumentation process.

The OTEL Collector is a key component of OpenTelemetry, acting as a vendor-agnostic agent that receives telemetry from applications, processes it, and exports it to one or more backends. You can run the Collector as a sidecar in Kubernetes, which centralises telemetry routing configuration without requiring any changes to your application code. The Collector's pipeline model makes it composable for complex telemetry routing requirements.

In a production environment, I've seen the OTEL Collector handle over 10,000 spans per second without any issues, and it's also able to handle multiple backend configurations, making it easy to send telemetry data to different vendors or systems. For instance, you can use the Collector to send traces to Jaeger and metrics to Prometheus, all from the same application.

One of the most compelling features of OpenTelemetry is its auto-instrumentation capability. The OTEL auto-instrumentation agents can instrument HTTP clients, database drivers, message queue clients, and web framework request handlers without requiring any code changes. While the quality of auto-instrumentation varies by language and framework, it provides a significant amount of production observability value with zero code changes.

In my experience, the auto-instrumentation agents have been able to capture about 80% of the spans in a typical web application, with the remaining 20% requiring manual instrumentation. This is a significant time saver, as manual instrumentation can be a tedious and error-prone process. Additionally, the auto-instrumentation agents are also able to capture telemetry data from libraries and frameworks that don't have built-in support for OpenTelemetry, making it a very powerful feature.

The concept of baggage in OpenTelemetry is also worth noting. It allows key-value pairs to be propagated across process boundaries in distributed traces, enabling high-cardinality analysis. For example, you can set a user ID, feature flag value, or deployment version at the entry point of a request and read it by any service in the call chain.

When combined with span attributes, baggage enables some powerful use cases. You can trace all requests from a specific user, compare performance between feature flag variants, or isolate errors to a specific deployment version. This level of analysis is critical for understanding complex distributed systems.

I've seen firsthand how difficult it can be to switch observability vendors when you've invested heavily in instrumenting your codebase. OpenTelemetry eliminates this vendor lock-in at the instrumentation layer, making it much easier to switch vendors if needed. This is a major win for developers and operators who want to avoid being tied to a specific vendor.

As OpenTelemetry continues to mature, I expect to see even more innovative use cases emerge. The project has already gained significant traction, and its vendor-neutral approach is likely to appeal to many organisations looking to standardise their observability instrumentation.