Service mesh adoption has grown with microservices architectures. Istio and Linkerd are the two dominant open source options. The production trade-offs between them are now clear after years of adoption.

A service mesh handles service-to-service communication concerns: mutual TLS for encryption and authentication, load balancing with traffic management, observability, and traffic policies. These capabilities are implemented in sidecar proxies running alongside each service container, making them language-agnostic.

In the clusters I ran with Istio, the Envoy sidecar typically added about 30 MiB of resident memory per pod and a CPU load of 5‑10 % under steady traffic. The control plane components—pilot, citadel, and galley—each required their own node‑level resources, so a 100‑node cluster could easily consume a full core just to keep the mesh alive. The memory hit mattered when we packed many small services onto a single node; we ended up scaling the node pool just to accommodate the mesh overhead.

Istio uses Envoy as its data plane proxy. Envoy's feature set is comprehensive, handling L7 traffic, gRPC support, and high configurability. Istio's control plane provides configuration distribution, certificate management, and telemetry collection. Istio's strength is its feature breadth, supporting almost any service mesh use case. However, its weakness is complexity: Istio is difficult to configure correctly and resource‑intensive.

The biggest pain point I saw was the upgrade path. Moving from Istio 1.5 to 1.7 required a full control‑plane rollout and a careful pause of traffic while the new CRDs propagated. We lost a few minutes of latency spikes because some services still referenced the old v1alpha3 APIs. The lesson was to script the whole process with Helm and to validate the new config against a staging cluster before touching production.

Linkerd uses a Rust‑based micro‑proxy instead of Envoy. The micro‑proxy is designed for minimal resource consumption and operational simplicity. Linkerd's configuration is simpler than Istio's. The mTLS is automatic: you install Linkerd, enable it on a namespace, and mTLS between services in that namespace is automatic. For teams that primarily want mTLS and observability without the full Istio feature set, Linkerd is the simpler path.

When we switched a subset of workloads to Linkerd, the sidecar footprint dropped to roughly 10 MiB per pod and the CPU penalty fell under 2 %. The latency added by the proxy was consistently under half a millisecond in our latency‑sensitive APIs. The trade‑off was that Linkerd's traffic‑splitting primitives lack the fine‑grained weight controls that Envoy offers, so we had to implement some routing logic in the application layer for complex canary scenarios.

Cilium implements network policy and service mesh features using eBPF at the kernel level rather than sidecar proxies. This eliminates the per‑pod proxy overhead of traditional service meshes. For clusters where the resource cost of sidecar proxies is a concern, Cilium's Hubble component provides observability and network policy features without the sidecar. However, L7 traffic management is less feature‑complete than Istio or Linkerd.