Istio is the most widely adopted service mesh in 2019, and I believe understanding its architecture is crucial for deploying it in production without any surprises, this knowledge helps in identifying potential issues beforehand

The data plane in Istio consists of Envoy proxy sidecars injected into every pod, which intercepts all inbound and outbound pod traffic transparently via iptables rules, this allows for a high degree of control over the traffic flow

Envoy handles a range of tasks including connection management, load balancing, circuit breaking, retries, mTLS, and telemetry generation, the application sees a localhost connection to Envoy and is unaware of the actual service-to-service connection being handled by Envoy

The control plane in Istio, at least in 2019, consists of multiple components including Pilot for configuration and service discovery, Citadel for certificate authority, Galley for configuration validation, and Mixer for policy enforcement and telemetry, these components communicate via gRPC to push configuration to the Envoy proxies

When we first rolled out Istio on a 200‑node GKE cluster, Pilot quickly became the bottleneck. Each Envoy instance opened a gRPC stream to Pilot for LDS and CDS updates, and with roughly 4 000 sidecars the control plane was pushing about 12 MB/s of configuration data. We had to scale Pilot to three replicas behind a load balancer and bump the heap limit to 2 GB to keep the 95th‑percentile response time under 150 ms. The same story applied to Citadel; its certificate signing workload spikes during rollout, so we isolated it on its own node pool and enabled the cache in the CA to avoid hitting the private key store on every handshake.

Mixer introduced another layer of latency that surprised many teams. In our early tests a simple request count metric added 3 ms of overhead per hop, which multiplied across a chain of services. We mitigated this by moving the adapter to a local in‑process mode for high‑throughput services and by sharding the backend Redis store to spread the load. The trade‑off was losing some of the global policy enforcement granularity, but the latency budget we had for user‑facing APIs forced us to make that compromise.

Debugging configuration drift was a constant source of night‑time alerts. The Envoy admin endpoint gives you the current listener and cluster dump, but the JSON is massive and the version of Envoy bundled with Istio 1.1 had a bug that omitted the TLS context when mTLS was in PERMISSIVE mode. We ended up writing a small Go utility that compared the pilot‑generated config with the live dump and highlighted mismatches. Coupling that with Jaeger traces let us see where a request was being reset by a circuit‑breaker that never got the updated threshold.

In 2019, production operators had to manage the full multi-component deployment of the control plane, which was simplified in Istio 1.5, but at the time it was a complex task that required careful planning and execution

Istio's VirtualService and DestinationRule resources are used to configure the Envoy proxies for traffic management, a VirtualService defines routing rules for a service, such as routing a percentage of traffic to a specific subset, while a DestinationRule defines the subsets and load balancing policy

The combination of VirtualService and DestinationRule enables canary deployments, A/B testing, and fault injection without requiring any changes to the application code, this makes it easier to test and deploy new versions of a service

Istio's PeerAuthentication resource is used to control the mTLS mode for workloads, with three modes available: PERMISSIVE, STRICT, and DISABLE, the PERMISSIVE mode allows for both mTLS and plaintext traffic, while the STRICT mode requires mTLS, and the DISABLE mode allows only plaintext traffic

The standard adoption path for mTLS is to start with the PERMISSIVE mode, observe the traffic using Kiali's traffic graph, and then switch to the STRICT mode to prevent plaintext traffic, AuthorizationPolicy resources are used to define which services can communicate with each other