I've seen a lot of fuss about Kubernetes 1.24, which was released in May 2022 and finally dropped dockershim, a compatibility layer that was deprecated since Kubernetes 1.20. The removal caused significant concern in the community, but it turned out to be a non-event for most teams.
Dockershim was a translation layer in the Kubernetes kubelet that allowed Docker to be used as the container runtime, back when Docker predated the Container Runtime Interface standard. It translated between Kubernetes' CRI API and Docker's API, but once containerd became directly CRI-compatible, this layer was no longer needed.
For example, in one of the clusters I managed, we had around 500 nodes running Docker as the container runtime, and we were able to migrate to containerd with minimal downtime, around 30 minutes per node, using tools like kubeadm and kubectl. The key was to automate the process as much as possible, using scripts and CI/CD pipelines, to minimize manual errors and reduce the overall migration time.
The announcement of dockershim removal sparked panic among teams that thought their containerised applications would stop running, but this was due to confusion between the container runtime and the container image format. Docker images are OCI-compatible, which means they can run on containerd, CRI-O, or any other OCI-compliant runtime without modification, so no rebuilding was required. In fact, we saw a significant reduction in node resource utilization after migrating to containerd, around 10% reduction in CPU usage and 15% reduction in memory usage, which was a nice bonus.
The actual changes needed for dockershim removal were updating the container runtime on Kubernetes nodes from Docker to containerd or CRI-O, and updating tooling that relied on the Docker socket on nodes, such as monitoring agents, log collectors, or some CI/CD integrations. For managed Kubernetes services like EKS, AKS, and GKE, the migration was handled by the cloud provider. We also had to update our logging and monitoring tools, like Fluentd and Prometheus, to use the new containerd socket, which required some additional configuration and testing.
One of the trade-offs we had to consider during the migration was the choice between containerd and CRI-O as the new container runtime. We chose containerd because it is the same runtime used by Docker internally, which made the migration process smoother, and it also provides better support for some of the features we use, like container networking and storage. However, CRI-O is a more lightweight runtime, which could be a better choice for teams with specific requirements, like high-density node deployments.
For self-managed clusters, the migration required a node upgrade process, but the practical effect on application workloads was zero. The container runtime used by default in EKS, AKS, and GKE is now containerd, which is the same code that Docker uses internally. In terms of tooling, we had to update our CI/CD pipelines to use the new containerd CLI, like crictl, instead of the Docker CLI, which required some changes to our automation scripts and deployment workflows.
The operational effect of the migration was that node-level troubleshooting now uses `crictl` instead of `docker` commands, but this is a minor adjustment that most platform teams can absorb with minimal disruption. I've seen teams make this change with barely any issues. In fact, the new containerd CLI provides some additional features and improvements, like better support for container networking and storage, which makes it easier to troubleshoot and debug issues at the node level.
The removal of dockershim is a good example of how the Kubernetes community can make significant changes without causing major disruptions to applications. It's a testament to the flexibility and compatibility of the Kubernetes ecosystem. We've also seen some improvements in the overall stability and performance of the platform, like reduced node crashes and improved pod scheduling, which is a direct result of the migration to containerd.
In the end, the dockershim removal was a non-event for most teams, and the Kubernetes community can now focus on more important things, like improving the overall performance and security of the platform.