Kubernetes began as Google's internal container orchestration system called Borg. They open-sourced it in 2014, and it's now the standard for running containerized workloads at scale. If you're running Docker containers in production, you likely need Kubernetes.
Kubernetes automates deployment, scaling, and management of containerized applications. You describe the desired state, and Kubernetes makes it happen and keeps it that way. If a container crashes, Kubernetes restarts it. If traffic spikes, it spins up more replicas. This means you stop managing servers manually.
A Kubernetes cluster has two main parts: the control plane and worker nodes. The control plane is the brain. It runs the API server, scheduler, controller manager, and etcd. The API server is the entry point for all cluster operations. etcd is a distributed key-value store that holds the cluster's state. The scheduler assigns workloads to nodes. The controller manager keeps the actual state matching the desired state.
Worker nodes are where your containers actually run. Each node runs kubelet, which communicates with the control plane, kube-proxy, which handles network routing, and a container runtime like containerd.
Some core concepts in Kubernetes include pods, deployments, services, ConfigMaps, Secrets, and namespaces. A pod is the smallest deployable unit. It usually wraps a single container, though you can run multiple tightly coupled containers in one pod. A deployment manages a set of pod replicas. You tell it how many copies of your app to run and what container image to use. It handles rolling updates and rollbacks.
A service provides a stable network endpoint for a set of pods. Pods come and go, but the service keeps a consistent IP and DNS name. Kubernetes handles load balancing across the pods behind it. ConfigMaps and Secrets let you inject configuration and credentials into pods without hardcoding them in container images. Namespaces provide logical partitions within a cluster, which is useful for separating teams or environments.
To deploy an application, you write a YAML manifest describing your deployment and apply it with kubectl apply -f deployment.yaml. Kubernetes creates the pods, keeps them running, and exposes them via a service. That's the basic flow. For scaling, you can run kubectl scale deployment webapp --replicas=5 or configure a Horizontal Pod Autoscaler (HPA) to scale automatically based on CPU or memory.
Kubernetes also supports rolling updates and rollbacks. When you push a new image version, Kubernetes rolls it out gradually. Old pods come down as new ones come up. If the new version is broken, kubectl rollout undo deployment/webapp takes you back to the previous version in seconds.
Some essential kubectl commands include kubectl apply -f file.yaml to apply a configuration to the cluster, kubectl get pods to list running pods, kubectl describe pod <name> for detailed info about a pod, kubectl scale deployment <name> --replicas=N to scale a deployment, kubectl rollout status deployment/<name> to check rollout progress, kubectl rollout undo deployment/<name> to roll back to previous version, and kubectl logs <pod> to view pod logs.
On Azure, AKS (Azure Kubernetes Service) manages the control plane for you. You pay for worker nodes, not the control plane. This makes it much cheaper to run for most workloads.