Serverless computing has been 'the future' since AWS Lambda launched in 2014. Nearly a decade later, the adoption pattern is clear: serverless solves specific problems very well and others poorly. The key is knowing which is which.

Where serverless excels

Event-driven processing is serverless's home territory. Processing images after upload, sending emails triggered by user actions, running batch jobs on a schedule, and handling webhook payloads are all cases where the serverless execution model (pay per invocation, scale to zero) aligns with the workload shape. You pay for the compute you use, not for idle capacity. For irregular or unpredictable workloads, the cost savings over always-on containers are significant.

The cold start problem

Cold starts remain the most cited limitation of serverless. When a function has not been invoked recently, the runtime needs to initialise a new execution environment. For AWS Lambda running .NET or Java, cold starts can reach 1-3 seconds. For latency-sensitive user-facing APIs, this is unacceptable. Mitigation strategies: provisioned concurrency (keeping warm instances ready), using lighter runtimes (Node or Python start faster than .NET or Java), and native AOT compilation for .NET Lambda which reduces cold starts to under 100ms.

Observability challenges

Serverless functions are ephemeral and distributed. A single user request might invoke dozens of functions across a call chain. Correlating logs, traces, and metrics across Lambda invocations requires deliberate instrumentation. OpenTelemetry with AWS X-Ray or third-party providers like Datadog is the current standard. Teams that skip distributed tracing in serverless architectures spend disproportionate time debugging production issues that are trivial to diagnose with traces.

The right decision model

The question is not serverless vs containers, it is which workloads benefit from each model. Event-driven processing and APIs with irregular traffic: serverless. Long-running processes, latency-sensitive APIs with consistent traffic, and ML inference: containers. Most production architectures use both. The engineering mistake is forcing one model on all workloads for consistency rather than matching the infrastructure model to the workload characteristics.