Performance testing is often deferred until after launch or skipped entirely, then done reactively in response to incidents. A proactive performance testing practice catches scalability issues before they become user-visible.

Load testing vs stress testing

Load testing measures application performance at expected production load: throughput, latency distribution, resource utilisation. It validates that performance targets are met and establishes a baseline. Stress testing increases load beyond expected levels to find the breaking point: at what load does the system start degrading? What degrades first? Stress tests reveal the capacity ceiling and the failure mode at saturation. Both types should be part of the pre-production test suite.

k6 and Gatling for load generation

k6 (Grafana Labs) and Gatling are the two modern open-source load testing tools for HTTP workloads. k6 uses JavaScript for test script definition; Gatling uses Scala. Both support: realistic user journeys (not just single-endpoint hammering), configurable ramp-up and sustained load, detailed latency statistics (percentiles, not just averages), and integration with CI/CD pipelines for regression detection. The output: latency distribution charts, throughput graphs, and error rate by endpoint.

Performance regression detection in CI

Running a load test in CI and failing the build when latency exceeds a threshold catches performance regressions before production. The practical challenge: load test execution takes minutes, which is expensive in CI. The solution: run abbreviated load tests (1-2 minutes of sustained load) in CI for regression detection, and run full-duration load tests (30+ minutes) nightly or as part of release preparation. The abbreviated test catches obvious regressions; the full test validates production readiness.

The realistic workload question

Load tests against a single endpoint at constant throughput are not realistic. Production workloads mix many different operations at varying rates with different data profiles. Realistic load tests: model the mix of operations based on production traffic analysis, use realistic data (not 'test user 1' making the same request in a loop), and simulate realistic think time between operations. Realistic workload models surface different bottlenecks than synthetic single-endpoint tests.