API design is riddled with conflicting advice and inconsistent implementations. Despite the abundance of guidance, a careful approach to key design decisions is essential for API longevity.
Inconsistent error responses are a frequent problem. Different JSON shapes for various error types, HTTP status codes that don't accurately reflect error semantics, and error messages that expose internal details all contribute to this issue. I recommend adopting RFC 7807 as the standard for error responses.
Problem Details responses, as specified in RFC 7807, include type (a URI for the error type), title (a human-readable summary), status (HTTP status code), detail (specific explanation), and instance (URI for this specific occurrence).
In my experience, the hardest part of implementing RFC 7807 is deciding how to handle error details. Should we expose internal system details or provide a generic explanation? For example, if a user's credit card fails to charge, we might want to include the exact error message from the payment gateway, but this can make it easier for malicious actors to exploit the system.
In offset-based pagination, the issue of new items being inserted before the current offset can cause items to be skipped or repeated as the list changes. A user paginating through a live list will see inconsistent results. I recall a case where a customer's application experienced this issue and had to be modified to handle the changing list. The cost of fixing and re-testing the application was substantial.
In contrast, cursor-based pagination is position-stable, using the cursor to encode the position in the sorted list rather than the offset. GitHub, Stripe, and Relay's cursor connection specification all use cursor-based pagination. This approach ensures that pagination is predictable and reliable.
While HATEOAS (Hypermedia as the Engine of Application State) promises to enable clients to discover available operations from API responses, it often adds unnecessary complexity and payload overhead. In practice, most API consumers still hardcode URLs.
A good rule of thumb is to use HATEOAS when the client is expected to navigate the link graph and retrieve additional resources. For example, when a user follows a link to a related resource, HATEOAS can be useful. However, for most CRUD operations, a simple self-link is sufficient.
POST operations that create resources are not idempotent by default, leaving clients uncertain whether the resource was created after a network timeout. Idempotency keys, a client-generated UUID sent in the Idempotency-Key header, allow the server to deduplicate requests.
If the same idempotency key is seen again within a retention window, the server can return the original response without re-executing the operation. Stripe popularized this pattern, and it's essential for payment and order creation APIs. In my experience, the key to successful idempotency is configuring the retention window correctly – if it's too short, you'll end up re-executing operations, and if it's too long, you'll hold onto unnecessary resources.
A good starting point for the retention window is 1-2 hours, depending on the business requirements and the rate of requests. For example, if you're handling payment requests, you might want to set the retention window to 30 minutes to minimize the risk of re-executing operations due to network timeouts.