Idempotency keys: designing safe retries for HTTP APIs
How idempotency keys let clients retry network failures without duplicate side effects, with storage patterns, HTTP semantics, and production pitfalls.
A payment API returns 502 Bad Gateway after the gateway timed out talking to the card processor. The client has no response body and no clear id. Did the charge go through? Retrying blindly might double-charge; giving up might leave money on the table. In freelance and consulting work, this class of problem shows up anywhere money, inventory, or external commitments are involved: the network is not reliable, but business rules require at-most-once side effects.
Idempotency keys are the standard pattern for making retries safe: the client sends a stable identifier with each logical operation; the server records outcomes keyed by that identifier and returns the same result when the same key is reused. This article explains why naive retries fail, how to implement keys correctly, and where the approach breaks down.
Why retries are dangerous without a contract
HTTP methods have semantics: GET and PUT are expected to be safe or idempotent in theory, but real APIs often wrap multi-step workflows—authorize, capture, emit events—where a single request maps to several side effects. A timeout can occur after the payment succeeded but before the response reached the client. Without a shared idempotency token, the server cannot distinguish “first attempt” from “retry of the same intent.”
Duplicate processing is not only a payments problem. Examples include:
- Creating duplicate support tickets or user accounts
- Shipping an order twice
- Granting credits or promotional entitlements multiple times
The fix is not “never retry.” Retries are essential for resilience. The fix is to key logical operations so the server can deduplicate or replay stored outcomes.
Core concepts
Idempotency key: client responsibility
The client generates a unique string per logical request—typically a UUID v4 or similar—for example once when the user clicks “Pay,” not on every TCP retransmit. The same key must accompany every retry of that action. Keys are usually sent in a header, conventionally Idempotency-Key, though some APIs use a field in the body for JSON-RPC-style calls.
Scope
Keys are scoped to a resource + actor pair. The same key might be valid for different users or different merchants; implementations usually scope by authenticated principal (and sometimes by API version or environment) so one tenant cannot collide with another.
Server behavior: three outcomes
On first sight of a key, the server performs the operation and persists the outcome (success response, validation error, or terminal failure) for a retention window—often 24 hours to several days, depending on product requirements.
On a repeat request with the same key and compatible payload:
- Return the stored response (including status code and body) without re-executing side effects.
If the same key arrives with a different payload (body or critical parameters), the server should reject the request—typically 422 or 409—because the client is contradicting itself.
Atomicity
The check “have we seen this key?” and the commit of the business operation must be one atomic unit from the perspective of duplicate prevention. Common patterns:
- Database transaction: insert idempotency row first with a unique constraint; if insert fails because the key exists, read and return cached response
- Distributed lock around key + operation for the duration of processing
If the operation is slow, some implementations return 202 Accepted and store the key against an async job; retries then poll or receive the same job result.
HTTP and method choice
POST is the usual candidate for idempotency keys because it is the method most often used for non-idempotent creates. For PUT with a stable path, the resource id itself can play a similar role; idempotency keys still help when the client generates the id server-side via a create endpoint.
Document clearly:
- Which endpoints accept
Idempotency-Key - Maximum key length (e.g. 255 characters)
- Retention period for replay
- Whether keys are required or optional
Trade-offs and limitations
Storage and operations cost. Every keyed request implies durable storage and a lookup path. High-volume APIs need indexes, TTL eviction, and monitoring on idempotency store size.
Exactly-once illusion. The pattern gives effectively once processing for a given key within the retention window. It does not solve cross-system distributed transactions by itself; downstream systems may still need their own deduplication or natural idempotency.
Clock skew and racing clients. Two different logical operations must not reuse the same key; that is a client bug. Defensive servers validate payload consistency.
Partial failure. If the server crashes after committing the side effect but before persisting the idempotency record, duplicates are still possible. Stronger guarantees require aligning the business write and idempotency record in the same transactional boundary where the datastore allows.
Practical example: Node.js handler sketch
The following illustrates the intent: validate key, short-circuit on replay, and tie the idempotency record to the business action. It is simplified—production code would add auth scoping, metrics, and structured logging.
// Conceptual types
type IdempotencyRecord = {
key: string;
statusCode: number;
body: unknown;
createdAt: number;
};
const store = new Map<string, IdempotencyRecord>(); // Replace with Redis/Postgres in production
function hashPayload(body: unknown): string {
return JSON.stringify(body); // Or stable canonical JSON / hash
}
export async function handlePayment(req: {
idempotencyKey: string | undefined;
userId: string;
body: { amountCents: number; currency: string };
}): Promise<{ status: number; json: unknown }> {
const key = req.idempotencyKey?.trim();
if (!key) {
return { status: 400, json: { error: "Idempotency-Key header required" } };
}
const scopedKey = `${req.userId}:${key}`;
const payloadFingerprint = hashPayload(req.body);
const existing = store.get(scopedKey);
if (existing) {
// Optional: reject conflicting replay
const priorFingerprint = (existing.body as { _fp?: string })?._fp;
if (priorFingerprint && priorFingerprint !== payloadFingerprint) {
return {
status: 409,
json: { error: "Idempotency key reused with different payload" },
};
}
return { status: existing.statusCode, json: existing.body };
}
// Perform charge with external provider; use provider idempotency if supported too
const result = await chargeCard(req.body);
const toStore = { ...result.json, _fp: payloadFingerprint };
store.set(scopedKey, {
key: scopedKey,
statusCode: result.status,
body: toStore,
createdAt: Date.now(),
});
return { status: result.status, json: toStore };
}
Client usage with fetch:
const idempotencyKey = crypto.randomUUID();
async function payOnce() {
const res = await fetch("/api/payments", {
method: "POST",
headers: {
"Content-Type": "application/json",
"Idempotency-Key": idempotencyKey,
},
body: JSON.stringify({ amountCents: 5000, currency: "USD" }),
});
return res.json();
}
// Retry only on known-safe conditions (network error, 502/503), same key
In engagements focused on production-ready APIs, pairing this server-side key with provider-level idempotency (many payment processors accept their own idempotency token) closes gaps where your server succeeded but the client never got the response.
Common mistakes and pitfalls
Regenerating the key on each retry. The key must identify the user action, not the HTTP attempt. A new UUID per retry defeats deduplication.
Omitting payload checks. Reusing a key with different amounts is a bug; the server should reject, not silently return the first outcome.
Unbounded storage. Idempotency records need TTL or archival; otherwise the store grows without limit.
Treating all status codes as cacheable. Some teams only cache success and idempotent validation errors; caching a 500 might lock the client into a transient failure until the key expires. Policies vary—document whether retries after a server error should use the same key or wait for expiry.
Ignoring downstream idempotency. Your API may be safe while a webhook or job queue still delivers duplicates unless those layers also deduplicate.
Conclusion
Idempotency keys bridge unreliable networks and strict business rules: they turn ambiguous retries into replay of a single decision, scoped in time and tied to client intent. The implementation cost—storage, atomicity, and clear documentation—is modest compared to manual reconciliation after duplicate charges or shipments.
Key takeaways:
- Treat each key as the identifier of a logical operation, not a transport artifact
- Persist outcomes and return them on replay; reject conflicting payloads
- Align the idempotency record with the business commit when possible, and extend the same idea to external providers where they support it
For teams building scalable, production-oriented services and APIs, getting retries and idempotency right early avoids expensive fixes later. Questions about architecture or collaboration fit naturally through the contact page.
Assine a newsletter
Receba um e-mail quando novos artigos forem publicados. Sem spam — apenas novos posts deste blog.
Via Resend. Você pode cancelar a inscrição em qualquer e-mail.