Designing idempotency keys for outbound event delivery
Designing idempotency keys for outbound event delivery
Most idempotency guidance is written for webhook receivers. The emitter side — especially under fan-out — has its own set of problems. A practical guide to designing keys that survive retries, replays, and multi-destination delivery.
Search "webhook idempotency" and almost every result is written from the receiver's seat: how to dedupe an inbound webhook so your handler doesn't process the same event twice. That's a real problem, and it's well-covered.
The emitter's problem is less covered, and arguably harder. Once a system commits to at-least-once delivery — which any reliable outbound pipeline must — the operator owns a different question: how do you make sure the same logical event, retried across multiple destinations, doesn't show up as fifteen rows in someone's CRM?
This post is about that question. It covers what an outbound idempotency key actually represents, two common ways to mint one and why neither alone is sufficient, the fan-out trap that catches most teams, the difference between retry and replay, and what destinations are actually willing to do with a key once it arrives.
What "the same event" means
The first thing to get straight is the unit of work. There are at least three candidates:
A logical event. A user signed up. There is exactly one such fact.
A delivery. That one signup needs to reach HubSpot, Salesforce, and Slack. Three deliveries.
An attempt. The HubSpot delivery failed with a 503 and will retry. Each retry is a new attempt against the same delivery.
Idempotency keys exist to make a destination treat multiple attempts as a single delivery. They do not, by themselves, do anything to coordinate state across destinations. Each destination is its own dedupe domain.
This sounds obvious written down. It is the source of most of the bugs.
Two ways to mint a key
There are two common approaches to generating an idempotency key. Both have failure modes.
Deterministic: hash the payload
The intuition: if the payload is identical, the hash is identical, and the destination dedupes. Concretely:
This is replay-safe in the trivial case — re-emitting the same payload produces the same key — and requires no storage. It breaks the moment the payload changes between attempts.
In practice, payloads change. An enrichment step adds a traits block. A bugfix corrects a mistyped field. A schema migration adds a new property. Any of these regenerates the hash, and the destination has no idea the new "event" is the retry of the old one. Dedupe is silently lost.
Deterministic keys also have a subtler problem: if two genuinely distinct events happen to have identical payloads — two users with the same name signing up in the same second from the same IP, say — the destination will treat the second one as a duplicate and drop it.
Assigned: UUID at first emit
The other approach: generate a key once, at the moment the event is first accepted, and persist it alongside the event.
This survives payload mutation. The key was minted before any enrichment ran, and it's stored with the event record. Every retry attempt looks it up rather than recomputing it.
The cost is that storage is now load-bearing. If the queue worker doesn't have access to the original assigned key — because it was lost in a crash, or because the retry path constructs the request from scratch — the worker either invents a new key (defeating the point) or fails closed (defeating reliability).
What actually works
The pattern that holds up in production is a hybrid:
Assign a key at first emit and persist it with the event. This is the canonical key for that logical event.
Reuse the assigned key on every retry attempt for the same delivery. Never regenerate.
Fall back to a deterministic hash only when the assigned key is unavailable — for example, an emergency replay from logs where event IDs were lost. Treat this as a degraded mode.
The mental model: the key is a property of the event, not a property of the request. Each retry is a new request carrying the same key.
The fan-out trap
Here is where most outbound systems quietly break.
The intuitive design is one key per logical event. The signup gets a UUID, and that UUID is sent to HubSpot, Salesforce, and Slack. It looks clean. It is wrong.
The reason is that destinations don't agree on what duplicate detection means. HubSpot might dedupe within a 24-hour window. Salesforce, configured with an External ID, dedupes effectively forever. Slack doesn't dedupe at all. So a retry that fires 26 hours after the original — well within a reasonable retry horizon for a flapping destination — gets:
Dropped by Salesforce as a duplicate.
Accepted by HubSpot as a fresh event, double-writing.
Posted to Slack again, double-notifying.
The result is inconsistent state across destinations, and it's effectively impossible to debug from logs because the failure mode looks like "HubSpot is unreliable" rather than "our key strategy doesn't survive long retry windows."
The correction is to scope the key per destination:
function deliveryKey(eventId: string, destinationId: string): string { return `${eventId}:${destinationId}`;}
Now each destination sees a stable key for its own retries. The HubSpot delivery has its own dedupe lineage; the Salesforce delivery has its own; the Slack delivery has its own. A retry to one destination cannot interact with another. And replaying a single destination — "resend everything from yesterday to HubSpot only, because we fixed the field mapping" — is a coherent operation rather than a global blast.
The general rule: the unit of idempotency is the (event, destination) pair, not the event alone.
Retry versus replay
A retry reuses the key. That is the entire point.
A replay often needs to do the opposite. When a human re-runs yesterday's events because a downstream bug was fixed and the destination needs to reprocess them, the desired behavior is usually that the destination accepts the duplicates. Otherwise the replay is a no-op.
There are two patterns for this.
Replay suffix. Append a replay identifier to the key:
The destination sees a new key, treats the event as fresh, and writes again. The downside is that downstream systems with their own dedupe — say, a CRM that builds a contact-history timeline keyed off the external ID — now get two entries for the same logical event, and untangling that requires knowing the replay happened.
Replay envelope. Mark the message itself as a replay (an X-Replay-Id header, a replay: true field), and let the destination decide. Some destinations support this natively. Most don't. For those that don't, the suffix pattern is the only option.
Whichever pattern is chosen, replay needs to be a deliberate, audited operation rather than something that happens accidentally because a queue got reprocessed. Mixing retry semantics and replay semantics is the source of most "why does our CRM have duplicates?" tickets.
What destinations will actually accept
Once the keys are right on the emitter side, the next question is whether the destination will use them. Support varies sharply.
Native idempotency headers. A small number of APIs accept a dedicated header — Stripe's Idempotency-Key is the canonical example. These are the easy case: pass the per-destination key, get exactly-once semantics within the destination's dedupe window, done.
Embed in payload. Most CRMs and marketing tools don't have a dedicated header but will dedupe on a designated field. HubSpot accepts an external identifier in custom properties; Salesforce supports External ID fields on custom objects; Customer.io uses the id on track calls. The pattern is the same — pick a stable field, map the per-destination key to it, configure the destination to treat that field as the dedupe key.
No support. The long tail. Slack, Discord, plain webhook URLs, most chat and notification destinations. There is no idempotency mechanism on the receiving side. The only option is to dedupe on the emitter side: before sending, check whether this (event, destination) pair has already succeeded, and skip if so. This requires storing delivery outcomes, which most ad-hoc integration code does not.
The practical consequence is that "we use idempotency keys" is not a single design — it's three designs, applied per destination, based on what the destination supports.
Storage and TTL
The last piece is how long keys live.
The constraint is straightforward: the TTL on the emitter's key store must exceed the destination's dedupe window plus the maximum retry horizon. If HubSpot dedupes for 24 hours and the retry policy backs off for up to 72 hours before giving up, the emitter must remember the key for at least 96 hours — otherwise a late retry will mint a fresh key and duplicate-write into a destination that has already forgotten the original.
Two operational notes follow from this.
First, dead letter queues need their own re-key logic. An event that sat in a DLQ for two weeks before being manually retried cannot reuse its original key against a destination that only dedupes for 24 hours — the destination has forgotten, and the emitter is now sending what looks like a stale duplicate. The right move is usually to treat DLQ recovery as a replay rather than a retry.
Second, "key expired" is a real operational state, not a theoretical one. It needs monitoring. A spike in retries against expired keys is a signal that retry horizons and key TTLs have drifted out of sync, and that some non-zero fraction of recent retries silently double-wrote.
Closing
The unit of work in outbound delivery is this event, to this destination, on this attempt. Idempotency keys that don't reflect all three dimensions — that collapse destinations together, or regenerate on payload change, or expire before retries finish — leak duplicates. The leaks are quiet. Most teams don't notice until a customer asks why their CRM has three contacts for the same person.
Designing the key correctly costs a few extra lines of code and one storage decision. Designing it incorrectly is invisible until the bill comes due.
If the receiver-side picture is also useful, Idempotent Event Delivery covers what consumers should do when they're on the other end of this pipe, and Dead Letter Queues covers what happens when retries finally run out.
Want per-destination idempotency, retries, and dead letter capture without rebuilding the transport?Join Meshes — emit one event, and we handle outbound delivery to every destination.