Agents retry. They retry when the LLM call times out and the orchestrator falls back. They retry when a tool returns a 5xx. They retry when the network blips mid-stream. They retry when an upstream API rate-limits and the SDK transparently backs off.
None of this is a problem if the tools are idempotent. All of it is a problem if they aren't.
The agent invokes book_calendar(user, slot). The HTTP call returns a 504 after 30 seconds because the calendar API is slow. The orchestrator retries. The second call succeeds in 200ms. The user has two identical bookings, and the first one might have actually succeeded — the 504 lied.
The pattern
Two parts. Each is one line:
- Generate a deterministic key per logical action. Hash the action's semantic identity — not the timestamp, not a random UUID. The key is the same key for any retry of the same logical operation.
- Pass the key to every side-effecting tool call. The tool's implementation (or the upstream API) uses the key to dedupe.
"Semantic identity" is the operation's full intent: this user + this calendar + this exact slot, not at this exact moment in time. If the agent decides to book the same user for the same slot a second time intentionally — that's a new logical action, with a new key. If it accidentally invokes the same tool twice — same logical action, same key, the second one no-ops.
The implementation
function idempotencyKey(actionType, payload) {
const semantic = {
action: actionType,
user_id: payload.user_id,
target: payload.target_id || payload.email || payload.slot_id,
amount: payload.amount,
// session_id pins the key to one conversation turn — so retries within
// a turn dedupe, but new conversations get fresh keys
session_id: payload.session_id,
};
const canon = JSON.stringify(semantic, Object.keys(semantic).sort());
return sha256(canon).slice(0, 32); // 32 hex chars = 128 bits
}
async function bookCalendar(payload) {
const key = idempotencyKey('book_calendar', payload);
return await calendarApi.book({
...payload,
'Idempotency-Key': key, // Stripe-style header
});
}
async function chargeCard(payload) {
const key = idempotencyKey('charge_card', payload);
return await stripe.charges.create({
amount: payload.amount,
customer: payload.customer,
idempotency_key: key, // Stripe's own param
});
}
That's it. Every Stripe customer learned this in 2016. Most agent frameworks don't have it built in.
What "semantic identity" actually means
The trickiest part of the pattern is picking the right hash inputs. Get this wrong and you either:
- Block legitimate re-actions — if the key only includes user_id + action_type, the user can never book a second calendar slot in the same session
- Allow accidental duplication — if the key includes a timestamp or random nonce, every retry has a different key and the dedupe protection vanishes
The right set is: operation + all parameters that distinguish this from a different valid call. For booking: user + calendar + specific time slot. For charging: customer + amount + invoice ID. For sending email: recipient + template + the action that triggered it (session_id pins the trigger).
If the same set of parameters appearing twice should mean "do it twice," you've picked the wrong set.
The retry-from-where matters
Where retries originate determines what you need to handle:
- Inside the SDK: Most modern API SDKs auto-retry 5xx and network errors. They reuse the same idempotency key if you pass one. Always pass one.
- From the agent orchestrator: Tools sometimes get retried at the orchestrator level after a longer timeout. Same key needs to flow through.
- By the user: User refreshes the form, agent gets the same intent twice. The session_id-pinned key handles this case — but only if you derive the key from session-stable data the user didn't change.
- By the LLM itself: The model sometimes decides to "try again" by re-invoking the tool. This is the case Issue #148's tool gate catches before it reaches the API.
The audit signal
Log { action_type, idempotency_key, response_id } on every side-effecting call. Aggregate weekly. Two flags:
- Same key with different response_ids = your dedupe didn't work upstream. Vendor side. File a ticket.
- Same logical action with different keys = your key derivation is including non-semantic inputs. Fix the key generator before the next double-charge surfaces.
For every side-effecting tool we ship at AutomateScale, the contract is the same: deterministic idempotency key from semantic inputs, logged on every call, audited weekly. The cost is two lines per tool. The cost of getting it wrong is one of your client's customers getting double-charged in front of their CFO. Want us to audit your tool layer? Apply for the audit.
The one-line summary
Agents retry. The retry isn't the bug — the missing idempotency key is. Hash the semantic identity of every side-effecting action, pass the key through, and your worst-case retry becomes a no-op instead of a duplicate. Pairs with the gate from #148 and the planner from #147.