Human-in-the-loop approval for LLM tool actions: policies, queues, and production UX
Gate high-risk agent tools behind durable approval workflows: policy engines, idempotent side effects, timeout semantics, and UX that keeps assistants useful without silent autonomy.
Your assistant can refund orders, rotate API keys, and post to Slack. In staging, the demo is magical: the model picks the right tool, the UI updates, everyone applauds. In production, the first incident is not prompt injection—it is a correct-looking tool call on the wrong account after a long thread, approved implicitly because nobody defined what “autonomous” means. Legal asks who clicked approve; engineering discovers there was no click, only a model completion that your server executed.
Human-in-the-loop (HITL) approval is how you keep agentic features shippable: the model may propose side effects, but your backend decides whether they run now, later, or never—based on policy, role, amount, environment, and audit requirements. This article covers the control plane (not the prompt tricks): durable approval records, idempotent execution, and UX patterns that do not stall every harmless read.
Why “ask the user in chat” is not approval
Chat UIs tempt you to treat “Should I proceed?” as consent. That fails in production for predictable reasons:
- No durable witness — Chat messages are not a legal or security audit trail unless you model them explicitly.
- Ambiguous scope — The user said “yes” three turns ago; the model now proposes a different amount, recipient, or tenant.
- Concurrent sessions — Mobile plus web means the approving principal may not be the session that triggered the tool.
- Automation bypass — A compromised prompt or retrieval chunk can mimic affirmative answers in the transcript.
HITL belongs in your domain layer: an approval_request row, a signed link or in-app inbox, and an execution gate that refuses to call side-effecting tools until status is approved (or policy auto-approves).
Teams I work with on production assistants usually already have tool routing; what they lack is a state machine between proposed and executed.
Classify tools: read, write, and irreversible
Before building UI, tag every tool with a risk tier. Keep the taxonomy small so product and security can reason about it.
| Tier | Examples | Default behavior |
|---|---|---|
| Read | getOrder, searchDocs, listInvoices | Auto-execute; log for audit |
| Write | updateShippingAddress, addComment | Policy-based: auto if low risk, else approval |
| Irreversible / high impact | issueRefund, deleteUser, transferFunds, sendExternalEmail | Approval required; optional second factor |
Encode tier in the tool registry—not only in documentation—so the orchestrator cannot “forget” to check:
export type ToolRisk = "read" | "write" | "irreversible";
export type RegisteredTool = {
name: string;
risk: ToolRisk;
/** Stable id for policy rules (refunds, exports, …) */
actionType: string;
execute: (args: unknown, ctx: ToolContext) => Promise<unknown>;
};
Why action types matter: Policy rules attach to actionType, not function names. Renaming refundOrder to createRefund should not silently bypass compliance rules.
Policy engine: when approval is required
Policies should be deterministic code or data, not model judgment. The model proposes; code decides.
Typical inputs:
tool.actionType,tool.risk- Principal — user id, roles, tenant, impersonation flag
- Arguments — amount, currency, destination domain, record id
- Environment — production vs sandbox
- Session signals — new device, elevated risk score, rate limits
Example rule sketch:
export type PolicyDecision =
| { effect: "allow" }
| { effect: "deny"; reason: string }
| { effect: "require_approval"; reason: string; expiresInSec?: number };
export function evaluateToolPolicy(
tool: RegisteredTool,
args: Record<string, unknown>,
ctx: ToolContext,
): PolicyDecision {
if (tool.risk === "read") return { effect: "allow" };
if (tool.actionType === "refund.create") {
const amount = Number(args.amountCents ?? 0);
if (amount > ctx.autoApproveRefundCents) {
return {
effect: "require_approval",
reason: `Refund ${amount} exceeds auto-approve limit`,
expiresInSec: 3600,
};
}
}
if (tool.risk === "irreversible") {
return { effect: "require_approval", reason: "Irreversible action" };
}
return { effect: "allow" };
}
Trade-off: Hard-coded thresholds are easy to ship; versioned policy documents (JSON/YAML in git, evaluated in CI) scale better for regulated tenants. Either way, log the policy version on every decision for audits.
Auto-approve is still a policy outcome
“No human” is not the absence of HITL—it is machine approval with explicit bounds. Document those bounds for security reviews and set monitoring on auto-approve rates per action type.
Durable approval records
Treat each gated tool call as a workflow entity, not a chat line.
Suggested fields:
id(uuid),session_id,tenant_idrequested_by(user),action_type,tool_namearguments_json— canonical JSON; hash for integrityarguments_hash— detect tampering between propose and executestatus—pending|approved|rejected|expired|executed|failedpolicy_reason,policy_versionapproved_by,approved_at,rejection_reasonidempotency_key— ties to tool round / client retryexpires_at— pending approvals must not linger foreverexecution_result_json,executed_at
Store a human-readable summary generated at propose time (“Refund $240.00 to card •••• 4242 for order #8821”). Approvers should not parse raw JSON under pressure.
In consulting engagements, the mistake I see most often is storing approvals only in Redis: you lose audit history and make incident response painful. PostgreSQL (or your system of record) is the source of truth; Redis can cache pending counts for the inbox UI.
Orchestration: integrate with the tool loop
Your multi-turn orchestrator already runs rounds: model → tool calls → results → model. Insert a gate before execute:
flowchart TD
A[Model proposes tool calls] --> B{Policy evaluate}
B -->|allow| C[Execute tool]
B -->|deny| D[Return denial to model]
B -->|require_approval| E[Persist approval_request]
E --> F[Notify approver]
F --> G[Return pending status to model / UI]
G --> H{Approver decision}
H -->|approved| I[Execute with same args hash]
H -->|rejected| J[Record rejection]
I --> C
Critical invariant: Execution uses stored arguments, not a fresh model re-generation. If the user edits the order amount in the UI after approval, invalidate the pending request.
What the model should see while pending
Return a structured tool result, not silence:
{
"status": "pending_approval",
"approvalId": "apr_01H…",
"summary": "Refund $240.00 for order #8821 awaiting manager approval",
"expiresAt": "2026-06-03T15:00:00Z"
}
System instructions should tell the model to inform the user, avoid duplicate proposals for the same idempotency_key, and not claim the refund completed.
Notifications and approver UX
Approvers are busy; optimize for decide in under 30 seconds.
- In-app inbox with filters (tenant, action type, age)
- Deep links with signed tokens (
approve/rejectone-time actions) - Slack/email with the summary and buttons—ensure buttons hit your API, not the model
- Mobile — irreversible actions often need mobile-friendly approval; chat desktop alone is insufficient
Show diff context: what changed since last state, linked CRM account, fraud score. Approvers are performing operational work, not chatting.
For teams building scalable, production-ready systems, invest early in separation of duties: the requester should not be the sole approver for high-impact actions unless policy explicitly allows it.
Timeouts, expiry, and user expectations
Pending approvals need expiry (e.g. 1–24 hours depending on action). When expires_at passes:
- Mark
expired - Do not execute
- Notify the requester session on next poll or via websocket
If the user still wants the action, the model must create a new proposal with a new idempotency key after re-validation (amounts and inventory change).
Do not auto-approve on expiry unless legal/compliance explicitly permits it—that pattern has caused real financial loss.
Idempotency and exactly-once side effects
Approvals intersect with retries:
- Client retries the chat request → same
idempotency_key→ return existingapproval_request, do not create duplicates - Approver double-clicks Approve → second request is a no-op if status is already
executed - Worker crashes after DB commit but before external API → resume execution using
status = approvedand idempotent downstream keys (payment provider idempotency headers, etc.)
Pattern:
async function executeApprovedRequest(approvalId: string): Promise<void> {
const row = await db.approvalRequests.findByIdForUpdate(approvalId);
if (!row) throw new NotFoundError();
if (row.status === "executed") return;
if (row.status !== "approved") throw new InvalidStateError(row.status);
const tool = registry.get(row.tool_name);
const args = canonicalizeJson(row.arguments_json);
if (hash(args) !== row.arguments_hash) throw new TamperError();
const result = await tool.execute(args, buildContext(row));
await db.approvalRequests.markExecuted(approvalId, result);
}
Use row-level locking or UPDATE … WHERE status = 'approved' with affected-rows check to prevent double execution under concurrency.
Observability and security
Emit structured logs and metrics:
approval.created,approval.approved,approval.rejected,approval.expired,approval.executed,approval.failed- Histogram: time from create → decision → execute
- Alert: spike in
deniedorfailedfor a singleaction_type
Security notes:
- Sign approval links; bind to approver identity and short TTL
- Rate-limit approval endpoints separately from chat
- Replay — approval IDs are one-time consumables for execution, not reusable bearer tokens
- Align with LLM trust boundaries: injection may propose tools; policy + HITL must block execution
Practical example: refund tool with approval gate
Below is a condensed but realistic flow in TypeScript. Adapt persistence and auth to your stack; the structure is what matters.
import { createHash, randomUUID } from "node:crypto";
type ApprovalRow = {
id: string;
status: string;
action_type: string;
tool_name: string;
arguments_json: string;
arguments_hash: string;
idempotency_key: string;
summary: string;
};
function hashArgs(args: unknown): string {
return createHash("sha256").update(JSON.stringify(args)).digest("hex");
}
export async function handleModelToolCalls(
sessionId: string,
toolCalls: Array<{ id: string; name: string; args: unknown }>,
ctx: ToolContext,
): Promise<Array<{ toolCallId: string; content: string }>> {
const results: Array<{ toolCallId: string; content: string }> = [];
for (const call of toolCalls) {
const tool = registry.get(call.name);
const decision = evaluateToolPolicy(tool, call.args as Record<string, unknown>, ctx);
if (decision.effect === "deny") {
results.push({
toolCallId: call.id,
content: JSON.stringify({ status: "denied", reason: decision.reason }),
});
continue;
}
if (decision.effect === "require_approval") {
const idempotencyKey = `${sessionId}:${call.id}`;
const existing = await db.findApprovalByIdempotency(idempotencyKey);
if (existing) {
results.push({
toolCallId: call.id,
content: JSON.stringify({
status: "pending_approval",
approvalId: existing.id,
summary: existing.summary,
}),
});
continue;
}
const args = call.args;
const row = await db.createApproval({
id: randomUUID(),
session_id: sessionId,
status: "pending",
action_type: tool.actionType,
tool_name: tool.name,
arguments_json: JSON.stringify(args),
arguments_hash: hashArgs(args),
idempotency_key: idempotencyKey,
summary: buildRefundSummary(args),
expires_at: new Date(Date.now() + (decision.expiresInSec ?? 3600) * 1000),
policy_reason: decision.reason,
});
await notifyApprovers(row, ctx);
results.push({
toolCallId: call.id,
content: JSON.stringify({
status: "pending_approval",
approvalId: row.id,
summary: row.summary,
}),
});
continue;
}
const output = await tool.execute(call.args, ctx);
results.push({ toolCallId: call.id, content: JSON.stringify(output) });
}
return results;
}
export async function approveAndExecute(
approvalId: string,
approver: { userId: string; roles: string[] },
): Promise<void> {
const row = await db.approvalRequests.findById(approvalId);
if (!row || row.status !== "pending") throw new InvalidStateError();
if (new Date(row.expires_at) < new Date()) {
await db.markExpired(approvalId);
throw new ExpiredError();
}
if (!canApprove(approver, row)) throw new ForbiddenError();
await db.markApproved(approvalId, approver.userId);
await executeApprovedRequest(approvalId);
}
Wire approveAndExecute to your inbox UI and signed deep links. Keep chat endpoints unable to call execute directly without passing through approval state.
Common mistakes and pitfalls
- Re-running the model to “fill in” arguments at execution time — Changes scope; breaks audit. Execute exactly what was approved.
- Storing approvals only in the chat transcript — Not queryable, not legally robust, lost on session reset.
- Requiring approval for every tool — Users abandon the product; approvers fatigue and click through. Tier tools aggressively.
- No expiry — Pending refunds pile up; execution fires on stale business state.
- Same person proposes and approves high-impact actions — Fails SOC2-style controls; encode separation in policy.
- Letting the model call execution endpoints — Tool implementations must be server-side only; never expose raw side-effect APIs to the client.
- Ignoring idempotency on approval create — Duplicate rows confuse approvers and can double-charge if execution guards are weak.
- Opaque tool results to approvers — “Model wanted to run tool X” without amount, tenant, and target is how mistakes get approved.
Conclusion
Agentic LLM features become production-ready when side effects are gated by policy and durable human decisions, not by conversational politeness. Classify tools by risk, evaluate deterministic policies, persist approval requests with hashed arguments, execute only after explicit approval (or bounded auto-approve), and instrument the full lifecycle. Pair this with solid session and tool-loop design from multi-turn LLM backends and defense in depth from trust boundaries for a coherent safety story.
If you are designing approval workflows for an assistant or hardening an existing agent stack, get in touch—I help teams ship scalable, auditable backends without sacrificing UX.
订阅邮件通讯
新文章发布时收到邮件。无垃圾信息 — 仅本博客的新文章通知。
由 Resend 发送,可在邮件中退订。