JSON Schema validation for LLM tool calls: contracts, AJV, and fail-closed execution in TypeScript

Validate LLM tool arguments as untrusted input: JSON Schema design, AJV compilation, strict object rules, and a typed dispatcher pattern for production TypeScript backends.

Author: Matheus Palma8 min read
Software engineeringArtificial intelligenceTypeScriptBackendAPI designJSON Schema

Your assistant proposes delete_invoice with { "invoiceId": "123 OR 1=1" }. Another completion calls refund_payment with a string where the handler expects a number, throwing deep inside Stripe integration code. A third returns valid JSON but keys your dispatcher never heard of, silently falling through to a default branch that does something anyway. None of these failures are “hallucinations” in the poetic sense; they are predictable consequences of treating model-emitted structure as if it were client SDK output.

Function calling and structured outputs moved half of your integration surface area into strings that look typed. This article explains why JSON Schema validation belongs on the hot path, how to write schemas that align with real handlers, and how to wire AJV so validation is fast, explicit, and safe under load. The patterns come from the same place most backend hardening does: shipping assistants and automation APIs where side effects are irreversible and retries are common.

Why “the model returned JSON” is not enough

Providers may advertise JSON mode or constrained decoding, which reduces syntax errors. That is orthogonal to semantic validity:

  • Unknown tools — Models can emit tool names you removed from the prompt last week.
  • Shape drift — Optional fields disappear; numbers arrive as strings; arrays become single values when the user utterance was ambiguous.
  • Policy violations — Even honest completions can propose arguments that violate business rules (refunds above a cap, exports of the wrong tenant).

Your HTTP layer already validates request bodies. Tool calls are another request body that happens to be generated by an LLM instead of a browser. Skipping validation means pushing fuzzy data straight into code paths written for trusted internal callers.

For a broader threat model (prompt injection, data exfiltration), see LLM trust boundaries and defense in depth. Here the focus is narrower and equally critical: mechanical correctness and containment before any tool runs.

Schema design that matches what you actually execute

JSON Schema is expressive enough to encode most argument shapes you expose to models. The design goal is not maximal cleverness; it is one-to-one alignment between what the schema accepts and what your implementation accepts.

Prefer closed objects for tool arguments

Use type: "object" with additionalProperties: false (Draft 2020-12 / OpenAI “strict” style) whenever the tool’s handler branches on a fixed set of keys. Open objects encourage the model to invent configuration knobs you will ignore—or misinterpret if your code uses loose spreads.

Trade-off: every new option requires a schema + handler change together. That coupling is desirable for tools with side effects; it is how you keep refactors detectable instead of silent.

Be explicit about nullable and optional semantics

JSON Schema’s required array lists keys that must be present. If null is meaningful, model it with type: ["string", "null"] rather than leaving parsers to guess. If absence and null should behave differently, say so in the schema and document it in the tool description the model reads.

Constrain strings with intent

Use minLength, maxLength, pattern, and enum where they reflect real invariants (UUID formats, ISO dates, allowed statuses). Patterns should be maintainable: a 200-character regex copied from Stack Overflow will eventually disagree with the database.

Numbers, integers, and bounds

Financial and quantity fields benefit from multipleOf or integer types plus explicit minimum / maximum. This prevents absurd values from reaching payment APIs even when the user never typed digits—the model might.

Compile validators once, validate on every invocation

Interpreting JSON Schema on each tool call adds CPU overhead. AJV (Another JSON Schema Validator) compiles schemas into JavaScript functions; keep compiled validators in a Map keyed by tool name (or schema id).

Operational practices:

  • Fail closed — If no schema is registered for a tool name, reject the call before execution. Do not “best effort” execute.
  • Separate compile from warm path errors — Schema compilation can throw during deploy or startup tests; validation errors are expected user/model space.
  • Stable error messages — Return structured errors to the model (as tool results) that say what failed without leaking internal stack traces to end users.

In high-traffic services, combine this layer with the resilience patterns discussed in production LLM API integration: timeouts, bounded tool rounds, and circuit breaking when downstream tools degrade.

From validated JSON to typed handlers

Validation should produce a narrowed type consumed by handlers—not unknown, and not raw any. Two common patterns:

  1. Discriminated registry — Tool name is a string literal union; each name maps to { schema, handle } where handle receives the validated output type.
  2. Branded types per tool — After validation, wrap the object in a nominal-style brand so downstream modules cannot accidentally pass RefundArgs into SendEmailArgs.

The important invariant: only the validation boundary may cast from unknown to concrete types.

Practical example: registry with AJV and a fail-closed dispatcher

The following example is self-contained aside from installing AJV (npm install ajv). It shows compilation, strict object validation, deterministic errors fed back as tool results, and handler code that never sees unvalidated data.

import Ajv, { type ErrorObject, type JSONSchemaType } from "ajv";

type ToolError = {
  ok: false;
  tool: string;
  errors: Pick<ErrorObject, "instancePath" | "message" | "keyword">[];
};

type ToolSuccess<T> = { ok: true; data: T };

type ToolResult<T> = ToolSuccess<T> | ToolError;

type ToolDefinition<T> = {
  description: string;
  schema: JSONSchemaType<T>;
  execute: (args: T) => Promise<unknown>;
};

const ajv = new Ajv({
  allErrors: true,
  strict: true,
  removeAdditional: false, // we prefer explicit additionalProperties: false in schemas
});

function compileValidator<T>(schema: JSONSchemaType<T>) {
  return ajv.compile(schema);
}

/** Example tool args — note additionalProperties: false */
type RefundPaymentArgs = {
  paymentId: string;
  amountCents: number;
  reason: "duplicate" | "customer_request" | "fraud";
};

const refundPaymentSchema: JSONSchemaType<RefundPaymentArgs> = {
  type: "object",
  properties: {
    paymentId: { type: "string", minLength: 8, maxLength: 64 },
    amountCents: { type: "integer", minimum: 1, maximum: 500_000 },
    reason: { type: "string", enum: ["duplicate", "customer_request", "fraud"] },
  },
  required: ["paymentId", "amountCents", "reason"],
  additionalProperties: false,
};

async function refundPaymentHandler(args: RefundPaymentArgs): Promise<unknown> {
  // args is trusted in shape only — still check authz / tenancy / idempotency in real code
  return { status: "refund_queued", paymentId: args.paymentId, amountCents: args.amountCents };
}

const tools = {
  refund_payment: {
    description: "Queue a partial or full refund for a captured payment.",
    schema: refundPaymentSchema,
    execute: refundPaymentHandler,
  },
} as const satisfies Record<string, ToolDefinition<unknown>>;

type ToolName = keyof typeof tools;

const validators = new Map<ToolName, ReturnType<typeof compileValidator>>(
  (Object.keys(tools) as ToolName[]).map((name) => [
    name,
    compileValidator(tools[name].schema as JSONSchemaType<(typeof tools)[typeof name] extends ToolDefinition<infer U> ? U : never>),
  ]),
);

export async function dispatchToolCall(
  name: string,
  rawArgs: unknown,
): Promise<{ result: unknown; validation: ToolResult<unknown> }> {
  if (!(name in tools)) {
    const err: ToolError = {
      ok: false,
      tool: name,
      errors: [{ instancePath: "", message: "unknown tool", keyword: "enum" }],
    };
    return { validation: err, result: err };
  }

  const toolName = name as ToolName;
  const validate = validators.get(toolName);
  if (!validate) {
    const err: ToolError = {
      ok: false,
      tool: name,
      errors: [{ instancePath: "", message: "validator missing", keyword: "x-internal" }],
    };
    return { validation: err, result: err };
  }

  if (!validate(rawArgs)) {
    const err: ToolError = {
      ok: false,
      tool: name,
      errors: (validate.errors ?? []).map((e) => ({
        instancePath: e.instancePath,
        message: e.message ?? "invalid",
        keyword: e.keyword,
      })),
    };
    return { validation: err, result: err };
  }

  const narrowed = rawArgs as (typeof tools)[typeof toolName] extends ToolDefinition<infer U> ? U : never;
  const data = await tools[toolName].execute(narrowed);
  return { validation: { ok: true, data: narrowed }, result: data };
}

In a full assistant loop, the ToolError object is what you serialize into a tool role message so the model can self-correct—mirroring how HTTP 400 bodies guide human clients. The multi-turn orchestration details are covered in session state and tool-call loops; this validation layer slots in before execute runs and after you parse JSON safely.

Common mistakes and pitfalls

  • Trusting provider-side “strict JSON Schema” as authorization — Structured output reduces malformed JSON; it does not prove the caller (or injected prompt) is allowed to perform the action. Always enforce authz after validation, on every path.
  • Using coerce types as a bandage — Automatically coercing "42" into 42 hides bugs and can launder unexpected values into financial code. Prefer explicit rejection and let the model retry with correct types.
  • Validators that diverge from handlers — If the schema allows a field the handler ignores, you will ship latent behavior: the model assumes an effect that never happens. Treat schema and handler as one commit whenever possible.
  • Logging raw arguments verbatim — They may contain PII or secrets echoed from context. Log hashes, truncated forms, or schema violation summaries instead.
  • Skipping validation on “internal” tools — Internal tools are often the most powerful (database queries, admin toggles). If the model can reach them, they are external surface area.

Conclusion

LLM tool calls are user-supplied structured data with extra entropy. JSON Schema gives you a declarative contract that models can be steered toward, while AJV gives you fast, deterministic enforcement on every invocation. Closed objects, explicit bounds, and a fail-closed dispatcher turn “JSON shaped like my API” into “JSON my API can actually run,” which is the bar for production assistants and automation endpoints.

Teams building scalable, production-ready assistants routinely invest in this layer early—not because models are adversarial by default, but because integration code should not depend on probabilistic syntax for correctness. If you are evaluating architecture for a new assistant or hardening an existing one, the combination of validation, session design, and trust-boundary work is what keeps side effects boring in the best sense.

For related reading, see multi-turn LLM backends and LLM trust boundaries. For engineering collaboration or reviews, use contact.

Subscribe to the newsletter

Get an email when new articles are published. No spam — only new posts from this blog.

Powered by Resend. You can unsubscribe from any email.