D4.1 ended with a top rung: when a shape must not be violated, make it unrepresentable. This chapter is that rung’s machinery. It has two generations — the older tool-use pattern that is still the right tool for open-ended schemas, and the newer grammar-constrained features that eliminate schema-violation retries entirely — and because the substance here is named API fields, schema rules, and numeric limits, it is a feature surface.

Do I know this already? Diagnostic

Answer these confidently and you can skim ahead to Exam essentials; if any is shaky, read closely — each is developed below.

In the classic pattern, which field of the forced tool call holds your data, and what do you do with the tool’s “result”?
What does strict: true guarantee that the classic pattern alone does not — and through which integration path is it silently dropped?
Which JSON-Schema keyword is mandatory on every object node, and why does the constrained decoder require it?
Name the two failure modes constrained decoding cannot prevent, and the stop_reason value that signals each.
For open-ended extraction (“I don’t know which fields will appear”), do you reach for structured outputs or classic tool use, and why?

Check your answers

The data is the forced call’s tool_use.input — that object is your extracted JSON; the tool’s “result” is discarded entirely.
It grammar-constrains tool inputs to your schema — no wrong types ("2" for 2), no missing required fields; it is silently dropped by the OpenAI SDK compatibility layer, which honors the request but gives no grammar guarantee.
additionalProperties: false — an open object has no closed grammar to compile, so the decoder must know exactly which keys are permitted at each step.
Refusal (stop_reason: "refusal" — a 200 you are billed for, output may not match) and truncation (stop_reason: "max_tokens" — every token schema-valid but the object never closed).
Classic tool use — open-ended extraction needs additionalProperties: true, which structured outputs cannot accept since it requires additionalProperties: false on every object.

The classic mechanism: a tool whose input is your output

The oldest reliable way to get JSON is to borrow the tool-call channel. Define a tool whose input_schema is exactly the shape you want back, force Claude to call it, and read the call’s input — that object is your extracted JSON; you discard the tool’s “result” entirely. The convention is to name the tool print_X (print_summary, print_entities) so the model treats it as committing data rather than taking an action. [Official] Extracting Structured JSON using Claude and Tool Use · AnthropicT1-official original Forcing the call is what guarantees the extraction happens: tool_choice: {type: "tool", name: "print_summary"}. [Official] Extracting Structured JSON using Claude and Tool Use · AnthropicT1-official original

tools = [{
    "name": "print_summary",
    "description": "Prints a summary of the article.",
    "input_schema": {
        "type": "object",
        "properties": {
            "author":  {"type": "string"},
            "topics":  {"type": "array", "items": {"type": "string"}},
            "summary": {"type": "string"},
        },
        "required": ["author", "topics", "summary"],
    },
}]
resp = client.messages.create(model="claude-opus-4-8", max_tokens=1024, tools=tools,
    tool_choice={"type": "tool", "name": "print_summary"}, messages=[...])
json_summary = next(b.input for b in resp.content if b.type == "tool_use")

Strict tool use: from shape to guarantee

The classic pattern controls which fields appear, but not their types — Claude could still emit "2" where you need 2. Setting strict: true on the tool definition closes that gap: “Setting strict: true on a tool definition guarantees Claude’s tool inputs match your JSON Schema by constraining the model’s token sampling to schema-valid outputs (a technique called grammar-constrained sampling).” [Official] Strict tool use · AnthropicT1-official original The motivation is operational: “Without strict mode, Claude might return incompatible types (‘2’ instead of 2) or missing required fields, breaking your functions and causing runtime errors.” [Official] Strict tool use · AnthropicT1-official original For “call one of N candidate tools and validate its inputs,” combine tool_choice: {type: "any"} with strict: true on each tool. [Official] Strict tool use · AnthropicT1-official original

Structured outputs: constrain the response itself

Strict tool use constrains a tool call. Its sibling constrains Claude’s response directly: output_config.format coerces the final assistant text to a JSON schema using the same pipeline. The two are “two complementary features: JSON outputs (output_config.format) … Strict tool use (strict: true),” and the payoff is the elimination of retry loops: “Structured outputs guarantee schema-compliant responses through constrained decoding … No retries needed for schema violations.” [Official] Structured outputs · AnthropicT1-official original The request carries output_config: {format: {type: "json_schema", schema: {...}}}, and the conforming JSON arrives in the response text.

Note the migration: “The output_format parameter has moved to output_config.format, and beta headers are no longer required. The old beta header (structured-outputs-2025-11-13) and output_format parameter will continue working for a transition period.” [Official] Structured outputs · AnthropicT1-official original

The JSON-Schema subset and its one mandatory rule

Both features accept a subset of JSON Schema, not the full draft. Objects, arrays, the scalar types, enum, const, anyOf, and internal $ref are supported; external $ref, recursive schemas, numerical bounds (minimum/maximum), and string-length bounds are not — unsupported features return a 400. [Official] Structured outputs · AnthropicT1-official original The one rule that catches everyone: additionalProperties: false is required on every object node — it is the most common 400 for hand-authored schemas. When you need a numeric or length bound, the SDK helpers strip it from the schema, encode it as description text, and validate it client-side after the call instead. [Official] Structured outputs · AnthropicT1-official original

Limits, caching, and the failure modes that still get through

Three operational facts complete the picture. Caching: the compiled grammar carries a first-request latency, then is “cached for 24 hours from last use,” and the cache “invalidates if you change the JSON schema structure or set of tools. Changing only name or description fields does NOT invalidate cache.” [Official] Structured outputs · AnthropicT1-official original Limits: a request allows at most 20 strict tools, 24 cumulative optional parameters across strict schemas, and 16 union-typed parameters; beyond that (or an internal grammar-size cap) you get a 400 “Schema is too complex for compilation.” [Official] Structured outputs · AnthropicT1-official original The failures constrained decoding cannot prevent. Grammar constraints guarantee every emitted token is schema-valid — but not that Claude emits a complete result, so two gaps survive and a caller must check stop_reason for both. First, a refusal: if Claude refuses, the response is stop_reason: "refusal" with a 200 status, you are billed, and the output may not match the schema. [Official] Structured outputs · AnthropicT1-official original Second, truncation: max_tokens is a hard cap on output, and stop_reason: "max_tokens" is its output-budget signal. [Official] How the agent loop works · AnthropicT1-official original If generation hits that cap mid-structure, every token emitted was schema-valid but the object never closed — the JSON is cut off, so a parser rejects it just the same. The fix is not a retry on the same budget but a larger max_tokens (or a smaller schema); the constrained decoder cannot finish a structure it ran out of room to write.

This is also why the classic cookbook pattern has a permanent niche: open-ended extraction with additionalProperties: true — “I don’t know which fields will be present” — is something structured outputs cannot do, since it requires additionalProperties: false. For open-ended schemas, plain tool use stays the right tool. [Official] Extracting Structured JSON using Claude and Tool Use · AnthropicT1-official original

Reading a strict extraction without trusting it blindly Worked example

You force a strict print_record extraction. The grammar guarantees the shape — but a naive caller that reads the result directly still crashes on the two failures above. Guard them:

resp = client.messages.create(
    model="claude-opus-4-8", max_tokens=1024, tools=[print_record],
    tool_choice={"type": "tool", "name": "print_record"}, messages=[...])

# strict: true makes each emitted token schema-valid -- not the response complete.
if resp.stop_reason == "refusal":
    raise ExtractionRefused()        # 200 status, you were billed, may not match schema
if resp.stop_reason == "max_tokens":
    raise OutputTruncated()          # valid tokens, but the object never closed
record = next(b.input for b in resp.content if b.type == "tool_use")

Trace the reasoning: the grammar buys you per-token schema-validity, so you will never see "3" where you required 3. It does not buy you a guarantee that a record arrived. A refusal returns a 200 you paid for with no conforming object; a max_tokens stop cuts the JSON off mid-write so tool_use.input is absent or malformed. Reading resp.content before checking stop_reason turns both into a confusing StopIteration or parse error three layers downstream. Two cheap branches convert them into legible, actionable failures — and the max_tokens branch tells you to raise the cap, not to retry the same doomed request.

Practice

Exercise solutions

Solution ↑ Exercise

A. A single forced tool plus strict: true gives both guarantees you need: the forced tool_choice ensures exactly this extraction runs, and strict constrains the inputs by grammar so seats is a real integer, not "3" — exactly the “incompatible types breaking your functions” case strict mode exists to prevent. B is the open-ended pattern; additionalProperties: true is for when you don’t know the fields, and it forgoes the strict guarantee — wrong for a fixed, type-critical record. C silently drops strict through the OpenAI-compatibility layer, so you lose the type guarantee precisely where you needed it. D is the unconstrained baseline D4.1 warned about — it can parse fine and still hand you "3", the failure you are trying to design out.

Solution ↑ Exercise

The most likely cause is an object node missing additionalProperties: false, and the one-line fix is to add it to every object in the schema — nested ones included. The decoder requires it because an open object (one allowing arbitrary extra keys) has no closed set of valid continuations to compile into a grammar: at each decoding step the model must know exactly which keys are permitted, and additionalProperties: false is what closes that set. Without it there is no finite grammar to constrain sampling against, so the API rejects the schema with a 400 rather than allow unconstrained keys. It is the top 400 for hand-authored schemas precisely because standard JSON Schema defaults additionalProperties to true, so a schema that “validates fine in your editor” still fails compilation here.

Solution ↑ Exercise

The niche is open-ended extraction — “I don’t know which fields will be present” — and the schema feature that defines it is additionalProperties: true (an object that may carry arbitrary, unknown keys). Constrained decoding cannot serve it because it requires additionalProperties: false on every object: the grammar must enumerate the permitted keys ahead of time, and an object that allows any key has no closed grammar to compile. So when the set of fields is genuinely unknown in advance, the classic print_X tool-use pattern — which imposes no such closure — remains the right tool; structured outputs is for shapes you can pin down completely.

Solution ↑ Exercise

What is happening is truncation, not a grammar failure. max_tokens is a hard cap on total output, and the generation ran into it partway through writing the object; the grammar did its job — every token emitted was schema-valid — but it cannot guarantee the structure finishes within the budget, so the JSON is cut off before its closing braces and the parser rejects it. The confirming signal is stop_reason: "max_tokens" (the output-budget value), as opposed to end_turn. The fix is to raise max_tokens (or shrink the schema / split the extraction). A plain retry of the identical request is not the fix because it re-runs against the same budget and truncates at the same place — you must enlarge the room before the structure can complete.

Exam essentials

Classic tool-use pattern — define a print_X tool whose input_schema is your output shape, force it with tool_choice: {type: "tool", name: ...}, read tool_use.input; the tool result is discarded.
strict: true — grammar-constrains tool inputs to the schema (no wrong types, no missing required fields); pair with tool_choice: {type: "any"} for “one-of-N and valid.” Ignored on the OpenAI-compat layer.
output_config.format — grammar-constrains Claude’s response to a JSON schema; “no retries needed for schema violations.” Migrated from the output_format param / structured-outputs-2025-11-13 beta header.
Schema subset — a subset of Draft 2020-12; additionalProperties: false is mandatory on every object (top 400 cause); no numeric/length bounds (SDKs strip them to descriptions + post-validate); no external $ref or recursion.
Limits + failure modes — 20 strict tools / 24 optional params / 16 union types per request; grammar cached 24h from last use (invalidated by schema/tool-set change, not name/description); two failures slip past constrained decoding — a refusal (stop_reason: "refusal"; 200, billed, may not match) and truncation (stop_reason: "max_tokens"; JSON cut off mid-object). Always check stop_reason for both; raise max_tokens for the second rather than retrying.