D4.1 ended with a top rung: when a shape must not be violated, make it unrepresentable. This chapter is that rung’s machinery. It has two generations — the older tool-use pattern that is still the right tool for open-ended schemas, and the newer grammar-constrained features that eliminate schema-violation retries entirely — and because the substance here is named API fields, schema rules, and numeric limits, it is a feature surface.
The classic mechanism: a tool whose input is your output
The oldest reliable way to get JSON is to borrow the tool-call channel. Define a tool whose input_schema is exactly the shape you want back, force Claude to call it, and read the call’s input — that object is your extracted JSON; you discard the tool’s “result” entirely. The convention is to name the tool print_X (print_summary, print_entities) so the model treats it as committing data rather than taking an action.
[Official]
Extracting Structured JSON using Claude and Tool Use · AnthropicT1-official original Forcing the call is what guarantees the extraction happens: tool_choice: {type: "tool", name: "print_summary"}.
[Official]
Extracting Structured JSON using Claude and Tool Use · AnthropicT1-official original
tools = [{
"name": "print_summary",
"description": "Prints a summary of the article.",
"input_schema": {
"type": "object",
"properties": {
"author": {"type": "string"},
"topics": {"type": "array", "items": {"type": "string"}},
"summary": {"type": "string"},
},
"required": ["author", "topics", "summary"],
},
}]
resp = client.messages.create(model="claude-opus-4-8", max_tokens=1024, tools=tools,
tool_choice={"type": "tool", "name": "print_summary"}, messages=[...])
json_summary = next(b.input for b in resp.content if b.type == "tool_use")
Strict tool use: from shape to guarantee
The classic pattern controls which fields appear, but not their types — Claude could still emit "2" where you need 2. Setting strict: true on the tool definition closes that gap: “Setting strict: true on a tool definition guarantees Claude’s tool inputs match your JSON Schema by constraining the model’s token sampling to schema-valid outputs (a technique called grammar-constrained sampling).”
[Official]
Strict tool use · AnthropicT1-official original The motivation is operational: “Without strict mode, Claude might return incompatible types (‘2’ instead of 2) or missing required fields, breaking your functions and causing runtime errors.”
[Official]
Strict tool use · AnthropicT1-official original For “call one of N candidate tools and validate its inputs,” combine tool_choice: {type: "any"} with strict: true on each tool.
[Official]
Strict tool use · AnthropicT1-official original
Structured outputs: constrain the response itself
Strict tool use constrains a tool call. Its sibling constrains Claude’s response directly: output_config.format coerces the final assistant text to a JSON schema using the same pipeline. The two are “two complementary features: JSON outputs (output_config.format) … Strict tool use (strict: true),” and the payoff is the elimination of retry loops: “Structured outputs guarantee schema-compliant responses through constrained decoding … No retries needed for schema violations.”
[Official]
Structured outputs · AnthropicT1-official original The request carries output_config: {format: {type: "json_schema", schema: {...}}}, and the conforming JSON arrives in the response text.
Note the migration: “The output_format parameter has moved to output_config.format, and beta headers are no longer required. The old beta header (structured-outputs-2025-11-13) and output_format parameter will continue working for a transition period.”
[Official]
Structured outputs · AnthropicT1-official original
The JSON-Schema subset and its one mandatory rule
Both features accept a subset of JSON Schema, not the full draft. Objects, arrays, the scalar types, enum, const, anyOf, and internal $ref are supported; external $ref, recursive schemas, numerical bounds (minimum/maximum), and string-length bounds are not — unsupported features return a 400.
[Official]
Structured outputs · AnthropicT1-official original The one rule that catches everyone: additionalProperties: false is required on every object node — it is the most common 400 for hand-authored schemas. When you need a numeric or length bound, the SDK helpers strip it from the schema, encode it as description text, and validate it client-side after the call instead.
[Official]
Structured outputs · AnthropicT1-official original
Limits, caching, and the failure modes that still get through
Three operational facts complete the picture. Caching: the compiled grammar carries a first-request latency, then is “cached for 24 hours from last use,” and the cache “invalidates if you change the JSON schema structure or set of tools. Changing only name or description fields does NOT invalidate cache.”
[Official]
Structured outputs · AnthropicT1-official original Limits: a request allows at most 20 strict tools, 24 cumulative optional parameters across strict schemas, and 16 union-typed parameters; beyond that (or an internal grammar-size cap) you get a 400 “Schema is too complex for compilation.”
[Official]
Structured outputs · AnthropicT1-official original The failures constrained decoding cannot prevent. Grammar constraints guarantee every emitted token is schema-valid — but not that Claude emits a complete result, so two gaps survive and a caller must check stop_reason for both. First, a refusal: if Claude refuses, the response is stop_reason: "refusal" with a 200 status, you are billed, and the output may not match the schema.
[Official]
Structured outputs · AnthropicT1-official original Second, truncation: max_tokens is a hard cap on output, and stop_reason: "max_tokens" is its output-budget signal.
[Official]
How the agent loop works · AnthropicT1-official original If generation hits that cap mid-structure, every token emitted was schema-valid but the object never closed — the JSON is cut off, so a parser rejects it just the same. The fix is not a retry on the same budget but a larger max_tokens (or a smaller schema); the constrained decoder cannot finish a structure it ran out of room to write.
This is also why the classic cookbook pattern has a permanent niche: open-ended extraction with additionalProperties: true — “I don’t know which fields will be present” — is something structured outputs cannot do, since it requires additionalProperties: false. For open-ended schemas, plain tool use stays the right tool.
[Official]
Extracting Structured JSON using Claude and Tool Use · AnthropicT1-official original
Practice
Exercise solutions
A. A single forced tool plus strict: true gives both guarantees you need: the forced tool_choice ensures exactly this extraction runs, and strict constrains the inputs by grammar so seats is a real integer, not "3" — exactly the “incompatible types breaking your functions” case strict mode exists to prevent. B is the open-ended pattern; additionalProperties: true is for when you don’t know the fields, and it forgoes the strict guarantee — wrong for a fixed, type-critical record. C silently drops strict through the OpenAI-compatibility layer, so you lose the type guarantee precisely where you needed it. D is the unconstrained baseline D4.1 warned about — it can parse fine and still hand you "3", the failure you are trying to design out.
The most likely cause is an object node missing additionalProperties: false, and the one-line fix is to add it to every object in the schema — nested ones included. The decoder requires it because an open object (one allowing arbitrary extra keys) has no closed set of valid continuations to compile into a grammar: at each decoding step the model must know exactly which keys are permitted, and additionalProperties: false is what closes that set. Without it there is no finite grammar to constrain sampling against, so the API rejects the schema with a 400 rather than allow unconstrained keys. It is the top 400 for hand-authored schemas precisely because standard JSON Schema defaults additionalProperties to true, so a schema that “validates fine in your editor” still fails compilation here.
The niche is open-ended extraction — “I don’t know which fields will be present” — and the schema feature that defines it is additionalProperties: true (an object that may carry arbitrary, unknown keys). Constrained decoding cannot serve it because it requires additionalProperties: false on every object: the grammar must enumerate the permitted keys ahead of time, and an object that allows any key has no closed grammar to compile. So when the set of fields is genuinely unknown in advance, the classic print_X tool-use pattern — which imposes no such closure — remains the right tool; structured outputs is for shapes you can pin down completely.
What is happening is truncation, not a grammar failure. max_tokens is a hard cap on total output, and the generation ran into it partway through writing the object; the grammar did its job — every token emitted was schema-valid — but it cannot guarantee the structure finishes within the budget, so the JSON is cut off before its closing braces and the parser rejects it. The confirming signal is stop_reason: "max_tokens" (the output-budget value), as opposed to end_turn. The fix is to raise max_tokens (or shrink the schema / split the extraction). A plain retry of the identical request is not the fix because it re-runs against the same budget and truncates at the same place — you must enlarge the room before the structure can complete.
Exam essentials
- Classic tool-use pattern — define a
print_Xtool whoseinput_schemais your output shape, force it withtool_choice: {type: "tool", name: ...}, readtool_use.input; the tool result is discarded. strict: true— grammar-constrains tool inputs to the schema (no wrong types, no missing required fields); pair withtool_choice: {type: "any"}for “one-of-N and valid.” Ignored on the OpenAI-compat layer.output_config.format— grammar-constrains Claude’s response to a JSON schema; “no retries needed for schema violations.” Migrated from theoutput_formatparam /structured-outputs-2025-11-13beta header.- Schema subset — a subset of Draft 2020-12;
additionalProperties: falseis mandatory on every object (top 400 cause); no numeric/length bounds (SDKs strip them to descriptions + post-validate); no external$refor recursion. - Limits + failure modes — 20 strict tools / 24 optional params / 16 union types per request; grammar cached 24h from last use (invalidated by schema/tool-set change, not name/description); two failures slip past constrained decoding — a refusal (
stop_reason: "refusal"; 200, billed, may not match) and truncation (stop_reason: "max_tokens"; JSON cut off mid-object). Always checkstop_reasonfor both; raisemax_tokensfor the second rather than retrying.