Part 2 Chapter 17 Last verified 2026-06-13 Fresh

Shaping Output — Structured & Reliable

The output half of shaping I/O — four levers that force reliable machine-readable output, ordered strongest-guarantee to lightest. tool_choice forces the call while strict guarantees the args; structured outputs add a grammar-backed guarantee that holds except for refusals and max_tokens cutoffs and only over the supported schema subset; prevent beats recover, so the retry loop is the fallback, not the primary path.

Volatility: feature-surface

Tools compared: claude-codecross-tool

On this page

One problem, four levers
Tool use: tool_choice forces the call, strict guarantees the args
The structured-outputs guarantee — and its limits
Prevent beats recover
The lightest lever: prompt-craft recipes
A note on evidence
Patterns
Quick reference
Practice

Before you start: The prompting-craft chapter just before this one (it shapes what goes in; this chapter shapes what comes out) and the spine's capability axis — output shape is a capability-axis decision. Tool use as a mechanism (the MCP and tool-minimization chapters) helps but is not required.

You will learn

The four output levers as one continuous surface — strongest guarantee down to lightest weight — and that choosing one is a guarantee-vs-flexibility decision
Why tool_choice and strict are two different levers: one forces the call, the other guarantees the arguments
The structured-outputs guarantee and its documented limits — so you never write “always valid JSON”
Why prevent beats recover, which makes the validation/retry loop the fallback rather than the primary path

The previous chapter shaped what goes into the model — the prompting craft. This one shapes what comes out: how you force output a program can parse instead of prose a human must read. It picks up a loose end from there — prefilling the assistant turn, a classic JSON-extraction trick, is being deprecated on the newest models, and the question “what do I reach for instead?” is exactly this chapter’s subject. The answer is four levers, arranged on one surface from the strongest guarantee to the lightest touch. The mechanics are Anthropic’s own API documentation — authoritative by construction, but a moving target: structured outputs is mid-GA-transition and the prefill gate grows with each model release, so this is a chapter to re-check per release, not to memorize.

One problem, four levers

Forcing reliable machine-readable output looks like several unrelated tricks — tool calls, a strict flag, retry loops, prefilling a brace — but they are one problem with four levers, and the levers form a single continuous surface from strongest guarantee to lightest weight.

Tool use turns a tool definition into an output mechanism: a tool’s input_schema is “[a JSON Schema] object defining the expected parameters for the tool,” [Official] Define tools · AnthropicT1-official original so you define one tool whose schema is the shape you want and parse the call it returns. [Official] Tool use with Claude · AnthropicT1-official original
Structured outputs / strict: true add a grammar-backed guarantee on top — schema-compliant output by construction. [Official] Structured outputs · AnthropicT1-official original
The validation + retry loop is the recover path for when the guarantee is unavailable or insufficient — define a schema, “and the SDK validates the output against it, re-prompting on mismatch.” [Official] Get structured output from agents · AnthropicT1-official original
The prompt-craft recipes — specify the format, prefill, stop sequences — are the lightest-weight, guarantee-free option. [Official] Increase output consistency · AnthropicT1-official original

This is a capability-axis decision in the spine’s sense: the output shape is part of what the agent’s tool surface costs and promises. The rest of the chapter walks the levers in order, and the discipline that orders them — prevent beats recover — falls out at the end.

The four output levers as one ladder, strongest guarantee at the top. Structured outputs / strict give a grammar-constrained guarantee — but only except refusals and max_tokens cutoffs, over the supported schema subset. tool_choice + strict force a call and constrain its arguments. The validation/retry loop recovers after an invalid output rather than preventing one. Prompt-asked JSON shapes likelihood, not the token grammar, so it carries no guarantee. Up the ladder is more guarantee; down it is more flexibility.

Tool use: `tool_choice` forces the call, `strict` guarantees the args

The first lever is the one most teams already have wired, because they use tool calls for everything else. Using a tool to shape output has two separable moves, and conflating them is the most common framing error.

The first move forces the model to emit a call. The tool_choice option {"type":"tool","name":"..."} “forces Claude to always use a particular tool.” [Official] Define tools · AnthropicT1-official original Give the model a single tool whose input_schema is your target shape, force that tool, and you are guaranteed it calls it.

The second move is not implied by the first. Forcing the call does not make the call’s arguments schema-valid — the model can still emit the wrong types or drop a required field. That guarantee is a separate addition: setting strict: true on the tool “guarantees Claude’s tool inputs match your JSON Schema by constraining the model’s token sampling to schema-valid outputs.” [Official] Strict tool use · AnthropicT1-official original

This chapter treats tool use as an output mechanism only. Which tools to expose, how few, and how to design their surface is the subject of the tool-minimization and MCP chapters earlier in this volume — a different question from “how do I get a known shape back.”

The structured-outputs guarantee — and its limits

The second lever is the strongest, and the one most often overstated. Structured outputs “guarantee schema-compliant responses through constrained decoding,” [Official] Structured outputs · AnthropicT1-official original and the guarantee is mechanistically grounded rather than a strong-prompt effect: structured outputs “use constrained sampling with compiled grammar artifacts” [Official] Structured outputs · AnthropicT1-official original — the schema is compiled into a grammar that constrains which tokens the model may sample. That compiled-grammar mechanism is why the guarantee holds and what distinguishes it from prompt-only JSON, which carries no such constraint.

But the guarantee is conditioned, and the conditions are load-bearing — you must render them every time.

This is the bridge to the first lever: tool_choice forces a call, and strict runs this same constrained-decoding pipeline over the call’s arguments. Schema-shaped return value (structured outputs) and schema-valid tool arguments (strict tool use) are the same guarantee applied to two surfaces.

Prevent beats recover

The third lever closes the loop when the output is still malformed — but its place in the ordering is the real lesson. There are two ways to deal with bad output, and they are not equals.

The recover path responds after the fact. On the tool-use API, when a client tool returns a tool_result with is_error: true, “Claude will then incorporate this error into its response,” [Official] Handle tool calls · AnthropicT1-official original and for an invalid or missing-parameter call, “Claude will retry 2-3 times with corrections before apologizing to the user.” [Official] Handle tool calls · AnthropicT1-official original On the Agent SDK, the same instinct is wired as a loop: you define a JSON Schema “and the SDK validates the output against it, re-prompting on mismatch,” [Official] Get structured output from agents · AnthropicT1-official original erroring out — surfaced as error_max_structured_output_retries — if validation does not succeed within the retry limit.

The prevent path is the second lever: strict / structured outputs eliminate the invalid call by construction, so there is nothing to recover from. The handle-tool-calls docs themselves point at strict as the way to “eliminate invalid calls” rather than retry them. [Official] Handle tool calls · AnthropicT1-official original

Recovery is not free and it is not certain. The SDK names three documented ways generation still fails: “This typically happens when the schema is too complex for the task, the task itself is ambiguous, or the agent hits its retry limit trying to fix validation errors.” [Official] Get structured output from agents · AnthropicT1-official original Each is a reason to prefer prevention: a simpler schema, a less ambiguous task, and a guarantee that needs no retries at all.

The lightest lever: prompt-craft recipes

The fourth lever is the oldest and the weakest — prompt-craft recipes that ask for a shape without constraining the tokens. They carry no schema guarantee at all, which is precisely when you want them: for flexibility beyond a strict schema, or on a path where the guarantee is not available.

The base recipe is to be explicit: “Precisely define your desired output format using JSON, XML, or custom templates so that Claude understands every output formatting element you require.” [Official] Increase output consistency · AnthropicT1-official original Two narrower JSON tricks have long ridden on top: prefilling the assistant turn to “skip the preamble and go straight to the JSON,” [Official] Prompting Claude for JSON mode · AnthropicT1-official original and pairing it with a stop sequence — “You can get rid of text that comes after the JSON by using a stop sequence.” [Official] Prompting Claude for JSON mode · AnthropicT1-official original

Here is where the loose end from the prompting chapter gets tied off. Prefilling the assistant turn is being deprecated on the newest models — it is not supported on Claude Opus 4.7, Opus 4.6, Sonnet 4.6, or Mythos Preview; on those models the documented replacement is structured outputs or system-prompt instructions. [Official] Increase output consistency · AnthropicT1-official original So the classic prefill-{ recipe is now legacy on current models — reach for the guarantee (lever two) or a system-prompt instruction instead. The stop-sequence recipe is unaffected by the gate and remains useful for trimming trailing prose.

These recipes shape the likelihood of a good shape; they do not constrain the grammar. That is the whole reason the docs redirect to structured outputs the moment you need a guarantee — the recipes are for the cases where you deliberately don’t.

A note on evidence

Everything in this chapter is Anthropic’s own API documentation — T1, authoritative by construction. That is the right tier for a vendor-API reference, but it is mono-vendor: there is no independent benchmark of how often structured outputs fail on a complex schema, or of the real-world distribution of refusal-versus-cutoff. This chapter does not invent one. The one number it quotes — “2-3 times” — is the docs’ own documented retry range, not a measured rate. If you want a failure rate for your schemas, you measure it; the docs tell you the guarantee and its conditions, not your distribution.

Patterns

Schema-as-output tool. Sketch: define one tool whose input_schema is your target shape, force it, parse the call. When to use: you already speak tool use and want a known shape back. Define tools · AnthropicT1-official original Mechanics: set tool_choice to {"type":"tool","name":"..."} to force the call; add strict: true to constrain the arguments. Strict tool use · AnthropicT1-official original Remember: tool_choice forces the call; strict is what makes the arguments conform — they are two levers.

Grammar-constrained guarantee. Sketch: use structured outputs / strict for a hard schema guarantee. When to use: you need schema-compliant output by construction, within the supported subset. Structured outputs · AnthropicT1-official original Mechanics: the schema compiles to a grammar that constrains token sampling; same pipeline drives strict tool use. Remember: the guarantee holds except refusals and max_tokens cutoffs, over the supported schema subset — still check stop_reason and a parse failure.

Prevent, then recover. Sketch: prevent invalid output with the guarantee; keep a retry loop for what it can’t reach. When to use: always order it this way. Get structured output from agents · AnthropicT1-official original Mechanics: structured outputs / strict first; fall back to validate-and-re-prompt (the SDK loop, or is_error feedback) where the guarantee is unavailable or insufficient. Handle tool calls · AnthropicT1-official original Remember: recovery costs a round trip per failure and can exhaust retries (complex schema, ambiguous task, retry-limit hit) — prevention costs neither.

Prompt-craft for flexibility. Sketch: specify the format and (optionally) use a stop sequence when you need reach beyond a strict schema. When to use: output flexibility the guarantee can’t express, or a path without it. Increase output consistency · AnthropicT1-official original Mechanics: state the format precisely; stop-sequence to trim trailing prose; do not prefill on current models (Opus 4.7 / Opus 4.6 / Sonnet 4.6 / Mythos Preview) — use system-prompt instructions or structured outputs. Prompting Claude for JSON mode · AnthropicT1-official original Remember: these shape likelihood, not the token grammar — no guarantee.

Quick reference

One surface, four levers: tool use → structured outputs / strict → validation/retry → prompt-craft — strongest guarantee to lightest weight. Structured outputs · AnthropicT1-official original
Two levers, not one: tool_choice forces the call; strict guarantees the arguments. Strict tool use · AnthropicT1-official original
The guarantee, stated honestly: schema-compliant output except refusals and max_tokens cutoffs, over the supported schema subset — never “always valid JSON.” Structured outputs · AnthropicT1-official original
Prevent beats recover: use the guarantee first; the retry loop is the fallback, not the default. Handle tool calls · AnthropicT1-official original
Prefill is deprecated on current models (Opus 4.7 / Opus 4.6 / Sonnet 4.6 / Mythos Preview) — use structured outputs or system-prompt instructions; the gate grows per release. Increase output consistency · AnthropicT1-official original
Volatility: structured-outputs GA is mid-transition and the prefill gate moves — re-check the param name and unsupported-model list per release.
Evidence: mono-vendor T1 docs; the only number (“2-3 times”) is the docs’ documented retry range, not a measured failure rate. Handle tool calls · AnthropicT1-official original

Practice

Exercise solutions

Solution ↑ Exercise

The two conflated levers are tool_choice (forces the model to emit a call) and strict (guarantees the call’s arguments match the JSON Schema by constraining token sampling). Forcing the tool with tool_choice {"type":"tool","name":"extract_invoice"} guarantees only that Claude calls extract_invoice — it does not guarantee the arguments are schema-valid; without strict, the docs note, the model “might return incompatible types or missing required fields.” The one addition that closes the gap is strict: true on the tool definition, which runs the constrained-decoding pipeline over the arguments so they conform to the input_schema. The teammate forced the call and assumed they had also constrained the contents; those are separate levers.

Solution ↑ Exercise

A pasteable sentence: “Structured outputs / strict guarantee schema-compliant output through constrained decoding — except for refusals (stop_reason: refusal) and max_tokens cutoffs, and only over the supported JSON-Schema subset.” “It always returns valid JSON” is an operational error, not just loose phrasing, because the guarantee is conditioned on a normal completion: a refusal ends generation before a complete object exists, and a max_tokens cutoff truncates mid-object — both can yield output that does not parse, with the guarantee fully in force. So a production path must still check stop_reason and still handle a parse failure; the guarantee shrinks that handling to a rare edge case but does not remove the need for it. Treating the guarantee as absolute is what removes those checks and turns a rare refusal or truncation into an unhandled crash.

Solution ↑ Exercise

Order by prevent-then-recover. First, the guarantee: reach for structured outputs / strict (lever two) — for a nested object you want the grammar-constrained guarantee, provided the schema stays inside the supported JSON-Schema subset; if the return is naturally a tool call, use tool_choice to force it and strict to constrain its arguments (the same pipeline). Prompt-craft drops out early: prefill is deprecated on current models (Opus 4.7 / Opus 4.6 / Sonnet 4.6 / Mythos Preview), so the classic prefill-{ recipe is off the table — if you need any prompt-side help, it is a system-prompt format instruction (and a stop sequence to trim trailing prose), used only for flexibility the schema can’t express. The validation/retry loop sits last, as the fallback — engaged when the guarantee is unavailable (a path or model without structured outputs) or insufficient (a schema richer than the supported subset). Even as a fallback it can still fail, and the docs name when: the schema is too complex for the task, the task itself is ambiguous, or the agent exhausts its retry limit — which is exactly the argument for keeping the schema focused and the task unambiguous so prevention carries the load and the loop rarely runs.

One problem, four levers

Tool use: tool_choice forces the call, strict guarantees the args

The structured-outputs guarantee — and its limits

Prevent beats recover

The lightest lever: prompt-craft recipes

A note on evidence

Patterns

Quick reference

Practice

Exercise solutions

Tool use: `tool_choice` forces the call, `strict` guarantees the args