The book closes on the question that underlies trust in any agent output: where did this claim come from, and is it still true? Provenance answers the first — a claim mapped to its source is auditable, one without a source is a trust-me. Temporal validity answers the second. This chapter is the exam-angle treatment; the named features — the Citations API surface, the location modes — are the moving parts, so it is a feature surface.

Do I know this already? Diagnostic

Answer these confidently and you can skim ahead to Exam essentials; if any is shaky, read closely — each is developed below.

What does the Citations API actually guarantee about a cited quote — and what does it not guarantee (e.g. a JSON grammar)?
Name the three citation location modes by document type.
You need structured JSON and per-claim attribution. Why does combining Citations with structured outputs fail, and what is the fallback?
Is a model’s reliable knowledge cutoff earlier or later than its training-data cutoff — and why?
Within one request, on how many of the documents must citations be enabled?

Check your answers

It guarantees the cited text is tied to an actual span in the source document — the model cannot fabricate a citation to text that is not there; it is span-bound, not grammar-constrained, so it guarantees nothing about output shape.
Plain text → char_location, PDF → page_location, custom content → content_block_location.
Cited text must interleave with the response prose, which a strict JSON schema forbids, so the API returns a 400; the fallback is the provenance triple (document_id, span_quote, confidence) verified caller-side.
Earlier (or equal) — data near the training cutoff is sparse, so the model’s reliable knowledge stops before its training does.
All or none — you cannot mix cited and uncited documents within a request.

Provenance maps every claim to its source

The point of provenance is verifiability. “Claude is capable of providing detailed citations when answering questions about documents, helping you track and verify information sources in responses. All active models support citations, with the exception of Haiku 3.” [Official] Citations · AnthropicT1-official original You enable it per document with citations: {"enabled": true} on the document block, and each cited claim in the response carries a sibling citations array pointing back to the exact span of the source it came from. [Official] Citations · AnthropicT1-official original One enablement rule to memorize: citations must be enabled on all or none of the documents within a request — you cannot mix cited and uncited documents. [Official] Citations · AnthropicT1-official original

The Citations API and its location modes

How a citation points at its source depends on the document type, and there are three modes. Plain text is chunked to sentences and cited by char_location; a PDF is cited by page_location; custom content, where you supply the chunks, is cited by content_block_location. [Official] Citations · AnthropicT1-official original The feature is also output-cheap: “the cited_text field is provided for convenience and does not count towards output tokens.” [Official] Citations · AnthropicT1-official original

The provenance triple: schema-friendly fallback

When the output must be structured JSON, the native Citations API is off the table, so you encode provenance into the schema yourself. This is the D4.4 hook applied to attribution — a design pattern this book recommends, not a platform feature: each extracted claim carries a source object with a document_id, a span_quote, and a confidence, and the caller verifies that span_quote actually appears in document_id. If it does not, the model fabricated the citation. It is a manual, checkable provenance you can drop inside any schema.

Temporal provenance: knowing when data is valid

Provenance is not only where a claim came from but when it can be trusted. Each model has a reliable knowledge cutoff — Opus 4.8 at January 2026, Sonnet 4.6 at August 2025, Haiku 4.5 at February 2025 — and that reliable cutoff is earlier than (or equal to) the model’s training-data cutoff, not later: Sonnet 4.6 trained on data through January 2026 but is reliable only to August 2025, and Haiku 4.5 trained through July 2025 but is reliable to February 2025. [Official] Models overview · AnthropicT1-official original Data near the training cutoff is sparse, so the model’s reliable knowledge stops before its training does. Past the reliable cutoff the model has no dependable knowledge, so a time-sensitive fact must come from a dated source supplied at request time (retrieval with a citation), not from the model’s memory. The use-side workflow for recording claim sources and decision dates is the handbook’s territory (its provenance and ADR material is forthcoming).

Provenance on both axes: where and when Worked example

A RAG assistant on Sonnet 4.6 (reliable knowledge cutoff August 2025) is asked: “What did the Q4 2025 earnings report say about revenue?” Both provenance axes are in play:

When — temporal validity first. Q4 2025 is after the model’s reliable cutoff (August 2025), so the model has no dependable knowledge of it — answering from memory risks a confident fabrication. The fact must come from a dated source supplied at request time, not the model’s weights. (Note it is irrelevant that Sonnet 4.6 trained through January 2026: the reliable cutoff is the earlier date, and it is what bounds trust.)
Where — bind the answer to the source. Supply the earnings PDF as a document with citations: {"enabled": true}. Because it is a PDF, the response cites by page_location, and each revenue claim carries the exact page span — auditable, not “trust me.” The cited_text echoes the source span back without costing output tokens.
If the pipeline also needs structured JSON, you hit the wall: Citations + output_config.format returns a 400. Fall to the provenance triple — emit {claim, source: {document_id, span_quote, confidence}} per fact and verify each span_quote appears in the cited PDF caller-side.

The closing synthesis of the book: a trustworthy claim needs where (a source span, via Citations or the triple) and when (a dated source, because the reliable cutoff — earlier than training — bounds what the model knows). Retrieval supplies the dated source; a citation binds the answer to it; verification proves the binding is real.

Practice

Exercise solutions

Solution ↑ Exercise

B. Citations and Structured Outputs are mutually exclusive — the 400 is the API telling you so — so when the structured shape is required, you encode provenance into the schema with a triple (document_id, span_quote, confidence) and verify each span against its source caller-side. A abandons the structured output the pipeline requires, trading one requirement for the other. C is the fabrication trap: an unverified span_quote may quote text that is not in the document, which is exactly the failure provenance exists to catch. D doubles cost and leaves you reconciling two responses with no guarantee the cited run and the schema run extracted the same facts.

Solution ↑ Exercise

The three modes are plain text → char_location (sentence-chunked; start/end character indices), PDF → page_location (start/end page numbers; scanned images without extractable text are not citable), and custom content → content_block_location (you supply the chunks; start/end block indices). cited_text is attractive on output cost because the field “is provided for convenience and does not count towards output tokens” — you get the quoted source span echoed back for verification without paying output tokens for it.

Solution ↑ Exercise

A question about an event after the model’s reliable knowledge cutoff should be answered from a supplied dated source because past that cutoff the model has no dependable knowledge — it may produce a plausible but fabricated answer. Crucially, the reliable cutoff is earlier than the training-data cutoff, so even data the model technically trained on near the boundary is unreliable; the earlier date is the one that bounds trust. Supplying the fact as a dated source at request time (retrieval plus a citation) makes the answer both correct and auditable. That connects to provenance broadly: provenance answers two questions — where a claim came from (a source span, via Citations or the triple) and when it is valid (a dated source past the cutoff). A time-sensitive claim needs both: an external dated source, bound to the answer by a verifiable citation.

Exam essentials

Provenance is verifiability — the Citations API (enable per document with citations: {"enabled": true}) ties each claim to a source span so a citation cannot be fabricated (span-bound, not grammar-constrained); cited_text does not count toward output tokens. Citations must be enabled on all or none of a request’s documents.
Three location modes — plain text → char_location, PDF → page_location, custom content → content_block_location; image citations are not yet supported.
Mutual exclusion — Citations + Structured Outputs return 400; when you need both a schema and provenance, use the provenance triple (document_id + span_quote + confidence) and verify the span caller-side.
Temporal provenance — each model has a reliable knowledge cutoff (Opus 4.8 Jan 2026, Sonnet 4.6 Aug 2025, Haiku 4.5 Feb 2025), which is earlier than (or equal to) the training-data cutoff — the model trains on later data but is only reliable to the earlier date (Sonnet 4.6 trained to Jan 2026, reliable to Aug 2025). Past the reliable cutoff, answer time-sensitive questions from a dated source, not the model’s memory.

Part 5 · D5 Review

6 exercises across 6 chapters — interleaved review.

d5-01-long-conversation-context

d5-01-ex-durable-instruction Early in a long session you tell Claude "always cite a file:line for any behavioral claim." Forty turns later it stops doing so, right around when a compaction summary appeared. What is the most reliable fix? - **A.** Re-paste the instruction manually every few turns to keep it fresh in context. - **B.** Move the instruction into CLAUDE.md, which is re-injected on every request and so survives compaction. - **C.** Switch to a model with a larger context window so compaction never triggers. - **D.** Raise `max_tokens` so the model has more room to comply.

d5-02-escalation-ambiguity

d5-02-ex-ambiguous-db An agent is scaffolding a new service and reaches a step that needs a database, but the request never said which one. What is the most reliable design? - **A.** Pick the most common default (say, PostgreSQL) and proceed, noting the choice in a comment. - **B.** Call `AskUserQuestion` with a short set of bounded options (Postgres / MySQL / SQLite) and continue once the user chooses. - **C.** Infer the database from whatever is already installed on the build machine. - **D.** Fail the task with an error explaining that the requirement was underspecified.

d5-03-error-propagation

d5-03-ex-silent-propagation In a planner → coder → reviewer pipeline, the planner occasionally emits an ambiguous spec, the coder silently picks a wrong interpretation, and the reviewer — reading only the code — rates it fine. Bad output ships. What is the most effective fix? - **A.** Upgrade the coder to a more capable model so it interprets ambiguous specs correctly. - **B.** Make the planner emit a typed, schema-validated spec and add an independent validation step that checks the coder's output against that spec before the reviewer runs. - **C.** Lengthen the reviewer's prompt to tell it to look harder for problems. - **D.** Retry the whole pipeline a second time and compare the two outputs.

d5-04-large-codebase-context

d5-04-ex-explore-large An agent must understand how authentication flows through a 5,000-file codebase before making a change, and the relevant code is spread across dozens of files. What keeps the main session's context usable? - **A.** Read every file that might be relevant into the main session so nothing is missed. - **B.** Dispatch subagents to explore subsystems and report back summaries, keeping the main context scoped to the change itself. - **C.** Raise `max_tokens` so the main session can hold more of the codebase. - **D.** Run `/compact` repeatedly while reading files so the window never fills.

d5-05-human-review-confidence

d5-05-ex-route-to-human A clinical-data extraction pipeline auto-accepts every result. Most are fine, but occasionally a high-stakes field is extracted wrong with no flag, and it reaches a patient record. You want to catch these without manually reviewing all output. What is the best design? - **A.** Retry every extraction with the model a second time and accept it if the two runs agree. - **B.** Have the model emit a `confidence` field and cross-check signals (e.g. `conflict_detected`), route low-confidence and flagged records through an isolated judge, and send what the judge cannot clear to a human. - **C.** Trust the model's self-reported confidence and auto-accept anything it marks "high." - **D.** Send every extracted record to a human reviewer to be safe.

d5-06-information-provenance

d5-06-ex-provenance-with-schema You are building an extraction pipeline that must return structured JSON *and* attribute each extracted fact to a source span. You try the Citations API with `output_config.format` and get a 400. What is the right design? - **A.** Drop the JSON schema and use the Citations API alone, parsing the prose response downstream. - **B.** Keep the schema and add a provenance triple per claim — `document_id`, `span_quote`, `confidence` — then have the caller verify each `span_quote` appears in its source. - **C.** Trust the model's cited text without verification, since it was instructed to quote the source. - **D.** Send the request twice, once with citations and once with the schema, and merge the two responses.