Assessment test

Before you read anything, find out where you stand. This test samples questions across all five CCA-F domains and scores them in your browser; the readout points you at the domains — and the chapters — where your answers were shaky. Strong domains you can skim on a first pass; weak ones deserve a close read.

Without JavaScript the questions still render below as a manual self-check, with each answer behind a reveal. For the complete bank grouped by domain, see the practice question bank.

Assessment test

Take a cross-domain assessment: a sampled form spread over every exam domain, scored with a weak-domain readout routing you to the chapters to (re)read.

d1-01-loop-end-turn mcq remember ◆◇◇◇

The agent loop repeats the tool-use round-trip until the model replies without a tool call. Which stop_reason marks that final reply?

Answer & rationale →

Options
Show answer

Correct: `end_turn`

d1-01-parallel-results-batch mcq apply ◆◆◇◇

A handler runs two independent read-only queries concurrently in response to one assistant turn that contained two tool_use blocks. The slower query finishes a few seconds after the faster one. To keep the next request well-formed, how should the handler send the results back?

Answer & rationale →

Options
Show answer

Correct: Collect both results and return them in the next user message, each keyed by its tool_use_id

d1-01-tool-error-return mcq apply ◆◆◇◇

Inside your tool handler the call throws an exception. To let the model see the failure and recover, what do you send back on the next turn?

Answer & rationale →

Options
Show answer

Correct: A `tool_result` block with `is_error: true`

d1-02-artifacts-pattern-return mcq apply ◆◆◇◇

A subagent gathers 4,000 tokens of raw source material, but the coordinator only needs its short conclusion to keep going. Per the chapter, how should the subagent return its work so the coordinator’s window stays lean?

Answer & rationale →

Options
Show answer

Correct: Write the full output to the filesystem (or external storage) and return a lightweight reference to it

d1-02-decompose-context-not-role mcq analyze ◆◆◆◇

A team must research three unrelated vendors and produce one comparison. An engineer proposes four subagents — researcher → analyst → writer → fact-checker — passing the task down the line. Applying the chapter’s decomposition rule, what is the better way to cut this work?

Answer & rationale →

Options
Show answer

Correct: Spawn one subagent per vendor (independent research paths), then have the coordinator synthesize the comparison in a single window

d1-02-verifier-early-victory mcq apply ◆◆◇◇

You add a verification subagent to blackbox-test your agent’s output, but it keeps declaring success after a single check. What is the chapter’s documented mitigation for this failure mode?

Answer & rationale →

Options
Show answer

Correct: Instruct it explicitly — e.g. "You MUST run the complete test suite before marking as passed"

d1-03-depth-one-limit mcq apply ◆◆◆◇

A subagent that analyzes a module wants to hand a slice of its work to a further helper agent, giving you a coordinator-to-subagent-to-sub-subagent chain. How should the architecture deliver that third layer?

Answer & rationale →

Options
Show answer

Correct: Orchestrate the extra layer from the parent; subagents cannot spawn subagents

d1-03-fresh-context-channel mcq apply ◆◆◇◇

You delegate to a subagent with the prompt “fix the bug we discussed,” and it flounders with no idea what bug you mean. To make the next delegation succeed, where must the specifics go?

Answer & rationale →

Options
Show answer

Correct: Write the file path and error text directly into the Agent-tool prompt string

d1-03-never-delegates-gate mcq analyze ◆◆◇◇

You define a focused doc-reviewer subagent with a specific, keyword-rich description, yet it never triggers — the main agent just reviews the file inline. The parent’s allowedTools is ["Read", "Edit", "Bash"]. Which fault best explains why nothing is delegated?

Answer & rationale →

Options
Show answer

Correct: "Agent" is missing from the parent's allowedTools, so the call is never approved

d1-04-enforcement-locus mcq apply ◆◆◇◇

A compliance pipeline runs the same four steps in the same order every time, must produce an audit trail, and needs a check between steps that a malformed output cannot slip past. Where should this workflow’s control flow live?

Answer & rationale →

Options
Show answer

Correct: Enforce it programmatically, because a fixed repeatable sequence that needs an audit trail and gates is what your code is for

d1-04-fresh-context-reviewer mcq apply ◆◆◇◇

Your quality workflow has Session A write a rate limiter and Session B review it, and a teammate proposes saving effort by handing Session B all of Session A’s prior context. In the Writer/Reviewer pattern, why must the reviewer not inherit the writer’s context?

Answer & rationale →

Options
Show answer

Correct: A fresh context cannot rationalize choices it never made, so it judges the work independently

d1-04-schema-only-gate mcq apply ◆◆◇◇

A pipeline validates every inter-step output against a JSON schema, yet it still ships data with a total that does not add up. The output parses cleanly and has every required field. What does the gate need to catch this?

Answer & rationale →

Options
Show answer

Correct: Add a semantic check, because a structurally valid output can still carry fabricated or contradictory content

d1-05-defer-pauses-query mcq apply ◆◆◇◇

A teammate writes a PreToolUse hook that returns defer carrying an updatedInput, expecting the rewritten command to then run. Will the command run, and will the rewrite take effect?

Answer & rationale →

Options
Show answer

Correct: Neither: defer ends the query for the host to resume later, and updatedInput is ignored with defer

d1-05-posttooluse-normalize mcq apply ◆◆◇◇

Every Bash result reaches the model carrying ANSI color codes, and you need them stripped before Claude reads the output — the call itself must still run normally. Which hook event and return field do that work?

Answer & rationale →

Options
Show answer

Correct: A PostToolUse hook returning updatedToolOutput, which replaces the result before the model reads it

d1-05-precedence-deny-wins mcq apply ◆◆◇◇

Three PreToolUse hooks fire on one tool call and return, respectively, allow, ask, and deny. What happens to the call, and what rule decides it?

Answer & rationale →

Options
Show answer

Correct: The call is blocked: matching hooks run in parallel and the most restrictive result wins

d1-06-adaptive-cost-tradeoff mcq apply ◆◆◇◇

A lead picks an adaptive multi-agent decomposition over a single-agent approach for an open-ended research task and is asked to justify the spend to a skeptical stakeholder. Compared to a single agent on an equivalent task, what is the rough token cost, and what does that spend actually buy?

Answer & rationale →

Options
Show answer

Correct: Roughly 3-10x more tokens, and what it buys is thoroughness, not speed

d1-06-over-decomposition-guard mcq apply ◆◆◇◇

An adaptive system keeps spawning dozens of subagents to answer questions that turn out to be simple — the “50 subagents for a simple query” failure — multiplying token cost for nothing. The task genuinely is open-ended, so abandoning adaptivity isn’t an option. What is the chapter’s mitigation for this failure mode?

Answer & rationale →

Options
Show answer

Correct: Tell the orchestrator to scale effort to complexity — 1 agent for simple fact-finding, 2-4 for a comparison, 10+ for complex research

d1-06-path-dependent-structure mcq apply ◆◆◇◇

You are scoping “find out whether any competitor shipped feature X — go as deep as the question needs,” where each thing you learn changes what you look at next and the depth is unknowable up front. Which decomposition structure does this task require, and on what property of the task does that choice turn?

Answer & rationale →

Options
Show answer

Correct: Adaptive decomposition, because the work is path-dependent — step N+1 depends on what step N discovered, so no design-time sequence can capture it

d1-07-crosshost-artifacts mcq apply ◆◆◇◇

A CI agent stops on error_max_turns partway through a refactor, and its container is torn down. You must finish the work tomorrow on a fresh worker that won’t have today’s transcript file. What is the more robust way to carry the work across hosts?

Answer & rationale →

Options
Show answer

Correct: Capture the artifacts that matter — decisions, the diff so far, the remaining plan — as application state and seed a fresh session's prompt

d1-07-fork-shared-disk mcq apply ◆◆◇◇

You fork your session to try a destructive refactor, expecting the original session’s files to be safe. The forked agent deletes a source file. Is that deletion isolated from the original session?

Answer & rationale →

Options
Show answer

Correct: No — forking branches the conversation, not the filesystem, so both forks share one disk and the deletion is real

d1-07-resume-fresh-cwd mcq apply ◆◆◇◇

A script captured a session_id from the first run’s ResultMessage and later calls resume with that exact ID, but the resumed session comes back empty and fresh instead of the expected history. What is the single most likely cause?

Answer & rationale →

Options
Show answer

Correct: You launched the resume from a different working directory, so the encoded-cwd lookup path no longer matches

d2-01-consolidate-selection-ambiguity mcq apply ◆◆◇◇

Your toolset exposes create_pr, review_pr, and merge_pr as three separate tools, and the agent frequently invokes the wrong one. Which redesign does the chapter document for exactly this set, and on what stated rationale?

Answer & rationale →

Options
Show answer

Correct: Fold them into one tool with an action parameter, because fewer, more capable tools reduce selection ambiguity

d2-01-description-highest-leverage mcq apply ◆◆◇◇

A tool you inherited carries the entire description “Gets data for a record,” and the agent keeps reaching for it at the wrong moments. Of the changes available, which one targets the single highest-leverage surface the chapter identifies?

Answer & rationale →

Options
Show answer

Correct: Rewrite the description to say what it does, when (and when not) to use it, and what each parameter means

d2-01-input-examples-400 mcq apply ◆◆◇◇

You add three input_examples to a working client tool, and the API now rejects the whole request with a 400 — your description and your code are untouched and fine. Which explanation matches the one hard rule the chapter states for input_examples?

Answer & rationale →

Options
Show answer

Correct: One example sets unit to a value the enum does not allow, so it fails to validate against input_schema

d2-02-error-flag-casing mcq understand ◆◆◇◇

A tool fails. In the Claude Messages API (direct, not MCP), which field on the tool_result block flags the failure — and in which casing?

Answer & rationale →

Options
Show answer

Correct: `is_error` (snake_case)

d2-02-validation-channel-routing mcq apply ◆◆◇◇

Your booking tool, exposed over MCP, receives a departure date that lies in the past — a business-logic failure the model could fix by asking for a new date. The MCP spec gives you two channels for reporting failures. Which channel should this failure travel down, and why?

Answer & rationale →

Options
Show answer

Correct: Return it as an execution error inside a successful result, so the model can read the message and self-correct

d2-03-mcp-scope-vs-bypass mcq apply ◆◆◇◇

An agent should be able to call every tool on one MCP server but gain no other broad privileges. Which configuration matches the chapter’s documented guidance for scoping that MCP access?

Answer & rationale →

Options
Show answer

Correct: Scope with an `allowedTools` wildcard like `mcp__github__*`, which grants exactly that server and nothing more

d2-03-thinking-forces-auto mcq apply ◆◆◇◇

A teammate wants the model to reason before acting, so they enable extended thinking, and they also want a specific tool always called, so they set tool_choice to a forced {"type": "tool", "name": …}. What happens when this request runs?

Answer & rationale →

Options
Show answer

Correct: The forced mode is incompatible with extended thinking, so the request errors before any tool fires

d2-04-local-scope-not-local-settings mcq apply ◆◆◇◇

A teammate added an MCP server at local scope, but it never loads. They have been editing .claude/settings.local.json to fix it, with no effect. Which file should you open instead to find and repair the server’s definition?

Answer & rationale →

Options
Show answer

Correct: Open the home-directory "~/.claude.json", under that project's key, where local-scoped servers actually live

d2-04-mcp-project-scope mcq understand ◆◆◇◇

You want an MCP server to be shared with everyone who checks out a repository — scoped to that project, not to your user account. Where do you declare it?

Answer & rationale →

Options
Show answer

Correct: `.mcp.json` at the repository root

d2-04-scope-precedence-wins mcq apply ◆◆◇◇

A repo ships a notion server in its committed .mcp.json (Project scope). A teammate also defines a server named notion at Local scope in their own ~/.claude.json. On the teammate’s machine, which definition actually connects?

Answer & rationale →

Options
Show answer

Correct: The Local-scope definition in their `~/.claude.json`, because precedence runs Local before Project

d2-04-verify-system-init-status mcq apply ◆◆◇◇

Your agent “doesn’t seem to have” its MCP tools, yet the config looks correct and no error is thrown. Before letting the agent act, what should the program do to turn this silent gap into an explicit failure?

Answer & rationale →

Options
Show answer

Correct: Read the "system:init" message and refuse to run if any server's status is not "connected"

d2-05-deny-beats-bypass mcq apply ◆◆◇◇

A headless agent runs under permission_mode="bypassPermissions" to suppress prompts, with one guardrail: disallowed_tools=["Bash(rm -rf *)"]. The model requests Bash(rm -rf /data). Will the deletion be blocked?

Answer & rationale →

Options
Show answer

Correct: Yes — deny rules are checked before the permission mode, so the rule binds even under bypassPermissions

d2-05-parallel-read-only mcq apply ◆◆◇◇

In one turn an agent issues two Grep calls, one Glob, and one Write. The chapter says the deciding property is whether a tool reads state or changes it. Which of these calls may the SDK run concurrently?

Answer & rationale →

Options
Show answer

Correct: The two Grep calls and the Glob, because read-only tools may run concurrently while Write modifies state

d3-01-claude-md-concatenate mcq understand ◆◆◇◇

Two CLAUDE.md files in the load chain give contradictory instructions. Which one takes precedence?

Answer & rationale →

Options
Show answer

Correct: Neither — they concatenate; both stay in context

d3-01-import-decline-permanent mcq apply ◆◆◇◇

Your CLAUDE.md pulls in shared standards with @./standards/api.md. The first time the import runs, Claude Code shows an approval dialog and the developer clicks decline. According to the chapter, what is the lasting consequence for that environment?

Answer & rationale →

Options
Show answer

Correct: Imports are disabled permanently — the approval dialog does not reappear, so the file never expands until the choice is reset

d3-01-settings-precedence-cli-wins mcq apply ◆◆◇◇

A developer sets “model”: “sonnet” in their user settings and “model”: “opus” in the project’s committed settings, then launches the session with —model haiku. Given how Claude Code resolves the settings hierarchy, which model actually runs, and why?

Answer & rationale →

Options
Show answer

Correct: haiku — the CLI argument sits above both file scopes on the precedence ladder

d3-02-commands-merged-into-skills mcq apply ◆◆◇◇

A colleague insists the only way to register a /deploy trigger is a flat file at .claude/commands/deploy.md. Per this chapter, which placement(s) will actually create a working /deploy?

Answer & rationale →

Options
Show answer

Correct: Either form works — a file at .claude/commands/deploy.md and a skill at .claude/skills/deploy/SKILL.md both create /deploy

d3-02-disable-model-invocation-description mcq apply ◆◆◇◇

You mark a risky /force-release skill disable-model-invocation: true, then ask Claude to “just release it” — and it acts as if the skill does not exist. According to this chapter, what did that flag do to the skill’s description?

Answer & rationale →

Options
Show answer

Correct: Its description is kept out of startup context entirely, so Claude is never told the skill exists and cannot auto-invoke it

d3-03-scope-load-order mcq apply ◆◆◇◇

Your machine has ~/.claude/rules/style.md (“prefer tabs”), and a repo you are working in ships .claude/rules/style.md (“prefer spaces”). Neither rule uses a paths glob, so both are in context. When the two instructions tension, which one does Claude favor and why?

Answer & rationale →

Options
Show answer

Correct: The project rule wins, because user-level rules load first and the project rule is read last at higher priority

d3-03-scoped-rule-silent mcq apply ◆◆◇◇

A rule at .claude/rules/backend/api.md carries paths: ["src/api/**/*.ts"]. During a session, Claude edits files under src/frontend/ and never reads anything under src/api/. Regarding that API rule, what is the effect of this session on its presence in context?

Answer & rationale →

Options
Show answer

Correct: Nothing happens — the rule stays out of context, because it activates only when Claude reads a file matching its glob

d3-04-approve-flips-write-mode mcq apply ◆◆◆◇

In plan mode, Claude presents a plan and you select the “accept edits” approve option. Considering only what the chapter says happens at approval, what permission mode is the session now in, and what does that mean for editing?

Answer & rationale →

Options
Show answer

Correct: It has exited plan mode into acceptEdits, so Claude can now apply edits; re-enter plan mode to research again

d3-04-opusplan-200k mcq apply ◆◆◆◇

You run claude --model opusplan for a refactor whose planning step must hold roughly 400K tokens of context at once. Before picking the alias, decide whether opusplan’s plan phase can provide that much context.

Answer & rationale →

Options
Show answer

Correct: No — opusplan's plan phase runs at the standard 200K; reach for opus[1m] for that phase instead

d3-04-prompts-still-apply mcq apply ◆◆◇◇

A teammate switches into plan mode expecting a silent run with no interruptions, then is surprised when Claude still pauses to ask before a shell command. What does plan mode actually do to the permission prompts?

Answer & rationale →

Options
Show answer

Correct: Tool-permission prompts still appear the same as in default mode; only source edits are withheld

d3-05-clear-after-third-correction mcq apply ◆◆◇◇

Three times this session you have corrected Claude on the same issue, and each fix drifts back toward the wrong approach. According to the chapter’s threshold for course-correction, what should you do next?

Answer & rationale →

Options
Show answer

Correct: Run /clear and restart with a more specific prompt that incorporates what the failed rounds taught you

d3-05-fresh-session-after-interview mcq apply ◆◆◇◇

You ran the interview pattern for a rate limiter: Claude questioned you with the AskUserQuestion tool and wrote a complete SPEC.md, which you reviewed and corrected. To begin implementation, which move does the chapter prescribe?

Answer & rationale →

Options
Show answer

Correct: Open a fresh session and prompt it to implement from the spec, so the work starts on clean context

d3-06-exit-code-masked mcq apply ◆◆◇◇

A CI step runs claude -p "review the diff" --max-turns 8 --output-format json > out.json || true. The agent thrashed past eight turns, out.json is truncated, yet the step shows green and the next step consumes the partial output. What went wrong?

Answer & rationale →

Options
Show answer

Correct: The run hit `--max-turns` and exited non-zero, but the trailing `|| true` swallowed the status, so the shell reported success and the step stayed green

d3-06-headless-flag mcq remember ◆◇◇◇

In a CI pipeline you need Claude Code to run a single prompt non-interactively, print the result to stdout, and exit. Which invocation does this?

Answer & rationale →

Options
Show answer

Correct: `claude -p '…'`

d3-06-json-schema-print-mode mcq apply ◆◆◇◇

An engineer wants a downstream step to receive a schema-conforming payload, so inside a normal interactive terminal session they reach for --json-schema with --output-format json — and it doesn’t behave as documented. Which statement correctly diagnoses the constraint and says where a valid result would arrive?

Answer & rationale →

Options
Show answer

Correct: `--json-schema` is print-mode only, so it works under `-p` but not in an interactive session; pairing it with `--output-format json` lands the conforming result in `structured_output`

d4-01-positive-over-negative mcq apply ◆◆◇◇

A teammate’s prompt keeps producing bulleted answers when the team wants paragraphs, so they ask you whether to add “do not use markdown lists” or “respond in smoothly flowing prose.” Which instruction steers more reliably toward paragraphs, and on what reasoning?

Answer & rationale →

Options
Show answer

Correct: "Respond in smoothly flowing prose" — it names the target shape instead of a forbidden region

d4-01-stop-at-the-rung mcq apply ◆◆◇◇

A summarization job returns a JSON object whose fields are all handled well by plain instruction except sentiment, which must be one of three labels and will crash the downstream pipeline if a fourth value ever appears. Following the chapter’s escalation discipline, how should you harden the prompt?

Answer & rationale →

Options
Show answer

Correct: Move only sentiment to an enum-constrained tool or structured output, leaving the other fields at explicit instruction

d4-02-edge-case-in-the-set mcq apply ◆◆◇◇

Your few-shot extraction returns the empty string for order_id when an input omits the order number, but the downstream consumer needs null. Following the chapter’s “target the ambiguous case” guidance, what is the most reliable fix?

Answer & rationale →

Options
Show answer

Correct: Add an example whose input has no order number and whose output shows the field as null, placed in the middle of the set

d4-02-single-example-quirk mcq apply ◆◆◇◇

A teammate steers an extraction task with a single carefully chosen example, and the model starts wrapping the first field in quotes on every output even when the input has no quotes. Which change does the chapter prescribe to stop the model copying that incidental trait?

Answer & rationale →

Options
Show answer

Correct: Move into the 3-5 range with examples that vary the incidental traits, so the wrapping is no longer common across the set

d4-03-classic-output-slot mcq understand ◆◆◇◇

In the classic structured-output pattern you force Claude to call a print_summary tool. Where is your extracted JSON, and what do you do with the tool’s “result”?

Answer & rationale →

Options
Show answer

Correct: In the tool call's `input` — you discard the tool's result

d4-03-strict-dropped-openai mcq analyze ◆◆◆◇

You set strict: true on a tool to guarantee schema-valid inputs. Through which integration path is that guarantee silently lost?

Answer & rationale →

Options
Show answer

Correct: The OpenAI SDK compatibility layer

d4-03-truncation-not-retry mcq apply ◆◆◇◇

Your strict extraction normally works, but on unusually long inputs the returned JSON is occasionally cut off mid-object and your parser throws — and stop_reason comes back as max_tokens. What is the correct fix?

Answer & rationale →

Options
Show answer

Correct: Raise `max_tokens` (or shrink the schema) — generation ran out of room to close the object

d4-04-stated-vs-calculated-total mcq apply ◆◆◇◇

An invoice extractor returns schema-valid JSON, but once in a while the total it reports does not match the sum of the line items, and those bad totals slip through to billing. Following the chapter’s semantic-hook approach, what schema-level design makes such a total mechanically checkable?

Answer & rationale →

Options
Show answer

Correct: Have the model emit both a stated_total and a calculated_total, then let application code compare them and route mismatches to review

d4-04-subtype-before-payload mcq apply ◆◆◇◇

A structured-output run sometimes succeeds and sometimes exhausts its retry budget, but downstream code occasionally processes garbage on the failure path instead of falling back. Per the chapter, how should the caller consume an Agent SDK structured-output result so the failure path is handled correctly?

Answer & rationale →

Options
Show answer

Correct: Branch on the result's subtype first, reading message.structured_output only on success and falling back otherwise

d4-04-truncation-not-retries mcq apply ◆◆◇◇

Your validate-retry loop keeps reaching error_max_structured_output_retries on your longest documents, which come back as cut-off JSON, while shorter documents extract cleanly. Per the chapter, what is the actual failure and the fix?

Answer & rationale →

Options
Show answer

Correct: The responses hit the max_tokens cap, so detect that stop_reason and raise the cap or shrink the schema before retrying

d4-05-streaming-dashboard mcq apply ◆◆◇◇

A teammate proposes a streaming batch so a dashboard can update ticket-by-ticket as each of 80,000 classifications arrives. Setting aside whether streaming is the right tool, which statement about this proposal does the chapter support?

Answer & rationale →

Options
Show answer

Correct: It cannot be built — streaming is not supported for batch requests

d4-05-succeeded-not-usable mcq apply ◆◆◇◇

Your overnight batch finishes and you iterate the results, keeping every entry whose result type is succeeded and treating it as a finished classification. Some downstream records turn out to hold refused or truncated text. Per the chapter, what must the per-result handler do that this design omits?

Answer & rationale →

Options
Show answer

Correct: Inspect each succeeded message's own stop_reason, because a refusal or truncation arrives as succeeded

d4-06-fleet-false-positives mcq apply ◆◆◇◇

A team runs several parallel specialist reviewers with no filtering step and finds the output noisy — each agent independently flags plausible-but-wrong issues that accumulate into untrustworthy review comments. Which change directly addresses this failure mode while keeping the fan-out?

Answer & rationale →

Options
Show answer

Correct: Add a verification pass that re-checks each candidate finding against actual code behavior before any are posted

d4-06-gate-on-neutral-check mcq apply ◆◆◇◇

A reviewer asks why a merge still went through even though Code Review posted an Important finding, and wants real bugs to block the merge queue going forward. How should the team make Code Review’s advisory output actually gate a merge?

Answer & rationale →

Options
Show answer

Correct: Read the severity breakdown from the check-run output in your own CI and fail the step when the Important count is positive

d5-01-accumulation-not-degradation mcq apply ◆◆◇◇

Turns in a session keep growing until the window is nearly full and the loop is about to compact, but nothing the model has produced is wrong or misremembered yet. Which response matches the failure mode actually in play?

Answer & rationale →

Options
Show answer

Correct: Reduce what accumulates — tighter tool outputs, or /clear and restart with a focused prompt

d5-01-buried-fact-no-compaction mcq apply ◆◆◇◇

A session is comfortably under the token limit and has never compacted, yet Claude misremembers a detail established about sixty turns ago in a long stretch of context. Which fix most directly addresses the failure mode at work here?

Answer & rationale →

Options
Show answer

Correct: Restate the fact near the end of the context so it sits where the model attends most reliably

d5-02-subagent-cannot-escalate mcq apply ◆◆◇◇

A coordinator dispatches a subagent to implement a feature, but the subagent discovers the spec is ambiguous and has no way to ask the user. How should the work be restructured so the ambiguity is handled correctly?

Answer & rationale →

Options
Show answer

Correct: Have the coordinator resolve the open questions first, then hand the subagent a fully-specified task

d5-02-suggest-alternative-pattern mcq apply ◆◆◇◇

Your canUseTool callback intercepts a Bash(curl … | sh) call. You want to block this specific command but also tell Claude to download, verify the checksum, and then run, so it can correct course on its own. Which response pattern does this?

Answer & rationale →

Options
Show answer

Correct: Suggest-alternative — deny with guidance in the message, which Claude reads and uses to adjust its next step

d5-03-green-evals-not-healthy mcq apply ◆◆◇◇

A multi-stage system shows broad, inconsistent quality degradation in production, yet every per-component evaluation stays green and neither internal usage nor the existing eval suite reproduces the problem. Following the April-23 postmortem’s conclusion, what kind of testing does the chapter say is actually needed?

Answer & rationale →

Options
Show answer

Correct: Run integration-level testing — per-model evals on every prompt change, ablation testing, and soak periods that exercise the system as it actually runs

d5-03-silent-ambiguity-resolution mcq apply ◆◆◇◇

In a planner-to-coder-to-reviewer pipeline, the planner emits a spec with one ambiguous requirement. The coder produces clean, working code, the reviewer rates it fine, and the wrong output ships with no warning anywhere. According to the chapter, what does the coder do at the moment it meets the ambiguity that makes the error go silent?

Answer & rationale →

Options
Show answer

Correct: The coder has no one to ask, so it resolves the ambiguity itself and hands the guess downstream as settled fact

d5-04-compact-vs-clear mcq apply ◆◆◇◇

A long task’s history is cluttered but still relevant; your next task is unrelated. Which command fits which?

Answer & rationale →

Options
Show answer

Correct: `/compact` to continue this task; `/clear` to switch to the new one

d5-04-scratchpad-survives mcq analyze ◆◆◆◇

You write a PLAN.md scratchpad, then run /compact, then later /clear. Is the plan still available to the agent?

Answer & rationale →

Options
Show answer

Correct: Yes — a disk scratchpad survives both

d5-05-funnel-judge-tier mcq apply ◆◆◇◇

Cheap automated checks pass an extraction — valid shape, no flagged conflict — yet it is wrong-but-plausible, and shipping it as-is is costly. The chapter’s funnel puts one tier between the auto-checks and the human to catch exactly this case. Which tier is it, and why does it catch what the auto-checks miss?

Answer & rationale →

Options
Show answer

Correct: An isolated judge in a fresh context, because it has no authorship bias toward the output it is checking

d5-05-measure-before-routing mcq apply ◆◆◇◇

An extraction pipeline emits a confidence field of high, medium, or low, and you want to use it to decide which records auto-accept and which route onward for a high-stakes clinical field. Before you trust the label to route, what does the chapter say you must do?

Answer & rationale →

Options
Show answer

Correct: Measure, on labeled past extractions, how often each stated level is actually correct, then route by that observed accuracy

d5-06-citations-schema-conflict mcq apply ◆◆◇◇

An extraction service enables the Citations API on a document and also sets a structured-output format on the same request, expecting JSON with per-claim citations. The API returns a 400 instead. According to the chapter, what is the root cause of this rejection?

Answer & rationale →

Options
Show answer

Correct: The two cannot coexist, because cited text must interleave with the response prose that a strict JSON schema forbids

d5-06-reliable-cutoff-bounds-trust mcq apply ◆◆◇◇

A teammate notes that Sonnet 4.6 trained on data through January 2026 but is reliable only to August 2025, and asks which date should decide whether a time-sensitive fact may be answered from the model’s own memory. Per the chapter, which date governs that decision, and why?

Answer & rationale →

Options
Show answer

Correct: August 2025, because the reliable knowledge cutoff is earlier than the training cutoff and is the date that bounds trust