Assessment test

Before you read anything, find out where you stand. This test samples questions across all five CCA-F domains and scores them in your browser; the readout points you at the domains — and the chapters — where your answers were shaky. Strong domains you can skim on a first pass; weak ones deserve a close read.

Without JavaScript the questions still render below as a manual self-check, with each answer behind a reveal. For the complete bank grouped by domain, see the practice question bank.

Assessment test

Take a cross-domain assessment: a sampled form spread over every exam domain, scored with a weak-domain readout routing you to the chapters to (re)read.

Questions: of 75

d1-01-loop-end-turn mcq remember ◆◇◇◇

The agent loop repeats the tool-use round-trip until the model replies without a tool call. Which stop_reason marks that final reply?

Answer & rationale →

Options a. `end_turn` b. `tool_use` c. `max_tokens` d. `stop_sequence`

Show answer

Correct: `end_turn`

d1-01-parallel-results-batch mcq apply ◆◆◇◇

A handler runs two independent read-only queries concurrently in response to one assistant turn that contained two tool_use blocks. The slower query finishes a few seconds after the faster one. To keep the next request well-formed, how should the handler send the results back?

Answer & rationale →

Options a. Collect both results and return them in the next user message, each keyed by its tool_use_id b. Return the first result now and send the second in a later user message once it resolves c. Merge both results into a single tool_result block so the next request stays well-formed d. Return whichever result finished first and let the model re-request the other on the next turn

Show answer

Correct: Collect both results and return them in the next user message, each keyed by its tool_use_id

d1-01-tool-error-return mcq apply ◆◆◇◇

Inside your tool handler the call throws an exception. To let the model see the failure and recover, what do you send back on the next turn?

Answer & rationale →

Options a. A `tool_result` block with `is_error: true` b. An HTTP 500 status to the caller c. Nothing — omit the `tool_result` so the model retries d. A fresh `tool_use` block describing the failure

Show answer

Correct: A `tool_result` block with `is_error: true`

d1-02-artifacts-pattern-return mcq apply ◆◆◇◇

A subagent gathers 4,000 tokens of raw source material, but the coordinator only needs its short conclusion to keep going. Per the chapter, how should the subagent return its work so the coordinator’s window stays lean?

Answer & rationale →

Options a. Write the full output to the filesystem (or external storage) and return a lightweight reference to it b. Stream all 4,000 tokens back to the coordinator so nothing is lost c. Append the raw material to the coordinator's system prompt for the rest of the run d. Summarize the 4,000 tokens to ~200 and re-inject that summary on every later turn

Show answer

Correct: Write the full output to the filesystem (or external storage) and return a lightweight reference to it

d1-02-decompose-context-not-role mcq analyze ◆◆◆◇

A team must research three unrelated vendors and produce one comparison. An engineer proposes four subagents — researcher → analyst → writer → fact-checker — passing the task down the line. Applying the chapter’s decomposition rule, what is the better way to cut this work?

Answer & rationale →

Options a. Spawn one subagent per vendor (independent research paths), then have the coordinator synthesize the comparison in a single window b. Keep the researcher → analyst → writer → fact-checker pipeline — it cleanly separates concerns c. Run the four-role pipeline inside one subagent per vendor, for maximum specialization d. Don't split at all — the comparison is too coupled for subagents to help

Show answer

Correct: Spawn one subagent per vendor (independent research paths), then have the coordinator synthesize the comparison in a single window

d1-02-verifier-early-victory mcq apply ◆◆◇◇

You add a verification subagent to blackbox-test your agent’s output, but it keeps declaring success after a single check. What is the chapter’s documented mitigation for this failure mode?

Answer & rationale →

Options a. Instruct it explicitly — e.g. "You MUST run the complete test suite before marking as passed" b. Give the verifier the full transcript of how the work was produced, so it can judge in context c. Replace the single verifier with three verifiers and take a majority vote d. Raise the verifier's token budget so it has room to keep checking

Show answer

Correct: Instruct it explicitly — e.g. "You MUST run the complete test suite before marking as passed"

d1-03-depth-one-limit mcq apply ◆◆◆◇

A subagent that analyzes a module wants to hand a slice of its work to a further helper agent, giving you a coordinator-to-subagent-to-sub-subagent chain. How should the architecture deliver that third layer?

Answer & rationale →

Options a. Orchestrate the extra layer from the parent; subagents cannot spawn subagents b. Add "Agent" to the analyzer's tools array so it can invoke its own helper c. Raise the analyzer's maxTurns so it has budget to run the nested delegation d. Give the analyzer a keyword-rich description so the helper is matched automatically

Show answer

Correct: Orchestrate the extra layer from the parent; subagents cannot spawn subagents

d1-03-fresh-context-channel mcq apply ◆◆◇◇

You delegate to a subagent with the prompt “fix the bug we discussed,” and it flounders with no idea what bug you mean. To make the next delegation succeed, where must the specifics go?

Answer & rationale →

Options a. Write the file path and error text directly into the Agent-tool prompt string b. Rely on the parent conversation, which the subagent inherits as shared history c. Set the subagent's model to inherit so it picks up the parent's prior turns d. Add the subagent's name to allowedTools so the discussed context is approved

Show answer

Correct: Write the file path and error text directly into the Agent-tool prompt string

d1-03-never-delegates-gate mcq analyze ◆◆◇◇

You define a focused doc-reviewer subagent with a specific, keyword-rich description, yet it never triggers — the main agent just reviews the file inline. The parent’s allowedTools is ["Read", "Edit", "Bash"]. Which fault best explains why nothing is delegated?

Answer & rationale →

Options a. "Agent" is missing from the parent's allowedTools, so the call is never approved b. The subagent's tools array omits "Agent", so it cannot accept the delegated call c. The subagent's model field is unset, so Claude has no engine to route the task to d. The subagent's prompt is too generic, so Claude cannot match the task to it

Show answer

Correct: "Agent" is missing from the parent's allowedTools, so the call is never approved

d1-04-enforcement-locus mcq apply ◆◆◇◇

A compliance pipeline runs the same four steps in the same order every time, must produce an audit trail, and needs a check between steps that a malformed output cannot slip past. Where should this workflow’s control flow live?

Answer & rationale →

Options a. Enforce it programmatically, because a fixed repeatable sequence that needs an audit trail and gates is what your code is for b. Stay prompt-based, because the model can self-direct any sequence more cheaply than orchestration code c. Enforce it programmatically, because prompt-based workflows cannot be translated to the SDK d. Stay prompt-based, because determinism only matters when the steps are not known in advance

Show answer

Correct: Enforce it programmatically, because a fixed repeatable sequence that needs an audit trail and gates is what your code is for

d1-04-fresh-context-reviewer mcq apply ◆◆◇◇

Your quality workflow has Session A write a rate limiter and Session B review it, and a teammate proposes saving effort by handing Session B all of Session A’s prior context. In the Writer/Reviewer pattern, why must the reviewer not inherit the writer’s context?

Answer & rationale →

Options a. A fresh context cannot rationalize choices it never made, so it judges the work independently b. The writer's context is too large to pass to the reviewer without exceeding the token budget c. Inheriting the context would let the reviewer finish faster and declare success too early d. The reviewer needs the writer's full transcript to judge the code accurately in context

Show answer

Correct: A fresh context cannot rationalize choices it never made, so it judges the work independently

d1-04-schema-only-gate mcq apply ◆◆◇◇

A pipeline validates every inter-step output against a JSON schema, yet it still ships data with a total that does not add up. The output parses cleanly and has every required field. What does the gate need to catch this?

Answer & rationale →

Options a. Add a semantic check, because a structurally valid output can still carry fabricated or contradictory content b. Tighten the schema with more required fields and stricter types until the bad data fails validation c. Pass the output downstream and let the publish step flag any content that looks wrong d. Replace the schema check with a semantic check, since structural validation adds no value

Show answer

Correct: Add a semantic check, because a structurally valid output can still carry fabricated or contradictory content

d1-05-defer-pauses-query mcq apply ◆◆◇◇

A teammate writes a PreToolUse hook that returns defer carrying an updatedInput, expecting the rewritten command to then run. Will the command run, and will the rewrite take effect?

Answer & rationale →

Options a. Neither: defer ends the query for the host to resume later, and updatedInput is ignored with defer b. Both: defer permits the call and the updatedInput rewrites it before it runs c. The rewrite holds but the call waits: defer just queues the rewritten command for the next turn d. The call is blocked outright, since defer is simply another name for the deny decision

Show answer

Correct: Neither: defer ends the query for the host to resume later, and updatedInput is ignored with defer

d1-05-posttooluse-normalize mcq apply ◆◆◇◇

Every Bash result reaches the model carrying ANSI color codes, and you need them stripped before Claude reads the output — the call itself must still run normally. Which hook event and return field do that work?

Answer & rationale →

Options a. A PostToolUse hook returning updatedToolOutput, which replaces the result before the model reads it b. A PreToolUse hook returning updatedInput, which rewrites the call before the tool runs c. A PostToolUse hook returning additionalContext, which strips the codes as it appends to the result d. A PreToolUse hook returning a deny decision so the noisy Bash call never runs

Show answer

Correct: A PostToolUse hook returning updatedToolOutput, which replaces the result before the model reads it

d1-05-precedence-deny-wins mcq apply ◆◆◇◇

Three PreToolUse hooks fire on one tool call and return, respectively, allow, ask, and deny. What happens to the call, and what rule decides it?

Answer & rationale →

Options a. The call is blocked: matching hooks run in parallel and the most restrictive result wins b. The call is allowed, because two of the three hooks did not return deny c. The call runs after a permission prompt, because ask outranks a lone deny d. The outcome depends on which hook finishes first, since order breaks the tie

Show answer

Correct: The call is blocked: matching hooks run in parallel and the most restrictive result wins

d1-06-adaptive-cost-tradeoff mcq apply ◆◆◇◇

A lead picks an adaptive multi-agent decomposition over a single-agent approach for an open-ended research task and is asked to justify the spend to a skeptical stakeholder. Compared to a single agent on an equivalent task, what is the rough token cost, and what does that spend actually buy?

Answer & rationale →

Options a. Roughly 3-10x more tokens, and what it buys is thoroughness, not speed b. Roughly 3-10x more tokens, and what it buys is a faster wall-clock from parallel subagents c. About the same tokens, since the work is identical, but a faster wall-clock from parallelism d. Roughly 3-10x more tokens, with the gain being deterministic, auditable, bounded cost

Show answer

Correct: Roughly 3-10x more tokens, and what it buys is thoroughness, not speed

d1-06-over-decomposition-guard mcq apply ◆◆◇◇

An adaptive system keeps spawning dozens of subagents to answer questions that turn out to be simple — the “50 subagents for a simple query” failure — multiplying token cost for nothing. The task genuinely is open-ended, so abandoning adaptivity isn’t an option. What is the chapter’s mitigation for this failure mode?

Answer & rationale →

Options a. Tell the orchestrator to scale effort to complexity — 1 agent for simple fact-finding, 2-4 for a comparison, 10+ for complex research b. Switch the work to a hardcoded sequential pipeline so the subagent count can never vary c. Cap the orchestrator's total token budget so it runs out before it can spawn too many subagents d. Give every spawned subagent the full transcript so each one can decide whether it is actually needed

Show answer

Correct: Tell the orchestrator to scale effort to complexity — 1 agent for simple fact-finding, 2-4 for a comparison, 10+ for complex research

d1-06-path-dependent-structure mcq apply ◆◆◇◇

You are scoping “find out whether any competitor shipped feature X — go as deep as the question needs,” where each thing you learn changes what you look at next and the depth is unknowable up front. Which decomposition structure does this task require, and on what property of the task does that choice turn?

Answer & rationale →

Options a. Adaptive decomposition, because the work is path-dependent — step N+1 depends on what step N discovered, so no design-time sequence can capture it b. A sequential pipeline, because mapping each finding to the next step in advance is what makes the cost auditable c. A sequential pipeline, because a clean fixed sequence is the structure least likely to be brittle under change d. Adaptive decomposition, because spawning many subagents up front guarantees the broadest possible coverage of the topic

Show answer

Correct: Adaptive decomposition, because the work is path-dependent — step N+1 depends on what step N discovered, so no design-time sequence can capture it

d1-07-crosshost-artifacts mcq apply ◆◆◇◇

A CI agent stops on error_max_turns partway through a refactor, and its container is torn down. You must finish the work tomorrow on a fresh worker that won’t have today’s transcript file. What is the more robust way to carry the work across hosts?

Answer & rationale →

Options a. Capture the artifacts that matter — decisions, the diff so far, the remaining plan — as application state and seed a fresh session's prompt b. Ship the original session's transcript file to the new worker and resume it by ID with a bigger budget c. Fork the original session so the new worker gets an independent copy of the conversation to continue from d. Use file checkpointing to snapshot the working directory and restore it on the new worker before resuming

Show answer

Correct: Capture the artifacts that matter — decisions, the diff so far, the remaining plan — as application state and seed a fresh session's prompt

d1-07-fork-shared-disk mcq apply ◆◆◇◇

You fork your session to try a destructive refactor, expecting the original session’s files to be safe. The forked agent deletes a source file. Is that deletion isolated from the original session?

Answer & rationale →

Options a. No — forking branches the conversation, not the filesystem, so both forks share one disk and the deletion is real b. Yes — a fork starts from a copy of the original's history, and that copy includes a snapshot of the working directory c. Yes — the new session ID gives the fork its own sandboxed filesystem, so edits cannot reach the original d. No — but only because the original session is still open; closing it first would have isolated the files

Show answer

Correct: No — forking branches the conversation, not the filesystem, so both forks share one disk and the deletion is real

d1-07-resume-fresh-cwd mcq apply ◆◆◇◇

A script captured a session_id from the first run’s ResultMessage and later calls resume with that exact ID, but the resumed session comes back empty and fresh instead of the expected history. What is the single most likely cause?

Answer & rationale →

Options a. You launched the resume from a different working directory, so the encoded-cwd lookup path no longer matches b. The original session hit error_max_turns, which clears the transcript and leaves nothing to resume c. You passed continue_conversation=True instead of the specific session ID, so the wrong session was reopened d. The session ID expired between runs, so the SDK had to mint a fresh empty session in its place

Show answer

Correct: You launched the resume from a different working directory, so the encoded-cwd lookup path no longer matches

d2-01-consolidate-selection-ambiguity mcq apply ◆◆◇◇

Your toolset exposes create_pr, review_pr, and merge_pr as three separate tools, and the agent frequently invokes the wrong one. Which redesign does the chapter document for exactly this set, and on what stated rationale?

Answer & rationale →

Options a. Fold them into one tool with an action parameter, because fewer, more capable tools reduce selection ambiguity b. Keep three tools but namespace each by service so their names no longer collide c. Keep three tools and add input_examples to each so the model picks the right one d. Keep three tools but make each response return only high-signal identifiers

Show answer

Correct: Fold them into one tool with an action parameter, because fewer, more capable tools reduce selection ambiguity

d2-01-description-highest-leverage mcq apply ◆◆◇◇

A tool you inherited carries the entire description “Gets data for a record,” and the agent keeps reaching for it at the wrong moments. Of the changes available, which one targets the single highest-leverage surface the chapter identifies?

Answer & rationale →

Options a. Rewrite the description to say what it does, when (and when not) to use it, and what each parameter means b. Rename it from a bare verb to a service-namespaced name like crm_get_record c. Add input_examples so the model can infer the intent from sample calls d. Trim the response to high-signal fields so the model wastes fewer tokens reading it

Show answer

Correct: Rewrite the description to say what it does, when (and when not) to use it, and what each parameter means

d2-01-input-examples-400 mcq apply ◆◆◇◇

You add three input_examples to a working client tool, and the API now rejects the whole request with a 400 — your description and your code are untouched and fine. Which explanation matches the one hard rule the chapter states for input_examples?

Answer & rationale →

Options a. One example sets unit to a value the enum does not allow, so it fails to validate against input_schema b. Three examples exceed the per-tool limit, so the request is rejected for too many examples c. input_examples are not permitted on this client tool and must be removed entirely d. The description disagrees with the examples, so the model cannot reconcile the two surfaces

Show answer

Correct: One example sets unit to a value the enum does not allow, so it fails to validate against input_schema

d2-02-error-flag-casing mcq understand ◆◆◇◇

A tool fails. In the Claude Messages API (direct, not MCP), which field on the tool_result block flags the failure — and in which casing?

Answer & rationale →

Options a. `is_error` (snake_case) b. `isError` (camelCase) c. `error_code` (snake_case) d. a JSON-RPC `-32602` error

Show answer

Correct: `is_error` (snake_case)

d2-02-validation-channel-routing mcq apply ◆◆◇◇

Your booking tool, exposed over MCP, receives a departure date that lies in the past — a business-logic failure the model could fix by asking for a new date. The MCP spec gives you two channels for reporting failures. Which channel should this failure travel down, and why?

Answer & rationale →

Options a. Return it as an execution error inside a successful result, so the model can read the message and self-correct b. Return a JSON-RPC -32602 protocol error, since invalid parameters are a protocol-level problem c. Return a JSON-RPC error so the host can decide whether to surface the failure to the user d. Return a successful empty result and rely on the model to notice the missing booking

Show answer

Correct: Return it as an execution error inside a successful result, so the model can read the message and self-correct

d2-03-mcp-scope-vs-bypass mcq apply ◆◆◇◇

An agent should be able to call every tool on one MCP server but gain no other broad privileges. Which configuration matches the chapter’s documented guidance for scoping that MCP access?

Answer & rationale →

Options a. Scope with an `allowedTools` wildcard like `mcp__github__*`, which grants exactly that server and nothing more b. Set `permissionMode: "bypassPermissions"`, which auto-approves the server's tools and is the documented MCP path c. Set `tool_choice` to `any` so the model is steered toward the MCP server's tools each request d. Add the server to `disallowedTools` so its tools are removed from the request and then re-approved on demand

Show answer

Correct: Scope with an `allowedTools` wildcard like `mcp__github__*`, which grants exactly that server and nothing more

d2-03-thinking-forces-auto mcq apply ◆◆◇◇

A teammate wants the model to reason before acting, so they enable extended thinking, and they also want a specific tool always called, so they set tool_choice to a forced {"type": "tool", "name": …}. What happens when this request runs?

Answer & rationale →

Options a. The forced mode is incompatible with extended thinking, so the request errors before any tool fires b. Forcing one named tool silently disables thinking, so the model answers but never reasons first c. The request runs, but the model emits a reasoning preamble that violates the strict schema d. Parallel tool use kicks in, so the model calls several tools at once instead of the one named tool

Show answer

Correct: The forced mode is incompatible with extended thinking, so the request errors before any tool fires

d2-04-local-scope-not-local-settings mcq apply ◆◆◇◇

A teammate added an MCP server at local scope, but it never loads. They have been editing .claude/settings.local.json to fix it, with no effect. Which file should you open instead to find and repair the server’s definition?

Answer & rationale →

Options a. Open the home-directory "~/.claude.json", under that project's key, where local-scoped servers actually live b. Open ".claude/settings.local.json" and add the missing server block there yourself c. Open the committed ".mcp.json" at the repo root, since every scope resolves through that file d. Open the user settings, because a local-scoped server is shared across all of your projects

Show answer

Correct: Open the home-directory "~/.claude.json", under that project's key, where local-scoped servers actually live

d2-04-mcp-project-scope mcq understand ◆◆◇◇

You want an MCP server to be shared with everyone who checks out a repository — scoped to that project, not to your user account. Where do you declare it?

Answer & rationale →

Options a. `.mcp.json` at the repository root b. `~/.claude.json` c. `.claude/settings.local.json` d. the `mcpServers` field in `package.json`

Show answer

Correct: `.mcp.json` at the repository root

d2-04-scope-precedence-wins mcq apply ◆◆◇◇

A repo ships a notion server in its committed .mcp.json (Project scope). A teammate also defines a server named notion at Local scope in their own ~/.claude.json. On the teammate’s machine, which definition actually connects?

Answer & rationale →

Options a. The Local-scope definition in their `~/.claude.json`, because precedence runs Local before Project b. The committed Project `.mcp.json`, because a shared, version-controlled config always overrides a personal one c. Both connect, and Claude Code merges the two definitions field by field d. Neither connects until the duplicate name is resolved, because matching names is treated as a conflict

Show answer

Correct: The Local-scope definition in their `~/.claude.json`, because precedence runs Local before Project

d2-04-verify-system-init-status mcq apply ◆◆◇◇

Your agent “doesn’t seem to have” its MCP tools, yet the config looks correct and no error is thrown. Before letting the agent act, what should the program do to turn this silent gap into an explicit failure?

Answer & rationale →

Options a. Read the "system:init" message and refuse to run if any server's status is not "connected" b. Catch the exception the server raises on connection failure and retry the query c. Lower the 60-second initialization timeout so a failing server reports sooner d. Set "strictMcpConfig: true" so only your declared servers can connect

Show answer

Correct: Read the "system:init" message and refuse to run if any server's status is not "connected"

d2-05-deny-beats-bypass mcq apply ◆◆◇◇

A headless agent runs under permission_mode="bypassPermissions" to suppress prompts, with one guardrail: disallowed_tools=["Bash(rm -rf *)"]. The model requests Bash(rm -rf /data). Will the deletion be blocked?

Answer & rationale →

Options a. Yes — deny rules are checked before the permission mode, so the rule binds even under bypassPermissions b. No — bypassPermissions sits above deny rules in the order, so it approves the call first c. No — the rule must instead go in allowed_tools, which is the list bypassPermissions honors d. Only if you also drop the mode to default, since deny rules apply in default mode alone

Show answer

Correct: Yes — deny rules are checked before the permission mode, so the rule binds even under bypassPermissions

d2-05-parallel-read-only mcq apply ◆◆◇◇

In one turn an agent issues two Grep calls, one Glob, and one Write. The chapter says the deciding property is whether a tool reads state or changes it. Which of these calls may the SDK run concurrently?

Answer & rationale →

Options a. The two Grep calls and the Glob, because read-only tools may run concurrently while Write modifies state b. All four, because the SDK fans out every tool call within a single turn by default c. The Glob and the Write, because file-system tools are the ones cleared to run in parallel d. None of the three reads, because parallelism is a per-request switch the developer must enable first

Show answer

Correct: The two Grep calls and the Glob, because read-only tools may run concurrently while Write modifies state

d3-01-claude-md-concatenate mcq understand ◆◆◇◇

Two CLAUDE.md files in the load chain give contradictory instructions. Which one takes precedence?

Answer & rationale →

Options a. Neither — they concatenate; both stay in context b. The Project file overrides the User file c. The Local file overrides every other scope d. The Managed-policy file overrides the rest

Show answer

Correct: Neither — they concatenate; both stay in context

d3-01-import-decline-permanent mcq apply ◆◆◇◇

Your CLAUDE.md pulls in shared standards with @./standards/api.md. The first time the import runs, Claude Code shows an approval dialog and the developer clicks decline. According to the chapter, what is the lasting consequence for that environment?

Answer & rationale →

Options a. Imports are disabled permanently — the approval dialog does not reappear, so the file never expands until the choice is reset b. Claude re-prompts on the next session start, since approval is tracked per session c. The import is skipped only this once, and the referenced file expands automatically next launch d. Claude raises a build error and refuses to start until the import line is removed

Show answer

Correct: Imports are disabled permanently — the approval dialog does not reappear, so the file never expands until the choice is reset

d3-01-settings-precedence-cli-wins mcq apply ◆◆◇◇

A developer sets “model”: “sonnet” in their user settings and “model”: “opus” in the project’s committed settings, then launches the session with —model haiku. Given how Claude Code resolves the settings hierarchy, which model actually runs, and why?

Answer & rationale →

Options a. haiku — the CLI argument sits above both file scopes on the precedence ladder b. opus — the project settings file is committed, so it outranks a session-only flag c. sonnet — the user file is the broadest scope, and broadest wins as it does for CLAUDE.md d. haiku — but only because all three values happen to merge, like permission rules do

Show answer

Correct: haiku — the CLI argument sits above both file scopes on the precedence ladder

d3-02-commands-merged-into-skills mcq apply ◆◆◇◇

A colleague insists the only way to register a /deploy trigger is a flat file at .claude/commands/deploy.md. Per this chapter, which placement(s) will actually create a working /deploy?

Answer & rationale →

Options a. Either form works — a file at .claude/commands/deploy.md and a skill at .claude/skills/deploy/SKILL.md both create /deploy b. Only .claude/commands/deploy.md, because a slash command is the sole way to register a /deploy trigger c. Only .claude/skills/deploy/SKILL.md, because the legacy flat-file commands no longer run d. Neither — /deploy must be a built-in command whose behavior is coded into the CLI

Show answer

Correct: Either form works — a file at .claude/commands/deploy.md and a skill at .claude/skills/deploy/SKILL.md both create /deploy

d3-02-disable-model-invocation-description mcq apply ◆◆◇◇

You mark a risky /force-release skill disable-model-invocation: true, then ask Claude to “just release it” — and it acts as if the skill does not exist. According to this chapter, what did that flag do to the skill’s description?

Answer & rationale →

Options a. Its description is kept out of startup context entirely, so Claude is never told the skill exists and cannot auto-invoke it b. Its ~100-token description still loads at startup, but Claude is forbidden from acting on the match c. Its description loads only after the user runs it once, so the first auto-invoke is what is blocked d. Its description is dropped only when the skill-listing budget overflows, otherwise Claude can still auto-invoke

Show answer

Correct: Its description is kept out of startup context entirely, so Claude is never told the skill exists and cannot auto-invoke it

d3-03-scope-load-order mcq apply ◆◆◇◇

Your machine has ~/.claude/rules/style.md (“prefer tabs”), and a repo you are working in ships .claude/rules/style.md (“prefer spaces”). Neither rule uses a paths glob, so both are in context. When the two instructions tension, which one does Claude favor and why?

Answer & rationale →

Options a. The project rule wins, because user-level rules load first and the project rule is read last at higher priority b. The user-level rule wins, because a machine-wide preference outranks anything a single repository ships c. Whichever rule carries a paths glob wins, since scoping a rule raises its priority over an unscoped one d. Neither applies — two rules giving conflicting instructions cancel out, so Claude falls back to its default

Show answer

Correct: The project rule wins, because user-level rules load first and the project rule is read last at higher priority

d3-03-scoped-rule-silent mcq apply ◆◆◇◇

A rule at .claude/rules/backend/api.md carries paths: ["src/api/**/*.ts"]. During a session, Claude edits files under src/frontend/ and never reads anything under src/api/. Regarding that API rule, what is the effect of this session on its presence in context?

Answer & rationale →

Options a. Nothing happens — the rule stays out of context, because it activates only when Claude reads a file matching its glob b. The rule loads at session start, since every rule under .claude/rules/ is read into context when Claude launches c. The rule loads as soon as Claude runs any tool during the session, even before touching a matching file d. The rule is ignored permanently this session, because a path-scoped rule that misses at launch never gets a second chance

Show answer

Correct: Nothing happens — the rule stays out of context, because it activates only when Claude reads a file matching its glob

d3-04-approve-flips-write-mode mcq apply ◆◆◆◇

In plan mode, Claude presents a plan and you select the “accept edits” approve option. Considering only what the chapter says happens at approval, what permission mode is the session now in, and what does that mean for editing?

Answer & rationale →

Options a. It has exited plan mode into acceptEdits, so Claude can now apply edits; re-enter plan mode to research again b. It remains in plan mode, with the approval recorded as a note for the next edit c. It is paused awaiting a second confirmation before any write mode takes effect d. It has switched to bypassPermissions, since approving a plan waives later checks

Show answer

Correct: It has exited plan mode into acceptEdits, so Claude can now apply edits; re-enter plan mode to research again

d3-04-opusplan-200k mcq apply ◆◆◆◇

You run claude --model opusplan for a refactor whose planning step must hold roughly 400K tokens of context at once. Before picking the alias, decide whether opusplan’s plan phase can provide that much context.

Answer & rationale →

Options a. No — opusplan's plan phase runs at the standard 200K; reach for opus[1m] for that phase instead b. Yes — opusplan's Opus plan phase receives the automatic 1M-context upgrade c. Yes — the 1M window activates once the planning context exceeds 200K d. No — but switching to Sonnet for execution would give the planning step the larger window

Show answer

Correct: No — opusplan's plan phase runs at the standard 200K; reach for opus[1m] for that phase instead

d3-04-prompts-still-apply mcq apply ◆◆◇◇

A teammate switches into plan mode expecting a silent run with no interruptions, then is surprised when Claude still pauses to ask before a shell command. What does plan mode actually do to the permission prompts?

Answer & rationale →

Options a. Tool-permission prompts still appear the same as in default mode; only source edits are withheld b. All prompts are suppressed, since plan mode runs as a quiet read-only sandbox c. Prompts are suppressed for shell commands but still appear for file reads d. Prompts disappear once Claude begins writing the plan, signalling research is underway

Show answer

Correct: Tool-permission prompts still appear the same as in default mode; only source edits are withheld

d3-05-clear-after-third-correction mcq apply ◆◆◇◇

Three times this session you have corrected Claude on the same issue, and each fix drifts back toward the wrong approach. According to the chapter’s threshold for course-correction, what should you do next?

Answer & rationale →

Options a. Run /clear and restart with a more specific prompt that incorporates what the failed rounds taught you b. Issue a third, more forceful correction since the feedback loop is finally getting tight c. Open plan mode and have Claude re-explore the codebase before you correct it once more d. Raise the context budget so the failed approaches and the new instruction both fit in the window

Show answer

Correct: Run /clear and restart with a more specific prompt that incorporates what the failed rounds taught you

d3-05-fresh-session-after-interview mcq apply ◆◆◇◇

You ran the interview pattern for a rate limiter: Claude questioned you with the AskUserQuestion tool and wrote a complete SPEC.md, which you reviewed and corrected. To begin implementation, which move does the chapter prescribe?

Answer & rationale →

Options a. Open a fresh session and prompt it to implement from the spec, so the work starts on clean context b. Keep going in the same session, since it already holds the full design discussion in context c. Start a fresh session but re-paste the interview transcript so no decisions are lost d. Delete the spec file and have the interview session implement from its own memory of the answers

Show answer

Correct: Open a fresh session and prompt it to implement from the spec, so the work starts on clean context

d3-06-exit-code-masked mcq apply ◆◆◇◇

A CI step runs claude -p "review the diff" --max-turns 8 --output-format json > out.json || true. The agent thrashed past eight turns, out.json is truncated, yet the step shows green and the next step consumes the partial output. What went wrong?

Answer & rationale →

Options a. The run hit `--max-turns` and exited non-zero, but the trailing `|| true` swallowed the status, so the shell reported success and the step stayed green b. `--output-format json` overrides the process exit code with the `result` field, so a parseable payload always reports the step as passing c. `--max-turns` only prints a warning and exits `0`, so the step was genuinely successful and the empty result is a separate stdout bug d. Headless `claude -p` always exits `0`; CI must inspect `total_cost_usd` to infer failure, which this pipeline failed to do

Show answer

Correct: The run hit `--max-turns` and exited non-zero, but the trailing `|| true` swallowed the status, so the shell reported success and the step stayed green

d3-06-headless-flag mcq remember ◆◇◇◇

In a CI pipeline you need Claude Code to run a single prompt non-interactively, print the result to stdout, and exit. Which invocation does this?

Answer & rationale →

Options a. `claude -p '…'` b. `claude --interactive` c. `claude --watch` d. `claude --serve`

Show answer

Correct: `claude -p '…'`

d3-06-json-schema-print-mode mcq apply ◆◆◇◇

An engineer wants a downstream step to receive a schema-conforming payload, so inside a normal interactive terminal session they reach for --json-schema with --output-format json — and it doesn’t behave as documented. Which statement correctly diagnoses the constraint and says where a valid result would arrive?

Answer & rationale →

Options a. `--json-schema` is print-mode only, so it works under `-p` but not in an interactive session; pairing it with `--output-format json` lands the conforming result in `structured_output` b. `--json-schema` validates output in any session, but only `-p` exposes the `structured_output` field, so the interactive run silently dropped it c. `--json-schema` requires `--output-format stream-json` to emit a schema-validated payload; `json` alone returns only prose d. `--json-schema` enforces the shape on every model turn, so an interactive session rejects it because it has no fixed turn count

Show answer

Correct: `--json-schema` is print-mode only, so it works under `-p` but not in an interactive session; pairing it with `--output-format json` lands the conforming result in `structured_output`

d4-01-positive-over-negative mcq apply ◆◆◇◇

A teammate’s prompt keeps producing bulleted answers when the team wants paragraphs, so they ask you whether to add “do not use markdown lists” or “respond in smoothly flowing prose.” Which instruction steers more reliably toward paragraphs, and on what reasoning?

Answer & rationale →

Options a. "Respond in smoothly flowing prose" — it names the target shape instead of a forbidden region b. "Do not use markdown lists" — it removes the specific failure the reviewers keep flagging c. "Avoid bullets and don't be too structured" — it rules out more than one unwanted form at once d. "Never use lists or headers under any circumstances" — its firmness leaves no room to drift

Show answer

Correct: "Respond in smoothly flowing prose" — it names the target shape instead of a forbidden region

d4-01-stop-at-the-rung mcq apply ◆◆◇◇

A summarization job returns a JSON object whose fields are all handled well by plain instruction except sentiment, which must be one of three labels and will crash the downstream pipeline if a fourth value ever appears. Following the chapter’s escalation discipline, how should you harden the prompt?

Answer & rationale →

Options a. Move only sentiment to an enum-constrained tool or structured output, leaving the other fields at explicit instruction b. Move every field to a structured output, so the whole contract is enforced uniformly at the strongest rung c. Keep all fields on explicit instruction and add retries, since a capable model usually matches the schema when told d. Add few-shot examples for every field so each one is demonstrated before the parser ever runs

Show answer

Correct: Move only sentiment to an enum-constrained tool or structured output, leaving the other fields at explicit instruction

d4-02-edge-case-in-the-set mcq apply ◆◆◇◇

Your few-shot extraction returns the empty string for order_id when an input omits the order number, but the downstream consumer needs null. Following the chapter’s “target the ambiguous case” guidance, what is the most reliable fix?

Answer & rationale →

Options a. Add an example whose input has no order number and whose output shows the field as null, placed in the middle of the set b. Add a sentence beside the examples instructing the model to use null, not an empty string, when no order number appears c. Add several more clean examples that all include an order number, giving the model more data to generalize from d. Leave the prompt alone and post-process every empty string in the output into null after the model returns

Show answer

Correct: Add an example whose input has no order number and whose output shows the field as null, placed in the middle of the set

d4-02-single-example-quirk mcq apply ◆◆◇◇

A teammate steers an extraction task with a single carefully chosen example, and the model starts wrapping the first field in quotes on every output even when the input has no quotes. Which change does the chapter prescribe to stop the model copying that incidental trait?

Answer & rationale →

Options a. Move into the 3-5 range with examples that vary the incidental traits, so the wrapping is no longer common across the set b. Add a prose instruction telling the model not to quote the first field unless the input quotes it c. Cut back to zero examples and let the model infer the shape from the instruction alone d. Keep the one example but raise the model's token budget so it has room to reconsider the quirk

Show answer

Correct: Move into the 3-5 range with examples that vary the incidental traits, so the wrapping is no longer common across the set

d4-03-classic-output-slot mcq understand ◆◆◇◇

In the classic structured-output pattern you force Claude to call a print_summary tool. Where is your extracted JSON, and what do you do with the tool’s “result”?

Answer & rationale →

Options a. In the tool call's `input` — you discard the tool's result b. In the tool's result — you return it to the model c. In the assistant text block — you parse it with a regex d. In `output_config.format` — you read it from the response

Show answer

Correct: In the tool call's `input` — you discard the tool's result

d4-03-strict-dropped-openai mcq analyze ◆◆◆◇

You set strict: true on a tool to guarantee schema-valid inputs. Through which integration path is that guarantee silently lost?

Answer & rationale →

Options a. The OpenAI SDK compatibility layer b. The native Anthropic SDK c. The Message Batches API d. Any call that also sets `tool_choice` to `any`

Show answer

Correct: The OpenAI SDK compatibility layer

d4-03-truncation-not-retry mcq apply ◆◆◇◇

Your strict extraction normally works, but on unusually long inputs the returned JSON is occasionally cut off mid-object and your parser throws — and stop_reason comes back as max_tokens. What is the correct fix?

Answer & rationale →

Options a. Raise `max_tokens` (or shrink the schema) — generation ran out of room to close the object b. Retry the identical request — a fresh sample will usually complete the object c. Add `additionalProperties: false` to every object node so the grammar can compile d. Switch off `strict` and parse the response text instead, since the grammar is failing

Show answer

Correct: Raise `max_tokens` (or shrink the schema) — generation ran out of room to close the object

d4-04-stated-vs-calculated-total mcq apply ◆◆◇◇

An invoice extractor returns schema-valid JSON, but once in a while the total it reports does not match the sum of the line items, and those bad totals slip through to billing. Following the chapter’s semantic-hook approach, what schema-level design makes such a total mechanically checkable?

Answer & rationale →

Options a. Have the model emit both a stated_total and a calculated_total, then let application code compare them and route mismatches to review b. Add a minimum constraint on the total field so a value below the line-item floor is rejected by the schema c. Mark the total field required and strict so its type and presence are guaranteed on every record d. Raise the model's reasoning budget so it has room to add up the line items more carefully before answering

Show answer

Correct: Have the model emit both a stated_total and a calculated_total, then let application code compare them and route mismatches to review

d4-04-subtype-before-payload mcq apply ◆◆◇◇

A structured-output run sometimes succeeds and sometimes exhausts its retry budget, but downstream code occasionally processes garbage on the failure path instead of falling back. Per the chapter, how should the caller consume an Agent SDK structured-output result so the failure path is handled correctly?

Answer & rationale →

Options a. Branch on the result's subtype first, reading message.structured_output only on success and falling back otherwise b. Wrap the call in a try/catch so the exhausted-retries case is caught as an exception and handled there c. Read message.structured_output and treat an empty object as the signal that the retries were exhausted d. Hard-code the documented retry count so the caller knows exactly how many attempts preceded the result

Show answer

Correct: Branch on the result's subtype first, reading message.structured_output only on success and falling back otherwise

d4-04-truncation-not-retries mcq apply ◆◆◇◇

Your validate-retry loop keeps reaching error_max_structured_output_retries on your longest documents, which come back as cut-off JSON, while shorter documents extract cleanly. Per the chapter, what is the actual failure and the fix?

Answer & rationale →

Options a. The responses hit the max_tokens cap, so detect that stop_reason and raise the cap or shrink the schema before retrying b. The retry budget is too small, so increase the number of attempts until one of them completes the object c. The semantic cross-checks are too strict, so loosen them so a partial object can pass validation d. The grammar cache is stale, so vary the schema between attempts to force a fresh compilation each retry

Show answer

Correct: The responses hit the max_tokens cap, so detect that stop_reason and raise the cap or shrink the schema before retrying

d4-05-streaming-dashboard mcq apply ◆◆◇◇

A teammate proposes a streaming batch so a dashboard can update ticket-by-ticket as each of 80,000 classifications arrives. Setting aside whether streaming is the right tool, which statement about this proposal does the chapter support?

Answer & rationale →

Options a. It cannot be built — streaming is not supported for batch requests b. It works if every request carries a unique custom_id to order the live updates c. It works only after raising max_tokens with the extended-output beta header d. It cannot be built — a single batch is capped at far fewer than that many requests

Show answer

Correct: It cannot be built — streaming is not supported for batch requests

d4-05-succeeded-not-usable mcq apply ◆◆◇◇

Your overnight batch finishes and you iterate the results, keeping every entry whose result type is succeeded and treating it as a finished classification. Some downstream records turn out to hold refused or truncated text. Per the chapter, what must the per-result handler do that this design omits?

Answer & rationale →

Options a. Inspect each succeeded message's own stop_reason, because a refusal or truncation arrives as succeeded b. Treat every succeeded result as a valid classification, since unusable outputs come back as errored c. Skip stop_reason and instead re-poll processing_status until the batch reports usable results d. Filter out succeeded results, since only canceled and expired results carry a real message

Show answer

Correct: Inspect each succeeded message's own stop_reason, because a refusal or truncation arrives as succeeded

d4-06-fleet-false-positives mcq apply ◆◆◇◇

A team runs several parallel specialist reviewers with no filtering step and finds the output noisy — each agent independently flags plausible-but-wrong issues that accumulate into untrustworthy review comments. Which change directly addresses this failure mode while keeping the fan-out?

Answer & rationale →

Options a. Add a verification pass that re-checks each candidate finding against actual code behavior before any are posted b. Cut the fleet back to a single reviewer so there are fewer independent sources of flagged issues c. Give every specialist the same broad mandate so their overlapping findings confirm each other d. Raise each reviewer's token budget so it has room to recheck its own findings before posting

Show answer

Correct: Add a verification pass that re-checks each candidate finding against actual code behavior before any are posted

d4-06-gate-on-neutral-check mcq apply ◆◆◇◇

A reviewer asks why a merge still went through even though Code Review posted an Important finding, and wants real bugs to block the merge queue going forward. How should the team make Code Review’s advisory output actually gate a merge?

Answer & rationale →

Options a. Read the severity breakdown from the check-run output in your own CI and fail the step when the Important count is positive b. Tighten the branch protection rule so Code Review's neutral check is treated as a required passing status c. Switch the trigger from manual to after-every-push so a fresh review runs and blocks each new commit d. Add more specialist reviewers so the fleet produces enough Important findings to halt the merge

Show answer

Correct: Read the severity breakdown from the check-run output in your own CI and fail the step when the Important count is positive

d5-01-accumulation-not-degradation mcq apply ◆◆◇◇

Turns in a session keep growing until the window is nearly full and the loop is about to compact, but nothing the model has produced is wrong or misremembered yet. Which response matches the failure mode actually in play?

Answer & rationale →

Options a. Reduce what accumulates — tighter tool outputs, or /clear and restart with a focused prompt b. Re-surface the key facts near the end of the context to counter lost-in-the-middle c. Move the standing instructions into CLAUDE.md so they survive the next summary d. Treat it as a quality bug and add stricter instructions to make the model attend better

Show answer

Correct: Reduce what accumulates — tighter tool outputs, or /clear and restart with a focused prompt

d5-01-buried-fact-no-compaction mcq apply ◆◆◇◇

A session is comfortably under the token limit and has never compacted, yet Claude misremembers a detail established about sixty turns ago in a long stretch of context. Which fix most directly addresses the failure mode at work here?

Answer & rationale →

Options a. Restate the fact near the end of the context so it sits where the model attends most reliably b. Move the fact into CLAUDE.md so it is re-injected and survives the next compaction c. Switch to a model with a larger context window so the conversation has more room d. Raise the output token budget so the model has room to recall the detail

Show answer

Correct: Restate the fact near the end of the context so it sits where the model attends most reliably

d5-02-subagent-cannot-escalate mcq apply ◆◆◇◇

A coordinator dispatches a subagent to implement a feature, but the subagent discovers the spec is ambiguous and has no way to ask the user. How should the work be restructured so the ambiguity is handled correctly?

Answer & rationale →

Options a. Have the coordinator resolve the open questions first, then hand the subagent a fully-specified task b. Give the subagent its own canUseTool callback so it can surface the question itself c. Let the subagent pick a reasonable default and record the assumption in its final report d. Raise the subagent's token budget so it has room to reason the ambiguity away

Show answer

Correct: Have the coordinator resolve the open questions first, then hand the subagent a fully-specified task

d5-02-suggest-alternative-pattern mcq apply ◆◆◇◇

Your canUseTool callback intercepts a Bash(curl … | sh) call. You want to block this specific command but also tell Claude to download, verify the checksum, and then run, so it can correct course on its own. Which response pattern does this?

Answer & rationale →

Options a. Suggest-alternative — deny with guidance in the message, which Claude reads and uses to adjust its next step b. Approve-with-changes — rewrite the command in updatedInput so the pipe-to-shell becomes a safe download c. Reject — return a plain deny so the blocked command simply does not run d. Approve-and-remember — echo a PermissionUpdate so this command is auto-allowed next time

Show answer

Correct: Suggest-alternative — deny with guidance in the message, which Claude reads and uses to adjust its next step

d5-03-green-evals-not-healthy mcq apply ◆◆◇◇

A multi-stage system shows broad, inconsistent quality degradation in production, yet every per-component evaluation stays green and neither internal usage nor the existing eval suite reproduces the problem. Following the April-23 postmortem’s conclusion, what kind of testing does the chapter say is actually needed?

Answer & rationale →

Options a. Run integration-level testing — per-model evals on every prompt change, ablation testing, and soak periods that exercise the system as it actually runs b. Raise the coverage threshold on the existing component evals until each agent's suite is exhaustive c. Add a structured-error-context schema to every agent boundary so faults are reported in machine-readable form d. Replace the slowest agent in the chain, since the degradation must originate in whichever component is least reliable

Show answer

Correct: Run integration-level testing — per-model evals on every prompt change, ablation testing, and soak periods that exercise the system as it actually runs

d5-03-silent-ambiguity-resolution mcq apply ◆◆◇◇

In a planner-to-coder-to-reviewer pipeline, the planner emits a spec with one ambiguous requirement. The coder produces clean, working code, the reviewer rates it fine, and the wrong output ships with no warning anywhere. According to the chapter, what does the coder do at the moment it meets the ambiguity that makes the error go silent?

Answer & rationale →

Options a. The coder has no one to ask, so it resolves the ambiguity itself and hands the guess downstream as settled fact b. The coder copies the planner's ambiguous wording forward verbatim, so the reviewer inherits the same unclear spec c. The coder escalates the ambiguity to a human, but the reply arrives too late to change the committed output d. The coder retries the planner step until the spec comes back unambiguous, masking the original defect

Show answer

Correct: The coder has no one to ask, so it resolves the ambiguity itself and hands the guess downstream as settled fact

d5-04-compact-vs-clear mcq apply ◆◆◇◇

A long task’s history is cluttered but still relevant; your next task is unrelated. Which command fits which?

Answer & rationale →

Options a. `/compact` to continue this task; `/clear` to switch to the new one b. `/clear` to continue this task; `/compact` to switch c. `/compact` for both — it's the only one that frees context d. `/clear` for both — `/compact` only changes the model

Show answer

Correct: `/compact` to continue this task; `/clear` to switch to the new one

d5-04-scratchpad-survives mcq analyze ◆◆◆◇

You write a PLAN.md scratchpad, then run /compact, then later /clear. Is the plan still available to the agent?

Answer & rationale →

Options a. Yes — a disk scratchpad survives both b. No — `/compact` summarizes it away c. No — `/clear` wipes the working directory d. Only until `/clear`, which deletes the file

Show answer

Correct: Yes — a disk scratchpad survives both

d5-05-funnel-judge-tier mcq apply ◆◆◇◇

Cheap automated checks pass an extraction — valid shape, no flagged conflict — yet it is wrong-but-plausible, and shipping it as-is is costly. The chapter’s funnel puts one tier between the auto-checks and the human to catch exactly this case. Which tier is it, and why does it catch what the auto-checks miss?

Answer & rationale →

Options a. An isolated judge in a fresh context, because it has no authorship bias toward the output it is checking b. A second run of the same model, accepting the record when the two runs agree c. The model's own self-reported confidence score, gating the human queue directly d. Sending every record straight to the human reviewer to be safe

Show answer

Correct: An isolated judge in a fresh context, because it has no authorship bias toward the output it is checking

d5-05-measure-before-routing mcq apply ◆◆◇◇

An extraction pipeline emits a confidence field of high, medium, or low, and you want to use it to decide which records auto-accept and which route onward for a high-stakes clinical field. Before you trust the label to route, what does the chapter say you must do?

Answer & rationale →

Options a. Measure, on labeled past extractions, how often each stated level is actually correct, then route by that observed accuracy b. Auto-accept every record marked high, since high monotonically outranks medium and low c. Drop the confidence field and route purely on whichever label the model emits most often d. Treat the field as calibrated and reuse the same thresholds after the model is upgraded

Show answer

Correct: Measure, on labeled past extractions, how often each stated level is actually correct, then route by that observed accuracy

d5-06-citations-schema-conflict mcq apply ◆◆◇◇

An extraction service enables the Citations API on a document and also sets a structured-output format on the same request, expecting JSON with per-claim citations. The API returns a 400 instead. According to the chapter, what is the root cause of this rejection?

Answer & rationale →

Options a. The two cannot coexist, because cited text must interleave with the response prose that a strict JSON schema forbids b. The PDF was scanned without extractable text, so its pages could not be cited under page_location c. Citations were enabled on only some of the request's documents rather than all of them d. The cited_text field overflowed the output token budget once the schema was attached

Show answer

Correct: The two cannot coexist, because cited text must interleave with the response prose that a strict JSON schema forbids

d5-06-reliable-cutoff-bounds-trust mcq apply ◆◆◇◇

A teammate notes that Sonnet 4.6 trained on data through January 2026 but is reliable only to August 2025, and asks which date should decide whether a time-sensitive fact may be answered from the model’s own memory. Per the chapter, which date governs that decision, and why?

Answer & rationale →

Options a. August 2025, because the reliable knowledge cutoff is earlier than the training cutoff and is the date that bounds trust b. January 2026, because the model trained on data through that date and can be trusted to that point c. Either date is fine, since both cutoffs describe the same boundary of dependable knowledge d. Neither date applies, because supplying a dated source removes any cutoff from consideration

Show answer

Correct: August 2025, because the reliable knowledge cutoff is earlier than the training cutoff and is the date that bounds trust