Chapters

Every chapter, grouped by Part. Use the card metadata to calibrate how much trust to place in a chapter's specific claims.

Part 1

  1. Chapter 1 architectural-pattern Fresh verified 2026-06-13

    Agent = Model + Harness

    The introduction. This book engineers the two layers around the model — the environment an agent acts in and the context it reasons over — and this chapter grounds that thesis in the frame the book stands on — an agent is a model plus the deterministic harness that wraps it. The three layers, the components, the nested loop, the book's map, and what it leaves to companion volumes.

    claude-codecross-tool
  2. Chapter 2 architectural-pattern Fresh verified 2026-06-13

    Beyond Autocomplete: The Environment & Context Discipline

    The argued opener for this book. The discipline that turns a model into an agent is the engineering of the two layers around it — the environment it acts in and the context it reasons over — and it is the most underappreciated, highest-leverage thing an architect designs.

    claude-codecross-tool
  3. Chapter 3 architectural-pattern Fresh verified 2026-05-29

    Repo & Doc Design for Agents

    The first environment chapter — the repository is the substrate a coding agent operates in. Design it to maximize the signal the agent reads and the machine-checkable feedback it gets back. Five converged-craft moves, with their evidence tiers stated honestly.

    claude-codecross-tool
  4. Chapter 4 architectural-pattern Fresh verified 2026-05-29

    The Instruction Layer: CLAUDE.md & AGENTS.md

    The always-loaded config file (CLAUDE.md / AGENTS.md) is not documentation — it is a permanent slice of the context budget. Spend it only on broadly-applicable, can't-infer-from-code context. The one measured result inverts the naive prior.

    claude-codecross-tool
  5. Chapter 5 feature-surface Fresh verified 2026-05-29

    Skills & Progressive Disclosure

    A Skill is procedural knowledge you author once that loads only when relevant. Progressive disclosure is the payoff, the description is the load-bearing interface, and a Skill is ergonomics — not a security boundary.

    claude-codecross-tool
  6. Chapter 6 feature-surface Fresh verified 2026-06-13

    Guardrails, Permissions & Reversibility

    The safety layer of the environment — express intent in policy, contain failure in mechanism. The permission model gates what the agent may attempt; sandbox isolation relaxes prompts safely; and reversibility must be out-of-band, because the agent's self-report can't be trusted.

    claude-codecross-tool
  7. Chapter 7 architectural-pattern Fresh verified 2026-05-29

    Environments at Scale: Large Codebases & Monorepos

    When the repo is too big to load, legibility stops meaning "document everything" and starts meaning "bound what the agent must load." Interface contracts, a shallow-but-deeply-linked index, per-decision ADRs, and scope-to-workspace monorepo structure.

    claude-codecross-tool
  8. Chapter 8 feature-surface Fresh verified 2026-06-13

    Context Rot: Why Windows Degrade

    The evidence that long context does not degrade gracefully — four distinct failure modes, why the robust claim is directional not numeric, and why "architectural and unsolved" overshoots in 2026. This is the problem context assembly answers.

    claude-codecross-tool
  9. Chapter 9 feature-surface Fresh verified 2026-06-13

    Context Assembly: Engineering the Window

    The engineering response to context rot — the harness owns the boundary deciding what enters the window and when. Cache stability, just-in-time loading, compaction, attention placement, and assembly-as-prompt.

    claude-codecross-tool
  10. Chapter 10 feature-surface Fresh verified 2026-06-13

    Memory: Persisting Context Across Sessions

    Memory is just recalled context — so every memory anti-pattern is a context anti-pattern. Typing enables decay, the doc-vs-memory boundary is durable-shared vs fast-private, repo-as-memory is the cheap floor you outgrow. An openly unsolved design space.

    claude-codecross-tool
  11. Chapter 11 architectural-pattern Fresh verified 2026-05-29

    Designing the Whole: Environment + Context as One System

    The capstone — an integrative design workflow that composes the book's eight core chapters into one discipline, with decision points and an honest map of what is settled, converged, first-party-only, and openly unsolved.

    claude-codecross-tool

Part 2

  1. Chapter 12 architectural-pattern Fresh verified 2026-06-13

    Beyond One Agent, One Tool

    The spine of the Tools & Orchestration volume. Two axes organize everything that follows — capability is a context cost (so the default is to subtract), and coordination is a context-isolation move (so a new unit is a fresh window, not an added skill). The chapter sets the volume's altitude and maps its chapters onto those two axes.

    claude-codecross-tool
  2. Chapter 13 architectural-pattern Fresh verified 2026-06-13

    Build vs. Buy: Choosing a Harness

    The first move on the capability axis — start direct on the API and add a harness abstraction only when it earns its keep. Why a framework's convenience is bought with abstraction that obscures prompts and is harder to debug, why a custom harness is a standing maintenance liability as models improve, what the framework landscape offers per each vendor's own docs, and why the realistic answer is the configure-wrap-extend middle path rather than the build-or-buy binary.

    claude-codecross-tool
  3. Chapter 14 architectural-pattern Fresh verified 2026-06-13

    Tool Minimization: Subtract First

    The governing default of the volume's capability axis — the smallest tool set that covers the workflow beats a complete one. Why an extra tool is paid twice (definition tokens at rest, selection errors at runtime), the three independent production reports that converge on subtract-first, the two highest-leverage heuristics (consolidate, return high-signal), and the dynamic complement — load tools on demand when scale forces it.

    claude-codecross-tool
  4. Chapter 15 feature-surface Fresh verified 2026-06-13

    MCP: Designing External Capability

    How to wire external capability against a least-privilege, capability-negotiated protocol — and design against a known moving target. The host/client/server split and its design-time isolation, the three primitives as three control modes, the OAuth-2.1 authorization posture, and how to build to MCP's stable core while isolating what the announced 2026-07-28 release candidate changes.

    claude-codecross-tool
  5. Chapter 16 feature-surface Fresh verified 2026-06-13

    Shaping Input — The Prompting Craft

    The craft that shapes what goes into the agent — five moves in the source's own order (be clear, show examples, elicit reasoning, structure with XML and roles, chain). The lead mental model is the brilliant-but-new employee; examples are the most reliable lever; two techniques changed under newer models (manual chain-of-thought is now a fallback, prefill on the last assistant turn is deprecated); and chaining is single-thread decomposition, not orchestration.

    claude-codecross-tool
  6. Chapter 17 feature-surface Fresh verified 2026-06-13

    Shaping Output — Structured & Reliable

    The output half of shaping I/O — four levers that force reliable machine-readable output, ordered strongest-guarantee to lightest. tool_choice forces the call while strict guarantees the args; structured outputs add a grammar-backed guarantee that holds except for refusals and max_tokens cutoffs and only over the supported schema subset; prevent beats recover, so the retry loop is the fallback, not the primary path.

    claude-codecross-tool
  7. Chapter 18 architectural-pattern Fresh verified 2026-06-14

    Sub-Agents: The Context-Isolation Primitive

    The first move of the coordination axis — a sub-agent is isolation, not capability. A fresh window that inherits nothing and returns only the relevant result; the fresh-in / result-only-out contract that makes it composable; separation of concerns; roles as description plus system prompt plus scoped tools; and when the isolation earns its keep versus when it is pure overhead.

    claude-codecross-tool
  8. Chapter 19 architectural-pattern Fresh verified 2026-06-14

    Multi-Agent: Coordinating Many

    Coordinating many agents as one decision chain — topology, then coordinator, then verifier, then a cost gate. Orchestrator-worker and the centralized-to-decentralized axis; the decompose-delegate-aggregate loop two independent first-party posts describe; the in-orchestration verifier; and the genuinely open, unflattened question of when multi-agent is worth its cost — Anthropic ships it, Cognition argues against it, and they share the parallelizability test.

    claude-codecross-tool
  9. Chapter 20 architectural-pattern Fresh verified 2026-06-14

    Composing Tools & Orchestration: The Two Axes as One System

    The capstone of the Tools & Orchestration volume — composing its chapters into one sequenced design workflow on the spine's two axes (capability and coordination), the recurring decision points, an honest map of the evidence tiers, and the boundary this volume leaves to Operations.

    claude-codecross-tool

Part 3

  1. Chapter 21 stable-principle Fresh verified 2026-06-14

    Measuring & Operating Agents: The Discipline

    The spine of the Evaluation & Operations volume. Once an agent is built, the discipline shifts from construction to operation — and the first move is to make what counts as good measurable before scaling. The chapter maps the volume's five operational surfaces (eval, observability, cost, oversight, security) and states the volume's evidence-honesty rule up front — that five of the six rest on first-party-authoritative evidence rather than triangulation, with security the one genuine convergence.

    claude-codecross-tool
  2. Chapter 22 feature-surface Fresh verified 2026-06-14

    Evaluating a Prompt: The Four-Step Loop

    How you know a prompt is good and iterate it — a four-step loop, not a one-shot check. Define measurable criteria, build a representative test set, iterate with tooling, and grade by reliability-per-effort, with criteria and tests fixed before you touch the prompt. The unit of analysis is a single prompt, and the LLM judge here is merely used — its calibration belongs to the next chapter.

    claude-codecross-tool
  3. Chapter 23 stable-principle Fresh verified 2026-06-14

    Evaluating an Agent: Harnesses, Suites & the Judge

    Evaluating a whole agent rather than a single prompt — the unit of analysis is a trajectory, a run. The chapter builds the eval before the harness, keeps the task suite small and failure-derived, reads every result as a measurement with uncertainty rather than a point score, and treats the LLM judge as a calibrated instrument with known error rather than an oracle.

    claude-codecross-tool
  4. Chapter 24 feature-surface Fresh verified 2026-06-14

    Observability: Seeing What the Agent Did

    Observability is four instrumentation surfaces stacked on one ground truth — the session-log transcript. Logging persists it, OpenTelemetry GenAI conventions trace it, attribution ties a diff back to it, and cost-surfacing shows the price. The chapter holds two boundaries — attribution is a provenance hook not an approval gate, and surfacing a cost number is not modeling the economics.

    claude-codecross-tool
  5. Chapter 25 feature-surface Fresh verified 2026-06-14

    Cost: The Economics of Running Agents

    The economics of running an agent, on one premise — context is compute, so the input context an agent reprocesses each turn is the dominant cost driver, not output generation. Four composable levers manage that spend — reduce input context, cache the stable prefix, route by model tier, and batch the non-urgent work — and they stack rather than compete. Cache economics are stated as ratios and the model ladder qualitatively, because the underlying pricing surface is volatile.

    claude-codecross-tool
  6. Chapter 26 feature-surface Fresh verified 2026-06-14

    Human-in-the-Loop: Keeping a Human in Control

    The oversight surface of the Evaluation & Operations volume. Keeping a human in control of a production agent is one move — control over the irreversible or wrong action — expressed four ways (the approval gate, plan mode, calibration, escalation in automation), all of them a workflow layered on top of Vol-1's permission model. The chapter draws the workflow-on-model line sharply, names the default-ask versus approval-fatigue trade-off as genuinely open, and treats agent self-calibration as a sparse, explicitly imperfect pattern.

    claude-codecross-tool
  7. Chapter 27 architectural-pattern Fresh verified 2026-06-14

    Security: The Adversarial-Input Layer

    The adversarial-input layer — who is really issuing the instruction. Prompt injection and Willison's lethal trifecta as the necessary-conditions threat model; the incidents (EchoLeak, Comet, ShadowPrompt) as one attack shape; why detection-only fails by construction and design-by-construction is this volume's one genuine convergence; the honest residual that defenses reduce, not eliminate; and a supply chain whose trust the registry delegates to you. The authorized-but-forged counterpart to Vol 1's authorized-but-risky guardrails.

    claude-codecross-tool
  8. Chapter 28 architectural-pattern Fresh verified 2026-06-14

    Operating the Whole: Eval + Ops as One Loop

    The Volume 3 capstone — the five operational surfaces as one closed operate-and-improve loop. A production failure surfaces in the session log, becomes an eval case, and drives a fix bounded by cost, oversight, and security, then it is measured again. An honest map of where Vol 3's evidence stands, the unsolved trade-offs the discipline navigates rather than solves, and a short close on Design v1.0.

    claude-codecross-tool