Part 1 Chapter 4 Last verified 2026-05-29 Fresh

The Instruction Layer: CLAUDE.md & AGENTS.md

The always-loaded config file (CLAUDE.md / AGENTS.md) is not documentation — it is a permanent slice of the context budget. Spend it only on broadly-applicable, can't-infer-from-code context. The one measured result inverts the naive prior.

Volatility: architectural-pattern

Tools compared: claude-codecross-tool

On this page

The always-loaded file is a budget
The measured result that inverts the prior
Official, practitioner, and a study converge
Presence is not usage
Push the rest out
Patterns
Quick reference
Practice

The previous chapter ended on a hook: adding more documentation for an agent is not automatically better, and there is a measured result that proves it. This chapter is that result’s home. The file in question — CLAUDE.md, or the cross-tool AGENTS.md — looks like documentation, but it behaves like a permanent line item in the agent’s context budget, and that single fact drives everything here.

The always-loaded file is a budget

A CLAUDE.md is “a special file that Claude reads at the start of every conversation,” [Official] Best practices for Claude Code · AnthropicT1-official original and it “is loaded every session, so only include things that apply broadly.” Best practices for Claude Code · AnthropicT1-official original That is the whole game: every line you put in it is spent on every turn, whether or not it is relevant to the task at hand.

So the discipline is not “write good docs” — it is budget the always-on context. Anthropic’s curation test is per-line and ruthless: for each line, ask whether removing it would cause a mistake — “If not, cut it. Bloated CLAUDE.md files cause Claude to ignore your actual instructions!” Best practices for Claude Code · AnthropicT1-official original The file should carry “Bash commands, code style, and workflow rules” Best practices for Claude Code · AnthropicT1-official original — broadly-applicable, can’t-infer-from-code context — and nothing else.

The measured result that inverts the prior

Almost everything in this book is converged craft. This chapter is the exception: there is one controlled study, and it cuts against the intuitive default.

Researchers at ETH Zurich tested whether repository-level context files actually help, across multiple agents and models. The headline: “context files tend to reduce task success rates compared to providing no repository context, while also increasing inference cost by over 20%.” Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents? · Gloaguen, Mündler, Müller, Raychev, Vechev (ETH Zurich) (2026)T3-practitioner original The mechanism they propose is that context files encourage broader exploration, and agents dutifully follow their instructions — so their normative conclusion is that “unnecessary requirements from context files make tasks harder, and human-written context files should describe only minimal requirements.” Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents? · Gloaguen, Mündler, Müller, Raychev, Vechev (ETH Zurich) (2026)T3-practitioner original

This is not “never use a context file.” It is a measured basis for the line-budget rule: the harm comes from unnecessary content, so the file should carry only minimal, broadly-applicable requirements.

Official, practitioner, and a study converge

The practitioner rules turn out to minimize exactly the harm the study measured. HumanLayer reports keeping “our root CLAUDE.md file… less than sixty lines” [Practitioner] Writing a good CLAUDE.md · Kyle (HumanLayer) (2025)T3-practitioner original and warns that the file “is the highest leverage point of the harness, so avoid auto-generating it.” Writing a good CLAUDE.md · Kyle (HumanLayer) (2025)T3-practitioner original

Convergence claude-codecross-tool

Three independent evidence classes land on the same rule — keep the always-loaded file short and hand-curated: Anthropic’s official “cut any line whose removal wouldn’t cause a mistake,” Best practices for Claude Code · AnthropicT1-official original HumanLayer’s practitioner “under sixty lines, don’t auto-generate,” Writing a good CLAUDE.md · Kyle (HumanLayer) (2025)T3-practitioner original and the ETH study’s measured “describe only minimal requirements.” Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents? · Gloaguen, Mündler, Müller, Raychev, Vechev (ETH Zurich) (2026)T3-practitioner original Official + practitioner + a controlled study agreeing is the strongest convergence in this book. [Convergence]

Presence is not usage

One more honest reading, because adoption numbers are easy to misuse. The AGENTS.md site reports the format is “used by over 60k open-source projects” AGENTS.md · Agentic AI Foundation (Linux Foundation)T2-release-notes original — but that is a presence count (a code-search for the file), a vendor self-report. The first large-scale trace-based study found that “more than 40% of file-using projects have no commit-level activity” Agentic Much? Adoption of Coding Agents on GitHub · Robbes, Matricon, Degueule, Hora, Zacchiroli (2026)T3-practitioner original — the file is present, but the tooling is not being exercised.

Push the rest out

If the always-loaded file is only for broadly-applicable context, where does everything else go? It loads on demand. AGENTS.md supports nested files where “the closest one takes precedence and every subproject can ship tailored instructions,” AGENTS.md · Agentic AI Foundation (Linux Foundation)T2-release-notes original and the broader principle — letting an agent “load information only as needed” [Official] Equipping agents for the real world with Agent Skills · Anthropic (2025)T1-official original — is the bridge to the next chapter on Skills.

Patterns

What earns a place. Sketch: the file holds only broadly-applicable, can’t-infer-from-code context. When to use: deciding any line. Best practices for Claude Code · AnthropicT1-official original Mechanics: Bash commands, code style, workflow rules — yes; task- or subproject-specific detail — no. Remember: it is paid every turn; scope to “applies broadly.”

The curation test. Sketch: per line, “would removing this cause a mistake?” When to use: every edit to the file. Best practices for Claude Code · AnthropicT1-official original Mechanics: if removal wouldn’t cause a mistake, cut it. Remember: bloat makes the agent ignore your real instructions — measured harm, not style.

Hand-write, don’t auto-generate. Sketch: craft the file by hand; never machine-generate it. When to use: always. Writing a good CLAUDE.md · Kyle (HumanLayer) (2025)T3-practitioner original Mechanics: short root file; review every line. Remember: highest-leverage point of the harness — and auto-generation is what the ETH study implicates.

Push the rest out (progressive disclosure). Sketch: layer instructions; load on demand. When to use: anything not broadly applicable. AGENTS.md · Agentic AI Foundation (Linux Foundation)T2-release-notes original Mechanics: nested AGENTS.md (nearest wins) for subprojects; Skills for procedures (next chapter). Remember: always-loaded is for the broadly-true; the rest loads when relevant.

Measure by traces. Sketch: judge adoption by AI-assisted commit traces, not file presence. When to use: any rollout or adoption claim. Agentic Much? Adoption of Coding Agents on GitHub · Robbes, Matricon, Degueule, Hora, Zacchiroli (2026)T3-practitioner original Mechanics: look for co-authored commits/PRs, not stars or file counts. Remember: presence ≠ usage; >40% of file-using repos show no activity.

Quick reference

It’s a budget, not docs — every line is paid on every turn.
Scope: broadly-applicable, can’t-infer-from-code context only (Bash commands, code style, workflow rules).
The one measured result: unnecessary context-file content reduces success and adds >20% cost — keep it minimal.
Convergence: official + practitioner + study agree on short + hand-curated. Never quote a line number.
Presence ≠ usage: measure adoption by traces, not file counts.
Everything else loads on demand — nested files, then Skills.

Practice

Exercise solutions

Solution ↑ Exercise

Most hand-written files lose a surprising fraction to the curation test — anything the agent could infer from the code (file locations it can grep, conventions visible in nearby code) and anything task-specific (how to do one particular migration) fails “would removing it cause a mistake?” for the general session. Survivors are typically: the build/test commands, a few non-obvious workflow rules, and code-style choices not enforced by a linter. If your file is long, that is the signal — the ETH result says the excess is not neutral, it is actively costing success and tokens. Cut to the broadly-true core; route the rest to on-demand loading.