Environments at Scale: Large Codebases & Monorepos
When the repo is too big to load, legibility stops meaning "document everything" and starts meaning "bound what the agent must load." Interface contracts, a shallow-but-deeply-linked index, per-decision ADRs, and scope-to-workspace monorepo structure.
On this page
The repo-design chapter made a small repo legible: one entry-point map, examples, sensors. This chapter asks what changes when the repo is too big to load. The answer is a shift in what “legibility” even means — and every move here is converged craft, not measured effect, with two of the launchpad’s catchy names turning out to be folk coinages.
Bound what the agent must load
At scale, you cannot make the repo legible by documenting all of it — there is too much, and an agent that ingests everything drowns. Legibility becomes constraining the loadable surface so an agent can work in one domain without reading the whole repo. As one practitioner puts it, “context construction should be scoped, not exhaustive.” [Practitioner] Coding Agents in the Monorepo: Why Context Windows and 50-Service Repos Don't Mix · Tian Pan (2026)T3-practitioner original
Interface contracts at boundaries
The first move exposes a boundary the agent reads instead of the implementation. Once agents must navigate a large repo, “explicit interface contracts matter more than they used to,” Coding Agents in the Monorepo: Why Context Windows and 50-Service Repos Don't Mix · Tian Pan (2026)T3-practitioner original realized as a per-domain file: “each major service or domain owns a file that describes its conventions, its interface contracts, and its dependencies.” Coding Agents in the Monorepo: Why Context Windows and 50-Service Repos Don't Mix · Tian Pan (2026)T3-practitioner original The agent reads the relevant one, not all of them.
This is the established contract-first tradition applied to agents — defining the contract so that, before implementation, “the contract has already been defined and communicated with potential consumers.” [Practitioner] API Contract Definitions: Contract first, implementation first, OpenAPI, GraphQL, gRPC · Lena Fuhrimann (2022)T3-practitioner original The agent is just another consumer working against the declared interface.
Shallow index, deep links
The second move keeps a top-level map shallow but deeply linked. Anthropic’s large-codebase guidance: “the root file describes only the highest-level structure, and subdirectory CLAUDE.md files provide the next level of detail,” [Official] How Claude Code works in large codebases: Best practices and where to start · Anthropic (2026)T1-official original loaded on demand as the agent moves through the tree. The named mechanism is “progressive disclosure, which allows agents to incrementally discover relevant context through exploration.” Seeing like an agent: how we design tools in Claude Code · Thariq Shihipar (Anthropic) (2026)T1-official original
ADRs: the why, one decision at a time
Interface contracts expose what a boundary is; Architecture Decision Records expose why it’s structured that way — in a form that scales. The failure mode they fix: “large documents are never kept up to date. Small, modular documents have at least a chance.” [Practitioner] Documenting Architecture Decisions · Michael Nygard (2011)T3-practitioner original ADRs are “numbered sequentially and monotonically. Numbers will not be reused,” Documenting Architecture Decisions · Michael Nygard (2011)T3-practitioner original each capturing “a set of forces and a single decision in response.” Documenting Architecture Decisions · Michael Nygard (2011)T3-practitioner original The maintained collection — one record per decision, each capturing “a single AD and its rationale” Architectural Decision Records (ADRs) · ADR GitHub organizationT2-release-notes original — is a navigable decision log an agent loads one decision at a time, not a monolith it must read whole.
Monorepos: scope to the workspace
The fourth move bounds the package layout. An unbounded repo means “the agent searches the whole repo and wastes context on irrelevant packages,” [Practitioner] AI agents in monorepos: what to configure differently from a single-product repo · Dave Barnwell (2026)T3-practitioner original because “in a monorepo, the root is an index, not the real unit of work.” AI agents in monorepos: what to configure differently from a single-product repo · Dave Barnwell (2026)T3-practitioner original The single highest-leverage fix: “make the agent decide which workspace it is operating in.” AI agents in monorepos: what to configure differently from a single-product repo · Dave Barnwell (2026)T3-practitioner original Then it “traverses the graph in steps, building up a coherent picture rather than trying to ingest everything at once.” Coding Agents in the Monorepo: Why Context Windows and 50-Service Repos Don't Mix · Tian Pan (2026)T3-practitioner original
Patterns
Interface/contract docs at boundaries. Sketch: per-domain file with conventions, contracts, dependencies. When to use: multi-service / multi-domain repos. Coding Agents in the Monorepo: Why Context Windows and 50-Service Repos Don't Mix · Tian Pan (2026)T3-practitioner original Mechanics: expose the declared interface; the agent reads the boundary, not the implementation. Remember: it’s the contract-first pattern — name it, don’t call it INTERFACE.md.
Shallow index, deep links. Sketch: a shallow root that points to on-demand detail. When to use: hundreds of top-level folders. How Claude Code works in large codebases: Best practices and where to start · Anthropic (2026)T1-official original Mechanics: root = highest-level structure only; subdirectory files carry the next layer, loaded as the agent explores. Remember: a flat dump doesn’t scale; progressive disclosure does.
ADRs, append-only. Sketch: one numbered, immutable record per decision. When to use: any non-obvious structural choice. Documenting Architecture Decisions · Michael Nygard (2011)T3-practitioner original Mechanics: sequential numbers never reused; supersede via a new file; keep in source control. Architectural Decision Records (ADRs) · ADR GitHub organizationT2-release-notes original Remember: a monolith rots; a per-decision log an agent loads one at a time survives.
Scope to the workspace. Sketch: make the agent pick its package before acting. When to use: any monorepo. AI agents in monorepos: what to configure differently from a single-product repo · Dave Barnwell (2026)T3-practitioner original Mechanics: tags/graph to navigate-before-read; walk the dependency graph in steps. Nx and AI — Why They Work so Well Together · Victor Savkin (Nx) (2025)T2-release-notes original Remember: the root is an index, not the unit of work — bound the agent to one workspace.
Quick reference
- At scale, legibility = bound what must be loaded (not document everything).
- Interface contracts: read the boundary, not the implementation (contract-first; not
INTERFACE.md). - Shallow index, deep links: shallow root + on-demand layers (progressive disclosure at repo scale).
- ADRs: numbered, append-only, one decision each — beats a monolithic architecture doc.
- Monorepos: scope to one workspace; navigate-before-read via tags/graph.
- Evidence: converged craft, no effect sizes.
Practice
Exercise solutions
For a one-service change the agent should not load: other services’ implementations (interface contracts let it read just the boundaries it depends on), the full repo tree (a shallow index points it to the right subtree), the history of why unrelated domains are structured as they are (per-decision ADRs scope the why to the relevant decision), and sibling packages’ code (scope-to-workspace bounds it to its package and the dependency graph it actually touches). Each move removes one class of unnecessary load — together they keep the agent’s window on the one domain it’s working in.