Three hours in, the agent is hallucinating function names and ignoring rules that were working fine at session start. What went wrong? More importantly: how do you escape? This chapter is reactive — the catalog of characteristic failures every practitioner eventually hits, with a concrete recovery guide for each.

Representation

Every agentic-coding tool has characteristic misuse patterns. They’re not model bugs or tool deficiencies — they arise from treating the agent as infinitely capable when it is in fact a bounded system with a finite context window, degrading attention, and no memory across sessions.

The four failure layers

When the agent keeps failing on the same problem across sessions, the temptation is to blame the model. Usually wrong. Work through the four layers in order — most failures resolve at Layer 1 or Layer 2.

The prompt. Is the request ambiguous? Does it assume context the agent doesn’t have? Test: rephrase with explicit constraints and an example of what “correct” looks like. If the agent succeeds, you had a specification problem, not a model problem.
The briefing doc (CLAUDE.md / GEMINI.md / AGENTS.md). Is there a conflicting rule? A rule too vague to enforce? A rule the agent interprets differently than you intended? Test: temporarily move the briefing doc aside (mv .claude/CLAUDE.md .claude/CLAUDE.md.bak) and retry. If the problem disappears, a rule is interfering. Add rules back incrementally to isolate.
The codebase. Does the existing code teach the wrong patterns? The codebase-is-the-curriculum principle applies in reverse here: if 50 existing functions use pattern A and you want pattern B, the agent will default to A regardless of instructions. Test: ask the agent why it chose its approach. If it cites existing code, the codebase is teaching patterns your instruction is supposed to override — make the instruction explicit about the exception.
The model. Genuine limitations exist — certain reasoning patterns, mathematical computations, or domain-specific conventions the model gets wrong consistently. Test: try max reasoning depth; try a different model; if all fail the same way, accept the limitation and design a workaround (verification step, hook, or manual review).

In practice, the large majority of persistent failures resolve at Layer 1 or Layer 2. Practitioners who jump to “the model is wrong” usually haven’t verified their instructions are unambiguous and conflict-free.

Operation

The catalog. Eight common anti-patterns, each with a Recovery box you can apply on the spot. Prevention links back to the chapter that covers it properly.

1. Context overload

Symptom: the agent ignores important rules. Instructions from the briefing doc are followed inconsistently. Behavior degrades over long sessions.

Root cause: the briefing doc has grown to contain every rule, convention, and preference accumulated over months. Attention dilutes across all of it, and critical rules get the same weight as minor preferences.

Prevention: hub-and-spoke architecture with path-scoped rule files. Keep the core briefing doc under ~300 lines. Non-negotiable standards become hooks, not advisory lines.

2. The kitchen-sink session

Symptom: performance degrades midway through a session. The agent repeats itself, forgets earlier decisions, or produces lower-quality output.

Root cause: multiple unrelated tasks in a single session. Each task’s context remains, consuming tokens that contribute nothing to the current task. Covered in depth in Ch 2 Context as Currency.

3. Over-correcting

Symptom: three or more rounds of “no, that’s not what I meant” followed by increasingly desperate attempts at the same task.

Root cause: each correction adds noise — the original error, your correction, the agent’s acknowledgment, its retry. After three rounds, the context is dominated by failure patterns.

Prevention: the two-failure rule (covered in Ch 4 Session Loop). After two failed corrections, clear and re-prompt with precision.

4. The permanent prototype

Symptom: code has been “working” for weeks but has no tests, no type hints, no error handling. “We’ll add those later” becomes permanent.

Root cause: the exploration phase has no defined exit criteria. Without an explicit transition, the code accumulates users and dependencies while remaining at prototype quality.

Prevention: explicit phase transitions in the briefing doc, with graduation checklists and target dates. Covered in Ch 5 Edit-Test-Commit.

5. The verification gap

Symptom: agent-generated code accepted without testing. Subtle bugs surface in production weeks later.

Root cause: AI-generated code looks correct. The syntax is clean, variable names reasonable, logic reads well. This appearance of correctness is precisely why verification is essential — the bugs are subtle, not obvious.

Prevention: always provide verification criteria in the prompt, tests with code (not after), hook-enforced test requirements. Full treatment in Ch 5 Edit-Test-Commit.

6. Infinite exploration

Symptom: you asked the agent to “investigate” something without scoping. It reads 40 files, filling the context window with exploration results that crowd out implementation.

Root cause: unbounded investigation prompts give the agent no stopping criteria. Every related file gets read; context that should be reserved for the actual task is consumed.

Prevention: scope investigations narrowly with explicit deliverables. Use subagents for exploration so research context doesn’t consume your main session.

7. Big-bang refactoring

Symptom: a large rewrite fails partway through, leaving the codebase broken.

Root cause: the scope of the rewrite exceeds what can be held in context. The agent loses track of the original behavior and introduces regressions.

Prevention: incremental refactoring — extract one function, write characterization tests, refactor, verify, commit. Each step independently shippable. Characterization-test mechanics are covered in Ch 5 Edit-Test-Commit; the full four-step refactoring protocol (extract → test → harden → promote) is covered in Ch 10 Starting and Refactoring Projects.

8. Trusting the briefing doc untested

Symptom: you added a rule to your briefing doc weeks ago. You assume it’s in force. It isn’t.

Root cause: a rule in the briefing doc is advisory — the agent is told to follow it, not prevented from violating it. Rules that are too vague, buried among competing rules, or contradicted by codebase patterns silently fail.

Prevention: test every new rule immediately after adding it.

Evolution

Anti-patterns are the most tool-agnostic territory in the book. The failure modes are properties of agent-as-bounded-system, not of any specific product.

Convergence: the four-layer diagnosis. Prompt → briefing doc → codebase → model is universal. The briefing-doc filename changes (CLAUDE.md / GEMINI.md / AGENTS.md); the layer it represents doesn’t. When a team has internalized the layered diagnosis, their debugging-the-agent time drops dramatically because they stop jumping to Layer 4 prematurely.

Emerging: automated anti-pattern detection. Some teams have started instrumenting their agent sessions to auto-flag anti-patterns in real time — “you’ve corrected the same issue three times, consider clearing” or “this session has touched 12 files; kitchen-sink warning.” The tooling is hand-rolled in 2026; expect first-class product support by 2027.

How these anti-patterns scale

These are personal anti-patterns. When you scale agent-assisted work to a team, new patterns emerge — and the solutions shift from personal discipline to organizational infrastructure. Shared briefing docs, team hook libraries, enforced phase transitions, code-review guidelines that surface verification gaps. That scope is beyond this chapter, but the pattern is: individual-level anti-patterns become team-level infrastructure requirements as you scale.

Quick reference

The agent is a powerful but bounded system. Most anti-patterns arise from forgetting the bounds.
Four-layer diagnosis: prompt → briefing doc → codebase → model. Most failures resolve at Layer 1 or 2 — check those before blaming the model.
Context overload: keep the hub briefing doc under 300 lines; offload specifics to path-scoped rules.
Kitchen-sink session: one session per logical task; clear between.
Over-correcting: two-failure rule; after two rounds, clear and re-prompt with precision.
Permanent prototype: declare phases explicitly; set graduation checklists.
Verification gap: every agent-generated change earns verification criteria; no exceptions.
Infinite exploration: scope investigations with deliverables; delegate unbounded research to subagents.
Big-bang refactoring: incremental protocol; each step independently shippable.
Untested briefing-doc rule: smoke-test every new rule before trusting it.
Most anti-patterns are tool-agnostic — the recovery procedures port across Claude, Gemini, and Codex with only minor command changes.