The agent mental model
What every CLI-agent actually is — an agent loop with three durable properties and four engineering principles that apply regardless of which tool you use. The foundation the rest of the book builds on.
On this page
You have just installed a CLI agent — Claude Code, Gemini CLI, or Codex CLI. Before your first prompt, take five minutes to understand what you are working with. Not a chatbot, not an autocomplete engine, but an agent with a specific architecture, specific constraints, and a predictable failure profile. The mental model you hold in this chapter will inform every decision in the rest of the book.
Representation
Every modern CLI-agent — Claude Code, Gemini CLI, Codex CLI — implements the same underlying structure. Naming varies; the shape does not.
Three system properties are universal across the category:
Context window. Everything the agent knows about your task — your prompt, files it has read, tool results, conversation history — occupies a finite token window. When the window fills, older information fades. This is not infinite memory; it is a working budget. The implications run deep enough to warrant their own chapter (Ch 2 Context as Currency).
Tool use. The agent does not type code into a file. It calls tools: Read, Edit, Write, Bash, Glob, Grep, and a varying set of specialized helpers. Each tool call costs context (the request + the result) and produces observable effects. Understanding this matters because every operation has a token price, and the agent’s strategy is shaped by that price.
Configuration layers. The agent’s behavior is not a single rule set — it is a stack. A briefing doc at the project root (CLAUDE.md / GEMINI.md / AGENTS.md), optional scoped rules, tool-level permissions, and in some tools an output-style layer. These layers are not suggestions — they are the operating system through which the agent interprets your project.
Four engineering principles that apply regardless
The principles below are older than agentic coding. They apply to code written by humans, code written by AI, and code written by both together. They matter more with AI in the loop because AI amplifies whatever patterns it finds in your codebase — good patterns propagate at the same rate as bad ones.
Never fail silently. Every error must be explicitly reported with recovery options. Silent failures are the most expensive kind of bug: they produce incorrect results that look correct, propagate through downstream systems undetected, and surface only when the damage is difficult to reverse. In AI-assisted development, vague instructions like “handle errors gracefully” invite the agent to catch and suppress. Specific instructions like “report errors with the message, your analysis of the root cause, and 2–3 options for resolution” make every failure visible and actionable.
Simplicity over complexity. Short functions, flat structure, self-documenting names. A 20-line function with a clear name is superior to a 5-line function with three levels of abstraction. Simple code is easier for the agent to reason about, test, and modify correctly. Each layer of indirection is another opportunity for misunderstanding. When the agent reads your codebase to learn conventions, simple patterns propagate accurately; complex patterns propagate errors.
Immutability by default. Return new data structures; mark mutations explicitly. Pure functions are easier to test, easier to parallelize, and easier for the agent to reason about. When mutation is necessary (performance, I/O, state), make it explicit — name the function with a verb that signals the mutation, use mutable types deliberately, and document the side effects. The reader (human or agent) should never be surprised by what a function modifies.
Fail fast with diagnostics. Stop immediately on problems with full context: what failed, what was expected, what was received, and what the caller can do about it.
def process_data(df: pd.DataFrame, min_rows: int) -> pd.DataFrame:
if len(df) < min_rows:
raise ValueError(
f"Need {min_rows} rows, got {len(df)}. "
f"Check data source or reduce min_rows parameter."
)
# ...
The error message includes what went wrong, where to look, and what to do about it. This is the difference between a ten-second fix and a thirty-minute investigation.
Operation
The three CLI-agents this book covers all run the agent loop. The surface area differs. The table maps each core primitive to its tool-specific form.
| Primitive | Claude Code | Gemini CLI | Codex CLI |
|---|---|---|---|
| Project briefing doc | CLAUDE.md | GEMINI.md | AGENTS.md |
| File read | Read tool | read_file | native read |
| File edit | Edit / Write | edit_file / write_file | native edit |
| Shell execution | Bash tool | run_shell_command | native exec |
| Search | Grep / Glob | search_file_content | native grep |
| Plan-mode entry | Shift+Tab | /plan | approval modes (--suggest, -a on-request) |
| Configuration scope | layered (global / project / local / enterprise / user) | global + project (GEMINI.md) | global + project (~/.codex/config.toml) |
Evolution
The agent-loop abstraction has converged across the CLI-agent category faster than almost any other pattern in agentic coding. What remains contested is the shape of the tool surface, the depth of configuration, and which workflow primitives graduate into first-class commands.
Convergence: engineering principles are pre-agentic. The four principles above — never fail silently, simplicity, immutability, fail fast — are not changing. They were best practice before AI-generated code existed; they remain best practice after. What changed is the amplification factor. A silent-failure pattern in a 10,000-line codebase now propagates into every new file the agent writes. The principles themselves are stable.
Emerging: plan-mode as a primitive. All three tools have shipped (or committed to) an explicit planning phase where the agent reads and proposes without writing. Claude’s was first-class first; Gemini shipped an explicit /plan command later. Codex approximates the same behavior through its approval-mode flow (--suggest / -a on-request): the agent proposes each action, waits for operator approval before executing. Not the same framing but functionally equivalent for review-before-edit workflows. Expect full first-class convergence within a year.
Emerging: delegation / subagents. Spawning a child agent with its own context for a bounded sub-task exists in Claude Code today and is signalled-but-not-shipped in Gemini. Codex has not yet. The pattern is a natural fit once the agent-loop model is internalized, but the engineering to do it safely (context isolation, result summarization, permission propagation) is non-trivial. Expect partial convergence in 2026, full convergence in 2027.
Quick reference
- The agent loop is the foundational primitive — prompt, reason, tool, observe, decide, repeat.
- Context window is finite. Tool use has a token cost. Configuration layers shape interpretation.
- Four engineering principles (fail loudly, simplicity, immutability, fail fast) predate agentic coding and matter more with AI in the loop.
- The codebase is the curriculum — AI amplifies whatever patterns are already there.
- Practices written to the loop-shape port across tools; practices written to a specific tool’s command names do not.
- Plan-mode and subagents are emerging convergences; expect full parity within 12–18 months.