Part I built the agent and its orchestration; Part II turns to the tools that agent reaches for. A tool is a contract between a deterministic system and a non-deterministic caller — and the architect’s leverage is not the implementation behind it but the surfaces the model actually reads: the description, the input examples, the operation boundary, the name, and the response. Get those right and a capable model selects the tool correctly; get them wrong and no amount of model quality rescues it.
The description is the highest-leverage surface
Of every field on a tool definition, the description moves performance the most: detailed descriptions are “by far the most important factor in tool performance.” [Official] Define tools · AnthropicT1-official original A description is not documentation for a human reader — it is the surface the model selects from, so it must spell out what the tool does, when it should be used (and when it should not), what each parameter means, and any caveats. [Official] Define tools · AnthropicT1-official original The guidance even sets a floor: aim for “at least 3-4 sentences per tool description, more if the tool is complex.” [Official] Define tools · AnthropicT1-official original
The gap is concrete. A get_stock_price described as “Retrieves the current stock price for a given ticker symbol… returns the latest trade price in USD… It will not provide any other information” tells the model exactly when to reach for it and what it gets back; the same tool described as “Gets the stock price for a ticker” leaves it guessing about inputs, outputs, and boundaries.
[Official]
Define tools · AnthropicT1-official original
Show correct usage with input_examples
The description tells the model how to use a tool; input_examples show it. This optional field carries an array of example argument objects that demonstrate correct calls — the documented “Tool Use Examples” feature.
[Official]
Define tools · AnthropicT1-official original A weather tool can ship three: a full call, a call with a different unit, and a call that omits the optional field — teaching the model the shape by demonstration rather than prose.
The one hard rule: each example must validate against the tool’s input_schema, or the request returns a 400.
[Official]
Define tools · AnthropicT1-official original Two more facts for the exam: input_examples are for client (user-defined) tools, not server-side tools, and they cost roughly 20–50 tokens for simple examples, 100–200 for complex nested ones — a context cost you pay deliberately where ambiguity is high.
[Official]
Define tools · AnthropicT1-official original
Consolidate operations to reduce selection ambiguity
The next surface is the operation boundary — how much each tool does. The documented default is to consolidate: “Consolidate related operations into fewer tools. Rather than creating a separate tool for every action (create_pr, review_pr, merge_pr), group them into a single tool with an action parameter. Fewer, more capable tools reduce selection ambiguity.” [Official] Define tools · AnthropicT1-official original Every extra near-equivalent tool is one more line the model can pick wrong.
The deeper principle is to design for the agent’s affordances, not mirror your API’s endpoints: rather than make the model chain list_users + list_events + create_event, give it one schedule_event; rather than get_customer_by_id + list_transactions + list_notes, give it get_customer_context.
[Official]
Writing tools for agents · AnthropicT1-official original A tool that returns exactly the workflow the agent needs beats three tools it must orchestrate.
Namespace tool names by service
A name is the model’s fastest disambiguator, and the documented convention is to namespace by service: “Use meaningful namespacing in tool names… prefix names with the service (e.g., github_list_prs, slack_send_message). This makes tool selection unambiguous as your library grows.”
[Official]
Define tools · AnthropicT1-official original Bare search becomes a liability the moment a second search exists; github_search and jira_search never collide.
Names also carry hard constraints that differ by regime. A Claude API tool name must match ^[a-zA-Z0-9_-]{1,64}$.
[Official]
Define tools · AnthropicT1-official original An MCP tool name should be 1–128 characters of ASCII letters, digits, underscore, hyphen, or dot — no spaces — and unique within its server.
[Official]
Tools — Model Context Protocol Specification 2025-11-25 · AnthropicT1-official original Those MCP tools then reach the agent through a fixed pattern, mcp__<server>__<tool>: a list_issues tool on a server keyed github becomes mcp__github__list_issues.
[Official]
Connect to external tools with MCP · AnthropicT1-official original
Return only high-signal information
The response is the half of the contract authors forget. The model reads every token a tool returns, so a tool should “return only high-signal information… semantic, stable identifiers (e.g., slugs or UUIDs) rather than opaque internal references, and include only the fields Claude needs to reason about its next step.” [Official] Define tools · AnthropicT1-official original Bloated responses waste the context window and bury the fields that matter. The shape of the response also shapes the next call: a semantic identifier the model can pass straight into the following tool keeps a multi-step task cheap; an opaque internal handle forces a re-lookup. [Official] Writing tools for agents · AnthropicT1-official original
When the response should be machine-shaped, MCP lets a tool declare an optional outputSchema — and when it does, the server MUST return structuredContent conforming to that schema (mirroring it in a text block for compatibility).
[Official]
Tools — Model Context Protocol Specification 2025-11-25 · AnthropicT1-official original That is the output-side analogue of the required input schema; the structured-output machinery that drives it is Domain 4’s subject (D4.3).
The structural floor: an object input schema
Beneath the design judgments sits a requirement no interface can skip. Every tool’s input schema is a JSON Schema object: in the Claude API a tool definition’s three required fields are name, description, and an input_schema object;
[Official]
Define tools · AnthropicT1-official original in MCP the inputSchema is required and must be a valid JSON Schema object, not null.
[Official]
Tools — Model Context Protocol Specification 2025-11-25 · AnthropicT1-official original A tool that takes no arguments still declares an empty object schema — the object is the floor every interface stands on.
Practice
Exercise solutions
Consolidate the three into one get_customer_context tool (namespace it — e.g. crm_get_customer_context — if the agent spans services). Its description should state what it returns and when to use it: “Returns a customer’s profile, recent transactions, and notes for a given customer ID; use it whenever you need context about a customer before acting.” The redesign applies consolidation (fewer, more capable tools reduce selection ambiguity) and design-for-affordances (one call returns the context the agent needs instead of three CRUD calls it must chain). The agent stalled because three thin tools forced multi-step chaining the descriptions never made obvious; a single high-signal response also lets any follow-up call reuse the returned identifiers cheaply.
The most likely cause is that one of the examples does not validate against the tool’s input_schema — an invalid input_examples entry returns a 400. Every example must conform to the same input_schema the real calls do (right types, required fields present, enum values legal); a single bad example (a typo’d enum, a missing required field) fails the whole request. The examples have to agree with input_schema — which is also why they double as a check on the schema itself.
A good description must add, at minimum: (1) what the tool does concretely (not “gets data” but which data, in what form); (2) when to use it and when not to — the boundary that prevents misrouting; (3) what each parameter means (and what the response returns). Aim for 3–4 sentences. The audience is the model, which selects tools by description alone and never reads the implementation — so an opaque description is a performance bug the model cannot route around, making the description the single highest-leverage fix (“by far the most important factor in tool performance”). Adding input_examples compounds the gain by showing correct argument shape.
Exam essentials
- The description is the highest-leverage surface — “by far the most important factor in tool performance.” Say what the tool does, when (and when not) to use it, and what each parameter means; 3–4 sentences minimum.
input_examplesshow correct usage — an array of example argument objects; each must validate againstinput_schema(invalid → 400). Client tools only, not server tools; ~20–50 / ~100–200 tokens.- Consolidate to reduce selection ambiguity — fewer, more capable tools (an
actionparameter overcreate_pr/review_pr/merge_pr); design for the agent’s affordances, not your API’s endpoints. - Namespace names by service —
github_list_prs, not a baresearch. API names match^[a-zA-Z0-9_-]{1,64}$; MCP names are 1–128 ASCII chars and surface asmcp__server__tool. - Return only high-signal information — semantic, stable identifiers and only the fields the model needs. MCP’s optional
outputSchemagoverns the machine-shaped output (server must then return conformingstructuredContent). - The input schema must be an object — the structural floor of every tool;
strict: true(D2.2/D2.3) then makes inputs conform to it.