30 - Agents

Earlier lectures covered the pieces in isolation: tool calling (lecture 26), MCP (lecture 27), skills (lecture 28). This lecture covers the layer that ties them together — agents and subagents — and how the dominant CLI coding harnesses (Claude Code, OpenAI Codex, opencode, forgecode) expose that layer to you.

The mental model is the same everywhere. The spelling differs.

Tools are things the model can call.
Skills are instructions the model loads on demand.
MCP is how you plug in external capabilities.
Agents are the loop that decides what to do next.
Subagents are agents spawned by other agents, each with its own fresh context.

Once you understand subagents, the differences between Claude Code and Codex and opencode shrink to configuration syntax.

The Problem Subagents Solve

Context windows are finite. Even with 1M tokens, you run out because most of what an agent sees during a task is exploration output it will never reference again.

Consider three everyday moments in a coding session:

Running a test suite that produces 10,000 lines of log output — most of which is dots and timings. You only care about the failures.
Searching a 500-file codebase for every caller of normalizeUser(). You want the list of call sites, not the 50 surrounding lines of each file.
Fetching a documentation page to check whether a library supports a specific flag. Once you have the answer, the rest of the page is dead weight.

If the main agent does this work itself, its context fills with garbage. Every subsequent turn pays for tokens that have no bearing on the next decision. Eventually the window compacts (lossily), or you hit the turn budget, or the signal-to-noise drops so far that the model starts making mistakes.

A subagent fixes this. The parent delegates the noisy work to a child agent with its own context. The child does the exploration, builds whatever working memory it needs, then returns a summary — typically a few hundred tokens. The 10,000-line log, the 500 search hits, the full doc page — none of it touches the parent's window.

The subagent's context is thrown away when it returns. That is the point.

Anatomy of an Agent

Before subagents, pin the vocabulary. An agent is a loop:

Concretely, each iteration is: read the current conversation, generate a response (text, tool calls, or both), execute any tool calls, append the results, repeat. Lecture 01 built one from scratch.

An agent has five configurable pieces:

System prompt — the long-standing instructions that define role, constraints, output style.
Tools — what it can call. From lecture 26.
MCP servers — external tool collections it can attach. From lecture 27.
Skills — on-demand instruction modules. From lecture 28.
Model — which LLM runs the loop. Opus, Sonnet, Haiku, GPT-5.4, etc.
Turn / token budget — when to stop.

A subagent is all of the above, packaged as a child process the parent can spawn.

The Shared Subagent Contract

Before the platform-specific details, here is what every subagent system agrees on:

The subagent gets its own context window. It sees its own system prompt, its own input task, its own tool results. It does not see the parent's conversation history.
The parent hands it a task description, not raw state. You design the task brief like you design a function signature.
The subagent has its own tools and permissions. Often narrower than the parent's — a read-only Explore subagent cannot write files.
Subagents typically cannot spawn further subagents. Nesting depth is 1 by default across all four platforms. This prevents runaway fan-out.
Siblings don't share memory. If you spawn two subagents in parallel, neither sees the other's work until both return and the parent merges their summaries.

Here is how the four platforms name and configure the same idea:

Platform	Term	Config location	Format
Claude Code	subagent	`.claude/agents/`, `~/.claude/agents/`	Markdown + YAML frontmatter
OpenAI Codex	custom agent / subagent	`.codex/agents/`, `~/.codex/agents/`	TOML
opencode	agent (`mode: subagent`)	`.opencode/agents/`, `~/.config/opencode/agents/`, `opencode.json`	Markdown + YAML, or JSON
forgecode	agent (project via `AGENTS.md`)	project root, `~/.forge/`	Markdown + `.forge.toml`

The rest of this lecture drills into each.

Claude Code

Claude Code is Anthropic's reference implementation and has the most mature subagent story. If you learn the Claude Code model, the others map onto it with small renames.

Official documentation: https://code.claude.com/docs/en/sub-agents

Built-in subagents

Three are always available:

Explore — runs on Haiku, read-only tools, optimized for codebase search and file discovery. Claude delegates to Explore automatically when it needs to understand code without changing it. You can hint a thoroughness level (quick, medium, very thorough) in the task.
Plan — used during plan mode. Read-only. Gathers context before presenting a plan. Prevents infinite nesting because subagents cannot spawn subagents themselves.
general-purpose — all tools, inherits the parent's model. For multi-step tasks that need both exploration and modification.

Two helpers fire automatically for specific slash commands: statusline-setup (Sonnet, runs on /statusline) and Claude Code Guide (Haiku, answers Claude-Code-feature questions).

Creating a custom subagent

Run /agents inside Claude Code — it opens an interactive builder that generates the name, description, and system prompt, asks which tools to allow, picks a model, and saves the file. That is the recommended path.

Under the hood, a subagent is just a Markdown file with YAML frontmatter:

---
name: code-reviewer
description: Reviews code for quality, security issues, and best practices. Use after any code change.
tools: Read, Glob, Grep, Bash
model: sonnet
---

You are a senior code reviewer. When invoked:

1. Run `git diff HEAD` to see recent changes.
2. Review the diff for correctness, security issues, and style.
3. Return a prioritized list of findings. Flag any secrets, SQL injection risks, or missing error handling.
4. Suggest concrete fixes, not vague advice.

The frontmatter drives selection and tooling. The body is the system prompt.

Frontmatter fields

Only name and description are required. The full set:

Field	Purpose
`name`	Lowercase, hyphens only. Filesystem-unique.
`description`	When Claude should delegate. This is the trigger. Agents undertrigger — be explicit.
`tools`	Allowed tool list. Omit to inherit all.
`disallowedTools`	Deny list subtracted from the inherited or specified set.
`model`	`sonnet`, `opus`, `haiku`, a full model ID, or `inherit` (default).
`permissionMode`	`default`, `acceptEdits`, `auto`, `dontAsk`, `bypassPermissions`, `plan`.
`maxTurns`	Cap on agentic turns before the subagent stops.
`skills`	Skills to load into the subagent at startup. Subagents do not inherit skills from the parent.
`mcpServers`	MCP servers this subagent can reach.
`hooks`	Lifecycle hooks scoped to this subagent.
`memory`	`user`, `project`, or `local` — persistent memory across sessions.
`background`	Run as a background task (default `false`).
`effort`	`low`, `medium`, `high`, `max`. Overrides session effort.
`isolation`	Set to `worktree` to spawn in a temporary git worktree, isolating edits.
`color`	Display color in the UI.
`initialPrompt`	Auto-submitted first user turn when the subagent is the main session agent.

The description is the most important field. The parent agent picks a subagent by matching the task to its description. Vague descriptions never trigger — the same failure mode as skills in lecture 28.

Scope and priority

Subagents live in multiple places. When names collide, the higher-priority scope wins:

Managed settings (organization-wide) — highest.
--agents CLI flag (JSON, session-only).
.claude/agents/ (project, checked into git).
~/.claude/agents/ (user, all projects).
Plugin agents/ directory — lowest.

Project subagents are the right home for team-shared reviewers and domain experts. User subagents are right for personal helpers you use across all projects. The --agents flag is useful for automation scripts and quick tests:

claude --agents '{
  "code-reviewer": {
    "description": "Expert code reviewer. Use proactively after code changes.",
    "prompt": "You are a senior code reviewer. Focus on quality, security, best practices.",
    "tools": ["Read", "Grep", "Glob", "Bash"],
    "model": "sonnet"
  }
}'

Worktree isolation

Set isolation: worktree and the subagent runs in a temporary git worktree — its own copy of the repo. Edits don't touch the parent's working tree until you merge. This is how Claude Code supports parallel implementation: spawn two subagents on the same codebase, each in its own worktree, let both edit without colliding, compare results.

Subagents vs agent teams

Claude Code also has a separate feature called agent teams, for cases where you need multiple agents communicating across sessions (e.g. long-running background work). Subagents are a single-session concept. For the coordination patterns in this lecture, subagents are the right tool. See https://code.claude.com/docs/en/agent-teams for the team model.

SDK users access the same primitives from code via the Claude Agent SDK: https://platform.claude.com/docs/en/agent-sdk/subagents

OpenAI Codex

Codex CLI is OpenAI's terminal coding agent. It adds subagents as first-class citizens but with one important philosophical difference from Claude Code: Codex only spawns a subagent when you explicitly ask it to. There is no automatic delegation based on description matching.

Official docs: https://developers.openai.com/codex/subagents · CLI: https://developers.openai.com/codex/cli

AGENTS.md — the project instructions layer

Before subagents, Codex has AGENTS.md, the equivalent of Claude Code's CLAUDE.md. Codex reads it before every task.

Two scopes:

~/.codex/AGENTS.md — your personal defaults across every project.
Project-level AGENTS.md in the repo root and any subdirectory. Instructions cascade: files closer to the working directory override earlier ones. You can also drop an AGENTS.override.md to replace higher-level instructions wholesale.

Typical pattern:

Root AGENTS.md: "Run npm run lint before opening a PR."
services/payments/AGENTS.override.md: "Use make test-payments instead of npm test."

When Codex works in the payments directory, both apply, with the override replacing the parent's test directive.

Full spec: https://developers.openai.com/codex/guides/agents-md

Custom subagents

TOML files in ~/.codex/agents/ (personal) or .codex/agents/ (project). Each file defines one agent.

Required fields: name, description, developer_instructions. Optional: model, model_reasoning_effort, sandbox_mode, mcp_servers, skills.config, nickname_candidates.

name = "reviewer"
description = "PR reviewer focused on correctness, security, and missing tests."
model = "gpt-5.4"
model_reasoning_effort = "high"
sandbox_mode = "read-only"
developer_instructions = """
Review code like an owner.
Prioritize correctness, security, behavior regressions, and missing test coverage.
"""
nickname_candidates = ["Atlas", "Delta", "Echo"]

A custom agent file accepts the same settings as a normal Codex session config, so you have the full TOML surface to configure sandbox, MCP servers, and model parameters per agent.

Invocation

You ask for a subagent by name, in natural language:

Have pr_explorer map the affected code paths in this PR.
Spawn one reviewer agent per changed service and consolidate findings.

No @mention, no slash command. Codex parses your request and orchestrates accordingly.

Global agent settings

In ~/.codex/config.toml:

[agents]
max_threads = 6              # concurrent subagents allowed
max_depth = 1                # default: subagents cannot spawn subagents
job_max_runtime_seconds = 1800

Parallel fan-out

When you ask for several agents at once, Codex spawns them in parallel and waits until all return before consolidating:

Spawn three reviewers — security, performance, docs — and merge their findings.

Each inherits the parent's sandbox policy. Output consolidates into a single response.

Codex + Agents SDK

Codex exposes the CLI as an MCP server, which means you can orchestrate it from the OpenAI Agents SDK for deterministic, multi-stage pipelines — plan → implement → review → deploy coded as an Agents SDK graph that invokes Codex at each stage. See https://developers.openai.com/codex/guides/agents-sdk.

opencode

opencode is the open-source terminal coding agent from the SST team. Its core distinction is the explicit mode field on every agent: primary, subagent, or all.

Official docs: https://opencode.ai/docs/agents/ · GitHub: https://github.com/anomalyco/opencode

Primary vs subagent

Primary agents are directly invocable by the user. You switch between them with Tab or a keybind. Built-in primaries:
- Build — default, all tools enabled.
- Plan — analysis and planning; edit and bash permissions set to ask.
Subagents are invoked by primaries (or by the user via @name). Built-in subagents:
- General — multi-step research, full tool access except todo.
- Explore — fast, read-only.
all mode works either way. It is the default when mode is unspecified.

Three hidden system agents (Compaction, Title, Summary) run automatically behind the scenes.

Configuration

Two equivalent forms. JSON, in opencode.json:

{
  "agent": {
    "code-reviewer": {
      "mode": "subagent",
      "description": "Review the diff against our security checklist",
      "model": "anthropic/claude-sonnet-4-20250514",
      "permission": {
        "edit": "deny",
        "bash": "ask",
        "webfetch": "allow"
      }
    }
  }
}

Markdown, in ~/.config/opencode/agents/ (global) or .opencode/agents/ (project):

---
description: Security-focused reviewer
mode: subagent
model: anthropic/claude-sonnet-4-20250514
---
You are a security reviewer. Focus on injection, auth, and secret handling.

Invocation

@code-reviewer — explicit @-mention.
Automatic — the primary delegates based on the subagent's description, just like Claude Code.
Task tool — one agent invokes another programmatically.

Fine-grained permissions

opencode gives more per-command permission granularity than the others. You can allow git status * and ask for git push in the same config:

"permission": {
  "bash": {
    "*": "ask",
    "git status *": "allow",
    "git push": "ask"
  },
  "task": {
    "*": "deny",
    "code-reviewer": "ask"
  }
}

permission.task lets a primary agent control which subagents it may invoke — useful when you want a Plan primary that can call Explore but not Build.

Set hidden: true on a subagent to remove it from the @ autocomplete menu while still letting other agents invoke it via the Task tool.

forgecode

Forge Code (forgecode.dev) is a multi-provider terminal harness. Its distinguishing feature is model flexibility: it talks to 300+ models via OpenRouter, plus direct OpenAI, Anthropic, and open-weight providers, and it lets you switch models mid-session with :model.

Official docs: https://forgecode.dev/docs/ · GitHub: https://github.com/tailcallhq/forgecode

Configuration

Project instructions: AGENTS.md in the repo root (same file format as Codex).
Settings: .forge.toml (main config), .mcp.json (MCP integration), .ignore (file exclusion).
Skills: SKILL.md folders, compatible with the Agent Skills spec from lecture 28.
Shell integration: zsh plugin that lets you trigger Forge from the shell prompt with :.
Install: curl -fsSL https://forgecode.dev/cli | sh.

Subagents in forgecode

forgecode's subagent story is lighter than Claude Code's or Codex's. It supports custom agents configured via AGENTS.md and project-level config, but the automatic-delegation and parallel-fan-out patterns are less developed. Treat forgecode as the right choice when:

You want a single harness across many providers (cloud, open-weight, local).
You are optimizing cost by routing to cheaper models mid-session.
You already know the Agent Skills spec and want to reuse skills across platforms.

Use Claude Code or Codex when subagent coordination is the core of your workflow. Check the current docs for the live state of subagent support — this part of the ecosystem is moving quickly.

Platform comparison at a glance

Feature	Claude Code	Codex	opencode	forgecode
Subagent file format	Markdown + YAML	TOML	Markdown + YAML, or JSON	Markdown + TOML
Project location	`.claude/agents/`	`.codex/agents/`	`.opencode/agents/`	project root + `.forge.toml`
User location	`~/.claude/agents/`	`~/.codex/agents/`	`~/.config/opencode/agents/`	`~/.forge/`
Project instructions file	`CLAUDE.md`	`AGENTS.md`	`AGENTS.md`	`AGENTS.md`
Automatic delegation	Yes (description match)	No (explicit only)	Yes (description match)	Limited
Parallel fan-out	Yes	Yes (`max_threads`)	Via Task tool	Limited
Worktree isolation	Yes (`isolation: worktree`)	No (sandbox modes)	No	No
Per-subagent model	Yes	Yes	Yes	Yes
Per-subagent tool/permission scope	Yes	Yes (sandbox)	Yes (fine-grained)	Partial
Built-in subagents	Explore, Plan, general-purpose	none pre-shipped	Build, Plan, General, Explore	none pre-shipped
Depth limit	1	1 (`max_depth`)	1	1

Coordination Patterns

Once you have subagents, three patterns recur constantly.

Fan-out research

The parent needs to learn several independent things. Spawn one subagent per thread of investigation, run them in parallel, merge their summaries.

Token math: if each subagent used 100K tokens of exploration, the parent saved 300K tokens. It sees ~1–2K tokens of merged summary instead.

Pipeline — research, plan, implement, review

Sequential, because each stage depends on the previous:

Researcher (Explore subagent) — maps the relevant code. Returns a short context brief.
Planner (Plan subagent) — designs the change. Returns a step-by-step plan.
Implementer (main agent or general-purpose subagent) — writes the code.
Reviewer (custom subagent) — checks the diff against a checklist. Returns findings.

Each stage is cheap because it sees only what it needs. The parent orchestrates.

Isolated parallel edits

Two subagents, same task, different approaches. Each runs in its own git worktree (Claude Code's isolation: worktree). The parent compares the results and picks the better diff. Useful when you want A/B approaches without merge conflicts.

Delivering a Feature End-to-End

Worked example: "Add dark mode to the settings page." Here is what happens when you give that prompt to a subagent-aware harness.

Step 1 — Main agent triages. Reads the user request. Decides it needs to understand (a) how the settings page is built and (b) what theming already exists.

Step 2 — Fan-out research. Spawns two parallel Explore subagents:

Subagent A: "Find the settings page components. List files and the structure."
Subagent B: "Search for theme, dark mode, color scheme references. Report what's there and what's missing."

Each subagent burns tens of thousands of tokens reading files. Each returns ~500 tokens of summary. The parent sees only the summaries.

Step 3 — Plan. Main agent synthesizes the two summaries into a plan: "Extend existing ThemeProvider, add a toggle to SettingsLayout, add a prefers-color-scheme media listener." No Plan subagent needed for a small change; for larger work, delegate to Plan.

Step 4 — Implement. Main agent makes the edits directly. It has the plan, it has the file paths, the context is clean.

Step 5 — Test. Spawns a subagent: "Run npm test and report failures." Test output can be 5,000+ lines. The subagent reads the log, distills to "3 tests failed, all in theme.test.ts, cause: missing default color on the color-scheme CSS var." The raw log stays in the subagent.

Step 6 — Fix. Main agent fixes the three tests based on the distilled summary.

Step 7 — Review. Spawns a code-reviewer subagent on the diff. It returns a findings list: "Theme toggle missing aria-label; localStorage access not guarded for SSR." Two small fixes.

Step 8 — Commit. Main agent commits.

What made this work: at no point did the main agent's context contain 500-file search dumps or 10K test log lines. It saw only the essential summaries, the plan, the diff, and the findings. That kept the reasoning crisp across eight steps.

Context Management Deep Dive

A concrete comparison. Suppose your session has 20 MCP tools attached. Each tool definition is roughly 1,000–1,400 tokens. MCP tools load on every turn (they are part of the system prompt), so a 40-turn session pays 20 × 1,200 × 40 ≈ 960,000 tokens just to define tools you might not use.

Replace the MCP servers with skills that invoke a smaller set of deterministic scripts. Each skill costs ~100 tokens for discovery at startup. Only the active skill loads its full body (~2–5K tokens). Over 40 turns: 20 × 100 + ~3,000 = ~5,000 tokens of tool overhead.

Now add subagents. The parent has 5 tools. It delegates verbose exploration to subagents. The subagents each load their own skills (skills do not inherit from the parent; each subagent starts with its own clean skill roster). The parent's per-turn overhead stays near 5 tools × 500 tokens = 2,500 tokens.

The headline: tools, skills, and subagents compose to keep the parent's context budget for reasoning, not for reading noise.

Two important caveats:

Subagent output may be truncated. Many harnesses cap subagent return payloads at 10–30K characters. Design your task brief so the summary fits. If you need the full thing, design the subagent to write it to a file and return the path.
Sometimes you need the verbose output back in the parent. If the parent has to quote exact error lines, or reason about precise diffs, a summary loses fidelity. In those cases, do not subagent — handle it inline, or have the subagent produce a file the parent reads directly.

When to Use Subagents

Verbose side-quests. Test logs, log tailing, large search results, doc pages, large file surveys.
Independent parallel research. Three questions that don't depend on each other — answer them concurrently.
Specialized personas. A security reviewer, a SQL reviewer, a docs writer. Each gets a tight system prompt that would bloat your main agent if inlined.
Cost routing. Explore on Haiku, plan on Sonnet, stay on Opus for the heavy reasoning turns.
Long-running work. Background subagents let the main conversation keep moving while a migration analysis runs.
Isolated experimentation. Worktree-isolated subagents try a risky refactor without touching the main tree.

When NOT to Use Subagents

Tightly coupled reasoning. If every step needs to see every previous step, a subagent's "summary back to parent" loses information and adds a round-trip.
Tasks smaller than the summary. If the whole task is five lines, spawning a subagent costs more than doing it inline.
Interactive clarification. Subagents cannot ask the user questions. If the task needs clarification, the parent must handle it.
Deterministic, reproducible execution. A skill with a script is more reliable than a subagent being asked to "run the validator." Use skills (lecture 28) for deterministic work; use subagents for judgment.
When the output IS the artifact. If the parent needs the full file, not a summary, a subagent adds a layer for no gain. Read the file directly.

Common Pitfalls

Vague descriptions. The parent selects subagents by matching the task to each subagent's description field. Generic descriptions never fire. Be explicit about when to use the subagent, what keywords should trigger it, and what makes it different from siblings. Same trap as skill descriptions in lecture 28.
Summaries that drop the crucial detail. Plan what the subagent must return. "Report findings" is too vague. "Return a JSON list of {file, line, severity, message} for each issue, max 20 entries" is enforceable.
Assuming siblings share memory. If subagent A discovers something subagent B needs, you must either serialize them (pipeline, not fan-out) or have the parent pass A's result into B's task brief.
Skill re-load cost. Every subagent loads its own skills. Ten subagents with five skills each can add up. Prune aggressively.
Over-eager delegation. Not every task deserves a subagent. Four-line fixes, trivial answers, and "what does this constant mean" questions belong inline.
Forgetting the depth limit. Subagents normally cannot spawn further subagents (depth 1). If your plan depends on three-level nesting, you need to rearrange as a pipeline or lift work up to the parent.

References

The Problem Subagents Solve​

Anatomy of an Agent​

The Shared Subagent Contract​

Claude Code​

Built-in subagents​

Creating a custom subagent​

Frontmatter fields​

Scope and priority​

Worktree isolation​

Subagents vs agent teams​

OpenAI Codex​

AGENTS.md — the project instructions layer​

Custom subagents​

Invocation​

Global agent settings​

Parallel fan-out​

Codex + Agents SDK​

opencode​

Primary vs subagent​

Configuration​

Invocation​

Fine-grained permissions​

forgecode​

Configuration​

Subagents in forgecode​

Platform comparison at a glance​

Coordination Patterns​

Fan-out research​

Pipeline — research, plan, implement, review​

Isolated parallel edits​

Delivering a Feature End-to-End​

Context Management Deep Dive​

When to Use Subagents​

When NOT to Use Subagents​

Common Pitfalls​

References​

Claude Code​

OpenAI Codex​

opencode​

forgecode​

Ecosystem and write-ups​

The Problem Subagents Solve

Anatomy of an Agent

The Shared Subagent Contract

Claude Code

Built-in subagents

Creating a custom subagent

Frontmatter fields

Scope and priority

Worktree isolation

Subagents vs agent teams

OpenAI Codex

AGENTS.md — the project instructions layer

Custom subagents

Invocation

Global agent settings

Parallel fan-out

Codex + Agents SDK

opencode

Primary vs subagent

Configuration

Invocation

Fine-grained permissions

forgecode

Configuration

Subagents in forgecode

Platform comparison at a glance

Coordination Patterns

Fan-out research

Pipeline — research, plan, implement, review

Isolated parallel edits

Delivering a Feature End-to-End

Context Management Deep Dive

When to Use Subagents

When NOT to Use Subagents

Common Pitfalls

References

Claude Code

OpenAI Codex

opencode

forgecode

Ecosystem and write-ups