30 - Agents
Earlier lectures covered the pieces in isolation: tool calling (lecture 26), MCP (lecture 27), skills (lecture 28). This lecture covers the layer that ties them together — agents and subagents — and how the dominant CLI coding harnesses (Claude Code, OpenAI Codex, opencode, forgecode) expose that layer to you.
The mental model is the same everywhere. The spelling differs.
- Tools are things the model can call.
- Skills are instructions the model loads on demand.
- MCP is how you plug in external capabilities.
- Agents are the loop that decides what to do next.
- Subagents are agents spawned by other agents, each with its own fresh context.
Once you understand subagents, the differences between Claude Code and Codex and opencode shrink to configuration syntax.
The Problem Subagents Solve
Context windows are finite. Even with 1M tokens, you run out because most of what an agent sees during a task is exploration output it will never reference again.
Consider three everyday moments in a coding session:
- Running a test suite that produces 10,000 lines of log output — most of which is dots and timings. You only care about the failures.
- Searching a 500-file codebase for every caller of
normalizeUser(). You want the list of call sites, not the 50 surrounding lines of each file. - Fetching a documentation page to check whether a library supports a specific flag. Once you have the answer, the rest of the page is dead weight.
If the main agent does this work itself, its context fills with garbage. Every subsequent turn pays for tokens that have no bearing on the next decision. Eventually the window compacts (lossily), or you hit the turn budget, or the signal-to-noise drops so far that the model starts making mistakes.
A subagent fixes this. The parent delegates the noisy work to a child agent with its own context. The child does the exploration, builds whatever working memory it needs, then returns a summary — typically a few hundred tokens. The 10,000-line log, the 500 search hits, the full doc page — none of it touches the parent's window.
The subagent's context is thrown away when it returns. That is the point.
Anatomy of an Agent
Before subagents, pin the vocabulary. An agent is a loop:
Concretely, each iteration is: read the current conversation, generate a response (text, tool calls, or both), execute any tool calls, append the results, repeat. Lecture 01 built one from scratch.
An agent has five configurable pieces:
- System prompt — the long-standing instructions that define role, constraints, output style.
- Tools — what it can call. From lecture 26.
- MCP servers — external tool collections it can attach. From lecture 27.
- Skills — on-demand instruction modules. From lecture 28.
- Model — which LLM runs the loop. Opus, Sonnet, Haiku, GPT-5.4, etc.
- Turn / token budget — when to stop.
A subagent is all of the above, packaged as a child process the parent can spawn.
The Shared Subagent Contract
Before the platform-specific details, here is what every subagent system agrees on:
- The subagent gets its own context window. It sees its own system prompt, its own input task, its own tool results. It does not see the parent's conversation history.
- The parent hands it a task description, not raw state. You design the task brief like you design a function signature.
- The subagent has its own tools and permissions. Often narrower than the parent's — a read-only Explore subagent cannot write files.
- Subagents typically cannot spawn further subagents. Nesting depth is 1 by default across all four platforms. This prevents runaway fan-out.
- Siblings don't share memory. If you spawn two subagents in parallel, neither sees the other's work until both return and the parent merges their summaries.
Here is how the four platforms name and configure the same idea:
| Platform | Term | Config location | Format |
|---|---|---|---|
| Claude Code | subagent | .claude/agents/, ~/.claude/agents/ | Markdown + YAML frontmatter |
| OpenAI Codex | custom agent / subagent | .codex/agents/, ~/.codex/agents/ | TOML |
| opencode | agent (mode: subagent) | .opencode/agents/, ~/.config/opencode/agents/, opencode.json | Markdown + YAML, or JSON |
| forgecode | agent (project via AGENTS.md) | project root, ~/.forge/ | Markdown + .forge.toml |
The rest of this lecture drills into each.
Claude Code
Claude Code is Anthropic's reference implementation and has the most mature subagent story. If you learn the Claude Code model, the others map onto it with small renames.
Official documentation: https://code.claude.com/docs/en/sub-agents
Built-in subagents
Three are always available:
- Explore — runs on Haiku, read-only tools, optimized for codebase search and file discovery. Claude delegates to Explore automatically when it needs to understand code without changing it. You can hint a thoroughness level (
quick,medium,very thorough) in the task. - Plan — used during plan mode. Read-only. Gathers context before presenting a plan. Prevents infinite nesting because subagents cannot spawn subagents themselves.
- general-purpose — all tools, inherits the parent's model. For multi-step tasks that need both exploration and modification.
Two helpers fire automatically for specific slash commands: statusline-setup (Sonnet, runs on /statusline) and Claude Code Guide (Haiku, answers Claude-Code-feature questions).
Creating a custom subagent
Run /agents inside Claude Code — it opens an interactive builder that generates the name, description, and system prompt, asks which tools to allow, picks a model, and saves the file. That is the recommended path.
Under the hood, a subagent is just a Markdown file with YAML frontmatter:
---
name: code-reviewer
description: Reviews code for quality, security issues, and best practices. Use after any code change.
tools: Read, Glob, Grep, Bash
model: sonnet
---
You are a senior code reviewer. When invoked:
1. Run `git diff HEAD` to see recent changes.
2. Review the diff for correctness, security issues, and style.
3. Return a prioritized list of findings. Flag any secrets, SQL injection risks, or missing error handling.
4. Suggest concrete fixes, not vague advice.
The frontmatter drives selection and tooling. The body is the system prompt.
Frontmatter fields
Only name and description are required. The full set:
| Field | Purpose |
|---|---|
name | Lowercase, hyphens only. Filesystem-unique. |
description | When Claude should delegate. This is the trigger. Agents undertrigger — be explicit. |
tools | Allowed tool list. Omit to inherit all. |
disallowedTools | Deny list subtracted from the inherited or specified set. |
model | sonnet, opus, haiku, a full model ID, or inherit (default). |
permissionMode | default, acceptEdits, auto, dontAsk, bypassPermissions, plan. |
maxTurns | Cap on agentic turns before the subagent stops. |
skills | Skills to load into the subagent at startup. Subagents do not inherit skills from the parent. |
mcpServers | MCP servers this subagent can reach. |
hooks | Lifecycle hooks scoped to this subagent. |
memory | user, project, or local — persistent memory across sessions. |
background | Run as a background task (default false). |
effort | low, medium, high, max. Overrides session effort. |
isolation | Set to worktree to spawn in a temporary git worktree, isolating edits. |
color | Display color in the UI. |
initialPrompt | Auto-submitted first user turn when the subagent is the main session agent. |
The description is the most important field. The parent agent picks a subagent by matching the task to its description. Vague descriptions never trigger — the same failure mode as skills in lecture 28.
Scope and priority
Subagents live in multiple places. When names collide, the higher-priority scope wins:
- Managed settings (organization-wide) — highest.
--agentsCLI flag (JSON, session-only)..claude/agents/(project, checked into git).~/.claude/agents/(user, all projects).- Plugin
agents/directory — lowest.
Project subagents are the right home for team-shared reviewers and domain experts. User subagents are right for personal helpers you use across all projects. The --agents flag is useful for automation scripts and quick tests:
claude --agents '{
"code-reviewer": {
"description": "Expert code reviewer. Use proactively after code changes.",
"prompt": "You are a senior code reviewer. Focus on quality, security, best practices.",
"tools": ["Read", "Grep", "Glob", "Bash"],
"model": "sonnet"
}
}'
Worktree isolation
Set isolation: worktree and the subagent runs in a temporary git worktree — its own copy of the repo. Edits don't touch the parent's working tree until you merge. This is how Claude Code supports parallel implementation: spawn two subagents on the same codebase, each in its own worktree, let both edit without colliding, compare results.
Subagents vs agent teams
Claude Code also has a separate feature called agent teams, for cases where you need multiple agents communicating across sessions (e.g. long-running background work). Subagents are a single-session concept. For the coordination patterns in this lecture, subagents are the right tool. See https://code.claude.com/docs/en/agent-teams for the team model.
SDK users access the same primitives from code via the Claude Agent SDK: https://platform.claude.com/docs/en/agent-sdk/subagents
OpenAI Codex
Codex CLI is OpenAI's terminal coding agent. It adds subagents as first-class citizens but with one important philosophical difference from Claude Code: Codex only spawns a subagent when you explicitly ask it to. There is no automatic delegation based on description matching.
Official docs: https://developers.openai.com/codex/subagents · CLI: https://developers.openai.com/codex/cli
AGENTS.md — the project instructions layer
Before subagents, Codex has AGENTS.md, the equivalent of Claude Code's CLAUDE.md. Codex reads it before every task.
Two scopes:
~/.codex/AGENTS.md— your personal defaults across every project.- Project-level
AGENTS.mdin the repo root and any subdirectory. Instructions cascade: files closer to the working directory override earlier ones. You can also drop anAGENTS.override.mdto replace higher-level instructions wholesale.
Typical pattern:
- Root
AGENTS.md: "Runnpm run lintbefore opening a PR." services/payments/AGENTS.override.md: "Usemake test-paymentsinstead ofnpm test."
When Codex works in the payments directory, both apply, with the override replacing the parent's test directive.
Full spec: https://developers.openai.com/codex/guides/agents-md
Custom subagents
TOML files in ~/.codex/agents/ (personal) or .codex/agents/ (project). Each file defines one agent.
Required fields: name, description, developer_instructions. Optional: model, model_reasoning_effort, sandbox_mode, mcp_servers, skills.config, nickname_candidates.
name = "reviewer"
description = "PR reviewer focused on correctness, security, and missing tests."
model = "gpt-5.4"
model_reasoning_effort = "high"
sandbox_mode = "read-only"
developer_instructions = """
Review code like an owner.
Prioritize correctness, security, behavior regressions, and missing test coverage.
"""
nickname_candidates = ["Atlas", "Delta", "Echo"]
A custom agent file accepts the same settings as a normal Codex session config, so you have the full TOML surface to configure sandbox, MCP servers, and model parameters per agent.
Invocation
You ask for a subagent by name, in natural language:
Have pr_explorer map the affected code paths in this PR.
Spawn one reviewer agent per changed service and consolidate findings.
No @mention, no slash command. Codex parses your request and orchestrates accordingly.
Global agent settings
In ~/.codex/config.toml:
[agents]
max_threads = 6 # concurrent subagents allowed
max_depth = 1 # default: subagents cannot spawn subagents
job_max_runtime_seconds = 1800
Parallel fan-out
When you ask for several agents at once, Codex spawns them in parallel and waits until all return before consolidating:
Spawn three reviewers — security, performance, docs — and merge their findings.
Each inherits the parent's sandbox policy. Output consolidates into a single response.
Codex + Agents SDK
Codex exposes the CLI as an MCP server, which means you can orchestrate it from the OpenAI Agents SDK for deterministic, multi-stage pipelines — plan → implement → review → deploy coded as an Agents SDK graph that invokes Codex at each stage. See https://developers.openai.com/codex/guides/agents-sdk.
opencode
opencode is the open-source terminal coding agent from the SST team. Its core distinction is the explicit mode field on every agent: primary, subagent, or all.
Official docs: https://opencode.ai/docs/agents/ · GitHub: https://github.com/anomalyco/opencode
Primary vs subagent
- Primary agents are directly invocable by the user. You switch between them with Tab or a keybind. Built-in primaries:
- Build — default, all tools enabled.
- Plan — analysis and planning; edit and bash permissions set to
ask.
- Subagents are invoked by primaries (or by the user via
@name). Built-in subagents:- General — multi-step research, full tool access except todo.
- Explore — fast, read-only.
allmode works either way. It is the default whenmodeis unspecified.
Three hidden system agents (Compaction, Title, Summary) run automatically behind the scenes.
Configuration
Two equivalent forms. JSON, in opencode.json:
{
"agent": {
"code-reviewer": {
"mode": "subagent",
"description": "Review the diff against our security checklist",
"model": "anthropic/claude-sonnet-4-20250514",
"permission": {
"edit": "deny",
"bash": "ask",
"webfetch": "allow"
}
}
}
}
Markdown, in ~/.config/opencode/agents/ (global) or .opencode/agents/ (project):
---
description: Security-focused reviewer
mode: subagent
model: anthropic/claude-sonnet-4-20250514
---
You are a security reviewer. Focus on injection, auth, and secret handling.
Invocation
@code-reviewer— explicit @-mention.- Automatic — the primary delegates based on the subagent's description, just like Claude Code.
- Task tool — one agent invokes another programmatically.
Fine-grained permissions
opencode gives more per-command permission granularity than the others. You can allow git status * and ask for git push in the same config:
"permission": {
"bash": {
"*": "ask",
"git status *": "allow",
"git push": "ask"
},
"task": {
"*": "deny",
"code-reviewer": "ask"
}
}
permission.task lets a primary agent control which subagents it may invoke — useful when you want a Plan primary that can call Explore but not Build.
Set hidden: true on a subagent to remove it from the @ autocomplete menu while still letting other agents invoke it via the Task tool.
forgecode
Forge Code (forgecode.dev) is a multi-provider terminal harness. Its distinguishing feature is model flexibility: it talks to 300+ models via OpenRouter, plus direct OpenAI, Anthropic, and open-weight providers, and it lets you switch models mid-session with :model.
Official docs: https://forgecode.dev/docs/ · GitHub: https://github.com/tailcallhq/forgecode
Configuration
- Project instructions:
AGENTS.mdin the repo root (same file format as Codex). - Settings:
.forge.toml(main config),.mcp.json(MCP integration),.ignore(file exclusion). - Skills:
SKILL.mdfolders, compatible with the Agent Skills spec from lecture 28. - Shell integration: zsh plugin that lets you trigger Forge from the shell prompt with
:. - Install:
curl -fsSL https://forgecode.dev/cli | sh.
Subagents in forgecode
forgecode's subagent story is lighter than Claude Code's or Codex's. It supports custom agents configured via AGENTS.md and project-level config, but the automatic-delegation and parallel-fan-out patterns are less developed. Treat forgecode as the right choice when:
- You want a single harness across many providers (cloud, open-weight, local).
- You are optimizing cost by routing to cheaper models mid-session.
- You already know the Agent Skills spec and want to reuse skills across platforms.
Use Claude Code or Codex when subagent coordination is the core of your workflow. Check the current docs for the live state of subagent support — this part of the ecosystem is moving quickly.
Platform comparison at a glance
| Feature | Claude Code | Codex | opencode | forgecode |
|---|---|---|---|---|
| Subagent file format | Markdown + YAML | TOML | Markdown + YAML, or JSON | Markdown + TOML |
| Project location | .claude/agents/ | .codex/agents/ | .opencode/agents/ | project root + .forge.toml |
| User location | ~/.claude/agents/ | ~/.codex/agents/ | ~/.config/opencode/agents/ | ~/.forge/ |
| Project instructions file | CLAUDE.md | AGENTS.md | AGENTS.md | AGENTS.md |
| Automatic delegation | Yes (description match) | No (explicit only) | Yes (description match) | Limited |
| Parallel fan-out | Yes | Yes (max_threads) | Via Task tool | Limited |
| Worktree isolation | Yes (isolation: worktree) | No (sandbox modes) | No | No |
| Per-subagent model | Yes | Yes | Yes | Yes |
| Per-subagent tool/permission scope | Yes | Yes (sandbox) | Yes (fine-grained) | Partial |
| Built-in subagents | Explore, Plan, general-purpose | none pre-shipped | Build, Plan, General, Explore | none pre-shipped |
| Depth limit | 1 | 1 (max_depth) | 1 | 1 |
Coordination Patterns
Once you have subagents, three patterns recur constantly.
Fan-out research
The parent needs to learn several independent things. Spawn one subagent per thread of investigation, run them in parallel, merge their summaries.
Token math: if each subagent used 100K tokens of exploration, the parent saved 300K tokens. It sees ~1–2K tokens of merged summary instead.
Pipeline — research, plan, implement, review
Sequential, because each stage depends on the previous:
- Researcher (Explore subagent) — maps the relevant code. Returns a short context brief.
- Planner (Plan subagent) — designs the change. Returns a step-by-step plan.
- Implementer (main agent or general-purpose subagent) — writes the code.
- Reviewer (custom subagent) — checks the diff against a checklist. Returns findings.
Each stage is cheap because it sees only what it needs. The parent orchestrates.
Isolated parallel edits
Two subagents, same task, different approaches. Each runs in its own git worktree (Claude Code's isolation: worktree). The parent compares the results and picks the better diff. Useful when you want A/B approaches without merge conflicts.
Delivering a Feature End-to-End
Worked example: "Add dark mode to the settings page." Here is what happens when you give that prompt to a subagent-aware harness.
Step 1 — Main agent triages. Reads the user request. Decides it needs to understand (a) how the settings page is built and (b) what theming already exists.
Step 2 — Fan-out research. Spawns two parallel Explore subagents:
- Subagent A: "Find the settings page components. List files and the structure."
- Subagent B: "Search for theme, dark mode, color scheme references. Report what's there and what's missing."
Each subagent burns tens of thousands of tokens reading files. Each returns ~500 tokens of summary. The parent sees only the summaries.
Step 3 — Plan. Main agent synthesizes the two summaries into a plan: "Extend existing ThemeProvider, add a toggle to SettingsLayout, add a prefers-color-scheme media listener." No Plan subagent needed for a small change; for larger work, delegate to Plan.
Step 4 — Implement. Main agent makes the edits directly. It has the plan, it has the file paths, the context is clean.
Step 5 — Test. Spawns a subagent: "Run npm test and report failures." Test output can be 5,000+ lines. The subagent reads the log, distills to "3 tests failed, all in theme.test.ts, cause: missing default color on the color-scheme CSS var." The raw log stays in the subagent.
Step 6 — Fix. Main agent fixes the three tests based on the distilled summary.
Step 7 — Review. Spawns a code-reviewer subagent on the diff. It returns a findings list: "Theme toggle missing aria-label; localStorage access not guarded for SSR." Two small fixes.
Step 8 — Commit. Main agent commits.
What made this work: at no point did the main agent's context contain 500-file search dumps or 10K test log lines. It saw only the essential summaries, the plan, the diff, and the findings. That kept the reasoning crisp across eight steps.
Context Management Deep Dive
A concrete comparison. Suppose your session has 20 MCP tools attached. Each tool definition is roughly 1,000–1,400 tokens. MCP tools load on every turn (they are part of the system prompt), so a 40-turn session pays 20 × 1,200 × 40 ≈ 960,000 tokens just to define tools you might not use.
Replace the MCP servers with skills that invoke a smaller set of deterministic scripts. Each skill costs ~100 tokens for discovery at startup. Only the active skill loads its full body (~2–5K tokens). Over 40 turns: 20 × 100 + ~3,000 = ~5,000 tokens of tool overhead.
Now add subagents. The parent has 5 tools. It delegates verbose exploration to subagents. The subagents each load their own skills (skills do not inherit from the parent; each subagent starts with its own clean skill roster). The parent's per-turn overhead stays near 5 tools × 500 tokens = 2,500 tokens.
The headline: tools, skills, and subagents compose to keep the parent's context budget for reasoning, not for reading noise.
Two important caveats:
- Subagent output may be truncated. Many harnesses cap subagent return payloads at 10–30K characters. Design your task brief so the summary fits. If you need the full thing, design the subagent to write it to a file and return the path.
- Sometimes you need the verbose output back in the parent. If the parent has to quote exact error lines, or reason about precise diffs, a summary loses fidelity. In those cases, do not subagent — handle it inline, or have the subagent produce a file the parent reads directly.
When to Use Subagents
- Verbose side-quests. Test logs, log tailing, large search results, doc pages, large file surveys.
- Independent parallel research. Three questions that don't depend on each other — answer them concurrently.
- Specialized personas. A security reviewer, a SQL reviewer, a docs writer. Each gets a tight system prompt that would bloat your main agent if inlined.
- Cost routing. Explore on Haiku, plan on Sonnet, stay on Opus for the heavy reasoning turns.
- Long-running work. Background subagents let the main conversation keep moving while a migration analysis runs.
- Isolated experimentation. Worktree-isolated subagents try a risky refactor without touching the main tree.
When NOT to Use Subagents
- Tightly coupled reasoning. If every step needs to see every previous step, a subagent's "summary back to parent" loses information and adds a round-trip.
- Tasks smaller than the summary. If the whole task is five lines, spawning a subagent costs more than doing it inline.
- Interactive clarification. Subagents cannot ask the user questions. If the task needs clarification, the parent must handle it.
- Deterministic, reproducible execution. A skill with a script is more reliable than a subagent being asked to "run the validator." Use skills (lecture 28) for deterministic work; use subagents for judgment.
- When the output IS the artifact. If the parent needs the full file, not a summary, a subagent adds a layer for no gain. Read the file directly.
Common Pitfalls
- Vague descriptions. The parent selects subagents by matching the task to each subagent's
descriptionfield. Generic descriptions never fire. Be explicit about when to use the subagent, what keywords should trigger it, and what makes it different from siblings. Same trap as skill descriptions in lecture 28. - Summaries that drop the crucial detail. Plan what the subagent must return. "Report findings" is too vague. "Return a JSON list of
{file, line, severity, message}for each issue, max 20 entries" is enforceable. - Assuming siblings share memory. If subagent A discovers something subagent B needs, you must either serialize them (pipeline, not fan-out) or have the parent pass A's result into B's task brief.
- Skill re-load cost. Every subagent loads its own skills. Ten subagents with five skills each can add up. Prune aggressively.
- Over-eager delegation. Not every task deserves a subagent. Four-line fixes, trivial answers, and "what does this constant mean" questions belong inline.
- Forgetting the depth limit. Subagents normally cannot spawn further subagents (depth 1). If your plan depends on three-level nesting, you need to rearrange as a pipeline or lift work up to the parent.
References
Claude Code
- Subagents: https://code.claude.com/docs/en/sub-agents
- Agent SDK subagents: https://platform.claude.com/docs/en/agent-sdk/subagents
- Agent teams: https://code.claude.com/docs/en/agent-teams
- Context window visualization: https://code.claude.com/docs/en/context-window
- Plan mode in common workflows: https://code.claude.com/docs/en/common-workflows
OpenAI Codex
- Subagents: https://developers.openai.com/codex/subagents
- AGENTS.md: https://developers.openai.com/codex/guides/agents-md
- Agents SDK integration: https://developers.openai.com/codex/guides/agents-sdk
- CLI overview: https://developers.openai.com/codex/cli
- CLI reference: https://developers.openai.com/codex/cli/reference
- Introducing Codex: https://openai.com/index/introducing-codex/
opencode
- Agents: https://opencode.ai/docs/agents/
- Commands: https://opencode.ai/docs/commands/
- Config: https://opencode.ai/docs/config/
- CLI: https://opencode.ai/docs/cli/
- GitHub: https://github.com/sst/opencode
forgecode
- Docs: https://forgecode.dev/docs/
- GitHub: https://github.com/tailcallhq/forgecode
- Agent Client Protocol (related): https://github.com/forge-agents/forge
Ecosystem and write-ups
- Awesome Claude Code: https://github.com/hesreallyhim/awesome-claude-code
- VoltAgent awesome-claude-code-subagents: https://github.com/VoltAgent/awesome-claude-code-subagents
- Simon Willison on Codex subagents: https://simonwillison.net/2026/Mar/16/codex-subagents/
- Inside Claude Code architecture: https://www.penligent.ai/hackinglabs/inside-claude-code-the-architecture-behind-tools-memory-hooks-and-mcp/