Mastering Claude Code — the technical guide
Mastering Claude Code
A Practical Guide for the Curious
A friend-to-friend guide to getting more from Claude Code than the box.
What you’re reading. This is a written guide with tables, diagrams, and a small bonus prompt at the very end that’s not meant for you — it’s meant for an AI you can paste it into. There’s also a narrated version of this same guide (same words, fewer tables) if you’d rather listen than read.
Why this exists. Most people who hear about Claude Code paste a question into it, get an answer, and move on. That works, but it leaves nine-tenths of the tool on the shelf. This guide is for the moment you decide to pick up the rest of it.
Sharing. Pass this along. The whole point is to make this less of a secret club and more of a shared toolkit.
How to read this
The guide is in four parts.
Part 1: Getting Started assumes you’ve heard the name “Claude Code” and not much else. It explains what the tool actually is, what you’ll need, and walks you through a first session so the rest of the guide has somewhere to land.
Part 2: The Foundations is the heart of it. Five chapters covering the five things you build once and then use every session forever: a standing letter to your assistant (CLAUDE.md), a long-term knowledge store (Memory), one-command procedures (Skills), rules your assistant cannot skip (Hooks), and delegating to specialists (Subagents). Each chapter is independent — read in any order.
Part 3: Composition is four chapters about combining the foundations into real workflows: state at the right scope, work that runs while you sleep, getting a second opinion automatically, and choosing tools well. These build on the foundations and refer back freely.
Part 4: Putting It Together walks through a single person’s setup across all nine pillars, suggests an adoption order, and tells you when not to bother.
Annex at the end is a block of text for an AI, not for you. It’s a copy-and-paste prompt you can drop into a fresh Claude session to bring the assistant up to speed on this guide. Ignore it unless you want it.
| Part | Chapters | What you’ll get |
|---|---|---|
| 1. Getting Started | 0 | Enough background to follow the rest |
| 2. Foundations | 1–5 | The five building blocks of a real setup |
| 3. Composition | 6–9 | How to combine them into useful workflows |
| 4. Putting It Together | — | A walked example, adoption order, caveats |
Part 1 — Getting Started
Chapter 0 — What Claude Code actually is
Claude Code is a program you run in your terminal that talks to Claude (the AI). That’s the headline. The detail is what makes it interesting.
When you use Claude through a web browser, you get a conversation window: you type, Claude replies, you type again. When you use Claude Code, you get the same conversation plus the ability for Claude to read files on your computer, run commands, edit files, and remember things between sessions — provided you set those things up. It’s the difference between asking a friend for advice over the phone and giving them the keys to your kitchen.
Why bother going past the browser
Most people who try Claude in the browser run into the same wall: every conversation starts from zero. You have to re-explain your project, re-state your preferences, re-paste the same files. By the third or fourth conversation, you stop bothering with anything complicated and use it for one-off questions only.
Claude Code lets you build a working environment that remembers, enforces rules, and does work while you’re not at the keyboard. The investment to set up each piece is small. The pay-off is that conversation number five doesn’t start at the same place conversation number one did — it starts wherever you left off, knowing what you’ve already established.
A few words you’ll see throughout
| Word | What it means in plain English |
|---|---|
| Session | One run of Claude Code in your terminal, from open to close. Like one phone call. |
| Prompt | A message you type to Claude. Same as in the browser. |
| Context | Everything Claude can see right now — your prompts, its replies, anything it’s read so far. A finite scratch-pad. |
| Tool | A specific action Claude can take, like “read this file” or “run this command”. |
| Agent (also “subagent”) | A second copy of Claude you can hand a task to, briefed cold, while your main session keeps doing other things. |
| Memory | A folder of small text files Claude reads at the start of every session to remember what matters across conversations. |
| Skill | A named procedure you’ve written down once that Claude can run on
demand by typing /your-skill. |
| Hook | A small script that fires automatically at certain moments — like before any file is written — to enforce a rule. |
| CLAUDE.md | A plain-text file at the root of your project that Claude reads at the start of every session. Like a standing letter. |
If those last few don’t make sense yet, don’t worry — each one has its own chapter. The table is here so you can come back to it.
What you’ll need
- A computer running macOS, Linux, or Windows (via WSL).
- A terminal — the black-text-on-a-screen window. Comfort with it at the level of “I can change directories and run a command” is enough.
- A Claude account. The Pro plan ($20/mo) is the usual starting point; you can also pay per use through Anthropic’s API if you’d rather.
- About thirty minutes to install and run a first session.
Your first session, walked through
The official install instructions live at the Claude Code documentation; they change occasionally, so I won’t try to mirror them here. Once installed, this is what a first session looks like.
You open your terminal, change to a folder you care about — let’s say
a folder containing some notes you’ve been making for a side project —
and type claude. A prompt appears. You type:
What’s in this folder?
Claude reads the directory and tells you what it found. You type:
Open
ideas.mdand summarise the three most recent entries.
Claude opens the file, reads it, and produces a summary. So far this is what the browser does too.
Now the difference. You type:
Save my preferences: I want concise answers, no preamble, and bullet points only when they actually help.
Claude, by default, doesn’t have a way to do that — but with a memory store set up (Chapter 2), it can write that to a small file and read it at the start of every future session. From that point on, every conversation knows that about you. You stop saying it.
That’s the rhythm of the whole guide: each pillar makes one piece of “I have to say this every time” stop happening.
Part 2 — The Foundations
These five chapters are the building blocks. Read them in any order. Pick the one that matches your current frustration.
Chapter 1 — CLAUDE.md: a standing letter to your assistant
The idea
When you open Claude Code in a folder, it looks for a file called
CLAUDE.md and reads it first — every single time, no matter
what you’re about to ask. Think of it as a one-page letter you’ve left
on the desk for the assistant: here’s what this place is, here are
the rules of the house, here’s where things live. The assistant
doesn’t have to be told all of that again; it’s already on the desk when
they walk in.
Why bother
Without it, every session starts at zero on your project. You re-explain the directory layout, re-state the conventions, re-mention the thing-that-must-not-be-touched. Halfway through the third re-explanation in a week, you realise you’ve been describing the same workspace over and over and the assistant still doesn’t know it.
With it, the assistant arrives already oriented. The first minute of every session — the re-onboarding minute — disappears. Multiplied over a year of sessions, that’s real time.
Where the file lives
There are two scopes, and you can use both:
~/.claude/CLAUDE.md ← user-wide; applies to every session anywhere
your-project/CLAUDE.md ← project-only; applies when you open this folder
The user-wide file is for things that are true about you regardless of project: how you like to be talked to, your preferred file formats, any rule that should hold everywhere. The project file is for things specific to that project: the layout, the glossary, the gotchas.
Both files are loaded together when you open the project. The project file extends the user file; it doesn’t silently overwrite it.
What earns a place in the file
| Belongs in CLAUDE.md | Does not |
|---|---|
| Two-line description of what this project is | A full description of every directory |
| The three or four commands you actually run | Every command anyone could possibly run |
| House conventions (naming, file organisation) | Things explained well by the code itself |
| Domain glossary — terms with specific meaning here | A textbook on the domain |
| A few gotchas that have bitten before | Every conceivable gotcha |
| Pointers to the rest of your setup (“Memory lives at…”) | The full contents of those other places |
The discipline is length. Every word in CLAUDE.md costs you context budget on every session, whether or not today’s task needs that word. Aim for fifty to a hundred and fifty lines. If you’re over two hundred, start cutting.
A starting template
# Project: [Name]
## What this is
[One or two sentences. Why this folder exists.]
## Where things live
- `src/` — main code
- `docs/` — written material
- `data/` — datasets
## Commands I actually run
- Build: `make build`
- Test: `pytest tests/`
- Deploy: see `/deploy` skill
## Gotchas
- The staging server runs on a different database than production.
Always confirm which one you're talking to.
## Pointers
- Memory: `~/.claude/projects/this-project/memory/MEMORY.md`
- Skills: `.claude/skills/`Ten or twenty lines is a fine starting size. Grow it by noticing the things you keep re-explaining and adding them one at a time.
What does not belong here
Secrets. This file is plain text on disk, almost certainly checked into version control at some point. API keys, passwords, tokens — none of these go in CLAUDE.md. They belong in a password manager or environment variable.
Current task state. “We’re on step three of seven” is true today and stale tomorrow. That’s plan-state (Chapter 6), not a standing instruction.
Hard rules that must never be broken. If the consequence of skipping a rule is real — a corrupted file, a deleted dataset, a leaked credential — that rule belongs in a Hook (Chapter 4), not in text the assistant should read but might miss when its attention is elsewhere.
Failure modes
The 5,000-word file. It started at twenty lines and grew over a year into a reference manual. Now it loads on every session, eats context, and the assistant has learned to skim it because there’s too much. Quarterly trim. Move stable knowledge to memory; move procedures to skills; move enforcement to hooks.
Mixing user-wide and project-specific. Your communication preferences end up in one project’s file because that’s where you happened to write them. Six months later you’re in a different project and the assistant doesn’t know them. Put cross-project content in the user-wide file. Put project content in the project file.
Writing it like a README. A README is for humans new to the project — it explains, motivates, walks through installation. CLAUDE.md is for the assistant — terse, directive, focused on what’s needed to behave correctly. Conflating them produces a file that does neither job well.
When to use this vs the other foundations
| If the thing is… | Put it in… |
|---|---|
| Always-relevant context the assistant should arrive knowing | CLAUDE.md |
| Knowledge that accumulates and should be retrieved by topic | Memory (Chapter 2) |
| A procedure you’ll invoke on demand | Skill (Chapter 3) |
| A rule that must hold regardless of what the model decides | Hook (Chapter 4) |
Chapter 2 — Memory: knowledge that survives the session
The idea
Memory is a folder of small text files that Claude reads at the start of each session. Unlike CLAUDE.md (which loads in full, every time), memory is indexed — Claude reads the index, then opens only the files relevant to today’s task. It’s the difference between leaving a one-page letter on the desk and giving the assistant access to a filing cabinet they can open by labelled drawer.
Why bother
CLAUDE.md is short by design. Memory is where the bulk of your “what I’d otherwise have to re-explain” lives — your preferences, the lessons learned from past mistakes, the location of credentials, the quirks of a vendor’s API, the decisions already made on each active project.
A memory store that started small at week one feels indispensable by month six. The compounding is real: every session adds a tiny bit of accumulated knowledge that all future sessions can draw on.
The four kinds of memory
There are exactly four types, and the distinction matters because each type behaves differently as it ages.
| Type | What goes here | Example |
|---|---|---|
| user | Who you are and how you prefer to work | “I want concise answers and bullet points only when they help.” |
| feedback | A lesson learned from a mistake — what went wrong and the rule to apply next time | “When I ask for a summary, don’t add a ‘next steps’ section unless I asked.” |
| project | Where files live for an ongoing effort, decisions already made, integrations in use | “The kitchen reno is in fit-out phase; cabinets arrive Friday; certifier needs 48h notice.” |
| reference | Stable external facts you’d otherwise have to look up | “Vendor X’s quoting endpoint is at /api/v3/quotes, not /quotes.” |
The two types you’ll use most are feedback and project. Feedback compounds — every mistake you bother to write down is a regression prevented across every future session. Project files are where the running context of “where am I on the kitchen reno / book draft / consulting engagement” sits.
How it’s organised
The store lives at
~/.claude/projects/<some-slug>/memory/. Inside that
folder:
memory/
├── MEMORY.md ← the index (one row per file)
├── me.md ← a user file
├── feedback_concise.md ← a feedback file
├── reno_kitchen.md ← a project file
└── vendor_x_api.md ← a reference file
MEMORY.md is a table. Each row is one file: short
description, hashtags, link. Claude reads this index first and uses the
hashtags to decide which other files to open.
| File | Description | Tags |
|------|-------------|------|
| me.md | Who I am, output preferences | #user #prefs |
| feedback_concise.md | Don't add unrequested "next steps" sections | #feedback #style |
| reno_kitchen.md | Kitchen reno — phase tracker, vendors, dates | #project #renovation |
| vendor_x_api.md | Quoting endpoint quirks at Vendor X | #reference #api |Each file has a short header at the top (called frontmatter) and then a body:
---
name: Kitchen Renovation
type: project
tags: [renovation, kitchen]
---
# Kitchen Renovation
Phase: Fit-out (started Mon 12 May)
Cabinets: ordered, delivery Fri 23 May
Certifier: 48 hours notice required for any change.
…How to start
You don’t need all four types on day one. The right order is:
- One
userfile. Call itme.md. Put in it: who you are in a sentence or two, your preferred output style, any rule that holds across everything you do. Five to ten lines is enough. - One
projectfile for whatever’s most active. Where things live, what decisions are settled, what’s outstanding. Ten to twenty lines. - A
feedbackfile the first time the assistant does something you want it to stop doing. Write it the moment the friction happens, while it’s fresh. Short header explaining what went wrong, short “do this instead” rule. - A
referencefile the second time you find yourself looking up the same external fact. Once is fine; twice means it belongs in memory.
Claude reads MEMORY.md automatically when it’s present.
If you want certain files loaded on every session regardless, that’s
what CLAUDE.md is for — point at memory and let the index do the
rest.
Failure modes
Saving things the codebase already knows. A note
saying “the main function is in src/app.py on line 42” is
duplicating what grep would tell you in less time than it
took to write the note. The codebase moves; the note doesn’t. Save
decisions, conclusions, and external facts. Don’t save things the
project itself can answer.
Stale “current state” memory. “We’re in fit-out phase, cabinets arrive Friday” is true this week and wrong next week. That kind of state belongs in a plan file (Chapter 6), not memory. Memory carries policies and standing facts; plans carry progress.
Index bloat. Every entry in the index adds a tiny bit of cognitive overhead on every session. A 25-file index that’s been pruned beats a 60-file one with 20 stale files. Skim quarterly; archive what’s no longer relevant.
Two files saying nearly the same thing. They will drift apart. Pick one canonical home for each piece of knowledge and point from the other if you need to.
A worked example
A researcher writes a doctoral thesis over eighteen months. She makes
a user file noting she prefers inline citations over
footnotes and wants her claims flagged when evidence is thin. She makes
a project file describing her research question, the
structure of her argument chapters, and the location of her bibliography
database.
Three months in, she adds a feedback file: when she asks
for “a more careful tone”, Claude over-qualifies; the rule is “tighten
the hedges, don’t add new ones”. A few weeks later, another feedback
file: a particular statistical method was questioned by her supervisor;
only use it when the dataset meets condition Y.
A reference file holds her department’s style guide —
specifically, the edge cases she keeps re-looking-up (figure caption
format, citing unpublished conference proceedings, word-limit
rules).
By month twelve, every session opens already knowing her preferred citation style, the structure of her argument, which statistical decisions are settled, and which writing patterns she’s already corrected. The session begins at the level of a collaborator who’s been alongside the project from the start, not a fresh assistant who needs re-onboarding.
Chapter 3 — Skills: one-command procedures
The idea
A skill is a small file that bundles three things: a name (the
slash-command you type, like /style-pass), instructions for
what to do when called, and a description that tells Claude when to use
it. Once you’ve written a skill, you can invoke it explicitly
(/style-pass on the last section) or Claude can invoke it
automatically when your prompt matches its description.
Why bother
Without skills, multi-step procedures live in your head or in old transcripts. You re-explain the same sequence every session, details vary each time, the assistant drifts from your intent. Skills convert “something I worked out once” into “something we repeat reliably”. When Claude reads its skill library at the start of each session, it sees the procedures available and can reach for one without being told.
What a skill looks like
A skill is a folder containing a single SKILL.md
file:
~/.claude/skills/
└── style-pass/
└── SKILL.md
The file has a short header and a body:
---
name: style-pass
description: Apply house style to the current section — cap sentences at 28 words,
flag passive constructions, normalise em-dashes, and check for blacklist
transition words. Use when the user says "style pass", "clean up the prose",
or after finishing a draft section.
allowed-tools: [Read, Edit]
---
# Style Pass
Run these checks in order on the section the user names:
1. Read the section.
2. Cap sentence length: any sentence over 28 words gets flagged and a
shorter rewrite proposed.
3. Flag passive constructions in topic sentences.
4. Normalise em-dashes: " — " with spaces, never "—" with none.
5. Scan for blacklist transition words (very, really, basically,
actually, simply, just).
6. Produce a change log. Wait for the user to confirm before
saving any edits.The header (frontmatter) controls metadata. The body is the procedure itself — write it like you’re briefing a colleague who will do exactly what you’ve written, no more, no less.
The description does double duty
The description field is the most important field in the
file. It does two jobs at once:
- It tells you (and future readers) what the skill does.
- It tells Claude when to reach for the skill without being asked.
Vague descriptions like “useful for prose” never trigger automatically. The pattern that works: state what the skill produces, then list the literal phrases you might type when you want it. Over-specify. The system under-triggers more often than it over-triggers.
Where skills live
Three scopes, picked by where the file sits:
| Scope | Location | When it’s available |
|---|---|---|
| User | ~/.claude/skills/<name>/SKILL.md |
Every session, every folder |
| Project | your-project/.claude/skills/<name>/SKILL.md |
Only when you open this folder |
| Plugin | Distributed as a package | Wherever installed |
A project-scope skill with the same name as a user-scope skill takes precedence. Useful when you want a generic skill but override it for one specific project.
A starter set worth building
The third-time test: if you’ve explained the same multi-step procedure three times across sessions, it belongs in a skill. Things worth starting with:
- A morning brief. “Pop the inbox, list anything urgent in the active project, and give me a one-line health check on the systems I care about.”
- A weekly review. “Summarise the week’s commits / notes / activity in the project; flag anything I said I’d do but haven’t.”
- A finishing-touches pass on writing. Style checks, citation format, the usual cleanup before a thing leaves your hands.
- A deploy/release procedure. The exact sequence of
commands you run to push a change live. Document once; type
/deploythereafter.
How to apply it
- Notice the candidate. Third-time rule.
- Draft the skill. Write the file by hand or run the
/skill-creatorskill that ships with Anthropic’s plugin pack — it interviews you about edge cases and produces a draft. - Scope the tools. Only list what the skill genuinely needs. A prose skill doesn’t need shell access.
- Test the transcript. Invoke it against a realistic scenario. Read the full back-and-forth, not just the final output, and look for missed steps.
- Tune the description until Claude triggers it on the phrases you actually use.
Failure modes
Vague descriptions that never trigger. “Useful for writing tasks” matches almost nothing. Name the output, list the trigger phrases, name the context.
Skills that wrap a single command. If the body is one shell command with no branching, you don’t earn the maintenance overhead. Type the command directly.
Skills that should be agents. A skill that runs for forty-five minutes, fans out in parallel, or asks the user mid-procedure isn’t a skill — it’s an agent (Chapter 5).
Hardcoded paths in a user-scope skill.
/home/yourname/specific-project/outline.md does the wrong
thing everywhere else. Use relative paths or arguments.
Skill vs hook vs subagent
| If the thing… | …it’s a |
|---|---|
| Must fire automatically on a system event (session start, file write) | Hook (Chapter 4) |
| The user invokes on demand | Skill (this chapter) |
| Is long, parallel, or needs a specialist with its own tool kit | Subagent (Chapter 5) |
Chapter 4 — Hooks: rules your assistant cannot skip
The idea
A hook is a small script that fires automatically at specific moments in a session — before any file is written, after a tool runs, when the session starts, when the assistant finishes a turn. It runs in your shell, as you, deterministically. If it exits with code zero, the event continues; if it exits with anything else, the event is blocked.
Hooks are the difference between telling the assistant to follow a rule and making the rule happen. CLAUDE.md text can be skimmed when context is crowded. A hook cannot be skimmed. It runs every time.
Why bother
Some rules need to hold every time, no exceptions. “Never write to the production database directly.” “This folder of documents is frozen — no edits without approval.” “Send me a notification when a long job finishes.” These are not preferences; they’re invariants. Putting them in CLAUDE.md gives you the appearance of safety. Putting them in hooks gives you the substance.
When hooks fire
| Event | When it fires |
|---|---|
SessionStart |
Once, immediately after the session opens |
UserPromptSubmit |
Each time you submit a message, before Claude processes it |
PreToolUse |
Before any tool call (you can scope to specific tools) |
PostToolUse |
After a tool call returns |
Stop |
When the assistant finishes a turn |
SubagentStop |
When a subagent finishes a turn |
Notification |
When Claude emits a user-facing notification |
PreToolUse and PostToolUse can be scoped to
specific tools, so you can have a hook that fires only on file writes,
or only on shell commands, or only on a particular external
integration.
What hooks are most useful for
| Use case | Event | Effect |
|---|---|---|
| Inject today’s date and one-line status | SessionStart |
Assistant arrives knowing what day it is and recent state |
| Block writes to protected paths | PreToolUse (Write/Edit) |
Frozen files literally cannot be edited |
| Send a notification when a long task finishes | Stop |
You get pinged when something done; no need to watch the terminal |
| Append an audit log entry after a sensitive command | PostToolUse (Bash) |
A record exists regardless of what the assistant decided |
A minimal first hook
The cheapest useful hook is a SessionStart script that
injects today’s date — Claude doesn’t otherwise have a reliable view of
the current date unless you tell it.
#!/usr/bin/env bash
DATE=$(date '+%Y-%m-%d')
cat <<EOF
{
"hookSpecificOutput": {
"hookEventName": "SessionStart",
"additionalContext": "Today is $DATE."
}
}
EOFSaved as a file, made executable, and wired up in
~/.claude/settings.json:
{
"hooks": {
"SessionStart": [
{
"hooks": [
{ "type": "command", "command": "/path/to/the/script" }
]
}
]
}
}From this point forward, every session opens already knowing today’s date.
A protective hook
A more useful pattern: a PreToolUse hook on
Write and Edit that checks the file path and
blocks the operation if it points at a frozen directory.
#!/usr/bin/env bash
PAYLOAD=$(cat) # the tool call arrives on stdin
TARGET=$(echo "$PAYLOAD" | jq -r '.tool_input.file_path')
case "$TARGET" in
"$HOME/research/approved/"*)
cat <<EOF
{
"decision": "block",
"reason": "This document is frozen. To edit it, move it back to drafts/ first."
}
EOF
exit 0
;;
esac
exit 0Every attempted write goes through this check. The assistant cannot edit frozen documents, not because it was told not to, but because the harness physically stops the write.
Failure modes
Encoding a real rule as text instead of a hook.
“Never write to the production database” in CLAUDE.md is a wish. The
same rule in a PreToolUse hook is enforced. The test: would
it matter if this rule were skipped once? If yes, make it a
hook.
Overbroad matchers. A hook that blocks all writes to a directory will also block writes the assistant legitimately needs to make. Be precise; include a bypass mechanism (a one-shot unlock file, a flag) so legitimate work isn’t permanently stuck.
Context-window bloat from SessionStart. A hook that injects a giant document on every session start eats budget that could go toward actual work. Keep injected context to a few lines.
Host-specific hooks without environment checks. A
hook that calls a tool only available on one machine will fail silently
or noisily on every other machine where the same
settings.json applies. Either check for the dependency
inside the hook, or scope the hook to the project that needs it.
The three-way choice
| When you need… | Reach for… |
|---|---|
| A rule the model must obey, every time | Hook |
| A standing piece of context the model should have | CLAUDE.md |
| A procedure the user invokes on demand | Skill |
A reminder belongs in CLAUDE.md. An invariant belongs in a hook. A repeated action belongs in a skill.
Chapter 5 — Subagents: delegate the heavy lifting
The idea
A subagent is a second copy of Claude you can hand a specific task. It arrives with no memory of your conversation, carries only the tools you’ve permitted it, and may run on a different model than your main session. You write its briefing once and store it as a file; from then on, your main session can call it by name whenever that kind of work appears.
The main session orchestrates — it holds the plan and the judgement calls. The subagent executes a bounded task and reports back. The separation is not a courtesy; it’s the structural reason subagents are useful at all.
Why bother
Three reasons:
Context hygiene. Every tool call that returns a large result — a long log file, a directory full of documents, a database query — lands in your context window and stays there. A subagent does the heavy reading and returns only the distilled answer. Your main session holds reasoning; the subagent absorbs the noise.
Parallelism. Three independent research questions can run at the same time, finishing in the time the slowest takes. A single session can’t do that for itself.
Specialisation. An agent briefed as a domain specialist is less likely to drift than a general session handling one more thing at the bottom of a long thread. The brief is the agent’s whole world; that focus is a correctness advantage as much as a speed one.
What a subagent looks like
A subagent is a Markdown file at
~/.claude/agents/<name>.md:
---
name: research-analyst
description: Use this agent for literature searches, citation checking, and
summarising sources on a given topic. Invoke when the task is "find what is
known about X" rather than "decide what to do about X".
tools: [Read, Write, Bash]
model: sonnet
---
You are a research analyst. When given a topic and a source archive:
1. Search the archive for relevant material.
2. For each source found, capture: title, author, date, key claims,
and your assessment of evidence strength.
3. Write the structured summary to the output path the brief gives you.
4. Do not synthesise or recommend — that's the orchestrator's job.
If a source is paywalled or unreadable, note it but don't pretend you
read it.
Always cite the file path of each source you used.The frontmatter defines the agent. The body is its persistent briefing — its role, the conventions it follows, the rules it applies when it hits something unexpected.
Tool allowlists matter
A research agent that reads files and runs searches doesn’t need
Write or shell access. A formatting agent doesn’t need
Bash. Tight allowlists mean three things: less risk if the
agent misinterprets its brief, faster invocation (less to load), and
clearer intent — the allowlist documents what the agent is supposed to
touch.
Picking the right model
| Model | When |
|---|---|
| Haiku | High-volume mechanical work — extracting structured data, formatting, simple classification |
| Sonnet | Standard professional work where judgement matters |
| Opus | Reasoning-heavy work where getting it wrong is expensive — legal analysis, architecture, subtle debugging |
Running N parallel Haiku agents to summarise N documents costs a fraction of a single Opus call and often produces equivalent results.
The brief is half the job
The agent definition is permanent. The brief is what you pass at invocation time — the specific task right now. A good brief looks like:
Goal: One sentence saying what "done" looks like.
Input: Exact file paths or data — not "the report" but "/path/to/report.md".
Output: Format and destination — "Write findings to /path/to/findings.md
as a structured list."
Constraints: Things not to do, word limits, style.
If stuck: Report back rather than guess.
A vague brief produces vague work. The agent cannot ask you a clarifying question mid-task unless it’s designed to do so, so the brief must be self-contained.
When to delegate vs do it yourself
| Situation | Reach for… |
|---|---|
| Task fits in three lines, output is small | Do it inline |
| Output will be large (and you don’t want it cluttering context) | Subagent |
| Task is one of several independent things | Subagent (in parallel) |
| The work has a specialist angle you’ve defined an agent for | Subagent |
| It’s a mechanical, repeatable sequence with no judgement | Probably a Skill, not an agent |
The break-even point for delegation is roughly thirty to sixty seconds of careful effort. Below that, do it yourself; agent invocation overhead exceeds the gain.
Fan-out: N tasks in parallel
When you have N independent subtasks of the same kind — summarise these N documents, draft these N sections, check these N references — spawn N agents in one go.
┌────────┐
│ Main │
│ session│
└───┬────┘
│
┌──────────┬──────────┼──────────┬──────────┐
▼ ▼ ▼ ▼ ▼
┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐
│ agent │ │ agent │ │ agent │ │ agent │ │ agent │
│ 1 │ │ 2 │ │ 3 │ │ 4 │ │ 5 │
└────────┘ └────────┘ └────────┘ └────────┘ └────────┘
│ │ │ │ │
└──────────┴──────────┴──────────┴──────────┘
│
▼
┌────────┐
│ Main │
│ reads │
│ outputs│
└────────┘
A few rules for fan-out:
- Validate the outputs. Read what the agents actually produced before treating it as done. Confident-sounding output is not always correct output.
- Watch for sibling collisions. If the agents share a design space — they must produce distinct domains, consistent naming, coordinated structure — flat fan-out without coordination causes overlap. Two agents picking “the same obvious example” independently. The fix is a manager agent that assigns scope to each worker before any of them begin.
Failure modes
The terse brief. “Fix the formatting issues in the report” produces shallow work because the agent must guess what the standard is, where the issues are, and what good looks like. Output quality reflects brief quality.
Over-delegation of trivial tasks. Spawning an agent to rename a variable or change a date is slower than doing it inline. Agent invocation has overhead.
Serial fan-out. Launching agents one at a time when they’re independent wastes the whole point of the pattern.
Trusting the summary. An agent that says “done, all three sections updated” might have updated two correctly and one incorrectly. Read the diff yourself.
Building a subagent for what should be a skill. If the “agent” is a fixed sequence of commands with no judgement calls, it’s a skill. Subagent = packaged expert; skill = packaged procedure.
Part 3 — Composition
The foundations work on their own. The composition chapters describe how to combine them into real workflows.
Chapter 6 — The persistence ladder: state at the right scope
The idea
Information you produce in a session lives at one of four scopes: within a single response, across this conversation, across all conversations, or across parallel branches of work. Each scope has a natural home, and each carries overhead proportional to how durable it is. The discipline is picking the smallest scope that genuinely fits.
| Scope | Home | Lifetime |
|---|---|---|
| Within one response | Todo list (TodoWrite) |
Dies with the session |
| Across sessions, one project | Plan file (~/.claude/plans/<slug>.md) |
Lives as long as the project |
| Across projects, across time | Memory store | Indefinite, until you archive |
| Parallel branches of work | Worktree / sandbox copy | Until merged or discarded |
Why bother
When you store the wrong thing at the wrong scope, two failures occur in opposite directions. Putting plan-state into memory means every future session loads stale progress into context. Putting durable conventions into a session-only todo list means they evaporate by tomorrow. Neither failure announces itself loudly — you just gradually notice things drifting.
Todos: in-session only
The TodoWrite tool maintains a checklist that Claude
tracks within the current conversation. Use it freely for the sequence
of steps in front of you right now — call the plumber, draft the
variation notice, confirm the inspection.
When the session ends, the list ends. That’s the design. Don’t treat a todo list as a deliverable, a decision log, or anything Claude should remember next time.
Plans: cross-session, one project
A plan is what a todo list becomes when the work spans more than one session, requires a decision between phases, or has dependencies worth reasoning about in advance.
A todo answers: what am I doing right now in this session?
A plan answers: what are the phases, in what order, and what does "done"
mean for each?
Save plans to a known location —
~/.claude/plans/<slug>.md is the convention. Not in
/tmp, not in a session buffer. The plan is the artefact: it
lives as long as the work lives, and gets archived when the work is
done.
Memory: durable across projects
Memory (Chapter 2) carries policies and standing facts, not progress. The test is one question: “Will I want this in a session that has nothing to do with the current work?”
- Yes → memory.
- “Only while this project is active” → plan or project file.
The classic mistake is writing plan-state to memory — “currently in fit-out phase, cabinets arrive Friday” — which is wrong by next week and poisons every session that loads it.
Worktrees / sandboxes: parallel branches
If your work is code, a git worktree is a second checked-out copy on a different branch. You can experiment without touching the mainline; merge or discard the result.
If your work is documents, the equivalent is a sandbox copy: a separate folder holding an experimental variant while the authoritative version stays untouched.
Use worktrees when you’d otherwise risk clobbering or contaminating finished work. The overhead is real — you eventually have to reconcile or delete — so don’t reach for one for a small in-place edit.
Applying the ladder
The default walk for any new piece of work:
- Start with todos. Open the session, scope the task, let Claude track the steps. If the work finishes in one session, you never climb higher.
- Promote to a plan when the work spans sessions or needs a sign-off between phases.
- Persist to memory only when a decision has proven to generalise beyond this project.
- Branch a worktree when you’d contaminate the mainline by working in place.
Failure modes
Promoting todos to permanent documents. If a list is worth saving, rewrite it as a proper plan with phases and success criteria — not as a copy of the in-session checklist.
Plans for one-session work. A five-phase plan for “call the plumber” is overhead. Plans pay off only when sessions or sign-offs are genuinely involved.
Plan-state in memory. “Phase 2 complete, awaiting certification” is plan-state. Loading stale progress into every subsequent session contaminates them all.
A worktree for a trivial edit. The overhead of a second branch is only justified when isolation is genuinely needed.
A homeowner’s renovation
Month one: a homeowner uses TodoWrite to track each
day’s site work — confirm skip-bin delivery, photograph existing walls,
call the certifier. The lists do their job and are gone by evening.
Month two: scope has grown. Three phases, two hold points for
sign-off, a materials schedule tied to lead times. She writes a plan to
~/.claude/plans/kitchen-reno.md with labelled phases. Each
session begins by reading the plan.
Month four: she notices she’s restating the same conventions at the start of every session — variation notices follow a specific format, trades get 48 hours’ written notice before any schedule change. These are policies, not plan-state. She moves them into a memory entry.
When the builder suggests moving the island bench 600mm — a change that might void the tile order — she doesn’t edit the confirmed plan. She makes a sandbox copy and works through the implications there. The change doesn’t survive scrutiny. She discards the sandbox. The authoritative plan is untouched.
Chapter 7 — Async: work that runs while you sleep
The idea
You’re not always at the keyboard. A mature setup builds that assumption in rather than hoping you’ll be there. The async layer has three pieces: outbound notifications (something pings you when a task completes), inbound inbox (you can drop a message in from your phone that the next session picks up), and background runs (long jobs that don’t block your main session).
Why bother
Long-running work without notifications forces a choice: park yourself at the terminal or check back hours later to find it either finished cold or errored on step two. An inbound inbox solves the symmetric problem: thoughts that occur away from the machine evaporate before the next session starts unless you have somewhere to put them.
Without the async layer, Claude Code is a synchronous command prompt. With it, it’s a delegatable agent that works while you move and sleep.
Outbound notifications
The simplest form is a one-line shell wrapper around your preferred push channel — Telegram, Pushover, ntfy.sh, an email gateway, anything you actually check.
notify "Asset bake complete — 12 levels processed, 1 flagged for review."Keep the messages substantive. “Done” is useless. “Done — 0 failures, output at /path/to/report” is actionable. A notification is a state-transition signal — task complete, decision needed, long job starting — not activity noise.
Inbound inbox
A persistent queue you can append to from anywhere — your phone, another machine, a web form. The next Claude session pops messages off it before starting other work.
[09:14, from phone] If the lighting pass on level 18 fails again,
skip it and flag it — don't block the whole bake.
[09:22, from phone] Jump-height values in levels 12–15 were rebalanced
today; note that in the manifest.
The queue must be persistent (not in-memory), capped (so it doesn’t grow forever), and read non-destructively (so you can inspect it without popping entries). The assistant’s job at session start is: check inbox, act on any messages, then proceed.
Background runs
The Bash and Agent tool calls both accept a
run_in_background parameter. Use it for any subprocess that
will take more than a few seconds. The main session doesn’t block; you
get notified when it finishes.
Pair background runs with the Monitor tool, which streams events from a running background process. Each line of output becomes a notification, so you can watch a build log in real time without blocking the main session.
Pacing primitives
| Cadence | Tool |
|---|---|
| Fixed schedule (nightly build, weekly review) | cron |
| Self-paced loops (poll until X happens) | /loop skill or equivalent |
| In-session “wake me in N minutes” | ScheduleWakeup |
For in-session waits, watch the prompt cache window. Roughly: under five minutes, the cache stays warm and re-entry is cheap; from five minutes to about an hour, you pay a cache-miss; longer than that, the cost is irrelevant. The awkward middle is a twelve-minute sleep on an eight-minute job — you cross the cache boundary for no reason. Either stay short or commit to a real pause.
A nightly bake
A solo game developer runs a nightly asset bake — 40 level maps, sprite atlases, lighting pass, packaged build. Three to five hours, no reason to sit and watch.
She fires it at 10pm. The opening notification reads: “Starting
nightly bake, 40 levels queued, estimated 3–4 hours.” The export
subprocess runs with run_in_background: true. The Monitor
tool streams the log.
She drops two notes from her phone before bed:
If level 18 fails the lighting pass again, skip and flag it — don’t block the bake.
Jump-height values in levels 12–15 were rebalanced today; note that in the manifest.
At the midpoint she gets a checkpoint: “20/40 baked, no failures. Atlas export next.”
At 1am, level 18’s lighting pass fails. The assistant checks the inbox, finds the standing instruction, skips, flags, continues. No one is woken.
At 5:30am the bake completes: “Done — 39/40 packaged, level 18 flagged, jump-height note added to manifest.” She reads it over breakfast.
When to ping vs stay quiet
Ping on: - Completion of anything that took more than a few minutes - Errors you can’t resolve from standing instructions - Natural checkpoints in multi-phase work - “Starting now” heads-up for genuinely long jobs
Don’t ping on: - Routine intra-phase progress - Decisions you can resolve yourself - Anything that will finish before the human could read the notification
A quiet channel is a trusted channel. One substantive, well-timed notification beats five noisy ones.
Failure modes
Alert fatigue. Notifications on every tool call mean you’ll mute the channel within a day.
Silent completion. Opposite failure. The job finishes at 2am, you find out at 8am, half the day already gone.
Polling loops that thrash the cache. A
sleep 10 loop for a 20-minute process generates calls every
ten seconds for no reason.
Inbox messages without context. “Fix the thing” is useless by morning. Each message needs enough context for a cold session to act.
The async harness silencing the human. When a background task is running, the assistant’s bias toward “do the work, report back” can cause it to queue your new prompts behind the running task. A new message from you is always an interrupt. If the message has a question, answer it now.
Chapter 8 — Multi-agent patterns: a second pair of eyes
The idea
A single agent is fast but biased toward its first plausible answer. Once it commits, every subsequent step reinforces the direction rather than questioning it. A second agent briefed cold — meaning it hasn’t seen any of the first agent’s reasoning — can’t anchor on a framing it never received. That structural independence is the point.
The cost is roughly linear in tokens. The quality lift, especially where a confident wrong answer is worse than a flagged uncertainty, is not.
Why bother
A reviewer who didn’t write the plan sees gaps the author assumed away. A second agent briefed cold is in exactly that position. This matters most for hard-to-reverse decisions and for judgement calls where “looks right” isn’t the same as “is right”.
Cheap verification you can do yourself — running tests, checking outputs, eyeballing a diff — doesn’t need a second agent. When verification requires a perspective you cannot structurally take, add an agent.
Five useful patterns
1. Critic loop. One agent produces; a second agent, briefed cold, critiques against an explicit rubric; the producer incorporates the critique; the critic re-checks. Set a ceiling — two or three rounds — before you start. The critic must receive only the output, never the producer’s reasoning. A critic who sees intent evaluates effort, not result.
2. Planner/executor split. A reasoning-heavy planner (the most capable model) produces a structured plan. A cheaper executor carries out each step. Different bottlenecks: planning is reasoning-intensive and mistakes compound; execution is often mechanical once the plan is sound.
3. Second opinion. For a single high-stakes judgement — is this clause ambiguous? is this schedule feasible? — spawn a second agent with the same inputs and a brief that references nothing the first produced. Agreement raises confidence. Disagreement locates exactly where the question is genuinely hard.
4. Parallel fan-out with synthesis. N agents on N independent questions, then a synthesiser agent that produces a unified result. The synthesiser is briefed to reconcile contradictions and flag gaps, not just stitch outputs together.
5. Build-and-grade. A producer drafts; a grader scores against an explicit rubric; producer iterates; grader re-evaluates. The score makes convergence visible — you can see whether rounds are improving the work or spinning in place.
A wedding plan
An event planner has drafted a 200-person wedding plan: venue logistics, catering, run-sheet from access to carriages, three vendor contracts, a risk register.
A single end-to-end review will miss cross-domain conflicts. The food-safety reviewer won’t catch the clause in the venue contract that prohibits outside catering without a supplement.
Instead, parallel fan-out to five specialist critics, each holding the full plan but briefed cold on a specific domain: venue logistics, catering, run-sheet, contracts (instructed to cross-reference the others), and contingency (instructed to assume the three most likely failure modes). Each writes findings to a named output file.
Then a synthesis agent, briefed cold on the five critique files, looks for contradictions, repeated flags (real structural problems, not quirks), and a ranked list of blocking items by impact on the date.
The pattern earns its cost because the five sub-tasks are genuinely independent, domain-specific bias would cause a generalist to miss problems systematically, and the event date can’t move. Where a confident wrong answer means a ruined wedding instead of an extra revision, the redundancy is cheap.
Failure modes
Critic loops that never converge. No stop rule means the critic always finds something. Set a ceiling. Accept good-enough after three rounds.
Planners that over-specify. Specifying every micro-step leaves the executor no room to respond to what it actually finds. Plans should specify outcomes per step, not a rigid command sequence.
Second opinions that share the first agent’s conclusion. “Agent A concluded X; what do you think?” is a request for validation, not a second opinion. The second agent must receive the same raw inputs as the first, nothing more.
Parallel fan-out for dependent tasks. Running B in parallel with A when B needs A’s output produces error-embedded results that look complete and aren’t.
Sibling collisions in flat fan-out. N parallel agents picking the same obvious example domain independently. The fix is a manager agent that assigns distinct scope before workers begin.
Critic monoculture. Five critics all briefed with the same axis — all adversarial, all security-focused, all compliance-oriented — will collectively miss every failure outside that axis. At minimum, one critic should occupy the “use this as an actual operator would” angle — treating the work as a product to be used, not a specification to be verified.
Adding agents to mask a bad brief. A critic loop on bad input produces critiqued bad input. Sharpen the brief before reaching for a second agent.
When to add an agent vs sharpen the brief
The first move is almost always to improve the brief. A second agent on a bad brief produces a second bad result. Add an agent only when one of three conditions holds:
- Independent perspective. The work needs a viewpoint the original brief can’t structurally provide.
- Real parallelism. Sub-tasks are genuinely independent and wall-clock time matters.
- Cost of failure greatly exceeds cost of the extra agent. A second-opinion call on an irreversible decision costs one model call; the alternative may cost much more.
If none of those holds, you need a better brief, not another agent.
Chapter 9 — Tool discipline: pick the right hammer
The idea
Every tool call has a cost — latency, tokens, context window space, sometimes a permission prompt — and a value, which is the information or change it produces. Tool discipline is the habit of reaching for the cheapest tool that delivers the needed value, and of batching independent calls into a single turn so the cache stays warm and the wall clock stays short.
Done well, this is invisible: the session finds what it needs quickly, changes only what it should, and leaves the context holding distilled information rather than noise.
Why bother
Same task, wrong tools: slower, noisier, more permission prompts,
more context consumed. That context is finite — it has to hold your
reasoning, not raw file dumps. A session that uses cat
instead of Read, serialises five searches instead of
batching, or pulls a 30KB log into the main context, will feel sluggish
compared to one that uses the right tool the right way.
The hierarchy
Dedicated file tools over shell equivalents.
| For this… | Use this… | Not this… |
|---|---|---|
| Reading a file | Read |
cat |
| Editing a file | Edit |
sed |
| Creating / overwriting a file | Write |
echo > file |
| Listing files by pattern | Glob |
find |
| Searching file contents | Grep |
grep-via-shell |
The dedicated tools render line-numbered output (auditable), produce
structured diffs (reviewable), avoid most permission prompts, and refuse
ambiguous edits (safer than sed). Bash is for genuine
subprocess work: running a build, querying a live system, restarting a
service.
Parallel calls when work is independent. If two or more tool calls don’t depend on each other’s results, issue them in the same turn. The runtime is bounded by the slowest call, not the sum.
# Bad: serialise
Grep("MyConfig", "/src") wait...
Grep("MyConfig", "/config") wait...
Grep("MyConfig", "/tests") wait...
# Good: batch
Grep("MyConfig", "/src")
Grep("MyConfig", "/config") ← all in one turn
Grep("MyConfig", "/tests")
Subagent as a context buffer. When a tool call would return a large payload — a full log, a directory listing, a bulk query result — ask whether you actually need all of it in the main context. If what you need is a summary, a count, or the answer to a specific question, delegate the retrieval and the interpretation to a subagent. The subagent does the heavy reading; you get back 200 words instead of 25 kilobytes.
Right host, not just right tool. Tool discipline extends to where tools run. If you have a fast-feedback orchestrator host and a separate heavy-compute executor, long-running work belongs on the executor. Builds, test suites, ETL — none of that should run on the host you’re trying to keep responsive. Symptom: “everything feels slow” even though the executor is idle. Cause: running heavy work on the interactive host.
Deferred tools and ToolSearch
Most tools are not loaded by default. Before reaching for a one-off
ToolSearch to load a single tool, ask whether there’s a
logical bundle. A single search that names a toolkit returns the whole
set in one round-trip. Loading tools one at a time is wasted overhead;
loading more than you need wastes a different kind of overhead. Load the
bundle for the domain you’re entering.
When to use which tool tier for an external system
| Tier | What it is | When |
|---|---|---|
| Dedicated MCP | A purpose-built integration for a specific system | Always prefer if it exists |
| Browser MCP | Drive a browser to navigate and extract | When no dedicated MCP exists |
| Computer-use | Interpret pixels on the desktop | Last resort — fragile, slow |
Reach down the tiers only when the upper tiers are unavailable or insufficient.
A research historian
A historian is working through several hundred digitised field notes spread across thematic subdirectories. She needs to find every reference to a specific surveying instrument, read the relevant passages, then edit a draft essay to replace an outdated name.
Her first instinct: open files one by one. cat the first
directory’s listings, read ten files in sequence, find two references.
She’s 10% through and her context is full of files that contained
nothing useful.
She resets. One Grep call across the entire archive —
recursive, with line numbers and context. 23 matches in 14 files,
returned in a few hundred tokens. Then 14 parallel Read
calls scoped to a narrow line range around each match. Her context now
holds exactly the passages she needs. Finally, Edit with
replace_all to rename the instrument throughout her essay:
one call, no regex, a clean diff.
Three turns instead of several dozen. The context holds evidence and reasoning, not hundreds of lines of irrelevant material.
Failure modes
cat instead of Read. No
line numbers, no audit trail, permission prompts. Familiarity is the
only winner.
sed instead of Edit. Blind
regex replacement with no diff. Edit forces you to be
explicit about what you’re changing and fails safely when the target
string isn’t found.
Serialising independent searches. Pure dead time. “To be safe” is not safe; it’s just slow.
Pulling a large result into main context. A 20KB log in the main context occupies space that should hold your reasoning. If you need “errors in this log”, the log itself is not what you needed.
Computer-use when a native MCP exists. Pixel-based interaction breaks on layout changes. API-based interaction doesn’t.
Heavy work on the orchestrator host. Push it to the executor where it belongs.
A short rule of thumb
Use the most specific tool that does the job. Batch every independent call into the same turn. Route large outputs through a subagent and bring back only the derived result. Load deferred tool bundles once, not one tool at a time.
Part 4 — Putting It Together
A walked example: Alex and the consultancy
Alex runs a one-person IT consultancy serving six small manufacturing clients. Different cloud providers, different on-premise hardware, different support agreements. Recurring monthly health checks, periodic security reviews, change requests, occasional incidents. Alex is simultaneously the strategist, the engineer, the documentarian, and the account manager.
A general-purpose chat assistant that starts at zero every session is not a useful tool here. A properly built setup is.
What Alex’s setup looks like
Memory store. About twenty entries across the four
types. Two user files capture standing preferences. Six
project files, one per client, hold infrastructure
overview, config repository location, decisions already made, known
quirks. A growing collection of feedback files documents
lessons hard-won — why one client’s firewall silently drops health
checks (expected behaviour, not a failure); that another’s change
approval requires 48 hours’ notice. Several reference files
hold vendor API quirks.
Skills. A /monthly-health-check skill
takes a client name, reads the client’s project memory file, runs the
check sequence, and produces a formatted report. A
/change-request skill drafts a scope document. A
/incident-triage skill opens with structured symptom
gathering. Each skill reads memory at invocation, so one definition
serves all six clients.
Subagents. A security-analyst agent —
tools Read and Write only — reviews
configuration files against a hardening checklist. A
documentation-drafter handles heavy writing. When a
security review covers three independent subsystems, Alex fans out three
analysts in parallel.
Hooks. A PreToolUse hook on
Write and Edit blocks anything pointed at a
client’s production configuration directory unless the assistant first
emits a clear confirmation. A SessionStart hook injects
today’s date and checks the overnight inbox. A Stop hook
fires a Telegram message when a long session finishes.
CLAUDE.md. A user-wide file with cross-client conventions — house tone for client-facing reports, the rule that hostnames are never written in plain text in outputs that might be archived externally. A project-scope file at the root of each client’s working directory captures the interpretive frame for that client: a two-line glossary, the standing constraints (“change windows are Friday 9pm to Saturday 6am only”), and pointers to the relevant memory file.
Persistence ladder. In-session todos for today’s
health check. A plan file at
~/.claude/plans/client-x-upgrade.md for a six-week
multi-phase upgrade. Memory for the certificate-rotation procedure that
turns out to apply across four clients. Worktrees for risky
architectural changes that might not work.
Async. Health checks for all six clients run as background subagents on Friday afternoons. Each sends a “starting” notification and a completion summary. If a check encounters something needing a decision, it pauses and alerts immediately. A phone-side webhook appends messages to an inbox file that next session pops first.
Multi-agent. A critic agent reviews draft security reports before they go to a client. A second-opinion agent runs on high-stakes change-request scoping when rollback complexity is significant.
Tool discipline. Parallel Grep calls
when looking for service names across config repositories.
Read scoped to line ranges, not whole-file dumps.
Edit for templates, not sed. Vendor MCPs
loaded as bundles. Heavy work pushed to the executor host.
The arc of adoption
| Time | What to build | What it gets you |
|---|---|---|
| Week 1 | One user memory file. One project memory
file. A ten-line CLAUDE.md. A SessionStart
hook for today’s date. |
Sessions stop starting at zero. |
| Month 1 | Three skills for the most-repeated procedures. A
feedback file after the first preventable mistake. A plan
file for any work spanning sessions. One subagent for a category of
heavy work. |
Repetition stops being a tax. |
| Quarter 1 | A protective hook on a path that must not be touched. One outbound notification channel wired and tested. One background job run end-to-end. Memory index trimmed. | Long jobs run while you live your life. |
| Year 1 | A dozen memory entries, mostly feedback. Six to ten skills. Two or three subagent definitions. Hooks that have caught real problems. Multi-agent critic loops on at-stakes deliverables. | A working environment that compounds instead of resetting. |
When NOT to bother
The patterns in this guide pay off when work compounds — when you’ll be in the same project tomorrow, next week, next month; when the same procedures recur; when the same mistakes can keep happening. They don’t pay off for one-off questions, throwaway scripts, or research you’ll never revisit.
Don’t build a memory store for a project you’ll finish this week. Don’t write a skill for something you’ll do once. Don’t write a hook for a rule no one would ever break anyway. Don’t add an agent when sharpening the brief would fix the problem.
The system serves the work, not the other way around.
Where to next
If a specific friction is what brought you here, go to that chapter directly:
- I keep re-explaining myself at session start. → Chapter 1 (CLAUDE.md) and Chapter 2 (Memory).
- I keep doing the same multi-step procedure manually. → Chapter 3 (Skills).
- Something keeps getting touched that shouldn’t. → Chapter 4 (Hooks).
- My main session is always clogged with large outputs. → Chapter 5 (Subagents) and Chapter 9 (Tool discipline).
- I want long jobs to run while I’m away. → Chapter 7 (Async).
- A high-stakes deliverable needs a cold second look. → Chapter 8 (Multi-agent).
If you’re starting from zero, the suggested order is: Chapter 1, then Chapter 2, then Chapter 3. Three chapters and you’re already past the most common friction.
The system you build doesn’t need to look like Alex’s. The pillars are the same; the shape they take in your work will be different. Map the patterns to what you do — where is your persistent context, what procedures recur, what work would benefit from a specialist, what invariants must always hold. The answers will be yours.
Closing
Mastery of Claude Code is not about better prompts. It is about choosing the right durable structure for each piece of knowledge, procedure, expert, and guardrail your work needs. A cleverly worded instruction in a chat window is fragile. The same intent expressed as a memory file, a skill, a subagent brief, or a hook is robust. Over weeks and months, a well-structured setup improves; a prompt-engineering practice degrades as the gap between what you remember to say and what the session needs to know keeps widening.
Build incrementally. Add the next pillar when you feel the friction it solves. The goal is not to have all nine pillars running — it is to have the right pillars running for the work you’re doing today, and a clear path to add the next one when the work calls for it.
Annex — System Prompt for an LLM
This section is not for you. It’s a copy-paste payload for a clean LLM session — one you can drop into a fresh Claude (or other model) conversation to bring it up to speed on the contents of this guide. Skip it. Or, if you’d like to chat with an AI that has this guide in its head, copy everything below the line into a fresh conversation.
You are about to assist a user who is learning Claude Code — Anthropic's
terminal-based coding assistant. They've read a guide called "Mastering
Claude Code". Internalise the following condensed version of that guide
and answer their questions in plain, friendly language. Avoid jargon.
Be concrete. Use examples from non-software domains when they help.
THE NINE PILLARS
1. CLAUDE.md — A plain-text file at the root of a project (or in
~/.claude/ for user scope) that Claude reads at the start of every
session. Use it for standing context: what this project is, where
things live, conventions, gotchas, and pointers to the rest of the
setup. Keep it under 150 lines. Never put secrets, current task
state, or hard-rules-that-must-not-be-broken here.
2. Memory — A folder of small text files at
~/.claude/projects/<slug>/memory/ that Claude reads selectively via
an index file (MEMORY.md). Four types: user (who you are),
feedback (lessons from mistakes), project (running context of an
active effort), reference (stable external facts). Feedback and
project are used most. Memory carries policies and standing facts,
not progress.
3. Skills — Named procedures stored as Markdown files at
~/.claude/skills/<name>/SKILL.md. The frontmatter has a name,
description (which Claude matches against to invoke
automatically), and tool allowlist. The body is the procedure.
Type /name to invoke explicitly; Claude can also invoke
proactively based on the description. Third-time rule: if a
procedure has been explained three times, it belongs in a skill.
4. Hooks — Shell scripts wired to lifecycle events
(SessionStart, PreToolUse, PostToolUse, Stop, etc.) in
settings.json. Run deterministically; exit code 0 allows the
event, non-zero blocks it. Use for rules that must hold every
time — protective guards, context injection, notifications,
audit logs. The test: if the consequence of skipping a rule
once is real, it belongs in a hook, not CLAUDE.md.
5. Subagents — Second copies of Claude with their own definition
files at ~/.claude/agents/<name>.md. Briefed cold (no memory of
your conversation), carry only allowed tools, can run on
different models (haiku/sonnet/opus). Used for: context hygiene
(large outputs stay out of main context), parallelism (N tasks
simultaneously), specialisation (briefed perspective beats
general session). The brief at invocation matters as much as
the definition. Watch for sibling collisions in flat fan-out;
use a manager agent when workers share a design space.
6. Persistence ladder — State at four scopes:
- Todos (in-session, dies with the session)
- Plans (~/.claude/plans/<slug>.md, lives as long as the project)
- Memory (indefinite, across projects)
- Worktrees / sandboxes (parallel branches, until merged or
discarded)
Discipline: pick the smallest scope that fits. Plan-state in
memory poisons future sessions. Conventions in todos evaporate.
7. Async & notifications — Outbound push (a `notify` wrapper around
your push channel of choice), inbound inbox (a queue Claude pops
at session start), background runs (run_in_background: true on
Bash and Agent calls), Monitor tool for streaming output. Pacing:
under five minutes keeps the prompt cache warm; 5–60 min pays a
cache-miss; longer doesn't matter. Substantive notifications
only — no activity noise. A new user message is always an
interrupt, never queued behind a running task.
8. Multi-agent patterns — Critic loop, planner/executor split,
second opinion, parallel fan-out with synthesis, build-and-grade.
Use when the work needs a perspective a single agent can't take
on itself (independent eye on its own output), when parallelism
matters, or when failure cost greatly exceeds extra-agent cost.
The first move is almost always to sharpen the brief; add an
agent only when sharpening won't fix the issue. Critic roster
should not be a monoculture — include at least one
operator-experience angle.
9. Tool discipline — Dedicated file tools (Read, Edit, Write,
Grep, Glob) over shell equivalents. Parallel calls when work
is independent. Subagent as context buffer for large outputs.
Right host: heavy work belongs on the executor, not the
orchestrator. Deferred tools loaded in bundles via
ToolSearch, not one at a time. Tier for external systems:
dedicated MCP > browser MCP > computer-use.
PATTERNS THAT RECUR ACROSS PILLARS
- Cold briefing: agents that need an independent perspective must
not see prior reasoning.
- Read the diff, not the summary: an agent's confident "done"
report is not verification.
- The third-time test: explain something three times, write it
down (skill, memory, CLAUDE.md, hook — whichever fits).
- Right place for each rule: standing context → CLAUDE.md;
knowledge → memory; procedure → skill; invariant → hook.
ADOPTION ORDER
Week 1: one user memory file, one project memory file, a 10-line
CLAUDE.md, a SessionStart hook for today's date.
Month 1: three skills for the most-repeated procedures, a feedback
file after the first preventable mistake, a plan file for any
multi-session work, one subagent for heavy work.
Quarter 1: a protective hook on a frozen path, an outbound
notification channel, a background run, memory trim.
Year 1: a dozen memory entries, six to ten skills, two or three
subagent definitions, multi-agent critic loops on high-stakes
deliverables.
WHEN NOT TO BOTHER
Don't build memory for a project that ends this week. Don't
write a skill for something done once. Don't write a hook for
a rule no one would ever break. Don't add an agent when
sharpening the brief would fix the problem.
You may now answer the user's questions. Keep responses
concrete and short. When they describe a friction, name the
specific pillar that addresses it and walk them through the
minimal first step. Use analogies from outside software when
useful.
End of guide. Updates from the last ten weeks live in the
updates/ folder — start with updates/index.md
for a rollup, or open updates/week-NN.md for any specific
week.