Chapter 102Lesson 3

Docs ship in the PR, or they're already wrong

The discipline that keeps documentation accurate, shipping every doc update in the same pull request as the code change that affects it.

A pull request splits the test suite in two, unit tests and end-to-end tests, each with its own command, and renames the script the README tells people to run from pnpm test to pnpm test:unit. The change is correct. CI is green. The PR merges.

The README still says pnpm test.

Three weeks later someone new clones the repo, opens the README to get the project running, and types the command it tells them to: pnpm test. The shell answers command not found. They check that they typed it right. They check their Node version. They re-read the setup steps in case they missed one. Eventually they give up and ask in the team channel, and someone who knows answers in ten seconds: it’s test:unit now. The newcomer lost an afternoon, and they learned something they’ll carry for the rest of their time on the codebase: the README can’t be trusted. Next time it tells them something, they’ll check the code first.

Nothing in that story was a coding mistake. The code change was right. The test split was right. What went wrong is that a PR changed something the README makes a claim about, and the README shipped wrong, in the same repo, on the same day, because nobody asked the question this lesson is about: what did this PR claim, and which docs make a claim about that?

This is the last lesson of the chapter, and it closes out the documentation half of the unit. The previous chapter named the four artifacts a repo maintains from the outside: README, AGENTS.md, ADRs, and source-as-reference. This chapter went inside the source file for two more: TSDoc on the public surface and // why comments on the lines that need one. Every one of those is one merge away from being silently wrong, and this lesson is the discipline that keeps all of them true.

The lesson has four parts. The rule is that a doc ships in the PR that changed the thing it describes. The map tells you which artifact moves with which kind of change. The checklist is the five doc checks a reviewer runs on any diff. And the boundary marks where automation stops: a machine catches the drift it can compare, a reviewer catches the drift only a human can read. The deliverable isn’t a tool. It’s a reflex you run the moment you open a PR, automatic after a quarter of practice.

A wrong doc is worse than no doc

The reason comes before the mechanics, because the map in the next sections is only worth memorizing once you feel the cost of getting it wrong.

Start with the obvious case: a missing doc. A function with no doc comment, a README that doesn’t mention some command, a config value nobody wrote down. A missing doc has a real cost: the reader has to go read the code to learn what the doc would have told them. But that cost is bounded. It’s paid once, it’s paid up front, and the reader knows they’re paying it, because there’s no doc, so they go straight to the source. They never get misled, since there was nothing there to mislead them.

Now the case that looks the same but isn’t: a wrong doc, one that asserts something the code no longer does. This is drift , and it costs the next reader far more than a missing doc ever could. Trace what the README cost the newcomer in the opening:

They read it. (That’s the time a missing doc would have saved them.)
They believed it and acted on it: they ran the command and watched it fail.
They spent an afternoon discovering it was wrong, re-checking their own setup and suspecting themselves before suspecting the doc, because a doc is supposed to be true.
They stopped trusting it. Not just that line, but the whole README and every doc next to it. A doc caught wrong once poisons every doc around it, because now the reader has to verify all of them against the code, which is exactly the work the docs were supposed to save.

That last cost is the one that compounds. A missing doc is a hole; a wrong doc is a trap, and once a reader steps in one, they distrust the whole floor. The following figure puts the two side by side as a ledger of stacking costs. Flip between the tabs and watch the second column run past the first.

No doc
Wrong doc

Missing doccost ledger

1Reader goes and reads the code instead.

Total:one cost, paid once, up front. The reader is never misled.

Bounded, one-time, and honest: the reader knows to go read the code.

The economics make this a reflex rather than a nice idea. Writing the doc update inside the PR, while the change is fresh in your head and the diff is right in front of you, costs about fifteen minutes. Discovering a wrong doc in production costs days: a teammate or a customer hits the wrong claim, traces it, fixes it, and rebuilds their trust in the docs. The asymmetry is so lopsided that “I’ll do it later” is never actually the cheaper option. It only feels cheaper, because the fifteen minutes lands on you now and the days land on someone else later. The PR is where you pay the small cost so nobody pays the large one.

The rule: the doc ships in the PR that changed the thing

Here’s the discipline, in one sentence you can repeat to yourself:

A code change that breaks a doc claim updates that doc claim in the same PR. Not the next PR. Not a follow-up ticket. Not “before release.” The same one.

The reason isn’t tidiness; it’s structural, and it’s the whole argument. Walk a change through the checkpoints it passes on its way to production. You write it. You open a PR. A reviewer, whether a human or a reviewing agent, reads the diff with attention. That is the one moment where someone is deliberately looking at exactly what changed. They approve. It merges. It deploys.

Count the checkpoints where someone looks at the change closely, and there’s only one: code review. After that, the very next checkpoint is production. So a doc the change should have updated has exactly one shot at being noticed, that single review, before it’s wrong in front of real users with nobody looking. The PR is the moment of leverage, because it’s the only moment when the change and the docs that describe it are in front of a person at the same time. Before the PR, the change is still in flight and the docs aren’t wrong yet. After it merges, the docs are wrong and nobody is reviewing them.

That’s also why “I’ll fix the docs in a quick follow-up PR” doesn’t escape the problem; it just narrows it. Picture it: the code PR merges Monday, the doc PR merges Wednesday. Between those two merges, main is self-contradictory, because the code says one thing and the doc says another. On a team that deploys main continuously, that window isn’t theoretical. It’s a live wrong doc in production for two days, available to every reader and every clone. The same-PR rule is the only option where that window has zero width, because the code and its doc land in the same merge or neither does.

The next two sections make this usable: a map of what to update, and a checklist for catching what got missed.

The doc-change map: which artifact moves with which change

When you open a PR, run one question down a short list: for each doc surface, did this diff invalidate a claim that surface makes? The list is the same every time, which is why it becomes automatic. You aren’t deciding which surfaces to check; you’re checking the same seven and answering yes or no on each.

The skill isn’t only knowing which surface to update. It’s knowing which to skip. Most PRs touch one or two of these surfaces and none of the rest, so half the value of the map is learning the “usually doesn’t move” case for each. Recognizing fast that a surface is irrelevant to this change is as much the reflex as catching the one that isn’t. Here are the seven, each with its trigger and its quiet case.

README moves when the local-dev sequence changes, a common-task command changes (the opening anecdote), or a major stack swap happens. Most feature PRs don’t touch it. (You met the thin README last chapter. It’s deliberately small, so the few claims it makes are the ones worth watching.)
AGENTS.md moves when a convention shifts: a new “don’t” rule, a new module added to the repo layout, a build or test command renamed, a tool added to the stack. Feature PRs rarely move it; refactor and infrastructure PRs often do.
ADRs. A new ADR is added in the PR that ships an architectural decision; an existing ADR’s status flips to “superseded” in the PR that overturns it. Both happen in the deciding PR, never in a follow-up, so the decision and its record land together. (This is the one-decision-per-file ADR from last chapter, and its three-test inclusion bar still applies: not every change is a decision worth an ADR.)
TSDoc moves when an exported function’s signature, contract, side effects, or failure modes change: adding @throws for a new error path, updating @param, marking @deprecated, or just refreshing the summary sentence so it still describes what the function does. All of it goes in the PR that changed the code. (This is TSDoc on the public surface. The cut for which functions earn a block hasn’t changed; this is about keeping the blocks you already wrote honest.)
Inline // why comments move when the why changes. The constraint got fixed upstream, so the comment is now a fossil comment and the workaround it guards should go with it; or the constraint got promoted into enforcement, so the comment dies and a type, test, or transaction takes its place. The comment travels with the lines it explains, or it dies with them; it never outlives the reason it existed. (This is the carry-or-promote reflex from commenting the why.)
Schema header comments. The one-paragraph header on a pgTable declaration moves when the table’s purpose, scope, or invariants change. A new column usually doesn’t trigger it, but a new invariant does: a uniqueness rule, a tenancy constraint, a new lifecycle the table now enforces. (The schema-header pattern is from last chapter.)
.env.example moves whenever env.ts adds, removes, or renames a key. These two files are siblings: env.ts is the validated source of truth, and .env.example is the human-readable hint a new developer copies to get started. An env.ts change without a matching .env.example change is an incomplete PR.

That last pair is worth a second look, because it shows why the human checklist exists at all. env.ts is enforced: a missing required variable fails pnpm build, so the code can’t ship with env.ts out of sync with reality. But .env.example is just a text file, and nothing validates it. Add RESEND_API_KEY to env.ts and forget the example, and the build stays green while the next developer who clones the repo has no idea the variable exists until something fails at runtime. The build catches a missing var; nothing catches a stale example. That gap, enforced on one side and hint-only on the other, is exactly the kind of drift only a person reading the diff will catch.

The whole map compresses into one picture. The following diagram puts the kinds of code change on the left and the doc artifacts each one moves on the right; the edges are the reflex itself, drawn out. The goal is that after you’ve seen it, opening a PR fires the right edge automatically: renamed a command lights up AGENTS.md and README, changed a signature lights up TSDoc, and so on.

flowchart LR
  cmd["Renamed a command"]
  env["Added / renamed<br/>an env var"]
  sig["Changed an<br/>exported signature"]
  adr["Made an<br/>architectural decision"]
  why["Fixed the bug a<br/><code>// why</code> guarded"]
  inv["Changed a<br/>table's invariant"]
  conv["Added a convention<br/>/ new module"]

  readme["README"]
  agents["AGENTS.md"]
  envfiles["<code>.env.example</code> + <code>env.ts</code>"]
  tsdoc["TSDoc"]
  adrdoc["ADR<br/>(new / superseded)"]
  whydoc["<code>// why</code> comment<br/>(moves / dies)"]
  schema["Schema header"]

  cmd --> readme
  cmd --> agents
  env --> envfiles
  sig --> tsdoc
  adr --> adrdoc
  why --> whydoc
  inv --> schema
  conv --> agents

  class cmd,env,sig,adr,why,inv,conv change
  class readme,agents,envfiles,tsdoc,adrdoc,whydoc,schema artifact
  classDef change fill:#dbeafe,stroke:#1d4ed8,color:#111,stroke-width:2px
  classDef artifact fill:#bbf7d0,stroke:#15803d,color:#111,stroke-width:2px

Open a PR, run down the left column, and follow the edges. Most PRs light up one or two artifacts and leave the rest dark, and knowing which to skip is half the reflex.

One edge tempts people the most, and it’s the one that separates following the map from understanding it: a plain new column on a table does not move the schema header. Adding a nullable feature_flag_enabled column for a flag is just more shape. The table’s purpose, scope, and invariants are unchanged, so the header that describes them is still true. The header moves only when an invariant changes, which is when the new column comes with a uniqueness rule, a tenancy constraint, or a new rule the table now enforces. The cut is purpose and invariants, never field count.

Now drill the map. Match each code change on the left to the doc artifact that has to move with it.

Match each code change to the doc artifact that must move with it in the same PR. Click an item on the left, then its match on the right. Press Check when done.

Renamed getInvoice and added a new @throws case

Its TSDoc block

Added RESEND_API_KEY to env.ts

.env.example

Decided to move from polling to webhooks

A new ADR

Deleted a setTimeout whose race was fixed upstream

The fossil // why comment beside it

Renamed the lint command from lint to check

AGENTS.md

Added a unique constraint for a new tenant invariant

The schema header comment

The reviewer’s doc checklist

Everything so far has been from the author’s seat: I’m opening a PR, which docs did my change touch? Now flip to the reviewer’s chair. The author’s reflex becomes the reviewer’s pass, a short, fixed list of doc checks to run against any diff. It’s the same map you just learned, read from the other side of the PR. The author asks “which docs did I move?”; the reviewer asks “did they move the docs they should have?”

This is the structural enforcement the last two lessons kept pointing forward to. TSDoc taught you to write a good doc block; comments taught you to carry a comment through a refactor or promote it. Both said the thing that keeps those docs accurate is review, and this is it. Here are the five checks, run in order on the diff:

Signatures. Did any exported function’s signature, contract, or set of thrown errors change? If so, did its TSDoc update to match? A new parameter with no @param, a new error path with no @throws, a summary sentence that now describes the old behavior: all drift.
Env vars. Did any environment variable get added or renamed? If so, did env.ts and .env.example both update? The build enforces one side; you’re the only thing enforcing the other.
Conventions and layout. Did any convention or repo-layout fact change, such as a renamed command, a new module, or a new rule? If so, did AGENTS.md update? This is the surface that rots most quietly, because no single small change feels like it touches it.
Decisions. Did this PR make an architectural decision, or overturn one? If so, is there a new ADR, or a status flip on the old one? A cross-cutting pattern introduced with no ADR is a decision nobody recorded.
Stripped comments. Was a // why comment removed in a refactor? If so, was the constraint it protected either preserved in a moved comment or upgraded to enforcement? If the comment is just gone and the constraint is gone with it, that’s a bug walking back in.

Check five is the subtle one, and it’s the reason “comments are part of the code” was a whole section of the last lesson. The first four checks ask you to spot something that’s in the diff: a changed signature, a new env var, a new pattern. Check five asks you to spot something that’s missing from it, a comment that used to be there and isn’t anymore. Noticing an absence in a diff is the hardest thing a reviewer does, because nothing on the screen draws your eye to a deleted line. You have to read the minus lines as carefully as the plus ones. Remember the deleted setTimeout from the last lesson: the load-bearing sleep that got tidied away with no comment to explain why it was there, and surfaced as a flaky production bug a week later. A // why line beside it would have turned that silent deletion into one a reviewer could question, and this is the checkpoint where that question gets asked. A reviewer running check five sees a comment vanish, stops, and asks what it was protecting, before the bug it was holding back can ship.

Now sit in the reviewer’s chair for real. The following is a small pull request across four files. Review it the way you’d review a teammate’s: read the diff, find where a doc no longer matches the code it describes, and click the line to leave a comment naming the drift. There are four defects, one per surface from the checklist.

Review this PR the way you'd review a teammate's. Every defect here is a doc that no longer matches the code it describes. Click the line and name the drift. There are four. Click any line to leave a review comment, then press Submit review.

src/lib/billing/charge.ts

/**
 * Charges a finalized invoice through Stripe and records the result.
 *
 * @param invoiceId - the invoice to charge
 * @throws when the invoice is not in the `finalized` state
 */
export const chargeInvoice = async (invoiceId: string): Promise<Result<Charge>> => {
  const invoice = await getInvoice(invoiceId);
  if (invoice.status !== 'finalized') return err('not_finalized', 'Invoice must be finalized before charging.');
  if (invoice.amountCents > org.chargeLimitCents) {
    throw new ChargeLimitError(invoice.amountCents);
  }
  return ok(await stripe.charge(invoice));
};

src/env.ts

export const env = createEnv({
  server: {
    DATABASE_URL: z.url(),
    STRIPE_SECRET_KEY: z.string().min(1),
    RESEND_API_KEY: z.string().min(1),
  },
  // …
});

src/lib/invoices/finalize.ts

export const finalizeInvoice = async (input: FinalizeInput) => {
  // Order matters: the audit row must commit before the receipt enqueues,
  // or a crash between the two loses the audit but still sends the email.
  await writeAuditRow(input);
  await enqueueReceiptEmail(input.invoiceId);
  await persistInvoiceResult(input);
  return ok();
};

src/lib/notifications/dispatch.ts

export const dispatch = async (event: NotifiableEvent) => {
  await sendEmail(event);
  await sendInbox(event);
  // every channel now also mirrors to the new webhook fan-out service;
  // downstream teams subscribe instead of us pushing per-integration
  await sendWebhookFanout(event);
};

That fourth plant, a decision shipped with no ADR, raises a question the checklist doesn’t answer: what does a reviewer do when a PR is incomplete like this? The mechanism is a block. The reviewer declines to approve until the missing doc ships, because a PR that changes a contract without updating its doc isn’t done. How to block well is the whole subject of the next chapter: the language of it, when to block versus suggest, and how to phrase it so it lands as help instead of an obstacle. Here, just hold the shape: an incomplete PR gets blocked, and the doc is part of what makes it complete.

Where automation stops and review begins

There’s an obvious objection forming by now: if doc drift is this mechanical, why is a human the load-bearing check? Why not just add a CI rule and be done?

Because some drift a machine can compare and some it can’t, and the line between the two is what this section is about. Get it wrong in the cheap direction, assuming “tooling will catch it,” and you stop reading diffs, so the half of drift that no tool can see ships unguarded.

Here’s the split, with concrete cases on each side.

Mechanical drift, which is automatable. This is drift a machine can detect by comparing two things for a structural match:

The cleanest example is a .env.example-vs-env.ts key-parity check. The two files must declare the same set of keys. That’s a set comparison: a script computes both key sets and fails if they differ. Pure mechanics, zero judgment.
A TSDoc linter (such as eslint-plugin-tsdoc) flagging a @param that names an argument the function no longer has, or a malformed tag. The tool isn’t reading meaning; it’s matching tag names against the signature.
A test that imports a Server Action and exercises its success and failure paths. When the contract changes in a way the test pins, the test goes red, and the behavior change announces itself.

Each of those works because there’s a structural fact to compare: key set against key set, tag name against parameter name, asserted behavior against actual behavior. No interpretation required.

Semantic drift, which is review-only. This is drift where being right requires reading intent against behavior, and no linter can do that:

Does the TSDoc summary sentence still describe what the function does? The tags can all be present and well-formed while the sentence quietly describes last quarter’s behavior. A machine sees a valid summary; only a reader sees that it’s wrong.
Does the README’s “getting started” prose still match the actual steps? The commands might all exist, so a parity check passes, while the order or the explanation is now wrong.
Is a // why comment still true, or is it a fossil for a bug that’s since been fixed? The comment is syntactically fine; it’s just describing a world that no longer exists.
Did this PR’s new behavior warrant an ADR that nobody wrote? No tool can decide a change was architecturally significant.

None of those have a structural fact to compare. They require someone who knows what the code is supposed to do, reading the doc against that. That someone is the reviewer.

So here is the threshold to carry out of this section: lint catches the drift a machine can compare; the reviewer catches the drift only a human can read. The env-parity check is the cleanest anchor for the left side, the one drift class that’s purely mechanical, a set equality and nothing else, which is exactly why it’s the easiest to automate and the least interesting to a reviewer. The further you get from “two sets must be equal,” the more the check belongs to a person.

Move the reflex into the workflow

A reflex is worth nothing if it evaporates the first busy week. Two pieces of lightweight scaffolding keep it alive. They aren’t the teaching, just the rails that hold the discipline in place when nobody has time to be disciplined.

The PR template

A pull-request template is a markdown file at a known path that GitHub uses to pre-fill the description box every time someone opens a PR. Put two checkboxes in it and the author has to look at the doc surfaces at the exact moment of leverage, PR-open time, before they request review.

## Docs

- [ ] I updated the docs affected by this change (TSDoc, `// why`, README, AGENTS.md, ADR).
- [ ] If I added or renamed a dependency or env var, `env.ts` and `.env.example` match.

Keep it to two lines, not an essay. A template that runs to a page of checkboxes gets approved unread, which is worse than no template, because now everyone is trained to tick boxes without looking. Keep it short enough that ticking it honestly is faster than ticking it dishonestly.

The obvious failure mode is worth naming: the author ticks the box without doing the work. The template is a prompt, not enforcement; it can’t make anyone actually update a doc. That’s fine, because it was never meant to be the enforcement. The reviewer’s five-check pass from two sections ago is what catches a box that was ticked falsely. The box and the checklist are two halves of one mechanism: the box is the author-side prompt that makes them look, and the checklist is the reviewer-side verification that they actually did. Neither works alone; together they close the loop.

The quarterly meta-doc review

There’s a class of drift the per-PR reflex can’t catch, no matter how disciplined everyone is. Some docs rot without any single PR touching them. The README’s “getting started” sequence and the AGENTS.md conventions list drift slowly: a dozen small changes each leave them a little more out of date, and no one of those changes is wrong enough, or close enough to the doc, to trip the per-PR check. Each PR is individually clean; the doc is collectively stale. Per-PR review structurally can’t see it, because there’s no single PR to point at.

The counter isn’t a better review; it’s a cadence. Every quarter or so, someone follows the README from a genuinely clean clone, fresh machine state and no assumptions, and writes down every place it deviates from reality. They do the same for the AGENTS.md conventions, reading them against what the codebase actually does now. State the cadence once, schedule it like any other recurring engineering chore, and it works.

The anti-pattern is what happens to every recurring chore that isn’t defended: “everyone’s too busy this quarter,” and then the quarter after, and the meta-docs rot in exactly the slow, invisible way the cadence existed to prevent. The quarterly review has to be defended on the calendar the same way you’d defend dependency upgrades or on-call rotation, not because it’s urgent any given week, but because it’s the only thing that catches the rot nothing else can.

One small connection is worth noting: if your team uses conventional commits , a line in the commit body can flag that the PR touched docs, and changelog tooling picks it up. It’s useful but not load-bearing. The discipline is the PR review, not the commit prefix. Don’t mistake the prefix for the protection.

The reflex, and the chapters it closes

Everything in this lesson, the map, the checklist, the boundary, the template, and the cadence, is scaffolding around a single question. Pull it out and carry it, because after a quarter of running it on every PR you open, it fires on its own:

Before you request review, ask: what did this PR claim, and which docs make a claim about that?

The list of docs is the same seven surfaces every time, and once the question is a habit it takes seconds. It’s the entire payload of this lesson: the map tells you where to look, the checklist is the same question from the reviewer’s seat, and the rest keeps the habit alive when you’re busy. The reflex is the deliverable.

This also closes the documentation half of the unit, so step back and see the shape these two chapters were building toward. Four ideas, each a lesson’s worth of work, stack into one posture:

Docs live next to the truth, in the repo and beside the code, where they can’t be forgotten in a wiki nobody opens.
You link instead of duplicating, so a doc can’t drift from the source it points at, because it doesn’t store a second copy of anything.
Volume tracks value, through the public-surface cut for TSDoc and the why-not-what cut for comments, so the docs that exist are the ones worth reading and the signal isn’t drowned in noise.
And the lock on all of it: the doc ships with the change that affects it, or it’s already wrong.

The first three keep docs worth reading. This last one keeps them true, and it’s the load-bearing one, because without it the other three decay: a doc placed next to the truth drifts away from it on the next merge, a link goes stale, a perfectly cut TSDoc block ends up describing a function that no longer behaves that way. With this discipline, the three compound instead. Every accurate doc makes the next reader trust the docs more, which makes them read them, which makes them worth maintaining.

One last property is worth stating plainly: in 2026, a repo’s docs are read by whoever, or whatever, edits the code next. A stale AGENTS.md or a wrong TSDoc steers the next change wrong before a human ever looks at it. That raises the stakes on accuracy, but it doesn’t change the discipline at all. It’s the same discipline a careful team has always run: the doc ships in the PR, or it’s already wrong. The only thing that’s changed is that it’s no longer optional.

The next chapter picks up where the reviewer’s pass left off. Beyond the doc checks, it covers the whole craft of reviewing a pull request: what to look for beyond docs, when to suggest versus when to block, and the language that makes a review land as help instead of an obstacle. You’ve been doing review from one narrow seat this whole lesson. Next, you take the whole chair.

External resources

A few resources are worth bookmarking: the philosophy this whole unit rests on, the reviewer’s-pass guidance read from a different angle, and the file you’ll wire up to put the reflex into a real repo.

Docs as Code

writethedocs.org

Write the Docs' canonical writeup of the philosophy behind this unit: docs live in the repo, ship in the PR, and get reviewed like code.

What to look for in a code review

google.github.io

Google's engineering-practices guide: the reviewer's pass widened past docs, including a dedicated documentation check.

Creating a pull request template

docs.github.com

GitHub's docs on the .github/pull_request_template.md file and where it lives, for the two-checkbox version.