Chapter 86Lesson 3

Coverage as a diagnostic, not a target

Reading Vitest coverage reports as a tool for locating untested code, not a score to chase.

The last lesson left you with a clear standard: the bar is not “we have tests,” it is “do the tests fail on the bugs that ship.” A green suite that sails past the cross-tenant leak hasn’t earned the confidence it projects, it has manufactured it. So you need an instrument that tells you whether the suite is real, and the one everyone reaches for is coverage . It is also the single most misread number in testing.

Here is the misreading, made concrete. The report says 87% lines, 72% branches, and two engineers look at it. The first reads it as a single grade: 87% is a passing number, so set the CI gate at “≥80% lines” and push the next release toward 100%. The second ignores the average entirely and asks which files, which lines, because that 87% is hiding the case where the webhook receiver sits at 40% and a one-line getter sits at 100%. This lesson teaches the second read. By the end you’ll have a coverage block on the vitest.config.ts from the first lesson, with thresholds on the surfaces where coverage means something and an include that drags untested files into the light. The config is the small part. The larger goal is learning to read the report the way an experienced engineer does: as a diagnostic that locates the untested seam before it ships, not a target to optimize toward.

What the coverage report actually measures

Before you can judge how to read coverage, you need a plain definition of what it is, because the whole “diagnostic, not target” idea rests on it.

Coverage measures which parts of your source ran while the suite executed. That is all it measures. The report breaks “which parts” into four numbers: lines (which source lines executed), statements (which statements ran, close to lines, but several statements can share a line), functions (which functions were called at least once), and branches (which sides of each decision were taken). All four are lenses on a single fact: what got reached.

Hold onto one sentence, because the rest of the lesson turns on it. Coverage records what executed, never what was checked. A line can run inside a test that asserts nothing at all and still count as covered: the line ran, so it shows green, and the report has no idea whether anything downstream was verified. Keep that gap in mind, because we come back to it.

The numbers come from the @vitest/coverage-v8 provider you installed in the first lesson. It reads the coverage data that V8 already produces during a normal run, so there is no instrumentation step rewriting your source first. That makes it fast, and you already have the dependency. The alternative provider, @vitest/coverage-istanbul, exists for the rare case where V8’s output doesn’t fit your needs, but provider: 'v8' is the default, so the choice is already made for you.

Turning the report on is a small addition to the config you already have. A provider and a reporter are enough to produce one:

// inside test: { ... }
coverage: {
  provider: 'v8',
  reporter: ['text', 'html'],
},

Run it with the pnpm test:coverage script from the first lesson. Of the four numbers it prints, one is worth reading first, and it isn’t the one most people look at.

Read branch coverage before line coverage

The most common way to misread coverage is to treat the line percentage as the headline number. That is the wrong instinct, because line coverage is the lens least connected to where bugs live.

The distinction is sharp once you see it. Line coverage rewards executing every line. Branch coverage rewards taking every path through every decision: each side of an if, each case, each side of an && or a ?: that can short-circuit, each catch that may or may not fire. The two come apart constantly, because a single test can run every line of a function while only ever taking one side of its decisions.

Take a guard you’ve written a dozen times by now:

export function loadReport(role: Role) {
  if (role !== 'admin') {
    return forbidden();
  }
  return buildReport();
}

Write one test that calls loadReport('admin'). It runs the function declaration, runs the if, and runs the return buildReport(), so every line executes and line coverage reports 100%. But the test only ever took the false side of role !== 'admin'. The forbidden() branch never fired. Branch coverage reports 50%, and the branch it tells you is missing is the authorization denial, exactly the path whose absence ships as a bug. This is the branch the fail-closed reflex from the errors-and-security work cares about most: every doubt is a deny, and the deny is the branch line coverage is blind to.

The following figure puts the two readings side by side so you can see why they diverge. The same five lines, the same single test: one gutter counts lines, the other counts branches.

lines

loadReport.ts

branches

✓

1 export function loadReport(role: Role) {

✓

2 if (role !== 'admin') {

false ✓ true ✕

✓

3 return forbidden();

✓

4 }

✓

5 return buildReport();

5 / 5 — 100%

one test: loadReport('admin')

1 / 2 — 50%

covered missed

Every line ran; only one side of the decision did. Line coverage says 100%, branch coverage says 50% — and the missing branch is the denial.

So the rule worth keeping is this: read branch coverage first. Line coverage hovering near 100% with branch coverage well below it is the signature of a suite that calls the code but doesn’t test its decisions, and the decisions are where the seam bugs are.

Why 100% coverage is theatre

If branch coverage is the number to read first, you might reasonably conclude the goal is to push every number to 100%. It isn’t, and treating it as one is the most expensive mistake in the chapter. A 100%-coverage badge is a yellow flag, not a gold star.

Start with the cost, because it is the obvious problem. Hitting 100% means writing a test for every getter, every defaulted parameter, every error class’s name, and every branch the framework injects that your code never chooses. The effort is enormous and the bug-finding signal is close to zero: you are spending real hours to turn green a body of code nobody was ever going to break.

The deeper problem is corruption. Chasing the number rewards tests that exercise code paths instead of behaviors, which is where the earlier sentence pays off: coverage measures what ran, not what was checked. A 100% number is therefore fully achievable by a suite that asserts almost nothing. Call each function with a fixture, let the lines tick over, and never check the result. Worse than wasting effort, a 100% culture quietly converts your test code from a contract about behavior into a mirror of the source, and once the team senses the tests only echo the implementation, they stop trusting them. You have paid for a suite and bought yourself noise.

It helps to recognize the concrete shapes these tests take, because they all pass and they all look like work:

it('exports the function', () => expect(typeof fn).toBe('function')) confirms the function exists but asserts nothing about what it does.
A snapshot that captures whatever the function returned today and asserts nothing about behavior.
A test that checks a function’s return type matches its declared type. The type system already guarantees that, as you saw back in the TypeScript work, so the test is redundant the moment it compiles.
A test that mocks every dependency and then asserts the mocks were called with the values the test itself just wired in. It tests the wiring, not the function. This is why the conventions forbid mocking Drizzle, Stripe, and Resend: a test built on those mocks can only ever confirm its own setup.

A single question separates these from real tests, and it is worth carrying everywhere: what would have to change for this test to fail meaningfully? If the only honest answer is “delete the test,” it is theatre. A real test fails when the behavior breaks; a theatre test fails only when it is removed.

Try it on the file below. You are reviewing a small test suite, and most of it is theatre. Leave a comment on each test that isn’t pulling its weight, naming why it is theatre, not just that it is.

You're reviewing a teammate's new test file. Three of these four tests add coverage but no signal — they pass, they tick lines green, and they fail only if you delete them. Leave a comment on each one you'd flag, naming *why* it's theatre. Click any line to leave a review comment, then press Submit review.

src/lib/billing.test.ts

import { expect, it, vi } from 'vitest';
import { formatPlan, chargeCustomer } from './billing';

it('exports formatPlan', () => {
  expect(typeof formatPlan).toBe('function');
});

it('returns a string', () => {
  const result = formatPlan('pro');
  expect(typeof result).toBe('string');
});

it('formats the pro plan label', () => {
  expect(formatPlan('pro')).toBe('Pro plan');
});

it('charges the customer', () => {
  const stripe = { charge: vi.fn() };
  chargeCustomer(stripe, 4200);
  expect(stripe.charge).toHaveBeenCalledWith(4200);
});

expect(typeof formatPlan).toBe('function') checks that the export exists and nothing else. It passes for any implementation — even formatPlan = () => undefined. The only change that turns it red is deleting the export. It runs the import line and ticks coverage, but it verifies zero behavior.

A real test names the behavior:

expect(formatPlan('pro')).toBe('Pro plan');

There is a tool that measures the thing coverage can’t, and it is worth knowing as a mental model even though you won’t install it. Mutation testing (the tool is Stryker) flips operators in your source and asks whether any test notices. That is the clean way to hold coverage’s limitation in your head: coverage measures what ran; mutation measures what’s checked. Stryker is overkill for most SaaS suites, so take the idea and leave the tool. When you ask “what would have to change for this test to fail,” you are running the mutation test in your head.

Reading the report to find the under-tested seam

If 100% isn’t the goal and theatre tests are the trap, the fair question is what coverage is actually for. This is where it earns its keep. Read correctly, the report is the fastest way to locate load-bearing code paths your suite never exercised, and those paths cluster at the seams.

Flip your mental model of the report. It isn’t a scoreboard, it is a map of un-exercised paths, and you read a map by location, not by average. The diagnostic move is to scan the per-file breakdown for under-coverage in the seams. The catalog you built last lesson is exactly the list to scan: authedAction’s catch branch, the webhook receiver’s signature-failure path, safeLimit ’s fail-open carve-out, the cross-tenant 404 branch, the error mapper’s fallback case. Each uncovered branch there is a line that a missing test will ship as a production bug.

Here is the contrast that makes the whole lesson land. A 100%-covered /lib mapper sitting next to a 40%-covered Server Action wrapper is the clearest signature of a suite that ships bugs: all the testing effort pooled at the safe base, none at the dangerous seam. That is the coverage-report restatement of last lesson’s pyramid failure: the effort went where the diagram said the bugs were, and the bug went where the diagram wasn’t looking. The average hides it; the per-file view makes it impossible to miss.

The figure below is what that view looks like. Don’t read it like a scoreboard. Your eye should go straight to the rows in red, and within those rows, to the gap between the line and branch columns.

File

Lines

Branches

Uncovered

lib/money.ts

100%

—

lib/invoice-mapper.ts

96%

92%

41, 58

lib/auth/authed-action.ts

100%

45%

22–29

app/api/webhooks/stripe/route.ts

40%

30%

18–52

lib/rate-limit.ts

88%

52%

31–37

headline average 85% lines · 64% branches — the one number that tells you nothing

authed-action.ts → 100% lines, 45% branches — the denial branches never ran. The trap line coverage hides.

stripe/route.ts → Your highest-stakes seam, your least-covered file.

healthy under-covered seam

Don't read the average — a healthy-looking headline number across this table would tell you nothing. Read which seams are red.

So how do you actually get in front of this? pnpm test:coverage writes an HTML report under coverage/, since the html reporter is on by default. Open it, drill into a seam file, and read which lines and branches are red. Never read the top-line percentage; it is the one number on the page that can’t help you. The two reporters serve two surfaces. The text reporter prints a summary table to the terminal, the at-a-glance view a teammate or CI scans, while the HTML report is the developer’s drill-down. Treat this as a periodic diagnostic read, once a sprint and whenever you touch a seam, not a per-save habit. Running with --coverage on every save is slow and pointless, so leave it for the moments you are actually asking the diagnostic question.

On a pull request, the signal a reviewer wants is differential, not absolute. Differential coverage answers “this PR added 20 lines and covered 15 of them,” which is actionable, where “the overall number moved 0.4%” is noise. Vitest can ratchet thresholds up automatically with coverage.thresholds.autoUpdate, but the course doesn’t use it, because the churn it creates outweighs the gain. Surface the differential rather than auto-ratcheting. Wiring that signal into CI and the PR comment is a later chapter’s job; here you only need to know which signal to look for.

Per-directory thresholds as a backstop, not a goal

The reading discipline is the real work, but a discipline that lives only in your head erodes the moment you stop looking. So you encode a piece of it into CI, carefully, because a threshold framed wrong recreates every problem this lesson just dismantled.

Start with the principle, because the framing is the whole game. A threshold is a backstop: a floor that catches the case where a previously-tested seam loses coverage, like someone adding an else without a test for it. It is not a target you climb toward. The team writes tests for behaviors that exist, and the threshold is the speed-bump that catches the regression. It goes only on surfaces where coverage means something, /lib purity and the seams, and nowhere on framework-mediated surfaces, because chasing coverage there is the theatre you already know to avoid.

Here is the course baseline. Notice that every threshold ships with a one-line justification, since a number without a reason is a number nobody can defend later:

src/lib/**, 90% lines, 85% branches. Pure logic, the wide base of the honeycomb; if it lives in /lib, it is testable, so it should be near-fully covered.
src/app/api/webhooks/**, 85% branches. The highest-stakes seam; every uncovered branch is a webhook that mishandles a real provider event.
src/lib/auth/**, src/lib/error-mapping.ts, src/lib/rate-limit.ts, 85% branches. Load-bearing helpers; an uncovered branch here is an auth bypass, a leaked stack trace, or a fail-open that should have failed closed.
Everything else, deliberately unthresholded.

One note before you write it, because the tooling changed and the stale advice is everywhere. In Vitest 4, per-glob thresholds are keys inside coverage.thresholds, and they no longer inherit the top-level perFile setting. If a glob needs per-file checking, you set perFile on that glob’s own object. Older guides won’t mention this, so write it as the version on disk behaves.

The exclusion list is the paired idea. Thresholds declare where coverage matters; coverage.exclude declares where it is pure noise, the files whose coverage number would only ever mislead. The course excludes config files (**/*.config.{ts,js}), type-only files (**/*.d.ts, **/types.ts), barrel files (**/index.ts), framework-orchestrated route files (app/**/page.tsx, app/**/layout.tsx, tested through integration at the seam rather than by re-testing the framework’s rendering), Storybook stories (**/*.stories.tsx), one-off scripts (scripts/**), and mock directories (**/__mocks__/**). The discipline is the same as for thresholds: every exclusion carries a recorded reason. An unexplained exclude entry is exactly how a seam quietly disappears from the report and rots out of sight.

Walk through the assembled block one part at a time:

// inside test: { ... }
coverage: {
  provider: 'v8',
  reporter: ['text', 'html'],
  thresholds: {
    'src/lib/**': { lines: 90, branches: 85 },
    'src/app/api/webhooks/**': { branches: 85, perFile: true },
    'src/lib/auth/**': { branches: 85 },
  },
  exclude: [
    '**/*.config.{ts,js}',
    '**/*.d.ts',
    'app/**/{page,layout}.tsx',
    '**/*.stories.tsx',
  ],
},

First, produce the report: the v8 provider and two reporters, text for the terminal summary and html for the drill-down you open when you are hunting a seam.

// inside test: { ... }
coverage: {
  provider: 'v8',
  reporter: ['text', 'html'],
  thresholds: {
    'src/lib/**': { lines: 90, branches: 85 },
    'src/app/api/webhooks/**': { branches: 85, perFile: true },
    'src/lib/auth/**': { branches: 85 },
  },
  exclude: [
    '**/*.config.{ts,js}',
    '**/*.d.ts',
    'app/**/{page,layout}.tsx',
    '**/*.stories.tsx',
  ],
},

A floor on the wide base. Pure logic in /lib should be near-fully covered, so 90% lines and 85% branches catches a regression where someone ships untested logic into it.

// inside test: { ... }
coverage: {
  provider: 'v8',
  reporter: ['text', 'html'],
  thresholds: {
    'src/lib/**': { lines: 90, branches: 85 },
    'src/app/api/webhooks/**': { branches: 85, perFile: true },
    'src/lib/auth/**': { branches: 85 },
  },
  exclude: [
    '**/*.config.{ts,js}',
    '**/*.d.ts',
    'app/**/{page,layout}.tsx',
    '**/*.stories.tsx',
  ],
},

Branch floors on the seams. These protect the denial and fail-closed branches specifically, the webhook’s signature-failure path and the auth wrapper’s reject path, the exact branches line coverage hides.

// inside test: { ... }
coverage: {
  provider: 'v8',
  reporter: ['text', 'html'],
  thresholds: {
    'src/lib/**': { lines: 90, branches: 85 },
    'src/app/api/webhooks/**': { branches: 85, perFile: true },
    'src/lib/auth/**': { branches: 85 },
  },
  exclude: [
    '**/*.config.{ts,js}',
    '**/*.d.ts',
    'app/**/{page,layout}.tsx',
    '**/*.stories.tsx',
  ],
},

A Vitest 4 detail: glob thresholds don’t inherit the top-level perFile, so you opt in per glob. Here it means every webhook file clears 85%, not just the average across them, so one untested receiver can’t hide behind the others.

// inside test: { ... }
coverage: {
  provider: 'v8',
  reporter: ['text', 'html'],
  thresholds: {
    'src/lib/**': { lines: 90, branches: 85 },
    'src/app/api/webhooks/**': { branches: 85, perFile: true },
    'src/lib/auth/**': { branches: 85 },
  },
  exclude: [
    '**/*.config.{ts,js}',
    '**/*.d.ts',
    'app/**/{page,layout}.tsx',
    '**/*.stories.tsx',
  ],
},

Strip the noise. Config, type-only, and framework-orchestrated files would only ever report misleading numbers, so each entry is coverage you refuse to let count against you.

1 / 1

That config is the backstop and nothing more. It catches regressions on surfaces you have already decided to care about; it does not, and cannot, tell you whether a test asserts the right thing. The reading discipline from the previous section is still the actual work.

Before you move on, practice the distinction the section turns on: which files earn a coverage threshold, and which are exempt. Sort each one.

Sort each file by whether it earns a coverage threshold or is exempt from the report. Ask: if this file lost coverage, would a real bug be hiding behind the number? Drag each item into the bucket it belongs to, then press Check.

Gets a coverage threshold Load-bearing — coverage here means something

Excluded or unthresholded Framework-mediated or pure noise

lib/money.ts

lib/auth/authed-action.ts

app/api/webhooks/stripe/route.ts

lib/rate-limit.ts

app/dashboard/page.tsx

env.config.ts

components/ui/card.tsx

lib/types.ts

There is one gap coverage cannot see by itself, and it is the subtlest of the lot, so it is the right note to end on. Coverage reports on what ran. A file that no test ever imports therefore doesn’t run during the suite, which means it doesn’t appear in the report at all.

Consider how that compounds. A file with a single trivial test that runs every line is 100% covered and effectively untested, which you already know to distrust. But a file with no test at all is worse, because it doesn’t even drag the average down. In Vitest 4’s default, it is simply absent. Coverage tells you what ran; it is completely silent about what was never written. The most dangerous file in your codebase can be the one the report doesn’t mention.

Here is the mechanism, and read it carefully because the behavior changed and the old advice will derail you. In Vitest 4, the report by default includes only files that were loaded during the test run. The old behavior, reporting every file by default via coverage.all: true, was removed in Vitest 4, and coverage.all no longer exists. To pull untested files into the report, you set coverage.include to globs covering your load-bearing surface. Any file matched by include that was never imported now shows up at 0% instead of vanishing:

// inside test: { ... }
coverage: {
  include: ['src/lib/**/*.ts', 'src/app/api/**/*.ts'],
},

But include only makes the gap visible; it can’t close it. The audit is a human habit: for every file in /lib and /app/api, confirm a test sits beside it (or in tests/integration/ for the seams that cross modules) and actually exercises the public surface. The config surfaces the 0% rows; you still have to read them and write the missing test. In the figure earlier, an untested file would simply be a new row sitting at 0%, and once you have set include, it can’t hide.

One honest carve-out, because rules without judgment are how teams learn to resent their own tooling. A brand-new feature behind a flag, in its first sprint, may legitimately ship under-tested while its surface is still moving. The professional move is to time-box the gap out loud: add the directory to a temporary exclude, ship it, write the tests in the follow-up PR, and remove the exclude. That is the opposite of hiding the absence behind a coverage line that happens to clear the threshold. The discipline was never “always 90%,” it is naming the gap instead of pretending it isn’t there.

Here is the finished coverage block in one place, so you hold the whole shape rather than the pieces:

// inside test: { ... }
coverage: {
  provider: 'v8',
  reporter: ['text', 'html'],
  include: ['src/lib/**/*.ts', 'src/app/api/**/*.ts'],
  thresholds: {
    'src/lib/**': { lines: 90, branches: 85 },
    'src/app/api/webhooks/**': { branches: 85, perFile: true },
    'src/lib/auth/**': { branches: 85 },
    'src/lib/error-mapping.ts': { branches: 85 },
    'src/lib/rate-limit.ts': { branches: 85 },
  },
  exclude: [
    '**/*.config.{ts,js}',
    '**/*.d.ts',
    '**/types.ts',
    '**/index.ts',
    'app/**/{page,layout}.tsx',
    '**/*.stories.tsx',
    'scripts/**',
    '**/__mocks__/**',
  ],
},

Before the close, one question to make sure the senior read stuck.

A teammate opens a PR that raises the project’s overall coverage from 84% to 91% and points to it as evidence the suite got stronger. You pull up the report. Which observation would most justify pushing back?

The branch coverage is still trailing the line coverage, so a handful of decisions remain untested.

The seven new points all landed on /lib getters now sitting at 100%, while the webhook receiver and the auth wrapper haven’t moved off 40%.

The PR didn’t also switch on coverage.thresholds.autoUpdate, so the new floor isn’t locked in.

91% still leaves a measurable slice of untested code, so the work isn’t finished until it reaches 100%.

The average is the one number on the page that can’t tell you whether the suite got stronger. Seven points earned at the safe base while the seams sit at 40% is a suite that still ships the seam bug — now wearing a better badge. Read where the coverage is, never the headline; and 100% was never the target, so “still short of 100%” is the wrong complaint. The line-vs-branch gap is real but weak: it’s true of almost every suite, and autoUpdate is something the course deliberately declines.

Where this leaves you

Coverage is a flashlight for finding untested seams, not a score to maximize. It tells you what your tests ran, but the number that actually decides whether a test is worth keeping is whether it asserted the right thing, and no coverage tool can see that. You can hit 100% and assert nothing, or sit at 70% with a suite that catches every bug that matters.

That lands you exactly where the next lesson begins. The value of a test is in what it asserts, not what it executes, so next you will learn to write that assertion well: Arrange, Act, Assert; one behavior per test; an assertion that fails on the real bug and survives the refactor that doesn’t change behavior. The flashlight found the seam; now you write the test that holds it.

External resources

Test Coverage — Martin Fowler

martinfowler.com

The canonical statement of this lesson's thesis: coverage is for finding untested code, useless as a numeric grade of test quality.

Getting Started with Vitest Code Coverage — ViteConf

youtube.com

Vitest maintainer Ari Perkkiö (11 min) on how files are included and which coverage provider to choose — the tooling under this config.

Vitest — Coverage configuration

vitest.dev

Every coverage option, including the threshold, reporter, include, and exclude shapes this lesson set.

Stryker — Mutation testing for JavaScript

stryker-mutator.io

The tool behind the mental model: it mutates your source and asks whether any test notices — measuring what coverage can't.