Chapter 94Lesson 5

Lighthouse as the pre-launch gate

Run Lighthouse in two postures, a one-off pre-launch audit and a recurring Lighthouse CI gate that fails a pull request when a performance budget busts.

The product ships in two weeks, and the team needs two things before that date arrives. The first is a structural way to stop a performance regression from reaching production: something that fails a pull request before the slow code merges, not a dashboard someone notices three weeks later. The second is a one-off deep pass over the pages that actually matter, the landing page and the dashboard, so launch day doesn’t ship with a four-second hero image nobody profiled.

Both of those jobs are Lighthouse. There’s a tension to clear first, though, because the earlier lessons of this chapter told you the opposite. The Core Web Vitals drew a hard line: field data is the verdict and Lighthouse is the regression catcher, so chase your real-user numbers, not a synthetic score. That advice is correct, but only after launch.

Before launch, there is no field data. The Core Web Vitals you see in Speed Insights come from CrUX , the Chrome User Experience Report, which aggregates data from real Chrome users. CrUX needs traffic to exist, and an unlaunched product has none. So in the two weeks before ship, the synthetic lab run is not the inferior signal you’d ignore in favor of field data: it’s the only signal you have. That resolves the apparent contradiction. Lighthouse drops to being the regression catcher once real users arrive, but until they do, it’s the whole picture.

This lesson sets up Lighthouse in those two postures: the pre-launch deep audit you run once, and the CI regression gate you run forever. By the end you’ll have written a lighthouserc.json, the single config file that encodes the gate, and you’ll have a threshold cheat sheet calibrated for a 2026 SaaS, so the numbers in that config aren’t pulled from the air.

What Lighthouse measures (and the one Vital it can’t)

A Lighthouse run is a single page load under a fixed synthetic profile. It throttles the network to a simulated Slow 4G connection, throttles the CPU to mimic a mid-tier mobile phone, disables browser extensions, and starts from a cold cache. The result is one run, on demand, and reproducible: the exact opposite of field data’s “real users, real devices, rolling 28-day window.”

That fixed profile is why Lighthouse works as the regression catcher. When a number moves between two runs, the network didn’t get slower and the user’s phone didn’t change, so the only thing that moved is your code. Field data can’t tell you that, because a Vital regression in CrUX could be a code change or just a bad week of mobile traffic. The lab holds every variable steady except the one you care about.

A run scores four categories: Performance, Accessibility, Best Practices, and SEO. There used to be a fifth, PWA, but it was removed in Lighthouse 12. If a tutorial shows you five categories with a Progressive Web App score, it’s out of date, the same way a tutorial citing FID instead of INP is out of date.

The category you came for is Performance, and this is where the most important correction lives. The Performance score is a weighted blend of five lab metrics:

The Lighthouse 12 Performance score

FCP 10%

SI 10%

LCP 25%

TBT 30%

CLS 25%

First Contentful Paint

Speed Index

Largest Contentful Paint

Total Blocking Time

Cumulative Layout Shift

TBT is the heaviest single lever — and it is Lighthouse's stand-in for INP. Lighthouse never reports an INP number.

FCP 10%, Speed Index 10%, LCP 25%, TBT 30%, CLS 25%. The two Vitals from the first lesson, LCP and CLS, carry half the score between them.

Two of those five are minor and you can mostly set them aside: FCP (when the first pixel of content paints) and Speed Index (how quickly the page visually fills in during load) carry 10% each. The other three are where the lesson lives, and there are two things to take from them.

First, LCP and CLS are already familiar. These are the same two Vitals from The Core Web Vitals, and between them they carry half the Performance score. So the work you already know moves this number directly: the next/image preload from Priority on the LCP element drives LCP, and the reserve-space discipline that keeps layout from jumping drives CLS. You are not learning a new set of fixes; you’re learning a tool that scores the fixes you already have.

Second, and this is the correction: the third Vital, INP, is not on the bar. Lighthouse cannot measure INP . Interaction to Next Paint is the latency of real user interactions, and a lab run never clicks anything. There’s no user, so there’s no interaction, so there’s no INP to report. What Lighthouse measures instead is TBT , Total Blocking Time, the total time the main thread was blocked for longer than 50 milliseconds during the page load. TBT is a partial proxy for INP: a page whose main thread is jammed during load will almost certainly be slow to respond to clicks too, so a high TBT is your pre-launch warning that INP will be bad in the field. But it’s only a proxy, because it captures the input-delay component of INP and nothing else.

The practical rule follows from this: never quote an INP number from a Lighthouse report, because there isn’t one. Read TBT instead, treat it as the early-warning signal, and remember that the actual INP fix from the first lesson was to ship less client JavaScript. TBT is how you catch that whole class of problem before you have a single real INP measurement. When you finally want a true INP number, you go to Speed Insights or the DevTools Performance panel, not Lighthouse.

Three ways to run Lighthouse, and when each earns it

Three surfaces run Lighthouse, and they aren’t interchangeable. Before the mechanics, get the decision straight. The question an experienced engineer asks isn’t “what are the options,” it’s “what am I trying to do right now.”

The first surface is the DevTools Lighthouse panel. Reach for it when you want a quick, ad-hoc check while you’re developing, on the exact page you’re looking at. It’s one click and it runs locally. Two caveats trip people up constantly, though. Run it against a production build, meaning pnpm build then pnpm start, never pnpm dev. The dev bundle is unminified, unoptimized, and full of development-only code, so every metric off pnpm dev is meaningless. And even on a production build, localhost over the loopback interface has near-zero network latency, so the numbers read artificially fast. Keep throttling on and treat the absolute values with suspicion; the panel is best for relative checks, as in “did this change make it worse?”

The second surface is PageSpeed Insights, Google’s hosted Lighthouse. Reach for it when you want the pre-launch audit of your marketing page. It earns that spot for one reason: it runs Lighthouse on Google’s own infrastructure and overlays CrUX field data in the same view, tying lab and field together. There’s a pre-launch nuance worth knowing here. For a URL with no traffic yet, the field section simply doesn’t appear, and you see a lab-only report. That’s not a bug; it’s the correct, expected pre-launch state, and it’s the concrete face of “there is no field data yet.” After launch, that section fills in and becomes the verdict.

The third surface is @lhci/cli, Lighthouse CI. Reach for it as the automated regression gate. It runs Lighthouse against a built copy of your app inside CI, asserts your thresholds, and fails the build when they’re missed. This is the course default for the gate and the backbone of the next two sections.

The following walker encodes that decision as the order to ask the questions in, not a flat feature list. Start from what you’re trying to do and let it route you to the surface.

You need to run Lighthouse — which reach?

Two surfaces and the threshold cheat sheet

Before wiring the gate, you need the targets it asserts against, otherwise the numbers in the config are arbitrary. So what do you aim at, and on which pages?

You do not audit every route. You pick two, because a SaaS has two performance regimes and two pages cover both. The first is the marketing or landing page: your highest-traffic surface, the SEO-sensitive first impression, and mostly static. The second is one critical authenticated screen: the dashboard home or the primary task screen, whichever ships the most JavaScript. That second page is the realistic interactive worst case, because it’s behind login, it’s heavier, and it’s where bundle bloat shows up. The rest of the app’s pages are variations on one of these two, so auditing both covers the static-marketing regime and the JS-heavy-app regime, and that’s enough.

Here are the starting targets. Treat them as 2026 SaaS defaults you tighten quarterly, not as laws.

2026 SaaS audit budgets

Surface

Performance score

LCP lab

CLS shift

TBT lab

Total JS gzip

Marketing page static · public · SEO-sensitive

≥ 90

≤ 2.5 s

≤ 0.1

≤ 200 ms

≤ 200 KB

Authenticated dashboard behind login · JS-heavy

≥ 85

≤ 3.0 s

≤ 0.1

≤ 300 ms

≤ 350 KB

2026 SaaS starting points. Tighten them quarterly as the app gets faster, and never loosen them.

The reasoning behind those numbers is worth spelling out, because every one of them connects to something you already know.

LCP and CLS match the good Vital bands from the first lesson, LCP ≤ 2.5 s and CLS ≤ 0.1. The lab target equals the field target here, because you want the synthetic run to clear the same bar real users will be scored against.

TBT stands in for what you might have expected to be an INP budget. There is no lab INP, so there’s no INP number to assert. You budget TBT instead, the proxy, and the dashboard gets a looser 300 ms because an interactive screen legitimately runs more JavaScript on load than a static landing page does.

The JS budgets lead straight back to the treemap. Reading the bundle treemap taught you to find where the bytes went; this is where you cap them. Notice too that the dashboard’s larger JS budget is the reason it scores lower on Performance: more JavaScript means more main-thread work, which means higher TBT, which means a lower score. The looser targets aren’t a double standard; they’re the honest cost of an interactive surface.

One thing the cheat sheet deliberately does not say is “100 everywhere.” Chasing a perfect 100 hits diminishing returns. The jump from 90 to 100 often involves micro-optimizations that change nothing for a user on a flaky mobile connection, and that user, not the engineer on fiber, is who the score is supposed to protect. The cheat sheet encodes “good enough that you don’t lose users,” not a vanity number.

Before you wire any of this up, make sure the surface class, not the page’s name, is what determines which budget applies. Sort each of these pages into the audit profile it belongs to.

Each page below gets audited against one of the two profiles in the cheat sheet. Sort by which budget set applies — judge by what the page *is*, not by its name. Drag each item into the bucket it belongs to, then press Check.

Marketing profile Static, public, SEO-sensitive — the ≥ 90 / ≤ 200 KB JS row

Authenticated profile Behind login, JS-heavy — the ≥ 85 / ≤ 350 KB JS row

The public pricing page

The signup landing page a Google ad points to

The public product changelog

The org settings screen behind login

The invoice list a logged-in user opens

The analytics dashboard with three charts

Wiring the CI gate with `@lhci/cli`

This is the recurring posture, the floor that holds as the app grows. The whole gate lives in one config file you should be able to write yourself after this section.

Install it as a dev dependency:

pnpm add -D @lhci/cli

The current @lhci/cli is the 0.15.x line, which bundles Lighthouse 12, the version whose Performance score you just learned. It runs fine on Node 24, the course default. Don’t memorize the version literal, since LHCI’s versioning moves and you’ll get whatever’s current.

Everything else is one file, lighthouserc.json. It has three essential parts: what to audit, what to assert, and where to put the report. Walk through it pane by pane.

{
  "ci": {
    "collect": {
      "url": ["http://localhost:3000/", "http://localhost:3000/dashboard"],
      "startServerCommand": "pnpm start",
      "numberOfRuns": 3
    },
    "assert": {
      "preset": "lighthouse:recommended",
      "assertions": {
        "categories:performance": ["error", { "minScore": 0.9 }],
        "largest-contentful-paint": ["error", { "maxNumericValue": 2500 }],
        "total-blocking-time": ["error", { "maxNumericValue": 300 }],
        "cumulative-layout-shift": ["error", { "maxNumericValue": 0.1 }],
        "resource-summary:script:size": ["error", { "maxNumericValue": 358400 }]
      }
    },
    "upload": {
      "target": "temporary-public-storage"
    }
  }
}

The collect block lists the two surfaces from the cheat sheet, the marketing root and the dashboard, and startServerCommand so LHCI boots your production app itself with pnpm start. It expects the app to have been built first.

{
  "ci": {
    "collect": {
      "url": ["http://localhost:3000/", "http://localhost:3000/dashboard"],
      "startServerCommand": "pnpm start",
      "numberOfRuns": 3
    },
    "assert": {
      "preset": "lighthouse:recommended",
      "assertions": {
        "categories:performance": ["error", { "minScore": 0.9 }],
        "largest-contentful-paint": ["error", { "maxNumericValue": 2500 }],
        "total-blocking-time": ["error", { "maxNumericValue": 300 }],
        "cumulative-layout-shift": ["error", { "maxNumericValue": 0.1 }],
        "resource-summary:script:size": ["error", { "maxNumericValue": 358400 }]
      }
    },
    "upload": {
      "target": "temporary-public-storage"
    }
  }
}

numberOfRuns: 3. A single Lighthouse run is noisy, since one slow garbage-collection pause can swing a metric. LHCI runs three times per URL and takes the median, smoothing run-to-run jitter into a number you can assert against.

{
  "ci": {
    "collect": {
      "url": ["http://localhost:3000/", "http://localhost:3000/dashboard"],
      "startServerCommand": "pnpm start",
      "numberOfRuns": 3
    },
    "assert": {
      "preset": "lighthouse:recommended",
      "assertions": {
        "categories:performance": ["error", { "minScore": 0.9 }],
        "largest-contentful-paint": ["error", { "maxNumericValue": 2500 }],
        "total-blocking-time": ["error", { "maxNumericValue": 300 }],
        "cumulative-layout-shift": ["error", { "maxNumericValue": 0.1 }],
        "resource-summary:script:size": ["error", { "maxNumericValue": 358400 }]
      }
    },
    "upload": {
      "target": "temporary-public-storage"
    }
  }
}

The assert block is the heart of the gate. preset: "lighthouse:recommended" pulls in Lighthouse’s full recommended assertion set as a baseline, so you’re checking dozens of audits for free.

{
  "ci": {
    "collect": {
      "url": ["http://localhost:3000/", "http://localhost:3000/dashboard"],
      "startServerCommand": "pnpm start",
      "numberOfRuns": 3
    },
    "assert": {
      "preset": "lighthouse:recommended",
      "assertions": {
        "categories:performance": ["error", { "minScore": 0.9 }],
        "largest-contentful-paint": ["error", { "maxNumericValue": 2500 }],
        "total-blocking-time": ["error", { "maxNumericValue": 300 }],
        "cumulative-layout-shift": ["error", { "maxNumericValue": 0.1 }],
        "resource-summary:script:size": ["error", { "maxNumericValue": 358400 }]
      }
    },
    "upload": {
      "target": "temporary-public-storage"
    }
  }
}

These are the overrides that tie the gate to your cheat sheet. categories:performance minScore: 0.9 is the ≥ 90 score (scores run 0–1 here, so 0.9 = 90). Then come the per-metric budgets: LCP and TBT in milliseconds, CLS unitless. These are the lines that fail the build when a budget busts.

{
  "ci": {
    "collect": {
      "url": ["http://localhost:3000/", "http://localhost:3000/dashboard"],
      "startServerCommand": "pnpm start",
      "numberOfRuns": 3
    },
    "assert": {
      "preset": "lighthouse:recommended",
      "assertions": {
        "categories:performance": ["error", { "minScore": 0.9 }],
        "largest-contentful-paint": ["error", { "maxNumericValue": 2500 }],
        "total-blocking-time": ["error", { "maxNumericValue": 300 }],
        "cumulative-layout-shift": ["error", { "maxNumericValue": 0.1 }],
        "resource-summary:script:size": ["error", { "maxNumericValue": 358400 }]
      }
    },
    "upload": {
      "target": "temporary-public-storage"
    }
  }
}

This is the structural resource budget. resource-summary:script:size caps shipped JavaScript, and the value is in bytes, so the 350 KB from the cheat sheet becomes 358400. Bust the JS budget and the pull request fails mechanically, with no human judgment required.

{
  "ci": {
    "collect": {
      "url": ["http://localhost:3000/", "http://localhost:3000/dashboard"],
      "startServerCommand": "pnpm start",
      "numberOfRuns": 3
    },
    "assert": {
      "preset": "lighthouse:recommended",
      "assertions": {
        "categories:performance": ["error", { "minScore": 0.9 }],
        "largest-contentful-paint": ["error", { "maxNumericValue": 2500 }],
        "total-blocking-time": ["error", { "maxNumericValue": 300 }],
        "cumulative-layout-shift": ["error", { "maxNumericValue": 0.1 }],
        "resource-summary:script:size": ["error", { "maxNumericValue": 358400 }]
      }
    },
    "upload": {
      "target": "temporary-public-storage"
    }
  }
}

upload with target: "temporary-public-storage" takes zero setup and gives you a shareable hosted report URL for every run. The heavier alternative is a self-hosted LHCI server for historical trends; this lesson names it but doesn’t build it.

1 / 1

Two ideas in that config carry the lesson’s weight.

The first is to assert the metric budget, not the aggregate score. The gate’s job is to fail when LCP busts 2.5 seconds or the JS budget blows past its cap, not when the overall score drifts from 95 to 93. The score is diagnostic, so it tells a human where to look. The assertions are the structural protection, so they fail the build mechanically. This is the through-line the whole chapter runs on, structural protection over vanity metric, the same shape as the ESLint ban on raw <img> in Priority on the LCP element and sideEffects: false in The barrel-export trap.

The second is that the resource-summary line is a performance budget, and budgets are the structural form of this idea. Beyond JS size, LHCI lets you cap image bytes, font bytes, total page weight, and timing budgets like LCP and FCP. The advice is one budget per resource class, and the JS budget is the central one for a SaaS, because JavaScript is what drives TBT, which drives the score.

One honest simplification: this config applies one assertion set to both URLs, so the marketing page gets held to the dashboard’s looser 350 KB budget rather than its own 200 KB. That’s fine as a starting gate, because it still catches a runaway dependency. When you want a tighter per-surface budget, LHCI’s assertMatrix lets you key different assertions to different URL patterns, a small extension once the basic gate is in place.

To run it, add one script line to package.json:

"scripts": {
  "lhci": "lhci autorun"
}

autorun is the one command that does everything. lhci autorun chains a healthcheck, then runs collect, assert, and upload in sequence, exiting non-zero if any assertion fails. That non-zero exit is what a CI system reads as “this build failed.”

That raises the obvious question of where this runs. The conceptual flow on a pull request that touches UI or dependencies is to build the app, start it, run lhci autorun, and let CI fail if any assertion does. But the thing that orchestrates that flow is GitHub Actions: the workflow file, the triggers that decide which PRs run it, the runner, dependency caching, secrets, and a logged-in test user for the dashboard route. You haven’t met GitHub Actions yet.

There’s one more real requirement to name and defer. Your dashboard surface lives behind login, and a fresh Lighthouse run arrives with no session, so it would be redirected to the sign-in page and audit that instead. Two approaches solve it: a puppeteerScript that logs in before the audit, or injecting a session cookie. Both depend on your auth stack and on CI secret plumbing, so the implementation waits for the GitHub Actions chapter. For now, just know it’s a requirement, not a detail you can skip.

Reading a Lighthouse report

A report’s job is to become an action list, not a number. You read it in four passes, top to bottom, and you act on the third pass first.

Performance

Performance score

0–49 50–89 90–100

The weighted aggregate of the five metrics, one number from 0 to 100. The diagnostic value is the breakdown beneath it, never the dial itself.

Start with the Performance score, but only to set it aside. It’s the weighted aggregate of the five metrics, and its diagnostic value is entirely in the breakdown beneath it. To restate the discipline once more: don’t chase the number, find the red metric.

Drop to the metrics strip to see which metric is red. This matters because the metric tells you the fix. A red LCP sends you to the image work in Priority on the LCP element; a red TBT sends you to the client-JavaScript work in the first lesson and the bundle triage in Reading the bundle treemap. Same red color, opposite fixes.

Then read Opportunities, and act here first. These are ranked by estimated time savings, and almost every one maps to a fix this chapter already taught. Oversized-image opportunities point to Priority on the LCP element, meaning next/image with preload and correct sizing. Unused-JavaScript and bundle opportunities point to The barrel-export trap (optimizePackageImports) plus Reading the bundle treemap for the treemap triage. Render-blocking resources point at font and CSS handling.

Last, read Diagnostics, the structural issues with no time estimate attached: excessive DOM size, long main-thread tasks, console errors. Long main-thread tasks here are the TBT story again, pointing back at client JavaScript.

The cross-tool routing ties it all together. A red metric in Lighthouse is the symptom. The tool that gives you the diagnosis depends on the metric: bundle weight goes to the treemap from Reading the bundle treemap; a slow render time goes to the RSC waterfall view in a Sentry trace (the next lesson); a database-bound slow TTFB goes to the query work in the lesson after that. Lighthouse tells you that the page is slow and roughly where, and the chapter’s other tools tell you why.

Match each Lighthouse signal to the tool or fix you’d reach for.

A red Lighthouse signal is a symptom. Match each one to the fix or diagnosis tool you'd reach for. Click an item on the left, then its match on the right. Press Check when done.

LCP 4.1 s, and the LCP element is the hero image

Mark the hero with next/image preload and give it correct sizes

Opportunity: 600 KB of unused JavaScript

Open the treemap, find the surprise dependency, fix the barrel import

CLS 0.28 — content jumps as images load

Reserve width and height on every dynamic element

Metric is fine but a Diagnostic flags long main-thread tasks

Cut client JavaScript — TBT is the symptom, less JS is the fix

Score is green but the page feels slow to load, and LCP lags TTFB badly

Open the trace and read the RSC waterfall — the data fetch is on the critical path

Finally, a word on the three categories you didn’t come for. Lighthouse surfaces accessibility, SEO, and best-practices gaps almost for free: missing alt text, unlabeled form fields, low contrast, and missing landmarks for accessibility; missing meta tags for SEO; mixed content or insecure requests for best practices. None of these are taught at depth here, and accessibility and SEO each have their own lessons elsewhere, but the audit flags them, and fixing each one is the relevant lesson’s job. One thing is worth internalizing now, though, because it’s a trap teams fall into:

Pre-launch deep pass vs. the recurring gate

Step back from the tool. What this chapter is really teaching is a cadence, and Lighthouse runs in two of them. Getting the cadence right is the whole point: the same tool, used two different ways, solving two different problems.

The pre-launch deep pass is the one-off. Before ship, run PageSpeed Insights against the marketing page, the dashboard, and two or three other critical screens like signup, the primary task flow, and settings. Triage every finding through the chapter’s fix map: bundle bloat to the treemap, image problems to next/image, slow renders to the RSC waterfall, slow database queries to the query work. Ship the fixes, re-audit, and repeat until every surface clears the cheat-sheet targets. This is a deep dive you do once, and remember that the field section of every report will be empty because there’s no traffic yet. Lab is the whole picture here.

The recurring CI gate is the floor. It runs @lhci/cli on every pull request that touches UI or dependencies, against the two surfaces, with the budget assertions. It can’t make your app fast, because it has no opinions about your code; it can only stop the app from silently regressing past the budget as it grows. Pair it with the post-launch verdict from Speed Insights and the chapter’s other recurring checks, like the weekly slow-query review in the database lesson.

There’s a third move that only becomes available after launch: calibration. Once real field data exists, recalibrate the CI budgets against your historical field data, not against vanity scores. If the field consistently beats a lab budget, you’ve earned the right to tighten it. If the field is worse than the lab predicted, your lab profile is too generous and you should trust the field for prioritization, which is the first lesson’s rule, finally actionable.

The two postures, side by side:

One-off: pre-launch deep pass
Recurring: the CI gate

1 One-off Pre-launch deep pass Find every problem once, before ship.

Trigger The weeks before ship.

Tool PageSpeed Insights — plus DevTools for spot checks.

Surfaces Marketing + dashboard, plus 2–3 critical screens (signup, primary task, settings).

✓Can Find every problem on the pages that matter and drive them to the cheat-sheet targets.

✕Can't See field data — there's no traffic yet, so lab is all you have. And it's a snapshot, not a guard.

The snapshot: run PageSpeed Insights against the pages that matter, triage every finding, ship, and re-audit until each surface clears the cheat sheet.

That two-cadence picture places this lesson in the chapter: the pre-launch gate covering all three Vitals, sitting between the per-metric fix lessons before it and the server-side diagnosis lessons after it. The first lesson framed performance vigilance as a pre-launch deep pass plus recurring guards; this lesson is where that framing gets its tooling.

Your team merges a pull request that adds a charting library to the dashboard. Two months later, that library has quietly pushed the dashboard’s bundle past its budget. Which Lighthouse posture was supposed to catch this at merge time?

The pre-launch PageSpeed Insights deep pass

The @lhci/cli CI gate’s JS resource budget, which fails the PR when shipped JavaScript busts its cap

Speed Insights field data, once the 28-day window fills in

A DevTools Lighthouse run on localhost

External resources

These are the references that back this lesson: the LHCI docs you’ll consult while writing your config, and the source for why Lighthouse can’t measure INP.

Lighthouse CI — Getting Started

github.com

The canonical reference for lighthouserc.json — the collect, assert, and upload blocks behind the gate.

@lhci/cli on npm

npmjs.com

Current version and changelog, where you check which Lighthouse release it bundles.

Interaction to Next Paint (INP)

web.dev

Why INP is field-only and TBT is just a lab proxy, this lesson's central correction, from web.dev.

PageSpeed Insights

pagespeed.web.dev

The hosted Lighthouse plus CrUX overlay you run against the marketing page pre-launch.