Chapter 90Lesson 3

The four-path catalog

Writing Playwright end-to-end specs for the four money paths that justify the E2E layer, sign-in, Stripe Checkout, invitation acceptance, and your product's value loop.

Lesson 1 gave you the filter (money, identity, unrecoverable data) and the discipline to keep E2E off by default. Lesson 2 gave you the kit: the config that runs against a production build, storageState so you sign in once, role-first locators, auto-waiting assertions, fixtures, the saas_e2e database, and the trace viewer. This lesson spends both. You’ll point the kit at the four paths in our invoice app that actually clear the filter, and write the spec for each.

There are exactly four, and we’ll take them in this order: sign-in to the paid surface, the Stripe Checkout round-trip, invitation acceptance with a seat grant, and the invoice value loop, which means create, send, get paid, and see it flip to paid. The order is not arbitrary. It’s the order a real team adopts them in: the first three are universal to every SaaS, while the last depends on what your product actually does. Think of these four as a destination, not a day-one checklist. A team reaches first for sign-in and checkout, and adds the other two only once verifying every release by hand stops scaling. After they’re in place, the catalog becomes the canon: you extend it the same way, without re-arguing the trigger every time.

Each spec below is one you could paste into tests/e2e/ and adapt. To keep your attention on the one thing each path teaches, the specs appear as focused excerpts of one or two behaviors apiece, not the full multi-assertion file you’d actually commit. Each one leads with what failure costs, then what to assert, then the spec, then the single new mechanic the path introduces.

This is the first money path of every SaaS, and the filter is easy to apply out loud: if sign-in breaks on production, identity breaks, and paying users can’t reach the product they pay for. Nothing downstream matters if the front door is jammed.

One detail makes this path worth leading with. The entire storageState machinery from Lesson 2 exists so that every other test skips the login screen. This is the one test that can’t use it, because the thing under test is the login. It drives a fresh browser context with no saved session and actually types credentials into the form, the same way a real user does on a cold visit.

The app’s sign-in is the Better Auth email-and-password flow from earlier, the one whose server result is a discriminated union: a success branch or a tagged failure like 'invalid-credentials'. Four behaviors are worth asserting, and they map straight onto those branches:

/sign-in renders with email and password fields you can find by their labels.
Valid credentials redirect to /dashboard, with the signed-in user’s name visible.
Invalid credentials surface an error alert and leave you on /sign-in. This is the 'invalid-credentials' branch, seen from the browser.
The dual-key rate limiter blocks the sixth bad attempt and shows the lockout copy. This is the 'too-many-attempts' branch.

The spec below covers the happy redirect and the invalid-credentials branch in one file. Notice the import: test and expect come from ./fixtures, the local re-export from Lesson 2, never from @playwright/test directly.

import { test, expect } from './fixtures';

test('signs in and lands on the dashboard', async ({ page }) => {
  await page.goto('/sign-in');
  await page.getByLabel(/email/i).fill('owner@e2e.test');
  await page.getByLabel(/password/i).fill('correct-horse-battery');
  await page.getByRole('button', { name: /sign in/i }).click();

  await expect(page).toHaveURL(/\/dashboard/);
  await expect(page.getByText(/welcome, ada/i)).toBeVisible();
});

test('rejects a wrong password', async ({ page }) => {
  await page.goto('/sign-in');
  await page.getByLabel(/email/i).fill('owner@e2e.test');
  await page.getByLabel(/password/i).fill('wrong-password');
  await page.getByRole('button', { name: /sign in/i }).click();

  await expect(page.getByRole('alert')).toHaveText(/invalid email or password/i);
  await expect(page).toHaveURL(/\/sign-in/);
});

The import is from ./fixtures, never @playwright/test directly. The { page } fixture you get here is a fresh context with no storageState. This is the one test in the whole suite that logs in for real, because the thing under test is the login.

import { test, expect } from './fixtures';

test('signs in and lands on the dashboard', async ({ page }) => {
  await page.goto('/sign-in');
  await page.getByLabel(/email/i).fill('owner@e2e.test');
  await page.getByLabel(/password/i).fill('correct-horse-battery');
  await page.getByRole('button', { name: /sign in/i }).click();

  await expect(page).toHaveURL(/\/dashboard/);
  await expect(page.getByText(/welcome, ada/i)).toBeVisible();
});

test('rejects a wrong password', async ({ page }) => {
  await page.goto('/sign-in');
  await page.getByLabel(/email/i).fill('owner@e2e.test');
  await page.getByLabel(/password/i).fill('wrong-password');
  await page.getByRole('button', { name: /sign in/i }).click();

  await expect(page.getByRole('alert')).toHaveText(/invalid email or password/i);
  await expect(page).toHaveURL(/\/sign-in/);
});

Drive the form with the same role-first label ladder from Lesson 2: getByLabel for the fields, getByRole('button', …) to submit. owner@e2e.test is the seeded owner credential, consistent with Lesson 2’s seed.

import { test, expect } from './fixtures';

test('signs in and lands on the dashboard', async ({ page }) => {
  await page.goto('/sign-in');
  await page.getByLabel(/email/i).fill('owner@e2e.test');
  await page.getByLabel(/password/i).fill('correct-horse-battery');
  await page.getByRole('button', { name: /sign in/i }).click();

  await expect(page).toHaveURL(/\/dashboard/);
  await expect(page.getByText(/welcome, ada/i)).toBeVisible();
});

test('rejects a wrong password', async ({ page }) => {
  await page.goto('/sign-in');
  await page.getByLabel(/email/i).fill('owner@e2e.test');
  await page.getByLabel(/password/i).fill('wrong-password');
  await page.getByRole('button', { name: /sign in/i }).click();

  await expect(page.getByRole('alert')).toHaveText(/invalid email or password/i);
  await expect(page).toHaveURL(/\/sign-in/);
});

This is the happy assertion, the payload of the test. Auto-waiting toHaveURL waits out the post-login redirect to /dashboard, then the signed-in user’s name confirms the session actually landed in the browser.

import { test, expect } from './fixtures';

test('signs in and lands on the dashboard', async ({ page }) => {
  await page.goto('/sign-in');
  await page.getByLabel(/email/i).fill('owner@e2e.test');
  await page.getByLabel(/password/i).fill('correct-horse-battery');
  await page.getByRole('button', { name: /sign in/i }).click();

  await expect(page).toHaveURL(/\/dashboard/);
  await expect(page.getByText(/welcome, ada/i)).toBeVisible();
});

test('rejects a wrong password', async ({ page }) => {
  await page.goto('/sign-in');
  await page.getByLabel(/email/i).fill('owner@e2e.test');
  await page.getByLabel(/password/i).fill('wrong-password');
  await page.getByRole('button', { name: /sign in/i }).click();

  await expect(page.getByRole('alert')).toHaveText(/invalid email or password/i);
  await expect(page).toHaveURL(/\/sign-in/);
});

The negative branch asserts the error alert by its accessible role, and that we’re still on /sign-in. This is the server’s 'invalid-credentials' discriminant, observed from the browser instead of read off the return value.

1 / 1

Two details belong to this path specifically. The first is cross-browser coverage. Sign-in is one of only two paths, checkout being the other, that also runs in WebKit and Firefox, via the opt-in PLAYWRIGHT_PROJECTS=all projects from Lesson 2. Auth breaks in browser-specific ways often enough to justify the extra runtime here and nowhere casual: a cookie attribute one engine honors and another drops, or a redirect one engine handles differently.

The second is that the lockout assertion only means something from a known starting count. If a previous run already burned five attempts, the sixth-attempt test is testing nothing.

Money path 2: the Stripe Checkout round-trip

This is the canonical money path and the centerpiece of the chapter. The filter is blunt here: if this breaks, money moves wrong. A user pays and doesn’t get the plan, or gets the plan without paying. Either way you have an angry customer and a refund to process.

What makes this the path that justifies the entire E2E layer is where the bug lives. It doesn’t live in any one component; it lives in the composition. The session has to survive a redirect out to a third-party origin (checkout.stripe.com) and back. And the plan the user sees on return was written not by the page they’re looking at, but by a webhook: a different process, arriving on its own schedule, that the success page has to wait for.

You’ve already seen the webhook tested in isolation at the integration layer. That test proves that given this Stripe event, Postgres flips the plan. It’s worth having, and it’s cheaper. But it can’t tell you whether a real user clicking Pay completes the round-trip and ends up looking at “Pro.” That’s a different class of bug entirely, and only the browser, driving the whole flow, can catch it. That distinction is the reason this test exists and the reason it can’t be replaced by something faster.

The diagram below draws out the composition. Run your eye across it and watch where the flow leaves your app and comes back, and where the webhook arrives on its own track. Two of these arrows are spanned by nothing but the browser test: the redirect round-trip, and the wait for the webhook before the UI shows the new plan. Those two arrows are why this path is E2E-only.

%%{init: {'themeCSS': '.messageText, .messageText tspan, .noteText, .noteText tspan, .actor tspan { font-size: 17px !important; }'} }%%
sequenceDiagram
  participant B as Browser<br/>(user)
  participant App as Next app
  participant Stripe as checkout.stripe.com
  participant Hook as Stripe<br/>(webhook sender)
  participant PG as Postgres

  B->>App: click "Upgrade to Pro" on /billing
  App->>App: server action creates a Checkout Session

  rect rgba(129, 140, 248, 0.16)
    Note over B,Stripe: only the browser test spans this — a redirect to another origin and back
    App->>B: 303 redirect to checkout.stripe.com
    B->>Stripe: fill the test card in Stripe's iframes, submit
    Stripe->>B: redirect back to /billing/success
  end

  Hook--)App: POST checkout.session.completed (async, own schedule)
  App->>PG: verify signature, write plan_entitlements = Pro

  rect rgba(52, 211, 153, 0.16)
    Note over B,App: only the browser test spans this — the wait for an out-of-band webhook
    B->>App: success page polls / router.refresh() until plan reads Pro
    App->>B: UI now shows "Pro"
  end

The composition is the justification: two processes and a third-party origin in one flow the user perceives as a single click. (Unit 18)

Now the spec. A signed-in owner, this time with storageState since the login isn’t what we’re testing, starts on /billing, clicks Upgrade to Pro, pays with Stripe’s universal test card, and returns to see the plan badge read “Pro.”

import { test, expect } from './fixtures';

test('upgrades to Pro through Stripe Checkout', async ({ page }) => {
  await page.goto('/billing');
  await page.getByRole('button', { name: /upgrade to pro/i }).click();
  await expect(page).toHaveURL(/checkout\.stripe\.com/);

  const card = page.frameLocator('iframe[name^="__privateStripeFrame"]');
  await card.getByLabel(/card number/i).fill('4242 4242 4242 4242');
  await card.getByLabel(/expiration/i).fill('12 / 34');
  await card.getByLabel(/cvc/i).fill('123');
  await page.getByRole('button', { name: /pay/i }).click();

  await expect(page).toHaveURL(/\/billing\/success/);
  await expect(page.getByRole('status')).toHaveText(/pro/i);
});

The owner arrives already authenticated, because storageState is wired per-project in Lesson 2, so this test skips the login screen entirely. It lands on /billing and clicks Upgrade to Pro with the same role-first locator ladder.

import { test, expect } from './fixtures';

test('upgrades to Pro through Stripe Checkout', async ({ page }) => {
  await page.goto('/billing');
  await page.getByRole('button', { name: /upgrade to pro/i }).click();
  await expect(page).toHaveURL(/checkout\.stripe\.com/);

  const card = page.frameLocator('iframe[name^="__privateStripeFrame"]');
  await card.getByLabel(/card number/i).fill('4242 4242 4242 4242');
  await card.getByLabel(/expiration/i).fill('12 / 34');
  await card.getByLabel(/cvc/i).fill('123');
  await page.getByRole('button', { name: /pay/i }).click();

  await expect(page).toHaveURL(/\/billing\/success/);
  await expect(page.getByRole('status')).toHaveText(/pro/i);
});

Assert we actually left for Stripe’s origin. Auto-waiting toHaveURL waits out the 303 redirect to checkout.stripe.com, the first arrow only the browser test spans.

import { test, expect } from './fixtures';

test('upgrades to Pro through Stripe Checkout', async ({ page }) => {
  await page.goto('/billing');
  await page.getByRole('button', { name: /upgrade to pro/i }).click();
  await expect(page).toHaveURL(/checkout\.stripe\.com/);

  const card = page.frameLocator('iframe[name^="__privateStripeFrame"]');
  await card.getByLabel(/card number/i).fill('4242 4242 4242 4242');
  await card.getByLabel(/expiration/i).fill('12 / 34');
  await card.getByLabel(/cvc/i).fill('123');
  await page.getByRole('button', { name: /pay/i }).click();

  await expect(page).toHaveURL(/\/billing\/success/);
  await expect(page.getByRole('status')).toHaveText(/pro/i);
});

Here is the new mechanic. Stripe nests its card fields in iframes, and frameLocator is the handle that reaches inside them. Inside the frame you still pin to role and label, never CSS. 4242 4242 4242 4242 is Stripe’s documented universal test card.

import { test, expect } from './fixtures';

test('upgrades to Pro through Stripe Checkout', async ({ page }) => {
  await page.goto('/billing');
  await page.getByRole('button', { name: /upgrade to pro/i }).click();
  await expect(page).toHaveURL(/checkout\.stripe\.com/);

  const card = page.frameLocator('iframe[name^="__privateStripeFrame"]');
  await card.getByLabel(/card number/i).fill('4242 4242 4242 4242');
  await card.getByLabel(/expiration/i).fill('12 / 34');
  await card.getByLabel(/cvc/i).fill('123');
  await page.getByRole('button', { name: /pay/i }).click();

  await expect(page).toHaveURL(/\/billing\/success/);
  await expect(page.getByRole('status')).toHaveText(/pro/i);
});

Submit the payment, then assert the redirect back to /billing/success. The session survived the round-trip out to a third-party origin and home again.

import { test, expect } from './fixtures';

test('upgrades to Pro through Stripe Checkout', async ({ page }) => {
  await page.goto('/billing');
  await page.getByRole('button', { name: /upgrade to pro/i }).click();
  await expect(page).toHaveURL(/checkout\.stripe\.com/);

  const card = page.frameLocator('iframe[name^="__privateStripeFrame"]');
  await card.getByLabel(/card number/i).fill('4242 4242 4242 4242');
  await card.getByLabel(/expiration/i).fill('12 / 34');
  await card.getByLabel(/cvc/i).fill('123');
  await page.getByRole('button', { name: /pay/i }).click();

  await expect(page).toHaveURL(/\/billing\/success/);
  await expect(page.getByRole('status')).toHaveText(/pro/i);
});

The payload is the plan badge reading “Pro”, read straight from the DOM. This is what the success page produces once it polls until the webhook lands: the user’s view of truth, never an internal table read.

1 / 1

The one genuinely new API here is frameLocator . Stripe’s hosted checkout page nests the card entry in iframes, and a normal locator can’t see across that boundary. page.frameLocator('iframe[name="..."]') gives you a scoped locator for the frame’s contents. From there you pin to role and name inside the frame, with getByLabel(/card number/i), exactly the ladder you already use, never CSS. Newer Playwright also offers locator.contentFrame() as an equivalent idiom, but frameLocator stays our default because it’s the most direct shape for a hosted page.

There’s a tension worth naming, because Lesson 2 deliberately left it for here. Playwright’s own best-practices guidance warns you against testing third parties you don’t control, and as a default that’s correct. Driving Stripe test mode is the course’s deliberate exception. The reasoning is simple: in a money path, Stripe is part of the system under test. You’re not validating Stripe’s UI; you’re validating that your session, your redirect, and your webhook handling compose correctly around it. And test mode plus the 4242 card is a documented, stable contract, not the flaky live dependency the warning is about. Drive it without guilt.

Two scope guards are worth naming once each so you don’t over-build. First, a single browser is fine for checkout’s own coverage: it’s a cross-browser path only because sign-in is, and the next chapter revisits how deep to go. Second, Stripe Test Clocks, which fast-forward a billing cycle to test renewals and dunning, are integration territory, not E2E. Driving a full billing cycle inside one browser test would blow the runtime budget for no composition payoff.

This spec is the template the project chapter builds on. When you get there, you’ll harden exactly this flow into a real, graded suite, so it’s worth getting comfortable with its shape now.

Money path 3: invitation acceptance with seat grant

This path moves no money directly, yet it has the highest correctness stakes of the four. The filter still applies, through its third clause: a botched invitation is an unrecoverable data-boundary breach with multi-tenant blast radius. Grant a new member access to the wrong organization and you’ve leaked one tenant’s data to another, the kind of bug that ends up in an incident report.

What makes this path interesting to test is that the invitee is a brand-new user on every run. So, like sign-in, this test runs without storageState: it creates a fresh user, walks them through acceptance, and lets the next database reset clean up. Recall the app’s invitation model from earlier: a signed token rides on the accept URL, four different arrival situations all funnel through one accept route, invite-sourced signups get their email auto-verified, and accepting switches the user’s active organization.

Four behaviors carry the weight here, the seat-grant happy path capped by the boundary guard:

A fresh user receives an invitation. The token arrives via a seed-inserted row. Sending the email itself was the integration test’s job from earlier, not this one’s, so name that boundary and step over it.
Opening the accept URL with the signed token lands on a sign-up form, with the invited email pre-filled.
Submitting credentials lands on the organization’s dashboard, with the assigned role visible.
The boundary assertion: the new member tries to reach another organization’s resource and gets a 404. The seat grant and its scoping, proven in one flow.

Before the spec, settle one thing in your head: this test drives exactly one of the four arrival situations. The diagram below lays out all four, then narrows to the one the spec exercises, so you don’t mistake a single spec for coverage of the whole feature. Walk through it.

Signed in, same email one-click accept

Signed in, different email re-auth first

Signed out, has an account sign in first

Signed out, no account sign up first

one route Accept route signed token URL

Seat granted scoped to org — no other

Four ways an invite can arrive — all four hit the same signed accept URL.

Signed in, same email one-click accept

Signed in, different email re-auth first

Signed out, has an account sign in first

Signed out, no account sign up first

one route Accept route signed token URL

Seat granted scoped to org — no other

Already signed in with the invited email? Accept in one click, no new credentials.

Signed in, same email one-click accept

Signed in, different email re-auth first

Signed out, has an account sign in first

Signed out, no account sign up first

one route Accept route signed token URL

Seat granted scoped to org — no other

Signed in as someone else, or signed out with an existing account? Re-auth, then accept.

Signed in, same email one-click accept

Signed in, different email re-auth first

Signed out, has an account sign in first

Signed out, no account sign up first

one route Accept route signed token URL

Seat granted scoped to org — no other

Signed out with no account — the shape our spec drives: sign up, then accept.

Signed in, same email one-click accept

Signed in, different email re-auth first

Signed out, has an account sign in first

Signed out, no account sign up first

one route Accept route signed token URL

Seat granted scoped to org — no other

Whichever way they arrive, the outcome is the same: a seat granted, scoped to this org and no other.

Now the spec for that one shape: a signed-out user with no account yet.

import { test, expect } from './fixtures';

test('accepts an invite and is scoped to the org', async ({ page, invite }) => {
  await page.goto(`/accept-invitation/${invite.token}`);
  await expect(page.getByLabel(/email/i)).toHaveValue(invite.email);

  await page.getByLabel(/password/i).fill('correct-horse-battery');
  await page.getByRole('button', { name: /accept invitation/i }).click();

  await expect(page).toHaveURL(/\/dashboard/);
  await expect(page.getByText(/member/i)).toBeVisible();

  await page.goto(`/orgs/${invite.otherOrgId}/invoices`);
  await expect(page.getByText(/not found/i)).toBeVisible();
});

A fresh user, no storageState. The invite fixture seed-inserts the token (the Lesson-2 fixtures pattern) and yields the token, the invited email, and a second org id to probe the boundary with.

import { test, expect } from './fixtures';

test('accepts an invite and is scoped to the org', async ({ page, invite }) => {
  await page.goto(`/accept-invitation/${invite.token}`);
  await expect(page.getByLabel(/email/i)).toHaveValue(invite.email);

  await page.getByLabel(/password/i).fill('correct-horse-battery');
  await page.getByRole('button', { name: /accept invitation/i }).click();

  await expect(page).toHaveURL(/\/dashboard/);
  await expect(page.getByText(/member/i)).toBeVisible();

  await page.goto(`/orgs/${invite.otherOrgId}/invoices`);
  await expect(page.getByText(/not found/i)).toBeVisible();
});

Open the signed accept URL and assert the email arrives pre-filled. The token carried the identity, so the sign-up form knows who you are before you type a thing.

import { test, expect } from './fixtures';

test('accepts an invite and is scoped to the org', async ({ page, invite }) => {
  await page.goto(`/accept-invitation/${invite.token}`);
  await expect(page.getByLabel(/email/i)).toHaveValue(invite.email);

  await page.getByLabel(/password/i).fill('correct-horse-battery');
  await page.getByRole('button', { name: /accept invitation/i }).click();

  await expect(page).toHaveURL(/\/dashboard/);
  await expect(page.getByText(/member/i)).toBeVisible();

  await page.goto(`/orgs/${invite.otherOrgId}/invoices`);
  await expect(page.getByText(/not found/i)).toBeVisible();
});

Submit the sign-up and assert we land on the org’s dashboard with the assigned role visible. That’s the seat grant, observed from the browser.

import { test, expect } from './fixtures';

test('accepts an invite and is scoped to the org', async ({ page, invite }) => {
  await page.goto(`/accept-invitation/${invite.token}`);
  await expect(page.getByLabel(/email/i)).toHaveValue(invite.email);

  await page.getByLabel(/password/i).fill('correct-horse-battery');
  await page.getByRole('button', { name: /accept invitation/i }).click();

  await expect(page).toHaveURL(/\/dashboard/);
  await expect(page.getByText(/member/i)).toBeVisible();

  await page.goto(`/orgs/${invite.otherOrgId}/invoices`);
  await expect(page.getByText(/not found/i)).toBeVisible();
});

This is the load-bearing check: the new member reaches for another org’s invoices and gets a 404. The seat was granted and scoped. This assertion is the whole reason the test earns its place.

1 / 1

That 404 is the multi-tenant guard doing its job, observed end to end. A smaller sibling test, worth a sentence and not a full spec, covers the expired-token case: open an accept URL whose seven-day token has lapsed and assert the rejection alert renders. That’s the expiry as a security primitive, surfaced to the user.

Money path 4: the invoice value loop

The first three paths are universal: every SaaS signs people in, charges them, and grants access. This fourth one is where your product enters the picture. It’s the one end-to-end loop where every layer has to align for the customer to actually receive the value they pay for. For our invoice app, that loop is create an invoice → send it → the recipient pays via Stripe → the invoice flips to paid in the UI.

Start with the part that generalizes, because the specifics here are ours, not yours: find the one or two loops where your product’s core promise is delivered, and put those in the catalog. In a project-management tool it might be “create a task, assign it, mark it done, see it move.” In a file host it’s “upload, share a link, the recipient downloads.” The slot is always there. What fills it is the judgment call about your own product that paths 1 through 3 don’t ask of you.

The reassuring part is that the pattern is one you already wrote. It’s path 2’s skeleton applied to your own object: sign in as the right role with storageState, exercise the create surface, walk the third-party round-trip if there is one, return, and assert on the user-visible outcome. Same shape, different object. You don’t need a fresh mental model, just the ability to recognize the loop.

Because this path reuses path 2’s structure, the spec below doesn’t get a full re-walk. It focuses on the one new thing this path introduces: data hygiene on a shared database.

import { test, expect } from './fixtures';

test('creates and lists an invoice without colliding', async ({ page }) => {
  const ref = `invoice-${test.info().title}-${Date.now()}`;

  await page.goto('/invoices/new');
  await page.getByLabel(/reference/i).fill(ref);
  await page.getByLabel(/amount/i).fill('250.00');
  await page.getByRole('button', { name: /create invoice/i }).click();

  await expect(page.getByRole('row', { name: ref })).toBeVisible();
});

This is the new bit. Value-loop tests write to the shared seeded org, since parallel workers share one saas_e2e, so each test names its records with a unique id: the test title plus a timestamp. Two workers writing at once never collide on the same row.

import { test, expect } from './fixtures';

test('creates and lists an invoice without colliding', async ({ page }) => {
  const ref = `invoice-${test.info().title}-${Date.now()}`;

  await page.goto('/invoices/new');
  await page.getByLabel(/reference/i).fill(ref);
  await page.getByLabel(/amount/i).fill('250.00');
  await page.getByRole('button', { name: /create invoice/i }).click();

  await expect(page.getByRole('row', { name: ref })).toBeVisible();
});

The create surface, driven role-first with the same locator ladder. It’s the same skeleton as path 2, pointed at your own object.

import { test, expect } from './fixtures';

test('creates and lists an invoice without colliding', async ({ page }) => {
  const ref = `invoice-${test.info().title}-${Date.now()}`;

  await page.goto('/invoices/new');
  await page.getByLabel(/reference/i).fill(ref);
  await page.getByLabel(/amount/i).fill('250.00');
  await page.getByRole('button', { name: /create invoice/i }).click();

  await expect(page.getByRole('row', { name: ref })).toBeVisible();
});

The payload asserts on your own row, addressed by its unique ref, never “the first row” or a row count. That’s what keeps the assertion stable while other workers write to the same table.

1 / 1

The owner’s org is shared across workers, so the records inside it are addressed by name, never by position. Notice too that there’s no per-test cleanup, no afterEach deleting rows. Cleanup is deferred: the next pnpm db:e2e:reset is the canonical clean, the run-level isolation seam Lesson 2 set up. Each test adds its own uniquely-named records and trusts the reset to wipe the slate between runs.

What belongs in the catalog, and what has a cheaper home

This section is the disciplinary heart of the lesson, so let’s make it concrete. The rule, one more time: every candidate that isn’t a money-path composition has a cheaper, more reliable home. The skill that matters most from this whole chapter is routing a candidate to the right layer instead of reaching for Playwright by reflex. Here are the false candidates that come up constantly, and where each actually belongs:

Form validation branches. Does the form reject a blank amount, a negative number, a too-long string? That’s a component test. The browser adds nothing.
Search, filter combinations, pagination cursors. These are URL-state behaviors, covered by integration tests against the route.
Settings page, docs page, marketing landing. No money flow, so they’re off the menu entirely.
A Server Action’s behavior in isolation. That’s an integration test against the action, not a browser driving the form.
A component rendering correctly in a specific locale. That’s a component test.
“Smoke” tests that only check a page returns 200. That’s a curl-based health check in CI, not a Playwright run.
Visual snapshots. Reach for a dedicated tool like Chromatic if you can afford it, otherwise leave them off the menu.

Now drill it. Sort each candidate below into the catalog or its cheaper home. This is the filter applied to concrete cases, the single most transferable thing you’ll take from this chapter.

Each of these is a real test someone wanted to write. Sort each one into the E2E money-path catalog, or its cheaper, more reliable home. Drag each item into the bucket it belongs to, then press Check.

E2E money path Composition across the whole stack — worth the seconds

Cheaper home Integration, component, or a health check

Sign-in redirects to the dashboard

Stripe Checkout returns and the UI shows Pro

An accepted invite grants the right org and 404s on others

The invoice form rejects a blank amount

The invoice list pagination cursor advances

A webhook with a bad signature is rejected

The marketing landing page renders

A dashboard string shows in the right locale

This case is worth a focused look, because for many real apps the primary sign-in isn’t email and password at all: it’s “Sign in with Google.” So how do you E2E-test an OAuth flow?

You have two options, and neither is clean. Option (a) drives a real Google test account through the consent screen. It’s slow, brittle, and runs against many providers’ automation terms of service, a fragile foundation for a per-PR gate. Option (b) drives the flow up to the redirect to the provider, then asserts that the redirect URL your app constructed is correct: the right client ID, the right scopes, the right callback. That’s lower fidelity, but it’s fast and stable, and it tests the part you actually own.

For the per-PR suite, reach for option (b). Save the full round-trip for a quarterly manual-QA pass, if you do it at all. This is the same principle as the gate from Lesson 1: don’t let a third party you can’t reliably drive turn a money path into a flaky test. Assert the seam you do control, the URL you build, and stop there.

In practice the assertion is small. Click the provider button, then check you landed on the provider with the right parameters.

await page.getByRole('button', { name: /continue with google/i }).click();
await expect(page).toHaveURL(/accounts\.google\.com.*client_id=/);

Wiring the catalog into CI

The wiring is short, because Lesson 2 already owns the config. This section just states the workflow shape and the runtime budget for the catalog specifically. The four-path suite runs in CI after the build job, depends on the database being reset first, and on failure uploads the HTML report and trace artifacts. Recall from Lesson 2 that the trace travels with the failure as a GitHub Actions artifact, which lets the reviewer reconstruct exactly what happened rather than guess from a red check.

The runtime budget is worth internalizing. On Chromium only, the four paths run in roughly three to six minutes. Adding WebKit and Firefox for sign-in and checkout brings it to eight to ten. Past about fifteen minutes you’d reach for sharding, the seam Lesson 2 named but didn’t set up, because a four-path suite is nowhere near needing it.

e2e:
  needs: build
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - run: pnpm install --frozen-lockfile
    - run: pnpm exec playwright install --with-deps chromium
    - run: pnpm db:e2e:reset
    - run: pnpm exec playwright test
    - uses: actions/upload-artifact@v4
      if: failure()
      with:
        name: playwright-report
        path: playwright-report/

Three lines carry the weight. needs: build makes the suite run against the same artifact the build job produced, never a stale tree. pnpm db:e2e:reset runs before the test so every run starts from the seed, with counters cleared and fixtures fresh, exactly the precondition the sign-in lockout and value-loop hygiene depend on. And if: failure() uploads the HTML report only when something broke, so the reviewer downloads the trace and opens it in the viewer instead of guessing.

The reviewer’s checklist for a new Playwright PR

Everything in this chapter compresses into one gate: the questions you ask when a teammate opens a PR adding a Playwright test. Six fast checks:

Which money path does this cover? Name it in the PR description. If you can’t name one, it doesn’t pass the filter, and that’s the first question to ask.
Role-first locators throughout? Any data-testid or CSS selector needs a justification.
storageState, not a UI login, except the sign-in path itself, which earns the exception?
Passes ten times locally with --retries=0? Flake is structural, and a retry only hides it.
Does it touch a third party? If Stripe, is the test-mode key in use? If something else, why is this E2E and not a seam test?
Has a trace.zip been generated and reviewed against the assertions?

Apply it now. The PR below adds a checkout test that looks plausible on a quick read. Leave inline comments on what an experienced reviewer would flag against the checklist.

Review this PR adding a checkout E2E test. Leave a comment on every line an experienced reviewer would flag against the six-point checklist. Click any line to leave a review comment, then press Submit review.

tests/e2e/checkout.spec.ts

import { test, expect } from '@playwright/test';

test('checkout works', async ({ page }) => {
  await page.goto('/sign-in');
  await page.getByLabel(/email/i).fill('owner@e2e.test');
  await page.getByLabel(/password/i).fill('correct-horse-battery');
  await page.getByRole('button', { name: /sign in/i }).click();

  await page.goto('/billing');
  await page.locator('.upgrade-btn').click();
  await page.waitForTimeout(3000);
  await expect(page).toHaveURL(/checkout\.stripe\.com/);
});

Step back and look at the shape of where you’ve landed. A team starting fresh in 2026 ships year one with zero Playwright tests: the integration suite catches the seam bugs, production observability catches the unknowns, and someone clicks through the money paths by hand before each release. In year two, when that manual pass stops scaling, they reach first for sign-in and checkout, then invitation and the value loop. Zero to four. That’s the whole trajectory.

So the four-path catalog isn’t a target you race toward. It’s the destination a disciplined team converges on. And here’s the instinct this entire chapter has been building toward: when you review someone’s first Playwright PR, push for fewer, better-chosen tests, not more. A good Playwright PR removes a test as often as it adds one.

External resources

The two new mechanics this lesson introduced each have a canonical doc worth bookmarking, alongside the official discipline guide behind the reviewer’s checklist and a deeper talk from the Playwright team.

Playwright — Frame locators

playwright.dev

The frameLocator API behind the Stripe iframe card-fill in the checkout spec.

Stripe — Testing & test cards

docs.stripe.com

Test mode, the 4242 card, and the other documented cards for simulating outcomes.

Playwright — Best Practices

playwright.dev

The official source for the reviewer's checklist: role-first locators, web-first assertions, and no hardcoded waits.

Advanced Playwright Techniques (Debbie O'Brien)

youtube.com

A 50-minute NDC talk from the Playwright team's developer advocate on the capabilities you reach for once the catalog grows.

The four-path catalog

Money path 1: sign-in to the paid surface

Money path 2: the Stripe Checkout round-trip

Money path 3: invitation acceptance with seat grant

Money path 4: the invoice value loop

What belongs in the catalog, and what has a cheaper home

OAuth sign-in: the conditional reach

Wiring the catalog into CI

The reviewer’s checklist for a new Playwright PR

External resources