Where the eyes go first
How to read a pull request as a senior reviewer, using a five-layer attention stack that spends your scrutiny on the highest-stakes concerns first.
A pull request is open in front of you. Thirty files changed, give or take. The clock is running, three other reviews are waiting, and the author wants this merged today. Where do your eyes go first, and what are they looking for?
For most people the honest answer is the top of the first file, line by line, flagging anything that looks off. It feels thorough, but it is the worst thing you can do. Reading a diff in file order surfaces the cheap stuff first, a stray import or a name you’d have spelled differently, and the formatter already owns most of that. It also spends your attention on the way down, so by the time you reach the query on line 200 that quietly dropped its tenant filter, you’re skimming. The most expensive bug in the diff gets the least attention, because it had the bad luck to live near the bottom of a file.
This lesson rests on one small reframe that changes how you read every diff: a reviewer doesn’t read the file top-down, they read the diff’s concerns top-down. Every change touches something the codebase already decided, like a way to scope a query to a tenant, a wrapper that does auth, or a shape that mutations return. So the question for each touched concern isn’t “is this line correct?” It’s “did this change defend the invariant that decision exists to enforce, or punch a hole through it?” You scan the diff once, map each change onto the principles and patterns you already know, and the comments that matter write themselves.
This chapter is the third leg of a stool you’ve already built two legs of. Earlier in this part of the course, the docs chapter covered the artifacts that describe the system, the README and the ADRs. Then the in-source-docs chapter covered the discipline that keeps those descriptions honest as the code moves. Code review is where a human reads a diff with attention before it becomes production behavior, the checkpoint where every one of those decisions either holds or quietly erodes. This lesson gives you two things: a five-layer stack that tells you where your finite attention goes first, and a map of the principles and patterns to scan the diff against. The next lesson teaches how to phrase what you find; this one is about what to look for.
Review defends invariants, it doesn’t hunt for mistakes
Section titled “Review defends invariants, it doesn’t hunt for mistakes”Before the mechanics, start with the mindset the rest of the lesson rests on.
Every architectural principle and SaaS pattern this course taught exists for one reason: to make some failure mode either impossible to represent or guaranteed to be caught. A discriminated union makes a contradictory state unrepresentable. A tenant-scoped query helper makes a cross-tenant read impossible to write by accident. A webhook dedup key makes a double-charge a non-event. These aren’t style choices. They’re load-bearing decisions the codebase already paid for, in design time, in review time, and in the bugs they prevented.
A code review is the moment those decisions either get defended or get quietly eroded, and the erosion almost never comes from malice. It comes from a well-meaning diff: an author who hand-writes a type that already exists in the schema, reaches past the tenant helper to a raw query, or re-invents a wrapper. The author isn’t careless; they just didn’t know the established surface was there. Your job as the reviewer is to catch that. You aren’t there to find typos. You’re there to defend the invariants.
Hold that distinction, because it sorts everything that follows. Mistake-hunting, scanning for typos, misalignment, or a missing semicolon, is the linter’s job, and the linter is faster and more consistent at it than you will ever be. Invariant defense, noticing that this diff re-implemented something the codebase already guarantees, is the human’s job, and nothing else can do it. When you spend a review hunting mistakes, you’re doing the machine’s work and leaving your own undone.
One idea recurs through every section from here: your attention is finite and expensive, so spend it where a miss costs the most. That is the reason the stack is ordered the way it is.
The review stack: five layers, top-down
Section titled “The review stack: five layers, top-down”The first deliverable is the decision about where your attention goes: five layers, highest-stakes at the top, run in order on every diff.
- Correctness and security. Does the code actually do what the PR claims it does? And does it leak a secret, drop an auth check, expose the tenant boundary, log PII, or trust client input the server is supposed to validate? This is the layer where a miss becomes a breach.
- Architectural principles (#1–#7). Does the diff respect the seven structural rules the codebase is built on: co-location, schema-as-truth, the side-effect boundary, naming for intent, framework conventions over invention, explicit over magic, and impossible states unrepresentable?
- SaaS patterns (#1–#15). Wherever the diff touches an established surface, does it use it: the tenant helper, the authed wrapper, the billing carve-out, webhook idempotency, the Result shape, URL state, cache decisions, soft delete, the notification layer, the migration cadence, the security baseline, the time and i18n primitives, the performance reach?
- Tests and contract integrity. Are the new code paths covered? Do the tests assert behavior the user cares about, or just the implementation? Does the TSDoc still match the signature? Did
.env.examplemove whenenv.tsdid? - Style and naming polish. The linter already caught the formatting. What’s left is reading names against intent and flagging the genuinely confusing one. This is the last layer. If your review only ever reached here, it missed.
The order is the entire point of the stack. The layers are sorted by what a miss costs in production. A dropped tenant boundary at layer 1 is a customer reading another customer’s data, a breach you report to regulators. A confusing variable name at layer 5 is a five-minute rename someone does next week. Your attention drains as you read a diff, the same way it drains reading anything, so you spend it from the top down: on the stack, not on the file.
That is the lever this lesson turns on: top-down on the stack, not top-down on the file. A reviewer who starts at file one, line one is sorting the diff by file position, an order that has nothing to do with risk. A reviewer who starts at “does this leak a secret or drop a tenant filter” is sorting by severity. Same diff, same thirty files, completely different review. The first runs out of attention before the bug; the second spends it on the bug first and the bikeshed last, or never.
Most of the actual work lives in layers 2 and 3, which are the next two sections: the map is the middle of the stack, zoomed in. Here’s the shape to carry with you first.
One more point before the map: a coding agent’s PR lands in this same stack. You don’t get a softer set of layers for machine-written code, and you don’t get a harsher one. The stack works on any author’s diff because the principles and patterns it checks are stable; they don’t change based on who typed the keys. We’ll come back to what that means near the end.
The principle map: what each principle catches at review time
Section titled “The principle map: what each principle catches at review time”Here is the first half of the reference: the seven structural rules the codebase is built on. You’ve met each one in the lesson that owns it, so what follows isn’t a re-teach. It’s how each one shows up at the review surface. For every principle there are two things: the diff signature that tips you off that something skipped the established way, and the comment you leave, one line pointing the author back at the rule.
Read the table by its middle column. The diff signature is the pattern your eye learns to catch on a scan; once you see it, the comment is almost automatic.
| Principle | Diff signature | The comment |
| --- | --- | --- |
| #1 Co-locate by feature, not layer (Ch29, File-system routing) | A new file landing in /utils/, /services/, or /types/ instead of its feature module | “Belongs in lib/<feature>/.” |
| #2 The schema is the source of truth (Drizzle chapters, Unit 5) | A hand-written type duplicating a Drizzle/Zod shape; a hardcoded list the schema already enumerates | “Derive it from the schema.” |
| #3 Pure functions in /lib, effects at named boundaries (Server Actions, Unit 6) | A db.insert or fetch inside a lib/<feature>/ helper; business logic in a route handler | “Move the effect to the action seam.” |
| #4 Name for intent, not implementation (naming convention) | handleClick, processData, helper, utils, manager | “Name what it means, not what it does.” |
| #5 Use framework conventions; don’t invent (App Router & Server Actions) | A custom router, a hand-rolled action-call wrapper, a homemade form lib where native forms + Server Actions cover it | “The framework already has this.” (authedAction/authedRoute are the sanctioned carve-outs) |
| #6 Prefer explicit over magic (boundary directives) | A global side-effect import mutating module state; a magic string driving control flow; a missing 'use client'/'use server' directive | “Name the boundary.” |
| #7 Make impossible states unrepresentable (TypeScript, Unit 1) | A boolean pair that admits a contradiction; a should-be-required-together optional pair; a string that wants to be a discriminated union | “Tighten the type.” |
Six of those rows are clear enough in words. The seventh, making impossible states unrepresentable, is worth seeing in code once, because its diff signature is a shape rather than a keyword. The author models one piece of async state as a few independent fields, and the type now permits combinations the UI should never be in.
type RequestState = { isLoading: boolean; data: Invoice[] | null; error: string | null;};The violation. Three flags give eight combinations, most of them nonsense. What does isLoading: true with both data and an error mean? The type allows it; the UI can’t render it.
type RequestState = | { status: 'loading' } | { status: 'success'; data: Invoice[] } | { status: 'error'; error: string };Tightened. One discriminant, three legal states, zero contradictions. The impossible combinations are now compile errors, not bugs to remember.
Two pieces of vocabulary recur in both maps. Drizzle names the data layer, and a Server Action names the seam for any mutation. You met both in earlier units; the one-line reminders are here so the map reads without you re-opening them.
The pattern map: what each pattern catches at review time
Section titled “The pattern map: what each pattern catches at review time”Here is the second half of the reference, in the same three columns: pattern, diff signature, and the comment. The principles are structural rules about how code is shaped; each pattern pins a specific SaaS concern. You scan and read it the same way. This table is just longer because there are more concerns to pin, and that’s by design: it’s a reference you return to at a real diff, not prose to read once.
| Pattern | Diff signature | The comment |
| --- | --- | --- |
| #1 Tenant-aware query (Ch56, Organizations) | db.select().from(table) with no org filter | “Scope it with tenantDb(orgId) in the where.” |
| #2 Authed action wrapper (Server Actions / RBAC) | A Server Action opening with manual auth() + if (!user) throw | “Wrap it in authedAction(role, schema, fn).” |
| #3 Seat management (Ch56, Organizations) | Adding/removing members around the seat & role primitives | “Go through the seat primitive.” |
| #4 Billing carve-out (Stripe billing, Unit 11) | import Stripe outside lib/billing/ | “Go through the billing interface.” |
| #5 Webhook idempotency (Ch63, Webhook ingestion) | A webhook handler that doesn’t dedup on the provider’s event id | “Add the processed_events dedup key.” |
| #6 Result return shape (Server Actions, Unit 6) | A Server Action that throws on a validation failure | “Return { ok: false, error }; throw only at the framework edge.” |
| #7 URL-state list view (Ch60, URL-state list views) | Filter/sort/page state in useState instead of searchParams | “Lift the state to the URL (nuqs).” |
| #8 Cache decisions (caching & invalidation) | A fetch with no explicit cache directive on mutable data; a missing revalidateTag/revalidatePath after a mutation | “Name the cache decision.” |
| #9 Soft delete / archive (Ch60, list views & soft delete) | A DELETE FROM that should set deletedAt; a unique constraint blind to soft-deleted rows | “Soft-delete, or make the constraint partial.” |
| #10 Notification layer (notifications, Unit 13) | A direct Resend/Twilio call inside a feature module | “Route it through the notification dispatcher.” |
| #11 Expand-migrate-contract (migrations, Unit 20) | A NOT NULL on an existing column, a rename, or a drop landing in one PR | “Split it into the three-step cadence.” |
| #12 Security baseline (the security baseline, Unit 16) | Untyped JSON.parse of a request body; a secret in a client component; a missing mutating-route guard | “Name the baseline being violated.” |
| #13/#14 Time and i18n (time & internationalization, Unit 17) | new Date() for user-visible formatting in a server component; a hardcoded English string in a translatable surface | “Use the Temporal / Intl / next-intl primitives.” |
| #15 Performance vigilance (performance, Unit 19) | A list query with no pagination; an N+1 on a relation that has a join helper | “Paginate, or use the join.” |
That’s the whole map: seven principles, fifteen patterns, twenty-two reasons to leave a comment. It looks like a lot on the page, but it is not a lot to carry, because you didn’t learn it here. You learned each item in the lesson that owns it, and the map just consolidates them into a single scan. After a quarter of doing this, you hold the signatures in peripheral memory and the comments write themselves.
What the map doesn’t do is tell you which concern matters most when a diff trips several at once. That’s the stack’s job, and it’s the skill worth drilling. Try sorting these diff signatures onto the layer they belong to.
Each item is something you spotted in a diff. Sort it onto the layer whose severity it earns — when a defect could sit in two layers, the higher layer wins. Drag each item into the bucket it belongs to, then press Check.
process.env.STRIPE_SECRET_KEY read inside a 'use client' filedb.select().from(invoices) with no org filterJSON.parse with no validation before it hits the DB/utils/ instead of its feature moduleprocessData as a function nameThe tenant-filter item is the one to sit with. On the map it’s pattern #1, “use tenantDb.” But a missing tenant filter doesn’t just skip a convention; it lets one customer read another customer’s rows. The consequence is a cross-tenant breach, so you review it as a layer-1 security finding, not a layer-3 style nudge. When the same defect could sit in two layers, the higher layer’s severity wins. That rule is why the stack is ordered the way it is, and it’s worth more than memorizing any single row.
What does not get a comment
Section titled “What does not get a comment”Knowing what to flag is half the skill. The other half, the one that separates a reviewer people want to work with from one they dread, is knowing what to stay silent on. Restraint is a senior move, not a footnote, so it gets its own section.
Here’s what you deliberately don’t comment on:
- Formatting. Biome owns it. A comment about indentation or quote style is wasted attention on both sides, yours to write it and the author’s to read it, for something a tool already settled.
- Naming or casing the linter already accepts. If the rule passed, bikeshedding
camelCaseversus your personal preference is noise. The codebase has a convention, the linter enforces it, and you’re done. - Personal style preferences. “I’d have used a ternary here.” “I prefer an early return.” These aren’t defects. They’re you projecting your own hands onto someone else’s code, so leave them out.
- Off-topic rewrites that aren’t load-bearing for this PR. “You could’ve used the visitor pattern.” “While you’re in here, refactor the export logic.” A novel-architecture proposal is a real conversation, but it doesn’t belong as a line comment on someone else’s PR. It belongs in a design doc or an ADR thread, which is exactly what the next chapter builds.
One principle sits under all four: the review’s job is to defend the established principles and patterns, not to redesign the system and not to impose your taste. Every item in that list sits below layer 5 or outside the stack entirely. None of them is on the map, so none of them is your beat as the human reviewer. The senior move is to look at a diff full of things you’d have done differently and comment only on the ones that actually break an invariant.
When CI already caught it, you don’t
Section titled “When CI already caught it, you don’t”One reviewer reads every PR before you do, never gets tired, and never disagrees with itself: the toolchain. tsc, Biome, and the test suite run on every push. Anything they catch deterministically is not your beat, because your attention is far too expensive to spend on a type error a compiler flags in milliseconds. If CI is red, that’s not a review comment; it’s a precondition the author hasn’t met yet.
Beyond those defaults, a mature codebase keeps promoting checks into CI. You met several of these as you built the app: a type-coverage gate, a migration linter that catches an unsafe NOT NULL, an env-parity check that fails when env.ts and .env.example drift apart. Each one started life as something a human used to catch by hand.
That is the senior’s long game, and it closes the loop on this whole lesson. When a class of comment recurs across PRs, the same forgotten wrapper or the same missing check week after week, you stop saying it by hand and promote it to a lint rule or a CI gate. Reviews encode patterns, and patterns become structural enforcement. As the structural net widens, your beat as a human reviewer narrows. That’s the reassuring part of the map: it isn’t a permanent twenty-two-item burden you carry forever. Every quarter the machine takes a few more rows off your plate, and you’re left with the judgment calls only a human can make.
Right-sizing the review: size, the description, and tempo
Section titled “Right-sizing the review: size, the description, and tempo”The map and the stack assume the review can even happen: that the diff is readable, that you understand what it claims, and that you respond while the work is still warm. Three structural disciplines gate exactly that. They aren’t about the diff’s content; they’re the conditions for reviewing it well.
PR size, and the “split this” comment. Reviewer accuracy doesn’t degrade gracefully as a diff grows; it falls off a cliff. Both public research and ordinary team experience put the threshold around 400 lines of meaningful change. Past that, one of two things happens, and neither is a review: you rubber-stamp it because reading it properly is too much, or you read it properly and miss the load-bearing issue anyway because attention doesn’t stretch that far. So when a PR is too big, the right comment is “split this; the review can’t catch what it can’t read.” That’s not a soft suggestion you tack on at the end. It’s structural, and it holds the merge the same way a security finding does, because an unreviewable PR can’t be reviewed by definition. The next lesson gives you the vocabulary to mark a comment as merge-blocking; for now, just know “split this” is non-negotiable, not a nit.
The description is the review’s contract. Before you read a single line of the diff, you read the PR description and ask one question: do I understand what this claims to do, and do I have the context to verify the claim? A one-line description on a 600-line diff is a contract gap, because you’re being asked to verify a claim you were never told. When that happens, your first comment isn’t on line 47; it’s “what does this change, and why?” This is the same in-source discipline you met last chapter, where the lesson on shipping docs in the PR made the case that a description is documentation. Here, its job is to scope the review.
Tempo, and the asymmetry. A review that lands a week late is a review that didn’t happen. The author has moved on to three other things, the context has evaporated from their head, and the merge ends up happening under deadline pressure with no review input at all, because the realistic alternative to a slow review is no review, not a careful one. So the reflex is a first response within a business day, ideally within hours, and same-day turnaround on small PRs. State the asymmetry plainly to yourself when you’re tempted to let one sit: a slow review costs more than a fast imperfect one, because the alternative to slow isn’t thorough, it’s nothing.
Reading tests with the same eyes as code
Section titled “Reading tests with the same eyes as code”Layer 4 hides a failure mode common enough to earn its own section: the test file gets a free pass. A junior reviewer treats tests as the author’s private business and ships the PR as long as they’re green. Tests are not the author’s private business. They’re part of the diff, and you read them with the same posture as production code.
Ask one question of every assertion: does this check behavior the user cares about, or does it only verify the implementation the test happened to be written next to? The rule to hold is that tests defend the contract, not the implementation. A test that asserts a mock was called or a spy fired isn’t a test; it’s a snapshot of today’s code that will keep passing no matter how badly the behavior breaks tomorrow. The sharpest version of this anti-pattern is a test that mocks the very thing it claims to be testing. It proves nothing except that the mock you wrote does what you told it to.
The contrast carries the whole point. Here is the same test with two ways of writing the assertion.
test('createInvoice saves the invoice', async () => { const save = vi.spyOn(invoiceRepo, 'save').mockResolvedValue(undefined); await createInvoice(validInput); expect(save).toHaveBeenCalledOnce();});The violation. This proves only that the action called the function we mocked. It would pass even if createInvoice saved garbage, or saved nothing the user could ever read back. That’s not a test; it’s a snapshot of the wiring.
test('createInvoice persists a readable invoice', async () => { const result = await createInvoice(validInput); expect(result.ok).toBe(true); const saved = await listInvoices(); expect(saved).toContainEqual(expect.objectContaining({ number: validInput.number }));});Defends behavior. This asserts what the user cares about: the call succeeded (result.ok) and the invoice is really there to read back through listInvoices, which closes over tenantDb(orgId). Break the implementation any way you like; if the outcome breaks, this test fails.
When you see the first shape in a diff, the comment is one line: this asserts the mock, not the behavior, so what’s the observable outcome a user would notice if it broke? That’s defending the contract, which is the only thing a test is for.
The always-look-for security scan
Section titled “The always-look-for security scan”Layer 1 sits at the top of the stack for a reason, so it earns a concrete, runnable form: five things you scan for on every review, in about five seconds, before you do anything else. This is a scan, not an audit; the depth lives in the security unit you’ve already worked through. The point is to make the top of the stack a fixed reflex instead of a vague intention.
The five:
- Secrets on client-bundle paths. Anything imported from a
'use client'file that reads a server secret, such asprocess.env.STRIPE_SECRET_KEY. That doesn’t keep the secret on the server; it ships it to every browser that loads the page. - Missing tenant filters. A query without the org scope. This overlaps the pattern map, but here it’s a security concern, because the consequence is one tenant reading another’s data, not just a skipped convention.
- Unvalidated user input crossing a boundary. Client input flowing into SQL or HTML without a parse or an escape step in between.
- Auth checks living in components instead of the action seam. A guard in JSX is decoration: it hides a button, it doesn’t stop a request. The real check belongs at
authedActionorrequireOrgUser, on the server, where it can’t be bypassed. - PII in logs. Emails, tokens, and raw request payloads logged where they’ll sit in plaintext in your log drain forever.
Notice that two of these, the tenant filter and the input validation, also appear on the pattern map. That overlap is deliberate, and it teaches the stack’s most important rule one more time: the same defect can sit in two layers, and when it does, the higher layer’s severity wins. A missing tenant filter caught during the security scan is a layer-1 finding that holds the merge, not a layer-3 style note you might let slide. Where a defect lands on the stack is decided by what it costs, not by which checklist you happened to spot it on.
Same bar for every author, including the agent
Section titled “Same bar for every author, including the agent”In 2026, a meaningful share of the PRs in your queue weren’t typed by a person. A coding agent opened them. The durable thing to internalize is that an agent’s diff goes through the exact same stack, the exact same map, and the exact same five layers, no softer and no harsher.
Agent-written code does have a characteristic failure shape, and it maps cleanly onto things you already check. It re-invents a wrapper the codebase already has (principle #5, and half the pattern map). It adds tests that look plausible but don’t exercise the contract (layer 4, the tautological-test problem at scale). It drifts from the conventions written down in AGENTS.md (the in-source discipline from the last chapter). None of that is new. It’s the same map, applied to a diff that happens to have been generated.
The trap cuts both ways, and that symmetry is the lesson. Don’t over-tighten on agent code because “the machine should’ve gotten it right”; that’s nitpicking, and it’s noise. And don’t loosen on a trusted teammate’s PR because “they know what they’re doing”; that’s rubber-stamping, and it’s how a tired senior ships the bug. The bar is the same in both directions. The reason it holds steady regardless of author is the thread running through this entire lesson: the checklist works because the principles and patterns are stable. They don’t bifurcate by who or what wrote the code, so neither does the review.
Run the stack on a diff
Section titled “Run the stack on a diff”You now have the stack, the map, and the negative space. The skill this lesson is really teaching, though, isn’t any single row; it’s the order you ask the questions in. So walk it once. The tree below forces severity-first: at each step you pick where your eyes go, and it won’t let you jump to the variable name on line three before you’ve checked whether the change leaks a secret.
Correctness and security is the top of the stack. A leaked secret or a dropped tenant filter holds the merge before anything else gets discussed, so you don’t move to principles or polish until this is resolved. (The next lesson gives you the word for “this blocks the merge.”)
You spent your attention top-down on the stack: correctness and security first, naming and polish last. That’s the whole move. A confusing name is a five-minute fix, and you got to it because the expensive layers were already clean.
Naming is the bottom of the stack. Starting there means sorting the diff by file position rather than by what a miss costs, so you’ll run out of attention before the layer-1 bug. Reset, and start at correctness and security.
That walk is the opening reframe, except now you’re performing it instead of reading it. The eyes don’t go to the top of the file. They go to the top of the stack.
Where to read next
Section titled “Where to read next”Two references are worth keeping if you want to go deeper on the priorities themselves, rather than on how to phrase comments, which the next lesson owns.
A large org's reviewer guide: design, functionality, and tests before style, the same severity-first ordering as the stack.
The Cisco-study data behind the ~400-line reviewability threshold: defect detection falls off a cliff past it.
You can now open a diff and name where to look first and why. You can spot the diff signature of a principle or pattern that got skipped, decide which concerns earn a comment and which don’t, call “split this” when the diff is too big to review, and route the CI-catchable stuff away from your own attention. The mental model to leave with is one sentence: a diff is not lines to read, it’s a set of invariants to verify, and you spend your attention top-down on the stack, never top-down on the file. What’s left is learning to phrase the finding so it lands as collaboration instead of a gate, which is exactly where the next lesson goes.