PR 2 (Migrate): dual-write, backfill, dual-read
PR 1 left production in a strange but deliberate state: the invoices table now carries subtotal and tax columns, both nullable, and not a single line of the running app touches them. They sit there empty while every read and write still flows through the old combined total. That gap is the whole point of expand — the new shape exists in the schema before any code depends on it.
Now you close the gap. This lesson is the dangerous middle of the change: the moment where the application has to know about both shapes at once. Every mutation will write subtotal, tax, and total together; every read will resolve through the new pair, falling back to the old column for the rows nobody has filled yet; a one-shot script will backfill those legacy rows in production after the dual-write is live; and a final small migration will promote the two columns to NOT NULL. When you are done, the inspector’s dual-write panel shows subtotal + tax = total on the newest rows, split-coverage reads 100%, the data-integrity diff is empty, and the schema-state panel reports subtotal NOT NULL and tax NOT NULL — all without a single moment where the live app and the live schema disagree.
Your mission
Section titled “Your mission”Teach the invoices surface to speak both money shapes. Up to now an invoice has had one amount; from here on every create and edit captures a separate subtotal and a separate tax, and the app keeps the old combined total column populated so nothing downstream breaks while the deploy rolls and old code is still in flight. Once every legacy row has been filled, the two new columns become required and the combined amount stops being a stored thing — but that last part is two PRs of patience away, and this lesson stops short of it.
The shape of the solution is dictated almost entirely by one rule: there must be no instant where a row exists with a subtotal/tax that disagrees with its total. That makes the dual-write structural — all three money columns live inside a single .set({...})/.values({...}), computed from the same inputs in the same statement. The classic inexperienced-engineer bug here is a helper that “writes total separately, later” — two statements where there should be one, a window where a crash or a concurrent read sees half the write. That half-write is exactly the divergence the inspector’s data-integrity diff exists to catch, so do not give it anything to find.
Order is the other load-bearing decision, and it is not negotiable. The backfill runs only after the dual-write code is live in production. Run it earlier and every invoice created in the gap between the backfill finishing and the new code deploying lands with a null subtotal/tax again — you would be racing your own script. Merge the code, watch it serve, then fill the old rows. The backfill itself must be idempotent and bounded: it selects only rows that still need it (subtotal IS NULL), works in batches rather than loading the table into memory, and re-checks the same guard inside the UPDATE so that a second run — after a network hiccup, or against a row a live write already filled — touches zero rows. Run it over the unpooled connection: a long-running script wants a direct session, not the transaction-mode pooler that fronts your serverless functions.
Money stays a string end to end — numeric maps to string at the Drizzle runtime, so the combined total is computed in integer cents and never with a floating-point + on dollar strings. The bridge you are building (total = subtotal + tax, plus a tolerance for posts that still send only a combined amount) is deliberately temporary; it is born in this PR to die in the next one. And the SET NOT NULL promotion ships as its own small PR after the production backfill completes, so the one irreversible-ish tightening in the whole cadence gets reviewed on its own.
A few things are explicitly out of scope, but worth naming so you know the boundary. At this project’s seed size a SET NOT NULL is instant; on a million-row table that statement takes an ACCESS EXCLUSIVE lock and you would reach for a validated CHECK constraint instead. A 200,000-row backfill belongs on a durable job runner, not a local script. And subtotal = total, tax = 0 is a modeling simplification — a real product would reconstruct tax from a per-invoice rate history. Dropping total, removing the fallback, and deleting the coalesce bridge are PR 3, the next lesson; leave all of that standing.
subtotal, tax, and a total equal to the integer-cents sum of the two.subtotal/tax and recomputes total as their sum, leaving the version-precondition behavior intact.subtotal becomes that amount, tax becomes '0') rather than rejected.subtotal/tax for un-backfilled rows by falling through to the legacy total (as subtotal) and 0 (as tax).subtotal/tax.SET NOT NULL promotion ships as its own PR, merges green, and succeeds against the fully-backfilled table; the schema-state probe then shows subtotal/tax as NOT NULL.subtotal + tax = total, and Sentry stays quiet.docs/runbooks/migration-subtotal-tax.md records dual-write live, backfill complete, columns NOT NULL.Coding time
Section titled “Coding time”Cut a branch — migrate/subtotal-tax-dual-write — and implement the dual-write, the dual-read fall-through, and the backfill against the brief and the tests. Rehearse the whole thing on a Neon preview branch first, merge PR 2 with no schema migration in it, run the production backfill, and only then ship the small promotion PR. Try it yourself before opening the walkthrough.
Reference solution and walkthrough
The migrate step touches five files plus one follow-up migration. We will go through them in the order they fall out of the work: the money helper that everything else leans on, the dual-write in the actions, the dual-read in the queries, the form that feeds the pair, the backfill script, and finally the promotion PR.
The combined-amount helper
Section titled “The combined-amount helper”total is about to become a value the application computes rather than a value the user types. That computation needs one home, because it is going to be called from the write path (to fill the total column), from the list table, and from the conflict banner. A standalone helper is that home.
export const combinedAmount = (money: { subtotal: string; tax: string;}): string => { const cents = Math.round(Number(money.subtotal) * 100) + Math.round(Number(money.tax) * 100); return (cents / 100).toFixed(2);};The one decision worth pausing on is why integer cents. numeric(12,2) comes back from Drizzle as a string, and the obvious move — Number(subtotal) + Number(tax) — drifts: 0.1 + 0.2 is 0.30000000000000004 in IEEE-754, and toFixed(2) on that is a coin flip at the boundaries. Rounding each operand to whole cents before adding keeps the arithmetic in integers, where there is no drift, and toFixed(2) formats it back to the numeric(12,2) shape the column expects. The money-as-string discipline itself is carried in from the chapter where you built the invoicing CRUD surface — this helper just gives the combine a name.
With the helper in place, the two read surfaces switch from the stored column to the derived value: table.tsx renders combinedAmount(row) for its Total column and conflict-banner.tsx renders combinedAmount(current), instead of reading row.total / current.total. They were displaying the combined amount all along; now they compute it.
The dual-write in the actions
Section titled “The dual-write in the actions”This is the heart of the migrate step. Both createInvoice and updateInvoice change in the same three ways: the Zod schema accepts the new pair, the write carries all three money columns in one statement, and a small fallback tolerates a post that still sends only a combined amount. The lifecycle actions touch no money column, so they are left exactly as they are.
const updateInvoiceSchema = z.strictObject({ id: z.string(), customerName: z.string().min(1), status: z.enum(STATUS_VALUES), subtotal: z.string().min(1).optional(), tax: z.string().min(1).optional(), total: z.string().min(1).optional(), version: z.coerce.number().int(), overwrite: z.coerce.boolean().default(false),});
// Tolerate a post that still sends only the combined amount during the deploy// window: treat it as the subtotal, with zero tax. Born to die in PR 3.const resolveMoney = (input: { subtotal?: string; tax?: string; total?: string;}): { subtotal: string; tax: string } => ({ subtotal: input.subtotal ?? input.total ?? '0', tax: input.tax ?? '0',});
// ...inside updateInvoice, after the version precondition passes:const money = resolveMoney(input);
const [updated] = await tx .update(invoices) .set({ customerName: input.customerName, status: input.status, subtotal: money.subtotal, tax: money.tax, total: combinedAmount(money), version: row.version + 1, }) .where( and(eq(invoices.organizationId, ctx.orgId), eq(invoices.id, input.id)), ) .returning();The schema now accepts the subtotal/tax pair the new form posts. They are optional only so the legacy fallback can stand in for them during the deploy window — once the form is everywhere posting the pair, this is the path every real request takes.
const updateInvoiceSchema = z.strictObject({ id: z.string(), customerName: z.string().min(1), status: z.enum(STATUS_VALUES), subtotal: z.string().min(1).optional(), tax: z.string().min(1).optional(), total: z.string().min(1).optional(), version: z.coerce.number().int(), overwrite: z.coerce.boolean().default(false),});
// Tolerate a post that still sends only the combined amount during the deploy// window: treat it as the subtotal, with zero tax. Born to die in PR 3.const resolveMoney = (input: { subtotal?: string; tax?: string; total?: string;}): { subtotal: string; tax: string } => ({ subtotal: input.subtotal ?? input.total ?? '0', tax: input.tax ?? '0',});
// ...inside updateInvoice, after the version precondition passes:const money = resolveMoney(input);
const [updated] = await tx .update(invoices) .set({ customerName: input.customerName, status: input.status, subtotal: money.subtotal, tax: money.tax, total: combinedAmount(money), version: row.version + 1, }) .where( and(eq(invoices.organizationId, ctx.orgId), eq(invoices.id, input.id)), ) .returning();total stays accepted, also optional. This is the tolerance for in-flight clients: an old tab or a queued request that still sends only the combined amount must not be rejected mid-deploy.
const updateInvoiceSchema = z.strictObject({ id: z.string(), customerName: z.string().min(1), status: z.enum(STATUS_VALUES), subtotal: z.string().min(1).optional(), tax: z.string().min(1).optional(), total: z.string().min(1).optional(), version: z.coerce.number().int(), overwrite: z.coerce.boolean().default(false),});
// Tolerate a post that still sends only the combined amount during the deploy// window: treat it as the subtotal, with zero tax. Born to die in PR 3.const resolveMoney = (input: { subtotal?: string; tax?: string; total?: string;}): { subtotal: string; tax: string } => ({ subtotal: input.subtotal ?? input.total ?? '0', tax: input.tax ?? '0',});
// ...inside updateInvoice, after the version precondition passes:const money = resolveMoney(input);
const [updated] = await tx .update(invoices) .set({ customerName: input.customerName, status: input.status, subtotal: money.subtotal, tax: money.tax, total: combinedAmount(money), version: row.version + 1, }) .where( and(eq(invoices.organizationId, ctx.orgId), eq(invoices.id, input.id)), ) .returning();The legacy-amount fallback, at the action layer. A post with the pair uses the pair; a post with only total becomes subtotal = total, tax = '0'. This is requirement 3 — tolerate, never reject.
const updateInvoiceSchema = z.strictObject({ id: z.string(), customerName: z.string().min(1), status: z.enum(STATUS_VALUES), subtotal: z.string().min(1).optional(), tax: z.string().min(1).optional(), total: z.string().min(1).optional(), version: z.coerce.number().int(), overwrite: z.coerce.boolean().default(false),});
// Tolerate a post that still sends only the combined amount during the deploy// window: treat it as the subtotal, with zero tax. Born to die in PR 3.const resolveMoney = (input: { subtotal?: string; tax?: string; total?: string;}): { subtotal: string; tax: string } => ({ subtotal: input.subtotal ?? input.total ?? '0', tax: input.tax ?? '0',});
// ...inside updateInvoice, after the version precondition passes:const money = resolveMoney(input);
const [updated] = await tx .update(invoices) .set({ customerName: input.customerName, status: input.status, subtotal: money.subtotal, tax: money.tax, total: combinedAmount(money), version: row.version + 1, }) .where( and(eq(invoices.organizationId, ctx.orgId), eq(invoices.id, input.id)), ) .returning();The structural dual-write: all three money columns inside one .set({...}), with total: combinedAmount(money) computed from the very same pair. One statement, so the three values can never disagree — there is no window for a crash to leave a half-write behind.
The non-negotiable detail is that subtotal, tax, and total sit in the same .set({...}) (and the same .values({...}) over in createInvoice). That is the structural dual-write the requirement asks for. The tempting refactor — compute and write total in a follow-up statement — is precisely the divergence bug: two writes mean a moment, however brief, where the row is internally inconsistent, and that is the one thing the inspector’s data-integrity diff is built to surface. The version precondition and the admin-only overwrite escape hatch are untouched: this PR changes the money columns, not the optimistic-concurrency guard. createInvoice gets the identical treatment — the pair in the schema, the fallback, and total: combinedAmount(money) inside its .values({...}). And rowToInvoice now maps the pair (subtotal: row.subtotal, tax: row.tax) onto the InvoiceRow the queries return.
The dual-read in the queries
Section titled “The dual-read in the queries”The write side now fills all three columns going forward, but the rows that already existed before this PR still have a null subtotal/tax — the backfill has not run yet, and even after it runs you want reads to be correct during the run. So the read side resolves the pair through a coalesce fall-through: a filled row reads its real subtotal/tax; an un-backfilled row falls through to the legacy total (as subtotal) and 0 (as tax). The combined total stays selected so any caller that still wants it keeps working.
export type InvoiceRow = { id: string; organizationId: string; number: string; customerName: string; status: InvoiceStatus; subtotal: string; tax: string; total: string; currency: string; createdAt: Date; dueAt: Date | null; deletedAt: Date | null; archivedAt: Date | null; version: number;};
// Dual-read: a backfilled row reads its real subtotal/tax; an un-backfilled row// falls through to the legacy total (as subtotal) and 0 (as tax). The combined// total stays available to callers. Born to die in PR 3.const subtotalExpr = sql<string>`coalesce(${invoices.subtotal}, ${invoices.total})`;const taxExpr = sql<string>`coalesce(${invoices.tax}, 0)`;
// The combined-amount sort orders on the resolved pair, not the raw column.const amountExpr = sql`(${subtotalExpr} + ${taxExpr})`;Both listInvoices and getInvoiceDetail select subtotalExpr as subtotal, taxExpr as tax, and invoices.total as total. The sort for total/-total orders on amountExpr — the resolved sum — so sorting by amount stays correct whether or not a row has been backfilled. The fall-through is what makes the dual-read safe to ship the instant the PR merges: there is never a read that returns a null money value, even with zero rows backfilled.
The form posts the pair
Section titled “The form posts the pair”The edit form drove a single combined-amount input; now it posts the two fields the dual-write reads. Only the field block changes — the version round-trip, the conflict resolution, and the useActionState wiring are owned by the chapter where you built the invoicing CRUD surface and stay exactly as they are.
<div className="space-y-1.5"> <Label htmlFor="subtotal">Subtotal</Label> <Input id="subtotal" name="subtotal" data-testid="subtotal-input" defaultValue={seed.subtotal} /> </div>
<div className="space-y-1.5"> <Label htmlFor="tax">Tax</Label> <Input id="tax" name="tax" data-testid="tax-input" defaultValue={seed.tax} /> </div>The single name="total" input is gone — replaced by name="subtotal" and name="tax", each reading its value from the seed row the form already has. Because the queries now surface subtotal/tax through the coalesce fall-through, the form has a sensible default to render even for a row that has not been backfilled yet.
The backfill script
Section titled “The backfill script”This is the most decision-dense file in the PR, and the one where the inexperienced-engineer traps are most expensive. The script fills the legacy rows — the ones created before the dual-write went live — by copying total into subtotal and setting tax to '0'. It is run by hand with pnpm db:backfill, never imported by the app, never inside a request.
import { sql } from 'drizzle-orm';
import { dbUnpooled } from '@/db/index';
const BATCH_SIZE = 1000;
export const runBackfill = async (): Promise<void> => { let totalUpdated = 0;
while (true) { const ids = await dbUnpooled.execute<{ id: string }>(sql` select id::text as id from invoices where subtotal is null limit ${BATCH_SIZE} `);
const batch = Array.from(ids).map((row) => row.id); if (batch.length === 0) { break; }
const updated = await dbUnpooled.execute<{ id: string }>(sql` update invoices set subtotal = total, tax = '0' where id = any(${batch}::uuid[]) and subtotal is null returning id::text as id `);
totalUpdated += Array.from(updated).length; console.log(`[backfill] updated ${totalUpdated} rows so far`); }
console.log(`[backfill] done — ${totalUpdated} rows backfilled`);};The backfill runs on dbUnpooled — the direct connection. The pooled db your serverless functions use runs in transaction mode and is the wrong tool for a long-running script that holds a session across many round trips.
import { sql } from 'drizzle-orm';
import { dbUnpooled } from '@/db/index';
const BATCH_SIZE = 1000;
export const runBackfill = async (): Promise<void> => { let totalUpdated = 0;
while (true) { const ids = await dbUnpooled.execute<{ id: string }>(sql` select id::text as id from invoices where subtotal is null limit ${BATCH_SIZE} `);
const batch = Array.from(ids).map((row) => row.id); if (batch.length === 0) { break; }
const updated = await dbUnpooled.execute<{ id: string }>(sql` update invoices set subtotal = total, tax = '0' where id = any(${batch}::uuid[]) and subtotal is null returning id::text as id `);
totalUpdated += Array.from(updated).length; console.log(`[backfill] updated ${totalUpdated} rows so far`); }
console.log(`[backfill] done — ${totalUpdated} rows backfilled`);};Each pass selects only rows that still need filling, bounded by a batch limit. Without the where subtotal is null filter every run rewrites the whole table; without the limit you load every id into memory at once.
import { sql } from 'drizzle-orm';
import { dbUnpooled } from '@/db/index';
const BATCH_SIZE = 1000;
export const runBackfill = async (): Promise<void> => { let totalUpdated = 0;
while (true) { const ids = await dbUnpooled.execute<{ id: string }>(sql` select id::text as id from invoices where subtotal is null limit ${BATCH_SIZE} `);
const batch = Array.from(ids).map((row) => row.id); if (batch.length === 0) { break; }
const updated = await dbUnpooled.execute<{ id: string }>(sql` update invoices set subtotal = total, tax = '0' where id = any(${batch}::uuid[]) and subtotal is null returning id::text as id `);
totalUpdated += Array.from(updated).length; console.log(`[backfill] updated ${totalUpdated} rows so far`); }
console.log(`[backfill] done — ${totalUpdated} rows backfilled`);};The UPDATE repeats the and subtotal is null guard in its WHERE. This is the idempotency-and-concurrency lock: a second run, or a row a live dual-write already filled between the select and the update, matches zero rows. Requirement 5.
import { sql } from 'drizzle-orm';
import { dbUnpooled } from '@/db/index';
const BATCH_SIZE = 1000;
export const runBackfill = async (): Promise<void> => { let totalUpdated = 0;
while (true) { const ids = await dbUnpooled.execute<{ id: string }>(sql` select id::text as id from invoices where subtotal is null limit ${BATCH_SIZE} `);
const batch = Array.from(ids).map((row) => row.id); if (batch.length === 0) { break; }
const updated = await dbUnpooled.execute<{ id: string }>(sql` update invoices set subtotal = total, tax = '0' where id = any(${batch}::uuid[]) and subtotal is null returning id::text as id `);
totalUpdated += Array.from(updated).length; console.log(`[backfill] updated ${totalUpdated} rows so far`); }
console.log(`[backfill] done — ${totalUpdated} rows backfilled`);};The loop drains the table batch by batch and stops the instant a pass comes back empty. One query cannot drain a table larger than the batch size; the empty pass is how the run knows it is done.
Two things look unusual at a glance and both are deliberate. First, the script issues raw SQL through dbUnpooled.execute(sql\…`)rather than the typed Drizzle builder. That is what lets the same script keep compiling after PR 3 drops thetotalcolumn from the schema: the typed builder would no longer know abouttotaland the file would fail to typecheck, but asqlliteral references the live database, not the TypeScript schema. Second, thewhere subtotal is nullguard appears twice — once to scope the select, once inside the update — and that repetition is the entire idempotency story. Re-running the backfill after a hiccup is a no-op; running it concurrently with live dual-writes is safe, because any row a live write already filled fails thesubtotal is null` predicate and is skipped.
The promotion PR
Section titled “The promotion PR”After the production backfill completes and split-coverage reads 100%, every row has a non-null subtotal/tax, which means the columns can finally be tightened to NOT NULL. That tightening ships as its own small PR — branch migrate-notnull/subtotal-tax — so the one near-irreversible statement in the cadence is reviewed in isolation. Promote both columns to .notNull() in schema.ts (consuming the TODO(L4) marker), then pnpm db:generate produces the migration:
ALTER TABLE "invoices" ALTER COLUMN "subtotal" SET NOT NULL;--> statement-breakpointALTER TABLE "invoices" ALTER COLUMN "tax" SET NOT NULL;Shipping SET NOT NULL inside the contract PR would also be correct — it is the same forward-only tightening either way. This project prefers the separate PR purely for reviewability: a reviewer sees one irreversible-ish change on its own, with the backfill that made it safe already merged and run.
The workflow
Section titled “The workflow”The code is only half the lesson; the order you ship it in is the other half. Rehearse the whole thing on a Neon preview branch before any of it reaches production:
-
Open PR 2 — title it
migrate: dual-write subtotal and tax, backfill, dual-read fall-through— with no schema migration in it. The migrate step is application code plus a by-hand script; the only migration in this lesson is theNOT NULLpromotion, which is a separate PR. -
On the preview deployment, point
pnpm db:backfillat the preview branch’s unpooled URL and run it. Confirm in the inspector that split-coverage hits 100%, the newest rows show all three columns withsubtotal + tax = total, and the data-integrity diff is empty. This is the rehearsal — the bug you are hunting for is a single-site dual-write split, and it shows up here as a divergent row. -
Merge PR 2 once CI is green. Production is now dual-writing all three columns and reading through the
coalescefall-through, but the legacy rows are still nullsubtotal/tax. -
Run
pnpm db:backfillagainst production, with the production unpooled URL supplied in that one shell session only — never committed to the repo. Watch the inspector for the first ten minutes or so: a dual-write split would surface immediately as asubtotal + tax <> totalrow. Split-coverage climbs to 100%. -
Open and merge the promotion PR (
migrate-notnull/subtotal-tax). Its migration runsSET NOT NULLagainst the now-fully-backfilled table and succeeds. The schema-state panel flipssubtotalandtaxtoNOT NULL. -
Fill the PR-2 section of
docs/runbooks/migration-subtotal-tax.md: dual-write live, backfill complete, columnsNOT NULL.
The cadence theory behind all of this — dual-write, dual-read, bounded-idempotent backfill, forward-only — is taught in the previous chapter on the expand-migrate-contract cadence; authedAction, withTenant, and logAudit come from the organizations and RBAC chapters. This lesson is where you wire them together against a live database.
The cadence this PR implements, step by step — the doc the video itself points you to.
Spin up a copy-on-write branch to rehearse the backfill against production-like data before merging.
The generate/migrate workflow behind the SET NOT NULL promotion PR.
Why money lives in exact numeric, not floating point — the drift the integer-cents helper exists to avoid.
Moment of truth
Section titled “Moment of truth”The suite for this lesson exercises the one piece of behavior it can run directly — the combinedAmount helper — and reads the source of the actions, the backfill, and the form to confirm they carry the structure that produces the observable behavior. Run it from the project root:
pnpm test:lesson 4That runs tests/lessons/Lesson 4.test.ts through the provided scripts/test-lesson.mjs. It checks that combinedAmount adds in integer cents and formats to two decimals — including the 0.1 + 0.2 drift case the lesson exists to teach; that createInvoice and updateInvoice write subtotal and tax inside a single structural .set({...}) while keeping the version precondition and the honest 409 intact; that the edit form posts the subtotal/tax pair and no longer carries a single total input; and that the backfill runs on dbUnpooled, selects un-backfilled rows in bounded batches, re-guards the UPDATE on subtotal IS NULL, and loops until a pass touches no rows. A green run looks like this:
✓ tests/lessons/Lesson 4.test.ts (11 tests)
Test Files 1 passed (1) Tests 11 passed (11)The tests prove the shape of the migrate step, but they run in a Node environment with no database and no preview branch, so the deployment invariants are yours to confirm by hand. Tick each one off as you verify it against the live preview, then production:
subtotal and tax as NOT NULL./invoices renders from the new pair, new create/edit mutations populate all three columns with subtotal + tax = total, and Sentry logs zero new errors through both merges.docs/runbooks/migration-subtotal-tax.md records dual-write live, backfill complete, and columns NOT NULL.