Neon branching and scale-to-zero
How Neon's separation of storage from compute makes serverless Postgres give every preview deployment its own copy-on-write database branch that scales to zero when idle.
You’re shipping a SaaS in 2026. Every pull request your team opens gets its own live preview URL, a full running copy of the app a reviewer can click through before anyone merges. And like everything else in your stack, the database bills you for what you use, not a flat rate for a box sitting in a rack. Picture the production database behind all of that. What does it actually look like?
Two harder questions hide inside that one. First, each preview needs its own data. If two reviewers are poking at two different PRs against the same shared database, they corrupt each other’s tests, and nobody trusts what they see. So how does every preview get an isolated copy? Second, if there’s a database behind every open PR, are you paying a monthly bill for each one, even the ones nobody has opened in a week? On a traditional Postgres server both of these are dealbreakers. A per-PR database is too slow to copy and too expensive to leave idle, so teams don’t even try. This lesson is about why that ceiling doesn’t exist on Neon. Both answers fall out of a single architectural choice Neon makes, and by the end you’ll be able to look at a real Neon project and explain why this setup is both cheap and safe. The previous lesson ended by pointing your DATABASE_URL at a Neon branch and promising the explanation later. Here it is: we’re going to find out what that branch is, and why branching is the entire pitch.
Storage and compute, pulled apart
Section titled “Storage and compute, pulled apart”Everything in this lesson (branching, the database-per-preview trick, the bill that drops to zero) follows from one fact about how Neon is built. Once you understand that fact, the rest stops being a list of features to memorize and becomes a set of things you could have predicted. That payoff is why this section moves slowly.
Start with the model you already hold, since the new model is easiest to see as a change from it. A traditional Postgres server is one process welded to its disk. There’s a program running, the thing that parses your SQL, plans queries, and reads and writes rows, and there are the data files it owns, sitting on a disk bolted to that same machine. The running program and the stored data are a single inseparable unit. They boot together, they live together, they die together.
That welding is invisible until you try to do something with it, and then you feel it in three specific places:
- Copying the database means physically copying the disk. To duplicate a 10 GB database you run
pg_dumpand restore it somewhere else: minutes of work and a second, full 10 GB on disk. The data has nowhere to live except attached to a process, so a copy is a whole new attached disk. - An idle server still costs money. The process runs whether or not anyone is querying it. A database behind a preview nobody has touched in a week is burning the same compute as one under load, because the process is always on. Always-on is the only state it has.
- You can’t add a read-only copy without replicating everything. Want a second process to spread out read traffic? It needs its own disk, which means copying all the data and keeping the copy in sync. The disk is chained to one process.
Each of those three pains is about to dissolve.
The reframe is a single move. Neon splits the process apart from the disk. Storage becomes its own thing: a log of page versions living in durable cloud storage, completely independent of any running program. Compute becomes its own thing too: a stateless Postgres process the platform starts on demand and points at that storage. Stateless is the part that matters here: the compute holds no permanent state of its own. Kill it and the data is untouched, because the data was never in the process; it was in storage all along. Start a brand-new process, point it at the same storage, and it picks up exactly where the last one left off.
Traditional Postgres
Neon
Look at the right side of that figure. Each of the three pains turns into its opposite:
- Storage is shared, so a second process doesn’t need its own copy of the data. That’s what makes branching cheap. (Next section.)
- Compute is disposable, so when nobody’s querying, the platform can throw the process away entirely and you pay nothing for it. That’s scale-to-zero. (A couple of sections down.)
- Multiple computes can read one storage, so a read-only replica is just another process pointed at the same log, with no replication and no second disk. Neon supports this for scaling reads and analytics. The course never needs it, so we name it once and move on.
Three of Postgres’s oldest constraints disappear, all for the same reason. Keep the split in mind, because every section from here just asks the same question: given that storage and compute are separate, what becomes possible?
Branches share storage, so they’re nearly free
Section titled “Branches share storage, so they’re nearly free”A branch is a new compute pointed at a snapshot of a parent’s storage at a moment in time. The snapshot is the interesting part, so look closely at how it’s taken. It’s copy-on-write : the new branch starts by sharing every existing page with its parent rather than duplicating any of them. Nothing is copied up front. Only when the branch actually changes a page (you insert a row, you update a value) does that one changed page get written fresh and kept separate. Everything you haven’t touched is still, physically, the parent’s pages.
So put the two models next to each other. To copy a 10 GB database the old way takes pg_dump, a restore, minutes of work, and a fresh 10 GB on disk. To branch it on Neon takes under a second and roughly zero added storage, because at the instant of branching the branch shares all 10 GB and has changed nothing. You only ever pay storage for the pages your branch diverges on. A branch you create and barely touch costs almost nothing to keep around, and as you’re about to see, that “almost nothing” is what makes the whole economic story work.
There’s one thing about branches that trips up nearly everyone, so it’s worth getting clear now. A branch captures its parent as it was at the moment you created it, and then the two go their separate ways. This is what point-in-time means, and it has real consequences:
Read that diagram closely, because it clears up the most common misconception. A branch is not a live mirror of its parent. Rows the parent gains after you branch do not show up in the branch, so an old branch shows old data. If you ever wonder why a new production row isn’t appearing in a branch, the answer is timing: the branch was cut before that row existed. The same independence runs the other way, and that’s the part that makes branches safe: rows your branch writes never flow back to the parent. You can do anything to a branch, fill it with garbage or drop a table, and the parent doesn’t feel it.
That safety is what makes branching worth building a workflow around. Here’s what it unlocks, each expanded in the sections that follow:
- A branch per preview deployment. Every open PR gets its own isolated, prod-shaped database, with no shared mess. (Next section, in full.)
- Two long-lived branches as your baseline. A
productionbranch (the primary, the one real users hit) and astagingbranch forked off it that lives indefinitely. This pair is the course’s working setup; treat it as the default everything else hangs off. - What-if and migration rehearsal. About to run a destructive migration against production and not entirely sure it’s safe? Branch production, run the scary thing on the branch, inspect the result, and throw the branch away if it went wrong. The branch is your undo button. The actual mechanics of running a migration this way come a few chapters out, in the migrations chapter. For now just know the capability exists, and that it’s the main reason branching belongs in a senior’s toolkit.
There’s nothing for you to type in this section. On a real project these branches are created by the platform and the integration we’re about to meet, not by you running a command. The point here is the model, not the keystrokes.
A database branch for every preview
Section titled “A database branch for every preview”This is the headline, and the direct answer to the first question we opened with: how does every preview get its own data? Watch one pull request travel from “opened” to “merged” and the answer becomes something you can picture rather than recite.
PR opened. A developer pushes a Git branch and opens a pull request on GitHub. So far the database side is unchanged: production and staging exist, and there’s no preview branch yet.
Neon branch created. The Vercel–Neon integration sees the new deployment and instantly creates a Neon branch off the parent, named something like preview/<git-branch>. It’s copy-on-write, so it shares storage with its parent and costs almost nothing to create.
Preview deployed with the branch’s URL. Vercel builds the preview and injects that branch’s connection string as the deployment’s DATABASE_URL. The preview app now talks to its own private database, so seeding it or mutating it can’t touch production or any other PR.
Iterate safely. The developer pushes more commits, runs migrations against the preview branch, and QA clicks through real, isolated data. Nothing here leaks into production or collides with another open PR.
PR merged or closed → branch deleted. With auto-delete enabled, the integration tears the preview branch down and reclaims the storage its diverged pages used. Back to just production and staging.
Scrub through that and the shape is clear: a database appears for the life of a PR and vanishes when the PR is done. That last step, the branch being thrown away, isn’t incidental. It’s the whole reason this is affordable, and it sets up the next section.
This is what makes the pattern matter in practice: it replaces the single shared staging database, which is a well-known anti-pattern. Picture the old way, with one staging database and two PRs in flight. Reviewer A seeds it with test invoices, reviewer B’s destructive test deletes a table A was relying on, and a third person’s migration leaves it half-broken, so now nobody believes anything they see in staging. A shared staging database degrades because everyone writes to it and no one owns it. A branch per PR removes that problem entirely: every preview gets a clean, isolated, production-shaped database that exactly one PR can touch. And because branches are nearly free to create and, as we’re about to see, nearly free to keep around while idle, you can genuinely afford one per open pull request without thinking about it.
One detail to recognize rather than configure, because it decides how cleanup happens. As of 2026 the Vercel–Neon integration comes in two flavors, and the only thing they differ on is who deletes the branch:
- Neon-managed: you switch on “automatically delete obsolete branches,” and cleanup runs the next time a preview deploys.
- Vercel-managed: a preview branch is deleted when Vercel removes the matching deployment, governed by Vercel’s deployment-retention policy, which defaults to roughly six months.
Don’t memorize the setup; the deployment chapter later in the course owns the actual wiring. The one thing to carry out of this is the watch-out it implies: auto-delete is a setting you have to turn on and understand. If you leave it off, preview branches quietly accumulate until your project is cluttered with the database of every PR you ever opened. That isn’t a flaw in branching, just a switch nobody flipped.
How the preview app actually opens a connection over that injected URL, which driver it uses and why the endpoint has a -pooler in it, is the next lesson’s entire subject. Here it’s enough to say that the integration injects a connection string, and the app reads it.
Compute sleeps when nobody’s looking
Section titled “Compute sleeps when nobody’s looking”Recall the second pain from the storage/compute split: an idle traditional server burns money because its process is always on. That pain disappears for the same reason branching was cheap.
Because a Neon compute holds no permanent state, the platform can do something a welded server never could. After a stretch of no activity, it suspends the compute entirely: the process goes away. While it’s gone you pay nothing for compute, and your data is completely safe, because the data was never in the process; it’s sitting in storage, untouched. The moment a query arrives, the platform spins a fresh compute back up, points it at the same storage, and serves the query. The database fell asleep and woke up, and nothing was lost. The current numbers, as of mid-2026:
- Idle window: after about five minutes with no activity, the compute suspends. On paid (Scale) plans this is a dial you can turn, as short as a minute or all the way to “never suspend, stay awake always.” On the free tier the five-minute window is fixed; you take it as it comes.
- Wake time (the cold start ): roughly 300–800 milliseconds to resume, with total time-to-first-query usually landing between half a second and a second.
This is the answer to the second question we opened with. An idle preview branch with no traffic bills you for storage only, because its compute has dropped to zero. That’s what makes “a branch per PR” sustainable rather than a nice idea you can’t afford. Ten open pull requests are not ten databases burning compute around the clock. They’re ten cheap storage snapshots whose compute flickers on only when someone actually opens that preview, then drops back to zero when they leave. Cheap-to-create branches and scale-to-zero compute are the two halves of one affordability story: branching makes the copy free, and scale-to-zero makes the idle free.
Scale-to-zero is a trade, not a free lunch, and the experienced move is knowing exactly what you’re trading. The first query after a suspend pays the wake latency, that 300–800ms. Every query after that, on the now-warm compute, is fast. So the cost lands entirely on the first query after an idle period, and the skill is deciding, per branch, whether that cost is acceptable there:
Framed as a per-branch decision, cold start stops being a gotcha and becomes a setting you choose deliberately: previews scale to zero where cheap matters more, and production stays warm where speed matters more.
One honest limit is worth noting. Neon’s free plan gives you something like 100 compute-hours per project a month, half a gigabyte of storage, and around ten branches. The exact numbers drift, so treat those as “roughly,” and check the pricing page in the resources below for the current figures. That’s plenty to learn this model on and run a side project, but it is not sized for live production traffic, where you’ll burn through the compute-hours under real load and need a paid plan. The free tier teaches the architecture perfectly; it just isn’t where a real product’s production lives.
The shape of a production Neon project
Section titled “The shape of a production Neon project”Pull the pieces together and a recognizable shape falls out, the one you should be able to read off a real project at a glance. We’re describing it, not building it; you don’t touch Vercel in this lesson. But when you see this shape later, it should look familiar.
DATABASE_URL points here · kept warm
Read top to bottom, that figure is the entire production setup:
- A Neon project holding a
productionprimary branch (this is the branch yourDATABASE_URLpoints at in production) and a long-livedstagingbranch forked off it. - A Vercel–Neon integration that cuts an ephemeral branch per preview deployment, injects that branch’s connection string into the preview, and, with auto-delete switched on, cleans the branch up afterward so they don’t accumulate.
- Production compute tuned to stay warm, so no real user ever waits through a cold start, while preview and dev branches are left to scale to zero, where the savings matter more than the wake latency.
Now look at production sitting in that tree and notice what protects it: only a name. This is the watch-out worth carrying out of the whole lesson. production is just a branch by convention. The platform does not stop anyone from running a destructive migration straight against it without branching first. There’s no “are you sure,” no technical guard of any kind. What protects production is discipline: branch it, test the change on the branch, then apply it for real. Teaching that habit of branch, run, then promote is the migrations chapter’s whole job. Neon gives you the safety net, but using it before you do something irreversible is up to you.
Three things this section deliberately leaves for later, so you know where they live. Provisioning a project and wiring the integration for real is the deployment chapter, later in the course. The migration discipline of branch, run, promote is the migrations chapter. And how your application code actually connects to whichever branch’s URL it’s handed is the very next lesson.
Check your understanding
Section titled “Check your understanding”You’ve got the whole model now: storage split from compute, branches that share storage, compute that sleeps. Before the next lesson dives into how the app connects, make sure you can use this model, mapping a real situation to the right move and avoiding the two misconceptions that catch people most often.
Start with the decisions. Each scenario below is a spot you’ll actually land in; follow it to the move an experienced engineer would make.
Branching is your undo button. Cut a branch off production, run the destructive migration there, and inspect the result. If it’s wrong, throw the branch away and production never knew. (The branch, run, promote workflow itself is the migrations chapter.)
A shared staging database is the problem here; a database per PR solves it. Each preview gets its own isolated, prod-shaped branch, so one PR’s test data can’t corrupt another’s.
The first request after an idle period is waking a scaled-to-zero compute. That’s fine for dev and previews, but not for users. Tune production to stay warm, with a longer or disabled idle window, so real traffic never pays the wake latency.
Preview branches don’t clean themselves up unless you tell them to. Turn on auto-delete so each branch is torn down when its PR closes, and the storage its diverged pages used gets reclaimed instead of creeping.
The order matters more than any single answer: a senior reads the situation first, then reaches for the Neon feature that fits it. The two questions below cover the misconceptions that cause the most confusion later, so it’s worth getting them right.
On Monday you create a branch off production. On Tuesday, three new rows are inserted into production — nothing else touches the branch. When you query the branch on Wednesday, what does it return?
production as it changes.production.production into the branch.production’s Tuesday writes stay on production and never appear in it. The same independence runs the other way, which is the whole point of a branch: anything you write to the branch never leaks back to production, so you can experiment freely.A preview branch has sat with zero traffic for the last hour — nobody opened its preview. The branch still holds the few pages it diverged from production. Which part of it is Neon billing you for during that idle hour?
Where to read more
Section titled “Where to read more”These are the canonical pages worth opening, especially for the numbers in this lesson, which drift over time, so trust the docs over any figure printed here.
The canonical reference for the idle window and wake latency — the numbers that move.
The copy-on-write model in depth: how a branch shares its parent's storage.
The integration that creates a database branch for every Vercel preview deployment.
The authoritative source for free-tier limits and compute-hour allowances.
A one-minute interactive demo: branch a 1 TB database, change rows, and revert — copy-on-write made tangible.
A distributed-systems engineer's deep dive into the storage/compute split — for when you want the layer below this lesson.