Skip to content
Chapter 36Lesson 3

Neon branching and scale-to-zero

How Neon's separation of storage from compute makes serverless Postgres give every preview deployment its own copy-on-write database branch that scales to zero when idle.

You’re shipping a SaaS in 2026. Every pull request your team opens gets its own live preview URL, a full running copy of the app a reviewer can click through before anyone merges. And like everything else in your stack, the database bills you for what you use, not a flat rate for a box sitting in a rack. Picture the production database behind all of that. What does it actually look like?

Two harder questions hide inside that one. First, each preview needs its own data. If two reviewers are poking at two different PRs against the same shared database, they corrupt each other’s tests, and nobody trusts what they see. So how does every preview get an isolated copy? Second, if there’s a database behind every open PR, are you paying a monthly bill for each one, even the ones nobody has opened in a week? On a traditional Postgres server both of these are dealbreakers. A per-PR database is too slow to copy and too expensive to leave idle, so teams don’t even try. This lesson is about why that ceiling doesn’t exist on Neon. Both answers fall out of a single architectural choice Neon makes, and by the end you’ll be able to look at a real Neon project and explain why this setup is both cheap and safe. The previous lesson ended by pointing your DATABASE_URL at a Neon branch and promising the explanation later. Here it is: we’re going to find out what that branch is, and why branching is the entire pitch.

Everything in this lesson (branching, the database-per-preview trick, the bill that drops to zero) follows from one fact about how Neon is built. Once you understand that fact, the rest stops being a list of features to memorize and becomes a set of things you could have predicted. That payoff is why this section moves slowly.

Start with the model you already hold, since the new model is easiest to see as a change from it. A traditional Postgres server is one process welded to its disk. There’s a program running, the thing that parses your SQL, plans queries, and reads and writes rows, and there are the data files it owns, sitting on a disk bolted to that same machine. The running program and the stored data are a single inseparable unit. They boot together, they live together, they die together.

That welding is invisible until you try to do something with it, and then you feel it in three specific places:

  • Copying the database means physically copying the disk. To duplicate a 10 GB database you run pg_dump and restore it somewhere else: minutes of work and a second, full 10 GB on disk. The data has nowhere to live except attached to a process, so a copy is a whole new attached disk.
  • An idle server still costs money. The process runs whether or not anyone is querying it. A database behind a preview nobody has touched in a week is burning the same compute as one under load, because the process is always on. Always-on is the only state it has.
  • You can’t add a read-only copy without replicating everything. Want a second process to spread out read traffic? It needs its own disk, which means copying all the data and keeping the copy in sync. The disk is chained to one process.

Each of those three pains is about to dissolve.

The reframe is a single move. Neon splits the process apart from the disk. Storage becomes its own thing: a log of page versions living in durable cloud storage, completely independent of any running program. Compute becomes its own thing too: a stateless Postgres process the platform starts on demand and points at that storage. Stateless is the part that matters here: the compute holds no permanent state of its own. Kill it and the data is untouched, because the data was never in the process; it was in storage all along. Start a brand-new process, point it at the same storage, and it picks up exactly where the last one left off.

Traditional Postgres

Postgres process
disk (data files)
one inseparable unit

Neon

Compute stateless Postgres
Compute stateless Postgres
Storage log of page versions
started on demand, pointed at storage
Neon's storage is durable and shared; its compute is disposable. Every feature in this lesson is a consequence of that one split.

Look at the right side of that figure. Each of the three pains turns into its opposite:

  • Storage is shared, so a second process doesn’t need its own copy of the data. That’s what makes branching cheap. (Next section.)
  • Compute is disposable, so when nobody’s querying, the platform can throw the process away entirely and you pay nothing for it. That’s scale-to-zero. (A couple of sections down.)
  • Multiple computes can read one storage, so a read-only replica is just another process pointed at the same log, with no replication and no second disk. Neon supports this for scaling reads and analytics. The course never needs it, so we name it once and move on.

Three of Postgres’s oldest constraints disappear, all for the same reason. Keep the split in mind, because every section from here just asks the same question: given that storage and compute are separate, what becomes possible?

Branches share storage, so they’re nearly free

Section titled “Branches share storage, so they’re nearly free”

A branch is a new compute pointed at a snapshot of a parent’s storage at a moment in time. The snapshot is the interesting part, so look closely at how it’s taken. It’s copy-on-write : the new branch starts by sharing every existing page with its parent rather than duplicating any of them. Nothing is copied up front. Only when the branch actually changes a page (you insert a row, you update a value) does that one changed page get written fresh and kept separate. Everything you haven’t touched is still, physically, the parent’s pages.

So put the two models next to each other. To copy a 10 GB database the old way takes pg_dump, a restore, minutes of work, and a fresh 10 GB on disk. To branch it on Neon takes under a second and roughly zero added storage, because at the instant of branching the branch shares all 10 GB and has changed nothing. You only ever pay storage for the pages your branch diverges on. A branch you create and barely touch costs almost nothing to keep around, and as you’re about to see, that “almost nothing” is what makes the whole economic story work.

There’s one thing about branches that trips up nearly everyone, so it’s worth getting clear now. A branch captures its parent as it was at the moment you created it, and then the two go their separate ways. This is what point-in-time means, and it has real consequences:

production primary
Mon A B
Tue A B C
branch
Mon A B copied at fork
Tue A B Z
A branch is a snapshot, not a live mirror. After the fork, new rows on either side stay on their own side.

Read that diagram closely, because it clears up the most common misconception. A branch is not a live mirror of its parent. Rows the parent gains after you branch do not show up in the branch, so an old branch shows old data. If you ever wonder why a new production row isn’t appearing in a branch, the answer is timing: the branch was cut before that row existed. The same independence runs the other way, and that’s the part that makes branches safe: rows your branch writes never flow back to the parent. You can do anything to a branch, fill it with garbage or drop a table, and the parent doesn’t feel it.

That safety is what makes branching worth building a workflow around. Here’s what it unlocks, each expanded in the sections that follow:

  • A branch per preview deployment. Every open PR gets its own isolated, prod-shaped database, with no shared mess. (Next section, in full.)
  • Two long-lived branches as your baseline. A production branch (the primary, the one real users hit) and a staging branch forked off it that lives indefinitely. This pair is the course’s working setup; treat it as the default everything else hangs off.
  • What-if and migration rehearsal. About to run a destructive migration against production and not entirely sure it’s safe? Branch production, run the scary thing on the branch, inspect the result, and throw the branch away if it went wrong. The branch is your undo button. The actual mechanics of running a migration this way come a few chapters out, in the migrations chapter. For now just know the capability exists, and that it’s the main reason branching belongs in a senior’s toolkit.

There’s nothing for you to type in this section. On a real project these branches are created by the platform and the integration we’re about to meet, not by you running a command. The point here is the model, not the keystrokes.

This is the headline, and the direct answer to the first question we opened with: how does every preview get its own data? Watch one pull request travel from “opened” to “merged” and the answer becomes something you can picture rather than recite.

Pull request opened
Preview deployment
Neon project
production primary · warm
staging long-lived
preview/feature-x copy-on-write · ~0 storage
shared storage · page log

PR opened. A developer pushes a Git branch and opens a pull request on GitHub. So far the database side is unchanged: production and staging exist, and there’s no preview branch yet.

Pull request open
Preview deployment
Neon project
production primary · warm
staging long-lived
preview/feature-x copy-on-write · ~0 storage
shared storage · page log

Neon branch created. The Vercel–Neon integration sees the new deployment and instantly creates a Neon branch off the parent, named something like preview/<git-branch>. It’s copy-on-write, so it shares storage with its parent and costs almost nothing to create.

Pull request open
Preview deployment deployed
Neon project
production primary · warm
staging long-lived
preview/feature-x copy-on-write · ~0 storage
shared storage · page log

Preview deployed with the branch’s URL. Vercel builds the preview and injects that branch’s connection string as the deployment’s DATABASE_URL. The preview app now talks to its own private database, so seeding it or mutating it can’t touch production or any other PR.

Pull request +commits pushed
Preview deployment iterating safely
Neon project
production primary · warm
staging long-lived
preview/feature-x copy-on-write · ~0 storage
shared storage · page log

Iterate safely. The developer pushes more commits, runs migrations against the preview branch, and QA clicks through real, isolated data. Nothing here leaks into production or collides with another open PR.

Pull request merged / closed
Preview deployment torn down
Neon project
production primary · warm
staging long-lived
preview/feature-x deleted · storage reclaimed
shared storage · page log

PR merged or closed → branch deleted. With auto-delete enabled, the integration tears the preview branch down and reclaims the storage its diverged pages used. Back to just production and staging.

Scrub through that and the shape is clear: a database appears for the life of a PR and vanishes when the PR is done. That last step, the branch being thrown away, isn’t incidental. It’s the whole reason this is affordable, and it sets up the next section.

This is what makes the pattern matter in practice: it replaces the single shared staging database, which is a well-known anti-pattern. Picture the old way, with one staging database and two PRs in flight. Reviewer A seeds it with test invoices, reviewer B’s destructive test deletes a table A was relying on, and a third person’s migration leaves it half-broken, so now nobody believes anything they see in staging. A shared staging database degrades because everyone writes to it and no one owns it. A branch per PR removes that problem entirely: every preview gets a clean, isolated, production-shaped database that exactly one PR can touch. And because branches are nearly free to create and, as we’re about to see, nearly free to keep around while idle, you can genuinely afford one per open pull request without thinking about it.

One detail to recognize rather than configure, because it decides how cleanup happens. As of 2026 the Vercel–Neon integration comes in two flavors, and the only thing they differ on is who deletes the branch:

  • Neon-managed: you switch on “automatically delete obsolete branches,” and cleanup runs the next time a preview deploys.
  • Vercel-managed: a preview branch is deleted when Vercel removes the matching deployment, governed by Vercel’s deployment-retention policy, which defaults to roughly six months.

Don’t memorize the setup; the deployment chapter later in the course owns the actual wiring. The one thing to carry out of this is the watch-out it implies: auto-delete is a setting you have to turn on and understand. If you leave it off, preview branches quietly accumulate until your project is cluttered with the database of every PR you ever opened. That isn’t a flaw in branching, just a switch nobody flipped.

How the preview app actually opens a connection over that injected URL, which driver it uses and why the endpoint has a -pooler in it, is the next lesson’s entire subject. Here it’s enough to say that the integration injects a connection string, and the app reads it.

Recall the second pain from the storage/compute split: an idle traditional server burns money because its process is always on. That pain disappears for the same reason branching was cheap.

Because a Neon compute holds no permanent state, the platform can do something a welded server never could. After a stretch of no activity, it suspends the compute entirely: the process goes away. While it’s gone you pay nothing for compute, and your data is completely safe, because the data was never in the process; it’s sitting in storage, untouched. The moment a query arrives, the platform spins a fresh compute back up, points it at the same storage, and serves the query. The database fell asleep and woke up, and nothing was lost. The current numbers, as of mid-2026:

  • Idle window: after about five minutes with no activity, the compute suspends. On paid (Scale) plans this is a dial you can turn, as short as a minute or all the way to “never suspend, stay awake always.” On the free tier the five-minute window is fixed; you take it as it comes.
  • Wake time (the cold start ): roughly 300–800 milliseconds to resume, with total time-to-first-query usually landing between half a second and a second.

This is the answer to the second question we opened with. An idle preview branch with no traffic bills you for storage only, because its compute has dropped to zero. That’s what makes “a branch per PR” sustainable rather than a nice idea you can’t afford. Ten open pull requests are not ten databases burning compute around the clock. They’re ten cheap storage snapshots whose compute flickers on only when someone actually opens that preview, then drops back to zero when they leave. Cheap-to-create branches and scale-to-zero compute are the two halves of one affordability story: branching makes the copy free, and scale-to-zero makes the idle free.

Scale-to-zero is a trade, not a free lunch, and the experienced move is knowing exactly what you’re trading. The first query after a suspend pays the wake latency, that 300–800ms. Every query after that, on the now-warm compute, is fast. So the cost lands entirely on the first query after an idle period, and the skill is deciding, per branch, whether that cost is acceptable there:

Framed as a per-branch decision, cold start stops being a gotcha and becomes a setting you choose deliberately: previews scale to zero where cheap matters more, and production stays warm where speed matters more.

One honest limit is worth noting. Neon’s free plan gives you something like 100 compute-hours per project a month, half a gigabyte of storage, and around ten branches. The exact numbers drift, so treat those as “roughly,” and check the pricing page in the resources below for the current figures. That’s plenty to learn this model on and run a side project, but it is not sized for live production traffic, where you’ll burn through the compute-hours under real load and need a paid plan. The free tier teaches the architecture perfectly; it just isn’t where a real product’s production lives.

Pull the pieces together and a recognizable shape falls out, the one you should be able to read off a real project at a glance. We’re describing it, not building it; you don’t touch Vercel in this lesson. But when you see this shape later, it should look familiar.

Neon project
production primary DATABASE_URL points here · kept warm
staging long-lived
preview/* ephemeral one per PR · created & destroyed automatically · scale to zero
shared storage · page log
One project: two long-lived branches (production kept warm, staging), and a fan of ephemeral preview branches that the integration creates and destroys per pull request.

Read top to bottom, that figure is the entire production setup:

  • A Neon project holding a production primary branch (this is the branch your DATABASE_URL points at in production) and a long-lived staging branch forked off it.
  • A Vercel–Neon integration that cuts an ephemeral branch per preview deployment, injects that branch’s connection string into the preview, and, with auto-delete switched on, cleans the branch up afterward so they don’t accumulate.
  • Production compute tuned to stay warm, so no real user ever waits through a cold start, while preview and dev branches are left to scale to zero, where the savings matter more than the wake latency.

Now look at production sitting in that tree and notice what protects it: only a name. This is the watch-out worth carrying out of the whole lesson. production is just a branch by convention. The platform does not stop anyone from running a destructive migration straight against it without branching first. There’s no “are you sure,” no technical guard of any kind. What protects production is discipline: branch it, test the change on the branch, then apply it for real. Teaching that habit of branch, run, then promote is the migrations chapter’s whole job. Neon gives you the safety net, but using it before you do something irreversible is up to you.

Three things this section deliberately leaves for later, so you know where they live. Provisioning a project and wiring the integration for real is the deployment chapter, later in the course. The migration discipline of branch, run, promote is the migrations chapter. And how your application code actually connects to whichever branch’s URL it’s handed is the very next lesson.

You’ve got the whole model now: storage split from compute, branches that share storage, compute that sleeps. Before the next lesson dives into how the app connects, make sure you can use this model, mapping a real situation to the right move and avoiding the two misconceptions that catch people most often.

Start with the decisions. Each scenario below is a spot you’ll actually land in; follow it to the move an experienced engineer would make.

What's the Neon move here?

The order matters more than any single answer: a senior reads the situation first, then reaches for the Neon feature that fits it. The two questions below cover the misconceptions that cause the most confusion later, so it’s worth getting them right.

On Monday you create a branch off production. On Tuesday, three new rows are inserted into production — nothing else touches the branch. When you query the branch on Wednesday, what does it return?

Monday’s data, without the three Tuesday rows.
All of Monday’s data plus the three Tuesday rows — the branch tracks production as it changes.
The three Tuesday rows, and any rows you wrote to the branch now also show up in production.
Nothing, until you manually pull the latest data from production into the branch.

A preview branch has sat with zero traffic for the last hour — nobody opened its preview. The branch still holds the few pages it diverged from production. Which part of it is Neon billing you for during that idle hour?

Neither part — once a branch goes idle it costs nothing until a query wakes it.
The diverged pages it’s holding, but not the process that serves queries.
The process that serves queries, kept running so the first query after idle is instant.
A fixed per-branch charge that’s the same whether the branch is busy or asleep.

These are the canonical pages worth opening, especially for the numbers in this lesson, which drift over time, so trust the docs over any figure printed here.