Chapter 94Lesson 6

RSC waterfalls and Promise.all

Spot and fix the data-fetching waterfalls that quietly slow Server Components, using Promise.all, React cache, and Suspense streaming.

Here is a puzzle that has stumped more than one engineer. The dashboard loads, and it feels sluggish: not broken, just slow enough to notice. You open the trace, and the server render took about 320ms. So you check the database, because that is where slowness usually hides. Every query is under 80ms. There is no slow third-party call, no cold start, nothing red anywhere. Every single operation is fast, and yet the page is slow. Where did the time go?

The answer is that the component awaited four reads one after another when three of them had no reason to wait. You already know the rule that fixes this. Back in parallel by default, sequential by dependency you learned to run independent work in parallel and only go sequential when a real dependency forces it. This lesson is where that rule earns its keep inside a real Server Component. You have also met this bug before in another costume: the N+1 query was this exact mistake one layer down, at the database, and the waterfall is its twin at the component layer. By the end you will be able to open a slow trace, spot the shape, run the dependency check, and reach for the right fix, because there is more than one.

What a waterfall looks like

Before any code, you need to be able to see this bug, because seeing it is most of the work. So let’s put it on a time axis.

Picture a RSC that needs four things to render. It awaits the user. Then it awaits the org, which needs user.orgId, so that one genuinely has to wait. Then it awaits the invoices, which need org.id. Then the team, which also needs org.id. That is four awaits, each roughly 80ms, stacked end to end: about 320ms.

Now look closer at the dependencies. org depends on user, so that wait is real. But invoices and team both depend on org and on nothing else. They do not depend on each other. There is no reason team should wait for invoices to come back; they could have left at the same time. The fastest this page can possibly go is user, then org, then invoices and team running together, with the page only as slow as the slower of the two. That is user → org → max(invoices, team), roughly 240ms. A quarter of the time gone, and we have not touched a single query.

That dependency graph, who waits for whom, is the thing you will reason about for the rest of this lesson. Hold it in your head: org waits for user, and invoices and team are siblings off org.

The diagram below is the whole lesson in three steps. Scrub through it.

user

80ms

org

80ms

invoices

80ms

team

80ms

~320ms

0

340ms

The waterfall: four awaits, each waiting for the last. Stacked into a descending staircase, they add up to ~320ms.

user

80ms

org

80ms

invoices

80ms

team

80ms

same dependency (org), independent of each other

~320ms

0

340ms

Find the independent reads: invoices and team both wait on org, but not on each other. Nothing forces them to run one after the other.

saved

user

80ms

org

80ms

invoices

80ms

team

80ms

~240ms

0

340ms

Overlap the siblings: they leave together right after org resolves, so three serial waits become two. The page is only as slow as the slower of the pair, about 240ms, and the green span is the ~80ms you reclaimed.

The takeaway is spatial, and it is the only reading skill you need: bars that stack into a staircase are running in series; bars that overlap are running in parallel. Carry that picture into the next section, because you are about to meet it in the wild.

The trace is the only place you see it

What makes this bug hard to catch is that it is invisible from every angle except one.

Go back to the engineer from the opening. They suspected the database, so they profiled each query in isolation. getUser was fast. getOrg was fast. listInvoices and listTeamMembers were both fast. Every part is healthy, so they conclude the database is fine and move on, page still slow and no idea why. They were not wrong about any individual query. The cost is not in any one read; it is in the order, the serialization, and you cannot see serialization by looking at parts. You can only see it on a timeline that shows the parts relative to each other.

That timeline is a trace . You already have one: back in wiring up Sentry you raised tracesSampleRate and let the instrumentation record spans, so the dashboard request is already producing exactly the picture you need. (Vercel’s observability tab shows the same shape if you prefer it.) Each span is a single operation with a start time and a duration, and the trace stacks them on a shared time axis. That means the entire reading skill from the last section transfers directly:

Spans that stack top-to-bottom without overlapping are sequential. Each one started when the last one finished.
Spans that overlap on the time axis ran in parallel.

So the diagnostic process is mechanical. Open the slow page’s trace. Find a run of spans that stack into that descending staircase. Then, for each adjacent pair, ask one question: does the second span actually need the result of the first? If yes, the wait is real, so leave it. If no, you have found a waterfall, and you can fix it.

That diagonal staircase is the tell. Once you have seen it a few times, you spot it from across the room.

GET /dashboard

320ms

getUser

80ms

getOrg

80ms

listInvoices

80ms

listTeamMembers

80ms

each bar starts where the last ended — sequential

0

320ms

The dashboard's server trace. Each child span starts exactly where the previous one ended; that descending diagonal is the waterfall.

Try reading one yourself. The trace below has five spans: some form a genuine dependency chain, and some are needlessly serial.

Here is the trace for GET /settings. Each bar starts where the one above it ended — a five-step staircase. The note in parentheses is what each read needs before it can start.

getUser            ████              0–60ms
getOrg                 ████          60–130ms   (needs user.orgId)
listProjects               ████      130–210ms  (needs org.id)
listMembers                    ████  210–290ms  (needs org.id)
getBillingSummary                  ██ 290–360ms (needs org.id)

This page currently takes ~360ms. Which of these reads belong in one parallel group — bars that have no reason to stack and could all leave together? Select every read that belongs in that group.

getUser

getOrg

listProjects

listMembers

getBillingSummary

Once you can find the staircase and ask “does this span depend on the one before it?”, you can find this bug in any codebase. Now for the fixes.

Rewriting co-located awaits with Promise.all

The simplest waterfall, and the most common, is the one from the opening: a single component body with awaits stacked in a row. The fix is the most direct payoff of everything you already know.

Here is the habit to build, stated for Server Components: before you add a second await in a component body, ask whether this read needs the value you just awaited. If it does not, the two should leave together with Promise.all. If it does, keep them sequential, and now the order is load-bearing, because the second one genuinely cannot start without the first.

Apply it to the dashboard. org needs user, so that await stays. But invoices and team only need org, not each other, so once org resolves they go together.

Waterfall
Parallel

export default async function DashboardPage() {
  const { orgId } = await requireOrgUser();
  const org = await getOrganization(orgId);
  const invoices = await listInvoices(org.id, org.billingPeriod);
  const team = await listTeamMembers(org.id);

  return <Dashboard org={org} invoices={invoices} team={team} />;
}

Four round trips, all in series. org genuinely waits on the auth result, and invoices needs the org’s billing period, but team is stuck behind invoices for no reason at all. About 320ms.

export default async function DashboardPage() {
  const { orgId } = await requireOrgUser();
  const org = await getOrganization(orgId);
  const [invoices, team] = await Promise.all([
    listInvoices(org.id, org.billingPeriod),
    listTeamMembers(org.id),
  ]);

  return <Dashboard org={org} invoices={invoices} team={team} />;
}

org still resolves first, because both reads need it. But invoices and team don’t need each other, so they leave together and are awaited as a pair. Same correctness, about 240ms.

Three serial waits became two: user → org → max(invoices, team). That 80ms is per render, and it is paid on every request, so under load it is not 80ms once, it is 80ms multiplied by your traffic, sitting on a thread the whole time. The rewrite costs you one line.

There are two ways people get Promise.all wrong, and both are worth learning now because both are silent.

The first is about failure. Promise.all rejects the moment any one of its promises rejects, and it hands you that first rejection while the others keep running in the background, their results thrown away. For a page where you need everything or nothing, that is exactly right: if one read fails the render is dead anyway. But when you would rather render what succeeded and degrade the rest, Promise.all is the wrong shape, so reach for Promise.allSettled (from Promise combinators), which waits for every promise and reports each outcome separately. Treat it as a choice rather than a warning: an all-or-nothing render calls for Promise.all, and a render-what-you-can page calls for allSettled (or streaming, which is coming up).

The second mistake is the dangerous one, because it does not throw at all. If you wrap two reads in Promise.all and the second genuinely needed the first’s value, the second now runs with undefined and quietly produces wrong data: no error, no crash, just a page built on bad values. The dependency check is the only thing standing between you and that bug, which is why it is the part of the habit you cannot skip.

Time to do the rewrite yourself. In the exercise below, the reads are modeled as timed async functions. One of them has a real dependency and must stay sequential; the other two are independent. Turn the independent pair into a single Promise.all and watch the total time drop.

getOrg must run first — it returns the org the other two reads need. But listInvoices and listMembers only depend on the org, not on each other, so they should leave together. Rewrite loadDashboard so the two independent reads run in parallel with Promise.all while getOrg stays sequential. The starter takes ~240ms; the rewrite should bring it to ~160ms — and dropping getOrg into the same Promise.all breaks the data the reads need.

Output

Reveal solution

export const loadDashboard = async () => {
  const org = await getOrg();
  const [invoices, members] = await Promise.all([
    listInvoices(org.id),
    listMembers(org.id),
  ]);
  return { org, invoices, members };
};

getOrg stays its own await because both reads need org.id, so that wait is a real dependency. Once it resolves, listInvoices and listMembers leave together inside one Promise.all, since neither needs the other’s result. Three serial waits become two: org → max(invoices, members), about 160ms. Folding getOrg into the same Promise.all would force org.id to be read before it exists, and the dependency check is what stops you doing that.

When the waterfall hides in the component tree

The waterfall in a single function body is the easy one: all the awaits are right there in front of you. The harder waterfall, and the one that accounts for most real cases, never appears in any single function. It is created by the shape of your component tree.

Here is how it happens. A parent component awaits its own read and renders. One of its children is also a Server Component, and it awaits its own read. There is no data dependency between them, since the child does not use anything the parent fetched. And yet the child’s read cannot start until the parent has finished rendering, because rendering is sequential: the server renders the parent, reaches the child in the output, and only then runs the child’s body. The two reads serialize, not because the data forces it, but because the render order does. This version is easy to miss, because the code looks clean: each component fetches its own data, well co-located, with no prop-drilling, and it waterfalls anyway.

Scrub through the trace below. The parent DashboardPage awaits the org, and its child InvoiceList awaits the invoices. Watch the server-render phase: the child’s read sits idle until the parent’s resolves.

Render order is fetch order

So the waterfall here is structural: it is what naive nesting does, not a typo. You have three ways out, and they grow more sophisticated in turn.

Option one: hoist the fetch up. Move both reads into the parent, fire them (with Promise.all if they are independent), and pass the results down as props. The timing is fixed, since both reads run at the top, together. The cost is that the parent now has to know about data its children consume, and in a deep tree that turns into prop-drilling a value through three components that do not care about it.

Option two: React cache(), the modern default. This is the move an experienced engineer reaches for, because it fixes the timing without giving up co-location. You wrap the read function in React’s cache(), which deduplicates it within a single render: call it five times in one render pass and it runs once, handing every caller the same in-flight promise. Now the parent can start the read by calling it without awaiting, so the request is already in flight while the child still calls await listInvoices(orgId) and receives that very same promise instead of kicking off a fresh one. The child stays self-contained, the timing goes parallel, and nobody drills a prop.

There is a stack-specific reason this matters here, and it is worth pausing on. When you fetch with fetch(), Next.js deduplicates identical GET requests within a render pass automatically, so fetch callers get this dedup for free. A Drizzle query does not. db.query.invoices.findMany(...) called twice in one render runs twice, because nothing memoizes it, and this app reads through Drizzle, not fetch. So in this codebase, the dedup and the kick-off pattern are not automatic: you have to wrap the query in cache() yourself to get them.

export const listInvoices = cache(async (orgId: string) => {
  return db.query.invoices.findMany({ where: eq(invoices.organizationId, orgId) });
});

export default async function DashboardPage({ orgId }: { orgId: string }) {
  listInvoices(orgId);

  return (
    <section>
      <OrgHeader orgId={orgId} />
      <InvoiceList orgId={orgId} />
    </section>
  );
}

async function InvoiceList({ orgId }: { orgId: string }) {
  const invoices = await listInvoices(orgId);
  return <InvoiceTable rows={invoices} />;
}

Wrapping the Drizzle read in React cache() makes it request-scoped: called many times in one render, it runs once and shares the result. Drizzle reads aren’t auto-deduped, so this wrap is what unlocks the pattern.

export const listInvoices = cache(async (orgId: string) => {
  return db.query.invoices.findMany({ where: eq(invoices.organizationId, orgId) });
});

export default async function DashboardPage({ orgId }: { orgId: string }) {
  listInvoices(orgId);

  return (
    <section>
      <OrgHeader orgId={orgId} />
      <InvoiceList orgId={orgId} />
    </section>
  );
}

async function InvoiceList({ orgId }: { orgId: string }) {
  const invoices = await listInvoices(orgId);
  return <InvoiceTable rows={invoices} />;
}

The parent starts the read without awaiting it. This is deliberate: it warms the cache so the promise is already in flight while the rest of the tree renders. It is not a forgotten await.

export const listInvoices = cache(async (orgId: string) => {
  return db.query.invoices.findMany({ where: eq(invoices.organizationId, orgId) });
});

export default async function DashboardPage({ orgId }: { orgId: string }) {
  listInvoices(orgId);

  return (
    <section>
      <OrgHeader orgId={orgId} />
      <InvoiceList orgId={orgId} />
    </section>
  );
}

async function InvoiceList({ orgId }: { orgId: string }) {
  const invoices = await listInvoices(orgId);
  return <InvoiceTable rows={invoices} />;
}

The child still awaits its own read, co-located and self-contained, but it receives the same in-flight promise the parent started, so it doesn’t pay a second round trip.

1 / 1

Option three: sibling Suspense boundaries. Split the children so each one fetches under its own <Suspense>. Siblings under separate boundaries fetch in parallel and stream in independently. That is the bridge to the next section, where we look at when streaming is the right call rather than parallel awaits.

When partial paint beats waiting: Suspense streaming

So far the goal has been to make independent reads overlap. But there is a different kind of slow page where overlapping is not the real lever.

Picture a dashboard with two reads: an analytics aggregation that genuinely takes ~800ms, and a user profile that takes ~50ms. Parallelize them perfectly with Promise.all and the page still cannot paint until the slower one resolves, so the user stares at a blank screen for 800ms before anything appears, even though the profile was ready in 50. The cost here is not serialization, because the reads are already parallel. The cost is that the slow read is blocking the first paint of everything else.

The fix is to stop making the fast content wait. Wrap the slow region in <Suspense fallback={...}>: the fast content paints immediately, a skeleton holds the slow region’s place, and the slow content streams in when it is ready.

One point to be precise about, because it is the most common misconception here: Suspense is not a speed-up. The slow fetch is still exactly as slow, and nothing about the 800ms changed. What changed is when the user sees something, with first paint moving from 800ms to 50ms, not how long the work takes. (The mechanics of how Suspense and streaming actually work belong to the App Router unit; here you only need the shape and the decision.)

That decision is the core of this section. You now have two fix shapes that both involve “don’t block,” and you have to pick between them:

Parallel-await when all the data must be present before the page is worth showing. A transactional page, such as an invoice you are about to approve and pay, should not paint half-formed; show it complete or show a loader. Use Promise.all.
Suspense streaming when partial paint is genuinely useful. A dashboard of independent widgets has no reason to hold the fast ones hostage to the slow one. Stream the slow region.

Both still start with the dependency graph. Streaming does not exempt you from the dependency check; it just changes what you do with an independent slow read once you have found it.

first paint ~800ms

profile

50ms

analytics

800ms

0

900ms

Block on all (Promise.all): both reads run in parallel, but the page can't paint until the slower one finishes, so the first-paint line sits at the end of analytics. The user stares at a blank screen for ~800ms.

first paint ~50ms

profile

50ms

analytics

skeleton — streaming in…

0

900ms

Stream the slow one: wrap analytics in <Suspense> and the fast profile paints immediately, so first paint jumps to ~50ms and a skeleton holds the slow region’s place. Note that the analytics bar did not move: the fetch is still 800ms. Only the first-paint line moved.

Now make the call yourself. Walk the decision below for a few scenarios and watch which shape it lands on.

Which fix does this page need?

Caching removes the duplicate; parallelism removes the wait

There is one last distinction to nail down, because conflating these two is the most common conceptual error in this whole area. You have actually been fixing two different problems, and they have two different fixes.

Serialization is independent reads running one after another. It wastes time by waiting, and you fix it with parallelism: Promise.all or streaming.

Duplication is the same read happening many times: getUser called in the layout, again in the header, again in a sidebar widget, all in one render. It wastes time by repeating, and you fix it with caching.

The two are independent, and a single page can suffer both at once: over-serialized and over-duplicated. Knowing which one you are looking at tells you which tool to pick. Two caching tools matter, and the difference between them is exactly the difference people get wrong:

React cache() is request-scoped memoization. The same read called N times within a single render runs once. You already used it above for the kick-off pattern. It forgets everything when the request ends.
The 'use cache' directive is cross-request persistence. The result survives between requests, so the next visitor reuses it. (Its mechanics belong to the Cache Components material.)

So when you reach for caching, ask which kind of duplication you have. If the same read fires three times in this render, use cache(). If every visitor re-runs the same expensive read, use 'use cache'. One is per-render and the other is cross-request. They are not the same tool, and using the wrong one either does nothing or caches something it should not.

Recall the cousin we opened with. This component-tree waterfall is the N+1 query one layer up: a list renders, each row is a child that awaits its own read, and the database sees N serial queries instead of one. Same shape, same family of fix: hoist the fetch and batch it into a single query, then pass the rows down. The SQL-side fix has its own lesson; the point here is to recognize it as the same bug.

That is the whole approach, and it is small enough to make a habit. Once a week, open one slow trace. Look for the diagonal staircase. Run the dependency check on it. Co-location is the React way, but co-location plus parallel awaits plus cache() is the combination an experienced engineer reaches for, and it costs you almost nothing once the habit is in place.

External resources

Next.js — Fetching Data

nextjs.org

The official guide, with side-by-side sequential vs. parallel examples and the Promise.all rewrite.

React — cache() reference

react.dev

Per-request memoization, the preload pattern, and the pitfalls — the API behind the kick-off move.

Avoiding Server Component Waterfalls with cache()

aurorascharff.no

Aurora Scharff contrasts hoisting with Promise.all against the cache() preload pattern that keeps co-location.