Parallel data fetching in React Server Components: avoiding request waterfalls in Next.js

Why stacked awaits serialize I/O in the App Router, how sibling Server Components and Promise.all overlap work, and how caching interacts with parallel fetches in production Next.js routes.

Autor: Matheus Palma8 Min. Lesezeit
Software engineeringFrontendNext.jsReactPerformanceTypeScript

You deploy a product page that looks innocent: three cards—pricing, inventory, and reviews—each backed by a microservice with p50 around 40 ms. In the browser DevTools waterfall you expect something near the slowest call. Instead p95 for the document is north of 200 ms, and it grows linearly with every new block you add. The UI is not “slow because microservices”; it is slow because the server rendered them one after another, and each await pinned the RSC payload until the previous dependency resolved.

That failure mode is easy to introduce in React Server Components (RSC) with the Next.js App Router: async components read like synchronous code, so the natural first draft is a vertical stack of awaits. This article explains why those stacks serialize I/O, how to restructure trees so independent work runs concurrently, and where caching (fetch defaults, unstable_cache, tags) changes the cost model. The guidance mirrors what shows up repeatedly when hardening client-facing routes: the goal is predictable TTFB, not micro-optimizing JSX.

The mental model: RSC render is still a tree walk

A Server Component runs on the server during a navigation or a rerender pass. React walks the element tree, suspends at boundaries when a child needs async data, and stitches a serialized payload for the client. Crucially, parent components run before children in the usual depth-first sense, and if the parent itself awaits work, children cannot start their own async work until the parent resumes.

Consider a single component that awaits three unrelated calls:

async function ProductRail({ sku }: { sku: string }) {
  const pricing = await fetchPricing(sku);
  const stock = await fetchStock(sku);
  const reviews = await fetchReviews(sku);
  return (
    <section>
      <Pricing data={pricing} />
      <Stock data={stock} />
      <Reviews data={reviews} />
    </section>
  );
}

If each remote call is 45 ms and connection reuse is healthy, the critical path is ~135 ms plus framework overhead—even though the data sources do not depend on each other. That is a self-inflicted waterfall: not the network’s fault, not React “being slow,” but await ordering expressed as program text.

Understanding the tree walk matters because fixes are not “use a faster hook”; they are graph surgery: move independent awaits behind boundaries that React can schedule together, or hoist them into a single concurrent batch.

Pattern 1: Promise.all for independent fetches in one component

When all inputs are known in the same scope and you want one component to remain the orchestrator, batch awaits with Promise.all (or Promise.allSettled when partial failure is acceptable):

async function ProductRail({ sku }: { sku: string }) {
  const [pricing, stock, reviews] = await Promise.all([
    fetchPricing(sku),
    fetchStock(sku),
    fetchReviews(sku),
  ]);
  return (
    <section>
      <Pricing data={pricing} />
      <Stock data={stock} />
      <Reviews data={reviews} />
    </section>
  );
}

The critical path drops toward max(pricing, stock, reviews) instead of the sum, modulo connection limits and CPU.

Trade-offs:

  • All-or-nothing latency — You wait for the slowest peer. If reviews are nice-to-have, bundling them with pricing may be the wrong UX; see granular splitting below.
  • Error coupling — One rejection fails the whole Promise.all. For optional panels, either catch per promise, use allSettled, or split components so one failure does not block the entire rail.
  • Back-pressure — Opening ten parallel queries per request can overwhelm a shared DB pool. Pair parallelism with sensible limits at the service layer (the same concern arises in request coalescing designs).

Pattern 2: split Server Components so siblings fetch in parallel

React can render sibling async Server Components concurrently. If each child owns its own fetch, their I/O overlaps without a manual Promise.all in the parent—provided the parent does not await anything heavy before emitting children:

async function PricingPanel({ sku }: { sku: string }) {
  const pricing = await fetchPricing(sku);
  return <Pricing data={pricing} />;
}

async function StockPanel({ sku }: { sku: string }) {
  const stock = await fetchStock(sku);
  return <Stock data={stock} />;
}

async function ReviewsPanel({ sku }: { sku: string }) {
  const reviews = await fetchReviews(sku);
  return <Reviews data={reviews} />;
}

export default function ProductRail({ sku }: { sku: string }) {
  return (
    <section>
      <PricingPanel sku={sku} />
      <StockPanel sku={sku} />
      <ReviewsPanel sku={sku} />
    </section>
  );
}

Here ProductRail is synchronous; each child suspends independently. This pattern shines when panels have different SLAs or error policies: wrap a nice-to-have in <Suspense fallback={...}> so pricing and stock stream first while reviews load.

Caveat: do not hide sequential work behind children. If PricingPanel needed stock to compute tax display, splitting without passing data down would either refetch redundantly or push complexity into the client—sometimes the right answer is a single BFF endpoint that joins server-side, trading HTTP-level parallelism for one round trip.

Pattern 3: align fetch caching with how you parallelize

In Next.js, deduplication and caching interact with parallelism:

  • Multiple components calling fetch("https://api/a", { same options }) during one render pass can be deduplicated into one network call—parallel structure does not always mean duplicate traffic.
  • Changing cache options breaks dedupe keys; subtle differences (next: { revalidate: 60 } vs 120) multiply actual requests.
  • For non-fetch data access (ORM, gRPC clients), dedupe does not magically apply; you may need explicit per-request caches or a data loader.

When consulting teams migrate pages from getServerSideProps-style monoliths to RSC, a recurring bug is “parallel components but duplicate queries” because each child uses a slightly different Prisma call or cache tag set. The render is concurrent; the database still sees N identical selects. Fix that at the repository layer (request-scoped memoization) rather than by re-collapsing everything into one mega-component—keep the UX structure, dedupe the IO.

Suspense boundaries are a UX tool—not a substitute for batching

<Suspense> around a subtree controls what the user sees while waiting; it does not, by itself, shorten server work unless combined with streaming and independent subtrees. Used well, Suspense lets the shell and primary merchandising paint while secondary panels resolve; used poorly, it becomes a spinner quilt that masks the same sequential backend work.

Heuristic from production pages:

  • Put tight fallbacks on small, known-size placeholders (skeleton lines).
  • Avoid wrapping the entire document unless you genuinely stream meaningful HTML early; a single root Suspense can still wait on a parent await above it.

Practical example: page shell with parallel panels and graded fallbacks

The following example uses sibling Server Components for independent services, Suspense only where delayed content is acceptable, and Promise.all inside a panel when two calls are truly dependent on each other but independent of other panels.

import { Suspense } from "react";

async function fetchJson<T>(url: string): Promise<T> {
  const res = await fetch(url, { next: { revalidate: 30 } });
  if (!res.ok) throw new Error(`HTTP ${res.status} for ${url}`);
  return res.json() as Promise<T>;
}

async function HeroPricing({ sku }: { sku: string }) {
  const [list, promo] = await Promise.all([
    fetchJson<{ currency: string; amount: number }>(
      `https://internal.example/pricing/${encodeURIComponent(sku)}`
    ),
    fetchJson<{ code: string } | null>(
      `https://internal.example/promos/${encodeURIComponent(sku)}`
    ),
  ]);
  return (
    <header>
      <h1>{sku}</h1>
      <p>
        {list.amount} {list.currency}
        {promo ? ` — code ${promo.code}` : null}
      </p>
    </header>
  );
}

async function InventoryTable({ sku }: { sku: string }) {
  const rows = await fetchJson<Array<{ loc: string; qty: number }>>(
    `https://internal.example/stock/${encodeURIComponent(sku)}`
  );
  return (
    <table>
      <tbody>
        {rows.map((r) => (
          <tr key={r.loc}>
            <td>{r.loc}</td>
            <td>{r.qty}</td>
          </tr>
        ))}
      </tbody>
    </table>
  );
}

async function ReviewList({ sku }: { sku: string }) {
  const reviews = await fetchJson<Array<{ id: string; score: number; body: string }>>(
    `https://internal.example/reviews/${encodeURIComponent(sku)}`
  );
  return (
    <ul>
      {reviews.map((rv) => (
        <li key={rv.id}>
          <strong>{rv.score}/5</strong> — {rv.body}
        </li>
      ))}
    </ul>
  );
}

export default function ProductPage({ params }: { params: { sku: string } }) {
  const { sku } = params;
  return (
    <main>
      <Suspense fallback={<header aria-busy="true">Loading pricing…</header>}>
        <HeroPricing sku={sku} />
      </Suspense>

      <section aria-label="Inventory">
        <h2>In stock</h2>
        <Suspense fallback={<p>Loading inventory…</p>}>
          <InventoryTable sku={sku} />
        </Suspense>
      </section>

      <section aria-label="Reviews">
        <h2>Reviews</h2>
        <Suspense fallback={<p>Loading reviews…</p>}>
          <ReviewList sku={sku} />
        </Suspense>
      </section>
    </main>
  );
}

HeroPricing still uses Promise.all because list and promo are independent yet both required for the hero text. InventoryTable and ReviewList run in parallel with each other and with HeroPricing because they are siblings under a synchronous page component.

For routes where read-your-writes consistency matters after a mutation, pair this structure with the routing discipline described in read-your-writes consistency with replicas: parallelism is not an excuse to read stale inventory immediately after checkout.

Common mistakes and pitfalls

Sequential awaits in a layout chainapp/layout.tsx that awaits user session, then page.tsx awaits catalog config, then a child awaits product: the layout blocks the whole subtree. Hoist independent session and catalog work with parallel patterns, or move non-blocking adornments lower.

Micro-splitting without dedupe — Ten sibling components each issuing the same SQL is parallel and wasteful. Measure query counts per request when refactoring.

Treating Client Components as parallel fetchers — Moving useEffect fetches to the client shifts work off TTFB and onto client waterfalls; sometimes correct for personalization, often a regression for SEO and first paint. Decide explicitly; do not “fix” server waterfalls by hiding them in the browser.

Ignoring downstream connection limits — Browsers cap parallel HTTP/2 streams; databases cap connections. If every page opens twenty internal calls, deploy aggregation endpoints or a GraphQL/BFF layer rather than pushing pressure to shared pools.

Over-batching unrelated domainsPromise.all across a dependency chain (B needs A’s id) is fine, but do not block shipping labels on hero images if the UX allows progressive reveal.

Conclusion

React Server Components make server-side data fetching feel local and linear, which is ergonomically excellent and operationally dangerous when independent I/O is written as a vertical list of awaits. The fixes are structural: Promise.all where one component owns independent calls, sibling Server Components where panels have different failure or latency profiles, and explicit caching/dedupe when the data layer is not fetch. Measure whole-route latency and per-query counts together; parallelism that multiplies identical queries is not scalability.

Used deliberately, these patterns keep Next.js routes responsive under real traffic and make later work—caching policies, admission control, and autoscaling—about genuine load, not self-imposed waterfalls. That is the bar teams expect when building production-ready, user-facing surfaces—and the same bar applied when helping partners ship scalable systems end to end.

Newsletter abonnieren

E-Mail erhalten, wenn neue Artikel erscheinen. Kein Spam — nur neue Beiträge von diesem Blog.

Über Resend. Abmeldung in jeder E-Mail möglich.