December 18, 2025·4 min readnodejsperformancescalability

Scaling Node.js APIs to Handle Real Traffic

The non-obvious bottlenecks that hit Node.js services at 10K RPS, and the patterns that actually fix them.

Node.js gets unfair criticism at scale. The truth is most performance problems aren't Node's fault — they're patterns we inherited from small apps that don't survive contact with real traffic.

These are the issues I actually saw at Bolt (10K RPS peak) and what fixed them.

1. Event loop starvation from sync code

JSON.parse on a 2MB response blocks the event loop. So does crypto.pbkdf2Sync, a synchronous regex on user input, and basically anything that touches a big buffer.

Fix:

Stream large responses instead of buffering.
Move CPU work to worker threads (worker_threads).
Profile with --prof or clinic flame — you'll usually find one or two hot functions eating 30% of event loop time.

Don't trust your intuition on what's slow. Profile.

2. Unbounded concurrency on downstream calls

A request comes in, spawns 50 parallel calls to a downstream service, and the downstream falls over. Your service looks fine on the dashboard. The downstream team sends an angry Slack message.

Fix: per-call concurrency limits. I use p-limit or a small token-bucket wrapper.

import pLimit from "p-limit";
const limit = pLimit(10);
 
const results = await Promise.all(
  ids.map((id) => limit(() => fetchById(id)))
);

Ten parallel is usually enough. Twenty is enough for anything. Fifty is an outage waiting to happen.

3. Connection pool exhaustion

pg, mysql2, and most HTTP clients have a default pool size around 10. Under load every request waits for a connection. Your p99 goes from 80ms to 8s, not because the database is slow, but because you're queueing in your own process.

Fix:

Size the pool for RPS × p99_latency.
Monitor connection wait time (most libraries expose it).
If you have many Node.js instances and one shared database, sum the pools — you'll often find you've oversubscribed the database itself.

4. The `async_hooks` tax

Request-scoped context (like OpenTelemetry spans) uses AsyncLocalStorage, which uses async_hooks. Every Promise allocation pays a small tax. At 10K RPS that tax is real.

Fix:

Use AsyncLocalStorage only where you need the context (tracing, logging). Don't wrap everything.
Avoid creating unnecessary async wrappers — async () => syncFn() allocates a Promise for nothing.

5. GC pressure from object churn

Node will spend 10% of CPU on GC if you're allocating millions of small objects per second (typical JSON parse-then-map pipeline).

Fix:

Reuse objects where safe (be careful — shared mutable state bites).
Use Buffer.allocUnsafe for I/O-bound buffers you fill immediately.
Bump heap to a sensible size (--max-old-space-size) so major GCs are rare, not frequent.

Profile with --trace-gc-verbose and look at pause times.

6. Graceful shutdown is always broken

Default Node shutdown on SIGTERM:

Stop accepting new connections. ✅
Close existing sockets? ❌ The process just exits.

In Kubernetes with rolling deploys, every deploy drops in-flight requests. Users see random 502s.

Fix:

process.on("SIGTERM", async () => {
  server.close(); // stop new connections
  await drainQueues(); // finish in-flight work
  await closeDB();
  process.exit(0);
});

And set terminationGracePeriodSeconds in your pod spec to match.

7. The logging layer

console.log is synchronous and slow. winston in naive config is async but unbounded — under pressure it holds gigabytes of pending logs in memory.

Fix:

Use pino. It's genuinely faster than alternatives by an order of magnitude.
Ship logs over a side channel (stdout → log shipper). Don't have the app call the log API directly.
Sample high-volume logs. At 10K RPS you don't need every request logged at INFO — 1% sampled + all errors is plenty.

What I'd prioritise

If you're scaling a Node.js API right now:

Profile the event loop. Fix the top 2 hot paths.
Size your pools honestly.
Cap downstream concurrency.
Get graceful shutdown right.
Switch to pino if you haven't.

Node is fast enough for almost any workload. Most "Node doesn't scale" stories are actually "our code doesn't scale" stories. The fix is usually in the patterns, not the runtime.

Want a senior set of eyes on your Node.js architecture? Let's talk.

Let's build

Have a system that needs to scale — or stop breaking?

I work with a small number of teams each month on architecture reviews, scaling, and hands-on backend engineering. If that sounds like you, let's talk.

Book a call anupkumar@live.in