Mohit Gauniyal

Sometimes systems don't get slow because they can't handle load.

They get slow because they're doing work too early.

This is a short write-up of one such situation — not a failure, just a correction.

TL;DR

We were preparing data before it was needed:

pre-computing responses
caching aggressively
exposing multiple projected endpoints

Under real usage, most requests didn't need half of it.

We simplified:

collapsed projections
delayed computation
removed the waterfall-style flow

System load dropped, but more importantly, the system became easier to understand and change.

<aside> 💡

Doing less early didn't make it fragile. It made it honest.

</aside>

The simple version

Imagine cooking everything on the menu because someone might order it.

That's what our system was doing.

Instead of responding to what was asked, it was preparing for everything that could be asked.

It felt safe.

It wasn't cheap.

What we were trying to do

This was a backend-heavy system with:

multiple consumers
different data needs
some pressure around performance

To be safe and "fast", the design leaned towards:

pre-computed responses
caching wherever possible
multiple endpoints with slightly different projections

On a whiteboard, it looked clean.

In code, it grew sharp edges.

The original flow (waterfall-style)

Every request followed roughly the same path:

request
  → fetch base data
  → compute all projections
  → cache results
  → respond

Even if the caller needed only a small subset, everything ran.

It worked.

But it worked too much.

What showed up in reality

Once the system saw real usage, a few things became obvious:

Most requests didn't need full data
Some expensive computations were barely used
Cache invalidation became harder than expected
Small changes touched more code than they should have

We weren't slow.

We were consistently doing unnecessary work.

The uncomfortable realization

<aside> 🤔

The problem wasn't caching. The problem was anticipation.

</aside>

We were optimizing for scenarios that hadn't happened yet.

So the question changed from:

"How do we optimize this further?"

to:

"What breaks if we don't do this upfront?"

In most cases, the answer was: not much.

The shift: removing the waterfall

Instead of preparing everything early, we moved to a more lazy, on-demand approach.

Conceptually, the flow became:

request
  → fetch minimum
  → respond
       ↳ compute more only if needed

That was the big change.

A very small code-level example

Before (eager computation):

function handleRequest() {
  const base = fetchBaseData()
  const stats = computeStats(base)
  const insights = computeInsights(base)

  return { base, stats, insights }
}

Even if the caller only needed base, everything ran.

After (lazy evaluation):

function handleRequest({ includeStats }) {
  const base = fetchBaseData()

  return {
    base,
    stats: includeStats ? computeStats(base) : undefined
  }
}

Nothing fancy.

Just doing less until asked.

What changed after this

Some measurable things improved:

lower CPU usage
simpler cache logic
more predictable performance

But the bigger win was mental.

Debugging became easier
Changes were more localized
The system fit back into one person's head

That matters when you're owning the full stack.

Why this mattered for us

Early systems don't have stable usage patterns.

You don't know yet:

what's critical
what's rarely used
what's actually worth optimizing

Designing for all of it upfront feels responsible.

Most of the time, it's just premature certainty.

The takeaway

<aside> ✨

Robust systems aren't the ones that handle everything early.

They're the ones that: • delay work • keep assumptions small • stay easy to change

</aside>

Anticipation feels like optimization.

Often, it's just work done too soon.

Let usage earn complexity.

Closing thought

Doing less early didn't make the system weaker.

It made it honest.

And honesty scales better than assumptions.

Anticipation Is Not Optimization

TL;DR

The simple version

What we were trying to do

The original flow (waterfall-style)

What showed up in reality

The uncomfortable realization

The shift: removing the waterfall

A very small code-level example

What changed after this

Why this mattered for us

The takeaway

Closing thought

Enjoyed this article?