Sometimes systems don't get slow because they can't handle load.
They get slow because they're doing work too early.
This is a short write-up of one such situation β not a failure, just a correction.
TL;DR
We were preparing data before it was needed:
- pre-computing responses
- caching aggressively
- exposing multiple projected endpoints
Under real usage, most requests didn't need half of it.
We simplified:
- collapsed projections
- delayed computation
- removed the waterfall-style flow
System load dropped, but more importantly, the system became easier to understand and change.
<aside> π‘Doing less early didn't make it fragile. It made it honest.
</aside>The simple version
Imagine cooking everything on the menu because someone might order it.
That's what our system was doing.
Instead of responding to what was asked, it was preparing for everything that could be asked.
It felt safe.
It wasn't cheap.
What we were trying to do
This was a backend-heavy system with:
- multiple consumers
- different data needs
- some pressure around performance
To be safe and "fast", the design leaned towards:
- pre-computed responses
- caching wherever possible
- multiple endpoints with slightly different projections
On a whiteboard, it looked clean.
In code, it grew sharp edges.
The original flow (waterfall-style)
Every request followed roughly the same path:
request
β fetch base data
β compute all projections
β cache results
β respond
Even if the caller needed only a small subset, everything ran.
It worked.
But it worked too much.
What showed up in reality
Once the system saw real usage, a few things became obvious:
- Most requests didn't need full data
- Some expensive computations were barely used
- Cache invalidation became harder than expected
- Small changes touched more code than they should have
We weren't slow.
We were consistently doing unnecessary work.
The uncomfortable realization
<aside> π€The problem wasn't caching. The problem was anticipation.
</aside>We were optimizing for scenarios that hadn't happened yet.
So the question changed from:
"How do we optimize this further?"
to:
"What breaks if we don't do this upfront?"
In most cases, the answer was: not much.
The shift: removing the waterfall
Instead of preparing everything early, we moved to a more lazy, on-demand approach.
Conceptually, the flow became:
request
β fetch minimum
β respond
β³ compute more only if needed
That was the big change.
A very small code-level example
Before (eager computation):
function handleRequest() {
const base = fetchBaseData()
const stats = computeStats(base)
const insights = computeInsights(base)
return { base, stats, insights }
}
Even if the caller only needed base, everything ran.
After (lazy evaluation):
function handleRequest({ includeStats }) {
const base = fetchBaseData()
return {
base,
stats: includeStats ? computeStats(base) : undefined
}
}
Nothing fancy.
Just doing less until asked.
What changed after this
Some measurable things improved:
- lower CPU usage
- simpler cache logic
- more predictable performance
But the bigger win was mental.
- Debugging became easier
- Changes were more localized
- The system fit back into one person's head
That matters when you're owning the full stack.
Why this mattered for us
Early systems don't have stable usage patterns.
You don't know yet:
- what's critical
- what's rarely used
- what's actually worth optimizing
Designing for all of it upfront feels responsible.
Most of the time, it's just premature certainty.
The takeaway
<aside> β¨Robust systems aren't the ones that handle everything early.
They're the ones that: β’ delay work β’ keep assumptions small β’ stay easy to change
</aside>Anticipation feels like optimization.
Often, it's just work done too soon.
Let usage earn complexity.
Closing thought
Doing less early didn't make the system weaker.
It made it honest.
And honesty scales better than assumptions.