It started as a normal bug report.
"Sometimes I click Mark as Done and it still shows Pending."
Not "the API fails". Not "it crashes". Just… the UI feeling off.
And those are the worst problems because they sound small, but they silently destroy trust.
This was happening on a dashboard I was working on — a content review queue.
Picture a simple setup: A list of items to review, each item has a status badge (Pending / Approved / Rejected), clicking an item loads details and history, and after an action, the status should update instantly.
Simple. Except it wasn't.
The Two Symptoms That Looked Unrelated
Symptom 1: "Ghost status"
I'd approve an item, refresh the page, and it would still show Pending for a bit. Then magically fix itself after some time.
Like the app needed "a moment to accept the truth." Classic ghost UI state.
Symptom 2: "Feels slower as the queue grows"
When the queue was small, everything felt smooth. As it grew past 10k items, the whole thing started feeling heavy: clicking an item felt delayed, filters felt sticky, sometimes it felt like the page was doing extra thinking.
Both symptoms felt like "frontend issues". They weren't.
What Was Actually Happening
I had introduced Redis caching (good idea… used badly)
To protect the database and reduce load, I cached the queue list. Reads became fast.
But here's the catch: The cache was time-based. Meaning it only refreshed when TTL expired.
So if an item changed from Pending → Approved, the database was updated instantly… but the list UI could still show the old data until the cache decided: "Okay fine, I'll expire now."
Technically the data wasn't wrong. The backend was correct. But the UI was showing stale state long enough to confuse humans.
My frontend was doing "search engine work"
The second issue was classic scaling debt.
To render some UI bits, I was doing lookups like this: fetch a big list, use .find() repeatedly to locate related records, scan arrays to "match" IDs.
Works great at 200 items. At 10,000+ items, it becomes a slow-motion disaster.
This wasn't "server slow". This was O(N) pain becoming visible.
Fix 1: I stopped treating cache like a timer
The moment something important happens (Approve / Reject), the cache needs to react immediately.
So instead of waiting for TTL expiry, I made cache invalidation event-aware: action occurs, backend knows which entity changed, the specific cache keys get deleted right away, next request fetches fresh state.
No "Pending but actually Approved" for 60 seconds.
Rule that stayed with me:
Don't build caches that wait for timers. Build caches that listen for events.
Fix 2: I stopped making the browser do database jobs
I accepted a hard truth:
The browser is not a search engine.
So I killed the "fetch everything → scan everything" approach and added targeted endpoints.
Instead of shipping a massive payload and doing .find() loops on the client, I did indexed lookups on the backend.
Example mindset shift:
- "send 10k items and let frontend locate the one" ❌
- "ask backend for the exact record by ID (with indexes)" ✅
Because even with a network roundtrip, a 10ms indexed lookup beats a 500ms client-side scan through bloated JSON.
Also loved this line while fixing it:
Indexes don't make databases faster. They make questions cheaper.
Fix 3: I made the UI refresh on purpose, not "eventually"
On the frontend, I was using SWR (stale-while-revalidate). Initially, revalidation was mostly passive (polling / interval refresh).
But for action-driven workflows, I needed intentional sync: user clicks approve, API succeeds, I trigger revalidation immediately, UI becomes consistent instantly.
That "ghost feeling" disappeared overnight.
What Changed After This (The Real Win)
This wasn't just a speed upgrade. It was a trust upgrade.
Status updates felt instant (because the cache stopped lying), the UI didn't get slower as data grew (because I removed scaling debt), performance became stable across sizes, not "fast until it isn't".
And best part? I didn't need a rewrite. Just a better mental model.
The Simple Takeaway
If your system feels weird at scale, it's usually one of these: time-based cache causing stale UI, O(N) frontend scans becoming visible, or both (lucky you).
And the fix is almost always boring: event-aware invalidation, indexed lookups, intentional revalidation.
The UI doesn't need to be "smart". It needs to be honest.