Wallet Reservations at the Redis Layer: How We Stopped Double-Billing Users
When users pay per minute for a call, the wallet is on the critical path. Here's how we used Redis-level reservations and locks to prevent double-billing and negative balances.
There is a special kind of bug that wakes you up at 3am. It is the bug where a user has been charged for a call they didn't take. Or worse: the bug where two of a user's calls run simultaneously and their balance goes negative.
We had both, once.
This post is about the Redis-level wallet system we built so we'd stop having either.
The Problem
Mynaksh's backend is a Node.js / Express microservice setup, with the wallet living as one of the more critical services. (The AI workloads sit in a separate Python service; the wallet is Node-only — there's no LLM in this story.)
We charge users per minute for astrologer consultations. That's the business model. It also means wallet balance is on the critical path of every call:
- Before a call starts, we verify the user has enough balance to begin.
- During the call, we deduct from the balance per minute.
- After the call ends, we reconcile and finalize the charge.
This is fine in a single-threaded world. The problems start when you remember that:
- A user can start two calls back-to-back, one ending and one starting in overlap.
- A user can have the app open on two devices and start a call from each.
- Network glitches mean retries, and naive retries mean repeated charges.
- Concurrent reads from the call service and the deduction service can race each other.
The naive version — "read balance, check it, charge it" — has all the classic concurrency bugs.
The Two Failure Modes We Cared About
- Double-billing. A user is charged twice for the same minute or the same session. Customer-trust killer.
- Negative balance. A user starts a call when they shouldn't have, because two parallel reads both saw "yes, has balance."
Either one is a refund-and-apology email at best, a churned user at worst.
The Reservation Pattern
We solved both with a reservation pattern, implemented at the Redis layer.
When a user starts a call:
- Check the user's available balance in Redis. (Redis is the single source of truth for hot wallet state — fast reads and writes, atomic operations available.)
- Reserve the expected cost of the call duration upfront. The reservation reduces available balance immediately.
- The call runs. Per-minute deductions come out of the reservation, not the main balance.
- When the call ends, reconcile: actual cost is settled against the reservation, the reservation is released, any unused portion is returned to available balance.
The key idea: a reservation is a hold, not a charge. The hold prevents double-spending. The actual money movement happens on reconcile.
Two parallel "start call" attempts hit the reservation step in some order. Whichever wins reserves first; the second one finds that available balance is now lower and either fails or succeeds based on what's left. There's no window where both can pass the check.
The reservation primitive is a Redis Lua script that does check + reserve atomically:
local user_key = KEYS[1]
local reservation_key = KEYS[2]
local amount = tonumber(ARGV[1])
local reservation_id = ARGV[2]
local ttl = tonumber(ARGV[3])
local available = tonumber(redis.call('HGET', user_key, 'available') or 0)
if available < amount then
return {-1, available}
end
redis.call('HINCRBY', user_key, 'available', -amount)
redis.call('HINCRBY', user_key, 'reserved', amount)
redis.call('HSET', reservation_key, 'amount', amount, 'user_key', user_key)
redis.call('EXPIRE', reservation_key, ttl)
return {0, available - amount}
Returning -1 plus the current balance lets the caller distinguish "insufficient funds" from "Redis error" without an extra round trip. The reservation TTL is a safety net — if a process dies between reserve and reconcile, the reservation expires and the funds return to available.
The Lock
Reservations alone aren't enough. You also need to lock the wallet during the reservation operation to prevent two concurrent reservation attempts from interleaving in subtle ways.
We use Redis locks, scoped to the user's wallet. Acquired with SET NX and a 5-second TTL:
const lockId = uuid();
const acquired = await redis.set(
`lock:wallet:${userId}`, lockId, 'NX', 'EX', 5
);
if (!acquired) {
// Another wallet operation is in flight.
// Retry with backoff up to 3x, then surface a clean failure.
}
The lock is short-lived — measured in milliseconds in normal operation. The 5-second TTL is a safety floor: if a process dies holding the lock, it clears automatically. We use Redlock-style implementation with the standard caveats.
If we fail to acquire the lock after retries, we surface "something went wrong, please try again" to the user instead of risking a duplicate hold. Better to ask them to retry than to charge them twice.
The Frontend Queue
Backend locks prevent double-spending at the wallet itself. They don't prevent the frontend from firing concurrent wallet requests in the first place — they just make sure the second one fails cleanly when it arrives. That's correct, but it's wasteful: rejected requests are extra network traffic, extra error-handling code paths, and occasional UI spinners that resolve into "please try again" toasts.
So we added a frontend queue, behind a feature flag.
When the flag is on, all wallet operations on the user's app go through a single in-app queue. Any new wallet request joins the queue and waits for the previous one to complete (or fail) before firing. If a recharge, a balance check, and a call-start hit the wallet at roughly the same time, they execute serially in arrival order — no parallel hits, no rejected lock acquisitions, no surprising UI states.
The queue is per-user, lives in the app's process, and clears when the app is killed. Pending requests have a timeout so a hung backend call doesn't block the queue forever.
The flag existed so we could roll the queue out gradually. We turned it on for 10% of users first, watched for queue starvation or unexpected serialization slowdowns, then ramped to 100% over a couple of days. The backend locks still run regardless — the queue is belt-and-suspenders, not a replacement.
The combined effect of "queue on the frontend, locks on the backend" is that wallet contention is essentially impossible from a single user's app. The backend locks handle the genuinely concurrent cases — multiple devices, retries from network failures, the user reopening the app while a previous session is still finalizing. Lock-contention warnings, which had been a small but steady stream in our logs, basically stopped after the rollout.
Reconciliation
Real-time wallet operations live in Redis. The eventual source of truth lives in Postgres.
We reconcile asynchronously:
- Every settled call writes a transaction record to Postgres immediately on call-end.
- A background job runs every 5 minutes per shard, auditing Redis state against the database — checking that hot balance + outstanding reservations + completed transactions sum to the user's authoritative balance.
- If they diverge by more than rounding error, it pages an engineer.
Production divergences are rare. The only one we've seen was a brief Redis failover that lost a few seconds of writes. The reconcile job caught it within five minutes; affected users were credited automatically; no support tickets.
What This Cost Us
Building this was about three weeks of focused work. Most of that time was not the implementation — it was the testing.
We built a chaos harness that:
- Spins up a fresh Redis cluster and Postgres instance.
- Seeds 1,000 fake users with varied balances.
- Fires concurrent call-start, call-end, and recharge events with randomized timing.
- Asserts at the end that no user's balance is negative, no transaction is recorded twice, and no reservation is orphaned.
The harness caught two real bugs in our own logic before they hit production: a race where the reconcile job and a concurrent reservation could both touch the same key, and an off-by-one in the reservation extension flow where extending could briefly double-count the extension amount. Both got fixed before any of this saw user traffic.
The implementation itself is small. The properties it gives you are large: atomic reservations, no double-spends, recoverable state after process crashes, auditable history.
What's Still Imperfect
The biggest remaining risk is Redis itself. If the Redis cluster falls over hard, we degrade to a slower fallback path that goes directly through Postgres with explicit row-level locking. This is correct but ~10x slower, and at peak traffic it can't keep up — calls fail to start. We accept that. Better to refuse calls than to start them on stale data.
The fallback path has been exercised in load tests but never in a real production outage. We run Redis with replicas and Sentinel for automatic failover, which has handled the few real incidents without engaging the fallback.
Long-call handling: if a call exceeds the upfront reservation, we extend it mid-call when the consumed portion crosses 80% of the reserved amount. The extension is its own atomic operation, with the same lock-and-reserve pattern. It's mostly invisible to the user — they see a continuous wallet meter — but it's where the most subtle bugs lived during development.