Before learning the solution, we must deeply understand the problem: the catastrophic failure of standard hashing algorithms when scaling systems out.
A system that decides which server should handle a request.
If we use the Round Robin technique (which is simple and has no memory), the same user may hit different servers on subsequent requests. This means caching on the server makes no sense!
Imagine the following flow:
If next time the user goes to a new server, then there is absolutely no meaning of caching. The new server won't have the cache!
To achieve stickiness, we use hashing based on the number of available machines (N).
Mostly, request_id is the `user_id` so we get the exact same hash value always.
Same User → Same Server.
The problem isn't the load balancer. Adding or removing servers is.
Each time we increase or decrease servers, the value of N changes. Consequently, almost every user will be forwarded to a completely new server because their modulo result changes.
Almost all users map to new servers. Every existing cache becomes totally irrelevant.
Because caches miss, every request hits the DB again simultaneously.
The DB gets overwhelmed with load it was never scaled for, slowing down completely or crashing.
"The Load Balancer is doing its job. But the hashing strategy is flawed."
A visual look into why adding a server is so destructive.
25 Bucks in each server
100 / 5 = 20 Bucks each
As the new server comes, we must re-distribute everything. 20 buckets move directly to the new server (S4). But severely, the modulo mapping of the remaining buckets changes entirely.
Almost ALL 100 buckets changed their target server!
To perfectly resolve this issue, we use"Consistent Hashing"
Understanding the core mechanics and the "Virtual Nodes" game changer.
M is the size of the hashing space.
Instead of thinking of hashes on a straight line 0 ----- 99, we wrap it into a continuous circle.
The Hash Ring:
0 → 1 → 2 ... → 99 → back to 0
Instead of modulo by N (changing servers), we do 2 distinct things using the fixed M ring:
hash(serverId) % MServers become fixed points mapped directly onto the circle.
hash(requestId) % MRequests also map onto that exact same ring.
For each request computationally placed on the ring, move clockwise along the ring to find the very first nearest server.
Only the requests lying immediately between the new server and the previous server are affected. NOT the entire system.
Only its adjacent regional requests shift to the next server gracefully. Minimal change on the ring.
Core Intuition: Old Modulo Hashing = Global Reshuffle ❌.
Consistent Hashing = Local Adjustment ✅.
Since servers are placed pseudo-randomly via hash, distances between them are rarely exactly equal.
S1 ------------------- S2 - S3 - S4
👉On the ring above, S1 gets a HUGE chunk of the geometry. Others get tiny chunks, resulting in a massive Load Imbalance.
Instead of 1 server = 1 point on the ring,
1 server = multiple points on the ring.
Let k = virtual nodes per server.
If k = 3, each server appears 3 times uniquely on the ring. 4 physical servers = 12 total hashed points.
k ≈ log(N) theoretically guarantees very low imbalance probability without exploding memory overhead.Imagine a pizza. Without virtual nodes: 4 people get 4 big slices. If the cuts are unequal, it's totally unfair.
With virtual nodes: Cut the pizza into 100 tiny slices. Each person is randomly assigned 25 specific slices scattered around the pie. Everyone inherently gets an almost perfectly equal share of the cheese!
M Size of Hash Ring
k Virtual Nodes per Server
k times each)."Map both servers and requests on a ring, assign via nearest clockwise server, and break big chunks into tiny pieces for perfect balance."