>_
EngineeringNotes
← Back to System Design

Consistent Hashing

Before learning the solution, we must deeply understand the problem: the catastrophic failure of standard hashing algorithms when scaling systems out.

Let's recap Load Balancers

A system that decides which server should handle a request.

ClientLoad BalancerS1S2

If we use the Round Robin technique (which is simple and has no memory), the same user may hit different servers on subsequent requests. This means caching on the server makes no sense!

Why does stickiness matter?

Imagine the following flow:

  • 1.User logs in.
  • 2.Server caches user data (profile, session configs, etc.)
  • This is done so no continuous Database calls happen.
But...

If next time the user goes to a new server, then there is absolutely no meaning of caching. The new server won't have the cache!

The standard solution: Hashing

To achieve stickiness, we use hashing based on the number of available machines (N).

Server = Hash (request_id) % NN = number of active machines

Mostly, request_id is the `user_id` so we get the exact same hash value always.
Same User → Same Server.

So, What's the problem?

The problem isn't the load balancer. Adding or removing servers is.

The severe flaw in modulo Math

Each time we increase or decrease servers, the value of N changes. Consequently, almost every user will be forwarded to a completely new server because their modulo result changes.

Suppose Hash(A) = 15
4 Servers:15 % 4 = 3Server 3
Now we add 1 new server:
5 Servers:15 % 5 = 0Server 0 💥
Target changed entirely!

The Catastrophe

Massive Cache Invalidation

Almost all users map to new servers. Every existing cache becomes totally irrelevant.

Database Overload

Because caches miss, every request hits the DB again simultaneously.

System Crash

The DB gets overwhelmed with load it was never scaled for, slowing down completely or crashing.

"The Load Balancer is doing its job. But the hashing strategy is flawed."

The Cost of Rehashing

A visual look into why adding a server is so destructive.

4
Machines
100
Buckets

25 Bucks in each server

+ 1 New Server
5
Machines
100Buckets

100 / 5 = 20 Bucks each

As the new server comes, we must re-distribute everything. 20 buckets move directly to the new server (S4). But severely, the modulo mapping of the remaining buckets changes entirely.

Almost ALL 100 buckets changed their target server!

So, What Do We Actually Need?

1Load Balancing
2Stickiness
3Minimal Changes on Add/Remove

To perfectly resolve this issue, we use"Consistent Hashing"

The Clean Layered Explanation

Understanding the core mechanics and the "Virtual Nodes" game changer.

1What is M?

M is the size of the hashing space.

Instead of thinking of hashes on a straight line 0 ----- 99, we wrap it into a continuous circle.

The Hash Ring:
0 → 1 → 2 ... → 99 → back to 0

0/MM/2

2The Mapping Logic (Very Important)

Instead of modulo by N (changing servers), we do 2 distinct things using the fixed M ring:

Step 1: Place Servers

hash(serverId) % M

Servers become fixed points mapped directly onto the circle.

Step 2: Place Requests

hash(requestId) % M

Requests also map onto that exact same ring.

The Golden Rule

For each request computationally placed on the ring, move clockwise along the ring to find the very first nearest server.

Case 1: Add a Server

Only the requests lying immediately between the new server and the previous server are affected. NOT the entire system.

Case 2: Remove a Server

Only its adjacent regional requests shift to the next server gracefully. Minimal change on the ring.

Core Intuition: Old Modulo Hashing = Global Reshuffle ❌.
Consistent Hashing = Local Adjustment ✅.

!

The Real-life Problem & The Game Changer

The Flaw: Skewed Load

Since servers are placed pseudo-randomly via hash, distances between them are rarely exactly equal.

S1 ------------------- S2 - S3 - S4

👉On the ring above, S1 gets a HUGE chunk of the geometry. Others get tiny chunks, resulting in a massive Load Imbalance.

Virtual Servers (v-nodes)

Instead of 1 server = 1 point on the ring,
1 server = multiple points on the ring.

Let k = virtual nodes per server.

If k = 3, each server appears 3 times uniquely on the ring. 4 physical servers = 12 total hashed points.

The distribution goes from:[S1] ....... [S2] . [S3] . [S4]To much more evenly spread:[S1]-[S2]-[S3]-[S4]-[S1]-[S3]-[S2]...
Why this is genius:
  • Remove S1: 3 small regions vanish and load spreads cleanly and proportionally across S2,S3,S4 instead of crushing one unlucky neighbor.
  • Add S5: Intersects and takes tiny regions from existing servers harmoniously, ensuring a perfectly smooth transition.
  • Note: typically k ≈ log(N) theoretically guarantees very low imbalance probability without exploding memory overhead.

The Pizza Analogy 🍕

Imagine a pizza. Without virtual nodes: 4 people get 4 big slices. If the cuts are unequal, it's totally unfair.

With virtual nodes: Cut the pizza into 100 tiny slices. Each person is randomly assigned 25 specific slices scattered around the pie. Everyone inherently gets an almost perfectly equal share of the cheese!

Final Mental Model

Terms

M Size of Hash Ring

k Virtual Nodes per Server

Flow
  1. Hash servers → place them on ring (k times each).
  2. Hash request → place it securely on ring.
  3. Move clockwise → assign to first found server.

One Line Execution Summary

"Map both servers and requests on a ring, assign via nearest clockwise server, and break big chunks into tiny pieces for perfect balance."