When a single database can no longer scale vertically or hold all your data, it's time to split it up. Discover the roadmap from a single node to high-throughput distributed systems.
In reality, a database is just a process running on a server (like an EC2 instance).
"We represent it with a cylinder symbol, but it's fundamentally a process coupled with high-performance storage."
As your product goes viral, your single small server will start gasping for air. Here is how we evolve.
The first instinct: Scale Up. Give the database more CPU, more RAM, and more Disk. The process stays the same, but the hardware bucket grows.
Relying on a single master (Read Replica model) creates a Single Point of Failure. If the master dies, your entire system goes read-only or crashes.
A Replica Set is a group of nodes that maintain the exact same data set. One node is elected as the Primary, while others are Secondaries.
Secondaries can serve read requests, significantly boosting your system's read throughput.
When vertical scaling hits its limit, we Scale Out. We split the data across multiple identical servers.
A physical server instance added to the system infrastructure. We 'shard' the database.
The act of splitting the data itself into mutually exclusive subsets. We 'partition' the data.
Combined Capacity: 2000 Writes/sec
Deciding which data goes where isn't random. It follows a Deterministic Logic based on how your system handles traffic and storage.
The goal is to avoid Hot Shards — situations where one server is overwhelmed while others sit idle. This is achieved by picking a partition key that distributes data uniformly.
Splitting by Rows. Each shard holds full objects for a subset of IDs.
Splitting by Columns. Sensitive or rarely accessed attributes move to a separate shard.
"If P5 becomes too hot, we can move it from Shard Beta to another shard."
A good partition key should have High Cardinality. For example, partitioning by UserID is better than partitioning by Status, which might only have a few values, causing massive data skew as your system grows.
These concepts are often used interchangeably, but their combinations define your architecture's capability.
Sharding \ Partitioning | NO | YES |
|---|---|---|
| NO | Standard DB (Day 0) Single process, single data set. Limited scale. | Logical Partitions Splitting data within same server (e.g., Table partitioning). |
| YES | Read Replicas Multiple physical servers, but NO data splitting. Full replication. | Full Sharding Physical server splitting + Data partitioning. Massive scale. |
Higher Throughput: Halve the load per server and handle twice the traffic.
Increased Storage: Bypass physical disk limits (e.g., 100TB) by spreading data across nodes.
High Availability: If one shard goes down, only a portion of your users are affected.
Operational Complexity: Managing replication lags, consistent hashing, and rebalancing is hard.
Expensive Cross-Shard Queries: Joining tables across shards is slow and often not natively possible.
Rebalancing Costs: Moving a "hot" partition to another shard is a heavy background task.
Sharding isn't just about splitting tables. It's about designing access patterns that ensure most queries hit exactly one shard.