System Design Overview

System design is the process of planning and defining how a software system will work internally.

Core Objective

Defining a system architecture, components, interfaces, and data flows to meet specific functional and non-functional requirements.

The Blueprint

Creating a robust blueprint for software systems that ensures Scalability, Performance, Reliability, and Efficiency.

Target Application

Handle 1 user to 10 Crore users efficiently

How to manage millions of data?

How to handle millions of requests?

How to handle system failures?

Architecture

MonolithicMicroservices

Components

Frontend Backend DB

The standard lifecycle

Requirements

System Design

Coding

Testing

Deployment

Types of System Design

System Design is broadly categorized into two phases: High-Level Design (HLD) and Low-Level Design (LLD).

High-Level Design (HLD)

"Bird's eye view of the system"

What it answers

What components exist?
How do they interact?
How does the system scale?

What it includes

ArchitectureFrontend / BackendDatabasesAPIs & Load BalancersCaching & CDN

Example: Instagram Flow

Users send requests Load Balancer Backend DB & Object Storage + CDN.
👉 No class, no code — just flow.

Goal:

Handle millions of users • Ensure scalability & reliability

Low-Level Design (LLD)

"Zoomed-in internal view"

What it answers

What classes and objects?
What methods and algorithms?
What data structures and logic?

What it includes

Classes & ObjectsDB schemas & RelationsAPI payloadsAlgorithmsDesign Patterns

Example: Instagram Internals

class User, Post, FeedService
DB Tables: users, posts, followers
POST /createPost, GET /feed
👉 Close to actual coding.

Goal:

Make system implementable • Ensure clean, maintainable code

Feature	High-Level Design (HLD)	Low-Level Design (LLD)
Focus	System architecture	Code-level design
Level	Abstract	Detailed
Concern	Scalability	Implementation
Includes	Services, DB, APIs	Classes, methods
Used by	Architects	Developers

10 Core Concepts

Mastering these principles makes a system reliable, scalable, and maintainable.

Vertical Scaling

Increase the capacity of a single resource.

See Pizza Shop Analogy vs Real World Example

Pizza Shop Analogy

Instead of hiring another chef, you give your current chef better tools, a faster oven, or caffeine to handle more orders.

Real World System

Upgrading a single server with a more powerful CPU, more RAM, or a faster SSD to handle increased traffic.

Preprocessing

Performing tasks during off-peak hours to save resources for busy times (via cron jobs).

See Pizza Shop Analogy vs Real World Example

Pizza Shop Analogy

Pre-making pizza bases or chopping veggies during off-peak hours so you're ready when the dinner rush hits.

Real World System

Running background tasks (like generating daily reports or updating caches) during off-peak hours to reduce latency during busy periods.

Backup Server

Introducing redundancy to eliminate single points of failure.

See Pizza Shop Analogy vs Real World Example

Pizza Shop Analogy

Having a spare, identical oven ready to be fired up immediately if the primary oven breaks down.

Real World System

Implementing redundancy. If your primary database or server crashes, a standby 'slave' or secondary unit automatically takes over to prevent a single point of failure.

Horizontal Scaling

Adding more resources or machines to handle increased demands.

See Pizza Shop Analogy vs Real World Example

Pizza Shop Analogy

Instead of one super-fast chef, you hire 10 regular chefs. You can easily add more chefs as demand increases.

Real World System

Instead of one powerful server, you deploy multiple smaller servers. This allows you to scale by simply adding more nodes to the network.

Microservices

Dividing the system into specialized, manageable units where each component has a specific responsibility.

See Pizza Shop Analogy vs Real World Example

Pizza Shop Analogy

Breaking down the kitchen into specialized stations: one chef just prepares dough, one does toppings, and one bakes. Each station operates independently.

Real World System

Breaking a monolithic application into smaller, specialized services (e.g., Auth, Payment, Profile). Each can be scaled or updated independently.

Distributed Systems

Spreading the system across different locations to improve fault tolerance and response time.

See Pizza Shop Analogy vs Real World Example

Pizza Shop Analogy

Opening multiple pizza branches across the city so a customer gets their pizza hot from the nearest branch.

Real World System

Spreading your service across geographical locations (e.g., AWS regions). Ensures a user in Tokyo accesses data from a server near them, reducing latency and increasing fault tolerance.

Load Balancing

Using a central authority to route requests efficiently between resources based on real-time data.

See Pizza Shop Analogy vs Real World Example

Pizza Shop Analogy

A skilled manager at the front taking orders and intelligently assigning them to the cook who currently has the smallest backlog.

Real World System

A 'traffic cop' (like Nginx or AWS ELB) sits in front of your servers, routing incoming user requests to the least busy server to optimize response times.

Decoupling

Separating different parts of a system so they can operate and evolve independently.

See Pizza Shop Analogy vs Real World Example

Pizza Shop Analogy

The cashier writes the order on a ticket and puts it on a rail. They don't need to yell at or wait for the chef; they just fire the order and take the next customer.

Real World System

Using message queues (Kafka, RabbitMQ) to ensure the 'Ordering' service doesn't need to know how the 'Shipping' service works. They communicate via events, keeping the system flexible.

Logging and Metrics

Tracking system events to condense data and identify performance bottlenecks.

See Pizza Shop Analogy vs Real World Example

Pizza Shop Analogy

Keeping a detailed ledger of what time orders were placed, how long they took, and tracking who was working when the oven malfunctioned.

Real World System

Implementing tools like Prometheus or ELK stack to monitor system health. If a server's 'oven' (database) is slow, you can identify the bottleneck immediately through logs.

#10

Extensibility

Design the system to be flexible so it can adapt to new business requirements without needing a total rewrite.

See Pizza Shop Analogy vs Real World Example

Pizza Shop Analogy

Designing your kitchen layout so that you can easily add a new deep fryer for wings later without having to rebuild the entire building.

Real World System

Writing modular code so that adding a new feature (like supporting a new payment method) doesn't require rewriting the codebase, ensuring the system grows with business needs.