>_
EngineeringNotes
← Back to All Backend Concepts
Concept 02

HTTP Protocol

The foundational set of rules for transferring web data, enabling client-server communication via a request-response model.

01

Overview & Characteristics

HTTP (Hypertext Transfer Protocol) is typically built on top of TCP/IP and coordinates how text, images, JSON, and other formats are moved between a web server and a browser.

Security Note: While HTTP sends data in plain text, HTTPS adds a crucial layer of TLS/SSL encryption.

Stateless

  • No memory of past interactions. Every request is completely independent.
  • On each request, all necessary metadata must be provided.
  • Simple to build and highly scalable as the server doesn't need to track state across requests.

🤝 Client-Server Model

  • Usually a web browser or app initiates communication by sending a request.
  • Communication is strictly initiated by the client. The server simply responds.
02

The Evolution of HTTP

HTTP/0.9 & HTTP/1.0 (1991, 1996)

The very early web. HTTP/1.0 introduced the concept of HTTP headers and status codes. Every request required a new, completely discrete TCP connection.

HTTP/1.1 (1997)

Introduced Persistent Connections (Keep-Alive). A single TCP connection could remain open to serve multiple HTTP requests, massively improving performance.

HTTP/2 (2015)

Introduced Multiplexing allowing multiple requests and responses to be sent simultaneously over a single TCP connection. Also added support for header compression.

HTTP/3 (2019/2022)

A massive architectural shift. It abandons TCP entirely and runs on top of QUIC (which uses UDP), virtually eliminating connection setup latency and head-of-line blocking.

03

HTTP Headers Deep Dive

Client → Server

Request Headers

User-Agent

Identifies the client software originating the request (e.g. Chrome Browser, Postman, a mobile app).

Authorization

Sends credentials, often JWTs or bearer tokens, to authenticate the user.

Cookie

Contains previously stored cookies sent back to the server.

Accept

Informs the server what kind of content format the client is expecting to receive (e.g. application/json).

Both Ways

General Headers

Date
Cache-Control
Connection
Payload Info

Representation Headers

Content-Type
Content-Length
Content-Encoding
ETag
Server → Client Guidelines

🔐 Security Headers

Strict-Transport-Security

(HSTS): Ensures the client only communicates with the server over HTTPS, actively preventing protocol downgrade attacks.

Content-Security-Policy

(CSP): Restricts the sources from which content like JavaScript, CSS, and images can be loaded, helping prevent cross-site scripting (XSS) attacks.

X-Frame-Options

Prevents the web page from being embedded in an iframe, mitigating clickjacking attacks.

X-Content-Type-Options

Ensures that the browser does not try to guess the MIME type of the content, preventing MIME type sniffing attacks.

Set-Cookie

Secures cookies by making them inaccessible to JavaScript using the HttpOnly flag, and ensuring they are sent only over encrypted HTTPS connections using the Secure flag.

💡 Key Concepts of HTTP Headers

1. Extensibility

HTTP is highly extensible because headers can be added and customized easily. Developers can create Custom Headers (e.g., X-Custom-Header) to pass application-specific metadata.

2. Remote Control

Headers act like a remote control on the server side:

  • Client can request specific formats via headers.
  • Server can use headers for controls like Cache-Control (defining how long a resource should be cached by the client).
04

Real World Example (Request & Response)

REQUEST
PUT /api/users/12345 HTTP/1.1 Host: example.com User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/93.0.4577.82 Safari/537.36 Content-Type: application/json Content-Length: 123 Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9... Accept: application/json Accept-Encoding: gzip, deflate, br Connection: keep-alive Referer: https://example.com/dashboard Cookie: sessionId=abc123xyz456; lang=en-US // a blank line here means, everything has been sent { "firstName": "John", "lastName": "Doe", "email": "john.doe@example.com", "age": 30 }
RESPONSE
HTTP/1.1 200 OK Date: Fri, 20 Sep 2024 12:00:00 GMT Content-Type: application/json Content-Length: 85 Server: Apache/2.4.41 (Ubuntu) Cache-Control: no-store X-Request-ID: abcdef123456 Strict-Transport-Security: max-age=31536000; includeSubDomains; preload Set-Cookie: sessionId=abc123xyz456; Path=/; Secure; HttpOnly Vary: Accept-Encoding Connection: keep-alive // a blank line here means, everything has been sent { "message": "User updated successfully", "userId": 12345, "status": "success" }
05

HTTP Methods

The Method defines the Intent of Communication. It tells the server what the client wants to do with the target resource.

GET
POST
PUT
PATCH
DELETE
Key Difference

PUT vs PATCH

PUT

Full Replacement. Replaces the entire resource with the payload provided. If a field is omitted in the request, it typically gets removed or set to null.

PATCH

Partial Update. Modifies only the specific fields provided in the payload. The untouched fields remain exactly as they were.

Idempotent vs Non-Idempotent

Idempotent:GETPUTDELETE

Doing it multiple times safely produces the same result without negative side-effects.

Non-Idempotent:POST

Doing it multiple times produces different results (e.g., submitting a POST request twice creates two new identical objects).

OPTIONS & CORS

The OPTIONS method is used to fetch the capability of a server for Cross-Origin Request (CORS). By default, browsers strongly restrict clients from requesting resources from a different domain application to prevent malicious behavior.

In CORS there are majorly two types of flows:
  • 1) Simple Request: Sent directly to the server without an initial pre-check.
  • 2. Preflighted Request: An OPTIONS request is sent first to ask the server if it permits the actual request before performing the state-changing operation (like a POST or PUT).
example.comapi.example.com
GET /api/products/123 HTTP/1.1 Host: api.anotherdomain.com Origin: https://example.com Accept: application/json
HTTP/1.1 200 OK Content-Type: application/json { "product": { "id": 123, "name": "Example Product", "price": 29.99 } }

Data is received, but the browser blocks the frontend from reading it because the Access-Control header is missing.

HTTP/1.1 200 OK Content-Type: application/json Access-Control-Allow-Origin: https://example.com { "product": { "id": 123, "name": "Example Product", "price": 29.99 } }

The server explicitly allows the origin via the Access-Control-Allow-Origin header. The frontend can now read the data safely.

🛡️ Preflighted Request Example & Triggers

A preflight OPTIONS request is triggered if any of the following are true:

  • The method is NOT GET, POST, or HEAD (e.g., using PUT, DELETE).
  • The request includes non-simple headers (like Authorization, X-Custom-Header).
  • The request has a Content-Type other than form-data or text/plain (e.g., the widely used application/json).
Preflight Flow
1. Browser asks (OPTIONS)
OPTIONS /api/resource HTTP/1.1 Host: api.anotherdomain.com Origin: https://example.com Access-Control-Request-Method: PUT Access-Control-Request-Headers: Authorization
2. Server grants permission
HTTP/1.1 204 No Content Access-Control-Allow-Origin: https://example.com Access-Control-Allow-Methods: PUT, DELETE Access-Control-Allow-Headers: Authorization Access-Control-Max-Age: 86400
Once the server approves, the browser automatically sends the actual PUT request. We generally use this flow because modern APIs rely heavily on application/json payloads and Authorization headers, both of which strictly require preflighting to maintain browser security.
06

HTTP Status Codes

When a client makes an HTTP request, the server responds with a 3-digit status code. These codes indicate whether a specific request has been successfully completed, and they are divided into 5 distinct classes. Importantly, these are universal—they mean the exact same thing whether you're using Node.js, Python, Java, or Go.

1xx Informational

Indicates that the request was received and the server is continuing the process. You rarely have to handle these directly.

  • 100 Continue
  • 101 Switching Protocols (e.g., upgrading to WebSockets)

2xx Success

The action was successfully received, understood, and accepted.

  • 200 OK: Standard response for successful requests.
  • 201 Created: The request succeeded and a new resource was created as a result (commonly used for POST).
  • 204 No Content: The request succeeded, but there is no content to send in the payload (commonly used for DELETE).

3xx Redirection

Further action must be taken by the client to complete the request (usually handled automatically by browsers).

  • 301 Moved Permanently: The URL of the requested resource has been changed permanently.
  • 304 Not Modified: Used for caching. Tells the client the response has not been modified, so the client can continue to use the same cached version.

4xx Client Errors

The request contains bad syntax or cannot be fulfilled. This means the frontend/client made a mistake.

  • 400 Bad Request: The server cannot or will not process the request due to perceived client error.
  • 401 Unauthorized: Authentication is required and has failed or has not yet been provided.
  • 403 Forbidden: The client logged in, but does not have permission to access the requested resource.
  • 404 Not Found: The server cannot find the requested resource.
  • 429 Too Many Requests: The user has sent too many requests in a given amount of time (Rate Limiting).

5xx Server Errors

The server failed to fulfill an apparently valid request. This means the backend made a mistake or crashed.

  • 500 Internal Server Error: A generic error message, given when an unexpected condition was encountered and no more specific message is suitable.
  • 502 Bad Gateway: The server, while acting as a gateway or proxy, received an invalid response from the upstream server.
  • 503 Service Unavailable: The server is not ready to handle the request, usually because it's overloaded or down for maintenance.
🏆 Best Practices for Developers
  • Always return the most specific status code possible (e.g., return 404 not 400 if a user is looking for a missing ID).
  • Don't return a 200 OK with an error message inside the JSON payload. If it's an error, use a 4xx or 5xx code.
  • Use 201 Created instead of 200 OK when returning from a successful POST request that creates a database entry.
  • Remember that status codes are framework-agnostic. Whether you use Node.js/Express, Python/Django, Spring Boot, or Laravel, the HTTP spec remains exactly identical.
07

HTTP Caching

HTTP Caching is the practice of storing a copy of a given resource and serving it back when requested. When a web cache has a requested resource in its store, it intercepts the request and returns its copy instead of redownloading the resource from the originating server.

This significantly improves performance by reducing response times, decreasing network traffic, and lowering the load on servers.

Mechanisms

⚙️ How Caching is Done

Cache-Control

The primary header used to specify caching policies in both client requests and server responses.

  • max-age=<seconds>: Specifies the maximum amount of time a resource will be considered fresh.
  • no-cache: Forces caches to submit the request to the origin server for validation before releasing a cached copy.
  • no-store: The cache should not store anything about the client request or server response.
  • public / private: Indicates whether the response may be cached by any cache or only by the user's browser cache.
ETag / Last-Modified

Used for Validation. When a cached resource expires (exceeds max-age), the client can ask the server if the cache is still valid.

  • ETag: An opaque identifier assigned by the web server to a specific version of a resource. The client sends it back via If-None-Match. If the ETag matches, server returns 304 Not Modified.
  • Last-Modified: Indicates the date and time at which the origin server believes the resource was last modified. Client sends it back via If-Modified-Since.
🔄 The Caching Flow
  1. Is it in cache? If no, fetch from server. If yes, go to step 2.
  2. Is it fresh? (Based on max-age). If yes, serve directly from cache (fastest). If expired, go to step 3.
  3. Is it still valid? The client makes a conditional request (e.g., using If-None-Match with the ETag).
  4. Server response:
    • If unchanged: Returns 304 Not Modified (empty body). Client updates expiration time and uses cached copy.
    • If changed: Returns 200 OK with the new resource body and new cache directives.
08

Content Negotiation

Content Negotiation is the mechanism that makes it possible to serve different versions of a document (or resource) at the same URI, so that user agents (like browsers) can specify which version fits their capabilities best.

🖥️ Server-Driven

The client sends headers (like Accept, Accept-Language, or Accept-Encoding) to tell the server its preferences. The server decides which version to send back.

Accept: application/json
Accept-Language: en-US, fr;q=0.9

👤 Agent-Driven

The server sends back an initial response containing a list of available resources (often as alternative links). The client/agent then chooses the best one and makes a second request for it.

Less common due to latency

👻 Transparent

An intermediate cache (like a CDN or proxy) performs the negotiation on behalf of the origin server. It reduces load on the main server while efficiently choosing the right format.

Client ↔ Cache ↔ Server
09

HTTP Compression

HTTP Compression (using formats like gzip or deflate) significantly reduces the size of server responses, saving bandwidth and massively improving load times.

⚙️ The Mechanism

  1. The client sends a request indicating what compression it understands:
    Accept-Encoding: gzip, deflate
  2. If the server supports it, it dynamically compresses the payload and responds with:
    Content-Encoding: gzip
  3. The browser receives the compressed bytes and automatically decompresses them before rendering or passing to JavaScript.

📉 Real-World Impact

26 MBUncompressed
3.8 MBGzipped

Enabling compression reduces large files phenomenally, enhancing site efficiency and lowering server transit costs.

10

Persistent Connections (Keep-Alive)

Establishing a TCP connection (the 3-Way Handshake) is relatively slow and resource-intensive.

  • Evolution: Early HTTP (1.0) was disastrously inefficient because it closed the TCP connection entirely after every single request. HTTP 1.1 successfully introduced Persistent Connections.
  • Benefit: Multiple requests and responses can be sequentially sent over a single, open TCP connection. This eliminates the overhead of establishing new connections constantly.
  • Control: Instructed using the header Connection: keep-alive, ensuring the server waits for subsequent requests rather than tearing down the socket immediately.
HTTP/1.0 (Without Keep-Alive)
[Connect] → [Req/Res] → [Close][Connect] → [Req/Res] → [Close]
HTTP/1.1 (With Keep-Alive)
[Connect] → [Req/Res] → [Req/Res] → [Req/Res] ...
11

Handling Large Data Transfers

Sending 10GB of video data in a single massive HTTP packet would crash the server's memory. To handle large transfers robustly, HTTP employs robust fragmentation strategies.

⬆️ Multipart Data (Uploads)

Used heavily for uploading large files (images/videos) from the client to the server via web forms.

Content-Type: multipart/form-data;
boundary=----WebKitFormBoundary7M

This technique splits binary data and textual strings into isolated parts separated by a unique "boundary string", enabling the backend to cleanly parse and reconstruct the file components.

⬇️ Streaming / Chunked (Downloads)

Used to send enormous responses from the server to the client without buffering it entirely in server memory first.

Transfer-Encoding: chunked
Content-Type: text/event-stream

The server transmits data continuously in sequential chunks. The persistent connection stays open until the server explicitly signals the final chunk has successfully arrived.

12

HTTPS, SSL, and TLS

🔐

HTTPS (Hypertext Transfer Protocol Secure) is simply standard HTTP requests secured and encrypted by Transport Layer Security (TLS).

Core Functions

  • 01Encryption: Data is mathematically scrambled in transit so intermediate devices (routers/ISPs/hackers) cannot eavesdrop on passwords or data payloads.
  • 02Authentication: Ensures the server the client is connecting to is genuinely who they claim to be, heavily validated through public/private key pairs and digital certificates.
  • 03Integrity: Cryptographically guarantees the payload was not maliciously modified or corrupted while traversing the internet.

SSL vs. TLS Standard

People colloquially say "SSL Certificate", but SSL (Secure Sockets Layer) is entirely obsolete, deprecated, and insecure.

The modern standardized protocol powering HTTPS worldwide is exclusively TLS (with TLS 1.3 being the fastest, most highly recommended configuration available today against cyberattacks).