Authentication & Authorization
Understanding the mechanisms to verify user identity and manage access controls in modern backend systems.
AuthN vs. AuthZ
Authentication (AuthN)
The process of verifying who you are in a given context (like a platform or OS).
Authorization (AuthZ)
The process of determining what you can do or what specifically you have permission to access.
The Historical Evolution
Authentication hasn't always been about database tables and passwords. The psychological principle of trust has heavily evolved alongside technology.
🤝 Pre-industrial
Authentication was entirely implied based on visceral human trust and facial recognition (e.g., village elders physically vouching for a stranger).
📜 Medieval Period
Explicit authentication grew strictly out of global trade needs. To prevent massive forgery, people relied on physical objects: wax seals, specialized watermarks, and encrypted numeric codes.
🚂 Industrial Revolution
The telegram forced a major architectural shift. Senders couldn't verify physical goods over a wire, transitioning trust to the principle of "something you know" (shared secrets or static pass phrases).
💻 Computational Era (1961)
Early time-sharing mainframes at MIT required multi-user accounts, birthing the digital password. Initially, passwords were just printed in massive, publicly readable plain-text files.
🔐 Cryptographic Advancements
The creation of hashing mathematics allowed storing irreversible, fixed-length representations of passwords instead of raw text. The 1970s introduced highly complex asymmetric cryptography (Diffie-Hellman) and ticket-based systems like Kerberos, the direct precursors to the token systems we use today.
Core Components of Modern Auth
🌐 HTTP is Inherently Stateless
By fundamental architecture design, HTTP treats every single request as completely isolated. The server has absolutely zero memory of past exchanges. Once a user logs in, their next click is seen as a brand-new stranger unless we forcefully attach state to the interaction.
Sessions
To create dynamic interactions (like a shopping cart), servers create a "Session". A unique randomized Session ID is generated upon successful login and sent to the client.
Severe Constaint: The server must look up every single request in the database, massively impacting highly-scaled systems.
JSON Web Tokens (JWT)
A breakthrough mechanism for transferring claims securely. JWTs are self-contained, meaning they hold all crucial user data (IDs, roles) and a cryptographic signature directly inside the token string itself.
Massive Scale: Eliminates the need for server-side session storage or repetitive database lookups entirely.
The Delivery Vehicle: Cookies
A Cookie is the native browser mechanism allowing the backend server to automatically store small pieces of data (like a Session ID or a JWT) directly inside the user's browser. Once set via HTTP headers, the browser automatically and securely attaches that cookie back to the server with every subsequent request, creating the illusion of a continuous, logged-in state without manual frontend intervention.
JWT Limitations & Hybrid Approach
⚠️ Instant Revocation Wait-Time
Because JWTs are entirely self-contained, the server doesn't natively track them in the database. This creates a severe security challenge: if a token is stolen or a user is explicitly banned, you cannot instantly revoke that exact token across distributed systems.
🔄 The Hybrid Architecture
To solve the invalidation problem without sacrificing scalability, modern applications use the Hybrid Approach:
- Access Token (DB-Free): A highly short-lived JWT (e.g., 15 minutes) used for instant, stateless authorization continuously.
- Refresh Token (Stateful): A long-lived token stored safely in an encrypted HTTP-only cookie and securely tracked in the database. Used uniquely to hit an endpoint generating fresh Access Tokens.
Types of Authentication
Session-Based
The earliest traditional method. Server stores all user data. Offers instant, flawless revocation (just forcibly drop the row from Redis) but suffers brutally from high intra-server latency and massive scaling bottlenecks.
Token-Based
Utilizes cryptographic standard JWTs. The client exclusively holds the data. Essential for modern, highly-distributed microservice architectures because it completely offloads database lookups to ultra-fast mathematical signature validation.
API Key
A long-lived, massive entropy static string uniquely identifying a client application. Used primarily when your backend talks precisely to another backend (e.g., calling OpenAI or Stripe servers without a human user interface).
OAuth 2.0 & OpenID Connect
Before OAuth, the internet suffered from the Password Sharing Anti-Pattern. If Yelp wanted to scan your Google Contacts to find friends, Yelp literally asked you to type your raw Google password directly into Yelp's website. They would then script a robot to log into Google as you.
Yelp now had permanent, god-level access to your entire Google account. If Yelp got hacked, your Google account was compromised. You couldn't revoke just Yelp's access without resetting your global Google password.
OAuth 1.0 (2007)
Created to solve the password sharing problem. Instead of passwords, it issued tokens.
OAuth 2.0 (2012)
A total rewrite. Ditched heavy cryptography for simple Bearer tokens over HTTPS. Dominates the industry today for Delegated Authorization.
⚙️ The Core Mechanism (How OAuth 2.0 works)
Let's explicitly trace how Spotify mathematically allows you to "Import contacts from Google".
User clicks "Import from Google" on Spotify. Spotify redirects the User's browser to the Google Login Page asking for specific "Scopes" (e.g., read:contacts).
The User explicitly logs into Google using their actual credentials. Google then prompts the User on-screen: "Spotify wants to view your contacts. Allow?"
If accepted, Google aggressively redirects the browser back to Spotify with a temporary, universally useless Authorization Code strictly appended to the URL.
Spotify's Backend script uses that temporary Auth Code, plus its own highly secret Client ID/Secret, and talks directly to Google's backend (server-to-server) to securely exchange them for the final mathematical Access Token.
The 4 Players
- 👤 Resource Owner You (The User)
- 📱 Client Spotify (App)
- 🛂 Auth Server Google Login Page
- 🗄️ Resource Server Google Contacts API
OpenID Connect (OIDC)
Because OAuth 2.0 physically didn't handle identity, the industry created OIDC (2014). It is strictly an identity layer bolted perfectly on top of the native OAuth flow.
The ID Token is literally just a standard JWT containing the user's explicit profile information (name, email, avatar). It definitively proves exactly who the user is. This is what securely powers "Sign in with Google or Apple" buttons across the internet.
The Industry Standard Summary
Delegated Authorization. Grants keys to access specific data endpoints on your behalf.
Federated Authentication. Mathematically verifies human identity via JWT ID Tokens.
Security Best Practices
Backend engineers are the absolute last line of defense against catastrophic data breaches. Mastering defensive foundational architectural patterns is mandatory.
🚨 Generic Error Messages
Never provide hyper-specific UI feedback like "User not found" vs "Incorrect password". This directly facilitates Username Enumeration, explicitly allowing hackers to brute-force a definitive list of valid emails on your server.
⏱️ Mitigating Timing Attacks
In immensely subtle cyberattacks, hackers can measure exactly how many milliseconds your server takes to respond to deduce if a username exists (because hashing a valid password takes ~100ms longer than instantly rejecting an invalid string).
Backend engineers must explicitly employ constant-time comparison algorithms (like Node's native crypto.timingSafeEqual()), or deliberately inject simulated jitter delays into the response cycle so incredibly fast failures still computationally simulate a natural hashing lag.