>_
EngineeringNotes
Module 05

Mastering PM2 & CI/CD

The bridge between "it works on my laptop" and real production.

01

What is PM2?

PM2 is a production process manager. Think of it as a smart supervisor sitting between your operating system and your application.

Linux OS PM2 Your App (Node/FastAPI)

It keeps your app alive forever, restarts it automatically if it crashes, handles logs, and allows for zero-downtime reloads.

02

The Real Problem

Without PM2

  • • Terminal closed → App dies
  • • SSH disconnected → App dies
  • • App crashes → Dead forever
  • • Server restarts → Never comes back
Terminalbash
node app.js

With PM2

  • • Runs in background
  • • Survives SSH close
  • Auto-restarts on crash
  • • Auto-starts on OS reboot
Terminalbash
pm2 start app.js
03

How It Works Internally

This is the key concept. PM2 itself is a daemon (background service). When you start an app, PM2 forks a child process and keeps its PID (Process ID).

Process Structurebash
pm2 daemon
├── app-1 (node / uvicorn)
├── app-2
└── log manager

PM2 constantly monitors these processes. It supervises them. If a process exits unexpectedly, PM2 restarts it immediately.

💡 Insight: When you close your terminal, the PM2 daemon keeps running, and because your app is a child of the daemon (not your shell), your app keeps running too.

04

Process Management Logic

Why does the app restart if you kill it manually?

Simulating a crashbash
kill -9 <pid>

PM2 sees that the process exited unexpectedly. Its logic is simple:

IF process_exit AND NOT intentionally_stopped
RESTART IT

To actually stop an app, you must tell PM2:

Terminalbash
pm2 stop app-name
05

Node.js & Python Support

Node.js (Express/Nest/Next)

Standard way to start a Node application:

Terminalbash
pm2 start app.js --name node-api

Python (FastAPI/Django)

PM2 doesn't run Python directly; it runs commands. For FastAPI, it wraps uvicorn.

Terminalbash
pm2 start "uvicorn main:app --host 0.0.0.0 --port 8000" \
  --name fastapi-app \
  --interpreter bash

Internally: PM2 → Bash → Uvicorn → Python App

06

Surviving Server Reboots

This is where PM2 becomes production-grade. We need to tell the OS (systemd) to launch PM2 on boot.

Step 1

Save running apps

Terminalbash
pm2 save

Dumps current process list to ~/.pm2/dump.pm2

Step 2

Generate startup script

Terminalbash
pm2 startup

Run the command output by this script to register PM2 with systemd.

07

Scaling with Cluster Mode

Node.js is single-threaded. PM2 lets you utilize all CPU cores without changing your code.

Terminalbash
pm2 start app.js -i max

This automatically balances load across instances:

CPU 1 → Instance 1
CPU 2 → Instance 2
CPU 3 → Instance 3
08

Continuous Integration (CI)

CI is the habit of automatically checking your code every time you push it. It is a safety net that saves developers hours of debugging locally.

What CI Actually Does:

  • 1.
    Install Dependencies: It runs `npm install` in a clean environment to make sure you didn't forget to save a package.
  • 2.
    Build Project: It runs `npm run build`. If you have a syntax error, it fails here.
  • 3.
    Run Linters: Checks formatting. "Fail fast" if code is messy.
  • 4.
    Run Tests: Executes unit tests. If logic is broken, it stops the pipeline.
Result: Code is "Verified Safe" to Merge.
Push
CI Gate(test, build, ...)
Stop
X (Stop)
Allow
✔ (Allow)
CD Gate(deploy)

* flow only reaches CD when everything passes in CI

09

Continuous Deployment (CD)

If CI says "Code is Safe", CD says "Ship it". It automates the boring manual work of SSH-ing into servers.

Typical CD Steps

  1. SSH into EC2 Instance
  2. Pull latest code (git pull)
  3. Install dependencies (npm i)
  4. Build Project (npm run build)
  5. Restart App (pm2 reload)

Delivery vs. Deployment

Continuous Deployment:

Automatic. Every push to main goes live immediately. Good for startups.

Continuous Delivery:

Manual Trigger. Code is *ready* to deploy, but a human clicks the button. Used by Big Tech.

10

Why 'Reload' Matters

Why is `pm2 reload` superior to `pm2 restart`?

Scenario: The "Crash Loop"

Imagine your CD pipeline passes (Build success). But your new code has a runtime error:
Error: Database Connection Failed (Wrong .env)

With `pm2 restart`:
  • Kills old app.
  • Tries to start new app.
  • New app crashes.
  • Result: Website Down.
With `pm2 reload`:
  • Starts new app in background.
  • New app crashes?
  • Pm2 keeps Old App alive.
  • Result: Zero Downtime.
11

Handling Logic Errors & Rollback

If some logic error occurs...

Then no one can catch it (neither CI nor CD).
(bad business logic / wrong API response / slow queries)
👉

That's where Monitoring / Logs / Health Check / Rollback comes in.

Rollback will be effective

(simple but powerful)

If something breaks after deploy:
$git checkout previous-commit
$npm run build
$pm2 reload app.js
PM2 logs everything & retries automaticallyThat's why it is used even if it cannot handle this logical error directly. It supports reload and integrates well with CI/CD.
ERRORS are not friendly

They don't announce themselves.

  • ▪️Production is the real ground. Everything may work perfectly on dev/stage, but real scenario comes out on production.
  • ▪️So, all these steps are very important.
  • 💡"Dev testing is about correctness.
    Production testing is about damage control."
  • ▪️High Stakes: Trust, money, and responsibilities depend on it.
Why is Prod different?
Dev Env
  • • 2 - 5 User (You)
  • • Local DB (0ms latency)
  • • No Firewalls
  • • Clean/Mock Data
Prod Env
  • • 10k Users (Race conditions)
  • • Cloud DB (Network Latency)
  • • Real Security Rules
  • • Messy Real Data