DevOps Docker CI/CD Cloud

CI/CD, Docker, and
Zero-Downtime Deployments

Making deployment the least stressful part of shipping software.

Deployment isn't a step at the end of development — it's a system you design. Here's how to think about shipping software reliably, from your local machine to production.

calendar_todayFebruary 2026 schedule13 min read personSaptarshi Sadhu

Most developers experience deployment as anxiety. You push code, something clicks in production, users complain, you frantically rollback. That chaos isn't inevitable — it's what happens when deployment is treated as a manual, last-minute act rather than an engineered system.

The three ideas in this article — containerisation, continuous delivery, and zero-downtime strategies — aren't about specific tools. They're about bringing the same rigour to shipping software that you bring to writing it.

"If it hurts, do it more often — and the pain will go away."
The counterintuitive DevOps principle: deploying 10 times a day is safer than deploying once a week, because each change is smaller and the consequences of failure are contained.

inventory_2Part I — Docker: Packaging the Environment, Not Just the Code

The classic failure mode: code works on a developer's machine, breaks on the server. The culprit is almost always environmental difference — different Node version, missing system library, different path to a config file. Docker doesn't solve your code bugs; it eliminates the environmental variable entirely.

What a Container Actually Is

A container is a process running in an isolated namespace with its own filesystem, network, and process tree — but sharing the host OS kernel. It's not a virtual machine (no emulated hardware, no guest OS), so it starts in milliseconds and has negligible overhead.

The key insight: a Docker image is an immutable snapshot of everything your application needs to run — OS libraries, runtime, dependencies, app code, config. When you run that image anywhere — your laptop, a CI server, a cloud VM — you get an identical environment.

layersImage (Blueprint)

Read-only. Built once, stored in a registry. Tagged by version. Shared across environments.

deployed_codeContainer (Instance)

Running instance of an image. Ephemeral — stateful data must be written to external volumes or cloud storage.

Writing a Production-Grade Dockerfile

Most tutorial Dockerfiles are fine for demos but wrong for production. A real production image is built in stages (to keep the final image small), runs as a non-root user, and separates dependency installation from code copying — so the expensive npm install layer is cached when only app code changes.

Dockerfile · Node.js Multi-stage Build

# ── Stage 1: Install deps (cached layer)
FROM node:20-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

# ── Stage 2: Build (only if there's a build step)
FROM node:20-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build

# ── Stage 3: Minimal production image
FROM node:20-alpine AS runner
WORKDIR /app
RUN addgroup -S app && adduser -S app -G app  # non-root user
COPY --from=builder /app/dist ./dist
COPY --from=deps  /app/node_modules ./node_modules
USER app
EXPOSE 3000
CMD ["node", "dist/index.js"]

info

Layer caching is the key to fast builds Docker rebuilds from the first changed layer downward. Put COPY package.json and npm install before COPY . . so that dependencies are only reinstalled when package.json actually changes — not on every code edit.

syncPart II — CI/CD: The Automated Path from Code to Production

Continuous Integration (CI) means every commit is automatically validated — built, linted, and tested — before it's allowed to merge. Continuous Delivery (CD) means the validated artifact is automatically deployed to an environment. Together they form a pipeline: a deterministic, repeatable path from git push to running in production.

The Pipeline as a System

Think of the pipeline as a set of gates. Code must pass each gate before proceeding. A failing test doesn't just notify you — it stops the process. Nothing broken ships to production automatically.

code

Push

git push
to main

→

build

Build

compile
lint check

→

labs

Test

unit tests
integration

→

inventory_2

Package

docker build
push to registry

→

rocket_launch

Deploy

rolling update
health check

GitHub Actions in Practice

GitHub Actions expresses this pipeline as a YAML file that lives in your repository. The pipeline is code — version controlled, reviewed, and auditable. Here's what a real workflow looks like:

GitHub Actions · .github/workflows/deploy.yml

name: Build, Test & Deploy

on:
  push:
    branches: [main]

jobs:
  ci:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup Node
        uses: actions/setup-node@v4
        with: { node-version: '20' }

      - run: npm ci
      - run: npm run lint
      - run: npm test -- --coverage

  deploy:
    needs: ci      # only runs if ci job passes
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Build & push Docker image
        run: |
          echo ${{ secrets.REGISTRY_TOKEN }} | docker login -u ${{ secrets.REGISTRY_USER }} --password-stdin ghcr.io
          docker build -t ghcr.io/you/app:${{ github.sha }} .
          docker push ghcr.io/you/app:${{ github.sha }}

      - name: Deploy (rolling update)
        run: |
          ssh deploy@${{ secrets.SERVER_IP }} \
            "docker pull ghcr.io/you/app:${{ github.sha }} && \
             docker service update --image ghcr.io/you/app:${{ github.sha }} app_web"

check_circle

Tag images with the git commit SHA Using :latest as your image tag is an operational anti-pattern — it's impossible to know what version is running. Tag every image with the git commit SHA. Every deployed image is then pinned, auditable, and precisely rollback-able.

compareEnvironment Consistency: Dev ≈ Staging ≈ Production

The goal of containerisation is to make your laptop and the production server run the same image. But even with Docker, there's a critical gap: environment-specific configuration.

The rule: code is environment-agnostic; configuration is injected at runtime via environment variables. Never bake secrets or environment-specific URLs into a Docker image.

docker-compose.yml · Local dev mirrors production shape

services:
  app:
    image: ghcr.io/you/app:${IMAGE_TAG:-latest}
    env_file: .env.local          # injected — not baked in
    ports: ["3000:3000"]
    depends_on: [db, redis]
    healthcheck:
      test: ["CMD", "wget", "-qO-", "http://localhost:3000/health"]
      interval: 10s
      retries: 3

  db:
    image: postgres:16-alpine
    volumes: [postgres_data:/var/lib/postgresql/data]
    environment: { POSTGRES_PASSWORD: "${DB_PASS}" }

  redis:
    image: redis:7-alpine

Running docker compose up locally spins up the same stack shape as production: your app, your DB, your cache — all wired together identically. The only differences are environment variables.

electric_boltZero-Downtime Deployment: The Mechanics

"Zero-downtime" means users experience no interruption during a deploy. Achieved by ensuring there is always at least one healthy instance serving traffic while new ones spin up.

Rolling Deployment

The standard approach for containerised workloads. If you have 4 instances running v1, a rolling deploy replaces them one at a time: bring up v2, wait for its health check to pass, route traffic to it, then terminate a v1 instance. Repeat.

At no point is all capacity on the old or new version simultaneously. The capacity stays constant; the version gradually transitions. This is what Docker Swarm's --update-parallelism and Kubernetes' maxUnavailable control.

Blue-Green Deployment

Run two identical environments — Blue (live) and Green (idle). Deploy the new version to Green, run your smoke tests, then switch the load balancer to point to Green. Blue becomes the idle standby.

The advantage over rolling: the switch is instant and atomic. The disadvantage: you need double the infrastructure during the transition. But for critical services, the operational confidence is worth it.

Rolling Update

✓ No extra infrastructure
✓ Gradual — easy to spot issues
✗ Mixed versions live simultaneously
✗ DB migrations must be backward-compatible

Blue-Green

✓ Instant, clean cutover
✓ Instant rollback (flip back)
✗ Requires 2× capacity briefly
✗ More complex infrastructure

warning

The database migration problem Zero-downtime deployments fail when the new application version requires a DB schema that the old version can't read. The rule: always deploy migrations before the application code that depends on them, in a backward-compatible way. Add a nullable column first, deploy the app that writes to it, then add the NOT NULL constraint later.

historyRollback Strategies: Designing for Recovery

A deployment system without a tested rollback path is incomplete. Rollback isn't an emergency procedure — it's a normal, expected operation that should take under 2 minutes and require no manual steps.

Immutable Images Make Rollback Trivial

Because each deploy is tagged with a git SHA and stored immutably in a registry, rolling back is just deploying the previous image tag. No rebuilding, no guessing what was deployed before.

Rollback · One command

# Docker Swarm
docker service rollback app_web

# Kubernetes
kubectl rollout undo deployment/app-web

# Or pin to a specific previous version
kubectl set image deployment/app-web \
  app=ghcr.io/you/app:a3f9c12  # the last known-good SHA

Health Checks Are the Gatekeeper

An automated rollback is only possible if the orchestrator knows a deploy is unhealthy. That means every service must expose a /health endpoint — not just returning 200, but actually checking that its dependencies (DB connection, Redis ping) are healthy. The orchestrator runs this check on every new container before routing traffic to it.

info

Automate the canary check After deploying, run a quick smoke test against the live endpoint — just enough to verify the critical user paths work. If any check fails, the pipeline triggers an automatic rollback before a human even notices. This is the loop that makes frequent deploys safe.

flagDeployment as a First-Class System

The shift in mindset is: deployment is not the end of development — it's a system you operate continuously. Docker gives you environment parity. CI/CD gives you a deterministic, automated path. Zero-downtime strategies and rollback make that path safe to run dozens of times a day.

Teams that deploy rarely do so because deployment is risky. Deployment is risky because they do it rarely. The feedback loop runs the wrong way. When you containerise your app, automate your pipeline, and instrument rollbacks from day one — shipping software becomes the least stressful part of the job.

Enjoyed this article? ☕
If this helped you or saved you some time, consider supporting my work.

Support my work

emoji_objects Key Takeaways

Docker packages the environment, not just the code — it eliminates the 'works on my machine' class of problems entirely.
Tag images with the git SHA, never with :latest — every deployed version becomes auditable and precisely rollback-able.
Put COPY package.json before COPY . . in your Dockerfile — dependency layers are cached, code changes rebuild fast.
Health checks are the gatekeeper between a broken deploy and production traffic — they enable automated rollback.
Deploying 10× a day is safer than deploying once a week — smaller changes mean smaller blast radius when something breaks.

Saptarshi Sadhu

System-focused developer at the intersection of AI, backend engineering, and scalable infrastructure. Builds things that have to work in the real world.

Portfolio All Articles Contact

← Previous Architecting Scalable Full Stack Systems

grid_view All Articles

Next → DSA Patterns That Actually Matter