arrow_backBack to Blog homeHome
DevOps Docker CI/CD Cloud

CI/CD, Docker, and Zero-Downtime Deployments

Making deployment the least stressful part of shipping software.

Deployment isn't a step at the end of development — it's a system you design. Here's how to think about shipping software reliably, from your local machine to production.

calendar_todayFebruary 2026 schedule13 min read personSaptarshi Sadhu

Most developers experience deployment as anxiety. You push code, something clicks in production, users complain, you frantically rollback. That chaos isn't inevitable — it's what happens when deployment is treated as a manual, last-minute act rather than an engineered system.

The three ideas in this article — containerisation, continuous delivery, and zero-downtime strategies — aren't about specific tools. They're about bringing the same rigour to shipping software that you bring to writing it.

"If it hurts, do it more often — and the pain will go away."
The counterintuitive DevOps principle: deploying 10 times a day is safer than deploying once a week, because each change is smaller and the consequences of failure are contained.

inventory_2Part I — Docker: Packaging the Environment, Not Just the Code

The classic failure mode: code works on a developer's machine, breaks on the server. The culprit is almost always environmental difference — different Node version, missing system library, different path to a config file. Docker doesn't solve your code bugs; it eliminates the environmental variable entirely.

What a Container Actually Is

A container is a process running in an isolated namespace with its own filesystem, network, and process tree — but sharing the host OS kernel. It's not a virtual machine (no emulated hardware, no guest OS), so it starts in milliseconds and has negligible overhead.

The key insight: a Docker image is an immutable snapshot of everything your application needs to run — OS libraries, runtime, dependencies, app code, config. When you run that image anywhere — your laptop, a CI server, a cloud VM — you get an identical environment.

layersImage (Blueprint)

Read-only. Built once, stored in a registry. Tagged by version. Shared across environments.

deployed_codeContainer (Instance)

Running instance of an image. Ephemeral — stateful data must be written to external volumes or cloud storage.

Writing a Production-Grade Dockerfile

Most tutorial Dockerfiles are fine for demos but wrong for production. A real production image is built in stages (to keep the final image small), runs as a non-root user, and separates dependency installation from code copying — so the expensive npm install layer is cached when only app code changes.

Dockerfile · Node.js Multi-stage Build
# ── Stage 1: Install deps (cached layer)
FROM node:20-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

# ── Stage 2: Build (only if there's a build step)
FROM node:20-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build

# ── Stage 3: Minimal production image
FROM node:20-alpine AS runner
WORKDIR /app
RUN addgroup -S app && adduser -S app -G app  # non-root user
COPY --from=builder /app/dist ./dist
COPY --from=deps  /app/node_modules ./node_modules
USER app
EXPOSE 3000
CMD ["node", "dist/index.js"]
info
Layer caching is the key to fast builds Docker rebuilds from the first changed layer downward. Put COPY package.json and npm install before COPY . . so that dependencies are only reinstalled when package.json actually changes — not on every code edit.

syncPart II — CI/CD: The Automated Path from Code to Production

Continuous Integration (CI) means every commit is automatically validated — built, linted, and tested — before it's allowed to merge. Continuous Delivery (CD) means the validated artifact is automatically deployed to an environment. Together they form a pipeline: a deterministic, repeatable path from git push to running in production.

The Pipeline as a System

Think of the pipeline as a set of gates. Code must pass each gate before proceeding. A failing test doesn't just notify you — it stops the process. Nothing broken ships to production automatically.

code
Push
git push
to main
build
Build
compile
lint check
labs
Test
unit tests
integration
inventory_2
Package
docker build
push to registry
rocket_launch
Deploy
rolling update
health check

GitHub Actions in Practice

GitHub Actions expresses this pipeline as a YAML file that lives in your repository. The pipeline is code — version controlled, reviewed, and auditable. Here's what a real workflow looks like:

GitHub Actions · .github/workflows/deploy.yml
name: Build, Test & Deploy

on:
  push:
    branches: [main]

jobs:
  ci:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Setup Node
        uses: actions/setup-node@v4
        with: { node-version: '20' }

      - run: npm ci
      - run: npm run lint
      - run: npm test -- --coverage

  deploy:
    needs: ci      # only runs if ci job passes
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Build & push Docker image
        run: |
          echo ${{ secrets.REGISTRY_TOKEN }} | docker login -u ${{ secrets.REGISTRY_USER }} --password-stdin ghcr.io
          docker build -t ghcr.io/you/app:${{ github.sha }} .
          docker push ghcr.io/you/app:${{ github.sha }}

      - name: Deploy (rolling update)
        run: |
          ssh deploy@${{ secrets.SERVER_IP }} \
            "docker pull ghcr.io/you/app:${{ github.sha }} && \
             docker service update --image ghcr.io/you/app:${{ github.sha }} app_web"
check_circle
Tag images with the git commit SHA Using :latest as your image tag is an operational anti-pattern — it's impossible to know what version is running. Tag every image with the git commit SHA. Every deployed image is then pinned, auditable, and precisely rollback-able.

compareEnvironment Consistency: Dev ≈ Staging ≈ Production

The goal of containerisation is to make your laptop and the production server run the same image. But even with Docker, there's a critical gap: environment-specific configuration.

The rule: code is environment-agnostic; configuration is injected at runtime via environment variables. Never bake secrets or environment-specific URLs into a Docker image.

docker-compose.yml · Local dev mirrors production shape
services:
  app:
    image: ghcr.io/you/app:${IMAGE_TAG:-latest}
    env_file: .env.local          # injected — not baked in
    ports: ["3000:3000"]
    depends_on: [db, redis]
    healthcheck:
      test: ["CMD", "wget", "-qO-", "http://localhost:3000/health"]
      interval: 10s
      retries: 3

  db:
    image: postgres:16-alpine
    volumes: [postgres_data:/var/lib/postgresql/data]
    environment: { POSTGRES_PASSWORD: "${DB_PASS}" }

  redis:
    image: redis:7-alpine

Running docker compose up locally spins up the same stack shape as production: your app, your DB, your cache — all wired together identically. The only differences are environment variables.

electric_boltZero-Downtime Deployment: The Mechanics

"Zero-downtime" means users experience no interruption during a deploy. Achieved by ensuring there is always at least one healthy instance serving traffic while new ones spin up.

Rolling Deployment

The standard approach for containerised workloads. If you have 4 instances running v1, a rolling deploy replaces them one at a time: bring up v2, wait for its health check to pass, route traffic to it, then terminate a v1 instance. Repeat.

At no point is all capacity on the old or new version simultaneously. The capacity stays constant; the version gradually transitions. This is what Docker Swarm's --update-parallelism and Kubernetes' maxUnavailable control.

Blue-Green Deployment

Run two identical environments — Blue (live) and Green (idle). Deploy the new version to Green, run your smoke tests, then switch the load balancer to point to Green. Blue becomes the idle standby.

The advantage over rolling: the switch is instant and atomic. The disadvantage: you need double the infrastructure during the transition. But for critical services, the operational confidence is worth it.

Rolling Update
  • ✓ No extra infrastructure
  • ✓ Gradual — easy to spot issues
  • ✗ Mixed versions live simultaneously
  • ✗ DB migrations must be backward-compatible
Blue-Green
  • ✓ Instant, clean cutover
  • ✓ Instant rollback (flip back)
  • ✗ Requires 2× capacity briefly
  • ✗ More complex infrastructure
warning
The database migration problem Zero-downtime deployments fail when the new application version requires a DB schema that the old version can't read. The rule: always deploy migrations before the application code that depends on them, in a backward-compatible way. Add a nullable column first, deploy the app that writes to it, then add the NOT NULL constraint later.

historyRollback Strategies: Designing for Recovery

A deployment system without a tested rollback path is incomplete. Rollback isn't an emergency procedure — it's a normal, expected operation that should take under 2 minutes and require no manual steps.

Immutable Images Make Rollback Trivial

Because each deploy is tagged with a git SHA and stored immutably in a registry, rolling back is just deploying the previous image tag. No rebuilding, no guessing what was deployed before.

Rollback · One command
# Docker Swarm
docker service rollback app_web

# Kubernetes
kubectl rollout undo deployment/app-web

# Or pin to a specific previous version
kubectl set image deployment/app-web \
  app=ghcr.io/you/app:a3f9c12  # the last known-good SHA

Health Checks Are the Gatekeeper

An automated rollback is only possible if the orchestrator knows a deploy is unhealthy. That means every service must expose a /health endpoint — not just returning 200, but actually checking that its dependencies (DB connection, Redis ping) are healthy. The orchestrator runs this check on every new container before routing traffic to it.

info
Automate the canary check After deploying, run a quick smoke test against the live endpoint — just enough to verify the critical user paths work. If any check fails, the pipeline triggers an automatic rollback before a human even notices. This is the loop that makes frequent deploys safe.

flagDeployment as a First-Class System

The shift in mindset is: deployment is not the end of development — it's a system you operate continuously. Docker gives you environment parity. CI/CD gives you a deterministic, automated path. Zero-downtime strategies and rollback make that path safe to run dozens of times a day.

Teams that deploy rarely do so because deployment is risky. Deployment is risky because they do it rarely. The feedback loop runs the wrong way. When you containerise your app, automate your pipeline, and instrument rollbacks from day one — shipping software becomes the least stressful part of the job.

Enjoyed this article? ☕
If this helped you or saved you some time, consider supporting my work.
Support my work

emoji_objects Key Takeaways

Saptarshi Sadhu
Saptarshi Sadhu
System-focused developer at the intersection of AI, backend engineering, and scalable infrastructure. Builds things that have to work in the real world.
← Previous Architecting Scalable Full Stack Systems Next → DSA Patterns That Actually Matter