CI/CD, Docker, and
Zero-Downtime Deployments
Making deployment the least stressful part of shipping software.
Deployment isn't a step at the end of development — it's a system you design. Here's how to think about shipping software reliably, from your local machine to production.
Most developers experience deployment as anxiety. You push code, something clicks in production, users complain, you frantically rollback. That chaos isn't inevitable — it's what happens when deployment is treated as a manual, last-minute act rather than an engineered system.
The three ideas in this article — containerisation, continuous delivery, and zero-downtime strategies — aren't about specific tools. They're about bringing the same rigour to shipping software that you bring to writing it.
"If it hurts, do it more often — and the pain will go away."
The counterintuitive DevOps principle: deploying 10 times a day is safer than deploying once a week, because each change is smaller and the consequences of failure are contained.
inventory_2Part I — Docker: Packaging the Environment, Not Just the Code
The classic failure mode: code works on a developer's machine, breaks on the server. The culprit is almost always environmental difference — different Node version, missing system library, different path to a config file. Docker doesn't solve your code bugs; it eliminates the environmental variable entirely.
What a Container Actually Is
A container is a process running in an isolated namespace with its own filesystem, network, and process tree — but sharing the host OS kernel. It's not a virtual machine (no emulated hardware, no guest OS), so it starts in milliseconds and has negligible overhead.
The key insight: a Docker image is an immutable snapshot of everything your application needs to run — OS libraries, runtime, dependencies, app code, config. When you run that image anywhere — your laptop, a CI server, a cloud VM — you get an identical environment.
Read-only. Built once, stored in a registry. Tagged by version. Shared across environments.
Running instance of an image. Ephemeral — stateful data must be written to external volumes or cloud storage.
Writing a Production-Grade Dockerfile
Most tutorial Dockerfiles are fine for demos but wrong for production. A real production image is built in stages (to keep the final image small), runs as a non-root user, and separates dependency installation from code copying — so the expensive npm install layer is cached when only app code changes.
# ── Stage 1: Install deps (cached layer) FROM node:20-alpine AS deps WORKDIR /app COPY package*.json ./ RUN npm ci --only=production # ── Stage 2: Build (only if there's a build step) FROM node:20-alpine AS builder WORKDIR /app COPY --from=deps /app/node_modules ./node_modules COPY . . RUN npm run build # ── Stage 3: Minimal production image FROM node:20-alpine AS runner WORKDIR /app RUN addgroup -S app && adduser -S app -G app # non-root user COPY --from=builder /app/dist ./dist COPY --from=deps /app/node_modules ./node_modules USER app EXPOSE 3000 CMD ["node", "dist/index.js"]
COPY package.json and npm install before COPY . . so that dependencies are only reinstalled when package.json actually changes — not on every code edit.syncPart II — CI/CD: The Automated Path from Code to Production
Continuous Integration (CI) means every commit is automatically validated — built, linted, and tested — before it's allowed to merge. Continuous Delivery (CD) means the validated artifact is automatically deployed to an environment. Together they form a pipeline: a deterministic, repeatable path from git push to running in production.
The Pipeline as a System
Think of the pipeline as a set of gates. Code must pass each gate before proceeding. A failing test doesn't just notify you — it stops the process. Nothing broken ships to production automatically.
to main
lint check
integration
push to registry
health check
GitHub Actions in Practice
GitHub Actions expresses this pipeline as a YAML file that lives in your repository. The pipeline is code — version controlled, reviewed, and auditable. Here's what a real workflow looks like:
name: Build, Test & Deploy on: push: branches: [main] jobs: ci: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Setup Node uses: actions/setup-node@v4 with: { node-version: '20' } - run: npm ci - run: npm run lint - run: npm test -- --coverage deploy: needs: ci # only runs if ci job passes runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Build & push Docker image run: | echo ${{ secrets.REGISTRY_TOKEN }} | docker login -u ${{ secrets.REGISTRY_USER }} --password-stdin ghcr.io docker build -t ghcr.io/you/app:${{ github.sha }} . docker push ghcr.io/you/app:${{ github.sha }} - name: Deploy (rolling update) run: | ssh deploy@${{ secrets.SERVER_IP }} \ "docker pull ghcr.io/you/app:${{ github.sha }} && \ docker service update --image ghcr.io/you/app:${{ github.sha }} app_web"
:latest as your image tag is an operational anti-pattern — it's impossible to know what version is running. Tag every image with the git commit SHA. Every deployed image is then pinned, auditable, and precisely rollback-able.compareEnvironment Consistency: Dev ≈ Staging ≈ Production
The goal of containerisation is to make your laptop and the production server run the same image. But even with Docker, there's a critical gap: environment-specific configuration.
The rule: code is environment-agnostic; configuration is injected at runtime via environment variables. Never bake secrets or environment-specific URLs into a Docker image.
services: app: image: ghcr.io/you/app:${IMAGE_TAG:-latest} env_file: .env.local # injected — not baked in ports: ["3000:3000"] depends_on: [db, redis] healthcheck: test: ["CMD", "wget", "-qO-", "http://localhost:3000/health"] interval: 10s retries: 3 db: image: postgres:16-alpine volumes: [postgres_data:/var/lib/postgresql/data] environment: { POSTGRES_PASSWORD: "${DB_PASS}" } redis: image: redis:7-alpine
Running docker compose up locally spins up the same stack shape as production: your app, your DB, your cache — all wired together identically. The only differences are environment variables.
electric_boltZero-Downtime Deployment: The Mechanics
"Zero-downtime" means users experience no interruption during a deploy. Achieved by ensuring there is always at least one healthy instance serving traffic while new ones spin up.
Rolling Deployment
The standard approach for containerised workloads. If you have 4 instances running v1, a rolling deploy replaces them one at a time: bring up v2, wait for its health check to pass, route traffic to it, then terminate a v1 instance. Repeat.
At no point is all capacity on the old or new version simultaneously. The capacity stays constant; the version gradually transitions. This is what Docker Swarm's --update-parallelism and Kubernetes' maxUnavailable control.
Blue-Green Deployment
Run two identical environments — Blue (live) and Green (idle). Deploy the new version to Green, run your smoke tests, then switch the load balancer to point to Green. Blue becomes the idle standby.
The advantage over rolling: the switch is instant and atomic. The disadvantage: you need double the infrastructure during the transition. But for critical services, the operational confidence is worth it.
- ✓ No extra infrastructure
- ✓ Gradual — easy to spot issues
- ✗ Mixed versions live simultaneously
- ✗ DB migrations must be backward-compatible
- ✓ Instant, clean cutover
- ✓ Instant rollback (flip back)
- ✗ Requires 2× capacity briefly
- ✗ More complex infrastructure
historyRollback Strategies: Designing for Recovery
A deployment system without a tested rollback path is incomplete. Rollback isn't an emergency procedure — it's a normal, expected operation that should take under 2 minutes and require no manual steps.
Immutable Images Make Rollback Trivial
Because each deploy is tagged with a git SHA and stored immutably in a registry, rolling back is just deploying the previous image tag. No rebuilding, no guessing what was deployed before.
# Docker Swarm docker service rollback app_web # Kubernetes kubectl rollout undo deployment/app-web # Or pin to a specific previous version kubectl set image deployment/app-web \ app=ghcr.io/you/app:a3f9c12 # the last known-good SHA
Health Checks Are the Gatekeeper
An automated rollback is only possible if the orchestrator knows a deploy is unhealthy. That means every service must expose a /health endpoint — not just returning 200, but actually checking that its dependencies (DB connection, Redis ping) are healthy. The orchestrator runs this check on every new container before routing traffic to it.
flagDeployment as a First-Class System
The shift in mindset is: deployment is not the end of development — it's a system you operate continuously. Docker gives you environment parity. CI/CD gives you a deterministic, automated path. Zero-downtime strategies and rollback make that path safe to run dozens of times a day.
Teams that deploy rarely do so because deployment is risky. Deployment is risky because they do it rarely. The feedback loop runs the wrong way. When you containerise your app, automate your pipeline, and instrument rollbacks from day one — shipping software becomes the least stressful part of the job.
If this helped you or saved you some time, consider supporting my work.
emoji_objects Key Takeaways
- Docker packages the environment, not just the code — it eliminates the 'works on my machine' class of problems entirely.
- Tag images with the git SHA, never with :latest — every deployed version becomes auditable and precisely rollback-able.
- Put
COPY package.jsonbeforeCOPY . .in your Dockerfile — dependency layers are cached, code changes rebuild fast. - Health checks are the gatekeeper between a broken deploy and production traffic — they enable automated rollback.
- Deploying 10× a day is safer than deploying once a week — smaller changes mean smaller blast radius when something breaks.