Skip to main content
Back to Blog
15 July 202415 min read

Docker Best Practices: Building Production-Ready Containers

DockerContainersDevOpsSecurity

Practical guidelines for building secure, efficient Docker containers. Multi-stage builds, security hardening, and image optimization.


Docker Best Practices: Building Production-Ready Containers

After years of shipping containers to production, I've learned that containers are easy to build badly. The difference between a development container and a production-ready one involves security, size, reliability, and observability considerations that are easy to overlook.

Multi-Stage Builds

Multi-stage builds are essential for production containers. They separate build-time dependencies from runtime, dramatically reducing image size and attack surface.

Basic Multi-Stage Pattern

# Build stage FROM node:20-alpine AS builder WORKDIR /app COPY package*.json ./ RUN npm ci --only=production COPY . . RUN npm run build # Production stage FROM node:20-alpine AS production WORKDIR /app # Create non-root user RUN addgroup -g 1001 -S nodejs && \ adduser -S nextjs -u 1001 # Copy only what's needed COPY --from=builder --chown=nextjs:nodejs /app/dist ./dist COPY --from=builder --chown=nextjs:nodejs /app/node_modules ./node_modules COPY --from=builder --chown=nextjs:nodejs /app/package.json ./ USER nextjs EXPOSE 3000 CMD ["node", "dist/server.js"]

Go Application Example

# Build stage FROM golang:1.22-alpine AS builder WORKDIR /app COPY go.mod go.sum ./ RUN go mod download COPY . . RUN CGO_ENABLED=0 GOOS=linux go build -ldflags="-w -s" -o /app/server # Production stage - distroless for minimal attack surface FROM gcr.io/distroless/static-debian12 COPY --from=builder /app/server / USER nonroot:nonroot ENTRYPOINT ["/server"]

The Go example produces a ~10MB image instead of ~800MB with the Go SDK.

Base Image Selection

Your base image choice affects size, security, and compatibility.

Image Comparison

Base ImageSizeUse Case
ubuntu:22.04~77MBFull Linux environment, debugging
debian:bookworm-slim~74MBStandard Linux, smaller than Ubuntu
alpine:3.19~7MBMinimal, musl libc (check compatibility)
distroless/static~2MBCompiled binaries only
scratch~0MBAbsolute minimum (static binaries only)

My Recommendations

  • Node.js: node:20-alpine for most cases
  • Go: gcr.io/distroless/static for production, golang:alpine for development
  • Python: python:3.12-slim (avoid alpine due to compilation issues)
  • Java: eclipse-temurin:21-jre-alpine

Security Hardening

Security isn't optional—it's the default expectation in production.

Run as Non-Root User

Never run containers as root. Create a dedicated user:

# Alpine RUN addgroup -g 1001 -S appgroup && \ adduser -S appuser -u 1001 -G appgroup # Debian/Ubuntu RUN groupadd -g 1001 appgroup && \ useradd -u 1001 -g appgroup -s /bin/sh appuser USER appuser

Don't Include Secrets

Secrets should never be in your image. They leak through:

  • Environment variables in Dockerfile
  • COPY commands
  • Image layer history

Bad:

# DON'T DO THIS ENV API_KEY=sk_live_abc123 COPY .env /app/

Good:

# Pass secrets at runtime # docker run -e API_KEY=$API_KEY myimage # Or use Docker secrets, Kubernetes secrets, or vault

Read-Only Filesystem

Run with read-only root filesystem when possible:

# docker-compose.yml services: app: read_only: true tmpfs: - /tmp - /var/run

Drop Capabilities

Remove unnecessary Linux capabilities:

# docker-compose.yml services: app: cap_drop: - ALL cap_add: - NET_BIND_SERVICE # Only if needed

Scan for Vulnerabilities

Integrate scanning into your CI/CD:

# Trivy (recommended) trivy image myapp:latest # Docker Scout docker scout cves myapp:latest # Snyk snyk container test myapp:latest

Image Optimization

Smaller images are faster to pull, use less storage, and have smaller attack surfaces.

Layer Caching Strategy

Order instructions from least to most frequently changed:

# GOOD: Dependencies first, code last FROM node:20-alpine WORKDIR /app # These rarely change - cached layers COPY package*.json ./ RUN npm ci --only=production # This changes often - invalidates only this layer COPY . . RUN npm run build

Combine RUN Commands

Each RUN creates a layer. Combine related operations:

# BAD: Multiple layers, larger image RUN apt-get update RUN apt-get install -y curl RUN apt-get clean # GOOD: Single layer, cleanup included RUN apt-get update && \ apt-get install -y --no-install-recommends curl && \ apt-get clean && \ rm -rf /var/lib/apt/lists/*

Use .dockerignore

Exclude unnecessary files from the build context:

# .dockerignore
.git
.gitignore
node_modules
npm-debug.log
Dockerfile*
docker-compose*
.env*
*.md
.vscode
.idea
coverage
tests
__pycache__
*.pyc

Pin Versions

Always pin versions for reproducible builds:

# BAD: Unpredictable builds FROM node:latest RUN npm install express # GOOD: Reproducible builds FROM node:20.11.0-alpine3.19 COPY package-lock.json ./ RUN npm ci

Runtime Configuration

Health Checks

Health checks enable orchestrators to detect and replace unhealthy containers:

HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \ CMD curl -f http://localhost:3000/health || exit 1

For non-HTTP services:

HEALTHCHECK --interval=30s --timeout=3s --retries=3 \ CMD pg_isready -U postgres || exit 1

Resource Limits

Always set memory and CPU limits in production:

# docker-compose.yml services: app: deploy: resources: limits: cpus: '1.0' memory: 512M reservations: cpus: '0.5' memory: 256M

Graceful Shutdown

Handle SIGTERM for graceful shutdown:

// Node.js example process.on('SIGTERM', async () => { console.log('SIGTERM received, shutting down gracefully'); await server.close(); await database.disconnect(); process.exit(0); });

Use exec form in CMD to receive signals properly:

# GOOD: Receives signals CMD ["node", "server.js"] # BAD: Signals go to shell, not app CMD node server.js

Logging Best Practices

Log to stdout/stderr for container log collection:

// Node.js - use console or structured logger console.log(JSON.stringify({ level: 'info', message: 'Server started', port: 3000, timestamp: new Date().toISOString() }));

Don't log to files inside containers—use log drivers:

# docker-compose.yml services: app: logging: driver: json-file options: max-size: "10m" max-file: "3"

Common Mistakes to Avoid

1. Running as Root

Problem: Security vulnerability, privilege escalation risk

Solution: Always use USER directive

2. Using Latest Tag

Problem: Unpredictable builds, silent breaking changes

Solution: Pin specific versions

3. Large Images

Problem: Slow deployments, wasted resources

Solution: Multi-stage builds, minimal base images

4. Secrets in Images

Problem: Credentials exposed in image layers

Solution: Runtime injection, secrets management

5. No Health Checks

Problem: Dead containers stay in rotation

Solution: Add HEALTHCHECK instruction

6. Ignoring Signal Handling

Problem: Data loss on shutdown, stuck containers

Solution: Handle SIGTERM, use exec form CMD

CI/CD Integration

GitHub Actions Example

name: Build and Push on: push: branches: [main] jobs: build: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Build image run: docker build -t myapp:${{ github.sha }} . - name: Scan for vulnerabilities uses: aquasecurity/trivy-action@master with: image-ref: myapp:${{ github.sha }} exit-code: '1' severity: 'CRITICAL,HIGH' - name: Push to registry run: | docker tag myapp:${{ github.sha }} registry/myapp:${{ github.sha }} docker push registry/myapp:${{ github.sha }}

Key Takeaways

  1. Multi-stage builds are non-negotiable: Separate build and runtime environments
  2. Security by default: Non-root users, minimal base images, no secrets in images
  3. Pin everything: Base images, package versions, for reproducible builds
  4. Health checks enable resilience: Let orchestrators detect and recover from failures
  5. Handle signals: Graceful shutdown prevents data loss and connection issues
  6. Log to stdout: Let the platform handle log aggregation
  7. Scan regularly: Vulnerabilities emerge constantly; automate scanning in CI/CD

Production containers aren't just "it works on my machine" wrapped in Docker. They're secure, efficient, observable, and resilient. Every Dockerfile decision should consider what happens when this runs at scale with real traffic.

Share this article