Docker Best Practices: Building Production-Ready Containers
Practical guidelines for building secure, efficient Docker containers. Multi-stage builds, security hardening, and image optimization.
Docker Best Practices: Building Production-Ready Containers
After years of shipping containers to production, I've learned that containers are easy to build badly. The difference between a development container and a production-ready one involves security, size, reliability, and observability considerations that are easy to overlook.
Multi-Stage Builds
Multi-stage builds are essential for production containers. They separate build-time dependencies from runtime, dramatically reducing image size and attack surface.
Basic Multi-Stage Pattern
# Build stage
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
# Production stage
FROM node:20-alpine AS production
WORKDIR /app
# Create non-root user
RUN addgroup -g 1001 -S nodejs && \
adduser -S nextjs -u 1001
# Copy only what's needed
COPY /app/dist ./dist
COPY /app/node_modules ./node_modules
COPY /app/package.json ./
USER nextjs
EXPOSE 3000
CMD ["node", "dist/server.js"]Go Application Example
# Build stage
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -ldflags="-w -s" -o /app/server
# Production stage - distroless for minimal attack surface
FROM gcr.io/distroless/static-debian12
COPY /app/server /
USER nonroot:nonroot
ENTRYPOINT ["/server"]The Go example produces a ~10MB image instead of ~800MB with the Go SDK.
Base Image Selection
Your base image choice affects size, security, and compatibility.
Image Comparison
| Base Image | Size | Use Case |
|---|---|---|
| ubuntu:22.04 | ~77MB | Full Linux environment, debugging |
| debian:bookworm-slim | ~74MB | Standard Linux, smaller than Ubuntu |
| alpine:3.19 | ~7MB | Minimal, musl libc (check compatibility) |
| distroless/static | ~2MB | Compiled binaries only |
| scratch | ~0MB | Absolute minimum (static binaries only) |
My Recommendations
- Node.js:
node:20-alpinefor most cases - Go:
gcr.io/distroless/staticfor production,golang:alpinefor development - Python:
python:3.12-slim(avoid alpine due to compilation issues) - Java:
eclipse-temurin:21-jre-alpine
Security Hardening
Security isn't optional—it's the default expectation in production.
Run as Non-Root User
Never run containers as root. Create a dedicated user:
# Alpine
RUN addgroup -g 1001 -S appgroup && \
adduser -S appuser -u 1001 -G appgroup
# Debian/Ubuntu
RUN groupadd -g 1001 appgroup && \
useradd -u 1001 -g appgroup -s /bin/sh appuser
USER appuserDon't Include Secrets
Secrets should never be in your image. They leak through:
- Environment variables in Dockerfile
- COPY commands
- Image layer history
Bad:
# DON'T DO THIS
ENV API_KEY=sk_live_abc123
COPY .env /app/Good:
# Pass secrets at runtime
# docker run -e API_KEY=$API_KEY myimage
# Or use Docker secrets, Kubernetes secrets, or vaultRead-Only Filesystem
Run with read-only root filesystem when possible:
# docker-compose.yml
services:
app:
read_only: true
tmpfs:
- /tmp
- /var/runDrop Capabilities
Remove unnecessary Linux capabilities:
# docker-compose.yml
services:
app:
cap_drop:
- ALL
cap_add:
- NET_BIND_SERVICE # Only if neededScan for Vulnerabilities
Integrate scanning into your CI/CD:
# Trivy (recommended)
trivy image myapp:latest
# Docker Scout
docker scout cves myapp:latest
# Snyk
snyk container test myapp:latestImage Optimization
Smaller images are faster to pull, use less storage, and have smaller attack surfaces.
Layer Caching Strategy
Order instructions from least to most frequently changed:
# GOOD: Dependencies first, code last
FROM node:20-alpine
WORKDIR /app
# These rarely change - cached layers
COPY package*.json ./
RUN npm ci --only=production
# This changes often - invalidates only this layer
COPY . .
RUN npm run buildCombine RUN Commands
Each RUN creates a layer. Combine related operations:
# BAD: Multiple layers, larger image
RUN apt-get update
RUN apt-get install -y curl
RUN apt-get clean
# GOOD: Single layer, cleanup included
RUN apt-get update && \
apt-get install -y --no-install-recommends curl && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*Use .dockerignore
Exclude unnecessary files from the build context:
# .dockerignore
.git
.gitignore
node_modules
npm-debug.log
Dockerfile*
docker-compose*
.env*
*.md
.vscode
.idea
coverage
tests
__pycache__
*.pycPin Versions
Always pin versions for reproducible builds:
# BAD: Unpredictable builds
FROM node:latest
RUN npm install express
# GOOD: Reproducible builds
FROM node:20.11.0-alpine3.19
COPY package-lock.json ./
RUN npm ciRuntime Configuration
Health Checks
Health checks enable orchestrators to detect and replace unhealthy containers:
HEALTHCHECK \
CMD curl -f http://localhost:3000/health || exit 1For non-HTTP services:
HEALTHCHECK \
CMD pg_isready -U postgres || exit 1Resource Limits
Always set memory and CPU limits in production:
# docker-compose.yml
services:
app:
deploy:
resources:
limits:
cpus: '1.0'
memory: 512M
reservations:
cpus: '0.5'
memory: 256MGraceful Shutdown
Handle SIGTERM for graceful shutdown:
// Node.js example
process.on('SIGTERM', async () => {
console.log('SIGTERM received, shutting down gracefully');
await server.close();
await database.disconnect();
process.exit(0);
});Use exec form in CMD to receive signals properly:
# GOOD: Receives signals
CMD ["node", "server.js"]
# BAD: Signals go to shell, not app
CMD node server.jsLogging Best Practices
Log to stdout/stderr for container log collection:
// Node.js - use console or structured logger
console.log(JSON.stringify({
level: 'info',
message: 'Server started',
port: 3000,
timestamp: new Date().toISOString()
}));Don't log to files inside containers—use log drivers:
# docker-compose.yml
services:
app:
logging:
driver: json-file
options:
max-size: "10m"
max-file: "3"Common Mistakes to Avoid
1. Running as Root
Problem: Security vulnerability, privilege escalation risk
Solution: Always use USER directive
2. Using Latest Tag
Problem: Unpredictable builds, silent breaking changes
Solution: Pin specific versions
3. Large Images
Problem: Slow deployments, wasted resources
Solution: Multi-stage builds, minimal base images
4. Secrets in Images
Problem: Credentials exposed in image layers
Solution: Runtime injection, secrets management
5. No Health Checks
Problem: Dead containers stay in rotation
Solution: Add HEALTHCHECK instruction
6. Ignoring Signal Handling
Problem: Data loss on shutdown, stuck containers
Solution: Handle SIGTERM, use exec form CMD
CI/CD Integration
GitHub Actions Example
name: Build and Push
on:
push:
branches: [main]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build image
run: docker build -t myapp:${{ github.sha }} .
- name: Scan for vulnerabilities
uses: aquasecurity/trivy-action@master
with:
image-ref: myapp:${{ github.sha }}
exit-code: '1'
severity: 'CRITICAL,HIGH'
- name: Push to registry
run: |
docker tag myapp:${{ github.sha }} registry/myapp:${{ github.sha }}
docker push registry/myapp:${{ github.sha }}Key Takeaways
- Multi-stage builds are non-negotiable: Separate build and runtime environments
- Security by default: Non-root users, minimal base images, no secrets in images
- Pin everything: Base images, package versions, for reproducible builds
- Health checks enable resilience: Let orchestrators detect and recover from failures
- Handle signals: Graceful shutdown prevents data loss and connection issues
- Log to stdout: Let the platform handle log aggregation
- Scan regularly: Vulnerabilities emerge constantly; automate scanning in CI/CD
Production containers aren't just "it works on my machine" wrapped in Docker. They're secure, efficient, observable, and resilient. Every Dockerfile decision should consider what happens when this runs at scale with real traffic.