devops

Containers β€” Docker Deep Reality

Why containers exist, Dockerfiles, layer caching, networking, volumes, Compose, and common failures


Why Containers Exist (The VM Pain)

Before containers, deploying software meant one of:

  1. Bare metal β€” config drift, β€œworks on my machine”, slow provisioning
  2. VMs β€” better isolation, but each VM needs its own OS kernel (GBs of overhead, minutes to boot)

Containers solved this by sharing the host kernel while isolating filesystem, network, and process namespaces. The result: millisecond startup, MB instead of GB images, consistent environments.

VMs:
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ App A β”‚ App B β”‚ App C β”‚
β”‚ Libs β”‚ Libs β”‚ Libs β”‚
β”‚ Guest OSβ”‚ Guest OSβ”‚ Guest OS β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Hypervisor β”‚
β”‚ Host OS β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Containers:
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ App A β”‚ App B β”‚ App C β”‚
β”‚ Libs β”‚ Libs β”‚ Libs β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Container Runtime β”‚
β”‚ Host OS Kernel (shared) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Image vs Container vs Runtime

ConceptWhat it is
ImageImmutable read-only template β€” layers of filesystem changes
ContainerRunning (or stopped) instance of an image, has its own writable layer
RegistryStorage for images (Docker Hub, ECR, GCR, ghcr.io)
RuntimeThe software that runs containers (containerd, runc)
Terminal window
# Images
docker images # list local images
docker pull nginx:1.25 # pull from registry
docker push myrepo/myapp:1.0 # push to registry
docker rmi nginx:1.25 # remove image
docker image prune # remove unused images
# Containers
docker ps # running containers
docker ps -a # all containers (including stopped)
docker run nginx # create + start
docker start <id> # start stopped container
docker stop <id> # graceful stop (SIGTERM β†’ SIGKILL after timeout)
docker kill <id> # immediate kill (SIGKILL)
docker rm <id> # remove stopped container
docker rm -f <id> # force remove running container

Dockerfile

Layer Caching

Each RUN, COPY, ADD instruction creates a new layer. Docker caches layers β€” if a layer hasn’t changed, it reuses the cache.

# BAD β€” invalidates cache for dependencies every time code changes
FROM node:20-alpine
COPY . .
RUN npm install
# GOOD β€” copy package files first, install dependencies (cache hit when only code changes)
FROM node:20-alpine
WORKDIR /app
COPY package*.json ./ # only changes when deps change
RUN npm ci # cached until package*.json changes
COPY . . # code changes here β€” only affects layers below

Cache invalidation rule: When any layer changes, all layers below it are rebuilt.

ENTRYPOINT vs CMD

# CMD β€” default command, easily overridden
CMD ["nginx", "-g", "daemon off;"]
docker run myimage /bin/sh # overrides CMD
# ENTRYPOINT β€” fixed command, can't be overridden (only appended to)
ENTRYPOINT ["python3", "server.py"]
docker run myimage --port 9000 # appends: python3 server.py --port 9000
# Common pattern β€” ENTRYPOINT for executable, CMD for default args
ENTRYPOINT ["python3", "server.py"]
CMD ["--port", "8080"]
docker run myimage --port 9000 # uses custom port

Environment Handling

# Baked into image (visible in docker inspect, don't use for secrets)
ENV APP_ENV=production
ENV PORT=8080
# Build argument (only available during build, not in running container)
ARG BUILD_VERSION
RUN echo "Building version $BUILD_VERSION"
Terminal window
# Pass env vars at runtime
docker run -e DATABASE_URL=postgres://... myimage
# Load from env file
docker run --env-file .env.production myimage
# Build with ARG
docker build --build-arg BUILD_VERSION=1.2.3 .

Never bake secrets into images β€” they’re visible in docker inspect and image layers.

Multi-Stage Builds

Reduce final image size by building in one stage, copying only the artifacts to the final stage.

# Stage 1: Build
FROM node:20 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build # outputs to /app/dist
# Stage 2: Final image
FROM nginx:alpine
COPY --from=builder /app/dist /usr/share/nginx/html
EXPOSE 80

The final image is just nginx + static files β€” no Node.js, no node_modules, no source code.

Terminal window
# Build only up to a specific stage (for debugging)
docker build --target builder -t myapp:debug .

Image Size & Attack Surface

Smaller images = faster pulls, smaller attack surface, less to patch.

# Use slim or alpine variants
FROM python:3.12-slim # ~50MB vs 1GB for full python
FROM node:20-alpine # ~170MB vs 1GB
# Remove package manager caches in the same RUN layer
RUN apt-get update && \
apt-get install -y --no-install-recommends curl && \
rm -rf /var/lib/apt/lists/*
# Run as non-root user
RUN useradd -m appuser
USER appuser
Terminal window
# Analyze image layers
docker history myimage
dive myimage # interactive layer explorer (install separately)
# Scan for vulnerabilities
docker scout cves myimage
trivy image myimage

Volumes vs Bind Mounts

VolumeBind Mount
LocationManaged by Docker (/var/lib/docker/volumes/)Anywhere on host
Controlled byDockerHost OS
Use casePersistent data (DB, uploads)Development (code hot-reload)
PerformanceBetter on non-LinuxGood
Backupdocker volume commandsStandard file backup
Terminal window
# Named volume
docker run -v mydb:/var/lib/postgresql/data postgres
# Bind mount (development)
docker run -v $(pwd)/src:/app/src myimage
# Read-only bind mount
docker run -v $(pwd)/config:/etc/myapp:ro myimage
# Inspect volume
docker volume ls
docker volume inspect mydb
# Backup a volume
docker run --rm -v mydb:/data -v $(pwd):/backup alpine \
tar czf /backup/mydb-backup.tar.gz -C /data .

Docker Networking

Terminal window
# List networks
docker network ls
# Default networks:
# bridge β€” default for containers on same host
# host β€” shares host network stack (no isolation)
# none β€” no network
# Create custom network (containers can talk by name)
docker network create myapp-network
# Connect container to network
docker run --network myapp-network --name db postgres
docker run --network myapp-network --name app myapp
# "app" container can reach "db" container by hostname "db"
# Inspect network
docker network inspect myapp-network

Docker Compose Patterns

Compose is for defining multi-container apps as a single unit.

docker-compose.yml
version: '3.8'
services:
app:
build: .
ports:
- "8080:8080"
environment:
DATABASE_URL: postgres://myuser:mypass@db:5432/mydb
REDIS_URL: redis://redis:6379
depends_on:
db:
condition: service_healthy # wait for health check
volumes:
- ./src:/app/src # dev: hot reload
networks:
- backend
db:
image: postgres:16
environment:
POSTGRES_USER: myuser
POSTGRES_PASSWORD: mypass
POSTGRES_DB: mydb
volumes:
- pgdata:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U myuser"]
interval: 5s
timeout: 5s
retries: 5
networks:
- backend
redis:
image: redis:7-alpine
networks:
- backend
volumes:
pgdata:
networks:
backend:
Terminal window
# Start all services
docker compose up -d
# Follow logs
docker compose logs -f
docker compose logs -f app # specific service
# Run a one-off command
docker compose exec app bash
docker compose run --rm app python manage.py migrate
# Stop and remove containers
docker compose down
# Stop and remove containers + volumes (WARNING: data loss)
docker compose down -v
# Rebuild images
docker compose build
docker compose up -d --build

Common Failures

Container Exits Immediately

Terminal window
# Check exit code
docker ps -a | grep myapp
# STATUS: Exited (1) 3 minutes ago
# Check logs
docker logs myapp
docker logs --tail 50 myapp
# Run interactively to debug
docker run -it --entrypoint /bin/sh myimage
# Override ENTRYPOINT to get a shell
docker run -it --entrypoint /bin/bash myimage

Common causes:

  • Application crashed on startup (check logs)
  • Missing environment variable β†’ panic/exception
  • Entrypoint script exits after running a command (use exec at the end)
  • PID 1 problem β€” if entrypoint is a shell script, use exec for the main process
#!/bin/sh
# In entrypoint.sh β€” always exec the main process
exec python3 server.py "$@"

Port Not Exposed

Terminal window
# Check what port the container is listening on INSIDE
docker exec myapp ss -tlnp
# Check port mapping
docker port myapp
# Verify the container is publishing the port
docker inspect myapp | grep PortBindings
# Correct: port mapping must be specified at run time
docker run -p 8080:8080 myimage # host:container

Env Misconfiguration

Terminal window
# Check what env vars are set in running container
docker exec myapp env
docker exec myapp printenv DATABASE_URL
# Inspect environment from outside
docker inspect myapp | grep -A10 '"Env"'

Dependency Not Ready

Terminal window
# Use health checks in Compose (not just depends_on)
# Without health checks, depends_on only waits for container start,
# not for the service to be ready
# Alternative: use a wait script
while ! nc -z db 5432; do sleep 1; done
# Or use wait-for-it.sh / dockerize