Docker Guide

1. What Docker Is

Docker packages an application with everything it needs to run -- the operating system, the runtime, the dependencies, and the configuration -- into a single unit called a container. A container runs the same way on your laptop, on a test server, and in production. If you can run it on your machine, you can run it anywhere.

This is different from a virtual machine. A VM emulates an entire computer, including the operating system kernel. A container shares the host's kernel but isolates the application's environment. Containers start in seconds, not minutes. They use less memory. They're lighter.

The core promise: environmental reproducibility. "Works on my machine" becomes "works in this container," and that's a statement you can verify on any machine that runs Docker.

2. Dockerfile Basics

A Dockerfile is a text file that describes how to build a container image. Each line is an instruction:

FROM -- the starting point. Every Dockerfile begins with a base image. FROM node:20-slim starts with a minimal Node.js 20 environment.
COPY -- copies files from your project into the image. COPY package.json . puts your package.json inside the container.
RUN -- executes a command during the build. RUN npm ci installs your dependencies inside the container.
EXPOSE -- documents which port the application listens on. EXPOSE 3000 tells anyone reading the Dockerfile that the app serves on port 3000. It doesn't actually open the port -- that happens when you run the container.
CMD -- the command that runs when the container starts. CMD ["node", "server.js"] starts your application.

A simple Dockerfile for a Node.js backend:

FROM node:20-slim
WORKDIR /app
COPY package*.json ./
RUN npm ci --omit=dev
COPY . .
EXPOSE 3000
CMD ["node", "server.js"]

3. The Layer Model

Every instruction in a Dockerfile creates a layer. Layers stack on top of each other to form the final image.

Docker caches layers. If an instruction hasn't changed since the last build, Docker reuses the cached layer instead of rebuilding it. This is why the order of instructions matters:

COPY package*.json ./ then RUN npm ci -- the dependency layer only rebuilds when package.json changes
COPY . . after the dependency install -- changing application code doesn't trigger a dependency reinstall

If you put COPY . . before RUN npm ci, every code change would rebuild the dependency layer too. Same result, much slower.

Each layer adds to the image size. Installing build tools, copying source code, and running tests all contribute. A layer that installs gcc and make for a native module adds hundreds of megabytes that stay in the final image even if you only needed them during the build. This is the awareness that leads to multi-stage builds later -- but for now, keep your layers intentional and know that each one costs space.

4. Base Images

The base image is the foundation of your container. For Node.js applications, common choices:

node:20 -- full Debian-based image with build tools, utilities, and system libraries. ~350MB. Has everything, including things you probably don't need.
node:20-slim -- stripped-down Debian image with Node.js and essential libraries. ~180MB. Good default for most applications.
node:20-alpine -- Alpine Linux-based image. ~50MB. Smallest option but uses musl libc instead of glibc, which can cause compatibility issues with some native modules.

Pin the version. FROM node:latest or FROM node:20 uses a floating tag -- the image you get today might be different from the one you get next month. FROM node:20-slim is better but still floats within patch versions. For production use, pin the specific version: FROM node:20.11-slim. For learning, node:20-slim is fine.

Never use latest. It's the default tag if you don't specify one. It means "whatever the maintainer decided to publish most recently." Your Dockerfile might build differently tomorrow than it does today, and you won't know why.

5. Build Context and .dockerignore

When you run docker build, Docker sends the entire build context (the directory you're building from) to the Docker daemon. Everything in that directory becomes available to COPY instructions.

This means Docker also sends files you don't want in the image:

.git/ -- your entire Git history, potentially hundreds of megabytes
node_modules/ -- your local dependencies, which you're reinstalling inside the container anyway
.env -- your environment variables, which might contain secrets

A .dockerignore file tells Docker what to exclude from the build context:

node_modules
.git
.env
.env.*
*.log
.DS_Store

Without a .dockerignore, Docker copies everything. Your image gets bigger, builds get slower, and you might accidentally bake secrets into the image.

6. Running Containers

Building an image:

docker build -t my-app .

The -t flag gives the image a name (tag). The . says "use the current directory as the build context."

Running a container:

docker run -p 3000:3000 my-app

The -p 3000:3000 maps port 3000 on your machine to port 3000 inside the container. Without this, the container runs but you can't reach it.

Passing environment variables:

docker run -p 3000:3000 -e DATABASE_URL=postgres://... my-app

The -e flag passes environment variables into the container. The application inside reads them as usual (process.env.DATABASE_URL).

Running in the background:

docker run -d -p 3000:3000 my-app

The -d flag runs the container in detached mode -- it starts and returns you to the terminal.

Stopping a container:

docker stop <container_id>

Starting a stopped container:

docker start <container_id>

Listing running containers:

docker ps

Checking image sizes:

docker image ls

Running a command inside a running container:

docker exec <container_id> whoami

This runs whoami inside the container. Useful for checking which user the container runs as.