Step 1: Understand containers and images
Open materials/docker-guide.md.
A container packages your code, runtime, libraries, and system tools into a single unit that runs the same everywhere. It is not a virtual machine -- it shares the host's kernel but isolates everything else. When you say "it works in a container," you are making a guarantee about the environment, not just the code.
An image is the blueprint. A container is a running instance of it. You build images from Dockerfiles -- layer-by-layer instructions. The guide covers the key Dockerfile commands: FROM, COPY, RUN, EXPOSE, CMD. Read through them. Each one adds a layer to the image.
Step 2: Understand why pinning and multi-stage builds matter
The Docker guide explains two critical concepts.
Pinning: FROM python:3 means "whatever Python 3 is today." If the base image updates next month with a breaking change, your build breaks even though you changed nothing. FROM python:3.11.8-slim means "exactly this version, always." The -slim variant excludes development tools you do not need for serving.
Multi-stage builds: The build stage needs compilers, build tools, and full PyTorch. The serving stage only needs the inference runtime, the trained model, and FastAPI. A single-stage build ships everything into the serving image. A multi-stage build starts fresh from a slim base and copies only serving artifacts from the build stage.
Step 3: Talk to Marcus Webb
Open a chat with Marcus Webb. He's a senior ML infrastructure engineer who's deployed models in production.
Ask Marcus about container design for ML serving. He'll talk about separating build environments from serving environments, why serving images should be small, and what belongs in the container versus what should stay out. His perspective is practical -- he's seen the consequences of bloated containers and unpinned dependencies.
Step 4: Create an initial Dockerfile
Direct Claude to create a Dockerfile for the serving endpoint:
Create a Dockerfile for the serving endpoint. The endpoint uses serve.py with feature_pipeline.py, model.pt, requirements.txt, and sensor-schema.json. The base image should be pinned. Use a multi-stage build: the first stage installs all dependencies, the second stage copies only what's needed for serving. Also create a .dockerignore that excludes notebooks, training data, training code, git history, virtual environments, and documentation.
Step 5: Review AI's Dockerfile
Read the generated Dockerfile carefully.
AI commonly generates Dockerfiles with several problems: unpinned base images (FROM python:3 instead of a specific version), single-stage builds that include everything, COPY commands that bring in the entire project directory, and no .dockerignore. Check for each of these.
If you want a second opinion, ask a different model to review the Dockerfile. Different models have different failure modes. A fresh context can catch issues that the generating model normalized.
If the Dockerfile copies training code, notebooks, or datasets into the serving image, that's a problem. The serving image needs: serve.py, feature_pipeline.py, model.pt, requirements.txt, sensor-schema.json, and the Python runtime. Nothing else.
Step 6: Build, run, and test the container
Build the Docker image:
docker build -t finca-serving .
The build output should show both stages completing. Watch for the slim base image in the second stage's FROM line.
Run the container:
docker run -p 8000:8000 finca-serving
Test from a separate terminal:
curl http://localhost:8000/health
curl -X POST http://localhost:8000/predict -H "Content-Type: application/json" -d '{"farm_id": "farm_01", "temperature": 22.5, "rainfall": 15.0, "soil_moisture": 45.0, "humidity": 72.0, "altitude": 1650.0}'
If either request fails from the container, the error reveals an assumption your local environment was hiding. A missing system dependency, a hardcoded file path, an environment variable that existed on your machine but not in the container. The container does not hide problems. It surfaces them.
Check: curl http://localhost:8000/health returns a successful response from the container, and curl -X POST http://localhost:8000/predict -H "Content-Type: application/json" -d '{"farm_id": "farm_01", ...}' returns a prediction.