Learn by Directing AI

The Brief

Valentina Reyes is back. The yield predictions worked -- her Copenhagen roaster was impressed she could give confident numbers, and 10 of 12 farms were closer than her gut feeling. But her data person Carlos tried to run the notebook and got different results. Something about Python versions or packages. And when Valentina needed updated predictions after a dry spell, she had to email you and wait.

The model works. The problem is that nobody else can run it, it only exists on one machine, and there is no record of what it predicted.

Your Role

You're taking the model from P4 and making it deliverable. No new data, no new training. The work is infrastructure: convert the notebook to a scripted pipeline, containerize the serving endpoint so it runs on any machine, and add prediction logging so every prediction is recorded.

You direct Claude Code through infrastructure work you haven't done before. Docker, scripted pipelines, and prediction logging are all new terrain. Templates and guides provide structure for each -- you fill them with design decisions. The AI relationship is the same as P4: you specify structural constraints and review what Claude generates, catching unpinned dependencies, bloated containers, and missing metadata.

What's New

Last time you trained a PyTorch model, encountered temporal leakage, and learned that invisible errors in the evaluation are the most dangerous kind. The model was the deliverable.

This time the model is the starting point. The gap you're closing is between "it works on my machine" and "Carlos can run it in ten minutes." Docker containers make the environment declaration explicit -- but AI generates unpinned, bloated Dockerfiles that include everything from notebooks to training code. Scripted pipelines replace notebooks with modular code and YAML configuration -- but AI generates monolithic scripts with hardcoded values. Prediction logging creates an audit trail -- but AI generates logging that captures request bodies without the metadata that makes logs useful. Each piece of infrastructure requires you to specify what AI won't get right on its own.

Tools

Docker -- containerization (new)
Python -- scripted pipeline, serving endpoint
FastAPI / uvicorn -- serving endpoint (familiar from P3-P4)
YAML -- pipeline configuration (new usage)
Claude Code -- AI direction
Git / GitHub -- version control
curl -- endpoint testing (familiar)

Materials

You receive:

Valentina's follow-up email describing the operational problem
Model artifacts from P4 (serve.py, feature pipeline, trained model, requirements, sensor schema)
A pipeline template showing modular structure and module stubs
A Docker guide covering container concepts, Dockerfiles, and multi-stage builds
A logging spec template with fields to design and health check prompts
A ticket breakdown covering setup, pipeline conversion, containerization, logging, and delivery
A project governance file (CLAUDE.md)