Learn by Directing AI

Step 1: Set up the project

Open a terminal, navigate to your dev directory, and start Claude Code.

cd ~/dev

claude

Paste this setup prompt:

Create the folder ~/dev/ml/p5. Download the project materials from https://learnbydirectingai.dev/materials/ml/p5/materials.zip and extract them into that folder. Read CLAUDE.md -- it's the project governance file.

Claude downloads the materials, extracts them, and reads the governance file. Once it finishes, you have a project workspace with everything you need.

Step 2: Read Valentina's email

Open materials/valentina-followup.md.

Valentina is back. The yield predictions worked -- her Copenhagen roaster was impressed. But two problems surfaced. Carlos, her data person, tried to run the notebook and got different results. Something about Python versions or packages. And Valentina has no record of what the model predicted versus what actually happened.

The model works. The problem is that it only works on your machine. Carlos has a different setup, different packages, a different Python version. The same code producing different results on different machines is exactly the kind of invisible dependency that local development hides.

Step 3: Talk to Valentina

Open a chat with Valentina. She's available to clarify what "running properly" means to her.

Some things to ask about: what Carlos tried, what system he's using, what timeline she has in mind, and what she means by "a record of predictions." Her answers will sharpen your understanding of the delivery target. She's practical and direct -- ask specific questions and she'll give you specific answers.

Step 4: Review the model artifacts

Look through materials/model-artifacts/. This is the starting point -- the model output from P4, provided as a baseline so everyone starts from the same place.

Five files:

serve.py -- the FastAPI serving endpoint with /health and /predict routes. This is the existing API that runs the model.
feature_pipeline.py -- transforms raw sensor input (temperature, rainfall, soil moisture, humidity, altitude) into the features the model expects. Normalizes values and creates derived features.
model.pt -- the trained PyTorch model. A simple feedforward network.
requirements.txt -- Python dependencies with pinned versions.
sensor-schema.json -- defines the expected input fields and valid ranges for each sensor reading.

Read through serve.py and feature_pipeline.py. Understand how a prediction request flows: raw sensor data comes in, feature_pipeline.py transforms it, the model produces a yield estimate.

Step 5: Verify the endpoint runs locally

Install the dependencies and start the endpoint:

pip install -r materials/model-artifacts/requirements.txt

cd materials/model-artifacts && python serve.py

In a separate terminal, test both endpoints:

curl http://localhost:8000/health

curl -X POST http://localhost:8000/predict -H "Content-Type: application/json" -d '{"farm_id": "farm_01", "temperature": 22.5, "rainfall": 15.0, "soil_moisture": 45.0, "humidity": 72.0, "altitude": 1650.0}'

The health endpoint should return a status response. The predict endpoint should return a JSON object with farm_id and predicted_yield_kg. If either fails, check the error messages -- missing dependencies or a wrong path to model.pt are the usual culprits.

Step 6: Review the ticket breakdown

Open materials/tickets.md. Five tickets cover the full scope:

T1: Project setup and model artifact review (what you just finished)
T2: Notebook-to-script pipeline conversion
T3: Docker containerization of serving endpoint
T4: Prediction logging and health monitoring
T5: End-to-end verification and delivery

The sequence matters. The pipeline comes first because you need a scripted training process before you can containerize. Containerization comes before logging because the serving endpoint needs to be stable before you add monitoring to it. Delivery comes last because it integrates everything.

✓ Check