Finca Esperanza -- Production Delivery

Client

Valentina Reyes, Owner and Export Director at Finca Esperanza Exports. Specialty coffee exporter in Neiva, Colombia. She needs the yield prediction model from the previous project running as a proper system that her data person Carlos can operate independently.

What you're building

A production-ready ML system for coffee yield prediction. The model already works -- you're making it deliverable. Three components: a scripted training pipeline (replacing the notebook), a Docker-containerized serving endpoint (replacing "it works on my machine"), and prediction logging (replacing "I don't know what it predicted").

Tech stack

Python 3.11
PyTorch (model inference)
FastAPI + uvicorn (serving endpoint)
Docker (containerization)
YAML (pipeline configuration)
curl (endpoint testing)
Git/GitHub (version control)

File structure

.
├── CLAUDE.md                  # This file
├── config.yaml                # Pipeline configuration (you create)
├── train.py                   # Pipeline entry point (you create)
├── data_loader.py             # Data loading module (you create)
├── feature_engineer.py        # Feature engineering module (you create)
├── trainer.py                 # Training module (you create)
├── evaluator.py               # Evaluation module (you create)
├── serve.py                   # Serving endpoint (provided, you modify)
├── feature_pipeline.py        # Feature processing (provided)
├── model.pt                   # Trained model (provided)
├── requirements.txt           # Python dependencies (provided)
├── sensor-schema.json         # Input schema (provided)
├── Dockerfile                 # Container definition (you create)
├── .dockerignore              # Build context exclusions (you create)
├── logs/                      # Prediction logs (created at runtime)
│   └── predictions.jsonl      # Structured prediction log
└── materials/
    ├── valentina-followup.md  # Valentina's email
    ├── tickets.md             # Work breakdown
    ├── pipeline-template.md   # Pipeline guide
    ├── docker-guide.md        # Docker concepts
    └── logging-spec-template.md  # Logging schema template

Key materials

tickets.md -- work breakdown with 5 tickets (T1-T5)
pipeline-template.md -- modular pipeline structure guide
docker-guide.md -- Docker concepts and commands reference
logging-spec-template.md -- logging schema design template
model-artifacts/ -- starting model files from previous project

Tickets

T1: Project setup and model artifact review
T2: Notebook-to-script pipeline conversion
T3: Docker containerization of serving endpoint
T4: Prediction logging and health monitoring
T5: End-to-end verification and delivery

Verification targets

python train.py --config config.yaml runs end-to-end without errors
curl http://localhost:8000/health returns 200 from containerized endpoint
curl -X POST http://localhost:8000/predict returns prediction from containerized endpoint
Prediction log contains timestamp, input features, output, confidence, response time per entry
Health endpoint checks model loadedness and dependency status
Full pipeline-to-container-to-prediction-to-log sequence works end-to-end

Commit convention

Commit after completing each ticket. Use descriptive messages: feat: convert training notebook to modular pipeline, feat: add multi-stage Dockerfile for serving endpoint, feat: add prediction logging middleware. Push to GitHub when the full system is verified.