Tickets

T1: Project setup and model artifact review

Set up the project workspace and verify the existing model artifacts work locally.

What to do:

Download and extract the project materials
Read Valentina's follow-up email to understand the operational problem
Talk to Valentina to clarify what she needs
Review the model artifacts (serve.py, feature_pipeline.py, model.pt, requirements.txt, sensor-schema.json)
Run the serving endpoint locally and confirm it responds

Acceptance criteria:

Project workspace set up with all materials present
python serve.py starts the endpoint on port 8000
curl http://localhost:8000/health returns a successful response
curl -X POST http://localhost:8000/predict -H "Content-Type: application/json" -d '{"farm_id": "farm_01", "temperature": 22.5, "rainfall": 15.0, "soil_moisture": 45.0, "humidity": 72.0, "altitude": 1650.0}' returns a prediction

Convert the training logic from notebook-style code into a modular scripted pipeline with YAML configuration.

What to do:

Read the pipeline template to understand modular pipeline structure
Create pipeline modules: data_loader.py, feature_engineer.py, trainer.py, evaluator.py
Create config.yaml with hyperparameters, data paths, and pipeline settings externalized
Create train.py as the entry point that reads config and orchestrates the modules
Review AI output for monolithic tendencies and hardcoded values
Run the pipeline and verify it produces model output

Acceptance criteria:

Four pipeline modules exist with clear input/output interfaces
config.yaml contains all hyperparameters and paths (nothing hardcoded in scripts)
python train.py --config config.yaml runs end-to-end without errors
Pipeline produces model output files matching the original approach

Containerize the serving endpoint so it runs consistently on any machine.

What to do:

Read the Docker guide to understand containers, images, and Dockerfiles
Ask Marcus Webb about container design for ML serving
Create a Dockerfile for the serving endpoint
Review for common problems: unpinned base image, unnecessary dependencies, missing .dockerignore
Implement a multi-stage Dockerfile separating build from serving dependencies
Build the image and run the container
Test the containerized endpoint with curl

Acceptance criteria:

Dockerfile uses pinned base image (e.g., python:3.11.8-slim)
Multi-stage build separates build dependencies from serving dependencies
.dockerignore excludes unnecessary files (notebooks, datasets, training code)
docker build -t finca-serving . completes without errors
docker run -p 8000:8000 finca-serving starts the container
curl http://localhost:8000/health returns successful response from the container
curl -X POST http://localhost:8000/predict ... returns prediction from the container

Add prediction logging and health monitoring so every prediction is recorded and system status is visible.

What to do:

Read the logging spec template to understand required fields
Design the logging schema: what to capture per prediction
Implement prediction logging middleware in serve.py
Review AI logging for missing metadata (confidence scores, timezone, feature distributions)
Design health check logic that verifies model loadedness and dependency availability
Update the /health endpoint with substantive checks
Run predictions and verify logs capture everything

Acceptance criteria:

Every prediction logged with: timestamp (with timezone), request_id, farm_id, input_features, predicted_yield_kg, confidence_score, response_time_ms
Log file uses JSON Lines format (one JSON object per line)
/health endpoint checks model loadedness and dependency status (not just {"status": "ok"})
After 5 curl predictions, log file contains 5 entries with all required fields

Verify the complete system works end-to-end, write documentation, and deliver to Valentina.

What to do:

Write a README explaining how Carlos runs the system
Run the full sequence: train pipeline, build container, start serving, make predictions, check logs
Push to GitHub with clean commit history
Send Valentina the delivery summary

Acceptance criteria:

README covers: build the container, run it, make predictions, check logs
Full sequence works: python train.py --config config.yaml -> docker build -> docker run -> curl predictions -> log entries verified
Clean Git history with descriptive commits
Valentina receives delivery summary