Learn by Directing AI
All materials

tickets.md

Tickets

T1: Project setup and model artifact review

Set up the project workspace and verify the existing model artifacts work locally.

What to do:

  • Download and extract the project materials
  • Read Valentina's follow-up email to understand the operational problem
  • Talk to Valentina to clarify what she needs
  • Review the model artifacts (serve.py, feature_pipeline.py, model.pt, requirements.txt, sensor-schema.json)
  • Run the serving endpoint locally and confirm it responds

Acceptance criteria:

  • Project workspace set up with all materials present
  • python serve.py starts the endpoint on port 8000
  • curl http://localhost:8000/health returns a successful response
  • curl -X POST http://localhost:8000/predict -H "Content-Type: application/json" -d '{"farm_id": "farm_01", "temperature": 22.5, "rainfall": 15.0, "soil_moisture": 45.0, "humidity": 72.0, "altitude": 1650.0}' returns a prediction

T2: Notebook-to-script pipeline conversion

Convert the training logic from notebook-style code into a modular scripted pipeline with YAML configuration.

What to do:

  • Read the pipeline template to understand modular pipeline structure
  • Create pipeline modules: data_loader.py, feature_engineer.py, trainer.py, evaluator.py
  • Create config.yaml with hyperparameters, data paths, and pipeline settings externalized
  • Create train.py as the entry point that reads config and orchestrates the modules
  • Review AI output for monolithic tendencies and hardcoded values
  • Run the pipeline and verify it produces model output

Acceptance criteria:

  • Four pipeline modules exist with clear input/output interfaces
  • config.yaml contains all hyperparameters and paths (nothing hardcoded in scripts)
  • python train.py --config config.yaml runs end-to-end without errors
  • Pipeline produces model output files matching the original approach

T3: Docker containerization of serving endpoint

Containerize the serving endpoint so it runs consistently on any machine.

What to do:

  • Read the Docker guide to understand containers, images, and Dockerfiles
  • Ask Marcus Webb about container design for ML serving
  • Create a Dockerfile for the serving endpoint
  • Review for common problems: unpinned base image, unnecessary dependencies, missing .dockerignore
  • Implement a multi-stage Dockerfile separating build from serving dependencies
  • Build the image and run the container
  • Test the containerized endpoint with curl

Acceptance criteria:

  • Dockerfile uses pinned base image (e.g., python:3.11.8-slim)
  • Multi-stage build separates build dependencies from serving dependencies
  • .dockerignore excludes unnecessary files (notebooks, datasets, training code)
  • docker build -t finca-serving . completes without errors
  • docker run -p 8000:8000 finca-serving starts the container
  • curl http://localhost:8000/health returns successful response from the container
  • curl -X POST http://localhost:8000/predict ... returns prediction from the container

T4: Prediction logging and health monitoring

Add prediction logging and health monitoring so every prediction is recorded and system status is visible.

What to do:

  • Read the logging spec template to understand required fields
  • Design the logging schema: what to capture per prediction
  • Implement prediction logging middleware in serve.py
  • Review AI logging for missing metadata (confidence scores, timezone, feature distributions)
  • Design health check logic that verifies model loadedness and dependency availability
  • Update the /health endpoint with substantive checks
  • Run predictions and verify logs capture everything

Acceptance criteria:

  • Every prediction logged with: timestamp (with timezone), request_id, farm_id, input_features, predicted_yield_kg, confidence_score, response_time_ms
  • Log file uses JSON Lines format (one JSON object per line)
  • /health endpoint checks model loadedness and dependency status (not just {"status": "ok"})
  • After 5 curl predictions, log file contains 5 entries with all required fields

T5: End-to-end verification and delivery

Verify the complete system works end-to-end, write documentation, and deliver to Valentina.

What to do:

  • Write a README explaining how Carlos runs the system
  • Run the full sequence: train pipeline, build container, start serving, make predictions, check logs
  • Push to GitHub with clean commit history
  • Send Valentina the delivery summary

Acceptance criteria:

  • README covers: build the container, run it, make predictions, check logs
  • Full sequence works: python train.py --config config.yaml -> docker build -> docker run -> curl predictions -> log entries verified
  • Clean Git history with descriptive commits
  • Valentina receives delivery summary