Logging Spec Template: Prediction Logging Schema

Why log predictions

Every prediction your model makes is a data point you might need later. Without prediction logs, you cannot answer basic questions: What did the model predict last week? How far off were those predictions from actual results? Is the model getting worse over time? Prediction logging is infrastructure -- it creates the raw material for every monitoring, evaluation, and debugging task that follows.

Required fields

Design your logging schema by deciding what to capture for each prediction. The fields below are the minimum. For each, decide the format and any constraints.

Field	Purpose	Your design
timestamp	When the prediction was made. Include timezone -- predictions from different systems in different timezones need to be comparable.
request_id	Unique identifier for this prediction request. Enables tracing a specific prediction through the system.
farm_id	Which farm this prediction is for. Comes from the input.
input_features	The raw input values sent to the model. Captures what the model saw when it made this prediction.
predicted_yield_kg	The model's output. The actual prediction value.
confidence_score	How confident the model is in this prediction. If your model does not natively produce a confidence score, note what you would use as a proxy or leave a design note.
response_time_ms	How long the prediction took from request to response. Performance monitoring -- a sudden spike means something changed.
model_version	Which model file served this prediction. When you retrain and deploy a new model, this field tells you which predictions came from which version.

Storage format

Prediction logs should be append-only. Each prediction adds one entry. Two common formats:

JSON Lines (.jsonl): One JSON object per line. Each line is independently parseable. Easy to append. Easy to read with standard tools (cat, jq, python json). Good for structured data with nested fields (like input_features).

CSV: One row per prediction. Columns map to fields. Simple but struggles with nested data (input_features would need to be flattened or serialized).

For predictions with nested input features, JSON Lines is typically the better choice.

Health check design

The /health endpoint answers one question: "Is the system ready to serve predictions?" A health check that returns {"status": "ok"} without checking anything is not a health check -- it is a liveness probe that tells you the process is running, not that it can serve.

What should /health verify? List everything the endpoint depends on that could fail:

_________________________________
_________________________________
_________________________________
_________________________________

For each item, decide: what does "healthy" look like? What does "unhealthy" look like? What should the response include when something is unhealthy?

The availability vs correctness distinction

Health monitoring answers "is the system available?" It tells you the process is running, the model is loaded, the dependencies are reachable. It does not tell you "is the system correct?" -- whether the predictions are accurate, whether the model has degraded, whether the input data distribution has shifted.