Logging Spec Template: Prediction Logging Schema
Why log predictions
Every prediction your model makes is a data point you might need later. Without prediction logs, you cannot answer basic questions: What did the model predict last week? How far off were those predictions from actual results? Is the model getting worse over time? Prediction logging is infrastructure -- it creates the raw material for every monitoring, evaluation, and debugging task that follows.
Required fields
Design your logging schema by deciding what to capture for each prediction. The fields below are the minimum. For each, decide the format and any constraints.
| Field | Purpose | Your design |
|---|---|---|
| timestamp | When the prediction was made. Include timezone -- predictions from different systems in different timezones need to be comparable. | |
| request_id | Unique identifier for this prediction request. Enables tracing a specific prediction through the system. | |
| farm_id | Which farm this prediction is for. Comes from the input. | |
| input_features | The raw input values sent to the model. Captures what the model saw when it made this prediction. | |
| predicted_yield_kg | The model's output. The actual prediction value. | |
| confidence_score | How confident the model is in this prediction. If your model does not natively produce a confidence score, note what you would use as a proxy or leave a design note. | |
| response_time_ms | How long the prediction took from request to response. Performance monitoring -- a sudden spike means something changed. | |
| model_version | Which model file served this prediction. When you retrain and deploy a new model, this field tells you which predictions came from which version. |
Storage format
Prediction logs should be append-only. Each prediction adds one entry. Two common formats:
JSON Lines (.jsonl): One JSON object per line. Each line is independently parseable. Easy to append. Easy to read with standard tools (cat, jq, python json). Good for structured data with nested fields (like input_features).
CSV: One row per prediction. Columns map to fields. Simple but struggles with nested data (input_features would need to be flattened or serialized).
For predictions with nested input features, JSON Lines is typically the better choice.
Health check design
The /health endpoint answers one question: "Is the system ready to serve predictions?" A health check that returns {"status": "ok"} without checking anything is not a health check -- it is a liveness probe that tells you the process is running, not that it can serve.
What should /health verify? List everything the endpoint depends on that could fail:
- _________________________________
- _________________________________
- _________________________________
- _________________________________
For each item, decide: what does "healthy" look like? What does "unhealthy" look like? What should the response include when something is unhealthy?
The availability vs correctness distinction
Health monitoring answers "is the system available?" It tells you the process is running, the model is loaded, the dependencies are reachable. It does not tell you "is the system correct?" -- whether the predictions are accurate, whether the model has degraded, whether the input data distribution has shifted.
Prediction logs are the bridge. By logging every prediction with its inputs and outputs, you create the raw material to eventually answer correctness questions. When actual harvest data comes in, you can compare predictions to reality. That comparison is drift detection -- a future concern. For now, the logs are infrastructure: they exist so that future monitoring is possible.