Step 1: Review the Validation Ticket
Open materials/tickets.md and read T-01 (Pydantic input validation) and T-02 (structured error responses). The goal is straightforward: the API should reject bad input with clear error messages and accept valid input. But "bad input" needs a definition, and that definition comes from the training data.
The model was trained on subscriber data with specific ranges: tenure between 1 and 72 months, monthly charges between 200 and 5000 NGN, specific contract types and payment methods. If someone sends a request with tenure of -5 or a contract type the model never saw, the prediction is meaningless. The model will still produce a number -- it just won't mean anything.
Step 2: Extract Validation Boundaries from the Training Data
Open materials/api-baseline/data_profile.json. This file contains the ranges and valid values from the training data. Direct Claude to load it and display the constraints: which numeric fields have which ranges, which categorical fields accept which values.
These constraints are the validation boundaries. The Pydantic model you build should encode them. The connection matters: the validation isn't arbitrary -- it encodes what the model was trained on.
Step 3: Build the Pydantic Validation Model
Direct Claude to add a Pydantic model to api-baseline/app.py that validates prediction requests against the training data's constraints. Something like: "Add a Pydantic model for the predict endpoint that validates input features against data_profile.json. Numeric fields should have min/max constraints matching the training data ranges. Categorical fields should only accept values from the training data."
Review what AI produces. AI commonly generates validation that covers type checking but misses domain-specific range constraints. Check whether the Pydantic model validates that tenure_months falls between 1 and 72, or whether it just checks that tenure_months is an integer. If AI only validates types, direct it to add the range constraints from the data profile.
Step 4: Add Structured Error Responses
Direct Claude to replace default error handling with structured JSON responses. When validation fails, the response should tell the caller what went wrong and how to fix it: which field failed, what the constraint was, what value was received.
An error response like {"detail": [{"field": "tenure_months", "constraint": "must be between 1 and 72", "received": -5}]} teaches the caller how to fix the problem. A 500 Internal Server Error with a numpy traceback teaches nothing.
Step 5: Test the Validation
Test with three categories of input using curl:
A valid request -- all fields within the training data's ranges:
curl -X POST http://localhost:8000/predict \
-H "Content-Type: application/json" \
-d '{"tenure_months": 24, "monthly_charges": 1500, "total_charges": 36000, "num_complaints": 2, "data_usage_gb": 15.5, "contract_type": "month-to-month", "payment_method": "credit_card", "segment": "prepaid"}'
An out-of-range request -- tenure_months set to -5:
curl -X POST http://localhost:8000/predict \
-H "Content-Type: application/json" \
-d '{"tenure_months": -5, "monthly_charges": 1500, "total_charges": 36000, "num_complaints": 2, "data_usage_gb": 15.5, "contract_type": "month-to-month", "payment_method": "credit_card", "segment": "prepaid"}'
The valid request should return a prediction. The invalid request should return a 422 with a structured error naming the field and constraint.
Check: Sending a request with an out-of-range value (e.g., age=-5) returns a 422 response with a JSON body that names the field and the valid range. Sending a valid request returns a prediction with no errors.