Step 1: Examine the Outliers
Direct Claude to identify outliers in the engineered features and harvest yields. Ask it to show you which farms and harvest periods have values that fall outside the typical range.
Some farms will show yields significantly above or below the median. Before doing anything about them, ask: what are these data points? A yield of 1,200 kg when most farms produce 2,000+ could be a sensor malfunction, a bad growing season, or the result of switching to a different coffee variety. The answer determines what you do with the data.
Check the farms Valentina mentioned -- farm_05 and farm_09 switched from Castillo to Gesha about eighteen months ago. Their yield dropped but their quality scores jumped. That's not an anomaly. That's a real variety effect showing up in the data.
Step 2: Review AI's Outlier Proposal
Direct Claude to propose an outlier handling strategy. AI commonly defaults to removing data points beyond a statistical threshold -- anything outside the interquartile range or more than three standard deviations from the mean.
Review the proposal. Does it propose removing the Gesha farms? If so, that would remove real signal from the dataset. The Gesha variety legitimately produces less volume at higher quality. Removing those data points because they're statistically unusual would mean the model can't account for the variety difference -- and Valentina's predictions for those farms would be wrong.
The sensor gaps from Unit 1 are a different story. If sensor anomalies during the outage periods produced questionable feature values, those might genuinely need handling -- imputation or flagging.
Step 3: Make the Domain Judgment
Decide how to handle each category of outlier:
- Gesha farm data (farm_05, farm_09 post-switch): Keep. These are real data points reflecting a genuine variety change. The model needs to see them.
- Sensor-gap anomalies: Flag or impute. If the sensor gaps produced incomplete feature aggregations (e.g., mean rainfall computed from 10 days instead of 30 because readings were missing), those values are unreliable and should be handled.
- Other variation: Keep unless there's a specific reason to exclude.
Direct Claude to implement your decisions. This is a domain judgment, not a statistical formula. Document the reasoning for each decision.
Step 4: Document Feature Decisions
Direct Claude to produce a feature documentation file. For each derived feature, record:
- What it is (e.g., "mean_temp_flowering: average temperature during October-November")
- Why it was created (the domain hypothesis -- "temperature during flowering affects cherry set rate")
- What relationship you expect ("higher flowering temperatures may reduce yield due to heat stress")
For the outlier handling, record what was kept, what was handled, and why. This documentation makes the pipeline auditable. Another practitioner reading it can understand the reasoning behind every transformation.
Step 5: Cross-Model Review
Open a fresh Claude Code session. This is separate from the session where you built the pipeline. Give the new session only two things: Valentina's requirements (from her email) and the feature documentation you just created. Don't give it the conversation history from building the pipeline.
Ask it to review: Do the features address Valentina's prediction needs? Are the outlier handling decisions justified? Is anything missing or contradictory?
A fresh perspective catches gaps that the original session normalized. The first session chose the features and justified them; the second session evaluates them without that justification context. If the feature set has a gap -- say, no features capturing soil moisture during the critical cherry development period -- a fresh review is more likely to catch it.
Check: Outlier handling distinguishes between variety effects (kept) and sensor anomalies (flagged/handled). Feature documentation explains the domain hypothesis for each derived feature. Cross-model review completed.