Step 1: Design the validation strategy
In previous projects, someone told you what to check. This time, you design the validation strategy yourself.
Start by thinking about what could be wrong with the inferential analysis. The regression produced coefficients, p-values, and effect sizes. What would make those unreliable? What assumptions could be violated? What alternative specifications might change the conclusion?
For an inferential analysis, the validation strategy should include:
- Assumption checks: Normality of residuals, homoscedasticity, multicollinearity (VIF). You started these in Unit 3. Now formalize what you found.
- Sensitivity analysis: Does the conclusion hold if you change the model specification? What happens if you remove the Luxor launch variable? What happens if you use a different seasonal specification?
- Effect size interpretation: Is the effect large enough to act on, or just statistically detectable?
- Cross-model review: A second AI reviewing the methodology with fresh context.
Write this strategy down before running any of the checks. The strategy is itself a deliverable -- it shows Hassan's silent partner that the validation was designed, not ad hoc.
Step 2: Use meta-prompting for novel checks
You have not designed a validation strategy for an inferential model at this level before. That is not a reason to skip validation. It is a reason to use AI to expand what you can verify.
Direct AI with something like: "I have run a linear regression testing whether a marketing shift is associated with booking growth, controlling for seasonality and a new product launch. What are the ways this analysis could be wrong? What checks would catch each failure mode? Design a validation plan."
Read AI's response critically. Some suggestions will be useful -- specific checks you had not considered, like testing for structural breaks or checking whether the effect is driven by a few outlier months. Some suggestions will be generic -- boilerplate validation steps that apply to any model regardless of question type.
Keep the checks that address this specific analysis. Discard the ones that are generic validation theater. Add any useful checks to your validation strategy.
Step 3: Run the validation checks
Execute the checks from your validation strategy. Direct AI to run each one and interpret the results:
Residual diagnostics: Generate a residuals-versus-fitted-values plot and a Q-Q plot. Check whether residuals are roughly randomly scattered (no patterns, no funnel shapes) and roughly normally distributed (points close to the diagonal on the Q-Q plot).
Multicollinearity: Compute VIF values for all predictors. Values above 5-10 indicate concerning multicollinearity. If the marketing_shift and Luxor launch variables are highly correlated (both are time-based), the individual coefficient estimates may be unstable even if the overall model is sound.
Sensitivity runs: Run at least two alternative specifications:
- Drop the Luxor launch variable. Does the marketing_shift coefficient change substantially? If it absorbs the Luxor effect, the two are confounded and cannot be separated.
- Change the seasonal specification -- use individual month dummies instead of a peak-season flag (or vice versa). Does the marketing_shift coefficient change? If the conclusion depends on how you model seasonality, it is fragile.
Document what each check found. Not just "PASS" or "FAIL" -- what the diagnostic showed, what it means for the analysis, and whether any adjustment is needed.
Step 4: Cross-model review
Open a second Claude Code session. Give it only three things: Hassan's original email, the data dictionary, and your methodology memo. Do not give it your notebook, your code, or your results. The point is fresh context without accumulated assumptions.
Ask the second AI to review your methodology:
- Is inference the right question type for this brief?
- Are the confounders adequately controlled?
- Are the effect sizes honestly interpreted?
- Is the attribution limitation properly addressed?
Read the review. Address legitimate findings -- a genuine methodological concern deserves a response, even if the response is "acknowledged as a limitation." Dismiss concerns that reflect the second AI's biases rather than real issues.
Document what the cross-model review found, what you addressed, and what you dismissed (and why). This goes in the methodology memo under "Cross-Model Review."
Step 5: Document the validation report
Write up the complete validation in the methodology memo under "Validation Strategy." Structure it as:
- What was checked: Each diagnostic and sensitivity test, named specifically
- Why those checks: Connect each check to a potential failure mode. "We checked multicollinearity because the marketing shift and Luxor launch are both time-based variables that could be confounded."
- What was found: Results of each check, interpreted
- What confidence the analysis supports: Based on the validation results, how confident are you in the marketing_shift finding? Are there caveats the checks revealed?
This section is what earns trust from Hassan's silent partner. An analysis with documented validation is categorically more credible than one without.
Check: Validation strategy designed (not copied from a checklist). Meta-prompting used to identify at least one check you would not have thought of. Sensitivity analysis run (at least two alternative specifications). Cross-model review completed with fresh context. Validation report documents what was checked, why, and what was found.