Learn by Directing AI
Unit 4

The warm-start analysis

Step 1: Prepare a context brief

Before you start this analysis session, write a context brief. This is a structured description of what this session needs to accomplish and what AI should know going into it:

"This session: prepare the air quality data for interrupted time series analysis. Current data: daily PM2.5 readings from 40 stations across three cities, 7 years. Established methodology: interrupted time series with seasonal, weather, and trend controls. Conventions: temporal splits only, report effect sizes alongside p-values, check assumptions before parametric tests."

This is proactive context design. You are deciding what AI needs to know before the first analytical prompt -- not reacting to degradation mid-session, but preventing it by front-loading the right information. The difference between "prepare the data" and "prepare the data for an interrupted time series analysis, using temporal splits only, controlling for seasonal and weather effects" is the difference between AI using its defaults and AI working within your constraints.

Step 2: Prepare the data

Direct AI to prepare the air quality data for the interrupted time series analysis. The preparation should include:

  • Aggregating to the appropriate time resolution (weekly or monthly averages may work better than daily for reducing noise while preserving the seasonal pattern)
  • Handling the station data -- you may analyze each city separately or pool stations within a city
  • Creating the time series structure: pre-regulation period, post-regulation period, with the intervention date at January 1, 2022
  • Merging weather data as control variables

Watch how AI prepares the data this time. With the project memory active, AI should follow the encoded conventions. Temporal splits, not random. Assumption checks before parametric operations.

Step 3: Run the interrupted time series analysis

Direct AI to model the pre-regulation trend, seasonal patterns, and weather effects, then test for a significant change in PM2.5 levels after the regulation.

The model should control for:

  • Seasonal variation (PM2.5 is higher in winter, lower in summer)
  • Weather effects (cold, calm days concentrate pollutants)
  • Long-term trends (air quality may have been improving regardless of the regulation)

What remains after these controls is the regulation's estimated effect -- if there is one.

Step 4: Compare against the cold-start baseline

Go back to what you observed in Unit 1 when AI worked without project memory. Compare the output side by side:

  • Did AI use temporal splits this time? In Unit 1, it likely used random splits.
  • Did AI report effect sizes? In Unit 1, it likely reported only p-values.
  • Did AI check assumptions? In Unit 1, it likely skipped them.

The same data. The same question. The same AI. The difference is the infrastructure you built in Unit 3. This is the experiential proof: infrastructure determines outcomes more than any individual prompt.

Step 5: Document the contrast

Record what was different between the cold-start and warm-start sessions. Be specific. List at least three concrete differences -- not "it was better" but "in the cold start, AI used a random 80/20 split; with project memory, it used a temporal split at the regulation date."

This documentation is the evidence for your decision record later. It captures the before/after contrast while the details are fresh.

✓ Check

Check: Context brief prepared before session start. Data prepared for interrupted time series. Analysis run with seasonal and weather controls. Before/after contrast documented with at least three specific differences between cold-start and warm-start output.