Learn by Directing AI

The Brief

Astrid Lindqvist runs the Air Quality Program at the Nordic Environmental Research Institute in Gothenburg. An independent nonprofit -- 30 researchers, 40 monitoring stations across Sweden, Norway, and Denmark.

The Swedish Environmental Protection Agency wants to know: has the vehicle emission regulation that took effect three years ago actually reduced particulate levels? Seven years of monitoring data across Stockholm, Gothenburg, and Malmo. PM2.5, NO2, ozone, SO2. Weather variables at each station. About 2.5 million data points.

The problem is noise. Season affects air quality. Weather affects it. Day of the week affects it. A naive before-after comparison of means tells you nothing useful. The agency needs an analysis that separates the regulation's effect from everything else -- and reports uncertainty honestly. This is going into a government report.

Your Role

You deliver a methodologically defensible analysis of whether the regulation changed PM2.5 levels. The analytical terrain is familiar -- inferential analysis with confounders, the kind of work you practiced on Hassan's booking data.

What is new is what happens behind the analysis. For the first time, you build the project's AI infrastructure yourself. Every previous project handed you a CLAUDE.md file that made AI productive from the first prompt. This time, you write that file. You encode the analytical conventions you have learned -- temporal splitting, effect sizes, assumption checks -- into persistent infrastructure that shapes every AI session on this project.

What's New

Last time, you determined the question type from an ambiguous brief, designed a validation strategy, and used meta-prompting to verify unfamiliar territory. You owned the methodology.

This time, you still own it -- but you also build the infrastructure underneath. You will run the analysis once without any project memory and see what AI does by default. Then you build the memory file, run the analysis again, and compare. The difference is the lesson.

The hard part is not the analysis. It is writing infrastructure specific enough that AI follows your conventions from the first prompt -- and recognizing when vague infrastructure produces vague compliance.

Tools

Python 3.11+ via your conda "ds" environment
Jupyter Notebook for the analysis
pandas for data handling
statsmodels for hypothesis tests and assumption checks
scipy for statistical tests and effect size calculations
matplotlib / seaborn for visualization
Claude Code as the AI you direct
Git / GitHub for version control

Materials

You receive:

Seven years of air quality monitoring data (daily averages from 40 stations across three cities)
Weather data matched to each station (temperature, wind speed, precipitation)
Station metadata (locations, types, cities)
A data dictionary describing all three datasets
A methodology memo template
No CLAUDE.md -- you create this yourself