The Brief
Astrid Lindqvist runs the Air Quality Program at the Nordic Environmental Research Institute in Gothenburg. An independent nonprofit -- 30 researchers, 40 monitoring stations across Sweden, Norway, and Denmark.
The Swedish Environmental Protection Agency wants to know: has the vehicle emission regulation that took effect three years ago actually reduced particulate levels? Seven years of monitoring data across Stockholm, Gothenburg, and Malmo. PM2.5, NO2, ozone, SO2. Weather variables at each station. About 2.5 million data points.
The problem is noise. Season affects air quality. Weather affects it. Day of the week affects it. A naive before-after comparison of means tells you nothing useful. The agency needs an analysis that separates the regulation's effect from everything else -- and reports uncertainty honestly. This is going into a government report.
Your Role
You deliver a methodologically defensible analysis of whether the regulation changed PM2.5 levels. The analytical terrain is familiar -- inferential analysis with confounders, the kind of work you practiced on Hassan's booking data.
What is new is what happens behind the analysis. For the first time, you build the project's AI infrastructure yourself. Every previous project handed you a CLAUDE.md file that made AI productive from the first prompt. This time, you write that file. You encode the analytical conventions you have learned -- temporal splitting, effect sizes, assumption checks -- into persistent infrastructure that shapes every AI session on this project.
What's New
Last time, you determined the question type from an ambiguous brief, designed a validation strategy, and used meta-prompting to verify unfamiliar territory. You owned the methodology.
This time, you still own it -- but you also build the infrastructure underneath. You will run the analysis once without any project memory and see what AI does by default. Then you build the memory file, run the analysis again, and compare. The difference is the lesson.
The hard part is not the analysis. It is writing infrastructure specific enough that AI follows your conventions from the first prompt -- and recognizing when vague infrastructure produces vague compliance.
Tools
- Python 3.11+ via your conda "ds" environment
- Jupyter Notebook for the analysis
- pandas for data handling
- statsmodels for hypothesis tests and assumption checks
- scipy for statistical tests and effect size calculations
- matplotlib / seaborn for visualization
- Claude Code as the AI you direct
- Git / GitHub for version control
Materials
You receive:
- Seven years of air quality monitoring data (daily averages from 40 stations across three cities)
- Weather data matched to each station (temperature, wind speed, precipitation)
- Station metadata (locations, types, cities)
- A data dictionary describing all three datasets
- A methodology memo template
- No CLAUDE.md -- you create this yourself