Step 1: What project memory does
Every previous project had a CLAUDE.md file in the materials. When you started Claude Code and it read that file, AI already knew the project context: the client, the data, the analytical constraints, the conventions. You did not have to re-state "always use temporal splits" or "always report effect sizes" -- the file did it for you.
That file is project memory. It is infrastructure that loads at session start, giving AI persistent context across every session. The quality of that file determines the quality of AI's baseline behavior. A file that says "always use temporal splits on time-dependent data; never use random splits" produces specific, verifiable output. A file that says "follow best practices" produces whatever AI's defaults happen to be.
You have been benefiting from this infrastructure since your first project. Now you build it yourself.
Step 2: Author CLAUDE.md
Create a CLAUDE.md file in your project directory. This is the file Claude Code will read at the start of every session. It needs to contain everything AI should know about this project from the first prompt:
- Project name and client: NERI Air Quality Regulation Analysis for Astrid Lindqvist
- What you are building: Interrupted time series analysis of PM2.5 trends in three Swedish cities, controlling for seasonal patterns, weather effects, and long-term trends
- Dataset descriptions: Air quality data (daily averages from 40 stations, 7 years), weather data (temperature, wind speed, precipitation per station), station metadata
- Analytical conventions: Always use temporal splits on time-dependent data. Always report effect sizes alongside p-values. Always check statistical assumptions before parametric tests. Always report confidence intervals. Compare against naive baselines.
- Known data quality considerations: Document anything you noticed during profiling in Unit 1
- Verification targets: What should be checked after each analytical step
- Commit convention: How and when to commit
Direct AI to help you write this file, but review every entry. The conventions you encode are the ones you have learned across six projects. Temporal splitting discipline from P5. Effect size reporting from P6. Assumption checking from P6. These are your earned conventions -- not a list from a textbook, but constraints you know matter because you have seen what happens without them.
Step 3: Test specificity
Write one convention vaguely and one specifically. For example:
- Vague: "Use appropriate statistical methods"
- Specific: "Always use temporal splits on time-dependent data; never use random splits"
Direct AI to prepare the data for analysis. Observe which convention produces verifiable output and which produces defaults. The vague entry gives AI permission to use its defaults -- which, as you saw in Unit 1, includes random splitting on time-dependent data. The specific entry produces output you can verify: did AI use a temporal split? Yes or no.
This is the specificity lesson. Infrastructure quality is not about length -- it is about precision. A 100-line file of vague guidance is worse than a 20-line file of specific constraints.
Step 4: Author AGENTS.md
Now create a second file: AGENTS.md. This is the cross-platform standard -- a file that any compliant AI coding agent reads at session start. The Linux Foundation's standard exists precisely to make project memory portable: one file, any agent.
CLAUDE.md captures Claude Code-specific features (ancestor hierarchy, auto-memory, path-scoped rules). AGENTS.md captures the universal constraints that transfer. The analytical conventions, the dataset descriptions, the verification targets -- these belong in both files. The Claude Code-specific configuration belongs only in CLAUDE.md.
Writing both is not duplication. It is authoring for portability from the first file. The filename is a lookup; the understanding is universal.
Step 5: Verify both files
Start a new Claude Code session. Watch what happens. AI reads the project files at session start. Its first response should reference the project context -- the dataset, the analytical conventions, the client -- without you having prompted any of it.
Check: does AI mention temporal splits? Effect sizes? Assumption checking? If these conventions appear in AI's first unprompted response, the infrastructure is working. If they do not, the file needs revision -- the entries may be too buried or too vague.
Check: CLAUDE.md exists and contains: dataset schema, analytical conventions (temporal splits, effect sizes, assumption checks), verification targets. AGENTS.md exists and contains the same core constraints. A fresh session shows AI referencing conventions unprompted.