Learn by Directing AI

The Brief

Wei Liang is Director of Marketing at BrightSmile Dental, a chain of six dental clinics in Chengdu, China. Sixty employees across the six locations. He ran a multi-channel marketing campaign last quarter -- WeChat ads, local KOL partnerships, and a patient referral bonus program. Total spend: CNY 180,000.

New patient bookings increased 22% during the campaign period compared to the prior quarter. The problem: bookings also increase every Q4 because of Chinese New Year and school holiday scheduling. Wei's boss wants to know if the 22% is campaign-driven or just seasonal. The board meets in three weeks.

Wei has two years of daily booking data across all six clinics -- about 28,000 rows. He can see the numbers went up. He cannot tell whether they would have gone up anyway.

Your Role

You're answering a question that descriptive analysis cannot. "Did bookings go up?" is observable. "Did the campaign cause the increase?" requires a statistical test. The analytical method changes -- from counting and charting to hypothesis testing with p-values and confidence intervals.

AI's failure modes change too. In previous projects, AI made computational errors -- wrong aggregation levels, silent row drops, inconsistent metric definitions. Here, AI makes judgment errors: selecting the wrong statistical test for the data type. Your job is to verify the method, not just the numbers.

What's New

Last time, you built interactive plotly charts and dual-audience Metabase dashboards for Amina's bookstore chain. You planned decomposition before starting and curated context for multi-concern AI sessions.

This time, the terrain shifts. You cross from descriptive analysis to inferential statistics. "Bookings increased 22%" becomes "bookings increased between X% and Y% with 95% confidence, and that increase is / is not statistically significant after accounting for seasonality." scipy.stats and statsmodels enter.

The hard part is not the data -- it is a single source, reasonably clean, with familiar columns. The hard part is framing the right hypothesis, choosing the right test, and reporting the result honestly.

Tools

Python 3.11+ (via Miniconda, "analytics" environment)
DuckDB
Jupyter Notebook
pandas
scipy.stats (new -- hypothesis testing)
statsmodels (new -- statistical analysis)
matplotlib / seaborn (statistical visualizations with confidence intervals)
Metabase (via Docker -- continuing from previous projects)
Docker
Claude Code
Git / GitHub

Materials

Booking data -- two years of daily bookings across six clinics, about 28,000 rows. Each row is a booking with date, clinic, patient type, service category, source channel, and revenue.
Campaign calendar -- start/end dates for each channel and the cost breakdown.
Data dictionary -- column definitions and business terminology for the booking data.
Statistical testing guide -- hypothesis testing concepts, test selection decision tree, confidence interval reporting, and code examples for scipy.stats and statsmodels.
CLAUDE.md -- project governance file with client context, work breakdown, and verification targets.