Learn by Directing AI

Step 1: The confidence interval chart

Wei needs a chart he can put in front of the board. A bar chart showing "bookings increased 22%" is simple but dishonest -- it hides the uncertainty. The honest chart shows the estimated effect with error bars.

Direct AI to build a matplotlib chart comparing the campaign-period and baseline new patient rates, with 95% confidence interval error bars:

Build a matplotlib bar chart comparing the new patient booking rate during the campaign period (Oct-Dec year 2, excluding Gaoxin) vs the baseline (Oct-Dec year 1). Add 95% confidence interval error bars to each bar. Title it clearly. Label the y-axis as the new patient proportion.

The error bars show the range of plausible values for each rate. Where the bars' error ranges overlap, the difference between the two periods is less certain. Where they do not overlap, the difference is more convincing.

Step 2: Channel comparison with uncertainty

Each campaign channel has a different level of evidence behind it. Some channels had more bookings (narrower confidence intervals, higher certainty). Others had fewer bookings (wider intervals, lower certainty).

Build a bar chart showing the estimated incremental new patients by channel (WeChat ads, KOL, referral). Add 95% confidence interval error bars. Add a horizontal dashed line at zero to mark "no effect." Title: "Campaign Effect by Channel: Incremental New Patients (95% CI)".

The zero line matters. Any bar whose error bars cross zero means the data is consistent with no effect from that channel -- even if the point estimate is positive. Bars whose error bars stay above zero show channels where the evidence supports a real campaign effect.

Step 3: Service category breakdown

The chi-squared test showed that the campaign effect differs across service categories. Display this visually -- cosmetic dentistry versus general dentistry, side by side, with confidence intervals.

Build a grouped bar chart showing the new patient rate by service category for the campaign period vs the baseline. Include error bars. Highlight which categories showed statistically significant increases and which did not.

The cosmetic dentistry increase should be visually obvious -- the campaign-period bar clearly higher than the baseline, with error bars that do not overlap. The general dentistry bars should be closer together, with overlapping intervals.

Step 4: Writing findings in uncertainty language

Charts show the uncertainty. The written findings must do the same. AI defaults to point estimates -- "the campaign increased bookings by 22%" -- because single numbers sound authoritative. The professional obligation is to include the range.

Direct AI to draft the findings:

Write a summary of the campaign effectiveness findings using uncertainty language. For each finding, include the point estimate, the 95% confidence interval, and the p-value. Use the format from the statistical testing guide's "Reporting with Confidence Intervals" section.

Review what AI produces. Check for these patterns:

Point estimates without intervals. "The campaign generated 45 additional patients" should be "The campaign generated an estimated 35-55 additional patients (95% CI)."
Binary significance language. "The effect on general dentistry was not significant" should be "The increase in general dentistry bookings was not statistically significant -- the observed increase is consistent with normal seasonal variation."
Missing qualifiers. "The campaign worked" should be "The data provides statistically significant evidence of a campaign effect on cosmetic dentistry bookings."

The difference between "not statistically significant" and "no effect" is critical. A non-significant result means the data does not provide strong enough evidence -- not that the campaign definitely had no impact. AI conflates these regularly.

Step 5: Updating the Metabase dashboard

Add a summary panel to the Metabase dashboard showing the statistical findings. Wei's board needs both the visual charts and the dashboard KPIs.

Create a Metabase question using SQL that displays the primary statistical findings: overall campaign effect with confidence interval, channel-level breakdown, and service category results. Format the output for a dashboard panel.

The dashboard KPI for the campaign effect should include the range, not just the midpoint. "Estimated additional new patients: 35-55 (95% CI)" communicates honestly. A single number -- "45 additional patients" -- does not.

Step 6: Cost per incremental patient

Wei's fifth requirement: "Give me a number I can put in front of the board." The number is cost per incremental patient, and it needs to be a range.

Direct AI to compute it:

Calculate the cost per incremental new patient by channel. Total campaign spend was CNY 180,000 (WeChat: 80,000, KOL: 60,000, Referral: 40,000). Divide each channel's spend by the estimated incremental patients from that channel. Express the result as a range using the confidence interval bounds, not as a single number.

The math is straightforward: spend divided by incremental patients. But because the incremental patient count is an estimate with a confidence interval, the cost per patient is also a range. A wide confidence interval on the patient count produces a wide range on the cost -- that is the honest version.

Step 7: The referral attribution gap

Ask Wei about the referral program tracking. Open the chat:

I noticed the referral booking source appears in both years of data -- before and after the campaign. Can you tell me how referral tracking worked before the bonus program started?

Wei responds: "The referral bonus was CNY 200 per referred patient. Before the campaign, we tracked referrals but couldn't separate organic referrals from bonus-driven ones. So the baseline referral rate is approximate."

This means the referral channel's confidence interval is wider than the numbers suggest. The baseline includes organic referrals, and the campaign period mixes organic and bonus-driven referrals under the same tag. Adjust the interpretation: note in the findings that the referral channel estimate carries additional uncertainty beyond what the statistical test captures.

✓ Check

Check: You should have at least one chart with error bars or confidence interval bands, findings written with uncertainty language (not point estimates), and a cost-per-incremental-patient figure expressed as a range.