Learn by Directing AI

Step 1: Read the communication plan

Open materials/project-plan.md, Section 5 (Communication).

Wanjiku does not want a coefficient table. She does not want R-squared. She wants two things: a ranked list of appointments by no-show risk, so Grace can prioritize reminder calls, and a summary that tells her how much to trust the predictions. Everything you have built so far exists to produce these two artifacts.

Step 2: Generate the ranked list

Direct AI to take the test set appointments and rank them by predicted no-show probability, highest risk at the top. Include the columns that matter for Grace's work: date, time slot, visit type, client tenure, pet species, and the predicted probability formatted as a percentage.

This is the artifact Wanjiku asked for. Not the model. Not the metrics. A list that Grace can print and use at the reception desk on Monday morning.

The top of the list should make domain sense. Vaccination follow-ups with new clients in afternoon slots should rank high. Morning appointments with returning clients should rank low. If the ordering does not match what you know about the clinic's no-show patterns, something went wrong upstream.

Step 3: Write the client summary

Direct AI to write a summary for Wanjiku explaining three things: what the model does, how accurate it is, and what the ranked list means for her scheduling.

The accuracy question is where most AI summaries fail. AI will report RMSE as a number. Wanjiku needs that number translated. "The model is typically off by about X percentage points" is useful. "RMSE = 0.27" is not. The translation from metric to practical language is the communication work.

R-squared and RMSE are not grades. They are measures of how useful the model is for Wanjiku's specific decision: which appointments to double-book or call about. Frame the accuracy in those terms.

Step 4: Review the translation

Read what AI produced. Check whether the summary actually speaks Wanjiku's language.

Common failure modes: AI dumps a coefficient table and calls it a summary. AI reports R-squared without explaining what it means for scheduling. AI hedges with so many caveats that the summary says nothing actionable.

If the summary reads like a statistics report, direct AI to revise: "Wanjiku needs to know whether this is accurate enough for scheduling decisions, not what R-squared means." A focused revision prompt produces better results than "make it simpler."

Step 5: Check for context degradation

This is a long session. You have made specific decisions across multiple units: which cleaning strategy for missing values, temporal split instead of random split, specific coefficient interpretations. AI has been tracking all of this in its context window.

Context windows have limits. As a session grows, earlier decisions can drift. AI may contradict something it agreed to earlier -- not because it changed its mind, but because the earlier context has faded. This is context degradation. It is a structural property of how context windows work, not a bug.

Read through the summary AI wrote. Check specific claims against your earlier work:

Does the summary reference the temporal split, or did AI slip back to the random split results?
Does RMSE match the temporal-split evaluation, not the inflated random-split number?
Are the cleaning decisions described accurately, or did AI introduce details you did not decide?
Do the coefficient descriptions match the signs and magnitudes you verified in Unit 4?

Context degradation is detectable through specific symptoms: AI re-introduces a methodology you corrected, contradicts a constraint from earlier in the session, or loses track of which analytical decisions have already been made.

Step 6: Correct if needed

If you find contradictions, direct AI to re-check specific claims against the earlier output. Be precise: "In Unit 3, we re-split the data temporally. The summary says the model was evaluated on a random holdout. Correct this to reflect the temporal split."

Specific correction prompts work. Vague ones do not. "Fix the summary" gives AI no anchor. "The RMSE in the summary should match the temporal-split evaluation from Unit 4" tells AI exactly what to verify and where.

If the summary is consistent with all earlier decisions, note that and move on. Context degradation does not always occur. The practice is checking for it, not finding it.

✓ Check