Learn by Directing AI

Step 1: Dr. Petrova's prompt

Dr. Nadia Petrova, a senior data scientist, sends a brief message: "the client said 'understand our booking patterns.' that's not a question type. what are they actually asking -- description, inference, or causation? the methodology depends entirely on the answer."

She is right. Every previous project gave you the question type. P1 was descriptive. P2 was predictive regression. P3 was inferential. P4 was classification. P5 was demand forecasting. This time, nobody tells you. The brief is ambiguous, and the analytical approach you choose determines what the entire analysis looks like.

Step 2: Articulate the framings

Direct AI to articulate what each possible framing of Hassan's request would produce. Give it the brief -- "I want to understand our booking patterns" -- and ask: what would a descriptive analysis produce? What would an inferential analysis produce? What would a predictive analysis produce? What would a causal analysis produce?

Each framing leads to a fundamentally different analysis:

Descriptive: "What are the booking patterns?" Seasonal charts, segment breakdowns, growth rates. Output: a report showing what happened.
Inferential: "What factors are associated with booking growth?" Regression with coefficients, effect sizes, confidence intervals. Output: findings with evidence about which factors matter.
Predictive: "How many bookings next quarter?" A forecast model with holdout evaluation. Output: point predictions with error bounds.
Causal: "Did the marketing shift cause the growth?" Causal inference methods, confounding controls. Output: causal claims with strong assumptions.

The framing choice is not cosmetic. Each produces a different deliverable, requires different methods, and answers a different question.

Step 3: Notice AI's default

Now ask AI a different question. Without specifying inference, ask it to suggest the best analytical approach for Hassan's data. Something like: "Given this booking data and marketing spend data, what's the best way to analyze whether Hassan's business decisions are working?"

Read what AI suggests. It will likely propose building a prediction model -- forecasting future bookings, perhaps with time series methods or a regression model evaluated on holdout accuracy. This is the most technically impressive approach. It is also not what Hassan needs.

Hassan's decision is whether to double down on digital marketing. He needs to know whether the marketing shift is associated with booking growth after accounting for other factors. That is an inference question, not a prediction question. A booking forecast tells Hassan how many bookings to expect next month. It does not tell him whether the marketing change worked.

AI defaults to prediction because prediction is the most technically complex approach, and AI gravitates toward complexity. This is a consistent bias, not a one-time mistake. Recognizing it is part of directing AI effectively.

Step 4: Ask Hassan to clarify

Go back to the platform. Present the framings to Hassan and ask which one serves his decision. Something like: "Before I start the analysis, I want to make sure we're answering the right question. Are you looking to describe what's been happening, figure out what's driving the growth, or predict future bookings?"

Hassan pauses, then writes a long email thinking through it: "all of those, but -- actually, I mostly need to know whether moving to digital marketing worked, because I need to decide whether to double down. If Instagram and Google are working, I'll increase the budget. If the growth is mostly the exchange rate, that's not something I control and I need a different strategy."

That confirms it. The primary question is inferential: is the shift to digital marketing associated with increased bookings, after controlling for seasonality and exchange rate trends? Descriptive analysis of seasonal patterns supports the secondary question about staffing.

Step 5: Document the methodology choice

Open materials/methodology-memo-template.md. This template has a new section compared to P5: "Analytical Question and Framing." It asks you to document the question type determination, the alternative framings you considered, and your rationale.

Direct AI to start filling in the methodology memo. Document:

Question type: Inference. The primary question is whether the digital marketing shift is associated with booking growth after controlling for seasonality and exchange rate.
Why not prediction: AI suggested prediction. A forecast model answers "how many bookings next quarter?" but does not tell Hassan whether his marketing change worked. Hassan's decision -- whether to increase the digital budget -- requires knowing which factors are associated with the growth, not how much growth to expect.
Descriptive support: Seasonal patterns and segment breakdowns serve Hassan's staffing and inventory planning. This is a secondary descriptive component, not the primary analysis.

This is the most consequential decision in the project. Everything downstream -- the model, the validation, the deliverable -- follows from it.

Step 6: Plan the analytical decomposition

Before starting any computation, write down the analysis plan. Direct AI to help you structure the sequence, but own the plan:

Clean data -- remove or flag cancellations, investigate attribution reliability
Descriptive analysis -- seasonal patterns, segment profiles, growth trends
Inferential analysis -- regression with the marketing shift as the key variable, controlling for seasonality and exchange rate
Validation -- assumption checks, effect sizes, sensitivity analysis, cross-model review
Translation -- findings in Hassan's decision language, technical appendix for his partner

Each step has a reason for its position. Cleaning must happen before analysis. Descriptive analysis gives you the lay of the land before you build the regression. Validation happens after the model, not during it. Translation is last because it depends on having validated findings.

Write this plan in your notebook or in the methodology memo. Having the sequence documented before you start prevents the common failure of jumping straight to modeling.

✓ Check