Learn by Directing AI
All materials

CLAUDE.md

Nile Compass Tours Booking Analysis

Client

Hassan El-Amin, Founder and Managing Director of Nile Compass Tours (Cairo, Egypt). A growing tour operator offering private cultural tours, multi-day Egypt itineraries, and Nile cruise packages. About 3,000 bookings per year.

What you are building

An inferential analysis of booking patterns to determine what factors are associated with booking growth -- specifically whether the shift to digital marketing 18 months ago is associated with increased bookings after controlling for seasonality and exchange rate trends. Supported by descriptive analysis of seasonal patterns for staffing decisions.

Tech stack

  • Python 3.11+ (conda "ds" environment)
  • Jupyter Notebook
  • pandas
  • statsmodels (OLS regression, hypothesis tests, assumption checks)
  • scipy (statistical tests, effect size calculations)
  • scikit-learn (supplementary modeling if needed)
  • matplotlib / seaborn (visualization)

File structure

p6/
  materials/
    bookings.csv              -- 3 years of booking data (~6,300 rows)
    marketing-spend.csv       -- Monthly marketing spend by channel (144 rows)
    data-dictionary.md        -- Field definitions for both datasets
    methodology-memo-template.md -- Template for documenting analytical decisions
    CLAUDE.md                 -- This file
  analysis.ipynb              -- Main analysis notebook (student creates)
  findings-summary.md         -- Findings for Hassan (student creates)
  technical-appendix.md       -- Methodology for silent partner (student creates)
  methodology-memo.md         -- Completed methodology memo (student creates)
  decision-record.md          -- Key decision documentation (student creates)

Key analytical concepts

  • Question typology: The brief is ambiguous ("understand our booking patterns"). The student must determine whether this is a descriptive, inferential, predictive, or causal question.
  • Inference vs prediction: Hassan needs to know whether the marketing shift worked (inference), not how many bookings to expect next quarter (prediction).
  • Effect sizes: Statistical significance alone is not enough. Effect sizes tell Hassan whether the finding is large enough to act on.
  • Self-reported attribution: The marketing_channel field is self-reported and systematically biased. This limitation must be documented.
  • Confounding: Multiple factors changed around the same time (marketing shift, exchange rate, new tour types). The regression controls for measured confounders but cannot prove causation.
  • Assumption checking: Check regression assumptions (normality, homoscedasticity, multicollinearity) before interpreting coefficients.

Task list

  1. Profile data -- load and profile bookings.csv and marketing-spend.csv
  2. Determine question type -- analyze Hassan's brief, identify the question type, document the framing decision
  3. Clean and describe -- handle cancellations, investigate attribution reliability, run descriptive analysis
  4. Run inferential analysis -- regression with marketing_shift as key predictor, compute effect sizes
  5. Design and run validation -- design validation strategy, meta-prompting, sensitivity analysis, cross-model review
  6. Translate findings -- translate statistical findings into business terms, prepare two deliverables
  7. Deliver and close -- send to Hassan, handle scope extensions, write decision record, commit and push

Verification targets

  • Question type documented and justified (inference, not prediction)
  • Cancelled bookings (~15%) separated from confirmed
  • Self-reported attribution limitation documented
  • Regression assumptions checked before interpreting coefficients
  • Effect sizes computed alongside p-values
  • Cross-model review completed with a separate AI context
  • All five of Hassan's requirements addressed
  • Findings stated in business terms, not statistical language
  • Confounding limitations stated honestly (association, not causation)

Commit convention

Commit after each major analytical milestone with a meaningful message describing what was decided and why. Example: "feat: determine question type as inference, document prediction alternative"