Learn by Directing AI
All materials

CLAUDE.md

Cumbre Adventures A/B Test Analysis

Project

You are analysing an A/B test for Marco Quispe, founder and lead guide of Cumbre Adventures in La Paz, Bolivia. Marco redesigned his online booking page and ran a 60-day experiment: half of website visitors saw the old page (version A), half saw the new page (version B). The overall booking rate went up, but premium trek bookings dropped. His web developer and operations manager disagree on whether the new page is working. Marco needs a proper statistical analysis.

Client

Marco Quispe, founder. Cumbre Adventures offers mountain biking on Death Road, trekking in the Cordillera Real, climbing Huayna Potosi, and paragliding over La Paz. 15 employees, seven years in business. Most bookings through the website, some from partner hostels and travel agencies.

Tech Stack

  • Python 3.11+ (via Miniconda, "analytics" environment)
  • DuckDB (local analytical database, also accessed via MCP)
  • Jupyter Notebook
  • pandas
  • scipy.stats (z-test for proportions, chi-squared)
  • matplotlib / seaborn (visualisation)
  • Claude Code with DuckDB MCP server
  • Git / GitHub

Data Sources

  • ab-test-data.csv -- ~4,200 rows, one per visitor. Columns: visitor_id, page_version (A/B), visit_date, tour_selected, booking_completed, booking_value, visitor_source (organic, paid_ad, hostel_referral, agency).

Deliverables

An experiment report using the statistical testing template (statistical-testing-template.md) with:

  • Test setup and metric definition
  • Overall and per-tour-type test results (p-values, confidence intervals, effect sizes)
  • Confound analysis (ad budget shift, pricing display difference, language limitation)
  • Actionable recommendation for Marco
  • Forward-looking suggestions for future A/B tests

Work Breakdown

  1. Setup and discovery -- Read Marco's email, set up the project, profile the dataset
  2. Metric definition and question framing -- Define "conversion rate" precisely, frame the analytical questions, discover the pricing display confound
  3. Statistical tests -- Run the overall and per-tour-type tests, verify against targets, catch p-value framing errors
  4. MCP connection -- Load data into DuckDB, connect AI via MCP, experience the capability shift, discover the ad budget confound
  5. Confound analysis -- Segment by visitor source and time period, discover the language limitation, assess validity threats
  6. Recommendation report -- Structure findings using the template, visualise with uncertainty, deliver to Marco

Verification Targets

See verification-targets.md for known-good values for the overall conversion rate test. Compare AI-generated p-values, confidence intervals, and effect sizes against these targets.

Commit Convention

Write descriptive commit messages that explain what the analysis found, not just what code was written. Commit after completing each unit's work (setup, metric definition, tests, MCP connection, confound analysis, report).