Cumbre Adventures A/B Test Analysis

Project

You are analysing an A/B test for Marco Quispe, founder and lead guide of Cumbre Adventures in La Paz, Bolivia. Marco redesigned his online booking page and ran a 60-day experiment: half of website visitors saw the old page (version A), half saw the new page (version B). The overall booking rate went up, but premium trek bookings dropped. His web developer and operations manager disagree on whether the new page is working. Marco needs a proper statistical analysis.

Client

Marco Quispe, founder. Cumbre Adventures offers mountain biking on Death Road, trekking in the Cordillera Real, climbing Huayna Potosi, and paragliding over La Paz. 15 employees, seven years in business. Most bookings through the website, some from partner hostels and travel agencies.

Tech Stack

Python 3.11+ (via Miniconda, "analytics" environment)
DuckDB (local analytical database, also accessed via MCP)
Jupyter Notebook
pandas
scipy.stats (z-test for proportions, chi-squared)
matplotlib / seaborn (visualisation)
Claude Code with DuckDB MCP server
Git / GitHub

Data Sources

ab-test-data.csv -- ~4,200 rows, one per visitor. Columns: visitor_id, page_version (A/B), visit_date, tour_selected, booking_completed, booking_value, visitor_source (organic, paid_ad, hostel_referral, agency).

Deliverables

An experiment report using the statistical testing template (statistical-testing-template.md) with:

Test setup and metric definition
Overall and per-tour-type test results (p-values, confidence intervals, effect sizes)
Confound analysis (ad budget shift, pricing display difference, language limitation)
Actionable recommendation for Marco
Forward-looking suggestions for future A/B tests

Work Breakdown

Setup and discovery -- Read Marco's email, set up the project, profile the dataset
Metric definition and question framing -- Define "conversion rate" precisely, frame the analytical questions, discover the pricing display confound
Statistical tests -- Run the overall and per-tour-type tests, verify against targets, catch p-value framing errors
MCP connection -- Load data into DuckDB, connect AI via MCP, experience the capability shift, discover the ad budget confound
Confound analysis -- Segment by visitor source and time period, discover the language limitation, assess validity threats
Recommendation report -- Structure findings using the template, visualise with uncertainty, deliver to Marco

Verification Targets

See verification-targets.md for known-good values for the overall conversion rate test. Compare AI-generated p-values, confidence intervals, and effect sizes against these targets.

Commit Convention

Write descriptive commit messages that explain what the analysis found, not just what code was written. Commit after completing each unit's work (setup, metric definition, tests, MCP connection, confound analysis, report).

CLAUDE.md