Learn by Directing AI
All materials

CLAUDE.md

Baan Suan Hotels -- Multi-Source Property Performance Analysis

Project

You are analyzing guest experience and property performance for Somchai Rattanapong, Director of Operations at Baan Suan Hotels. Baan Suan operates five boutique properties across Thailand: Koh Samui (beach resort), Krabi (beach resort), Bangkok (city hotel), Chiang Mai (cultural property), and Khao Yai (nature retreat).

What you are building

A findings report for the board that combines three data sources (bookings, reviews, revenue), determines which property differences in guest satisfaction are real vs noise, identifies what drives satisfaction, and presents the results in board-ready language.

Tech stack

  • Python 3.11+ (conda "ds" environment)
  • Jupyter Notebook
  • pandas
  • DuckDB (SQL-based multi-source analysis)
  • scikit-learn (Ridge, Lasso, cross_val_score)
  • scipy (statistical tests, assumption checking)
  • matplotlib / seaborn
  • Claude Code
  • Git / GitHub

Data sources

Key materials

Tickets

  • T1: Profile all three datasets independently (shape, types, nulls, date formats)
  • T2: Join sources -- resolve date format discrepancy (DD/MM vs MM/DD), standardize revenue categories, verify row counts
  • T3: Inferential analysis -- assumption checking, ANOVA or Kruskal-Wallis, effect sizes, pairwise comparisons with correction, interaction effects
  • T4: Prediction model -- feature selection, regularization (Ridge/Lasso), cross-validation with std, feature importance
  • T5: Cross-model review -- second AI reviews methodology memo with fresh context
  • T6: Board findings report -- translate statistical results to business language, address all four requirements
  • T7: Deliver to Somchai, handle scope extension, write decision record, push to GitHub

Verification guidance

  • Check row counts before and after every join
  • Check date formats match across sources before joining
  • Run assumption checks (normality, homoscedasticity) before statistical tests
  • Report effect sizes alongside p-values
  • Apply multiple comparison corrections for pairwise tests
  • Report cross-validation mean AND standard deviation
  • No leaky features in prediction model

Commit convention

Commit after completing each ticket. Use descriptive messages that explain what analytical work was done, not just what files changed.