Daun & Co -- Product and Channel Analysis
Client
Farid Hassan, Managing Director of Daun & Co, an organic tea company in Kota Kinabalu, Sabah, Malaysia. 12 employees. Three sales channels: retail shop, wholesale to cafes/hotels, and online store (Shopify).
What you're building
A multi-source product and channel analysis. Farid's revenue has been flat for two years. You will combine data from three systems (POS, Shopify, wholesale spreadsheets), author metric definitions from scratch, analyse product profitability by channel, identify trends, and deliver a narrative report addressing Farid's five requirements.
Tech stack
- Python 3.11+ (Miniconda, "analytics" environment)
- DuckDB (analytical database)
- pandas (data manipulation)
- Jupyter Notebook (analysis environment)
- matplotlib / seaborn (charts)
- Claude Code (AI direction)
- Git / GitHub (version control)
File structure
analytics/p2/
materials/
CLAUDE.md (this file)
product-catalog.csv (authoritative product names and costs)
retail-sales.csv (POS export, 14 months)
online-orders.csv (Shopify export, 14 months)
wholesale-invoices.xlsx (operations manager's records, 14 months)
notebooks/
analysis.ipynb (main analysis notebook -- you create this)
output/
metric-definitions.md (you create this)
cleaning-log.md (you create this)
deliverable.md (narrative report for Farid)
Key material references
- product-catalog.csv -- The authoritative product list with standard names and cost-per-unit. Use these names as canonical when reconciling across sources.
- retail-sales.csv -- POS export. Product names are variants of the catalog names.
- online-orders.csv -- Shopify export. Product names use Shopify-style formatting. Date format is MM/DD/YYYY.
- wholesale-invoices.xlsx -- Free-text "items" column needs parsing. No per-unit price breakdown.
Work breakdown
- The brief and the data -- Read Farid's email, set up the project, profile all three data sources, identify inconsistencies
- Clean and unify -- Standardise product names, parse wholesale data, handle nulls, document cleaning decisions, produce unified dataset
- Define the metrics -- Author metric definitions (plain language + SQL) for product profitability, channel revenue, channel growth
- Analyse and follow the thread -- Compute metrics, create charts, follow iterative investigation, verify AI's aggregations
- Deliver to Farid -- Write narrative addressing all five requirements, document data quality issues, push to GitHub
Verification approach
- Check product name mapping against the product catalog after unification
- Verify metric definition SQL matches plain language definition
- Compute key metrics manually and compare against AI's output
- Review narrative against Farid's five requirements
Commit convention
Descriptive messages tied to the work: "profile three data sources", "create product name mapping", "define revenue metric -- plain language and SQL", "add channel analysis with charts", "deliver narrative report to Farid". Commit after each meaningful step.