Learn by Directing AI
All materials

CLAUDE.md

Branzeria Carpati Profitability Pipeline

Client: Mihai Popescu, Owner and Head Cheesemaker at Branzeria Carpati, Sibiu, Romania

What you are building

A dbt project that connects three disconnected data sources -- production logs, sales records, and milk purchase ledgers -- to produce variety-level profitability analysis for a small artisan cheese operation. Mihai makes six traditional Romanian cheeses (telemea, cascaval, branza de burduf, urda, cas, nasal) from sheep and cow milk sourced from local shepherds. He tracks everything but cannot answer the question his accountant asks every quarter: which cheese variety actually makes money after milk cost, yield, and aging time?

Tech stack

  • Python (Miniconda de environment)
  • DuckDB (via Python duckdb package) -- local analytical database
  • dbt Core + dbt-duckdb adapter -- transformation framework
  • SQL (via DuckDB and dbt)
  • Git / GitHub -- version control
  • Claude Code -- AI agent (primary worker)

File structure

project/
  materials/
    production-log.csv          # 240 batch records
    sales.csv                   # 380 sales transactions
    milk-purchases.csv          # 195 milk purchase records
    production-log-sample.csv   # 12-row sample for initial review
    sales-sample.csv            # 12-row sample for initial review
    pipeline-spec.md            # Requirements and verification targets
    verification-checklist.md   # Expected values for profitability
    CLAUDE.md                   # This file
    dbt-template/               # Pre-configured dbt project scaffold
      dbt_project.yml
      profiles.yml
      models/
        schema.yml
        staging/
        marts/
  branzeria_carpati/            # Working dbt project (copied from template)
    dbt_project.yml
    profiles.yml
    models/
      schema.yml                # Source definitions and tests
      staging/
        stg_production.sql
        stg_sales.sql
        stg_purchases.sql
      marts/
        fct_variety_profitability.sql

Key material references

  • pipeline-spec.md -- Mihai's requirements, data source descriptions, dbt naming conventions, test requirements, and verification targets
  • verification-checklist.md -- Expected row counts, profitability margins per variety, and idempotency check procedure
  • dbt-template/ -- Pre-configured dbt project scaffold with DuckDB connection ready. Copy into your working directory to start building models immediately.

Ticket backlog

  • T1: Load and profile all three data sources (production log, sales, milk purchases) in DuckDB
  • T2: Initialize dbt project from template and configure source definitions
  • T3: Build staging models (stg_production, stg_sales, stg_purchases) with naming conventions
  • T4: Build profitability mart (fct_variety_profitability) joining all three sources
  • T5: Add dbt tests (unique, not_null, accepted_values, relationships) and verify profitability against checklist
  • T6: Generate profitability outputs for client and write pipeline summary

Verification targets

  • Row count baselines: production-log.csv = 240 rows, sales.csv = 380 rows, milk-purchases.csv = 195 rows. Staging models must match raw counts exactly.
  • Profitability margins: Cross-reference verification-checklist.md for expected margin ranges per variety. If aging duration is miscalculated for null end dates (~20% of production records), cascaval and branza de burduf profitability will be overstated.
  • dbt tests: All built-in tests pass (unique, not_null, accepted_values, relationships).
  • Idempotency: Running dbt run twice produces identical output -- same row counts, same values, no duplicates.

Commit convention

Commit after each ticket with a meaningful message describing what the piece does. Examples: "Load and profile three data sources in DuckDB", "Add stg_production staging model with source-conforming columns", "Add dbt tests and verify profitability against checklist".