Branzeria Carpati Profitability Pipeline
Client: Mihai Popescu, Owner and Head Cheesemaker at Branzeria Carpati, Sibiu, Romania
What you are building
A dbt project that connects three disconnected data sources -- production logs, sales records, and milk purchase ledgers -- to produce variety-level profitability analysis for a small artisan cheese operation. Mihai makes six traditional Romanian cheeses (telemea, cascaval, branza de burduf, urda, cas, nasal) from sheep and cow milk sourced from local shepherds. He tracks everything but cannot answer the question his accountant asks every quarter: which cheese variety actually makes money after milk cost, yield, and aging time?
Tech stack
- Python (Miniconda
deenvironment) - DuckDB (via Python
duckdbpackage) -- local analytical database - dbt Core + dbt-duckdb adapter -- transformation framework
- SQL (via DuckDB and dbt)
- Git / GitHub -- version control
- Claude Code -- AI agent (primary worker)
File structure
project/
materials/
production-log.csv # 240 batch records
sales.csv # 380 sales transactions
milk-purchases.csv # 195 milk purchase records
production-log-sample.csv # 12-row sample for initial review
sales-sample.csv # 12-row sample for initial review
pipeline-spec.md # Requirements and verification targets
verification-checklist.md # Expected values for profitability
CLAUDE.md # This file
dbt-template/ # Pre-configured dbt project scaffold
dbt_project.yml
profiles.yml
models/
schema.yml
staging/
marts/
branzeria_carpati/ # Working dbt project (copied from template)
dbt_project.yml
profiles.yml
models/
schema.yml # Source definitions and tests
staging/
stg_production.sql
stg_sales.sql
stg_purchases.sql
marts/
fct_variety_profitability.sql
Key material references
- pipeline-spec.md -- Mihai's requirements, data source descriptions, dbt naming conventions, test requirements, and verification targets
- verification-checklist.md -- Expected row counts, profitability margins per variety, and idempotency check procedure
- dbt-template/ -- Pre-configured dbt project scaffold with DuckDB connection ready. Copy into your working directory to start building models immediately.
Ticket backlog
- T1: Load and profile all three data sources (production log, sales, milk purchases) in DuckDB
- T2: Initialize dbt project from template and configure source definitions
- T3: Build staging models (stg_production, stg_sales, stg_purchases) with naming conventions
- T4: Build profitability mart (fct_variety_profitability) joining all three sources
- T5: Add dbt tests (unique, not_null, accepted_values, relationships) and verify profitability against checklist
- T6: Generate profitability outputs for client and write pipeline summary
Verification targets
- Row count baselines: production-log.csv = 240 rows, sales.csv = 380 rows, milk-purchases.csv = 195 rows. Staging models must match raw counts exactly.
- Profitability margins: Cross-reference verification-checklist.md for expected margin ranges per variety. If aging duration is miscalculated for null end dates (~20% of production records), cascaval and branza de burduf profitability will be overstated.
- dbt tests: All built-in tests pass (unique, not_null, accepted_values, relationships).
- Idempotency: Running
dbt runtwice produces identical output -- same row counts, same values, no duplicates.
Commit convention
Commit after each ticket with a meaningful message describing what the piece does. Examples: "Load and profile three data sources in DuckDB", "Add stg_production staging model with source-conforming columns", "Add dbt tests and verify profitability against checklist".