Learn by Directing AI
All materials

verification-checklist.md

Verification Checklist -- Branzeria Carpati Profitability

Source row count baselines

Source file Expected rows
production-log.csv 240
sales.csv 380
milk-purchases.csv 195

Staging verification

Each staging model must match its raw source row count exactly:

  • stg_production: 240 rows
  • stg_sales: 380 rows
  • stg_purchases: 195 rows

Column naming must follow snake_case convention consistently across all three models. All columns needed for the profitability mart must be present: variety, kilos_milk_in, kilos_cheese_out, aging_start_date, aging_end_date (production); variety, quantity_sold_kg, price_per_kg (sales); shepherd_name, liters_received, price_per_liter (purchases).

Profitability verification targets

Expected profitability per variety (values assume correct aging duration handling):

Variety Expected margin Notes
Telemea ~18-22% High volume, moderate price, good yield
Cascaval ~10-14% Higher price but longer aging increases cost
Branza de burduf ~15-18% Highest revenue per kg but lowest yield
Urda ~5-8% Lowest revenue but highest yield (whey cheese), slim margin
Cas ~20-25% High yield, moderate price, solid profitability
Nasal ~12-16% Moderate across all dimensions

Critical check: If aging duration is calculated incorrectly for batches with null end dates (~20% of production records), cascaval and branza de burduf profitability will be systematically overstated because aging cost is underestimated. This is the gap between "all tests pass" and "the numbers are right."

dbt test expectations

All built-in tests must pass:

  • unique on batch_number (staging)
  • not_null on key columns across all models
  • accepted_values on variety (six cheese types)
  • relationships between production and purchases on shepherd_name

The test suite verifies structural correctness, not business logic. Profitability values must be checked manually against the margin targets above.

Idempotency check

Run dbt run twice in succession. The second run must produce identical output:

  • Same row counts in all models
  • Same profitability values in the mart
  • No duplicate records

If the second run produces different results or additional rows, the model pattern is not idempotent.