Step 1: Think about test strategy as architecture
Until now, you've added tests as you built models -- a uniqueness test here, a not-null test there. That works for individual models but doesn't answer the harder question: what catches what, and where?
A quality testing strategy is a deliberate design. Staging tests validate source data against expectations. Intermediate tests verify transformation logic. Mart tests verify business rules and consumer contracts. Each layer catches a different class of failure, and each has a different cost of false positives.
The question is not "how many tests do I have?" but "what failure would I not catch?"
Step 2: Design staging-layer tests
Staging tests are boundary defenses. They catch problems at the point where external data enters your pipeline.
For each staging model, design tests that validate:
- Schema presence: expected columns exist (catches field renames from source systems)
- Data types: quantities are numeric, dates are dates (catches type changes that produce silent nulls)
- Value ranges: delivery quantities are positive, prices are within realistic ranges
- Null patterns: columns that should never be null vs columns where nulls are legitimate (billing_status has legitimate nulls)
Direct AI to create these tests. AI commonly generates tests only at the mart layer because that's where the final output lives. But a field rename caught at staging is one failure. The same rename caught at the mart layer is twenty failures -- every downstream model inherited the problem.
Step 3: Design intermediate-layer tests
Intermediate tests verify transformation logic. After joining four factories into unified views:
- Join key correctness: no orphaned records (deliveries without matching materials). A LEFT JOIN that drops records means material code resolution failed for some rows.
- Material code resolution completeness: every row has a standard material code. Rows with NULL standard codes mean the mapping table was incomplete.
- Deduplication verification: no unintended duplicates from the union of four sources.
Step 4: Design mart-layer tests
Mart tests verify business rules and consumer contracts:
- Business logic validation: cost attribution totals match source sums. If the sum of
fct_cost_attribution.total_kwddoesn't match the sum across all staging models, something was lost or duplicated in transformation. - Freshness constraints: mart tables should be no older than a defined threshold.
- Consumer contracts: the columns and types the CFO's reports depend on must be present and valid.
Step 5: Run coverage analysis
After implementing tests across all three layers, run a coverage analysis. Direct AI to report:
- Which models have tests? Which don't?
- Which models have only structural tests (unique, not_null) but no business logic tests?
- Which business rules in Fatimah's requirements are verified by tests? Which are not?
The coverage analysis is professional judgment about risk. 100% coverage with tautological tests (testing that a column is not null when the schema already enforces NOT NULL) is worse than 60% coverage of the things that could actually go wrong.
Identify the gaps. What failure could happen right now that no test would catch?
Step 6: Set up quality metrics tracking
Tests that pass today might become flaky tomorrow. Set up tracking for:
- Test pass rates over time
- Failure frequency by test (which tests fail most often?)
- Flaky test identification (tests that fail intermittently without a clear cause)
A test that fails every Tuesday and gets manually overridden is not a quality gate. It's noise that erodes trust in the entire testing infrastructure. Flaky tests that the team ignores are worse than no tests at all -- they create false confidence.
Step 7: Document the quality strategy
Write a quality strategy document that communicates: what's tested, at what layer, with what coverage, and what remains undefended.
This is professional documentation -- not a list of test names, but a description of the architecture. A new engineer reading this document should understand why specific thresholds were chosen, why certain business rules are tested at the mart layer instead of staging, and what known gaps exist.
Check: Run dbt test and report: how many tests at each layer (staging, intermediate, mart)? At least one business logic test exists at the mart layer (e.g., cost attribution totals match source sums).