Learn by Directing AI
Unit 6

Present results and close the project

Step 1: Prepare a summary for Francoise

Before contacting Francoise, assemble the key results from your pipeline. She needs specific numbers, not a description of what you built.

Query the mart for chain of custody completion:

SELECT chain_of_custody_status, COUNT(*) AS shipments
FROM fct_shipments
GROUP BY chain_of_custody_status;

Query the error quarantine for the gap report:

SELECT reason, COUNT(*) AS records
FROM error_quarantine
GROUP BY reason;

Query inventory across the four concessions:

SELECT concession_id, species, COUNT(*) AS logs, SUM(volume_m3) AS total_volume
FROM stg_forestry__logs
GROUP BY concession_id, species
ORDER BY concession_id, species;

Query yield by species from the intermediate model:

SELECT species, ROUND(AVG(yield_pct), 1) AS avg_yield
FROM int_yield
GROUP BY species;

These four queries give Francoise what she needs: how many shipments have full traceability, where the gaps are, what inventory looks like across concessions, and how efficient the sawmill is by species.

Step 2: Present results to Francoise

Open the chat with Francoise. Share the chain of custody completion rate, the quarantine report, the inventory summary, and the yield figures.

Francoise responds formally. For the chain of custody documentation: "This is acceptable." From Francoise, that is high praise.

She notes the error quarantine report with approval: "Good. I need to know where the gaps are." The quarantined records represent timber that cannot be traced from forest to export -- each one is a potential FLEGT compliance risk. The fact that you can enumerate them precisely, with reasons, is exactly what she needed.

Step 3: Handle the scope request

Francoise makes a request: "I also need to track timber that is in transit -- between the sawmill and the port. Right now I have a blind spot. Logs leave the sawmill and I don't know their status until customs processes them at the port."

This is a scope expansion. It adds a fourth data stream (transit/logistics tracking) to the pipeline. It has value -- the transit blind spot is real -- but it is not part of the current project.

Acknowledge the value of the request and defer it. Francoise is direct, and she respects directness in return. Something like: "Transit tracking would close the visibility gap between sawmill and port. I'll document it as a next-phase addition so it's ready when you want to proceed."

Francoise accepts the deferral professionally: "Understood. Add it to the next phase."

Step 4: End-to-end verification

Run the final verification. This is not the same as "dbt build passes." A passing build means the code runs without errors. End-to-end verification means the output matches expected values.

Trigger a full Dagster materialization:

dagster dev

In the Dagster UI, materialize all assets. Every asset should turn green.

Then run the dbt tests:

dbt test

All structural and business logic tests should pass.

Finally, compare the mart output against materials/verification-checklist.md. Check:

  • Forestry: 500 rows
  • Sawmill: 320 rows
  • Customs: 180 rows
  • Tag-batch mapping: 480 rows
  • Error quarantine: approximately 40 records
  • Chain of custody: approximately 160 complete out of 180 shipments (approximately 89% completion rate)
  • Yield: 35-65% range by species

If any number is significantly off, trace it back through the layers. The three-layer architecture and the Dagster lineage give you the tools to find where the discrepancy starts.

Step 5: Commit to Git

Commit the final state of the project. Include all dbt models, Dagster configuration, tests, and the extraction scripts.

git add -A
git commit -m "feat: complete Bois du Littoral chain of custody pipeline"

Push to GitHub:

git push origin main

Step 6: Write the README

Direct Claude to write a project README.

Write a README.md for this project. Include: what was built (multi-source data pipeline for timber chain of custody), what the pipeline does (extracts from four sources, resolves identity across systems, produces chain of custody status per shipment), the key results (completion rate, yield ranges, quarantine count), how to run it (dagster dev for orchestration, dbt build for transformation), and what's deferred (transit tracking between sawmill and port).

Review the README and commit it:

git add README.md
git commit -m "docs: add project README"

✓ Check

Check: All tests pass. Mart matches checklist. Dagster materialization succeeds. Quarantine contains expected records. README committed.

Project complete

Nice work. Ready for the next one?