Step 1: Prepare the configuration-aware analysis
The pipeline is complete. SCD dimensions track configuration history. MetricFlow standardizes metrics. PII is masked. Quality tests pass. Now prepare the analysis that answers Katrine's original question: "Did the blade upgrade actually improve power output?"
Query a specific turbine with a known blade upgrade. DK01-T02 received an upgrade from LM 46.8 to LM 43.2 on March 15, 2024. Compare average power output at comparable wind speeds before and after the change:
SELECT
CASE WHEN timestamp < '2024-03-15' THEN 'Pre-upgrade (LM 46.8)'
ELSE 'Post-upgrade (LM 43.2)' END as period,
blade_model,
ROUND(AVG(power_output_kw), 1) as avg_power_kw,
COUNT(*) as readings
FROM fct_turbine_performance
WHERE turbine_id = 'DK01-T02'
AND wind_speed_ms BETWEEN 8 AND 12
GROUP BY period, blade_model
ORDER BY period;
The wind speed filter (8-12 m/s) controls for weather. You're comparing the same turbine in the same wind conditions, before and after the upgrade. The SCD effective dates ensure each reading links to the correct configuration.
Step 2: Query standardized metrics
Query availability and capacity factor across all farms using the MetricFlow definitions:
SELECT farm_id,
ROUND(availability, 1) as availability_pct,
ROUND(capacity_factor, 1) as capacity_factor_pct
FROM farm_metrics
ORDER BY farm_id;
Every farm gets the same formula. The same metric name means the same calculation everywhere. When Katrine shows these numbers to farm owners, there's no "what does availability mean?" argument. The definition is in the semantic layer, auditable, and consistent.
Step 3: Present to Katrine
Email Katrine with the results. Show her:
- The configuration-aware performance comparison for DK01-T02 -- before and after the blade upgrade, controlling for wind speed
- The standardized availability and capacity factor metrics across all 14 farms
- How the system automatically tracks configuration changes -- when the next component is replaced, the historical boundary is created without manual tagging
Frame the results in business terms. Katrine doesn't need to know about SCD Type 2 or temporal joins. She needs to know that the system answers her question: the blade upgrade improved power output by approximately 8-10% at comparable wind speeds, and now every farm is measured the same way.
Step 4: Handle scope creep
Katrine will respond positively -- and then ask for something new. After seeing the configuration history working, she wants warranty tracking: "Can we also track warranty periods? Some components are under warranty and I need to know if a failure happened within the warranty window -- that's a claim we can file."
This is scope creep, but it's legitimate. Warranty tracking is a natural extension of the SCD pattern you already built. The question is whether it fits within the current architecture or requires structural changes.
Evaluate the request:
- Warranty periods are temporal data (start date, end date) associated with components
- This could be a new attribute pair on the existing SCD dimension (warranty_start, warranty_end)
- Or it could be a separate dimension (dim_warranty) linked to the component change events
The first option is simpler -- it extends what exists. The second option is cleaner if warranty data comes from a different source than the component change log. Consider what Katrine said: "Right now I check the spreadsheet manually." The warranty data likely lives in the same spreadsheet as the component changes, which makes it an attribute of the configuration, not a separate entity.
Step 5: Respond to scope creep
Respond to Katrine professionally. Acknowledge the request, evaluate feasibility, and give a clear recommendation:
- If you think warranty tracking fits as SCD attributes: explain that the existing configuration tracking can be extended to include warranty periods, with minimal additional complexity
- If you think it needs a separate dimension: explain the trade-off -- a cleaner data model but additional build work
Either way, note what data Katrine would need to provide (warranty start and end dates per component) and whether that data exists in her current spreadsheet.
Step 6: Push and close
Commit the final state and push to GitHub. Write a README that covers:
- What the pipeline does
- The SCD strategy and why it was chosen
- How to query the standardized metrics
- The PII masking approach
- How to run the quality tests
git add -A
git commit -m "feat: complete turbine performance pipeline with SCD tracking, MetricFlow metrics, and PII masking"
git push
The project is complete. The schema design decisions -- SCD strategies, metric definitions, model contracts -- are contracts that shape every future analysis. Katrine's farm owners get consistent numbers. Her configuration history answers the question that started this project: "Did the upgrade work?"
Check: Can you show Katrine a specific turbine's performance before and after a component change, with the SCD tracking ensuring the comparison is fair?