Learn by Directing AI

Step 1: The metric hierarchy template

Open materials/metric-hierarchy-template.md. This is the primary scaffolding document for this project -- it replaces the statistical testing guide from last time.

The template introduces three ideas:

OEE decomposition -- Overall Equipment Effectiveness breaks into Availability, Performance Rate, and Quality Rate. These are not three separate metrics. They are components of one metric, and the parent number is the product of the three.
Leading and lagging indicators -- some metrics predict future outcomes (leading), others confirm past outcomes (lagging). The distinction determines whether you can act before a problem or only report on it after.
Hierarchy documentation -- which metrics depend on which, what changes cascade where, who owns each definition.

Read the worked example in the template. It uses a different industry so it does not give away the answer for Verdant Packaging. The structure is what matters: parent metric at the top, components below, data source for each, cascade relationships documented.

Step 2: Define Availability

Start with the first component. Availability measures how much of the planned production time was actually used for production.

The formula: planned production time minus unplanned downtime, divided by planned production time. Source: production logs. The planned_time_minutes, actual_time_minutes, and downtime_minutes columns give you everything you need.

Direct AI to calculate Availability from the production logs:

Using the production logs, calculate Availability for each production line: (planned_time_minutes - downtime_minutes) / planned_time_minutes. Show the average Availability per line.

You should see Availability around 92% across lines. That looks healthy -- but it is only one-third of the picture.

Step 3: Define Performance Rate

Performance Rate measures whether the production line is running at its rated speed when it is running.

The formula: actual output divided by maximum possible output at the line's rated speed. Source: production logs. The units_produced and rated_capacity columns.

Direct AI to calculate Performance Rate:

Calculate Performance Rate for each production line: units_produced / rated_capacity. Show the average per line.

Performance Rate should be around 88%. The lines are running, but not at full speed. Again, one piece of the picture.

Step 4: Define Quality Rate

Quality Rate measures how many of the units produced were actually good.

The formula: good units divided by total units produced. This one is trickier -- the numerator comes from the quality results (pass/fail per batch), and the denominator comes from the production logs (total units produced). Two data sources feeding one metric component.

Direct AI to calculate Quality Rate:

Join quality results with production logs on production_line_id and production_date. Calculate Quality Rate: count of pass results / total batches tested, grouped by production line.

Quality Rate should be noticeably lower for LINE-A (food containers) -- around 78% compared to 88-92% for the other lines. This is where the story changes. The food container line is not slow and not down -- it is producing units that fail quality testing.

Step 5: Talk to Siobhan about efficiency

Ask Siobhan how Verdant Packaging currently measures production efficiency. Her response reveals a blind spot: "We look at output per shift, basically."

Output per shift is a single number that hides three different problems. Machine downtime, line speed, and quality failures all reduce "output" -- but the fix for each is different. When Siobhan sees the OEE breakdown -- 92% availability, 88% performance, 78% quality on the food container line -- the diagnosis changes. The quality issues are the bottleneck, not machine time.

Tell Siobhan: "Your line looks 85% efficient overall, but that's hiding a 92% availability rate and a 78% quality rate. The quality issues are your bottleneck, not machine downtime."

Her response: "That's exactly the kind of thing I've been missing. The quality issues are the bottleneck, not machine time."

Step 6: Review AI's hierarchy decomposition

Now direct AI to generate the full OEE hierarchy for Verdant Packaging:

Generate an OEE metric hierarchy for Verdant Packaging. OEE = Availability x Performance Rate x Quality Rate. For each component, include the definition, formula, data source, and current value per production line.

Review what AI produces. AI generates metric hierarchies that are mathematically consistent but may not match how the business makes decisions. AI might decompose by product line when Siobhan thinks in terms of production stage. Or AI might group the components differently from how the operations team would use them.

Check: does the decomposition match how Verdant Packaging actually operates? The three production lines make different products with different quality profiles. LINE-A (food containers) uses PLA resin and has the highest failure rate. The hierarchy should reflect the operational reality, not just the math.

Dr. Nkechi Obi, a senior colleague, reviews your first attempt at the hierarchy. Her feedback is brief and questioning: "OEE looks like one number. It's three. Define each one separately before you compose them." She pushes you to make sure each component stands on its own before combining them.

graph TD
    OEE["OEE<br/>~63% food containers<br/>~75% mailer bags<br/>~73% industrial wrap"]
    A["Availability<br/>~92%<br/>Source: production logs"]
    P["Performance Rate<br/>~88%<br/>Source: production logs"]
    Q["Quality Rate<br/>~78% LINE-A | ~92% LINE-B | ~88% LINE-C<br/>Source: quality results + production logs"]
    OEE --> A
    OEE --> P
    OEE --> Q
    Q -.->|"cascade: redefining<br/>quality rate changes OEE"| OEE

Step 7: Define leading and lagging indicators

Leading indicators predict future outcomes. Lagging indicators confirm past outcomes. The difference determines whether the operations team can act proactively or only react.

Define at least three of each for Verdant Packaging:

Leading indicators:

PLA moisture content -- when moisture exceeds 4%, batch failure rates on the food container line increase sharply. This is measurable before production starts.
Maintenance frequency -- lines with less frequent maintenance have higher unplanned downtime. Measurable before downtime occurs.
Raw material order lead times -- PLA resin from ChemPlas GmbH takes 35-50 days. Monitoring lead time trends predicts supply disruptions.

Lagging indicators:

Batch failure rate -- confirms past quality problems after they have happened.
Delivery delay rate -- confirms past scheduling failures.
Waste percentage -- confirms past production losses.

AI commonly classifies indicators as leading or lagging based on temporal ordering rather than causal relationship. A metric that happens to be measured first is not automatically a leading indicator. The classification depends on whether the metric has a causal relationship with the outcome -- whether acting on it can change what happens next.

Step 8: Document hierarchy relationships and test the cascade

Document the metric dependencies: which metrics depend on which, what changes cascade where.

Then test the cascade. Redefine Quality Rate to include rework (batches that were reprocessed and passed on the second attempt). What happens to OEE?

Recalculate Quality Rate including reprocessed batches as "pass" instead of "fail." Show the new Quality Rate per line and the new OEE. Compare to the original values.

When you include rework, Quality Rate increases -- batches that failed initially but were reprocessed and passed now count as good. OEE increases too, because OEE depends on Quality Rate. But the underlying operational problem has not changed. The food container line still has the same failure rate at the first pass. The metric looks better, but the reality is the same.

This is cascading governance. Changing a component metric's definition changes every metric that depends on it. The student who changes "quality rate" without reviewing what depends on it breaks metrics they did not intend to touch.

Document the hierarchy relationships: OEE depends on Availability, Performance Rate, and Quality Rate. Quality Rate uses data from both quality results and production logs. Redefining any component changes the parent. This documentation is governance infrastructure -- it tells the next person who touches these metrics what will break if they change a definition.

✓ Check

✓ Check: OEE decomposes into three components; each defined with data source; leading/lagging indicators identified; cascade tested

Metric hierarchies and OEE