Learn by Directing AI

Step 1: Quality assessment from experience

No template this time. In P4, you worked through a structured quality assessment -- null patterns, impossible values, consistency, business-relevant anomalies. That process is yours now. You decide what to check based on what the data looks like.

The POS data has prices, quantities, dates, and categorical columns. Each of those has its own failure modes. Direct Claude to check for them:

Run a data quality assessment on the POS data. Check for: null values in any column, negative or zero prices, impossible quantities, dates outside the expected range (Oct 2024 to Mar 2026), categories or stores not in the data dictionary, and duplicate transactions. Summarize findings as fixable, flaggable, or blocking.

Read the results. Classify what you find. Fixable issues are things Claude can clean automatically. Flaggable issues need a human decision -- usually the client's. Blocking issues stop the analysis until resolved.

Step 2: Ask Amina

Some findings need context only Amina can provide. Open the chat with Amina and ask about the anomalies you found.

She reveals three things:

School bulk orders. The school partnerships she mentioned account for roughly 20% of annual revenue, but they are concentrated in January and September. Regular retail sales are 1-3 items per transaction. School orders are 10-50. These spikes are real business, not data errors -- but they distort monthly averages if you do not account for them.
Cafe price change. Six months ago, cafe prices went up about 15%. Any year-over-year comparison of cafe revenue will look like growth when it is really a price adjustment. The underlying volume has not changed much.
Kariakoo demographics. Kariakoo has significantly higher foot traffic but lower average transaction values. The neighborhood is a busy commercial district -- more customers, smaller purchases. Comparing Kariakoo's average transaction value to Oyster Bay's without context is misleading.

These are not data quality problems in the traditional sense. The data is accurate. But what counts as a quality problem depends on the question being asked. School bulk orders are accurate transactions -- but they are a quality problem if your question is "what is the monthly retail trend?"

Step 3: Define the metrics

Amina wants store-by-store comparison. That means defining what "store performance" means in a way that accounts for what you just learned.

Three metrics need explicit definitions:

Revenue per store. Does this include school bulk orders? Both answers are valid. Including them shows the full business reality. Excluding them shows the underlying retail trend. You need both versions -- one for "total business health" and one for "retail trend analysis." Make the decision explicit.

Average transaction value. Kariakoo's demographics make this number fundamentally different from Oyster Bay's. Comparing them raw is misleading. You can annotate the demographic context, normalize by some baseline, or present them separately with explanations. Pick an approach.

Category growth trends. The cafe price change means cafe revenue growth is partly price inflation, not volume growth. Year-over-year cafe comparison needs a price-adjusted version. Other categories do not have this problem.

Step 4: Write definitions as SQL

Each metric definition needs two forms: plain language (for Amina and her managers) and SQL (for the dashboards). The SQL is what runs in both plotly and Metabase. If the definitions are not precise, the two tools will produce different numbers for the same metric.

Direct Claude to write the SQL:

Write SQL metric definitions for the following. Each needs a one-sentence plain language definition and a DuckDB SQL query.

1. Total revenue per store (all transactions)
2. Retail revenue per store (excluding school bulk orders -- transactions with quantity >= 10)
3. Average transaction value per store (with a note about Kariakoo demographics)
4. Category revenue growth month-over-month (with a price-adjusted version for Cafe items after September 2025)

Run each query against the POS data and show the results.

Notice what you did: you specified the constraints before the computation. Each metric has an explicit inclusion/exclusion rule. The cafe adjustment has a specific date boundary and a specific adjustment factor. These constraints shape what Claude produces -- vague constraints produce vague metrics.

Step 5: Cross-check the definitions

Run the metric definitions and verify the numbers make sense. Then cross-check with a second AI.

Here are the metric definitions and their SQL queries. [Paste the definitions.] Run each query and verify: (1) the total revenue across all stores sums correctly, (2) the retail-only revenue is lower than total revenue by roughly 20%, (3) the cafe price-adjusted trend shows a different growth rate than the unadjusted version. Flag anything that looks wrong.

AI models have different analytical failure modes. A metric definition that one model accepts without question might trigger a useful objection from another. The cross-check is not about finding the "right" answer -- it is about surfacing assumptions you may have missed.

The revenue comparison with and without school bulk orders should make the distortion visible. The January and September spikes flatten when you exclude bulk orders, revealing the underlying retail trend. This is the analytical design decision for the project: Amina's dashboards need both views, clearly labeled, so she knows which question each one answers.

✓ Check

Check: What is total revenue for each store with school orders included versus excluded? How does the cafe price change affect year-over-year comparison for the past six months? Can you state each metric definition in one sentence?

Data quality and metric definitions

Step 1: Quality assessment from experience

Step 2: Ask Amina

Step 3: Define the metrics

Step 4: Write definitions as SQL

Step 5: Cross-check the definitions