Learn by Directing AI
Unit 1

Amina's message and the POS data

Step 1: Amina's message

Amina Msangi runs Soma Books, three bookstores in Dar es Salaam. Oyster Bay, Kariakoo, Mikocheni. Eighteen employees, curated titles, stationery, and a cafe in each store.

Open the chat with Amina. Read her WhatsApp message. Five things she wants:

  1. A dashboard where she clicks a store and sees the breakdown -- categories, subcategories, months
  2. Store-by-store comparison on the same metrics
  3. Trends -- which categories are growing, which are declining
  4. Her managers need access to their own store's numbers
  5. A clear signal when there is not enough data to draw a conclusion

She communicates via WhatsApp. Professional, direct, no wasted words. She has 18 months of POS data from all three stores -- about 42,000 rows.

Step 2: Project setup

Open a terminal and start Claude Code:

cd ~/dev
claude

Paste this prompt:

Create the folder ~/dev/analytics/p5. Download the project materials from https://learnbydirectingai.dev/materials/analytics/p5/materials.zip and extract them into that folder. Read CLAUDE.md -- it's the project governance file. If anything needs admin access, tell me what to run in a separate terminal.

Claude creates the folder, downloads the materials, and reads CLAUDE.md. That file describes Amina's situation, the deliverables, the tech stack, and the work breakdown. Once Claude confirms it has read CLAUDE.md, you are set up.

Confirm Docker and Metabase are still running from P4. If they are not, direct Claude to restart them.

Step 3: The data dictionary

Open materials/data-dictionary.md. It describes the POS export columns: store, category, subcategory, title, author, price_tzs, quantity, transaction_date. Five categories: Fiction, Non-Fiction, Children's, Stationery, Cafe. Each has subcategories.

Note the business terminology section. School partnerships mean bulk orders from local schools, placed in January and September. These are identifiable by quantity -- regular sales are 1-3 items, school orders are 10-50.

Three stores in three neighborhoods. Oyster Bay is affluent with more international titles. Kariakoo is a busy commercial district with higher foot traffic. Mikocheni is a balanced residential neighborhood.

Step 4: Data profiling

The POS export (materials/soma-books-pos.csv) covers all three stores, 18 months of transactions. Direct AI to load it into DuckDB and profile it. Be specific -- one thing at a time:

Load materials/soma-books-pos.csv into DuckDB. Show me the column names, types, and row count.

After the basic profile, run a second query:

Show me the number of transactions per month, ordered by date. I want to see if there are any months with unusual volumes.

Step 5: Initial patterns

Look at the monthly transaction volumes. January and September should stand out -- they have noticeably more transactions than typical months. These are the school bulk order months. The data dictionary mentioned school partnerships, and the profile confirms the pattern.

Notice the three stores. Oyster Bay likely shows higher revenue per transaction. Kariakoo likely shows more transactions but lower average values. These differences will matter when Amina asks you to "compare stores fairly."

At this point, you know the data shape. You know there are seasonal patterns that could distort trend analysis. You know the stores have different profiles. You do not yet know about the cafe price change or the full story behind Kariakoo's different numbers -- those come from talking to Amina.

✓ Check

Check: How many total transactions are in the dataset? How many unique stores? What is the date range? Can you identify any months with unusually high transaction volumes?