Learn by Directing AI
Unit 1

The Brief and the Environment

Step 1: Set Up the Project

Open your terminal and start Claude Code:

cd ~/dev
claude

Paste this prompt:

Create the folder ~/dev/cybersecurity/p2. Download the project materials from https://learnbydirectingai.dev/materials/cybersecurity/p2/materials.zip and extract them into that folder. Read CLAUDE.md -- it's the project governance file.

Claude creates the folder, downloads the materials, and reads the governance file. When it finishes, look at what's in materials/. You should see CLAUDE.md, docker-compose.yml, scope-document.md, ttp-selection.md, client-email.md, sigma-rule-starter.yml, report-template.md, and a vulnerable-app/ directory. These are your working inputs for the full assessment.

The CLAUDE.md file is the project governance file. It lists six tickets (T1 through T6), verification targets, and a commit convention. Every ticket maps to a phase of the assessment pipeline. When you direct Claude on a task, reference the ticket number so it stays anchored to the scope.

Step 2: Launch the Lab Environment

The docker-compose.yml defines six services: the main application on port 8080, a staging copy on port 8081, a MySQL database, and a monitoring stack (Grafana, Loki, and Alloy). This is a bigger lab than P1's single DVWA container.

Direct Claude to start the environment:

Run docker compose up -d using the docker-compose.yml in the materials folder. Wait for all containers to be healthy, then tell me which containers are running.

This takes a few minutes the first time as Docker pulls the images. Once Claude confirms the containers are running, open a browser and go to http://localhost:8080. You should see Gintaro Kelias -- an amber jewelry shop with product listings, prices in euros, a search bar, and links to customer login and product reviews. This is not DVWA. It looks like a real store because it is modeled on one.

Now open a second tab and go to http://localhost:3000. This is Grafana. Navigate to the Explore view, select the Loki data source, and run a query like {container="gintaro-app"}. You should see HTTP request logs streaming in from the application container -- timestamps, request methods, URLs, status codes. This is the defender's view. Keep both tabs open.

The staging site at http://localhost:8081 runs the same application code. The scope document authorizes testing on both, but the staging site is where you can test more aggressively without worrying about disrupting the main instance.

Step 3: Read the Client Email

Open materials/client-email.md. This is from Ruta Kazlauskiene, who runs Gintaro Kelias -- a family amber jewelry workshop in Klaipeda, Lithuania. Three generations of amber craft, four years of online sales, shipping to 15 countries.

Three weeks ago, one of Ruta's loyal customers in Germany received an email that looked exactly like it came from the shop. The right logo, the right colors, asking her to "verify her account details." The customer did not click, but she was upset. Ruta does not know whether someone copied her branding or actually got into her customer list.

Her nephew Tomas built the shop on WordPress with WooCommerce. He has not updated the plugins in six months. Christmas orders start in October. The shop cannot go offline, but Ruta also cannot sleep knowing there might be a hole in the system.

She needs the answer in clear language. She knows amber, jewelry design, and customer service. She does not know cybersecurity.

Everything you do in this assessment is for this person. The scans, the exploits, the detection rules, the report -- all of it answers Ruta's question: is my customers' data safe?

Step 4: Read the Scope Document

Open materials/scope-document.md. This defines the assessment boundary -- what you are authorized to test and what you are not.

The scope matters more this time because there are more targets. P1 had one application at one address. This assessment covers the main application at port 8080, the staging site at port 8081, and the MySQL database at port 3306. The monitoring infrastructure (Grafana, Loki, Alloy) is out of scope -- you use it for detection, not as an attack target. Stripe payment processing is also out of scope.

Read the authorized activities section. Notice that the scope lists specific input fields: search, product reviews, login forms, and order processing. These are the surfaces you will test. If Claude suggests probing something not listed here -- the host operating system, the monitoring containers, any external service -- that is out of scope regardless of how interesting it looks.

The scope also specifies a constraint: the shop must remain operational throughout testing. No denial-of-service tests, no destructive operations on the database. Ruta's customers are still shopping. This constraint shapes how you direct every tool.

Step 5: Read the TTP Selection

Open materials/ttp-selection.md. In P1, this document listed one attack type: SQL injection. This time it lists five.

The TTP selection covers cross-site scripting (reflected and stored), command injection, credential testing, and SQL injection. Each entry names the target, the testing method, the tool, and the ATT&CK technique mapping. The priority guidance at the bottom ranks findings by business impact -- stored XSS affects every visitor to a compromised page, while reflected XSS requires a crafted link.

This is the biggest shift from P1. You are not looking for one thing. You are running a pipeline that tests for multiple vulnerability types, each requiring a different approach and producing a different signature in the logs. The sequence of your testing matters -- passive reconnaissance before active scanning, because what you learn passively shapes which active tests to run.

Read each vulnerability type and its testing method. You do not need to memorize the details. The document stays open throughout the assessment, and you will reference it when directing Claude through each exploitation phase.

✓ Check

✓ Check: The WooCommerce-style application loads in the browser. Grafana shows log entries from the application container.