Learn by Directing AI

Step 1: Design the test strategy

In Unit 3 you tested the auth system manually -- curl requests as the wrong role, checking for 403 responses. That verified the system works. Automated tests verify it keeps working as the code changes.

Open materials/templates/test-strategy-template.md. The template asks a good question: which behaviors belong at which test layer?

Unit tests verify isolated logic. Password hashing correctness -- does bcrypt hash and compare correctly? Role permission checks -- does the permission function return true for doctor + clinical-notes and false for nurse + clinical-notes? These run fast, test one thing, and don't need a database or server.

Integration tests verify that components work together with realistic inputs. A request to /api/patients/:id/clinical-notes with a doctor's session returns 200 with the data. The same request with a nurse's session returns 403. The same request with no session returns 401. These tests hit real API routes with real middleware but may use a test database.

E2E tests verify complete user journeys through the entire stack. A user navigates to the login page, types credentials, clicks submit, arrives at the role-appropriate dashboard, clicks a patient record, and sees the correct data. These tests run in a real browser against the full application.

Each layer has a cost. Unit tests are fast and precise but catch nothing about integration. E2E tests catch integration failures but are slow, harder to debug, and more brittle. Integration tests sit in the middle -- fast enough to run frequently, broad enough to catch real failures. The test strategy is the decision about where to invest.

Direct Claude to produce a test strategy document from the template. Specify the auth-specific scenarios: correct role, wrong role, unauthenticated, expired session, invalid credentials.

Open materials/guides/auth-guide.md and review the Common Auth Pitfalls section. Several of these pitfalls -- client-side enforcement, missing rate limiting, session storage in localStorage -- are things your tests should catch.

Step 2: Write integration tests for protected routes

Start with integration tests. These are the automated versions of the adversarial tests you ran manually in Unit 3.

Direct Claude to write integration tests for the protected API routes. Specify the test scenarios explicitly:

Write integration tests for the patient portal's protected API routes. Test three scenarios for each protected endpoint: authenticated with the correct role (expect 200), authenticated with the wrong role (expect 403), and unauthenticated (expect 401). Use Vitest. Test at minimum: /api/patients/:id/clinical-notes (doctor-only), /api/patients/:id/care-plan (doctor and nurse), /api/patients/:id/contact (all roles).

Watch what Claude generates. If the tests only cover the happy path -- correct role, authenticated, expect 200 -- they prove nothing about security. The adversarial tests are the ones that matter. A test suite where every test passes because every test uses a doctor account is a test suite that never verifies authorization.

Consider what's mocked and what's real. Every mock is a claim that the real thing would behave the same way. If you mock the auth middleware and it always returns "authorised," your tests pass even when the real middleware is broken. For auth tests, the middleware should be real. The database can be a test database.

Step 3: Set up Playwright

Playwright is a browser automation framework. It launches a real browser, fills in forms, clicks buttons, and asserts what appears on the page. It tests the application the way a user experiences it -- from the browser.

Install Playwright and configure it for the project. Direct Claude to set up Playwright with a specific constraint:

Set up Playwright for E2E testing. Configure it to run against the local development server. Use Playwright's built-in auto-waiting -- do not use page.waitForTimeout() or hardcoded delays. Use role-based locators (getByRole, getByLabel, getByText) instead of CSS selectors where possible.

AI commonly generates E2E tests with page.waitForTimeout(3000) scattered throughout -- pausing for three seconds and hoping the page has loaded. This produces tests that pass on fast machines and fail on slow ones. Playwright's auto-waiting mechanism waits for elements to be visible, enabled, and stable before interacting with them. It's deterministic. Hardcoded waits are guesses.

Write the first E2E test: the complete login flow. A user navigates to the login page, enters their email and password, submits the form, and arrives at the role-appropriate dashboard.

Step 4: Write E2E tests for auth flows

With Playwright configured and the first test passing, write the auth-specific E2E tests. Four flows:

Login with valid credentials. Enter correct email and password. Submit. Arrive at the dashboard. The test should verify the dashboard content matches the user's role.

Login with invalid credentials. Enter wrong password. Submit. An error message appears on the login page. The test should verify the error message is visible and the user stays on the login page.

Access protected page without authentication. Navigate directly to a protected route (like /patients) without logging in. The user should be redirected to the login page.

Role-based access denial. Log in as a nurse. Navigate to a doctor-only page (like /patients/:id/clinical-notes). An access denied message should appear, not the clinical notes data.

Each of these tests exercises a different aspect of the auth system. Together, they verify that the complete auth flow works from the user's perspective -- not just at the API level (integration tests) or the logic level (unit tests), but through the full stack with a real browser.

Step 5: Address test flakiness

Run the full test suite multiple times. If any test sometimes passes and sometimes fails, that's flakiness -- and flakiness is a signal, not randomness.

Common causes of flaky auth tests:

Timing assumptions. The test expects an element to appear immediately after form submission. On a fast machine, it does. On a slow machine or under load, it doesn't. Playwright's auto-waiting handles most of this, but custom assertions may need explicit waits for specific elements.

Shared state between tests. Test A creates a user. Test B expects that user to exist. Test C deletes that user. Run them in a different order and they break. Each test should set up its own state and clean up after itself.

Network dependencies. If the test relies on an external service (a real auth provider, an external API), network latency or downtime causes intermittent failures.

When the test suite is stable, run a cross-check. Direct a second AI model to review your test coverage against the access control documentation. Ask it: "Given this access control matrix and these test files, which permissions are tested and which are not?" A fresh perspective catches gaps that familiarity normalises.

✓ Check

✓ Check: Run the full test suite. All integration tests for protected routes pass, including the adversarial tests (wrong role returns 403, unauthenticated returns 401). All Playwright E2E tests pass, including login, invalid login, and role-based access denial.

Testing the Auth System

Step 1: Design the test strategy

Step 2: Write integration tests for protected routes

Step 3: Set up Playwright

Step 4: Write E2E tests for auth flows

Step 5: Address test flakiness