Learn by Directing AI
Unit 6

Error Tracking and Observability

Step 1: Set up error tracking

The portal works. The tests pass. But in production, things break that testing doesn't predict -- sessions expire mid-use, network requests fail, database connections drop during the Jarabacoa rainy season. You need to know when these things happen, not discover them when Lucia calls to say the system is down.

Error tracking (Sentry or an equivalent service) is not a logging replacement. It's a categorisation layer. It groups identical exceptions, counts their frequency, captures stack traces with environment context, and alerts on new error types. A single "AuthenticationError: Invalid credentials" entry with 47 occurrences in the past hour tells you something different than 47 individual log lines buried in a file.

Direct Claude to set up Sentry (or an equivalent). Configure it to capture unhandled exceptions. Auth failures are a distinct category -- failed logins, expired sessions, authorization denials each produce different errors, and Sentry should group them separately.

Step 2: Implement auth event logging

Beyond error tracking, every auth event should be logged with structured context. Not just "login failed" but who, when, from where, and what they were trying to access.

Direct Claude to implement structured logging for auth events:

  • Login success: user ID, role, IP address, timestamp
  • Login failure: attempted email, IP address, timestamp
  • Authorization check: user ID, role, route accessed, permitted or denied, timestamp
  • Session creation and destruction: user ID, session ID, timestamp
  • Patient record access: user ID, patient ID, role, route, timestamp

Each log entry should be a structured JSON object, not a plain text string. Structured logs are searchable, filterable, and parseable. A log that says "Error occurred" tells you nothing. A log that says {"action":"login_failure","attempted_email":"unknown@example.com","ip":"192.168.1.50","timestamp":"2026-04-11T14:31:00Z"} tells you exactly what happened, when, and from where.

Auth failures in production look different from auth failures in development. In development, tokens don't expire because sessions are short. OAuth callback URLs work because everything is localhost. Sessions don't race because there's one user. In production, a nurse at the Constanza clinic might leave the portal open all morning, and her session expires while she's with a patient. When she comes back and clicks, she sees -- what? That depends on whether your error handling catches the expired session and redirects to login, or shows stale data, or crashes.

Step 3: Review the error tracking dashboard

Trigger some auth errors deliberately. Try to log in with wrong credentials. Access a protected route with an expired session. Make a request as the wrong role.

Then check the Sentry dashboard. Each error type should appear as a distinct issue with its own count, timeline, and context tags.

Check the application logs. The failed login attempt should appear with the structured context you specified -- IP address, timestamp, attempted email. The authorization denial should show the user ID, role, and route.

The difference between useful observability and useless noise is context. A dashboard full of "Error: something went wrong" entries is useless. A dashboard that shows "3 failed login attempts from the same IP in the last 5 minutes" is actionable.

Step 4: Build an audit trail

Lucia hasn't thought about audit logs. But she should have.

Every time someone accesses a patient record, the system should record who, when, and what they accessed. This is not about distrusting the staff. It's about accountability in a medical context. If a patient asks "who has been looking at my records," Lucia needs to answer that question. If a provider accesses records they shouldn't have, the audit trail shows it. Dominican health regulations require retention of clinical records -- the access history is part of that obligation.

Direct Claude to implement a basic audit trail: a database table that logs every patient record access. User ID, patient ID, action (viewed, created, updated), timestamp. The middleware that checks authorization can also write the audit entry -- the two concerns live at the same enforcement point.

When the audit trail is working, explain the concept to Lucia. Frame it in terms she'll understand: "Every time someone looks at a patient record, the system records who, when, and what they accessed. Not because you don't trust your staff, but because if there's ever a question about who saw what, you can answer it."

She hadn't thought about it, but she sees the value immediately. In a small clinic network where everyone knows each other, trust is implicit. But accountability protects both the patients and the staff.

✓ Check

✓ Check: Trigger a failed login attempt. Check Sentry (or equivalent) -- the error appears with context (IP address, timestamp, attempted username). Check the application logs -- the failed attempt is logged with auth-specific context.