← Blog

Debug Like a Detective: bmad-investigate's Evidence-Graded Investigation

Detective evidence board with three tiers of colored cards in a dark cinematic room
New here? What is BMAD?
BMAD Method is a free, open-source AI workflow framework for Claude Code. It gives you structured AI-powered tools — multi-agent code review, brainstorming facilitation, custom AI personas, and this: a forensic debugging skill that enforces evidence discipline. You install it once with a single command and invoke tools via slash commands in Claude Code.

The typical debugging approach is to read the error, form a hunch, then go looking for evidence that confirms it. That is not investigation — that is confirmation bias with a keyboard. BMAD's bmad-investigate enforces the opposite discipline: every claim gets graded, every inference must trace back to something directly observed, and every hypothesis stays on record even after it turns out to be wrong.


The three evidence tiers

Every finding in a bmad-investigate case file carries one of three grades. The grade is not optional — it is the whole point.

Confirmed Directly observed. Cited with a path:line reference, log timestamp, or commit hash. If you cannot cite it, it is not Confirmed.
Deduced Logically follows from Confirmed evidence. The reasoning chain must be shown explicitly. "A is Confirmed, B follows from A because..." — no jumps allowed.
Hypothesized Plausible but unconfirmed. Every hypothesis must state what would confirm it and what would refute it. Hypotheses are never deleted, even when proven wrong — they become part of the forensic record.

The workflow anchors in Confirmed and expands outward. You do not start from a hunch and hunt for backup — you start from what you can see and reason forward. The grade system makes it impossible to quietly slip from "I observed X" into "therefore Y must be the bug" without labeling the leap.


Launch — three example invocations

Pass anything specific enough to start from: an error message, a code path, a ticket ID, a log file, or the path to an existing case file from a prior session.

Claude Code Chat  — error message stack trace, line number, or any error text
/bmad-investigate TypeError: Cannot read properties of null at getUserProfile:47
Claude Code Chat  — code area exploration no symptom required — builds a mental model of the area
/bmad-investigate src/payments/
Claude Code Chat  — resume a prior case loads open hypotheses and backlog, asks which thread to pull
/bmad-investigate investigations/proj-4821-investigation.md

Other valid inputs: ticket IDs, log file paths, diagnostic archives, or a plain problem description in natural language. The skill calibrates its approach based on what you hand it — defect-chasing when there is a symptom, area-exploration when there is not.


What the case file looks like

Every investigation produces a structured case file. It is a living document, not a summary — designed to be picked up mid-investigation or handed off to another developer. The most forensically interesting sections:

Section
Evidence Inventory
A graded list of every piece of evidence collected so far. Each item is tagged Confirmed, Deduced, or Hypothesized. Path:line citations use CWD-relative format (no leading slash) so they are clickable in IDE-embedded terminals.
Section
Timeline of Events
Timestamped, confidence-graded entries for everything that happened — in the code history, in the logs, in the incident. Tracks what changed when, and how certain each entry is.
Section
Hypothesized Paths (with confirm/refute criteria)
Every hypothesis written out with two things: what evidence would confirm it, and what evidence would refute it. This is the enforcement mechanism — a hypothesis without refutation criteria is just a hunch with nicer formatting. Entries are never deleted, even when proven wrong. Wrong hypotheses document the shape of the investigation.
Section
Deduced Conclusions
Reasoning chains built from Confirmed evidence. Each step must be explicit: "A is Confirmed at path:line. B follows from A because of C. Therefore D." No silent jumps from observation to conclusion.
Section
Source Code Trace
The actual code paths traced during investigation, with line-level citations. When you come back to this case a week later, you can see exactly which functions were walked and in what order.

The file also includes: Hand-off Brief (3-sentence summary), Case Info metadata, Problem Statement, Investigation Backlog, Confirmed Findings, Missing Evidence, Conclusion with confidence level, Recommended Next Steps, Side Findings, and Follow-up blocks for same-day reentry.


Not just for bugs

The skill handles two distinct modes with the same discipline:

🔍
Mode 1
Defect-chasing (symptom-driven)
You have an error, a crash, or an incident. The skill traces the symptom backward through evidence tiers until it hits a Confirmed root cause — or clearly marks where the chain breaks.
🗺️
Mode 2
Area-exploration (no symptom)
You're about to work on an unfamiliar codebase section and want a mental model before touching it. Pass a directory path. The skill maps the code, grades what it knows about behavior and assumptions, and surfaces what is unclear.

Same evidence tiers, same case file structure, same anti-confirmation-bias discipline. The output is a navigable map of what is actually known vs. what is assumed — useful before writing a single line of new code.


Resuming prior cases

Passing a path to an existing case file is a first-class workflow called Outcome 0. The skill loads the file, surfaces all open hypotheses that were never resolved, pulls up the Investigation Backlog, and asks which thread to pull next.

New evidence from the session appends as a ## Follow-up: {YYYY-MM-DD} block at the bottom of the file. The original case is never rewritten — the investigation history stays intact. If you had five hypotheses last week and three turned out to be wrong, those are still in the file, labeled as refuted, so you can see what you ruled out and why.

Why this matters
A case file you can resume is a case file you can hand off. When a second developer picks up the investigation, they get the full reasoning history — not just the current best guess.

Where case files are saved

Case files land at {implementation_artifacts}/investigations/{slug}-investigation.md. If the project has no configured implementation_artifacts path, the skill falls back to ./investigations/ in the current working directory.

The slug is derived from the investigation subject — an error type, a ticket number, or a path. Each investigation gets its own file; running /bmad-investigate twice on the same subject opens a new case unless you explicitly pass the existing file path to resume.


Not party mode — a different skill entirely
Clarification
bmad-investigate is a standalone forensic investigation tool registered in the dev agent's menu under code IN. It has no relationship to party mode or the multi-agent review crew. Party mode is a roundtable discussion format. This is a structured evidence-grading workflow for a single investigator — human or AI — tracing a problem through a codebase. The output is a case file, not a conversation transcript.

More in the BMAD 6.9.0 release

bmad-investigate is one of five new tools that shipped in 6.9.0. Each gets its own deep-dive.

Available now in BMAD

bmad-investigate ships as part of BMAD Method for Claude Code. It is a pure prompt-based skill — no separate install, no executable script. If you have BMAD configured, the skill is already available.

Comments

Loading comments…

Leave a comment

Want to work together?

If something here resonated, let's talk. I help teams build AI systems and automate workflows.