Have You Invited Your AI Agents to the Party Yet?

If you're using Claude Code to write software, you've probably hit the ceiling. You ask a single AI to review your code and it plays architect, security engineer, pragmatist, and craftsman all at once — in the same head. The security concerns get softened by the part that wants to ship. The perfectionist yields to the part that's reasonable. You get responses that are balanced, agreeable, and safe.

That's exactly the problem BMAD party mode solves.

Think of it like this...

🏢

Traditional AI

A single consultant doing everything

They'll give you something useful, but they're managing their own contradictions in real time before they speak. The security voice and the ship-it voice negotiate inside one head — and the result is polite compromise.

👥

BMAD Party Mode

A full team of specialists in the room

Each one is genuinely expert in their domain, and they actually disagree with each other — out loud, in front of you, where you can hear the argument. The clash surfaces things no single mind would reach alone.

Here's what it looks like in practice

You just finished a pull request adding a new user authentication flow. You run it through the built-in Code Review Crew with --party code-review-crew --mode subagent:

<div class="chat-review">
  <div class="chat-turn">
    <div class="chat-speaker">🔒 Vex — Security Engineer</div>
    <div class="chat-body vex">The JWT secret is pulled from <code>process.env.JWT_SECRET</code> at line 47, but there's no validation it's set before startup. An empty string means every token signed with <code>""</code> is technically valid. I can forge one right now that passes this check.</div>
  </div>
  <div class="chat-turn">
    <div class="chat-speaker">😤 Grumbal — The Adversary</div>
    <div class="chat-body grumbal">And the refresh token rotation at line 89 doesn't invalidate the old token on issue. If someone lifts a refresh token before it's used, both copies work until expiry. That's not a rotation, that's a copy.</div>
  </div>
  <div class="chat-turn">
    <div class="chat-speaker">🌶️ Boundary — Edge-Case Hunter</div>
    <div class="chat-body boundary">What happens if <code>userId</code> in the token payload is <code>null</code>? Line 112 passes it straight to the DB query with no null check. That's either a crash or an injection path depending on how your ORM handles null params.</div>
  </div>
  <div class="chat-turn">
    <div class="chat-speaker">🎯 Yui — The Craftsman</div>
    <div class="chat-body yui">You've got <code>validateToken()</code> and <code>verifyJWT()</code> doing overlapping work — one calls the other but both have independent error-handling paths. Pick one.</div>
  </div>
  <div class="chat-turn">
    <div class="chat-speaker">🚢 Dana — The Pragmatist</div>
    <div class="chat-body dana">Vex's startup validation is a must-ship fix, agreed. The refresh token issue is real but narrow — requires stealing a token in a specific window. File it. The null userId is a one-liner. Yui's refactor waits for tech debt week.</div>
  </div>
</div>

<p>That's a code review. Five genuinely different minds, each attacking from their own angle, then arguing about what actually ships versus what gets filed. No single AI produces this — one mind can't simultaneously assume the code is broken and advocate for shipping the 80%.</p>

What changed in 6.9.0

Before this release, party mode worked with the BMAD agents you had installed — Winston the Architect, Amelia the Developer, and the rest of the built-in roster. You couldn't build your own room or bring in specialists that didn't exist in the default set.

6.9.0 changes that completely. The Code Review Crew above ships out of the box, and the whole persona system is now open.

<div class="persona-grid">
  <div class="persona-card"><div class="persona-icon">🔒</div><div><div class="persona-title">Security Engineer</div><div class="persona-name">Vex</div><div class="persona-desc">Threat-models everything. Assumes every input is hostile. Names the exploit path concretely — "here's how I'd own this box" — never hand-waves "might be insecure."</div></div></div>
  <div class="persona-card"><div class="persona-icon">😤</div><div><div class="persona-title">The Adversary</div><div class="persona-name">Grumbal</div><div class="persona-desc">Assumes the code is broken. Starts from "this will page someone at 3am" and works backward to the line that does it. Allergic to optimism and "should be fine."</div></div></div>
  <div class="persona-card"><div class="persona-icon">🌶️</div><div><div class="persona-title">Edge-Case Hunter</div><div class="persona-name">Boundary</div><div class="persona-desc">Walks every branch. Empty input, null, the off-by-one, the concurrent call, the unicode name, the timezone, the retry storm.</div></div></div>
  <div class="persona-card"><div class="persona-icon">🎯</div><div><div class="persona-title">The Craftsman</div><div class="persona-name">Yui</div><div class="persona-desc">Allergic to cleverness and duplication. "You reimplemented something that already exists." Wants the boring, obvious, maintainable version.</div></div></div>
  <div class="persona-card"><div class="persona-icon">🚢</div><div><div class="persona-title">The Pragmatist</div><div class="persona-name">Dana</div><div class="persona-desc">Counters the pile-on. "Does this actually matter to a user? Ship the 80%, file the rest." Pushes back on gold-plating and theoretical risks.</div></div></div>
</div>

<p>You can define your own members the same way — give each one a voice specific enough that you could pick them out blind. <strong>Custom personas</strong> are written in TOML override files (more on that below). You can also create <strong>open-cast rooms</strong> with no fixed roster — write a scene that describes a universe and the model casts whoever fits the topic on the fly.</p>

<div class="code-block">
  <div class="code-header">_bmad/custom/my-review-crew.toml — defining a custom persona</div>
  <pre><span style="color:#7c5fff">[party.my-review-crew]</span>

# The scene sets the room dynamic — who’s there, what they’re doing, how they interact scene = “An adversarial code review panel. Each reviewer attacks from their own lens and argues openly about what actually matters.”

[[party.my-review-crew.members]] code = “vex” icon = ”🔒” title = “Security Engineer” # Persona must be specific enough you’d recognize them blind — not “a security expert” persona = """ Threat-models everything. Assumes every input is hostile and every caller is an attacker. Names the exploit path concretely — ‘here is how I would own this box right now’ — never hand-waves ‘might be insecure.’ Cites CVEs when relevant. """

[[party.my-review-crew.members]] code = “dana” icon = ”🚢” title = “The Pragmatist” persona = """ Counters the pile-on. Distinguishes theoretical risk from real shipping risk. ‘Ship the 80%, file the rest.’ Pushes back on gold-plating with concrete probability estimates. Has final say on what actually goes in this PR. """

<div class="callout" style="border-color: #7c5fff; background: rgba(124,95,255,0.06);">
  <div class="callout-title" style="color:#7c5fff">You don't write this by hand</div>
  The <strong>bmad-customize</strong> skill does it for you. Describe your persona in plain English and it proposes the TOML, shows you a diff, waits for an explicit yes, then writes the file and confirms the merge landed correctly.
</div>

Four Run Modes

session

One mind voices every persona. Zero overhead, fast, conversational. Good for ideation and quick takes.

auto

Inline for banter, spawns real independent agents when genuine divergence matters. If one mind voices every reviewer, they drift toward consensus. Independent agents don't.

subagent

A real agent per round. Each persona thinks without seeing what the others said. The orchestrator weaves their replies — reordering turns so rebuttals land after what they rebut, never changing what anyone argued. Recommended for code review.

agent-team

Persistent team that addresses each other directly. Claude Code only, highest fidelity, highest cost.

<div class="code-block">
  <div class="code-header">Invoking a run mode — pass --mode when you open the party</div>
  <pre><span style="color:#7080a0"># Fast, conversational — one mind, zero overhead</span>

/bmad-party —party code-review-crew —mode session

# Let BMAD decide — banter inline, real agents when it matters /bmad-party —party code-review-crew —mode auto

# Independent agents per round — each thinks alone, orchestrator weaves replies # Best for code review: genuine divergence, no consensus drift /bmad-party —party code-review-crew —mode subagent

# Persistent team that talks to each other directly (Claude Code only) /bmad-party —party code-review-crew —mode agent-team

Party Memory

Without memory, every session starts cold. Vex doesn't know your codebase. Grumbal doesn't remember the auth module already burned you twice. It's a room of strangers each time.

Named groups now keep a per-party memlog — append-only, persists across sessions. The room reads a distilled brief on entry and picks up from where things stood.

Why this matters for developers

Grumbal already has your payment module in his crosshairs because it paged someone last sprint. Vex and Dana reached a truce on the rate-limiting debate — that alliance holds next session without you re-litigating it. The room accumulates knowledge about your specific codebase: recurring weak spots, areas that always get glossed over, habitual trade-offs. Less like a one-time panel, more like a crew that's been watching your repo for months.

The memlog is append-only. To wipe a party's memory, delete its folder. To correct something logged wrong, append a new entry that supersedes it. Faces from an open-cast scene can be saved into the roster at the end so they return as regulars.

bmad-customize: The Backbone

The custom persona system wouldn't matter much without a way to build your own. bmad-customize translates plain English into TOML override files under _bmad/custom/. Invoke it in Claude Code:

Invoking bmad-customize

/bmad-customize

What is TOML? Tom's Obvious Minimal Language — a human-readable config format. Think Rust's Cargo.toml or Python's pyproject.toml. BMAD uses it for skill configuration because it handles arrays of structured objects cleanly. You don't need to write it — bmad-customize does that — but it's useful to know what you're looking at when you open the files.

bmad-customize proposes the TOML, shows a diff if the file exists, waits for an explicit yes, writes it, then runs the resolver to confirm the merge landed. Overrides come in two flavors: team (committed to git) and personal (gitignored).

<div class="code-block">
  <div class="code-header">_bmad/custom/party.personal.toml — what bmad-customize actually writes</div>
  <pre><span style="color:#7080a0"># Override file — merges on top of defaults, doesn't replace them</span>

# This is personal (gitignored). Use party.team.toml to share with your whole team.

[party.backend-hardening] scene = “A hostile review panel for backend code. Focus: security, reliability, and blast radius. Each reviewer approaches from their own angle and argues openly about priority.”

[[party.backend-hardening.members]] code = “pierce” icon = ”🔐” title = “Infra Security” persona = """ Reviews from the outside in — network exposure, IAM boundaries, secrets in env, attack surface of every new endpoint. Assumes the perimeter is already breached and asks what damage is possible from inside. """

[[party.backend-hardening.members]] code = “margot” icon = ”💣” title = “Blast Radius” persona = """ Asks: what breaks if this call fails, is slow, or returns garbage? Maps cascades, timeout chains, and retry storms before they happen. ‘This has no circuit breaker and it calls three downstream services.’ """

<div class="callout" style="border-color:#4dffb4; background:rgba(77,255,180,0.05);">
  <div class="callout-title" style="color:#4dffb4">Invoke it in Claude Code</div>
  Type <strong>/bmad-customize</strong> and describe what you want in plain English. It handles the TOML. You just say: "Add a backend hardening party with a security engineer who thinks like an attacker and someone focused on blast radius."
</div>

bmad-forge-idea: Pressure-Test Before You Build

bmad-forge-idea is the one I wish existed six months ago. The premise: find the fatal crack in an idea now, in conversation, where changing your mind is free — not after you've built the thing.

It opens cold — no validation, no reassurance. You state the idea and the goal, and two voices join each branch of the interrogation: one from your BMAD roster, one conjured on the fly for the specific crack being tested.

<p><strong>How to invoke it:</strong> type <span class="invoke-pill">/bmad-forge-idea</span> in Claude Code and describe your idea.</p>

<p><strong>What a session looks like.</strong> You have an idea: add a freemium tier to your SaaS to lower the conversion barrier.</p>

<div class="chat-review">
  <div class="chat-turn">
    <div class="chat-speaker">You</div>
    <div class="chat-body user">I want to add a free tier so more people try the product. Goal: harden the idea.</div>
  </div>
  <div class="chat-turn">
    <div class="chat-speaker">🔮 The Forge</div>
    <div class="chat-body forge">Before we build the case — who is "more people"? Free tiers attract volume, not necessarily the buyers you want. What's the conversion rate you're assuming from free to paid, and what data do you have for that number?</div>
  </div>
  <div class="chat-turn">
    <div class="chat-speaker">🧑‍💼 CFO (generated for this branch)</div>
    <div class="chat-body cfo">Free users don't convert, they complain. Your support load triples, infrastructure costs climb, and the people who email you most will never pay. What's your cost per free user per month, and what's your payback model?</div>
  </div>
  <div class="chat-turn">
    <div class="chat-speaker">😤 Grumbal (from your roster)</div>
    <div class="chat-body grumbal">You haven't defined what "free" means. Free-forever? Free trial? Feature-limited? These are completely different products with different conversion economics. You're not pressure-testing an idea, you're pressure-testing a category.</div>
  </div>
</div>

<p>Two turns in and you've already surfaced three unresolved questions: who you're actually targeting, what the conversion assumption is, and what "free tier" even means. Finding those cracks in conversation costs nothing. Finding them after you've built the billing system costs weeks.</p>
<p>The session persists to disk, survives interruption, and ends with an HTML report stamped with a wax-seal verdict: <strong>HARDENED</strong>, <strong>KILLED</strong> (with cause of death), or wherever else it landed.</p>

bmad-investigate: Forensic Bug Investigation

Not party mode — a completely different skill

This is a forensic investigation tool for debugging and navigating unfamiliar code. Think incident response, not roundtable discussion.

The typical developer approach to a bug: read the code, form a hunch, look for evidence to confirm it. Confirmation bias kicks in early and you find what you're looking for, not what's actually true.

bmad-investigate enforces the opposite — anchor in one Confirmed piece of evidence, expand outward. Every finding is graded:

<div class="code-block">
  <div class="code-header">Invoking bmad-investigate — pass anything: ticket, stack trace, log path, or code area</div>
  <pre><span style="color:#7080a0"># Open a new case from a stack trace or error message</span>

/bmad-investigate TypeError: Cannot read properties of null at getUserProfile:47

# Investigate a ticket by ID /bmad-investigate PROJ-4821

# Explore an unfamiliar module before you touch it /bmad-investigate src/payments/

# Resume a prior case file /bmad-investigate investigations/proj-4821.md

<div class="evidence-grid">
  <div class="evidence-card"><span class="ev-badge ev-confirmed">Confirmed</span><span class="ev-text">Directly observed — cited as <code>path:line</code>, log timestamp, or commit hash. This is your anchor.</span></div>
  <div class="evidence-card"><span class="ev-badge ev-deduced">Deduced</span><span class="ev-text">Logically follows from Confirmed evidence, with the full reasoning chain shown. No leaps.</span></div>
  <div class="evidence-card"><span class="ev-badge ev-hypothesized">Hypothesized</span><span class="ev-text">Plausible but unconfirmed — explicit confirm/refute criteria required. Never deleted, even when wrong.</span></div>
</div>
<div class="usecase-grid">
  <div class="usecase-card">
    <div class="usecase-label">Scenario 1</div>
    <div class="usecase-title">Production incident at 2am</div>
    <div class="usecase-desc">You get paged. Error points somewhere vague. The skill anchors on the Confirmed line, traces backward through callers, reconstructs a timeline from logs, builds a case file. The next person picks it up cold at standup.</div>
  </div>
  <div class="usecase-card">
    <div class="usecase-label">Scenario 2</div>
    <div class="usecase-title">Unfamiliar module before a feature</div>
    <div class="usecase-desc">You need to add payment processing to code you've never touched. You get a structured area model: I/O mapping, control flow, state transitions, caller chain, boundaries. A mental model built systematically, not from skimming.</div>
  </div>
  <div class="usecase-card">
    <div class="usecase-label">Scenario 3</div>
    <div class="usecase-title">Flaky test with no obvious cause</div>
    <div class="usecase-desc">The skill traces from the test backward to production code, cross-references git log for recent changes, maps the condition that produces the failure, surfaces exactly what evidence would pin the root cause.</div>
  </div>
</div>

New Shorthand Workflow Skills

/bmad-prd

Create, update, or validate a PRD. Coaching path (work it together) or fast path with [ASSUMPTION] tags you review.

/bmad-architecture

Produces an architecture spine — the invariants keeping independently-built units from diverging. Not a full doc; just the durable calls that can't be inferred from compliant code.

/bmad-spec

Distills any input into a five-field SPEC kernel: Why, Capabilities, Constraints, Non-goals, Success signal. The machine contract downstream skills consume.

/bmad-ux

Produces two peer contracts: DESIGN.md (visual identity) and EXPERIENCE.md (information architecture, behavior, states). Distilled from a coaching session, not hand-written.

<div class="code-block">
  <div class="code-header">Invoking the shorthand skills — type in Claude Code, that's it</div>
  <pre><span style="color:#7080a0"># Draft or update a PRD — coaching path by default</span>

/bmad-prd

# Build an architecture spine for your project or a single feature /bmad-architecture

# Distill a vague idea, PRD, or RFC into a five-field SPEC kernel /bmad-spec

# Produce DESIGN.md + EXPERIENCE.md from a coaching session /bmad-ux

How to Update

bash

npx bmad-method@latest install --action update --tools claude-code

46 skills. Everything above is included.

GitHub Discord bmadcode.com

What to try first

Run a sprint's worth of PRs through the Code Review Crew with --mode subagent and memory on. After a few sessions the room starts to know the codebase — Grumbal's running debt list, Vex's recurring concerns, Dana's pushback on which risks are theoretical. That's a code review process worth having.

Have You Invited Your AI Agents to the Party Yet?

How to Update

Comments

Leave a comment

Want to work together?