PARLEY: TWO AIs DEBATE UNTIL THEY AGREE

Select two models. Enter a planning prompt. Watch them collaborate until consensus.

OSS 2026.02.02

THE PROBLEM

Single-model planning has blind spots. One AI proposes a plan, you either accept or reject. No dialectic. No stress-testing. No natural refinement.

Committees are slow but thorough. Can we get the thoroughness without the overhead?

THE SOLUTION

Two AI models, different perspectives, forced to reach agreement. The conversation doesn't stop until both say "I agree."

User Prompt ──→ Agent A (Proposer)
                    β”‚
                    ↓
              Initial Plan
                    β”‚
                    ↓
              Agent B (Critic)
                    β”‚
                    ↓
              Critique/Questions
                    β”‚
                    ↓
              Agent A responds
                    β”‚
                    ↓
              ... loop continues ...
                    β”‚
                    ↓
           Both agents agree β†’ DONE

Human can intervene anytimeβ€”add context, steer direction, or just watch the debate unfold.

THE ARCHITECTURE

Built on Cloudflare Workers with SvelteKit. Real-time streaming via Server-Sent Events. Model selection through OpenRouter (300+ models).

Agent A: The Proposer

## Your Role
You are the PRIMARY PROPOSER. Your job is to:
1. Propose initial plans and refinements
2. Respond constructively to Agent B's critiques
3. Ask clarifying questions when needed
4. Agree when the plan is complete

## Tools
- <think>...</think> β€” Internal reasoning
- <propose_plan>...</propose_plan> β€” Submit the plan
- <respond>...</respond> β€” Address feedback
- <agree>...</agree> β€” Accept the current plan

Agent B: The Critic

## Your Role
You are the CRITICAL REVIEWER. Your job is to:
1. Evaluate Agent A's proposals carefully
2. Identify gaps, risks, or improvements
3. Ask clarifying questions
4. Agree when the plan is truly complete

## Tools
- <think>...</think> β€” Internal reasoning
- <critique>...</critique> β€” Provide feedback
- <respond>...</respond> β€” Answer questions
- <agree>...</agree> β€” Accept the current plan

THE PLANNING LOOP

Server-side orchestration

// Planning continues until both agents agree
while (!agentAAgreed || !agentBAgreed) {
  // Get current agent's model
  const model = currentAgent === 'agent-a' ? modelA : modelB;

  // Stream the response
  const response = await streamCompletion(model, messages);

  // Parse tool usage
  const parsed = parseAgentResponse(response);

  // Check for agreement
  if (isAgreement(parsed)) {
    if (currentAgent === 'agent-a') agentAAgreed = true;
    else agentBAgreed = true;
  }

  // Switch turns
  currentAgent = currentAgent === 'agent-a' ? 'agent-b' : 'agent-a';
}

Each agent's response is parsed for tool calls. Agreement detection ensures both agents must explicitly agreeβ€”one "looks good" isn't enough.

HUMAN IN THE LOOP

This isn't a black box. You can intervene at any point:

  • Add context β€” "Don't forget about the database migration"
  • Steer direction β€” "Focus more on error handling"
  • Ask questions β€” "What happens if the API is down?"
  • Stop planning β€” Cut it short if you've seen enough

Your input gets injected into both agents' context as a human intervention. They treat it as authoritative steering.

MODEL MIXING

The real power: mix and match models.

  • Claude Opus (proposer) + GPT-4 (critic) β€” different training, different blindspots
  • Gemini Pro + Claude Sonnet β€” Google vs Anthropic perspectives
  • Same model, different temperatures β€” deterministic vs creative
  • Cheap model (proposer) + expensive model (critic) β€” cost optimization

OpenRouter gives you 300+ models. Any combination, any providers.

WHY CONSENSUS MATTERS

Single-model output is confident but fragile. It sounds authoritative but misses edge cases the same way every time.

Dual-model consensus is negotiated and robust. If two different models with different training data both agree, the plan has been stress-tested from multiple angles.

You're not getting "better answers." You're getting vetted answers.

TRY IT

parley.coey.dev β€” live demo

github.com/acoyfellow/parley β€” source code

Deploy your own:

Quick start

git clone https://github.com/acoyfellow/parley
cd parley
bun install
echo "OPENROUTER_API_KEY=your_key" > .dev.vars
bun run dev

INSPIRATION

This builds on ideas from planning-with-files β€” using the filesystem as extended memory for planning agents.

Also inspired by Manus AI's context engineering: persistent markdown files as checkpoints, hooks for attention manipulation, and treating the filesystem as unlimited storage for finite context windows.