PARLEY: TWO AIs DEBATE UNTIL THEY AGREE
Select two models. Enter a planning prompt. Watch them collaborate until consensus.
THE PROBLEM
Single-model planning has blind spots. One AI proposes a plan, you either accept or reject. No dialectic. No stress-testing. No natural refinement.
Committees are slow but thorough. Can we get the thoroughness without the overhead?
THE SOLUTION
Two AI models, different perspectives, forced to reach agreement. The conversation doesn't stop until both say "I agree."
User Prompt βββ Agent A (Proposer)
β
β
Initial Plan
β
β
Agent B (Critic)
β
β
Critique/Questions
β
β
Agent A responds
β
β
... loop continues ...
β
β
Both agents agree β DONEHuman can intervene anytimeβadd context, steer direction, or just watch the debate unfold.
THE ARCHITECTURE
Built on Cloudflare Workers with SvelteKit. Real-time streaming via Server-Sent Events. Model selection through OpenRouter (300+ models).
Agent A: The Proposer
## Your Role
You are the PRIMARY PROPOSER. Your job is to:
1. Propose initial plans and refinements
2. Respond constructively to Agent B's critiques
3. Ask clarifying questions when needed
4. Agree when the plan is complete
## Tools
- <think>...</think> β Internal reasoning
- <propose_plan>...</propose_plan> β Submit the plan
- <respond>...</respond> β Address feedback
- <agree>...</agree> β Accept the current plan Agent B: The Critic
## Your Role
You are the CRITICAL REVIEWER. Your job is to:
1. Evaluate Agent A's proposals carefully
2. Identify gaps, risks, or improvements
3. Ask clarifying questions
4. Agree when the plan is truly complete
## Tools
- <think>...</think> β Internal reasoning
- <critique>...</critique> β Provide feedback
- <respond>...</respond> β Answer questions
- <agree>...</agree> β Accept the current planTHE PLANNING LOOP
Server-side orchestration
// Planning continues until both agents agree
while (!agentAAgreed || !agentBAgreed) {
// Get current agent's model
const model = currentAgent === 'agent-a' ? modelA : modelB;
// Stream the response
const response = await streamCompletion(model, messages);
// Parse tool usage
const parsed = parseAgentResponse(response);
// Check for agreement
if (isAgreement(parsed)) {
if (currentAgent === 'agent-a') agentAAgreed = true;
else agentBAgreed = true;
}
// Switch turns
currentAgent = currentAgent === 'agent-a' ? 'agent-b' : 'agent-a';
} Each agent's response is parsed for tool calls. Agreement detection ensures both agents must explicitly agreeβone "looks good" isn't enough.
HUMAN IN THE LOOP
This isn't a black box. You can intervene at any point:
- Add context β "Don't forget about the database migration"
- Steer direction β "Focus more on error handling"
- Ask questions β "What happens if the API is down?"
- Stop planning β Cut it short if you've seen enough
Your input gets injected into both agents' context as a human intervention. They treat it as authoritative steering.
MODEL MIXING
The real power: mix and match models.
- Claude Opus (proposer) + GPT-4 (critic) β different training, different blindspots
- Gemini Pro + Claude Sonnet β Google vs Anthropic perspectives
- Same model, different temperatures β deterministic vs creative
- Cheap model (proposer) + expensive model (critic) β cost optimization
OpenRouter gives you 300+ models. Any combination, any providers.
WHY CONSENSUS MATTERS
Single-model output is confident but fragile. It sounds authoritative but misses edge cases the same way every time.
Dual-model consensus is negotiated and robust. If two different models with different training data both agree, the plan has been stress-tested from multiple angles.
You're not getting "better answers." You're getting vetted answers.
TRY IT
parley.coey.dev β live demo
github.com/acoyfellow/parley β source code
Deploy your own:
Quick start
git clone https://github.com/acoyfellow/parley
cd parley
bun install
echo "OPENROUTER_API_KEY=your_key" > .dev.vars
bun run devINSPIRATION
This builds on ideas from planning-with-files β using the filesystem as extended memory for planning agents.
Also inspired by Manus AI's context engineering: persistent markdown files as checkpoints, hooks for attention manipulation, and treating the filesystem as unlimited storage for finite context windows.