DITTO: EDGE-NATIVE PARALLEL LLM ORCHESTRATION
Multiple models. One call. Consensus merging.
Run multiple AI models in parallel, merge their outputs with consensus, get back typed results. Built on Durable Objects for durability and idempotency.
THE QUICK VERSION
Single model = fragile. Multiple models in parallel = robust + fewer hallucinations. But combining N different API calls = tedious.
Ditto handles the orchestration. You call one function. It runs models in parallel using Effect for true concurrency, merges outputs with consensus or cooperative strategies, tracks individual responses, and returns typed results.
Per-request Durable Object job. Effect-based execution. Unlimited concurrency. All on the edge.
ARCHITECTURE
Client → Worker → JobDO (per request)
↓ ↓
Effect.all Parallel model calls
↓ ↓
Consensus Merged resultWHY PARALLEL MODELS?
Single models fail. Multiple models agree = higher confidence.
// Why parallel models?
// Single model = fragile
const single = await ai.run("@cf/meta/llama-3.1-8b-instruct", { prompt });
// One failure = complete failure
// One hallucination = wrong answer
// Multiple models in parallel = robust
const multi = await ditto({
prompt,
models: [
"@cf/meta/llama-3.1-8b-instruct",
"@cf/mistral/mistral-7b-instruct",
"@cf/qwen/qwen-2.5-7b-instruct"
],
strategy: "consensus"
});
// Majority agreement = higher confidence
// Disagreement = flag for review
// One failure = others continueCLIENT
Install ditto-ai, call one function. Get merged results with confidence scores.
// client.ts
import { dittoClient } from "ditto-ai";
const ditto = dittoClient({
endpoint: "https://your-worker.workers.dev/llm",
});
const response = await ditto({
prompt: "Summarize this email...",
models: [
"@cf/meta/llama-3.1-8b-instruct",
"@cf/mistral/mistral-7b-instruct"
],
strategy: "consensus",
});
console.log(response.result); // merged output
console.log(response.structured); // intent, confidence, supporting models
console.log(response.responses); // individual model outputs
console.log(response.timings); // total, fanout, slowest, merge timeINFRASTRUCTURE
One namespace, one Worker—Alchemy wires the bindings.
// alchemy.run.ts
import alchemy from "alchemy";
import { Worker, DurableObjectNamespace } from "alchemy/cloudflare";
const app = await alchemy("ditto");
const jobs = DurableObjectNamespace("job", { className: "JobDO" });
await Worker("api", {
entrypoint: "./src/worker.ts",
bindings: { JOBS: jobs, AI: "AI" }
});
await app.finalize();WORKER
Create per-request job DO. Poll for result or use WebSocket for real-time.
// src/worker.ts
import { Hono } from 'hono';
import { createJob, getJobResult } from './job';
const app = new Hono();
app.post('/llm', async (c) => {
const { prompt, models, strategy } = await c.req.json();
const jobId = crypto.randomUUID();
const stub = c.env.JOBS.get(c.env.JOBS.idFromName(jobId));
// Create job DO, runs models in parallel
await stub.fetch('https://do/start', {
method: 'POST',
body: JSON.stringify({ prompt, models, strategy })
});
// Poll for result (or use WebSocket for real-time)
const result = await getJobResult(stub);
return c.json(result);
});
export default app;JOB DURABLE OBJECT
Effect.all runs models in parallel with unbounded concurrency. Merge based on strategy.
// src/job.ts (Durable Object)
import { DurableObject } from "cloudflare:workers";
import { Effect } from "effect";
export class JobDO extends DurableObject {
private state: 'pending' | 'running' | 'complete' | 'error' = 'pending';
private responses: any[] = [];
private result: any = null;
async fetch(req: Request) {
const url = new URL(req.url);
if (url.pathname === '/start' && req.method === 'POST') {
const { prompt, models, strategy } = await req.json();
// Effect.all runs models in parallel
const program = Effect.all(
models.map(model =>
Effect.tryPromise({
try: () => this.callModel(model, prompt),
catch: (error) => new Error(`Model ${model} failed`)
})
),
{ concurrency: 'unbounded' }
);
// Execute with Effect runtime
const responses = await Effect.runPromise(program);
this.responses = responses;
// Merge based on strategy
this.result = strategy === 'consensus'
? this.mergeConsensus(responses)
: this.mergeCooperative(responses);
this.state = 'complete';
return Response.json({ success: true });
}
if (url.pathname === '/result') {
return Response.json({
state: this.state,
result: this.result,
responses: this.responses
});
}
return new Response('not found', { status: 404 });
}
private async callModel(model: string, prompt: string) {
const response = await this.env.AI.run(model, {
messages: [{ role: 'user', content: prompt }]
});
return { model, output: response.response };
}
private mergeConsensus(responses: any[]) {
// Confidence scoring, majority voting, etc.
return {
result: responses[0].output, // simplified
structured: {
confidence: 0.92,
supportingModels: responses.map(r => r.model)
}
};
}
private mergeCooperative(responses: any[]) {
// Sequential building, each model refines previous
return { result: responses[responses.length - 1].output };
}
}STRATEGIES
Consensus: parallel merge with confidence scoring. Cooperative: sequential, models build on each other.
// Strategies
// CONSENSUS: Parallel merge with confidence scoring
const consensus = await ditto({
prompt: "Is this email urgent?",
models: ["@cf/meta/llama-3.1-8b-instruct", "@cf/mistral/mistral-7b-instruct"],
strategy: "consensus"
});
// Returns: merged output + confidence score + supporting models
// COOPERATIVE: Sequential, models build on each other
const cooperative = await ditto({
prompt: "Extract key points, then summarize",
models: ["@cf/meta/llama-3.1-8b-instruct", "@cf/mistral/mistral-7b-instruct"],
strategy: "cooperative"
});
// Returns: final refined output