DITTO: EDGE-NATIVE PARALLEL LLM ORCHESTRATION

Multiple models. One call. Consensus merging.

TECHNICAL GUIDE 2025.11.21

Run multiple AI models in parallel, merge their outputs with consensus, get back typed results. Built on Durable Objects for durability and idempotency.

VIEW DEMO VIEW REPO NPM PACKAGE

THE QUICK VERSION

Single model = fragile. Multiple models in parallel = robust + fewer hallucinations. But combining N different API calls = tedious.

Ditto handles the orchestration. You call one function. It runs models in parallel using Effect for true concurrency, merges outputs with consensus or cooperative strategies, tracks individual responses, and returns typed results.

Per-request Durable Object job. Effect-based execution. Unlimited concurrency. All on the edge.

ARCHITECTURE

Client → Worker → JobDO (per request)
           ↓         ↓
        Effect.all  Parallel model calls
           ↓         ↓
        Consensus   Merged result

WHY PARALLEL MODELS?

Single models fail. Multiple models agree = higher confidence.

// Why parallel models?

// Single model = fragile
const single = await ai.run("@cf/meta/llama-3.1-8b-instruct", { prompt });
// One failure = complete failure
// One hallucination = wrong answer

// Multiple models in parallel = robust
const multi = await ditto({
  prompt,
  models: [
    "@cf/meta/llama-3.1-8b-instruct",
    "@cf/mistral/mistral-7b-instruct",
    "@cf/qwen/qwen-2.5-7b-instruct"
  ],
  strategy: "consensus"
});
// Majority agreement = higher confidence
// Disagreement = flag for review
// One failure = others continue

CLIENT

Install ditto-ai, call one function. Get merged results with confidence scores.

// client.ts
import { dittoClient } from "ditto-ai";

const ditto = dittoClient({
  endpoint: "https://your-worker.workers.dev/llm",
});

const response = await ditto({
  prompt: "Summarize this email...",
  models: [
    "@cf/meta/llama-3.1-8b-instruct",
    "@cf/mistral/mistral-7b-instruct"
  ],
  strategy: "consensus",
});

console.log(response.result);      // merged output
console.log(response.structured);  // intent, confidence, supporting models
console.log(response.responses);   // individual model outputs
console.log(response.timings);     // total, fanout, slowest, merge time

INFRASTRUCTURE

One namespace, one Worker—Alchemy wires the bindings.

// alchemy.run.ts
import alchemy from "alchemy";
import { Worker, DurableObjectNamespace } from "alchemy/cloudflare";

const app = await alchemy("ditto");

const jobs = DurableObjectNamespace("job", { className: "JobDO" });

await Worker("api", {
  entrypoint: "./src/worker.ts",
  bindings: { JOBS: jobs, AI: "AI" }
});

await app.finalize();

WORKER

Create per-request job DO. Poll for result or use WebSocket for real-time.

// src/worker.ts
import { Hono } from 'hono';
import { createJob, getJobResult } from './job';

const app = new Hono();

app.post('/llm', async (c) => {
  const { prompt, models, strategy } = await c.req.json();
  
  const jobId = crypto.randomUUID();
  const stub = c.env.JOBS.get(c.env.JOBS.idFromName(jobId));
  
  // Create job DO, runs models in parallel
  await stub.fetch('https://do/start', {
    method: 'POST',
    body: JSON.stringify({ prompt, models, strategy })
  });
  
  // Poll for result (or use WebSocket for real-time)
  const result = await getJobResult(stub);
  
  return c.json(result);
});

export default app;

JOB DURABLE OBJECT

Effect.all runs models in parallel with unbounded concurrency. Merge based on strategy.

// src/job.ts (Durable Object)
import { DurableObject } from "cloudflare:workers";
import { Effect } from "effect";

export class JobDO extends DurableObject {
  private state: 'pending' | 'running' | 'complete' | 'error' = 'pending';
  private responses: any[] = [];
  private result: any = null;

  async fetch(req: Request) {
    const url = new URL(req.url);
    
    if (url.pathname === '/start' && req.method === 'POST') {
      const { prompt, models, strategy } = await req.json();
      
      // Effect.all runs models in parallel
      const program = Effect.all(
        models.map(model => 
          Effect.tryPromise({
            try: () => this.callModel(model, prompt),
            catch: (error) => new Error(`Model ${model} failed`)
          })
        ),
        { concurrency: 'unbounded' }
      );
      
      // Execute with Effect runtime
      const responses = await Effect.runPromise(program);
      this.responses = responses;
      
      // Merge based on strategy
      this.result = strategy === 'consensus' 
        ? this.mergeConsensus(responses)
        : this.mergeCooperative(responses);
      
      this.state = 'complete';
      return Response.json({ success: true });
    }
    
    if (url.pathname === '/result') {
      return Response.json({
        state: this.state,
        result: this.result,
        responses: this.responses
      });
    }
    
    return new Response('not found', { status: 404 });
  }
  
  private async callModel(model: string, prompt: string) {
    const response = await this.env.AI.run(model, {
      messages: [{ role: 'user', content: prompt }]
    });
    return { model, output: response.response };
  }
  
  private mergeConsensus(responses: any[]) {
    // Confidence scoring, majority voting, etc.
    return {
      result: responses[0].output, // simplified
      structured: {
        confidence: 0.92,
        supportingModels: responses.map(r => r.model)
      }
    };
  }
  
  private mergeCooperative(responses: any[]) {
    // Sequential building, each model refines previous
    return { result: responses[responses.length - 1].output };
  }
}

STRATEGIES

Consensus: parallel merge with confidence scoring. Cooperative: sequential, models build on each other.

// Strategies

// CONSENSUS: Parallel merge with confidence scoring
const consensus = await ditto({
  prompt: "Is this email urgent?",
  models: ["@cf/meta/llama-3.1-8b-instruct", "@cf/mistral/mistral-7b-instruct"],
  strategy: "consensus"
});
// Returns: merged output + confidence score + supporting models

// COOPERATIVE: Sequential, models build on each other
const cooperative = await ditto({
  prompt: "Extract key points, then summarize",
  models: ["@cf/meta/llama-3.1-8b-instruct", "@cf/mistral/mistral-7b-instruct"],
  strategy: "cooperative"
});
// Returns: final refined output