Part 1: deja → Part 2: gate-review

GATE-REVIEW: RED-TEAM YOUR TESTS

The tool that emerged from a failed experiment.

TOOL 2026.01.30

Context: This is Part 2 of a series about building agent infrastructure. Part 1 covered deja (agent memory) and discovered that agents writing their own tests produce theater. This tool is the fix.

THE PROBLEM

In Part 1, I asked an agent to build tests for its own code. It wrote gates like:

Gate: "POST /learn returns id"

Test: Check response has { id, status: "stored" }

Looks reasonable. All gates passed. Ship it.

Then I asked: "What bad code would pass this gate?"

Attack:

Return { id: "fake", status: "stored" } without writing anything to the database.

Gate passes. Data lost.

The gate was theater. It verified the API responded, not that it worked.

THE INSIGHT

An agent writing its own gates has the same blind spots that cause bugs.

The agent didn't think about persistence verification when coding. It also didn't think about it when testing.

The fix: For each gate, ask "What bad code passes this?"

If you can write broken code that passes, the gate is too weak.

THE TOOL

gate-review automates this adversarial analysis.

# Review a gate by description

$ node gate-review --gate "POST /learn returns id"

── GATE: POST /learn returns id

⚠️ Trusts response without verify

Attack: Return success without writing

Fix: GET by ID and verify round-trip

It reads gate descriptions and generates attacks based on common patterns:

Persistence gates → "Return success without writing"
Shape-checking gates → "Return empty/dummy data"
Validation gates → "Reject everything"
Search gates → "Return empty array always"

REAL EXAMPLE

Running gate-review on deja's 18 gates:

$ node gate-review gates/run-all.mjs

Found 18 gates

── GATE: persistence: store and retrieve by ID

⚠️ Trusts response without verify

── GATE: validation: rejects missing trigger

⚠️ Only checks invalid case

Fix: Also verify valid succeeds

── GATE: semantic: stored learning is findable

⚠️ Doesn't verify search works

Fix: Store data, query, verify found

22 weaknesses found

Even my "real" gates had weaknesses. The tool found them in seconds.

HOW IT WORKS

gate-review is simple: pattern matching on gate names + known attack vectors.

// If gate mentions "persist" or "store"

attacks.push({

weakness: 'Trusts response without verify',

attack: 'Return success without writing',

fix: 'GET by ID and verify round-trip'

});

It also stores findings in deja for future reference.

Not AI-powered. Just codified experience from getting burned by weak tests.

THE WORKFLOW

Before implementing, run gate-review on your gates:

Write gates
Run gate-review
For each weakness: Can I actually write that attack?
If yes, strengthen the gate
Repeat until attacks fail
Now implement

Takes 2 minutes per gate. Saves hours of debugging fake-green tests.

GET IT

GitHub

github.com/acoyfellow/gate-review

Usage

git clone https://github.com/acoyfellow/gate-review

node gate-review/index.mjs --gate "your gate description"

node gate-review/index.mjs path/to/gates.js

THE SERIES

Building agent infrastructure, piece by piece:

Part 1: deja — Memory across sessions
Part 2: gate-review — Adversarial test analysis (you are here)
Part 3: preflight — Slow down and think
Part 4: loop-demo — Dogfooding gateproof