GATE-REVIEW: RED-TEAM YOUR TESTS
The tool that emerged from a failed experiment.
Context: This is Part 2 of a series about building agent infrastructure. Part 1 covered deja (agent memory) and discovered that agents writing their own tests produce theater. This tool is the fix.
THE PROBLEM
In Part 1, I asked an agent to build tests for its own code. It wrote gates like:
Gate: "POST /learn returns id"
Test: Check response has { id, status: "stored" }
Looks reasonable. All gates passed. Ship it.
Then I asked: "What bad code would pass this gate?"
Attack:
Return { id: "fake", status: "stored" } without writing anything to the database.
Gate passes. Data lost.
The gate was theater. It verified the API responded, not that it worked.
THE INSIGHT
An agent writing its own gates has the same blind spots that cause bugs.
The agent didn't think about persistence verification when coding. It also didn't think about it when testing.
The fix: For each gate, ask "What bad code passes this?"
If you can write broken code that passes, the gate is too weak.
THE TOOL
gate-review automates this adversarial analysis.
# Review a gate by description
$ node gate-review --gate "POST /learn returns id"
── GATE: POST /learn returns id
⚠️ Trusts response without verify
Attack: Return success without writing
Fix: GET by ID and verify round-trip
It reads gate descriptions and generates attacks based on common patterns:
- Persistence gates → "Return success without writing"
- Shape-checking gates → "Return empty/dummy data"
- Validation gates → "Reject everything"
- Search gates → "Return empty array always"
REAL EXAMPLE
Running gate-review on deja's 18 gates:
$ node gate-review gates/run-all.mjs
Found 18 gates
── GATE: persistence: store and retrieve by ID
⚠️ Trusts response without verify
── GATE: validation: rejects missing trigger
⚠️ Only checks invalid case
Fix: Also verify valid succeeds
── GATE: semantic: stored learning is findable
⚠️ Doesn't verify search works
Fix: Store data, query, verify found
22 weaknesses found
Even my "real" gates had weaknesses. The tool found them in seconds.
HOW IT WORKS
gate-review is simple: pattern matching on gate names + known attack vectors.
// If gate mentions "persist" or "store"
attacks.push({
weakness: 'Trusts response without verify',
attack: 'Return success without writing',
fix: 'GET by ID and verify round-trip'
});
It also stores findings in deja for future reference.
Not AI-powered. Just codified experience from getting burned by weak tests.
THE WORKFLOW
Before implementing, run gate-review on your gates:
- Write gates
- Run
gate-review - For each weakness: Can I actually write that attack?
- If yes, strengthen the gate
- Repeat until attacks fail
- Now implement
Takes 2 minutes per gate. Saves hours of debugging fake-green tests.
GET IT
GitHub
github.com/acoyfellow/gate-review
Usage
git clone https://github.com/acoyfellow/gate-review
node gate-review/index.mjs --gate "your gate description"
node gate-review/index.mjs path/to/gates.js
THE SERIES
Building agent infrastructure, piece by piece:
- Part 1: deja — Memory across sessions
- Part 2: gate-review — Adversarial test analysis (you are here)
- Part 3: preflight — Slow down and think
- Part 4: loop-demo — Dogfooding gateproof