LOOP

I've been thinking about AI coding workflows wrong.

AI WORKFLOWS 2026.01.20

THE OLD MODEL

I write prompts, AI writes code, I review, repeat. I'm the orchestrator. I'm in control.

THE NEW MODEL

I'm not in control. I'm just another iteration. The repo is the thing that persists. I prepare it to be entered, set up guardrails, and observe what happens.

THE PROBLEM

Every AI coding session starts fresh. No memory. No plan. No accountability.

Geoffrey Huntley calls this "Ralph Wiggum mode" — the model is like a well-meaning kid who forgot everything between messages. It has capability but zero continuity.

Most people fight this. They try to maintain context, write longer prompts, keep the session alive.

I stopped fighting it. I leaned into it.

If every message is a fresh start, then the only thing I actually control is what's in the context window when it starts.

That's the lever. That's the whole game.

WHAT LOOP DOES

Initialize:

npx @acoyfellow/loop init

This scaffolds your repo to be "loopable":

AGENTS.md              # how to build/test (north star)
tasks.md               # what to do (north star)
loop.json              # config
.loop/
  progress.md          # what's done (append-only)
  errors.md            # what went wrong (append-only)
  PAUSED               # kill switch
.github/workflows/
  loop.yml             # CI trigger

Then run:

npx @acoyfellow/loop enter    # one iteration
npx @acoyfellow/loop watch    # loop until paused

Each iteration:

1. reads the north star files (AGENTS.md, tasks.md)
2. reads recent progress and errors
3. builds a context window
4. calls the agent (claude, opencode, aider, whatever)
5. checks guardrails
6. records what happened
7. exits

The next iteration sees what the last one did. Errors persist. Progress persists. The repo is the memory.

GUARDRAILS

The default guardrail is simple: something must change. If the agent produces no diff, that's a failure.

But you can plug in anything:

{
  "agent": "claude",
  "guardrail": "npm test"
}

Or use gateproof for real observability:

{
  "guardrail": "bun gate.ts"
}

The guardrail is just a command. Exit 0 = pass. Exit 1 = fail. stderr gets appended to errors.md so the next iteration knows what went wrong.

THE INVERSION

Old way:
I write code
Tests validate
I am the harness

New way:
I write tests
Agent fills code
Tests are the harness, I observe

I'm not building features anymore. I'm building checks. The harder part isn't "make the AI write code" — it's "write tests that specify what I actually want."

WHY THIS MATTERS

Everything is a loop.

Debugging is a loop. Learning is a loop. CI/CD is a loop. You wake up, consult yesterday's notes, run one day, sleep, repeat.

We're all just iterations leaving notes for the next version of ourselves.

Once you see it, you can't unsee it. The question stops being "how do I control the AI" and starts being "how do I set up guardrails so the loop doesn't destroy itself."

That's what loop does. It prepares the vessel. You enter, observe, steer.

THE TRAP

Guardrails help, but they don't eliminate risk. There are real costs here.

Context bloat: When context gets bloated, mistakes compound. You need more guardrails. You need more tokens to process the mess. Smaller context windows, smaller iterations, but keeping good observation and notes for the next iteration — that's token efficiency. That's what loop does. But if you let contexts bloat, you're burning tokens on noise. Current pricing might be subsidized. These patterns may not be economically viable for long.

The feeling vs reality: It feels productive. You build things. But that feeling can decouple from value. You can build entire projects without any real check. For as long as nobody looks under the hood, you're good. But when someone pokes at it, it looks pretty crazy.

Asymmetric costs: It takes you a minute to prompt and wait. But reviewing a pull request takes many times longer. The asymmetry is brutal. Guardrails help you, but they don't help the person reviewing your PR.

When to stop: If you're looping past the point where you'd normally step away, that's not productivity. That's a signal to step away. Guardrails prevent slop, but they don't prevent you from building things you don't need.

Two things are both true: AI agents are amazing. They are also slop machines if you turn off your brain. Loop gives you guardrails. It doesn't give you judgment.

IF THIS IS WRONG

If this approach is wrong, you've only gained skills in normal software engineering.

You get better at building harnesses. You get better at writing tests. You get better at end-to-end self-feedback loops. You get better at code review. You get better at code sandboxing. You get better at self-healing tools.

These are all useful skills. Whether the loop approach works or not, you've practiced the fundamentals. You've learned to think in terms of constraints, verification, and observation. That's never wasted.

TRY IT

npx @acoyfellow/loop init
# edit AGENTS.md and tasks.md
npx @acoyfellow/loop enter

Repo: github.com/acoyfellow/loop

The repo is the rig. Git is the memory. Guardrails are what keep it from spinning forever.

You're not orchestrating tasks. You're orchestrating the conditions under which loops can safely run.