Ralph Wiggum loops with Codex: run iterative agent passes until done

Ralph Wiggum loops with Codex: run iterative agent passes until done

~ 5 min read


If you’ve been using Claude Code recently, you’ve probably seen people talking about the Ralph Wiggum recursive loop pattern.

This covers how to run the same idea with OpenAI Codex.

What the Ralph Wiggum loop is

In short, a Ralph loop is an agent run pattern where you repeatedly re-invoke the agent against the same task state until it either:

  • finishes (DONE)
  • gets blocked (BLOCKED)
  • reaches a max iteration limit

Digging a bit deeper

The Claude Code Ralph Wiggum plugin describes it as putting a subagent in a recursive Bash loop on a task. The value is not a single giant prompt; the value is the repeated execution cycle.

  • Definition: A bash script loop (while :; do...) that repeatedly feeds an AI agent (like Claude Code) a prompt, ignoring errors, until the goal is achieved.
  • Philosophy: As Geoff Huntley puts it, it is “ignorant, persistent, and optimistic” — named after the Simpsons character because it keeps trying until success.
  • Core Principle: It avoids “context rot” (AI getting dumber the longer a session runs) by restarting the session every cycle and using Git as the memory.

Claude Code origin, Codex adaptation

Ralph Wiggum became popular in Claude Code workflows. Codex does not need the same plugin to achieve the same execution model.

But you can reproduce a similar pattern with codex exec in a bounded shell loop.

NOTE: Codex is excellent at looping internally during long runs. It will often handle multiple small passes and update state before a single invocation exits. This means MAX_ITERS caps the number of times the shell script re-invokes Codex, not the total number of internal passes Codex makes at the task. In practice, Codex often accomplishes more per iteration than Claude Code would in the same pattern.

Codex Ralph loop script

Create scripts/ralph-loop.sh:

#!/usr/bin/env bash
set -euo pipefail

STATE_DIR="${2:-./ralph}"
TASK_FILE="${1:-$STATE_DIR/TASK.md}"
MAX_ITERS="${MAX_ITERS:-20}"

mkdir -p "$STATE_DIR"
[[ -f "$TASK_FILE" ]] || {
    echo "Missing task file: $TASK_FILE" >&2
    exit 1
}

touch "$STATE_DIR/NOTES.md"
: > "$STATE_DIR/RUN_LOG.md"
rm -f "$STATE_DIR/DONE" "$STATE_DIR/BLOCKED"

for ((i = 1; i <= MAX_ITERS; i++)); do
    printf "\n## Iteration %d (%s)\n" "$i" "$(date -u +"%Y-%m-%dT%H:%M:%SZ")" >>"$STATE_DIR/RUN_LOG.md"

    codex -a never exec -C "$(pwd)" -s workspace-write "
You are running one Ralph loop iteration.

Read:
- $TASK_FILE
- $STATE_DIR/NOTES.md

Do exactly one smallest useful step toward completion.
Append what you changed and why to $STATE_DIR/NOTES.md.

If complete, create $STATE_DIR/DONE with a concise summary.
If blocked, create $STATE_DIR/BLOCKED with blocker details and required input.

Before stopping, run only the smallest relevant verification command.
" 2>&1 | tee -a "$STATE_DIR/RUN_LOG.md"

    if [[ -f "$STATE_DIR/DONE" ]]; then
        echo "Done in $i iteration(s)."
        exit 0
    fi

    if [[ -f "$STATE_DIR/BLOCKED" ]]; then
        echo "Blocked at iteration $i."
        exit 2
    fi
done

echo "Hit MAX_ITERS=$MAX_ITERS without DONE/BLOCKED."
exit 3

Make it executable:

chmod +x scripts/ralph-loop.sh

Minimal task file format

Create ralph/TASK.md:

# Goal

Fix flaky Cypress test on /search route.

# Constraints

- Only edit test files and the related component.
- Do not change production behaviour outside this bug.

# Done when

- Failing test is stable for 3 consecutive runs.
- npm run lint passes.

Then run:

MAX_ITERS=12 ./scripts/ralph-loop.sh

Guardrails that matter

Ralph loops are powerful, but they can run away if unbounded. Each iteration is an API call that consumes tokens, so an unguarded loop can quietly burn through budget.

Always define:

  • max iterations — the MAX_ITERS cap in the script
  • writable scope — enforced here by the -s workspace-write sandbox flag passed to Codex
  • explicit done conditions — a DONE file with a summary
  • explicit blocked conditions — a BLOCKED file with blocker details

If you skip those four, you get drift instead of progress.

Practical improvements

After the basic loop works, add:

  1. git diff --stat snapshot after each iteration
  2. targeted test command in the task file
  3. final full validation (npm run test, npm run lint, npm run build) before merge

This keeps the rapid loop, but still enforces production-grade checks before shipping.

Why not just let Codex run longer?

Codex’s built-in long-running mode is good for single-pass work where the agent can hold context for the full task. The Ralph loop pattern wins when:

  • the task is large enough that context rot becomes a real risk
  • you want a clean git diff after every iteration for auditability
  • you need hard caps on both cost and iteration count

If your task fits in a single Codex session, skip the loop. If it doesn’t, the loop gives you control.

When to use Ralph loops in Codex

Use this pattern when work is:

  • multistep but deterministic
  • bounded to a known repo area
  • easy to validate with commands

Do not use it for open-ended design exploration where success criteria are unclear.

The key idea is straightforward: keep each agent pass small, persist state in files, and loop until there is objective evidence of done.

Beyond Ralph: OpenClaw and structured agentic loops

OpenClaw takes the basic Ralph Wiggum principle of persistent iteration and adds multi-agent coordination on top.

  • Multi-Agent: OpenClaw orchestrates multiple agents rather than looping a single one. Agents can validate each other’s outputs, reducing hallucination risk.
  • Self-Reinforcing Loop: A “heartbeat” mechanism wakes agents on a schedule to scrape recent state and act on it, creating a self-reinforcing dialogue rather than a linear pass.
  • Emergent Behaviour: This coordination sometimes produces unexpected results — the community’s “Crustafarian” crab cult meme originated from agents recursively riffing on each other’s outputs in an OpenClaw loop.

The trade-off is complexity: OpenClaw’s multi-agent coordination introduces real security and cost risks that a simple bash loop avoids. You need to trust (and audit) the agent interaction surface carefully.

Both Anthropic and OpenAI continue to push how long an agent can work autonomously while maintaining quality, so scaffolding your own loop may only be necessary for a short while. METR (Model Evaluation & Threat Research) tracks this progress if you want to see where the frontier is.

Agent Skills are also worth watching as a way to extend core agent functionality without building full orchestration.

References

all posts →