PAPER-2026-002

Ralph vs Gastown: Comparing Agent Orchestration Patterns

Two approaches to autonomous agent orchestration: Ralph spawns fresh contexts per iteration for sequential overnight work; Gastown coordinates persistent parallel sessions via tmux. Both achieve reliable outcomes through different architectural choices.

Architecture - 10 min read - Intermediate

Abstract

Autonomous AI agents require orchestration patterns that survive crashes, context limits, and interruptions. This paper compares two production-validated approaches: Ralph, which spawns fresh Claude Code instances per iteration to work through PRD stories sequentially, and Gastown, which coordinates multiple persistent sessions via tmux for parallel execution. We analyze when each pattern excels, their cost implications, and how they complement each other. Key finding: Ralph is the default for autonomous work (simpler, cheaper, overnight-friendly); Gastown is reserved for true parallelism needs (3+ independent features simultaneously).

Sequential

Ralph execution

Parallel

Gastown execution

Fresh

Ralph context per iteration

Persistent

Gastown sessions

1. The Orchestration Problem

AI agents face inherent limitations: context windows fill, sessions crash, network connections drop. Any orchestration system must answer: how does work survive these failures?

Both Ralph and Gastown solve this through nondeterministic idempotence: the path varies, but the destination is certain. Different iterations, same outcome. Crashes happen; work still completes.

The key question is not "which is better" but "which fits your situation."

Challenge	Ralph Solution	Gastown Solution
Context overflow	Fresh context per iteration	Persistent sessions with handoff
Session crashes	Restart picks up from next story	Beads state survives; worker respawns
Progress tracking	prd.json + progress.txt	Beads + hooks + checkpoints
Human oversight	Check results in the morning	Monitor via tmux sessions

2. Ralph: Fresh Context Per Iteration

Ralph is a bash script that spawns fresh Claude Code instances to work through user stories defined in a PRD (Product Requirements Document). Each iteration gets a clean context window.

2.1 Architecture

ralph.sh
├── Read prd.json
├── Find story where passes == false
├── Spawn fresh Claude Code instance
├── Claude implements story, commits, updates prd.json
├── Log to progress.txt
├── If all stories pass → done
└── Repeat

2.2 Key Insight: Context Pollution Prevention

When Claude works on story 1, it accumulates implementation details. By story 5, the context window is cluttered with irrelevant code from earlier stories. Fresh context per iteration means each story gets Claude's full attention.

What Works

Overnight autonomous work
Sequential story dependencies
Simple setup (bash script + JSON)
Predictable costs per story

Limitations

Sequential only (no parallelism)
Context lost between iterations
No live monitoring
Stories must be self-contained

2.3 PRD Format

The PRD is Claude's task board. Stories are marked passes: false until complete:

{
  "title": "User Authentication",
  "stories": [
    {
      "id": "story-1",
      "title": "Add login form",
      "acceptance": [
        "Form renders at /login",
        "Email validation works",
        "Tests pass"
      ],
      "files": ["src/routes/login/+page.svelte"],
      "passes": false
    }
  ]
}

3. Gastown: Persistent Parallel Sessions

Gastown coordinates multiple Claude Code instances via tmux. Each worker maintains a persistent session, enabling true parallel execution with session recovery.

3.1 Architecture

WezTerm
└── tmux (session persistence)
    ├── gt-coordinator    (you + Claude Code)
    ├── gt-witness-csm    (monitors per rig)
    ├── gt-refinery-csm   (merge queue per rig)
    ├── gt-steward        (background daemon)
    └── gt-worker-N       (ephemeral workers)

3.2 Key Insight: GUPP (Get Up and Push Protocol)

Workers check their hook on startup. If work exists, they start immediately. No asking permission. This enables autonomous execution while maintaining coordination through Beads issue tracking.

What Works

3+ independent features in parallel
3x speedup for parallelizable work
Live monitoring via tmux
Persistent context across restarts

Limitations

Complex setup (tmux, Beads, multiple sessions)
Higher cost (parallel API calls)
Merge conflicts between workers
Requires monitoring

3.3 Convoy Pattern

A convoy batches related issues for parallel execution:

# Create convoy with three issues
gt convoy create "Auth feature" cs-login cs-session cs-middleware

# Assign to workers (run in parallel)
gt-smart-sling cs-login csm
gt-smart-sling cs-session csm
gt-smart-sling cs-middleware csm

# 90 min sequential → 30 min parallel

4. Key Differences

Dimension	Ralph	Gastown
Execution	Sequential (one story at a time)	Parallel (multiple workers)
Context	Fresh per iteration	Persistent per session
State Storage	prd.json + progress.txt	Beads + hooks + molecules
Setup Complexity	Low (bash script)	High (tmux + Beads + roles)
Monitoring	Check progress.txt later	Live via tmux sessions
Cost Model	Predictable per story	Higher (parallel calls)
Best For	Overnight autonomous work	Parallel independent features

4.1 Context Model

The fundamental difference is how each handles context:

Ralph: Stateless Iterations

Each iteration spawns a fresh Claude Code instance. The only state is prd.json (which stories are done) and git history. No memory pollution between stories.

Gastown: Stateful Sessions

Workers maintain persistent sessions. Context accumulates within a session but survives restarts via Beads. Better for complex coordinated work.

4.2 Failure Recovery

Both achieve nondeterministic idempotence through different mechanisms:

Ralph: If iteration 5 crashes, restart the script. It reads prd.json, finds the next incomplete story, and continues. Previous work persists in git.
Gastown: If worker-3 crashes, the Steward respawns it. The worker checks its hook (Beads), sees the assigned issue, and continues. State persists in Beads (Git-synced).

5. Cost Comparison

5.1 Ralph Cost Model

Ralph costs are predictable per story:

Iterations	Estimated Cost	Use Case
5	~$1.50	Small feature (3-4 stories)
10	~$3.00	Medium feature (6-8 stories)
20	~$6.00	Large feature (12-15 stories)

5.2 Gastown Cost Model

Gastown costs depend on parallelism and model routing:

Approach	Models	Cost	Time
Sequential Sonnet	4x Sonnet	$0.04	120 min
Parallel Sonnet	4x Sonnet	$0.04	30 min
Haiku Swarm	1x Sonnet + 4x Haiku + 1x Opus	$0.114	30 min

5.3 When Gastown's Higher Cost is Justified

Time-critical work: 3x speedup justifies 2-3x cost for deadline-driven work
True parallelism: 4 independent features done in 30 min vs 120 min
Complex coordination: Workers can communicate via mail protocol

Rule of thumb: Ralph is cheaper for sequential work. Gastown costs more but delivers faster for parallel work. Choose based on whether time or cost is the constraint.

6. Decision Matrix

Use this matrix to choose the right tool:

Scenario	Tool	Why
Overnight feature development	Ralph	Simpler, cheaper, no monitoring needed
Sequential stories with dependencies	Ralph	Fresh context prevents pollution
Quick test-fix loop (same session)	Ralph (/ralph-loop)	Single-session refinement
3+ independent features simultaneously	Gastown	True parallelism, 3x speedup
Need live progress monitoring	Gastown	tmux sessions visible
Complex multi-step orchestration	Gastown	Molecules, convoys, merge queue
Worker needs to coordinate with others	Gastown	Mail protocol between agents

"Default to Ralph. Reserve Gastown for when you explicitly need parallelism."

- CREATE SOMETHING Orchestration Philosophy

7. How They Complement Each Other

Ralph and Gastown are not mutually exclusive. They can be combined for sophisticated orchestration patterns.

7.1 Ralph for Worker Self-Rescue

When a Gastown worker gets stuck, it can use the Ralph pattern (specifically /ralph-loop) to iterate on a fix before escalating:

# Gastown worker hits an error
# Before sending HELP message, try self-rescue:

/ralph-loop "
  Fix failing tests in auth module.
  Test output: [paste errors]
  Output <promise>TESTS_PASS</promise> when green.
" --max-iterations 10

# If Ralph loop succeeds → continue
# If Ralph loop fails → gt mail send HELP

7.2 Gastown for Ralph Parallelization

For very large features, use Gastown to run multiple Ralph instances in parallel:

# Split large feature into independent PRDs
# auth-prd.json, dashboard-prd.json, api-prd.json

# Run each in a Gastown worker
gt sling worker-1 "cd /project && ./ralph.sh --prd auth-prd.json"
gt sling worker-2 "cd /project && ./ralph.sh --prd dashboard-prd.json"
gt sling worker-3 "cd /project && ./ralph.sh --prd api-prd.json"

# Refinery merges when all complete

7.3 Integration with Harness

Both patterns integrate with the Harness quality gate system:

Ralph stories can include acceptance criteria that trigger Harness reviews (security, architecture, quality)
Gastown workers run through Harness baseline checks before starting, with Ralph self-rescue if baseline fails

8. Practical Recommendations

8.1 Start with Ralph

Create a PRD using /prd-to-ralph or manually
Run ./ralph.sh --max-iterations 10
Go to sleep or have dinner
Check progress.txt and git log in the morning

8.2 Graduate to Gastown When

You have 3+ truly independent features to implement
Time pressure justifies the complexity overhead
You're comfortable with tmux and Beads
You have budget for parallel API calls

8.3 Common Mistakes

Mistake	Why It Hurts	Fix
Using Gastown for sequential work	Complexity without benefit	Use Ralph
Ralph stories too big	Can't complete in one iteration	Break into atomic stories
Vague acceptance criteria	Claude marks done prematurely	Specific, testable criteria
Parallel work with dependencies	Merge conflicts, coordination failures	Sequential for dependent work

9. Conclusion

Ralph and Gastown represent two valid approaches to agent orchestration, optimized for different scenarios:

Ralph: Simple, Sequential, Overnight

Fresh context per iteration prevents pollution. Ideal for overnight autonomous work on sequential stories. Lower cost, simpler setup.

Gastown: Complex, Parallel, Monitored

Persistent sessions enable parallelism. Ideal for 3+ independent features with time pressure. Higher cost, more infrastructure.

Both achieve nondeterministic idempotence: work survives crashes, context limits, and interruptions. The difference is in execution model (sequential vs parallel) and context model (fresh vs persistent).

Key takeaway: Default to Ralph for autonomous work. Reserve Gastown for when you explicitly need parallelism. Both patterns complement each other and integrate with the broader Harness quality system.

The infrastructure disappears; only the work remains.

How to Apply This

If you're starting with agent orchestration:

Start with Ralph. Create a PRD, run the script, check results later.
Write specific, testable acceptance criteria for each story.
Keep stories atomic (one context window each).
Graduate to Gastown only when you have true parallelism needs.

If you're building orchestration infrastructure:

Support both patterns. They serve different use cases.
Fresh context (Ralph) prevents pollution; persistent context (Gastown) enables coordination.
Build integration points: Ralph self-rescue in Gastown workers.
Track costs per pattern to inform routing decisions.

If you're choosing between approaches:

Ask: is this truly parallelizable? If no, use Ralph.
Ask: do I need live monitoring? If no, use Ralph.
Ask: is time pressure worth the complexity? Only then consider Gastown.
When in doubt, default to Ralph.

Related Research

The Autonomous Harness - Agent orchestration with human agency through progress reports

Haiku Optimization - Intelligent model routing for cost-effective orchestration

The Norvig Partnership - Human-AI collaboration achieving 20x productivity gains