PAPER-2026-002

Ralph vs Gastown: Comparing Agent Orchestration Patterns

Two approaches to autonomous agent orchestration: Ralph spawns fresh contexts per iteration for sequential overnight work; Gastown coordinates persistent parallel sessions via tmux. Both achieve reliable outcomes through different architectural choices.

Architecture - 10 min read - Intermediate

Abstract

Autonomous AI agents require orchestration patterns that survive crashes, context limits, and interruptions. This paper compares two production-validated approaches: Ralph, which spawns fresh Claude Code instances per iteration to work through PRD stories sequentially, and Gastown, which coordinates multiple persistent sessions via tmux for parallel execution. We analyze when each pattern excels, their cost implications, and how they complement each other. Key finding: Ralph is the default for autonomous work (simpler, cheaper, overnight-friendly); Gastown is reserved for true parallelism needs (3+ independent features simultaneously).

Sequential
Ralph execution
Parallel
Gastown execution
Fresh
Ralph context per iteration
Persistent
Gastown sessions

1. The Orchestration Problem

AI agents face inherent limitations: context windows fill, sessions crash, network connections drop. Any orchestration system must answer: how does work survive these failures?

Both Ralph and Gastown solve this through nondeterministic idempotence: the path varies, but the destination is certain. Different iterations, same outcome. Crashes happen; work still completes.

The key question is not "which is better" but "which fits your situation."

ChallengeRalph SolutionGastown Solution
Context overflowFresh context per iterationPersistent sessions with handoff
Session crashesRestart picks up from next storyBeads state survives; worker respawns
Progress trackingprd.json + progress.txtBeads + hooks + checkpoints
Human oversightCheck results in the morningMonitor via tmux sessions

2. Ralph: Fresh Context Per Iteration

Ralph is a bash script that spawns fresh Claude Code instances to work through user stories defined in a PRD (Product Requirements Document). Each iteration gets a clean context window.

2.1 Architecture

ralph.sh
├── Read prd.json
├── Find story where passes == false
├── Spawn fresh Claude Code instance
├── Claude implements story, commits, updates prd.json
├── Log to progress.txt
├── If all stories pass → done
└── Repeat

2.2 Key Insight: Context Pollution Prevention

When Claude works on story 1, it accumulates implementation details. By story 5, the context window is cluttered with irrelevant code from earlier stories. Fresh context per iteration means each story gets Claude's full attention.

What Works

  • Overnight autonomous work
  • Sequential story dependencies
  • Simple setup (bash script + JSON)
  • Predictable costs per story

Limitations

  • Sequential only (no parallelism)
  • Context lost between iterations
  • No live monitoring
  • Stories must be self-contained

2.3 PRD Format

The PRD is Claude's task board. Stories are marked passes: false until complete:

{
  "title": "User Authentication",
  "stories": [
    {
      "id": "story-1",
      "title": "Add login form",
      "acceptance": [
        "Form renders at /login",
        "Email validation works",
        "Tests pass"
      ],
      "files": ["src/routes/login/+page.svelte"],
      "passes": false
    }
  ]
}

3. Gastown: Persistent Parallel Sessions

Gastown coordinates multiple Claude Code instances via tmux. Each worker maintains a persistent session, enabling true parallel execution with session recovery.

3.1 Architecture

WezTerm
└── tmux (session persistence)
    ├── gt-coordinator    (you + Claude Code)
    ├── gt-witness-csm    (monitors per rig)
    ├── gt-refinery-csm   (merge queue per rig)
    ├── gt-steward        (background daemon)
    └── gt-worker-N       (ephemeral workers)

3.2 Key Insight: GUPP (Get Up and Push Protocol)

Workers check their hook on startup. If work exists, they start immediately. No asking permission. This enables autonomous execution while maintaining coordination through Beads issue tracking.

What Works

  • 3+ independent features in parallel
  • 3x speedup for parallelizable work
  • Live monitoring via tmux
  • Persistent context across restarts

Limitations

  • Complex setup (tmux, Beads, multiple sessions)
  • Higher cost (parallel API calls)
  • Merge conflicts between workers
  • Requires monitoring

3.3 Convoy Pattern

A convoy batches related issues for parallel execution:

# Create convoy with three issues
gt convoy create "Auth feature" cs-login cs-session cs-middleware

# Assign to workers (run in parallel)
gt-smart-sling cs-login csm
gt-smart-sling cs-session csm
gt-smart-sling cs-middleware csm

# 90 min sequential → 30 min parallel

4. Key Differences

DimensionRalphGastown
ExecutionSequential (one story at a time)Parallel (multiple workers)
ContextFresh per iterationPersistent per session
State Storageprd.json + progress.txtBeads + hooks + molecules
Setup ComplexityLow (bash script)High (tmux + Beads + roles)
MonitoringCheck progress.txt laterLive via tmux sessions
Cost ModelPredictable per storyHigher (parallel calls)
Best ForOvernight autonomous workParallel independent features

4.1 Context Model

The fundamental difference is how each handles context:

Ralph: Stateless Iterations

Each iteration spawns a fresh Claude Code instance. The only state is prd.json (which stories are done) and git history. No memory pollution between stories.

Gastown: Stateful Sessions

Workers maintain persistent sessions. Context accumulates within a session but survives restarts via Beads. Better for complex coordinated work.

4.2 Failure Recovery

Both achieve nondeterministic idempotence through different mechanisms:

  • Ralph: If iteration 5 crashes, restart the script. It reads prd.json, finds the next incomplete story, and continues. Previous work persists in git.
  • Gastown: If worker-3 crashes, the Steward respawns it. The worker checks its hook (Beads), sees the assigned issue, and continues. State persists in Beads (Git-synced).

5. Cost Comparison

5.1 Ralph Cost Model

Ralph costs are predictable per story:

IterationsEstimated CostUse Case
5~$1.50Small feature (3-4 stories)
10~$3.00Medium feature (6-8 stories)
20~$6.00Large feature (12-15 stories)

5.2 Gastown Cost Model

Gastown costs depend on parallelism and model routing:

ApproachModelsCostTime
Sequential Sonnet4x Sonnet$0.04120 min
Parallel Sonnet4x Sonnet$0.0430 min
Haiku Swarm1x Sonnet + 4x Haiku + 1x Opus$0.11430 min

5.3 When Gastown's Higher Cost is Justified

  • Time-critical work: 3x speedup justifies 2-3x cost for deadline-driven work
  • True parallelism: 4 independent features done in 30 min vs 120 min
  • Complex coordination: Workers can communicate via mail protocol

Rule of thumb: Ralph is cheaper for sequential work. Gastown costs more but delivers faster for parallel work. Choose based on whether time or cost is the constraint.

6. Decision Matrix

Use this matrix to choose the right tool:

ScenarioToolWhy
Overnight feature developmentRalphSimpler, cheaper, no monitoring needed
Sequential stories with dependenciesRalphFresh context prevents pollution
Quick test-fix loop (same session)Ralph (/ralph-loop)Single-session refinement
3+ independent features simultaneouslyGastownTrue parallelism, 3x speedup
Need live progress monitoringGastowntmux sessions visible
Complex multi-step orchestrationGastownMolecules, convoys, merge queue
Worker needs to coordinate with othersGastownMail protocol between agents

"Default to Ralph. Reserve Gastown for when you explicitly need parallelism."

- CREATE SOMETHING Orchestration Philosophy

7. How They Complement Each Other

Ralph and Gastown are not mutually exclusive. They can be combined for sophisticated orchestration patterns.

7.1 Ralph for Worker Self-Rescue

When a Gastown worker gets stuck, it can use the Ralph pattern (specifically /ralph-loop) to iterate on a fix before escalating:

# Gastown worker hits an error
# Before sending HELP message, try self-rescue:

/ralph-loop "
  Fix failing tests in auth module.
  Test output: [paste errors]
  Output <promise>TESTS_PASS</promise> when green.
" --max-iterations 10

# If Ralph loop succeeds → continue
# If Ralph loop fails → gt mail send HELP

7.2 Gastown for Ralph Parallelization

For very large features, use Gastown to run multiple Ralph instances in parallel:

# Split large feature into independent PRDs
# auth-prd.json, dashboard-prd.json, api-prd.json

# Run each in a Gastown worker
gt sling worker-1 "cd /project && ./ralph.sh --prd auth-prd.json"
gt sling worker-2 "cd /project && ./ralph.sh --prd dashboard-prd.json"
gt sling worker-3 "cd /project && ./ralph.sh --prd api-prd.json"

# Refinery merges when all complete

7.3 Integration with Harness

Both patterns integrate with the Harness quality gate system:

  • Ralph stories can include acceptance criteria that trigger Harness reviews (security, architecture, quality)
  • Gastown workers run through Harness baseline checks before starting, with Ralph self-rescue if baseline fails

8. Practical Recommendations

8.1 Start with Ralph

  1. Create a PRD using /prd-to-ralph or manually
  2. Run ./ralph.sh --max-iterations 10
  3. Go to sleep or have dinner
  4. Check progress.txt and git log in the morning

8.2 Graduate to Gastown When

  • You have 3+ truly independent features to implement
  • Time pressure justifies the complexity overhead
  • You're comfortable with tmux and Beads
  • You have budget for parallel API calls

8.3 Common Mistakes

MistakeWhy It HurtsFix
Using Gastown for sequential workComplexity without benefitUse Ralph
Ralph stories too bigCan't complete in one iterationBreak into atomic stories
Vague acceptance criteriaClaude marks done prematurelySpecific, testable criteria
Parallel work with dependenciesMerge conflicts, coordination failuresSequential for dependent work

9. Conclusion

Ralph and Gastown represent two valid approaches to agent orchestration, optimized for different scenarios:

Ralph: Simple, Sequential, Overnight

Fresh context per iteration prevents pollution. Ideal for overnight autonomous work on sequential stories. Lower cost, simpler setup.

Gastown: Complex, Parallel, Monitored

Persistent sessions enable parallelism. Ideal for 3+ independent features with time pressure. Higher cost, more infrastructure.

Both achieve nondeterministic idempotence: work survives crashes, context limits, and interruptions. The difference is in execution model (sequential vs parallel) and context model (fresh vs persistent).

Key takeaway: Default to Ralph for autonomous work. Reserve Gastown for when you explicitly need parallelism. Both patterns complement each other and integrate with the broader Harness quality system.

The infrastructure disappears; only the work remains.

How to Apply This

If you're starting with agent orchestration:

  1. Start with Ralph. Create a PRD, run the script, check results later.
  2. Write specific, testable acceptance criteria for each story.
  3. Keep stories atomic (one context window each).
  4. Graduate to Gastown only when you have true parallelism needs.

If you're building orchestration infrastructure:

  1. Support both patterns. They serve different use cases.
  2. Fresh context (Ralph) prevents pollution; persistent context (Gastown) enables coordination.
  3. Build integration points: Ralph self-rescue in Gastown workers.
  4. Track costs per pattern to inform routing decisions.

If you're choosing between approaches:

  1. Ask: is this truly parallelizable? If no, use Ralph.
  2. Ask: do I need live monitoring? If no, use Ralph.
  3. Ask: is time pressure worth the complexity? Only then consider Gastown.
  4. When in doubt, default to Ralph.

Related Research

The Autonomous Harness - Agent orchestration with human agency through progress reports

Haiku Optimization - Intelligent model routing for cost-effective orchestration

The Norvig Partnership - Human-AI collaboration achieving 20x productivity gains