This is the triage guide. Match your symptom, apply the fix, then
re-check radioactive_ralph doctor to confirm.
Stale heartbeat¶
Symptom¶
radioactive_ralph status returns a response but with stale data, or
says the service is running but run --variant ... hangs waiting for
the control plane.
What it means¶
The service process crashed without cleaning its control-plane socket and heartbeat file. The socket / named pipe still exists on disk, so clients connect successfully — but nothing is on the other end.
Fix¶
radioactive_ralph service clean
radioactive_ralph service start # or launchctl/systemctl/SCM restart
Verify:
radioactive_ralph status --json
Response should have a fresh started_at timestamp.
Dead socket or pipe¶
Symptom (Unix)¶
error: ipc: dial /path/to/control.sock: connection refused
Symptom (Windows)¶
error: ipc: dial \\.\pipe\radioactive-ralph-<slug>: pipe does not exist
Fix¶
Either the service never started or the OS cleaned the endpoint.
# 1. Is the service actually running?
pgrep -f "radioactive_ralph service" # Unix
Get-Process | Where-Object { $_.ProcessName -like "*radioactive*" } # Windows
# 2a. If yes: the endpoint is stale — clean + restart
radioactive_ralph service clean
radioactive_ralph service start
# 2b. If no: just start it
radioactive_ralph service start
On Windows, named pipes are killed by reboot even with an installed service. That’s normal — the service wrapper recreates the pipe on start. Don’t panic.
Provider CLI missing or unauthenticated¶
Symptom¶
doctor: [FAIL] claude — claude binary not on PATH
or
service: worker failed to spawn: claude -p: exit 1 (sign in required)
Fix¶
See Provider auth for the full flow. The short version:
# Install if missing
curl -fsSL https://anthropic.com/install.sh | sh
# Authenticate
claude
# (first run prompts you to sign in at console.anthropic.com)
# Verify
radioactive_ralph doctor
If you’re not using claude/codex/gemini, remove the variant’s
provider = ... line from config.toml instead of installing a
provider you won’t use.
Service-install errors¶
Symptom (macOS)¶
launchctl bootstrap failed: 5: Input/output error
5: I/O error from launchctl usually means the plist is
syntactically-valid but references a binary launchctl can’t execute
(permissions or SIP).
Fix (macOS)¶
# 1. Verify the plist exists and the binary path is correct
cat ~/Library/LaunchAgents/com.jbcom.radioactive-ralph.<slug>.plist
# 2. Verify the binary is executable
ls -l $(which radioactive_ralph)
# 3. Re-install
radioactive_ralph service uninstall
radioactive_ralph service install
Symptom (Linux)¶
systemctl --user: Failed to connect to bus: No such file or directory
Fix (Linux)¶
systemd --user requires a user session bus. Under SSH on a headless
box:
export XDG_RUNTIME_DIR=/run/user/$UID
loginctl enable-linger $USER # keep the user bus alive across sessions
Then re-run service install.
Symptom (Windows)¶
service install failed: access denied
Fix (Windows)¶
Registering an SCM service requires admin privileges. Open PowerShell
as administrator and re-run radioactive_ralph service install.
Worker stuck in running¶
Symptom¶
radioactive_ralph plan tasks <plan> --status running shows a task
claimed a long time ago with no progress, no events in
plan history, and the variant subprocess is gone.
What it means¶
The runtime’s reaper hasn’t swept it yet. The task has a claimed-by session that’s dead; the reaper releases claims on sessions whose heartbeat is stale.
Fix¶
# Force-requeue the stuck task (resets claim, back to ready)
radioactive_ralph plan requeue <plan> <task>
If the same task keeps getting stuck:
# Mark it failed and look at the run history
radioactive_ralph plan fail <plan> <task>
radioactive_ralph plan history <plan> <task>
plandag schema mismatch¶
Symptom¶
error: plandag: open: schema version 5 is newer than this binary's supported 4
What it means¶
You downgraded the binary. The plandag SQLite file has a newer user_version than the running binary knows how to read.
Fix¶
Re-install the newer binary (recommended)
Or nuke the state and re-init (if you don’t care about plan history):
rm -rf $XDG_STATE_HOME/radioactive-ralph/<repo-hash>/plans.db radioactive_ralph init
Fixit advisor writes a fallback plan¶
Symptom¶
ralph: fixit emitted a fallback plan — operator intervention required
What it means¶
Stage 4 (Claude analysis) or Stage 5 (validation) failed twice. Fixit
wrote a diagnostic plan (status: fallback) instead of guessing.
Fix¶
Open .radioactive-ralph/plans/<topic>-advisor.md and read the
“Methodology” section — it includes the raw Claude output and the
validation errors. Typical causes:
Claude returned non-JSON (rare with the opus tier; usually a timeout)
Operator constraints conflict (e.g.
--descriptionforbids all variants)Repo state makes every variant disqualified (e.g. operator is on a release branch)
Adjust the inputs and re-run run --variant fixit --advise.
When everything else fails¶
Capture state and open an issue:
radioactive_ralph doctor --verbose > /tmp/ralph-doctor.txt
radioactive_ralph status --json > /tmp/ralph-status.json
journalctl --user -u radioactive-ralph-<slug> --since -1h > /tmp/ralph-journal.txt
(Or the platform equivalent — macOS Console, Windows Event Viewer.)
File at https://github.com/jbcom/radioactive-ralph/issues with those three files attached.