fix(plugin): make mailbox-update skill robust to common failure modes
All checks were successful
CI (Node) / build-test (push) Successful in 9s
All checks were successful
CI (Node) / build-test (push) Successful in 9s
The previous skill assumed a happy-path environment and silently broke on real installs. Hardened the flow: - Always pass --@kuns:registry on every npm call, so the upgrade survives an unreadable user .npmrc (e.g. roaming HOME on a network share that npm can't access). - Treat /health as the ground truth, not `claude-mailbox status`. Status only reflects the autostart wrapper and silently lies when the inner process crashed at boot. - Capture state before changing anything (CLI version, autostart state, port, daemon /health version) so failures can roll back to "do nothing — old daemon keeps running". - Detect the "daemon reachable but autostart NotInstalled" case (manual foreground serve) and refuse to swap binaries from under it. - After restart, poll /health for up to 10s and verify the live daemon's version matches LATEST; on timeout, dump the platform- appropriate daemon log tail instead of just reporting "Stopped". - Reorder: install first, then stop+start. Downtime is paid only after the new binary is verified on disk.
This commit is contained in:
@@ -5,52 +5,106 @@ allowed-tools: Bash
|
|||||||
|
|
||||||
You are running the **Claude-Mailbox update** command. Update the `@kuns/claude-mailbox` npm package and restart the daemon. Never run `sudo` automatically — if elevation is needed, stop and ask.
|
You are running the **Claude-Mailbox update** command. Update the `@kuns/claude-mailbox` npm package and restart the daemon. Never run `sudo` automatically — if elevation is needed, stop and ask.
|
||||||
|
|
||||||
## Step 1 — current version
|
Throughout, treat the **daemon's `/health` endpoint** as the ground truth for "is the daemon running and on what version", not `claude-mailbox status` (which only reflects the autostart wrapper). Always use the registry override flag `--@kuns:registry=https://git.kuns.dev/api/packages/releases/npm/` on every `npm` call, so the upgrade works even when the user's `.npmrc` is unreachable (e.g. roaming `$HOME` on a network share).
|
||||||
|
|
||||||
Run: `claude-mailbox --version`
|
## Step 1 — current state (must run before anything is changed)
|
||||||
|
|
||||||
- Exit 0 → record the version string as `CURRENT`.
|
Run, in order, and remember each result:
|
||||||
- Non-zero → tell the user the daemon binary is not installed yet. Suggest `/claude-mailbox:mailbox-doctor` to do a full setup and stop.
|
|
||||||
|
1. `claude-mailbox --version`
|
||||||
|
- Exit 0 → `CURRENT_CLI = <stdout trimmed>`.
|
||||||
|
- Non-zero → stop. The binary isn't installed; suggest `/claude-mailbox:mailbox-doctor` and exit.
|
||||||
|
2. `claude-mailbox status`
|
||||||
|
- Record as `AUTOSTART_STATE ∈ { Running, Stopped, NotInstalled }`.
|
||||||
|
3. Read the configured port. Try `~/.claude-mailbox/mailbox.json`; if absent or no `port` field, use **37849**. Call this `PORT`.
|
||||||
|
4. Probe `curl -sf -m 2 http://127.0.0.1:$PORT/health`.
|
||||||
|
- On success, parse JSON → `CURRENT_HEALTH_VERSION = .version`. Set `DAEMON_REACHABLE = true`.
|
||||||
|
- On failure → `CURRENT_HEALTH_VERSION = null`, `DAEMON_REACHABLE = false`.
|
||||||
|
|
||||||
|
Note any inconsistencies (e.g. `AUTOSTART_STATE = NotInstalled` but `DAEMON_REACHABLE = true` means a manually-started foreground daemon) — they affect Step 5.
|
||||||
|
|
||||||
## Step 2 — latest published version
|
## Step 2 — latest published version
|
||||||
|
|
||||||
Run: `npm view @kuns/claude-mailbox version`
|
Run: `npm view --@kuns:registry=https://git.kuns.dev/api/packages/releases/npm/ @kuns/claude-mailbox version`
|
||||||
|
|
||||||
If the npm registry config is missing, the call may fail with a 404. Fall back to:
|
If that fails for any reason (network, registry), fall back to:
|
||||||
|
|
||||||
```
|
```
|
||||||
npm view --registry=https://git.kuns.dev/api/packages/releases/npm/ @kuns/claude-mailbox version
|
npm view --registry=https://git.kuns.dev/api/packages/releases/npm/ @kuns/claude-mailbox version
|
||||||
```
|
```
|
||||||
|
|
||||||
Record the result as `LATEST`.
|
Record the result as `LATEST`. If both calls fail, stop and report the network/registry error.
|
||||||
|
|
||||||
## Step 3 — compare
|
## Step 3 — compare
|
||||||
|
|
||||||
- If `CURRENT === LATEST`: print "Already up to date (vX.Y.Z)." and stop. Do not run any further steps.
|
- `CURRENT_CLI === LATEST` AND `CURRENT_HEALTH_VERSION === LATEST` (or `DAEMON_REACHABLE = false`) → print "Already up to date (vLATEST)." and stop.
|
||||||
- Otherwise: tell the user `CURRENT` → `LATEST` and ask for confirmation before proceeding.
|
- `CURRENT_CLI === LATEST` but `CURRENT_HEALTH_VERSION !== LATEST` → the CLI is fresh but the running daemon is the old binary. Tell the user "Binary already at LATEST but the running daemon is still v$CURRENT_HEALTH_VERSION — restart needed." Then jump to Step 5 (no npm install).
|
||||||
|
- Otherwise → tell the user `CURRENT_CLI` → `LATEST` and ask for confirmation before proceeding.
|
||||||
|
|
||||||
## Step 4 — perform the update
|
Also warn before confirmation if `AUTOSTART_STATE = NotInstalled` AND `DAEMON_REACHABLE = false`:
|
||||||
|
|
||||||
On user confirmation, run these in order. Stop on the first failure and report it:
|
> Heads-up: autostart is not installed and no daemon is reachable on port $PORT. After the upgrade I can install the new binary, but I won't be able to start the daemon automatically — you'd need `/claude-mailbox:mailbox-doctor` to wire up autostart, or run `claude-mailbox serve` manually. Proceed anyway?
|
||||||
|
|
||||||
1. `claude-mailbox stop`
|
## Step 4 — install the new package
|
||||||
2. `npm install -g @kuns/claude-mailbox@latest`
|
|
||||||
- On Linux/macOS this may fail with EACCES. **Do not run sudo automatically.** Ask the user how they want to proceed (e.g., `sudo npm install -g …`, or switch to a user-scoped Node setup with nvm/fnm).
|
|
||||||
3. `claude-mailbox start`
|
|
||||||
4. `claude-mailbox --version` to verify the upgrade landed.
|
|
||||||
5. `claude-mailbox status` to verify the daemon is `Running`.
|
|
||||||
|
|
||||||
## Step 5 — summary
|
On user confirmation:
|
||||||
|
|
||||||
|
```
|
||||||
|
npm install -g @kuns/claude-mailbox@latest --@kuns:registry=https://git.kuns.dev/api/packages/releases/npm/
|
||||||
|
```
|
||||||
|
|
||||||
|
- The scope override is mandatory — do not omit it, even if `npm config get @kuns:registry` looks right. It costs nothing and protects against unreachable user-level `.npmrc`.
|
||||||
|
- On Linux/macOS the install may fail with EACCES. **Do not run sudo automatically.** Stop and ask how the user wants to proceed (e.g. `sudo`, switch to nvm/fnm).
|
||||||
|
- On any other failure, stop and report. Do **not** touch the daemon — leaving the old daemon running is the safe rollback.
|
||||||
|
|
||||||
|
After install, run `claude-mailbox --version` and confirm it now reports `LATEST`. If not (PATH shadowing, stale `which`), stop and report — the daemon is still on the old version, which is fine to keep running.
|
||||||
|
|
||||||
|
## Step 5 — restart the daemon
|
||||||
|
|
||||||
|
Now swap the daemon over to the new binary.
|
||||||
|
|
||||||
|
1. Stop the existing daemon if anything is running:
|
||||||
|
- If `AUTOSTART_STATE = Running` → `claude-mailbox stop` and wait up to 5s for `/health` on `PORT` to start failing (poll once per second).
|
||||||
|
- If `DAEMON_REACHABLE = true` but `AUTOSTART_STATE = NotInstalled` → a foreground/manual daemon is running. Tell the user:
|
||||||
|
> A daemon is reachable on port $PORT but autostart is not installed, so I can't stop it. Stop the manual `claude-mailbox serve` process yourself, then re-run this command to finish the restart.
|
||||||
|
Then stop here.
|
||||||
|
- Otherwise nothing to stop.
|
||||||
|
2. Start the daemon, picking the path that matches `AUTOSTART_STATE`:
|
||||||
|
- `Running` or `Stopped` (i.e. autostart is installed) → `claude-mailbox start`.
|
||||||
|
- `NotInstalled` → skip the start. After the loop below times out, tell the user to run `/claude-mailbox:mailbox-doctor` to install autostart, then exit.
|
||||||
|
3. Poll `curl -sf -m 1 http://127.0.0.1:$PORT/health` up to **10 times with 1s sleeps**. Stop polling as soon as one returns JSON with `"status":"ok"`.
|
||||||
|
4. Outcome:
|
||||||
|
- Health came up AND `version === LATEST` → ✓ proceed to Step 6.
|
||||||
|
- Health came up but `version !== LATEST` → the wrapper started an *old* binary somewhere on PATH. Dump `which claude-mailbox` / `where claude-mailbox` and stop with that info.
|
||||||
|
- Health did not come up → dump the most recent daemon log to help diagnose. Try, in order, the first one that exists:
|
||||||
|
- Windows Scheduled Task / fallback: `%LOCALAPPDATA%\ClaudeMailbox\logs\daemon.log` (tail the last 40 lines)
|
||||||
|
- Windows Service variant: `Get-WinEvent -ProviderName ClaudeMailbox -MaxEvents 20` via `powershell -NoProfile -Command "..."`
|
||||||
|
- macOS launchd: `~/Library/Logs/ClaudeMailbox/daemon.log` (last 40 lines)
|
||||||
|
- Linux systemd-user: `journalctl --user -u claude-mailbox -n 40 --no-pager`
|
||||||
|
If none exist or all are empty, just say "No daemon logs found; try `claude-mailbox serve` in a terminal to see the error directly."
|
||||||
|
Stop with that diagnostic; do not pretend the update succeeded.
|
||||||
|
|
||||||
|
## Step 6 — summary
|
||||||
|
|
||||||
Print exactly this block:
|
Print exactly this block:
|
||||||
|
|
||||||
```
|
```
|
||||||
Claude-Mailbox update
|
Claude-Mailbox update
|
||||||
previous version: <CURRENT>
|
previous version: <CURRENT_CLI>
|
||||||
new version: <whatever --version now reports>
|
new version: <claude-mailbox --version output>
|
||||||
daemon: Running | Stopped | NotInstalled
|
daemon health: ok (v<version from /health>) | unreachable
|
||||||
pending messages survived: <count of messages still in inbox via `claude-mailbox list`, if applicable>
|
daemon autostart: Running | Stopped | NotInstalled
|
||||||
|
pending messages: <total pending across all mailboxes — sum of pendingForYou from `claude-mailbox list`>
|
||||||
```
|
```
|
||||||
|
|
||||||
If the new version matches `LATEST` and daemon is `Running`, end with: "Update complete."
|
End with one of:
|
||||||
Otherwise, end with the first thing that went wrong.
|
|
||||||
|
- New CLI version matches `LATEST` AND `/health` returns `version === LATEST` → **"Update complete."**
|
||||||
|
- Anything else → **"Update incomplete: <the first concrete failure from Step 4 or 5>."**
|
||||||
|
|
||||||
|
## Operating notes
|
||||||
|
|
||||||
|
- **Always use the scope override flag** (`--@kuns:registry=...`) on every npm call. The user's `.npmrc` may be on a network drive that npm can't read.
|
||||||
|
- **Never rely on `claude-mailbox status` alone** to decide "the daemon is fine". Always cross-check with a `/health` probe — the status command only reflects whether the autostart task is in a Running state and doesn't notice if the process inside crashed at boot.
|
||||||
|
- **Never run `npm install` without first locking in the current state.** If the install fails, the safe rollback is to do nothing — the old daemon is still running.
|
||||||
|
- **Never `claude-mailbox stop` before the install succeeds.** Downtime is paid only after we know the new binary is on disk.
|
||||||
|
|||||||
Reference in New Issue
Block a user