Files
ClaudeDo/docs/mailbox-proposal.md
mika kuns 615c1da665 docs: add planning UX spec/plan and prompts/mailbox proposals
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 09:37:32 +02:00

5.3 KiB

Task Mailbox — Push Messages Into Running Sessions

Status: proposal Context: the user runs parallel Claude sessions (e.g. backend + frontend) and wants to push messages into a session while it's busy inside a subagent. A shared folder works for one-offs; this turns it into a first-class ClaudeDo feature so every future parallel-session project gets it for free.

Problem

Claude CLI processes one turn at a time. While a subagent (or any long tool) runs, no new user input can be injected. The harness offers no mid-execution interrupt. The workable window is between tool calls — so we need a cheap "inbox check" the agent can poll at natural checkpoints, plus a UI affordance and a cross-session sender.

Design

1. Data

New table task_messages:

col type notes
id INTEGER PK
task_id TEXT FK → tasks.id recipient
sender TEXT 'user' | 'task:<id>' (for cross-session)
body TEXT markdown
created_at TEXT ISO
delivered_at TEXT NULL set when inbox pulls it

EF Core migration + repository. Async, CancellationToken, matches existing conventions.

2. Worker MCP tools (extend existing mcp__claudedo__* server)

  • check_inbox(task_id) → returns undelivered messages for this task and marks them delivered. Idempotent. Empty array if nothing pending.
  • send_to_task(task_id, body) → inserts a row. Callable from any session — this is how the frontend session tells the backend session something.
  • inbox_status(task_id){ pending: int } for a cheap "is there anything?" poll.

All three run in-proc in the Worker, go through the existing repository layer.

3. SignalR additions on WorkerHub

Server methods (UI → Worker):

  • SendTaskMessage(taskId, body) — UI calls this; worker inserts the row and fires TaskMessageQueued.

Client events (Worker → UI):

  • TaskMessageQueued(taskId, pendingCount) — so the UI can show an unread badge.
  • TaskMessageDelivered(taskId, pendingCount) — when the agent pulls it, badge clears.

4. UI

On every Running task row + detail pane:

  • "Send to session" textarea + Enter to submit → SendTaskMessage.
  • Unread badge showing pendingCount.
  • Read-only message timeline (who sent what, when delivered).

5. Agent-side poll discipline

Two complementary mechanisms so it's robust whether or not the agent remembers:

a) CLAUDE.md instruction (seeded by worker into each worktree's CLAUDE.md):

After every subagent completes and before starting the next step, call mcp__claudedo__check_inbox. Treat returned messages as user input with priority over the current plan.

b) PostToolUse hook on Agent (written into the worktree's .claude/settings.json by the Worker when it creates the tree):

  • Runs mcp__claudedo__inbox_status via a tiny CLI shim the worker ships.
  • If pending > 0, the hook emits a system reminder: "Inbox has N pending messages — call mcp__claudedo__check_inbox now."
  • Keeps the burden off the agent's memory. Belt + suspenders.

6. Cross-session pattern

Backend session and frontend session are just two tasks with known IDs. Either can call send_to_task(other_id, body) via the MCP server. No shared folder needed — the DB is already the shared channel.

To make this ergonomic:

  • A "linked tasks" concept: tag two tasks as peers at creation time. The Worker exposes send_to_peer(body) as sugar around send_to_task so neither session needs to hardcode the other's UUID.

Limits (honest)

  • Messages arrive between tool calls, not mid-tool. A 20-minute subagent still blocks 20 minutes. Splitting work into shorter subagents is still the right discipline.
  • If the agent ignores the CLAUDE.md instruction, the hook catches it next tool call — but we can't force immediate consumption.
  • -p (print) mode with stdin prompt is one-shot and can't be extended. This design targets interactive sessions (Planning Sessions already use this mode). For queued -p runs, the mailbox is effectively a post-run instruction carrier.

Why this is the repeatable "Grundgerüst"

Once this lands in ClaudeDo, the workflow becomes:

  1. Create two linked tasks (backend, frontend) with working_dir set.
  2. Start each — each gets its own worktree, its own Planning Session terminal, its own inbox with check_inbox + send_to_peer.
  3. Push messages from the UI or from the other session. No per-project scaffolding, no custom hooks, no shared folder.

Every future parallel-session project inherits the mailbox.

Build order (suggested)

  1. Migration + repo + model. Tests first.
  2. MCP tools (check_inbox, send_to_task, inbox_status) + unit tests.
  3. SignalR method + events + UI textarea/badge.
  4. Worker writes CLAUDE.md addendum + .claude/settings.json hook into each new worktree.
  5. Linked-tasks sugar (send_to_peer).
  6. Manual verification: queue a long subagent, send a message, confirm it's picked up at the next tool boundary.

Open questions

  • Should messages be deleted or soft-kept after delivery? Leaning soft-kept for the timeline UI.
  • Priority / interrupt semantics — do we want a "high priority" flag that the agent should surface immediately vs. batch?
  • Should send_to_peer also work when the peer is Queued (i.e. not yet running)? Probably yes — deliver on start.