docs: add planning UX spec/plan and prompts/mailbox proposals

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
mika kuns
2026-04-25 09:37:32 +02:00
parent e192285f5d
commit 615c1da665
4 changed files with 1190 additions and 0 deletions

98
docs/mailbox-proposal.md Normal file
View File

@@ -0,0 +1,98 @@
# Task Mailbox — Push Messages Into Running Sessions
**Status:** proposal
**Context:** the user runs parallel Claude sessions (e.g. backend + frontend) and wants to push messages into a session while it's busy inside a subagent. A shared folder works for one-offs; this turns it into a first-class ClaudeDo feature so every future parallel-session project gets it for free.
## Problem
Claude CLI processes one turn at a time. While a subagent (or any long tool) runs, no new user input can be injected. The harness offers no mid-execution interrupt. The workable window is *between* tool calls — so we need a cheap "inbox check" the agent can poll at natural checkpoints, plus a UI affordance and a cross-session sender.
## Design
### 1. Data
New table `task_messages`:
| col | type | notes |
|---|---|---|
| `id` | INTEGER PK | |
| `task_id` | TEXT FK → tasks.id | recipient |
| `sender` | TEXT | `'user'` \| `'task:<id>'` (for cross-session) |
| `body` | TEXT | markdown |
| `created_at` | TEXT | ISO |
| `delivered_at` | TEXT NULL | set when inbox pulls it |
EF Core migration + repository. Async, CancellationToken, matches existing conventions.
### 2. Worker MCP tools (extend existing `mcp__claudedo__*` server)
- **`check_inbox(task_id)`** → returns undelivered messages for this task and marks them delivered. Idempotent. Empty array if nothing pending.
- **`send_to_task(task_id, body)`** → inserts a row. Callable from *any* session — this is how the frontend session tells the backend session something.
- **`inbox_status(task_id)`** → `{ pending: int }` for a cheap "is there anything?" poll.
All three run in-proc in the Worker, go through the existing repository layer.
### 3. SignalR additions on `WorkerHub`
Server methods (UI → Worker):
- `SendTaskMessage(taskId, body)` — UI calls this; worker inserts the row and fires `TaskMessageQueued`.
Client events (Worker → UI):
- `TaskMessageQueued(taskId, pendingCount)` — so the UI can show an unread badge.
- `TaskMessageDelivered(taskId, pendingCount)` — when the agent pulls it, badge clears.
### 4. UI
On every `Running` task row + detail pane:
- "Send to session" textarea + Enter to submit → `SendTaskMessage`.
- Unread badge showing `pendingCount`.
- Read-only message timeline (who sent what, when delivered).
### 5. Agent-side poll discipline
Two complementary mechanisms so it's robust whether or not the agent remembers:
**a) CLAUDE.md instruction** (seeded by worker into each worktree's `CLAUDE.md`):
> After every subagent completes and before starting the next step, call `mcp__claudedo__check_inbox`. Treat returned messages as user input with priority over the current plan.
**b) PostToolUse hook on `Agent`** (written into the worktree's `.claude/settings.json` by the Worker when it creates the tree):
- Runs `mcp__claudedo__inbox_status` via a tiny CLI shim the worker ships.
- If `pending > 0`, the hook emits a system reminder: "Inbox has N pending messages — call `mcp__claudedo__check_inbox` now."
- Keeps the burden off the agent's memory. Belt + suspenders.
### 6. Cross-session pattern
Backend session and frontend session are just two tasks with known IDs. Either can call `send_to_task(other_id, body)` via the MCP server. No shared folder needed — the DB is already the shared channel.
To make this ergonomic:
- A "linked tasks" concept: tag two tasks as peers at creation time. The Worker exposes `send_to_peer(body)` as sugar around `send_to_task` so neither session needs to hardcode the other's UUID.
## Limits (honest)
- Messages arrive *between* tool calls, not mid-tool. A 20-minute subagent still blocks 20 minutes. Splitting work into shorter subagents is still the right discipline.
- If the agent ignores the CLAUDE.md instruction, the hook catches it next tool call — but we can't force immediate consumption.
- `-p` (print) mode with stdin prompt is one-shot and can't be extended. This design targets *interactive* sessions (Planning Sessions already use this mode). For queued `-p` runs, the mailbox is effectively a post-run instruction carrier.
## Why this is the repeatable "Grundgerüst"
Once this lands in ClaudeDo, the workflow becomes:
1. Create two linked tasks (`backend`, `frontend`) with `working_dir` set.
2. Start each — each gets its own worktree, its own Planning Session terminal, its own inbox with `check_inbox` + `send_to_peer`.
3. Push messages from the UI or from the other session. No per-project scaffolding, no custom hooks, no shared folder.
Every future parallel-session project inherits the mailbox.
## Build order (suggested)
1. Migration + repo + model. Tests first.
2. MCP tools (`check_inbox`, `send_to_task`, `inbox_status`) + unit tests.
3. SignalR method + events + UI textarea/badge.
4. Worker writes CLAUDE.md addendum + `.claude/settings.json` hook into each new worktree.
5. Linked-tasks sugar (`send_to_peer`).
6. Manual verification: queue a long subagent, send a message, confirm it's picked up at the next tool boundary.
## Open questions
- Should messages be deleted or soft-kept after delivery? Leaning soft-kept for the timeline UI.
- Priority / interrupt semantics — do we want a "high priority" flag that the agent should surface immediately vs. batch?
- Should `send_to_peer` also work when the peer is `Queued` (i.e. not yet running)? Probably yes — deliver on start.