docs(worker): add external MCP UI-parity spec and plan

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
mika kuns
2026-05-30 13:42:24 +02:00
parent 26c4e5771b
commit 99dc08488b
2 changed files with 1095 additions and 0 deletions

View File

@@ -0,0 +1,125 @@
# External MCP — UI Parity for Start & Observe
**Date:** 2026-05-30
**Status:** Approved (design)
## Goal
Expand the always-on **External MCP server** (`ExternalMcpService`, exposed on
`cfg.ExternalMcpPort` under `/mcp`) so an external Claude session can **start and
observe** ClaudeDo work sessions end-to-end, reaching parity with the desktop UI
for those two concerns.
The server's purpose is deliberately scoped: **help the user start sessions and
observe them.** It is *not* a git/worktree console — branch merging, worktree
resets, and multi-turn continuation are things Claude does *inside* a task, so
they stay out of the tool surface.
## Scope
### In scope
**START — set up and launch a session**
- *(existing)* `AddTask`, `UpdateTask`, `UpdateTaskStatus` (Idle/Queued), `RunTaskNow`, `CancelTask`, `DeleteTask`
- **List management** — create / rename / delete lists; set working dir + default commit type
- **List & task config** — per-list defaults and per-task overrides for `model`, `system_prompt`, `agent_path`
- **Agents (read-only)** — list agent files and refresh, so Claude can choose a valid `agent_path`
- **Reset failed task** — discard the failed worktree and reset the row to Idle (the retry path)
**OBSERVE**
- *(existing)* `ListTaskLists`, `ListTasks`, `GetTask`
- **Run history** — read `task_runs` for a task (session id, tokens, turns, result, structured output, error)
- **Logs** — fetch a task's (or run's) log output
- **App settings (read-only)** — read worker app settings
### Out of scope (explicitly excluded)
- **Tags** — already removed from the system (migration `20260519044715_RemoveTags`); only the stale doc reference in `src/ClaudeDo.Worker/CLAUDE.md` needs deleting.
- **Multi-turn continue** (`--resume`) — Claude's own concern inside a task.
- **Worktree ops** — merge, merge targets, cleanup-finished, reset-all, force-remove, set-state.
- **Start planning session** — not needed via MCP.
- **App settings writes** — risky (e.g. flips permission mode); read-only only.
- **Agent file create/edit/delete** — not part of "starting a session".
## Approach (chosen: A)
**Reuse existing worker services; split the growing tool surface into focused
`[McpServerToolType]` classes.** No business logic is duplicated — each new tool
injects the same service the SignalR hub already uses, so MCP behavior stays
identical to the UI.
Adding ~12 tools to the single `ExternalMcpService` would push it past 600 lines
across eight unrelated jobs. Instead, organize tools by category, mirroring the
existing `External/` + `Planning/` layout:
| Class (new, in `External/`) | Tools | Backing service |
|---|---|---|
| `ExternalMcpService` *(existing, unchanged scope)* | task CRUD + run/cancel/status | `TaskRepository`, `QueueService`, `ITaskStateService` |
| `ListMcpTools` | `CreateList`, `RenameList`, `DeleteList`, `SetListWorkingDir` (name/dir/commitType) | `ListRepository` |
| `ConfigMcpTools` | `GetListConfig`, `SetListConfig`, `SetTaskConfig` (model/system_prompt/agent_path) | `ListRepository`, `TaskRepository.UpdateAgentSettingsAsync` |
| `RunHistoryMcpTools` | `ListRuns`, `GetRun`, `GetTaskLog` | `TaskRunRepository`, log file read |
| `AgentMcpTools` | `ListAgents`, `RefreshAgents` | `AgentFileService.ScanAsync` |
| `LifecycleMcpTools` | `ResetFailedTask` | `TaskResetService.ResetAsync` |
| `AppSettingsMcpTools` | `GetAppSettings` | `AppSettingsRepository.GetAsync` |
(Exact class grouping may be tuned during planning, but each class stays small
and single-purpose.)
## Architecture & wiring
The external MCP server is a **separate `WebApplication`** built in
`Program.cs` (≈ lines 188217) with its own DI container, distinct from the main
SignalR app. Shared singletons (`HubBroadcaster`, `QueueService`,
`ITaskStateService`, db factory, `WorkerConfig`) are injected by instance so both
apps act on the same runtime state.
Each new tool class must be:
1. Registered in the **external** builder (`externalBuilder.Services.AddScoped<…>()`),
alongside any newly required services (`TaskRunRepository`, `AgentFileService`,
`TaskResetService` + their dependencies).
2. Registered as tools via additional `.WithTools<T>()` calls on the external
`AddMcpServer()` chain.
No change to auth: the existing `ExternalMcpAuthMiddleware` (optional
`X-ClaudeDo-Key`, loopback-only otherwise) covers all tools uniformly. No
per-tool gating — the surface is read/observe + start, with the one borderline
write (`ResetFailedTask`) being a normal retry affordance.
## Data flow
- **Start:** Claude calls e.g. `CreateList``SetListConfig``AddTask(queueImmediately: true)`. Writes go through `ListRepository` / `TaskStateService`, which wake the queue and broadcast `ListUpdated` / `TaskUpdated` so the UI reflects changes live.
- **Observe:** Claude calls `ListTasks` / `GetTask``ListRuns` / `GetRun``GetTaskLog`. Pure reads from `TaskRepository` / `TaskRunRepository` and the log file at `TaskRunEntity.LogPath`.
- **Mutations broadcast** the same SignalR events the hub raises, keeping the desktop UI in sync.
## DTOs
- `RunDto` — projection of `TaskRunEntity`: `Id`, `RunNumber`, `SessionId`, `IsRetry`, `ResultMarkdown`, `StructuredOutputJson`, `ErrorMarkdown`, `ExitCode`, `TurnCount`, `TokensIn`, `TokensOut`, `StartedAt`, `FinishedAt`.
- `AgentDto` — from `AgentInfo` (`Name`, `Description`, `Path`).
- `ListConfigDto``Model`, `SystemPrompt`, `AgentPath` (reuse the shape already used by the hub).
- App-settings read reuses the existing `AppSettingsDto` shape (read-only subset is fine).
- Log fetch returns the file contents as a string (with a size cap / tail option decided in planning).
## Error handling
Follow the existing `ExternalMcpService` convention: throw
`InvalidOperationException` with a clear message for not-found / invalid-input /
illegal-state (e.g. "List {id} not found", "Cannot reset a non-failed task").
Reuse the guard patterns already present (required-field checks, status checks).
`ResetFailedTask` must refuse non-`Failed` tasks.
## Testing
Extend `tests/ClaudeDo.Worker.Tests/External/ExternalMcpServiceTests.cs` (and add
sibling test files per new tool class) using the existing real-SQLite + real-git
integration pattern:
- List CRUD round-trips; rename/delete propagate; delete blocked/handled sensibly.
- List + task config set/get round-trips; clearing all three fields removes list config (matches hub behavior).
- Run history reads return correct projections; `GetTaskLog` returns file contents and errors cleanly when no log exists.
- `ResetFailedTask` succeeds on a Failed task and refuses other statuses.
- Agent listing reflects files on disk after refresh.
- App-settings read returns current values.
## Doc cleanup (part of this work)
- `src/ClaudeDo.Worker/CLAUDE.md` — remove the stale `SetTaskTags` / `ListTags` /
"AddTask (with tags)" claim; replace the External MCP tool inventory with the
new surface.