Files
ClaudeDo/docs/superpowers/specs/2026-05-30-external-mcp-ui-parity-design.md
2026-05-30 13:42:24 +02:00

126 lines
6.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# External MCP — UI Parity for Start & Observe
**Date:** 2026-05-30
**Status:** Approved (design)
## Goal
Expand the always-on **External MCP server** (`ExternalMcpService`, exposed on
`cfg.ExternalMcpPort` under `/mcp`) so an external Claude session can **start and
observe** ClaudeDo work sessions end-to-end, reaching parity with the desktop UI
for those two concerns.
The server's purpose is deliberately scoped: **help the user start sessions and
observe them.** It is *not* a git/worktree console — branch merging, worktree
resets, and multi-turn continuation are things Claude does *inside* a task, so
they stay out of the tool surface.
## Scope
### In scope
**START — set up and launch a session**
- *(existing)* `AddTask`, `UpdateTask`, `UpdateTaskStatus` (Idle/Queued), `RunTaskNow`, `CancelTask`, `DeleteTask`
- **List management** — create / rename / delete lists; set working dir + default commit type
- **List & task config** — per-list defaults and per-task overrides for `model`, `system_prompt`, `agent_path`
- **Agents (read-only)** — list agent files and refresh, so Claude can choose a valid `agent_path`
- **Reset failed task** — discard the failed worktree and reset the row to Idle (the retry path)
**OBSERVE**
- *(existing)* `ListTaskLists`, `ListTasks`, `GetTask`
- **Run history** — read `task_runs` for a task (session id, tokens, turns, result, structured output, error)
- **Logs** — fetch a task's (or run's) log output
- **App settings (read-only)** — read worker app settings
### Out of scope (explicitly excluded)
- **Tags** — already removed from the system (migration `20260519044715_RemoveTags`); only the stale doc reference in `src/ClaudeDo.Worker/CLAUDE.md` needs deleting.
- **Multi-turn continue** (`--resume`) — Claude's own concern inside a task.
- **Worktree ops** — merge, merge targets, cleanup-finished, reset-all, force-remove, set-state.
- **Start planning session** — not needed via MCP.
- **App settings writes** — risky (e.g. flips permission mode); read-only only.
- **Agent file create/edit/delete** — not part of "starting a session".
## Approach (chosen: A)
**Reuse existing worker services; split the growing tool surface into focused
`[McpServerToolType]` classes.** No business logic is duplicated — each new tool
injects the same service the SignalR hub already uses, so MCP behavior stays
identical to the UI.
Adding ~12 tools to the single `ExternalMcpService` would push it past 600 lines
across eight unrelated jobs. Instead, organize tools by category, mirroring the
existing `External/` + `Planning/` layout:
| Class (new, in `External/`) | Tools | Backing service |
|---|---|---|
| `ExternalMcpService` *(existing, unchanged scope)* | task CRUD + run/cancel/status | `TaskRepository`, `QueueService`, `ITaskStateService` |
| `ListMcpTools` | `CreateList`, `RenameList`, `DeleteList`, `SetListWorkingDir` (name/dir/commitType) | `ListRepository` |
| `ConfigMcpTools` | `GetListConfig`, `SetListConfig`, `SetTaskConfig` (model/system_prompt/agent_path) | `ListRepository`, `TaskRepository.UpdateAgentSettingsAsync` |
| `RunHistoryMcpTools` | `ListRuns`, `GetRun`, `GetTaskLog` | `TaskRunRepository`, log file read |
| `AgentMcpTools` | `ListAgents`, `RefreshAgents` | `AgentFileService.ScanAsync` |
| `LifecycleMcpTools` | `ResetFailedTask` | `TaskResetService.ResetAsync` |
| `AppSettingsMcpTools` | `GetAppSettings` | `AppSettingsRepository.GetAsync` |
(Exact class grouping may be tuned during planning, but each class stays small
and single-purpose.)
## Architecture & wiring
The external MCP server is a **separate `WebApplication`** built in
`Program.cs` (≈ lines 188217) with its own DI container, distinct from the main
SignalR app. Shared singletons (`HubBroadcaster`, `QueueService`,
`ITaskStateService`, db factory, `WorkerConfig`) are injected by instance so both
apps act on the same runtime state.
Each new tool class must be:
1. Registered in the **external** builder (`externalBuilder.Services.AddScoped<…>()`),
alongside any newly required services (`TaskRunRepository`, `AgentFileService`,
`TaskResetService` + their dependencies).
2. Registered as tools via additional `.WithTools<T>()` calls on the external
`AddMcpServer()` chain.
No change to auth: the existing `ExternalMcpAuthMiddleware` (optional
`X-ClaudeDo-Key`, loopback-only otherwise) covers all tools uniformly. No
per-tool gating — the surface is read/observe + start, with the one borderline
write (`ResetFailedTask`) being a normal retry affordance.
## Data flow
- **Start:** Claude calls e.g. `CreateList``SetListConfig``AddTask(queueImmediately: true)`. Writes go through `ListRepository` / `TaskStateService`, which wake the queue and broadcast `ListUpdated` / `TaskUpdated` so the UI reflects changes live.
- **Observe:** Claude calls `ListTasks` / `GetTask``ListRuns` / `GetRun``GetTaskLog`. Pure reads from `TaskRepository` / `TaskRunRepository` and the log file at `TaskRunEntity.LogPath`.
- **Mutations broadcast** the same SignalR events the hub raises, keeping the desktop UI in sync.
## DTOs
- `RunDto` — projection of `TaskRunEntity`: `Id`, `RunNumber`, `SessionId`, `IsRetry`, `ResultMarkdown`, `StructuredOutputJson`, `ErrorMarkdown`, `ExitCode`, `TurnCount`, `TokensIn`, `TokensOut`, `StartedAt`, `FinishedAt`.
- `AgentDto` — from `AgentInfo` (`Name`, `Description`, `Path`).
- `ListConfigDto``Model`, `SystemPrompt`, `AgentPath` (reuse the shape already used by the hub).
- App-settings read reuses the existing `AppSettingsDto` shape (read-only subset is fine).
- Log fetch returns the file contents as a string (with a size cap / tail option decided in planning).
## Error handling
Follow the existing `ExternalMcpService` convention: throw
`InvalidOperationException` with a clear message for not-found / invalid-input /
illegal-state (e.g. "List {id} not found", "Cannot reset a non-failed task").
Reuse the guard patterns already present (required-field checks, status checks).
`ResetFailedTask` must refuse non-`Failed` tasks.
## Testing
Extend `tests/ClaudeDo.Worker.Tests/External/ExternalMcpServiceTests.cs` (and add
sibling test files per new tool class) using the existing real-SQLite + real-git
integration pattern:
- List CRUD round-trips; rename/delete propagate; delete blocked/handled sensibly.
- List + task config set/get round-trips; clearing all three fields removes list config (matches hub behavior).
- Run history reads return correct projections; `GetTaskLog` returns file contents and errors cleanly when no log exists.
- `ResetFailedTask` succeeds on a Failed task and refuses other statuses.
- Agent listing reflects files on disk after refresh.
- App-settings read returns current values.
## Doc cleanup (part of this work)
- `src/ClaudeDo.Worker/CLAUDE.md` — remove the stale `SetTaskTags` / `ListTags` /
"AddTask (with tags)" claim; replace the External MCP tool inventory with the
new surface.