Files

mika kuns 99dc08488b docs(worker): add external MCP UI-parity spec and plan

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-30 13:42:24 +02:00

6.8 KiB

Raw Blame History

External MCP — UI Parity for Start & Observe

Date: 2026-05-30 Status: Approved (design)

Goal

Expand the always-on External MCP server (ExternalMcpService, exposed on cfg.ExternalMcpPort under /mcp) so an external Claude session can start and observe ClaudeDo work sessions end-to-end, reaching parity with the desktop UI for those two concerns.

The server's purpose is deliberately scoped: help the user start sessions and observe them. It is not a git/worktree console — branch merging, worktree resets, and multi-turn continuation are things Claude does inside a task, so they stay out of the tool surface.

Scope

In scope

START — set up and launch a session

(existing) AddTask, UpdateTask, UpdateTaskStatus (Idle/Queued), RunTaskNow, CancelTask, DeleteTask
List management — create / rename / delete lists; set working dir + default commit type
List & task config — per-list defaults and per-task overrides for model, system_prompt, agent_path
Agents (read-only) — list agent files and refresh, so Claude can choose a valid agent_path
Reset failed task — discard the failed worktree and reset the row to Idle (the retry path)

OBSERVE

(existing) ListTaskLists, ListTasks, GetTask
Run history — read task_runs for a task (session id, tokens, turns, result, structured output, error)
Logs — fetch a task's (or run's) log output
App settings (read-only) — read worker app settings

Out of scope (explicitly excluded)

Tags — already removed from the system (migration 20260519044715_RemoveTags); only the stale doc reference in src/ClaudeDo.Worker/CLAUDE.md needs deleting.
Multi-turn continue (--resume) — Claude's own concern inside a task.
Worktree ops — merge, merge targets, cleanup-finished, reset-all, force-remove, set-state.
Start planning session — not needed via MCP.
App settings writes — risky (e.g. flips permission mode); read-only only.
Agent file create/edit/delete — not part of "starting a session".

Approach (chosen: A)

Reuse existing worker services; split the growing tool surface into focused [McpServerToolType] classes. No business logic is duplicated — each new tool injects the same service the SignalR hub already uses, so MCP behavior stays identical to the UI.

Adding ~12 tools to the single ExternalMcpService would push it past 600 lines across eight unrelated jobs. Instead, organize tools by category, mirroring the existing External/ + Planning/ layout:

Class (new, in `External/`)	Tools	Backing service
`ExternalMcpService` (existing, unchanged scope)	task CRUD + run/cancel/status	`TaskRepository`, `QueueService`, `ITaskStateService`
`ListMcpTools`	`CreateList`, `RenameList`, `DeleteList`, `SetListWorkingDir` (name/dir/commitType)	`ListRepository`
`ConfigMcpTools`	`GetListConfig`, `SetListConfig`, `SetTaskConfig` (model/system_prompt/agent_path)	`ListRepository`, `TaskRepository.UpdateAgentSettingsAsync`
`RunHistoryMcpTools`	`ListRuns`, `GetRun`, `GetTaskLog`	`TaskRunRepository`, log file read
`AgentMcpTools`	`ListAgents`, `RefreshAgents`	`AgentFileService.ScanAsync`
`LifecycleMcpTools`	`ResetFailedTask`	`TaskResetService.ResetAsync`
`AppSettingsMcpTools`	`GetAppSettings`	`AppSettingsRepository.GetAsync`

(Exact class grouping may be tuned during planning, but each class stays small and single-purpose.)

Architecture & wiring

The external MCP server is a separate WebApplication built in Program.cs (≈ lines 188–217) with its own DI container, distinct from the main SignalR app. Shared singletons (HubBroadcaster, QueueService, ITaskStateService, db factory, WorkerConfig) are injected by instance so both apps act on the same runtime state.

Each new tool class must be:

Registered in the external builder (externalBuilder.Services.AddScoped<…>()), alongside any newly required services (TaskRunRepository, AgentFileService, TaskResetService + their dependencies).
Registered as tools via additional .WithTools<T>() calls on the external AddMcpServer() chain.

No change to auth: the existing ExternalMcpAuthMiddleware (optional X-ClaudeDo-Key, loopback-only otherwise) covers all tools uniformly. No per-tool gating — the surface is read/observe + start, with the one borderline write (ResetFailedTask) being a normal retry affordance.

Data flow

Start: Claude calls e.g. CreateList → SetListConfig → AddTask(queueImmediately: true). Writes go through ListRepository / TaskStateService, which wake the queue and broadcast ListUpdated / TaskUpdated so the UI reflects changes live.
Observe: Claude calls ListTasks / GetTask → ListRuns / GetRun → GetTaskLog. Pure reads from TaskRepository / TaskRunRepository and the log file at TaskRunEntity.LogPath.
Mutations broadcast the same SignalR events the hub raises, keeping the desktop UI in sync.

DTOs

RunDto — projection of TaskRunEntity: Id, RunNumber, SessionId, IsRetry, ResultMarkdown, StructuredOutputJson, ErrorMarkdown, ExitCode, TurnCount, TokensIn, TokensOut, StartedAt, FinishedAt.
AgentDto — from AgentInfo (Name, Description, Path).
ListConfigDto — Model, SystemPrompt, AgentPath (reuse the shape already used by the hub).
App-settings read reuses the existing AppSettingsDto shape (read-only subset is fine).
Log fetch returns the file contents as a string (with a size cap / tail option decided in planning).

Error handling

Follow the existing ExternalMcpService convention: throw InvalidOperationException with a clear message for not-found / invalid-input / illegal-state (e.g. "List {id} not found", "Cannot reset a non-failed task"). Reuse the guard patterns already present (required-field checks, status checks). ResetFailedTask must refuse non-Failed tasks.

Testing

Extend tests/ClaudeDo.Worker.Tests/External/ExternalMcpServiceTests.cs (and add sibling test files per new tool class) using the existing real-SQLite + real-git integration pattern:

List CRUD round-trips; rename/delete propagate; delete blocked/handled sensibly.
List + task config set/get round-trips; clearing all three fields removes list config (matches hub behavior).
Run history reads return correct projections; GetTaskLog returns file contents and errors cleanly when no log exists.
ResetFailedTask succeeds on a Failed task and refuses other statuses.
Agent listing reflects files on disk after refresh.
App-settings read returns current values.

Doc cleanup (part of this work)

src/ClaudeDo.Worker/CLAUDE.md — remove the stale SetTaskTags / ListTags / "AddTask (with tags)" claim; replace the External MCP tool inventory with the new surface.

6.8 KiB Raw Blame History Unescape Escape