# External MCP — UI Parity for Start & Observe

**Date:** 2026-05-30
**Status:** Approved (design)

## Goal

Expand the always-on **External MCP server** (`ExternalMcpService`, exposed on
`cfg.ExternalMcpPort` under `/mcp`) so an external Claude session can **start and
observe** ClaudeDo work sessions end-to-end, reaching parity with the desktop UI
for those two concerns.

The server's purpose is deliberately scoped: **help the user start sessions and
observe them.** It is *not* a git/worktree console — branch merging, worktree
resets, and multi-turn continuation are things Claude does *inside* a task, so
they stay out of the tool surface.

## Scope

### In scope

**START — set up and launch a session**
- *(existing)* `AddTask`, `UpdateTask`, `UpdateTaskStatus` (Idle/Queued), `RunTaskNow`, `CancelTask`, `DeleteTask`
- **List management** — create / rename / delete lists; set working dir + default commit type
- **List & task config** — per-list defaults and per-task overrides for `model`, `system_prompt`, `agent_path`
- **Agents (read-only)** — list agent files and refresh, so Claude can choose a valid `agent_path`
- **Reset failed task** — discard the failed worktree and reset the row to Idle (the retry path)

**OBSERVE**
- *(existing)* `ListTaskLists`, `ListTasks`, `GetTask`
- **Run history** — read `task_runs` for a task (session id, tokens, turns, result, structured output, error)
- **Logs** — fetch a task's (or run's) log output
- **App settings (read-only)** — read worker app settings

### Out of scope (explicitly excluded)
- **Tags** — already removed from the system (migration `20260519044715_RemoveTags`); only the stale doc reference in `src/ClaudeDo.Worker/CLAUDE.md` needs deleting.
- **Multi-turn continue** (`--resume`) — Claude's own concern inside a task.
- **Worktree ops** — merge, merge targets, cleanup-finished, reset-all, force-remove, set-state.
- **Start planning session** — not needed via MCP.
- **App settings writes** — risky (e.g. flips permission mode); read-only only.
- **Agent file create/edit/delete** — not part of "starting a session".

## Approach (chosen: A)

**Reuse existing worker services; split the growing tool surface into focused
`[McpServerToolType]` classes.** No business logic is duplicated — each new tool
injects the same service the SignalR hub already uses, so MCP behavior stays
identical to the UI.

Adding ~12 tools to the single `ExternalMcpService` would push it past 600 lines
across eight unrelated jobs. Instead, organize tools by category, mirroring the
existing `External/` + `Planning/` layout:

| Class (new, in `External/`) | Tools | Backing service |
|---|---|---|
| `ExternalMcpService` *(existing, unchanged scope)* | task CRUD + run/cancel/status | `TaskRepository`, `QueueService`, `ITaskStateService` |
| `ListMcpTools` | `CreateList`, `RenameList`, `DeleteList`, `SetListWorkingDir` (name/dir/commitType) | `ListRepository` |
| `ConfigMcpTools` | `GetListConfig`, `SetListConfig`, `SetTaskConfig` (model/system_prompt/agent_path) | `ListRepository`, `TaskRepository.UpdateAgentSettingsAsync` |
| `RunHistoryMcpTools` | `ListRuns`, `GetRun`, `GetTaskLog` | `TaskRunRepository`, log file read |
| `AgentMcpTools` | `ListAgents`, `RefreshAgents` | `AgentFileService.ScanAsync` |
| `LifecycleMcpTools` | `ResetFailedTask` | `TaskResetService.ResetAsync` |
| `AppSettingsMcpTools` | `GetAppSettings` | `AppSettingsRepository.GetAsync` |

(Exact class grouping may be tuned during planning, but each class stays small
and single-purpose.)

## Architecture & wiring

The external MCP server is a **separate `WebApplication`** built in
`Program.cs` (≈ lines 188–217) with its own DI container, distinct from the main
SignalR app. Shared singletons (`HubBroadcaster`, `QueueService`,
`ITaskStateService`, db factory, `WorkerConfig`) are injected by instance so both
apps act on the same runtime state.

Each new tool class must be:
1. Registered in the **external** builder (`externalBuilder.Services.AddScoped<…>()`),
   alongside any newly required services (`TaskRunRepository`, `AgentFileService`,
   `TaskResetService` + their dependencies).
2. Registered as tools via additional `.WithTools<T>()` calls on the external
   `AddMcpServer()` chain.

No change to auth: the existing `ExternalMcpAuthMiddleware` (optional
`X-ClaudeDo-Key`, loopback-only otherwise) covers all tools uniformly. No
per-tool gating — the surface is read/observe + start, with the one borderline
write (`ResetFailedTask`) being a normal retry affordance.

## Data flow

- **Start:** Claude calls e.g. `CreateList` → `SetListConfig` → `AddTask(queueImmediately: true)`. Writes go through `ListRepository` / `TaskStateService`, which wake the queue and broadcast `ListUpdated` / `TaskUpdated` so the UI reflects changes live.
- **Observe:** Claude calls `ListTasks` / `GetTask` → `ListRuns` / `GetRun` → `GetTaskLog`. Pure reads from `TaskRepository` / `TaskRunRepository` and the log file at `TaskRunEntity.LogPath`.
- **Mutations broadcast** the same SignalR events the hub raises, keeping the desktop UI in sync.

## DTOs

- `RunDto` — projection of `TaskRunEntity`: `Id`, `RunNumber`, `SessionId`, `IsRetry`, `ResultMarkdown`, `StructuredOutputJson`, `ErrorMarkdown`, `ExitCode`, `TurnCount`, `TokensIn`, `TokensOut`, `StartedAt`, `FinishedAt`.
- `AgentDto` — from `AgentInfo` (`Name`, `Description`, `Path`).
- `ListConfigDto` — `Model`, `SystemPrompt`, `AgentPath` (reuse the shape already used by the hub).
- App-settings read reuses the existing `AppSettingsDto` shape (read-only subset is fine).
- Log fetch returns the file contents as a string (with a size cap / tail option decided in planning).

## Error handling

Follow the existing `ExternalMcpService` convention: throw
`InvalidOperationException` with a clear message for not-found / invalid-input /
illegal-state (e.g. "List {id} not found", "Cannot reset a non-failed task").
Reuse the guard patterns already present (required-field checks, status checks).
`ResetFailedTask` must refuse non-`Failed` tasks.

## Testing

Extend `tests/ClaudeDo.Worker.Tests/External/ExternalMcpServiceTests.cs` (and add
sibling test files per new tool class) using the existing real-SQLite + real-git
integration pattern:
- List CRUD round-trips; rename/delete propagate; delete blocked/handled sensibly.
- List + task config set/get round-trips; clearing all three fields removes list config (matches hub behavior).
- Run history reads return correct projections; `GetTaskLog` returns file contents and errors cleanly when no log exists.
- `ResetFailedTask` succeeds on a Failed task and refuses other statuses.
- Agent listing reflects files on disk after refresh.
- App-settings read returns current values.

## Doc cleanup (part of this work)

- `src/ClaudeDo.Worker/CLAUDE.md` — remove the stale `SetTaskTags` / `ListTags` /
  "AddTask (with tags)" claim; replace the External MCP tool inventory with the
  new surface.