docs: spec + plan for per-task model override via MCP

This commit is contained in:
mika kuns
2026-06-09 22:05:01 +02:00
parent 49046310ef
commit 51ef488d2f
2 changed files with 135 additions and 0 deletions

View File

@@ -0,0 +1,55 @@
# Plan: Per-task model override via MCP + cheapest-model prompt guidance
Spec: `docs/superpowers/specs/2026-06-09-per-task-model-override-design.md`
TDD, one focused commit per task. Build with `-c Release` per project; run
`ClaudeDo.Worker.Tests` (and `Data.Tests` if touched).
## Task 1 — ModelRegistry: cost ordering + alias validation
- Add `ByCostAscending = ["haiku","sonnet","opus"]`.
- Add `string? NormalizeAlias(string? model)`: trim; null/blank → null;
case-insensitive match against `Aliases` → canonical lowercase; else throw
`ArgumentException($"Unknown model '{model}'. Allowed: {join(Aliases)}.")`.
- Tests (Data.Tests): "sonnet"/"OPUS"/" haiku " → normalized; ""/null/" " →
null; "gpt4" → throws.
## Task 2 — CreateChildAsync accepts model
- `TaskRepository.CreateChildAsync`: add `string? model = null` (before the
trailing `CancellationToken ct = default`); set
`child.Model = ModelRegistry.NormalizeAlias(model)`.
- Update the two existing callers to compile (named pass-through added in
Tasks 34; keep default null here).
## Task 3 — Planning + improvement MCP tools forward model
- `PlanningMcpService.CreateChildTask`: add `string? model` param after
`commitType`; pass to `CreateChildAsync`. Extend `[Description]` to document
the model arg (haiku/sonnet/opus; cheapest capable).
- `TaskRunMcpService.SuggestImprovement`: add `string? model` param after
`description`; pass to `CreateChildAsync`. Extend `[Description]`.
- Tests: each tool persists the model; invalid value throws.
## Task 4 — External AddTask forwards model
- `ExternalMcpService.AddTask`: add `string? model = null` param (before the
trailing `CancellationToken`); `entity.Model = ModelRegistry.NormalizeAlias(model)`.
Extend `[Description]`.
- Test: AddTask persists model; invalid value rejected.
## Task 5 — Prompt guidance
- `PromptFiles.PlanningSystemDefault`: add a short paragraph — assign each
subtask the cheapest model that does it well, with ordering haiku < sonnet <
opus and the heuristic; pass it as `CreateChildTask(model=...)`.
- `PromptFiles.SystemDefault` Out-of-scope section: when filing via
`SuggestImprovement`, pass the cheapest capable `model`.
- `PromptFiles.ImprovementChildDefault`: one-line minimality reminder.
- No test (static prompt text); verify build only.
## Task 6 — Verify
- Build App + Worker `-c Release`; run Worker.Tests + Data.Tests.
- Update `ClaudeDo.Worker/CLAUDE.md` (ConfigMcpTools/creation-tool notes) and
`ClaudeDo.Data/CLAUDE.md` (ModelRegistry) if needed.

View File

@@ -0,0 +1,80 @@
# Per-task model override via MCP + cheapest-model prompt guidance
Date: 2026-06-09
## Goal
Let Claude pick the model for each task it generates (planning subtasks,
improvement follow-ups, external task creation) directly at creation time via
MCP, and instruct Claude — in the relevant prompts — to choose the *cheapest*
model that can do the job well.
## Background
- `TaskEntity.Model` (nullable) already exists and is resolved
task → list-config → global default in `TaskRunner.ResolveConfigAsync`, then
passed to the CLI as `--model` by `ClaudeArgsBuilder`.
- Today the model can only be set *after* creation via `set_task_config`
(`ConfigMcpTools.SetTaskConfig`). The creation tools (`CreateChildTask`,
`SuggestImprovement`, `AddTask`) accept no model, so assigning one is a
two-call dance.
- `ModelRegistry.Aliases = ["sonnet","opus","haiku"]`; no cost ordering or
validation helper exists.
No schema change is required — only plumbing a `model` argument through the
creation paths plus prompt edits.
## Decisions
- **Validation:** strict alias-only. `model` must be one of haiku/sonnet/opus
(case-insensitive); blank/null means "inherit" (no override); anything else
throws an MCP error so Claude self-corrects immediately rather than the task
failing later at CLI runtime.
- **`AddSubtask` is out of scope:** it creates a `SubtaskEntity` (a checklist
step), which is never independently executed — a model there is a no-op.
- **Improvement-child prompt:** the child's model is fixed at filing time and
it cannot re-pick, so only a one-line "this is an intentionally small/cheap
unit — stay minimal" reminder is added. The real model-choice instruction
lives in the main system prompt's SuggestImprovement guidance.
## Cost ordering & heuristic (single source: `ModelRegistry.ByCostAscending`)
`haiku < sonnet < opus`
- **haiku** — trivial/mechanical: doc tweaks, simple renames, small localized edits.
- **sonnet** — normal coding work (default).
- **opus** — complex architecture, cross-cutting changes, hard debugging.
## Changes
1. **`ClaudeDo.Data/Models/ModelRegistry.cs`**
- `ByCostAscending = ["haiku","sonnet","opus"]`.
- `string? NormalizeAlias(string? model)` — trim; null/blank → null;
case-insensitive match → canonical lowercase alias; else throw
`ArgumentException` with the allowed list.
2. **`TaskRepository.CreateChildAsync`** — add optional `string? model = null`;
set `child.Model = ModelRegistry.NormalizeAlias(model)`. Single choke-point
for both child-creation MCP tools.
3. **MCP creation tools** (add `model` param, document in `[Description]`):
- `PlanningMcpService.CreateChildTask` → forward to `CreateChildAsync`.
- `TaskRunMcpService.SuggestImprovement` → forward to `CreateChildAsync`.
- `ExternalMcpService.AddTask``NormalizeAlias` then set `entity.Model`.
4. **Prompts (`PromptFiles.cs`)**
- `PlanningSystemDefault` — instruct the planner to pass each
`CreateChildTask` the cheapest capable model (with the ordering/heuristic).
- `SystemDefault` (Out-of-scope improvements) — when filing via
`SuggestImprovement`, pass the cheapest capable `model`.
- `ImprovementChildDefault` — one-line minimality reminder.
5. **Tests** (no real CLI):
- `NormalizeAlias`: valid aliases (any case), blank/null → null, unknown → throws.
- `CreateChildTask` / `SuggestImprovement` / `AddTask` persist the model;
invalid model is rejected.
## Out of scope
- No DB migration. No locale changes (prompts and MCP descriptions are not
localized). No UI changes (existing per-task model display already covers it).