Children fan out from the parent's worktree HEAD and merge via a generalized planning orchestrator (parent branch + children); child roadblocks roll up to the parent review card. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
187 lines
9.3 KiB
Markdown
187 lines
9.3 KiB
Markdown
# Reusable Child Tasks + Agent Improvement Loop — Design
|
|
|
|
Date: 2026-06-04
|
|
|
|
## Goal
|
|
|
|
Let an executing task agent offload out-of-scope improvements it spots into
|
|
**child tasks** that run automatically, so ClaudeDo can drive a self-improvement
|
|
loop. Generalize the parent/child machinery that planning uses today into a
|
|
reusable subsystem not bound to planning.
|
|
|
|
Example: while implementing task X, Claude notices "this module should really be
|
|
refactored, but that's out of scope" — instead of scope-creeping, it calls a tool
|
|
that files the refactor as a child of X. The child runs on its own; once all of
|
|
X's children finish, X surfaces for review with its whole tree visible.
|
|
|
|
This builds on the bundled-prompts overhaul (`system.md` gains one instruction to
|
|
use the offload tool). It is otherwise independent.
|
|
|
|
## Lifecycle
|
|
|
|
A new task status `WaitingForChildren` is added.
|
|
|
|
```
|
|
Running → WaitingForReview standalone success, no children (existing)
|
|
Running → WaitingForChildren standalone success, ≥1 child (new)
|
|
Running → Done planning child success (existing)
|
|
WaitingForChildren → WaitingForReview all children terminal (new)
|
|
WaitingForChildren → Cancelled cancel (new)
|
|
```
|
|
|
|
- Improvement-children are created `Idle` **during** the parent's run and stay
|
|
unqueued until the parent's own run finishes — this avoids the parent and a
|
|
child working the same repo concurrently.
|
|
- When the parent's run succeeds and it has ≥1 non-terminal child, the parent goes
|
|
to `WaitingForChildren` and its children are enqueued (they then run under the
|
|
normal queue, governed by max-parallel — they are independent, not a forced
|
|
sequential chain like planning).
|
|
- Children run automatically and reach `Done` on success without their own review
|
|
gate (a per-child review would stall the loop). Each child still produces its
|
|
own worktree/commit; those worktrees are surfaced under the parent for merge.
|
|
- Children emit `CLAUDEDO_BLOCKED:` markers like any run (see the prompt-overhaul
|
|
spec). Each child's collected problems roll up onto the **parent's** review card,
|
|
so a parent in `WaitingForReview` shows "child N reported a problem" alongside
|
|
its own roadblocks.
|
|
|
|
## Worktree topology & merge
|
|
|
|
The correctness rule that makes this work:
|
|
|
|
- **Children base off the parent's worktree HEAD, not the list's base branch.**
|
|
The parent's code work lives only on `claudedo/{parentId}` until merged, so a
|
|
child refactoring code the parent just wrote must branch from the parent's HEAD
|
|
to see it. (Planning children base off the target branch because a planning
|
|
parent writes no code — improvement parents do, hence the difference.) The
|
|
per-run worktree setup takes the base commit from the parent task's recorded
|
|
worktree HEAD when `ParentTaskId` is set and the parent is a non-planning task.
|
|
- **Fan-out:** all children branch off the same parent HEAD and run independently
|
|
(parallel allowed). Parent-dependency is always satisfied; sibling overlaps
|
|
surface later as merge conflicts.
|
|
- **Merge reuses the planning orchestrator,** generalized into a shared
|
|
"tree merge": build an integration branch off the target, then sequentially
|
|
`merge --no-ff` the **parent's own branch** followed by each child branch,
|
|
pausing on conflict (continue / abort), exactly as `PlanningMergeOrchestrator`
|
|
/`PlanningAggregator` do today. Approving the parent triggers this one guided
|
|
flow, merging parent + all children in as few steps as possible. Because
|
|
children descend from the parent HEAD, the parent's commits are shared ancestors
|
|
and merge cleanly ahead of the children.
|
|
- The parent advances to `WaitingForReview` once **all** children are terminal —
|
|
counting `Done`, `Failed`, and `Cancelled`, so a failed child can't wedge the
|
|
parent forever. Failed/cancelled children are flagged on the review card.
|
|
|
|
Planning parents keep their existing behavior (parent → `Done` when its chain
|
|
finishes); they do not use `WaitingForChildren`.
|
|
|
|
## Consolidating the child subsystem
|
|
|
|
Today child handling is planning-coupled. Generalize:
|
|
|
|
- **`TaskRepository.CreateChildAsync`** — drop the `parent.PlanningPhase != None`
|
|
guard. A child can attach to any existing parent. (Planning callers are
|
|
unaffected; their parents have a planning phase.) The child sets
|
|
`ParentTaskId = parentId`; the caller decides `CreatedBy`.
|
|
- **Child-completion coordinator** — generalize planning's
|
|
`OnChildFinishedAsync` / `TryCompleteParentAsync` into a single component that,
|
|
on any child reaching a terminal state, checks the parent and applies a
|
|
**completion policy**:
|
|
- *planning parent* → finalize/Done (existing chain advancement stays in the
|
|
planning layer: unblock the next chained child).
|
|
- *improvement parent* (in `WaitingForChildren`, all children terminal) →
|
|
`WaitingForReview`.
|
|
- `TaskStateService` remains the sole writer of `Status` and owns the new
|
|
transitions (`SubmitForChildrenAsync`, the `WaitingForChildren → WaitingForReview`
|
|
advance).
|
|
|
|
## The offload tool
|
|
|
|
A narrow MCP tool exposed only to task runs (not the general external surface):
|
|
|
|
```
|
|
SuggestImprovement(title, description) → { childTaskId }
|
|
```
|
|
|
|
- The **server** stamps everything — the agent cannot choose the parent, the
|
|
status, or queue anything directly:
|
|
- `ParentTaskId = <calling task id>`
|
|
- `CreatedBy = <calling task id>` (unambiguous "agent-suggested improvement"
|
|
marker — distinct from `null` user/planning tasks and `"mcp"` external tasks)
|
|
- `Status = Idle`, same `ListId` as the parent.
|
|
- **One layer deep:** the tool rejects the call if the calling task already has a
|
|
`ParentTaskId` (a child cannot spawn children).
|
|
|
|
### Knowing the caller's identity
|
|
|
|
The always-on external `claudedo` MCP is shared and can't tell which task is
|
|
calling. So task runs get a **per-run MCP identity**, mirroring planning's
|
|
per-session token:
|
|
|
|
- `TaskRunner` mints a per-run token and writes a run-scoped `.mcp.json` (or
|
|
reuses the global server with a token header) so the offload tool resolves
|
|
token → calling task id server-side. A `TaskRunMcpContextAccessor` exposes the
|
|
current task id to the tool, the same way `PlanningMcpContextAccessor` does.
|
|
- This is the reliable path for both correct provenance and the one-layer-deep
|
|
guard — the id is never supplied by the model.
|
|
|
|
`system.md` gains a short instruction (from the prompt-overhaul spec):
|
|
|
|
```markdown
|
|
## Out-of-scope improvements
|
|
If you notice worthwhile work that is genuinely outside this task's scope
|
|
(a refactor, a follow-up, tech debt), do NOT do it here. File it with
|
|
SuggestImprovement(title, description) and stay focused on the task at hand.
|
|
```
|
|
|
|
## UI
|
|
|
|
- **Collapsible tree:** children group under their parent (by `ParentTaskId`).
|
|
Improvement-children are visually marked as agent-suggested (via
|
|
`CreatedBy == parentId`).
|
|
- **New status chip** for `WaitingForChildren` (e.g. amber "waiting on N
|
|
improvements") with its own color in `StatusColorConverter`.
|
|
- **Review card** for a parent in `WaitingForReview` lists child outcomes
|
|
(done/failed) and their rolled-up `CLAUDEDO_BLOCKED` problems, and drives the
|
|
shared tree-merge (parent + children) via the planning-style sequential flow
|
|
with conflict pause/continue/abort.
|
|
|
|
## Data / migration
|
|
|
|
- Add `WaitingForChildren` to the `TaskStatus` enum and its EF `ValueConverter`.
|
|
No new columns — `ParentTaskId` and `CreatedBy` already exist. No backfill
|
|
needed (no existing rows use the new value).
|
|
|
|
## Touch points
|
|
|
|
- `src/ClaudeDo.Data/Models/TaskStatus` (enum) + `TaskEntityConfiguration` — new value.
|
|
- `src/ClaudeDo.Data/Repositories/TaskRepository.cs` — generalize `CreateChildAsync`.
|
|
- `src/ClaudeDo.Worker/State/TaskStateService.cs` — `WaitingForChildren` transitions.
|
|
- `src/ClaudeDo.Worker/Runner/TaskRunner.cs` — route to `WaitingForChildren` when
|
|
children exist; enqueue children on parent finish; mint per-run MCP token.
|
|
- New: child-completion coordinator (generalized from planning) + the offload tool
|
|
(e.g. `TaskRunMcpService.SuggestImprovement`) + `TaskRunMcpContextAccessor` +
|
|
token auth (mirrors `PlanningTokenAuth`).
|
|
- `src/ClaudeDo.Worker/Planning/*` — refactor planning to consume the shared
|
|
child-completion coordinator and the shared tree-merge; keep chain-specific
|
|
advancement local. Generalize `PlanningMergeOrchestrator` / `PlanningAggregator`
|
|
into a reusable tree-merge that also folds in the parent's own branch.
|
|
- Worktree setup (`TaskRunner` / `WorktreeManager`) — base an improvement-child's
|
|
worktree on the parent task's recorded worktree HEAD instead of the list base.
|
|
- UI — tree grouping, `WaitingForChildren` chip/color, parent review card with
|
|
child outcomes + rolled-up roadblocks + the merge flow.
|
|
- Tests — offload tool stamps parent/createdBy + rejects nested calls;
|
|
parent → `WaitingForChildren` → `WaitingForReview` lifecycle; child worktree
|
|
bases off parent HEAD; tree-merge folds parent + children; planning regression
|
|
(still reaches Done).
|
|
|
|
## Open questions for review
|
|
|
|
1. **Failed child:** parent still advances to `WaitingForReview` with the failure
|
|
flagged (default), vs. parent → `Failed` if any child failed.
|
|
|
|
## Out of scope
|
|
|
|
- Multi-level nesting (only one layer deep by design).
|
|
- Per-list "disable improvement offload" toggle (could come later; the tool is
|
|
always available to top-level runs for now).
|
|
- Changes to how planning sets up its sequential chain.
|