Files

mika kuns 84fd2c11a0 docs: child base off parent HEAD, shared planning-style tree merge

Children fan out from the parent's worktree HEAD and merge via a
generalized planning orchestrator (parent branch + children); child
roadblocks roll up to the parent review card.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-06-04 13:45:54 +02:00

9.3 KiB

Raw Blame History

Reusable Child Tasks + Agent Improvement Loop — Design

Date: 2026-06-04

Goal

Let an executing task agent offload out-of-scope improvements it spots into child tasks that run automatically, so ClaudeDo can drive a self-improvement loop. Generalize the parent/child machinery that planning uses today into a reusable subsystem not bound to planning.

Example: while implementing task X, Claude notices "this module should really be refactored, but that's out of scope" — instead of scope-creeping, it calls a tool that files the refactor as a child of X. The child runs on its own; once all of X's children finish, X surfaces for review with its whole tree visible.

This builds on the bundled-prompts overhaul (system.md gains one instruction to use the offload tool). It is otherwise independent.

Lifecycle

A new task status WaitingForChildren is added.

Running → WaitingForReview   standalone success, no children   (existing)
Running → WaitingForChildren standalone success, ≥1 child       (new)
Running → Done               planning child success            (existing)
WaitingForChildren → WaitingForReview   all children terminal  (new)
WaitingForChildren → Cancelled          cancel                 (new)

Improvement-children are created Idle during the parent's run and stay unqueued until the parent's own run finishes — this avoids the parent and a child working the same repo concurrently.
When the parent's run succeeds and it has ≥1 non-terminal child, the parent goes to WaitingForChildren and its children are enqueued (they then run under the normal queue, governed by max-parallel — they are independent, not a forced sequential chain like planning).
Children run automatically and reach Done on success without their own review gate (a per-child review would stall the loop). Each child still produces its own worktree/commit; those worktrees are surfaced under the parent for merge.
Children emit CLAUDEDO_BLOCKED: markers like any run (see the prompt-overhaul spec). Each child's collected problems roll up onto the parent's review card, so a parent in WaitingForReview shows "child N reported a problem" alongside its own roadblocks.

Worktree topology & merge

The correctness rule that makes this work:

Children base off the parent's worktree HEAD, not the list's base branch. The parent's code work lives only on claudedo/{parentId} until merged, so a child refactoring code the parent just wrote must branch from the parent's HEAD to see it. (Planning children base off the target branch because a planning parent writes no code — improvement parents do, hence the difference.) The per-run worktree setup takes the base commit from the parent task's recorded worktree HEAD when ParentTaskId is set and the parent is a non-planning task.
Fan-out: all children branch off the same parent HEAD and run independently (parallel allowed). Parent-dependency is always satisfied; sibling overlaps surface later as merge conflicts.
Merge reuses the planning orchestrator, generalized into a shared "tree merge": build an integration branch off the target, then sequentially merge --no-ff the parent's own branch followed by each child branch, pausing on conflict (continue / abort), exactly as PlanningMergeOrchestrator /PlanningAggregator do today. Approving the parent triggers this one guided flow, merging parent + all children in as few steps as possible. Because children descend from the parent HEAD, the parent's commits are shared ancestors and merge cleanly ahead of the children.
The parent advances to WaitingForReview once all children are terminal — counting Done, Failed, and Cancelled, so a failed child can't wedge the parent forever. Failed/cancelled children are flagged on the review card.

Planning parents keep their existing behavior (parent → Done when its chain finishes); they do not use WaitingForChildren.

Consolidating the child subsystem

Today child handling is planning-coupled. Generalize:

TaskRepository.CreateChildAsync — drop the parent.PlanningPhase != None guard. A child can attach to any existing parent. (Planning callers are unaffected; their parents have a planning phase.) The child sets ParentTaskId = parentId; the caller decides CreatedBy.
Child-completion coordinator — generalize planning's OnChildFinishedAsync / TryCompleteParentAsync into a single component that, on any child reaching a terminal state, checks the parent and applies a completion policy:
- planning parent → finalize/Done (existing chain advancement stays in the planning layer: unblock the next chained child).
- improvement parent (in WaitingForChildren, all children terminal) → WaitingForReview.
TaskStateService remains the sole writer of Status and owns the new transitions (SubmitForChildrenAsync, the WaitingForChildren → WaitingForReview advance).

The offload tool

A narrow MCP tool exposed only to task runs (not the general external surface):

SuggestImprovement(title, description) → { childTaskId }

The server stamps everything — the agent cannot choose the parent, the status, or queue anything directly:
- ParentTaskId = <calling task id>
- CreatedBy = <calling task id> (unambiguous "agent-suggested improvement" marker — distinct from null user/planning tasks and "mcp" external tasks)
- Status = Idle, same ListId as the parent.
One layer deep: the tool rejects the call if the calling task already has a ParentTaskId (a child cannot spawn children).

Knowing the caller's identity

The always-on external claudedo MCP is shared and can't tell which task is calling. So task runs get a per-run MCP identity, mirroring planning's per-session token:

TaskRunner mints a per-run token and writes a run-scoped .mcp.json (or reuses the global server with a token header) so the offload tool resolves token → calling task id server-side. A TaskRunMcpContextAccessor exposes the current task id to the tool, the same way PlanningMcpContextAccessor does.
This is the reliable path for both correct provenance and the one-layer-deep guard — the id is never supplied by the model.

system.md gains a short instruction (from the prompt-overhaul spec):

## Out-of-scope improvements
If you notice worthwhile work that is genuinely outside this task's scope
(a refactor, a follow-up, tech debt), do NOT do it here. File it with
SuggestImprovement(title, description) and stay focused on the task at hand.

UI

Collapsible tree: children group under their parent (by ParentTaskId). Improvement-children are visually marked as agent-suggested (via CreatedBy == parentId).
New status chip for WaitingForChildren (e.g. amber "waiting on N improvements") with its own color in StatusColorConverter.
Review card for a parent in WaitingForReview lists child outcomes (done/failed) and their rolled-up CLAUDEDO_BLOCKED problems, and drives the shared tree-merge (parent + children) via the planning-style sequential flow with conflict pause/continue/abort.

Data / migration

Add WaitingForChildren to the TaskStatus enum and its EF ValueConverter. No new columns — ParentTaskId and CreatedBy already exist. No backfill needed (no existing rows use the new value).

Touch points

src/ClaudeDo.Data/Models/TaskStatus (enum) + TaskEntityConfiguration — new value.
src/ClaudeDo.Data/Repositories/TaskRepository.cs — generalize CreateChildAsync.
src/ClaudeDo.Worker/State/TaskStateService.cs — WaitingForChildren transitions.
src/ClaudeDo.Worker/Runner/TaskRunner.cs — route to WaitingForChildren when children exist; enqueue children on parent finish; mint per-run MCP token.
New: child-completion coordinator (generalized from planning) + the offload tool (e.g. TaskRunMcpService.SuggestImprovement) + TaskRunMcpContextAccessor + token auth (mirrors PlanningTokenAuth).
src/ClaudeDo.Worker/Planning/* — refactor planning to consume the shared child-completion coordinator and the shared tree-merge; keep chain-specific advancement local. Generalize PlanningMergeOrchestrator / PlanningAggregator into a reusable tree-merge that also folds in the parent's own branch.
Worktree setup (TaskRunner / WorktreeManager) — base an improvement-child's worktree on the parent task's recorded worktree HEAD instead of the list base.
UI — tree grouping, WaitingForChildren chip/color, parent review card with child outcomes + rolled-up roadblocks + the merge flow.
Tests — offload tool stamps parent/createdBy + rejects nested calls; parent → WaitingForChildren → WaitingForReview lifecycle; child worktree bases off parent HEAD; tree-merge folds parent + children; planning regression (still reaches Done).

Open questions for review

Failed child: parent still advances to WaitingForReview with the failure flagged (default), vs. parent → Failed if any child failed.

Out of scope

Multi-level nesting (only one layer deep by design).
Per-list "disable improvement offload" toggle (could come later; the tool is always available to top-level runs for now).
Changes to how planning sets up its sequential chain.

9.3 KiB Raw Blame History