Agent offloads out-of-scope work via SuggestImprovement; children run automatically; new WaitingForChildren state; generalize planning's parent/child machinery. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
7.4 KiB
Reusable Child Tasks + Agent Improvement Loop — Design
Date: 2026-06-04
Goal
Let an executing task agent offload out-of-scope improvements it spots into child tasks that run automatically, so ClaudeDo can drive a self-improvement loop. Generalize the parent/child machinery that planning uses today into a reusable subsystem not bound to planning.
Example: while implementing task X, Claude notices "this module should really be refactored, but that's out of scope" — instead of scope-creeping, it calls a tool that files the refactor as a child of X. The child runs on its own; once all of X's children finish, X surfaces for review with its whole tree visible.
This builds on the bundled-prompts overhaul (system.md gains one instruction to
use the offload tool). It is otherwise independent.
Lifecycle
A new task status WaitingForChildren is added.
Running → WaitingForReview standalone success, no children (existing)
Running → WaitingForChildren standalone success, ≥1 child (new)
Running → Done planning child success (existing)
WaitingForChildren → WaitingForReview all children terminal (new)
WaitingForChildren → Cancelled cancel (new)
- Improvement-children are created
Idleduring the parent's run and stay unqueued until the parent's own run finishes — this avoids the parent and a child working the same repo concurrently. - When the parent's run succeeds and it has ≥1 non-terminal child, the parent goes
to
WaitingForChildrenand its children are enqueued (they then run under the normal queue, governed by max-parallel — they are independent, not a forced sequential chain like planning). - Children run automatically and reach
Doneon success without their own review gate (a per-child review would stall the loop). Each child still produces its own worktree/commit; those worktrees are surfaced under the parent for merge. - The parent advances to
WaitingForReviewonce all children are terminal — countingDone,Failed, andCancelled, so a failed child can't wedge the parent forever. Failed/cancelled children are flagged on the review card.
Planning parents keep their existing behavior (parent → Done when its chain
finishes); they do not use WaitingForChildren.
Consolidating the child subsystem
Today child handling is planning-coupled. Generalize:
TaskRepository.CreateChildAsync— drop theparent.PlanningPhase != Noneguard. A child can attach to any existing parent. (Planning callers are unaffected; their parents have a planning phase.) The child setsParentTaskId = parentId; the caller decidesCreatedBy.- Child-completion coordinator — generalize planning's
OnChildFinishedAsync/TryCompleteParentAsyncinto a single component that, on any child reaching a terminal state, checks the parent and applies a completion policy:- planning parent → finalize/Done (existing chain advancement stays in the planning layer: unblock the next chained child).
- improvement parent (in
WaitingForChildren, all children terminal) →WaitingForReview.
TaskStateServiceremains the sole writer ofStatusand owns the new transitions (SubmitForChildrenAsync, theWaitingForChildren → WaitingForReviewadvance).
The offload tool
A narrow MCP tool exposed only to task runs (not the general external surface):
SuggestImprovement(title, description) → { childTaskId }
- The server stamps everything — the agent cannot choose the parent, the
status, or queue anything directly:
ParentTaskId = <calling task id>CreatedBy = <calling task id>(unambiguous "agent-suggested improvement" marker — distinct fromnulluser/planning tasks and"mcp"external tasks)Status = Idle, sameListIdas the parent.
- One layer deep: the tool rejects the call if the calling task already has a
ParentTaskId(a child cannot spawn children).
Knowing the caller's identity
The always-on external claudedo MCP is shared and can't tell which task is
calling. So task runs get a per-run MCP identity, mirroring planning's
per-session token:
TaskRunnermints a per-run token and writes a run-scoped.mcp.json(or reuses the global server with a token header) so the offload tool resolves token → calling task id server-side. ATaskRunMcpContextAccessorexposes the current task id to the tool, the same wayPlanningMcpContextAccessordoes.- This is the reliable path for both correct provenance and the one-layer-deep guard — the id is never supplied by the model.
system.md gains a short instruction (from the prompt-overhaul spec):
## Out-of-scope improvements
If you notice worthwhile work that is genuinely outside this task's scope
(a refactor, a follow-up, tech debt), do NOT do it here. File it with
SuggestImprovement(title, description) and stay focused on the task at hand.
UI
- Collapsible tree: children group under their parent (by
ParentTaskId). Improvement-children are visually marked as agent-suggested (viaCreatedBy == parentId). - New status chip for
WaitingForChildren(e.g. amber "waiting on N improvements") with its own color inStatusColorConverter. - Review card for a parent in
WaitingForReviewlists child outcomes (done/failed) and exposes their worktrees for merge.
Data / migration
- Add
WaitingForChildrento theTaskStatusenum and its EFValueConverter. No new columns —ParentTaskIdandCreatedByalready exist. No backfill needed (no existing rows use the new value).
Touch points
src/ClaudeDo.Data/Models/TaskStatus(enum) +TaskEntityConfiguration— new value.src/ClaudeDo.Data/Repositories/TaskRepository.cs— generalizeCreateChildAsync.src/ClaudeDo.Worker/State/TaskStateService.cs—WaitingForChildrentransitions.src/ClaudeDo.Worker/Runner/TaskRunner.cs— route toWaitingForChildrenwhen children exist; enqueue children on parent finish; mint per-run MCP token.- New: child-completion coordinator (generalized from planning) + the offload tool
(e.g.
TaskRunMcpService.SuggestImprovement) +TaskRunMcpContextAccessor+ token auth (mirrorsPlanningTokenAuth). src/ClaudeDo.Worker/Planning/*— refactor planning to consume the shared child-completion coordinator; keep chain-specific advancement local.- UI — tree grouping,
WaitingForChildrenchip/color, parent review card with child outcomes. - Tests — offload tool stamps parent/createdBy + rejects nested calls;
parent →
WaitingForChildren→WaitingForReviewlifecycle; planning regression (still reaches Done).
Open questions for review
- Child review/merge: children reach
Donewithout review and leave worktrees for manual merge. Is reviewing the whole tree at the parent enough, or do you want per-child merge controls in the parent's review card? (Default: surface child worktrees under the parent.) - Failed child: parent still advances to
WaitingForReviewwith the failure flagged (default), vs. parent →Failedif any child failed.
Out of scope
- Multi-level nesting (only one layer deep by design).
- Per-list "disable improvement offload" toggle (could come later; the tool is always available to top-level runs for now).
- Changes to how planning sets up its sequential chain.