Files

mika kuns 3e072fae66 docs(spec): waiting-for-review task state design

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-06-01 17:03:14 +02:00

8.1 KiB

Raw Blame History

Waiting for Review — Task State — Design

Date: 2026-06-01 Status: Approved (brainstorming) Scope: ClaudeDo.Data (TaskEntity, EF config + migration), ClaudeDo.Worker (TaskStateService, TaskRunner, QueueService, WorkerHub, ExternalMcpService), ClaudeDo.Ui (StatusColorConverter, TaskRowViewModel, views), CLAUDE.md docs

Problem

A successful task run currently transitions straight to Done and is considered complete. There is no gate for a human (or another agent) to review the result before it is accepted. We want review to be a mandatory step: after a successful run a task waits for an explicit approval, and a reviewer can send it back with feedback for another turn.

Goals

Add a WaitingForReview lifecycle state that a task enters automatically after a successful run.
Reviewer can approve (→ Done), reject-and-re-run (→ Queued, resuming the same Claude session with required feedback), reject-and-park (→ Idle), or cancel (→ Cancelled).
Reject-and-re-run reuses the existing session-resume mechanism so the agent continues with full context.
Both the desktop UI and the external MCP surface can perform review actions.

Non-Goals

No change to the failure path: a failed run still goes straight to Failed, never to WaitingForReview.
No change to planning-phase finalization. A planning parent that generates child tasks keeps its current behavior and does not route through review. Only ordinary executable runs (Running → success) are affected.
No change to worktree state flow (Active | Merged | Discarded | Kept).
No change to the in-run auto-retry-on-failure behavior; only the final successful completion routes to review.

Design

1. State machine

Changed/added transitions in bold:

From	To	Trigger
Idle	Queued	enqueue (unchanged)
Queued	Running	queue picker claim (unchanged)
Running	WaitingForReview	successful run (was → Done)
Running	Failed	failed run (unchanged)
Running	Cancelled	cancel during run (unchanged)
WaitingForReview	Done	approve
WaitingForReview	Queued	reject + required feedback → resume re-run
WaitingForReview	Idle	reject → park for manual edit
WaitingForReview	Cancelled	abandon an almost-done task
Done \| Failed \| Cancelled	Idle	reset (unchanged)

2. Data model

ClaudeDo.Data:

TaskStatus enum (Models/TaskEntity.cs): add WaitingForReview after Running.
EF string converter (Configuration/TaskEntityConfiguration.cs): map WaitingForReview ⇄ "waiting_for_review" (TEXT column, no schema constraint to change).
New nullable column ReviewFeedback : string? on TaskEntity. Holds the reviewer's rejection comment until the re-run consumes it, then it is cleared. Persisted so it survives a worker restart and is visible to the UI.
One EF migration: add the review_feedback column. No backfill — the new status value and column are only written going forward.

3. Worker — status transitions (`State/TaskStateService.cs`)

TaskStateService remains the sole owner of status writes. New/changed methods:

SubmitForReviewAsync(taskId) — Running → WaitingForReview. Sets FinishedAt and Result exactly as CompleteAsync does today. Called by TaskRunner on success instead of CompleteAsync. (CompleteAsync is retained for the approve path.)
ApproveReviewAsync(taskId) — WaitingForReview → Done.
RejectToQueueAsync(taskId, feedback) — WaitingForReview → Queued. Rejects empty/whitespace feedback with a failed TransitionResult. Stores feedback in ReviewFeedback. Wakes the queue.
RejectToIdleAsync(taskId) — WaitingForReview → Idle. Parks for manual editing; leaves Result intact, clears ReviewFeedback.
CancelAsync — extend the allowed source states to include WaitingForReview.

Each transition broadcasts TaskUpdated as today. Invalid source states return a failed TransitionResult (no throw), matching existing convention.

4. Resume-aware re-run (`Queue/QueueService.cs`)

The queue picker still atomically claims a Queued, unblocked task (UPDATE … SET status='running' … RETURNING *). The RETURNING row already carries ReviewFeedback. After a successful claim, QueueService branches:

ReviewFeedback set + latest run has a SessionId → TaskRunner.ContinueAsync(task, feedback) — --resume {sessionId} with feedback as the next-turn prompt.
ReviewFeedback set, no prior SessionId (edge case) → TaskRunner.RunAsync with the feedback appended to the task prompt, so the comment is not lost.
No ReviewFeedback → normal TaskRunner.RunAsync (fresh session).

ReviewFeedback is cleared once consumed (single UPDATE), so a later re-run does not re-apply stale feedback.

5. External MCP surface (`External/ExternalMcpService.cs`)

New tool review_task(taskId, decision, feedback?), decision ∈ {approve, reject_rerun, reject_park, cancel}. feedback is required when decision = reject_rerun (validation error otherwise). Maps onto the TaskStateService methods in §3. This lets automation / other agents act as reviewers.
get_task_status_values — add WaitingForReview with a description covering the four exit actions.
list_tasks status-filter parsing and validation message — include WaitingForReview.
get_task lifecycle description text — update to Idle → Queued → Running → WaitingForReview → Done | Failed | Cancelled.
update_task_status stays restricted to Idle and Queued; all review decisions go through review_task (keeps the "set status freely" affordance and the review affordance distinct).

6. Worker hub (`Hub/WorkerHub.cs` + `Hub/HubBroadcaster.cs`)

New hub methods called by the UI, each delegating to TaskStateService:

ApproveReview(taskId)
RejectReviewToQueue(taskId, feedback)
RejectReviewToIdle(taskId)

Cancel already exists. No new broadcast events — TaskUpdated covers it.

7. UI (`ClaudeDo.Ui`)

Converters/StatusColorConverter.cs: add a waiting_for_review case. Snap to an existing color token from the scale; final visual pass is left to the user (per project convention — centralize/tokenize, user does the visual pass).
ViewModels/Islands/TaskRowViewModel.cs: add IsWaitingForReview computed property and commands Approve, RejectRerun, RejectPark, Cancel (the last reuses the existing cancel command). Commands are enabled only when Status == WaitingForReview.
Reject-Rerun opens a small flyout/dialog with a required multi-line feedback text box; on confirm it calls RejectReviewToQueue(taskId, feedback).
Wire the commands to the new SignalR client methods.

8. Docs

Update the status flow in:

root CLAUDE.md — "Task status flow" line.
src/ClaudeDo.Data/CLAUDE.md — TaskEntity status list.
src/ClaudeDo.Worker/CLAUDE.md — status-model transition table.

Testing

ClaudeDo.Worker.Tests (real SQLite + real git, existing harness):

SubmitForReviewAsync: a successful run lands in WaitingForReview, not Done.
ApproveReviewAsync: WaitingForReview → Done.
RejectToQueueAsync: empty feedback rejected; valid feedback stored in ReviewFeedback and status → Queued.
RejectToIdleAsync: → Idle, Result preserved, ReviewFeedback cleared.
CancelAsync from WaitingForReview → Cancelled.
Invalid source states (e.g. approve from Idle) return a failed TransitionResult.
Resume-aware re-run: a task with ReviewFeedback + a prior SessionId, when claimed, resumes the session with the feedback as the prompt and clears ReviewFeedback.
review_task MCP tool: each decision maps to the correct transition; reject_rerun without feedback errors.

Open questions

None outstanding. Planning-task exclusion (Non-Goals) is the one assumption to verify against the planning-finalization code path during implementation; if planning finalization shares CompleteAsync, route only the executable-run success site through SubmitForReviewAsync.

8.1 KiB Raw Blame History