docs(spec): waiting-for-review task state design

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-01 17:03:14 +02:00
parent 4a36fbe5e0
commit 3e072fae66
1 changed files with 123 additions and 0 deletions
--- a/docs/superpowers/specs/2026-06-01-waiting-for-review-state-design.md
+++ b/docs/superpowers/specs/2026-06-01-waiting-for-review-state-design.md
@@ -0,0 +1,123 @@
 # Waiting for Review — Task State — Design
 **Date:** 2026-06-01
 **Status:** Approved (brainstorming)
 **Scope:** `ClaudeDo.Data` (TaskEntity, EF config + migration), `ClaudeDo.Worker` (TaskStateService, TaskRunner, QueueService, WorkerHub, ExternalMcpService), `ClaudeDo.Ui` (StatusColorConverter, TaskRowViewModel, views), CLAUDE.md docs
 ## Problem
 A successful task run currently transitions straight to `Done` and is considered complete. There is no gate for a human (or another agent) to review the result before it is accepted. We want review to be a mandatory step: after a successful run a task waits for an explicit approval, and a reviewer can send it back with feedback for another turn.
 ## Goals
 - Add a `WaitingForReview` lifecycle state that a task enters automatically after a **successful** run.
 - Reviewer can **approve** (→ `Done`), **reject-and-re-run** (→ `Queued`, resuming the same Claude session with required feedback), **reject-and-park** (→ `Idle`), or **cancel** (→ `Cancelled`).
 - Reject-and-re-run reuses the existing session-resume mechanism so the agent continues with full context.
 - Both the desktop UI and the external MCP surface can perform review actions.
 ## Non-Goals
 - No change to the failure path: a **failed** run still goes straight to `Failed`, never to `WaitingForReview`.
 - No change to planning-phase finalization. A planning parent that generates child tasks keeps its current behavior and does **not** route through review. Only ordinary executable runs (`Running` → success) are affected.
 - No change to worktree state flow (`Active | Merged | Discarded | Kept`).
 - No change to the in-run auto-retry-on-failure behavior; only the *final* successful completion routes to review.
 ## Design
 ### 1. State machine
 Changed/added transitions in **bold**:
 | From | To | Trigger |
 |---|---|---|
 | Idle | Queued | enqueue (unchanged) |
 | Queued | Running | queue picker claim (unchanged) |
 | Running | **WaitingForReview** | **successful run (was → Done)** |
 | Running | Failed | failed run (unchanged) |
 | Running | Cancelled | cancel during run (unchanged) |
 | **WaitingForReview** | **Done** | **approve** |
 | **WaitingForReview** | **Queued** | **reject + required feedback → resume re-run** |
 | **WaitingForReview** | **Idle** | **reject → park for manual edit** |
 | **WaitingForReview** | **Cancelled** | **abandon an almost-done task** |
 | Done \| Failed \| Cancelled | Idle | reset (unchanged) |
 ### 2. Data model
 `ClaudeDo.Data`:
 - `TaskStatus` enum (`Models/TaskEntity.cs`): add `WaitingForReview` after `Running`.
 - EF string converter (`Configuration/TaskEntityConfiguration.cs`): map `WaitingForReview` ⇄ `"waiting_for_review"` (TEXT column, no schema constraint to change).
 - New nullable column **`ReviewFeedback : string?`** on `TaskEntity`. Holds the reviewer's rejection comment until the re-run consumes it, then it is cleared. Persisted so it survives a worker restart and is visible to the UI.
 - One EF migration: add the `review_feedback` column. No backfill — the new status value and column are only written going forward.
 ### 3. Worker — status transitions (`State/TaskStateService.cs`)
 `TaskStateService` remains the sole owner of status writes. New/changed methods:
 - `SubmitForReviewAsync(taskId)` — `Running` → `WaitingForReview`. Sets `FinishedAt` and `Result` exactly as `CompleteAsync` does today. Called by `TaskRunner` on success **instead of** `CompleteAsync`. (`CompleteAsync` is retained for the approve path.)
 - `ApproveReviewAsync(taskId)` — `WaitingForReview` → `Done`.
 - `RejectToQueueAsync(taskId, feedback)` — `WaitingForReview` → `Queued`. Rejects empty/whitespace feedback with a failed `TransitionResult`. Stores `feedback` in `ReviewFeedback`. Wakes the queue.
 - `RejectToIdleAsync(taskId)` — `WaitingForReview` → `Idle`. Parks for manual editing; leaves `Result` intact, clears `ReviewFeedback`.
 - `CancelAsync` — extend the allowed source states to include `WaitingForReview`.
 Each transition broadcasts `TaskUpdated` as today. Invalid source states return a failed `TransitionResult` (no throw), matching existing convention.
 ### 4. Resume-aware re-run (`Queue/QueueService.cs`)
 The queue picker still atomically claims a `Queued`, unblocked task (`UPDATE … SET status='running' … RETURNING *`). The `RETURNING` row already carries `ReviewFeedback`. After a successful claim, `QueueService` branches:
 1. **`ReviewFeedback` set + latest run has a `SessionId`** → `TaskRunner.ContinueAsync(task, feedback)` — `--resume {sessionId}` with `feedback` as the next-turn prompt.
 2. **`ReviewFeedback` set, no prior `SessionId`** (edge case) → `TaskRunner.RunAsync` with the feedback appended to the task prompt, so the comment is not lost.
 3. **No `ReviewFeedback`** → normal `TaskRunner.RunAsync` (fresh session).
 `ReviewFeedback` is cleared once consumed (single UPDATE), so a later re-run does not re-apply stale feedback.
 ### 5. External MCP surface (`External/ExternalMcpService.cs`)
 - New tool **`review_task(taskId, decision, feedback?)`**, `decision ∈ {approve, reject_rerun, reject_park, cancel}`. `feedback` is required when `decision = reject_rerun` (validation error otherwise). Maps onto the `TaskStateService` methods in §3. This lets automation / other agents act as reviewers.
 - `get_task_status_values` — add `WaitingForReview` with a description covering the four exit actions.
 - `list_tasks` status-filter parsing and validation message — include `WaitingForReview`.
 - `get_task` lifecycle description text — update to `Idle → Queued → Running → WaitingForReview → Done | Failed | Cancelled`.
 - `update_task_status` stays restricted to `Idle` and `Queued`; all review decisions go through `review_task` (keeps the "set status freely" affordance and the review affordance distinct).
 ### 6. Worker hub (`Hub/WorkerHub.cs` + `Hub/HubBroadcaster.cs`)
 New hub methods called by the UI, each delegating to `TaskStateService`:
 - `ApproveReview(taskId)`
 - `RejectReviewToQueue(taskId, feedback)`
 - `RejectReviewToIdle(taskId)`
 Cancel already exists. No new broadcast events — `TaskUpdated` covers it.
 ### 7. UI (`ClaudeDo.Ui`)
 - `Converters/StatusColorConverter.cs`: add a `waiting_for_review` case. Snap to an existing color token from the scale; final visual pass is left to the user (per project convention — centralize/tokenize, user does the visual pass).
 - `ViewModels/Islands/TaskRowViewModel.cs`: add `IsWaitingForReview` computed property and commands **Approve**, **RejectRerun**, **RejectPark**, **Cancel** (the last reuses the existing cancel command). Commands are enabled only when `Status == WaitingForReview`.
 - Reject-Rerun opens a small flyout/dialog with a required multi-line feedback text box; on confirm it calls `RejectReviewToQueue(taskId, feedback)`.
 - Wire the commands to the new SignalR client methods.
 ### 8. Docs
 Update the status flow in:
 - root `CLAUDE.md` — "Task status flow" line.
 - `src/ClaudeDo.Data/CLAUDE.md` — TaskEntity status list.
 - `src/ClaudeDo.Worker/CLAUDE.md` — status-model transition table.
 ## Testing
 `ClaudeDo.Worker.Tests` (real SQLite + real git, existing harness):
 - `SubmitForReviewAsync`: a successful run lands in `WaitingForReview`, not `Done`.
 - `ApproveReviewAsync`: `WaitingForReview` → `Done`.
 - `RejectToQueueAsync`: empty feedback rejected; valid feedback stored in `ReviewFeedback` and status → `Queued`.
 - `RejectToIdleAsync`: → `Idle`, `Result` preserved, `ReviewFeedback` cleared.
 - `CancelAsync` from `WaitingForReview` → `Cancelled`.
 - Invalid source states (e.g. approve from `Idle`) return a failed `TransitionResult`.
 - Resume-aware re-run: a task with `ReviewFeedback` + a prior `SessionId`, when claimed, resumes the session with the feedback as the prompt and clears `ReviewFeedback`.
 - `review_task` MCP tool: each decision maps to the correct transition; `reject_rerun` without feedback errors.
 ## Open questions
 None outstanding. Planning-task exclusion (Non-Goals) is the one assumption to verify against the planning-finalization code path during implementation; if planning finalization shares `CompleteAsync`, route only the executable-run success site through `SubmitForReviewAsync`.