diff --git a/docs/superpowers/specs/2026-06-01-waiting-for-review-state-design.md b/docs/superpowers/specs/2026-06-01-waiting-for-review-state-design.md new file mode 100644 index 0000000..d7e81a3 --- /dev/null +++ b/docs/superpowers/specs/2026-06-01-waiting-for-review-state-design.md @@ -0,0 +1,123 @@ +# Waiting for Review — Task State — Design + +**Date:** 2026-06-01 +**Status:** Approved (brainstorming) +**Scope:** `ClaudeDo.Data` (TaskEntity, EF config + migration), `ClaudeDo.Worker` (TaskStateService, TaskRunner, QueueService, WorkerHub, ExternalMcpService), `ClaudeDo.Ui` (StatusColorConverter, TaskRowViewModel, views), CLAUDE.md docs + +## Problem + +A successful task run currently transitions straight to `Done` and is considered complete. There is no gate for a human (or another agent) to review the result before it is accepted. We want review to be a mandatory step: after a successful run a task waits for an explicit approval, and a reviewer can send it back with feedback for another turn. + +## Goals + +- Add a `WaitingForReview` lifecycle state that a task enters automatically after a **successful** run. +- Reviewer can **approve** (→ `Done`), **reject-and-re-run** (→ `Queued`, resuming the same Claude session with required feedback), **reject-and-park** (→ `Idle`), or **cancel** (→ `Cancelled`). +- Reject-and-re-run reuses the existing session-resume mechanism so the agent continues with full context. +- Both the desktop UI and the external MCP surface can perform review actions. + +## Non-Goals + +- No change to the failure path: a **failed** run still goes straight to `Failed`, never to `WaitingForReview`. +- No change to planning-phase finalization. A planning parent that generates child tasks keeps its current behavior and does **not** route through review. Only ordinary executable runs (`Running` → success) are affected. +- No change to worktree state flow (`Active | Merged | Discarded | Kept`). +- No change to the in-run auto-retry-on-failure behavior; only the *final* successful completion routes to review. + +## Design + +### 1. State machine + +Changed/added transitions in **bold**: + +| From | To | Trigger | +|---|---|---| +| Idle | Queued | enqueue (unchanged) | +| Queued | Running | queue picker claim (unchanged) | +| Running | **WaitingForReview** | **successful run (was → Done)** | +| Running | Failed | failed run (unchanged) | +| Running | Cancelled | cancel during run (unchanged) | +| **WaitingForReview** | **Done** | **approve** | +| **WaitingForReview** | **Queued** | **reject + required feedback → resume re-run** | +| **WaitingForReview** | **Idle** | **reject → park for manual edit** | +| **WaitingForReview** | **Cancelled** | **abandon an almost-done task** | +| Done \| Failed \| Cancelled | Idle | reset (unchanged) | + +### 2. Data model + +`ClaudeDo.Data`: + +- `TaskStatus` enum (`Models/TaskEntity.cs`): add `WaitingForReview` after `Running`. +- EF string converter (`Configuration/TaskEntityConfiguration.cs`): map `WaitingForReview` ⇄ `"waiting_for_review"` (TEXT column, no schema constraint to change). +- New nullable column **`ReviewFeedback : string?`** on `TaskEntity`. Holds the reviewer's rejection comment until the re-run consumes it, then it is cleared. Persisted so it survives a worker restart and is visible to the UI. +- One EF migration: add the `review_feedback` column. No backfill — the new status value and column are only written going forward. + +### 3. Worker — status transitions (`State/TaskStateService.cs`) + +`TaskStateService` remains the sole owner of status writes. New/changed methods: + +- `SubmitForReviewAsync(taskId)` — `Running` → `WaitingForReview`. Sets `FinishedAt` and `Result` exactly as `CompleteAsync` does today. Called by `TaskRunner` on success **instead of** `CompleteAsync`. (`CompleteAsync` is retained for the approve path.) +- `ApproveReviewAsync(taskId)` — `WaitingForReview` → `Done`. +- `RejectToQueueAsync(taskId, feedback)` — `WaitingForReview` → `Queued`. Rejects empty/whitespace feedback with a failed `TransitionResult`. Stores `feedback` in `ReviewFeedback`. Wakes the queue. +- `RejectToIdleAsync(taskId)` — `WaitingForReview` → `Idle`. Parks for manual editing; leaves `Result` intact, clears `ReviewFeedback`. +- `CancelAsync` — extend the allowed source states to include `WaitingForReview`. + +Each transition broadcasts `TaskUpdated` as today. Invalid source states return a failed `TransitionResult` (no throw), matching existing convention. + +### 4. Resume-aware re-run (`Queue/QueueService.cs`) + +The queue picker still atomically claims a `Queued`, unblocked task (`UPDATE … SET status='running' … RETURNING *`). The `RETURNING` row already carries `ReviewFeedback`. After a successful claim, `QueueService` branches: + +1. **`ReviewFeedback` set + latest run has a `SessionId`** → `TaskRunner.ContinueAsync(task, feedback)` — `--resume {sessionId}` with `feedback` as the next-turn prompt. +2. **`ReviewFeedback` set, no prior `SessionId`** (edge case) → `TaskRunner.RunAsync` with the feedback appended to the task prompt, so the comment is not lost. +3. **No `ReviewFeedback`** → normal `TaskRunner.RunAsync` (fresh session). + +`ReviewFeedback` is cleared once consumed (single UPDATE), so a later re-run does not re-apply stale feedback. + +### 5. External MCP surface (`External/ExternalMcpService.cs`) + +- New tool **`review_task(taskId, decision, feedback?)`**, `decision ∈ {approve, reject_rerun, reject_park, cancel}`. `feedback` is required when `decision = reject_rerun` (validation error otherwise). Maps onto the `TaskStateService` methods in §3. This lets automation / other agents act as reviewers. +- `get_task_status_values` — add `WaitingForReview` with a description covering the four exit actions. +- `list_tasks` status-filter parsing and validation message — include `WaitingForReview`. +- `get_task` lifecycle description text — update to `Idle → Queued → Running → WaitingForReview → Done | Failed | Cancelled`. +- `update_task_status` stays restricted to `Idle` and `Queued`; all review decisions go through `review_task` (keeps the "set status freely" affordance and the review affordance distinct). + +### 6. Worker hub (`Hub/WorkerHub.cs` + `Hub/HubBroadcaster.cs`) + +New hub methods called by the UI, each delegating to `TaskStateService`: + +- `ApproveReview(taskId)` +- `RejectReviewToQueue(taskId, feedback)` +- `RejectReviewToIdle(taskId)` + +Cancel already exists. No new broadcast events — `TaskUpdated` covers it. + +### 7. UI (`ClaudeDo.Ui`) + +- `Converters/StatusColorConverter.cs`: add a `waiting_for_review` case. Snap to an existing color token from the scale; final visual pass is left to the user (per project convention — centralize/tokenize, user does the visual pass). +- `ViewModels/Islands/TaskRowViewModel.cs`: add `IsWaitingForReview` computed property and commands **Approve**, **RejectRerun**, **RejectPark**, **Cancel** (the last reuses the existing cancel command). Commands are enabled only when `Status == WaitingForReview`. +- Reject-Rerun opens a small flyout/dialog with a required multi-line feedback text box; on confirm it calls `RejectReviewToQueue(taskId, feedback)`. +- Wire the commands to the new SignalR client methods. + +### 8. Docs + +Update the status flow in: + +- root `CLAUDE.md` — "Task status flow" line. +- `src/ClaudeDo.Data/CLAUDE.md` — TaskEntity status list. +- `src/ClaudeDo.Worker/CLAUDE.md` — status-model transition table. + +## Testing + +`ClaudeDo.Worker.Tests` (real SQLite + real git, existing harness): + +- `SubmitForReviewAsync`: a successful run lands in `WaitingForReview`, not `Done`. +- `ApproveReviewAsync`: `WaitingForReview` → `Done`. +- `RejectToQueueAsync`: empty feedback rejected; valid feedback stored in `ReviewFeedback` and status → `Queued`. +- `RejectToIdleAsync`: → `Idle`, `Result` preserved, `ReviewFeedback` cleared. +- `CancelAsync` from `WaitingForReview` → `Cancelled`. +- Invalid source states (e.g. approve from `Idle`) return a failed `TransitionResult`. +- Resume-aware re-run: a task with `ReviewFeedback` + a prior `SessionId`, when claimed, resumes the session with the feedback as the prompt and clears `ReviewFeedback`. +- `review_task` MCP tool: each decision maps to the correct transition; `reject_rerun` without feedback errors. + +## Open questions + +None outstanding. Planning-task exclusion (Non-Goals) is the one assumption to verify against the planning-finalization code path during implementation; if planning finalization shares `CompleteAsync`, route only the executable-run success site through `SubmitForReviewAsync`.