docs(spec): waiting-for-review task state design
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,123 @@
|
|||||||
|
# Waiting for Review — Task State — Design
|
||||||
|
|
||||||
|
**Date:** 2026-06-01
|
||||||
|
**Status:** Approved (brainstorming)
|
||||||
|
**Scope:** `ClaudeDo.Data` (TaskEntity, EF config + migration), `ClaudeDo.Worker` (TaskStateService, TaskRunner, QueueService, WorkerHub, ExternalMcpService), `ClaudeDo.Ui` (StatusColorConverter, TaskRowViewModel, views), CLAUDE.md docs
|
||||||
|
|
||||||
|
## Problem
|
||||||
|
|
||||||
|
A successful task run currently transitions straight to `Done` and is considered complete. There is no gate for a human (or another agent) to review the result before it is accepted. We want review to be a mandatory step: after a successful run a task waits for an explicit approval, and a reviewer can send it back with feedback for another turn.
|
||||||
|
|
||||||
|
## Goals
|
||||||
|
|
||||||
|
- Add a `WaitingForReview` lifecycle state that a task enters automatically after a **successful** run.
|
||||||
|
- Reviewer can **approve** (→ `Done`), **reject-and-re-run** (→ `Queued`, resuming the same Claude session with required feedback), **reject-and-park** (→ `Idle`), or **cancel** (→ `Cancelled`).
|
||||||
|
- Reject-and-re-run reuses the existing session-resume mechanism so the agent continues with full context.
|
||||||
|
- Both the desktop UI and the external MCP surface can perform review actions.
|
||||||
|
|
||||||
|
## Non-Goals
|
||||||
|
|
||||||
|
- No change to the failure path: a **failed** run still goes straight to `Failed`, never to `WaitingForReview`.
|
||||||
|
- No change to planning-phase finalization. A planning parent that generates child tasks keeps its current behavior and does **not** route through review. Only ordinary executable runs (`Running` → success) are affected.
|
||||||
|
- No change to worktree state flow (`Active | Merged | Discarded | Kept`).
|
||||||
|
- No change to the in-run auto-retry-on-failure behavior; only the *final* successful completion routes to review.
|
||||||
|
|
||||||
|
## Design
|
||||||
|
|
||||||
|
### 1. State machine
|
||||||
|
|
||||||
|
Changed/added transitions in **bold**:
|
||||||
|
|
||||||
|
| From | To | Trigger |
|
||||||
|
|---|---|---|
|
||||||
|
| Idle | Queued | enqueue (unchanged) |
|
||||||
|
| Queued | Running | queue picker claim (unchanged) |
|
||||||
|
| Running | **WaitingForReview** | **successful run (was → Done)** |
|
||||||
|
| Running | Failed | failed run (unchanged) |
|
||||||
|
| Running | Cancelled | cancel during run (unchanged) |
|
||||||
|
| **WaitingForReview** | **Done** | **approve** |
|
||||||
|
| **WaitingForReview** | **Queued** | **reject + required feedback → resume re-run** |
|
||||||
|
| **WaitingForReview** | **Idle** | **reject → park for manual edit** |
|
||||||
|
| **WaitingForReview** | **Cancelled** | **abandon an almost-done task** |
|
||||||
|
| Done \| Failed \| Cancelled | Idle | reset (unchanged) |
|
||||||
|
|
||||||
|
### 2. Data model
|
||||||
|
|
||||||
|
`ClaudeDo.Data`:
|
||||||
|
|
||||||
|
- `TaskStatus` enum (`Models/TaskEntity.cs`): add `WaitingForReview` after `Running`.
|
||||||
|
- EF string converter (`Configuration/TaskEntityConfiguration.cs`): map `WaitingForReview` ⇄ `"waiting_for_review"` (TEXT column, no schema constraint to change).
|
||||||
|
- New nullable column **`ReviewFeedback : string?`** on `TaskEntity`. Holds the reviewer's rejection comment until the re-run consumes it, then it is cleared. Persisted so it survives a worker restart and is visible to the UI.
|
||||||
|
- One EF migration: add the `review_feedback` column. No backfill — the new status value and column are only written going forward.
|
||||||
|
|
||||||
|
### 3. Worker — status transitions (`State/TaskStateService.cs`)
|
||||||
|
|
||||||
|
`TaskStateService` remains the sole owner of status writes. New/changed methods:
|
||||||
|
|
||||||
|
- `SubmitForReviewAsync(taskId)` — `Running` → `WaitingForReview`. Sets `FinishedAt` and `Result` exactly as `CompleteAsync` does today. Called by `TaskRunner` on success **instead of** `CompleteAsync`. (`CompleteAsync` is retained for the approve path.)
|
||||||
|
- `ApproveReviewAsync(taskId)` — `WaitingForReview` → `Done`.
|
||||||
|
- `RejectToQueueAsync(taskId, feedback)` — `WaitingForReview` → `Queued`. Rejects empty/whitespace feedback with a failed `TransitionResult`. Stores `feedback` in `ReviewFeedback`. Wakes the queue.
|
||||||
|
- `RejectToIdleAsync(taskId)` — `WaitingForReview` → `Idle`. Parks for manual editing; leaves `Result` intact, clears `ReviewFeedback`.
|
||||||
|
- `CancelAsync` — extend the allowed source states to include `WaitingForReview`.
|
||||||
|
|
||||||
|
Each transition broadcasts `TaskUpdated` as today. Invalid source states return a failed `TransitionResult` (no throw), matching existing convention.
|
||||||
|
|
||||||
|
### 4. Resume-aware re-run (`Queue/QueueService.cs`)
|
||||||
|
|
||||||
|
The queue picker still atomically claims a `Queued`, unblocked task (`UPDATE … SET status='running' … RETURNING *`). The `RETURNING` row already carries `ReviewFeedback`. After a successful claim, `QueueService` branches:
|
||||||
|
|
||||||
|
1. **`ReviewFeedback` set + latest run has a `SessionId`** → `TaskRunner.ContinueAsync(task, feedback)` — `--resume {sessionId}` with `feedback` as the next-turn prompt.
|
||||||
|
2. **`ReviewFeedback` set, no prior `SessionId`** (edge case) → `TaskRunner.RunAsync` with the feedback appended to the task prompt, so the comment is not lost.
|
||||||
|
3. **No `ReviewFeedback`** → normal `TaskRunner.RunAsync` (fresh session).
|
||||||
|
|
||||||
|
`ReviewFeedback` is cleared once consumed (single UPDATE), so a later re-run does not re-apply stale feedback.
|
||||||
|
|
||||||
|
### 5. External MCP surface (`External/ExternalMcpService.cs`)
|
||||||
|
|
||||||
|
- New tool **`review_task(taskId, decision, feedback?)`**, `decision ∈ {approve, reject_rerun, reject_park, cancel}`. `feedback` is required when `decision = reject_rerun` (validation error otherwise). Maps onto the `TaskStateService` methods in §3. This lets automation / other agents act as reviewers.
|
||||||
|
- `get_task_status_values` — add `WaitingForReview` with a description covering the four exit actions.
|
||||||
|
- `list_tasks` status-filter parsing and validation message — include `WaitingForReview`.
|
||||||
|
- `get_task` lifecycle description text — update to `Idle → Queued → Running → WaitingForReview → Done | Failed | Cancelled`.
|
||||||
|
- `update_task_status` stays restricted to `Idle` and `Queued`; all review decisions go through `review_task` (keeps the "set status freely" affordance and the review affordance distinct).
|
||||||
|
|
||||||
|
### 6. Worker hub (`Hub/WorkerHub.cs` + `Hub/HubBroadcaster.cs`)
|
||||||
|
|
||||||
|
New hub methods called by the UI, each delegating to `TaskStateService`:
|
||||||
|
|
||||||
|
- `ApproveReview(taskId)`
|
||||||
|
- `RejectReviewToQueue(taskId, feedback)`
|
||||||
|
- `RejectReviewToIdle(taskId)`
|
||||||
|
|
||||||
|
Cancel already exists. No new broadcast events — `TaskUpdated` covers it.
|
||||||
|
|
||||||
|
### 7. UI (`ClaudeDo.Ui`)
|
||||||
|
|
||||||
|
- `Converters/StatusColorConverter.cs`: add a `waiting_for_review` case. Snap to an existing color token from the scale; final visual pass is left to the user (per project convention — centralize/tokenize, user does the visual pass).
|
||||||
|
- `ViewModels/Islands/TaskRowViewModel.cs`: add `IsWaitingForReview` computed property and commands **Approve**, **RejectRerun**, **RejectPark**, **Cancel** (the last reuses the existing cancel command). Commands are enabled only when `Status == WaitingForReview`.
|
||||||
|
- Reject-Rerun opens a small flyout/dialog with a required multi-line feedback text box; on confirm it calls `RejectReviewToQueue(taskId, feedback)`.
|
||||||
|
- Wire the commands to the new SignalR client methods.
|
||||||
|
|
||||||
|
### 8. Docs
|
||||||
|
|
||||||
|
Update the status flow in:
|
||||||
|
|
||||||
|
- root `CLAUDE.md` — "Task status flow" line.
|
||||||
|
- `src/ClaudeDo.Data/CLAUDE.md` — TaskEntity status list.
|
||||||
|
- `src/ClaudeDo.Worker/CLAUDE.md` — status-model transition table.
|
||||||
|
|
||||||
|
## Testing
|
||||||
|
|
||||||
|
`ClaudeDo.Worker.Tests` (real SQLite + real git, existing harness):
|
||||||
|
|
||||||
|
- `SubmitForReviewAsync`: a successful run lands in `WaitingForReview`, not `Done`.
|
||||||
|
- `ApproveReviewAsync`: `WaitingForReview` → `Done`.
|
||||||
|
- `RejectToQueueAsync`: empty feedback rejected; valid feedback stored in `ReviewFeedback` and status → `Queued`.
|
||||||
|
- `RejectToIdleAsync`: → `Idle`, `Result` preserved, `ReviewFeedback` cleared.
|
||||||
|
- `CancelAsync` from `WaitingForReview` → `Cancelled`.
|
||||||
|
- Invalid source states (e.g. approve from `Idle`) return a failed `TransitionResult`.
|
||||||
|
- Resume-aware re-run: a task with `ReviewFeedback` + a prior `SessionId`, when claimed, resumes the session with the feedback as the prompt and clears `ReviewFeedback`.
|
||||||
|
- `review_task` MCP tool: each decision maps to the correct transition; `reject_rerun` without feedback errors.
|
||||||
|
|
||||||
|
## Open questions
|
||||||
|
|
||||||
|
None outstanding. Planning-task exclusion (Non-Goals) is the one assumption to verify against the planning-finalization code path during implementation; if planning finalization shares `CompleteAsync`, route only the executable-run success site through `SubmitForReviewAsync`.
|
||||||
Reference in New Issue
Block a user