docs(superpowers): add session prompts for worker state consolidation slices 2-6
Self-contained prompts to paste into fresh sessions, one per remaining slice. Each prompt includes scope, allowed transitions, caller-migration list, test expectations, and the conventional-commit message to use.
This commit is contained in:
@@ -0,0 +1,225 @@
|
||||
# Session Prompts — Worker State & Queue Consolidation Slices 2–6
|
||||
|
||||
Paste-ready prompts for each remaining slice. Run **one slice per session** so the diff stays reviewable and tests stay green between commits. Spec lives at `docs/superpowers/specs/2026-04-27-worker-state-and-queue-consolidation-design.md` — reference it when the prompt asks.
|
||||
|
||||
**Common ground rules** (carry across all slices):
|
||||
|
||||
- Direct on `main`, one commit per slice, conventional commit messages.
|
||||
- Build green (`dotnet build src/ClaudeDo.Worker/ClaudeDo.Worker.csproj` + Data + Ui) before commit.
|
||||
- Pre-existing test errors (TaskRunner/WorkerHub constructor drift in 4 test files) are **not** in scope to fix — they exist on `main` already. New compile errors my changes introduce ARE in scope.
|
||||
- No drive-by refactors outside the slice's stated scope.
|
||||
- New files must follow existing naming/folder conventions; legacy enum values stay until Slice 6.
|
||||
- After each slice, update `~/.claude/projects/C--Private-ClaudeDo/memory/` if I learn something durable about the codebase.
|
||||
|
||||
---
|
||||
|
||||
## Slice 2 — `TaskStateService` (centralized state machine)
|
||||
|
||||
**Prompt to paste into a fresh session:**
|
||||
|
||||
> Slice 2 of the worker state consolidation refactor. Spec: `docs/superpowers/specs/2026-04-27-worker-state-and-queue-consolidation-design.md` (sections 2 and 8). Slice 1 already landed (commit 7b737e6) — `TaskStatus` has `Idle`/`Cancelled`, `PlanningPhase` enum exists, `BlockedByTaskId` field exists. Legacy enum values still around.
|
||||
>
|
||||
> **Goal:** introduce `Worker/State/ITaskStateService` + `TaskStateService` as the single component that mutates `Status`, `PlanningPhase`, `BlockedByTaskId`. Migrate every existing caller. Mark repo `Mark*Async` helpers `internal`.
|
||||
>
|
||||
> **Public surface (verbatim from spec):**
|
||||
> ```csharp
|
||||
> Task<TransitionResult> EnqueueAsync(string taskId, CancellationToken ct);
|
||||
> Task<TransitionResult> StartRunningAsync(string taskId, DateTime startedAt, CancellationToken ct);
|
||||
> Task<TransitionResult> CompleteAsync(string taskId, DateTime finishedAt, string? result, CancellationToken ct);
|
||||
> Task<TransitionResult> FailAsync(string taskId, DateTime finishedAt, string? error, CancellationToken ct);
|
||||
> Task<TransitionResult> CancelAsync(string taskId, DateTime finishedAt, CancellationToken ct);
|
||||
> Task<TransitionResult> ResetToIdleAsync(string taskId, CancellationToken ct);
|
||||
> Task<TransitionResult> StartPlanningAsync(string parentId, CancellationToken ct);
|
||||
> Task<TransitionResult> FinalizePlanningAsync(string parentId, CancellationToken ct);
|
||||
> Task<TransitionResult> BlockOnAsync(string taskId, string predecessorTaskId, CancellationToken ct);
|
||||
> Task<TransitionResult> UnblockAsync(string taskId, CancellationToken ct);
|
||||
> Task<int> RecoverStaleRunningAsync(string reason, CancellationToken ct);
|
||||
> ```
|
||||
>
|
||||
> **Allowed transition table:** see spec §2. Reject invalid transitions with `TransitionResult(false, "<reason>")` — no exceptions. Each transition is one atomic `ExecuteUpdate` with `WHERE Status = <expected>` for TOCTOU-freedom.
|
||||
>
|
||||
> **Side effects after successful DB write** (do these inside the service so callers don't need to remember):
|
||||
> - On any `→ Queued`: call `_queue.WakeQueue()` directly for now (Slice 3 will replace with `IQueueWaker`). Inject `QueueService` lazily via `Func<QueueService>` to break the DI cycle if needed.
|
||||
> - On any successful transition: `_broadcaster.TaskUpdated(taskId)`.
|
||||
> - On `Done`/`Failed`/`Cancelled` for a child task: invoke `_chain.OnChildFinishedAsync(taskId, finalStatus, ct)`. If it returns a next-task-id, call `UnblockAsync` on it. Then run `_repo.TryCompleteParentAsync(parentId, ct)`.
|
||||
>
|
||||
> **Important:** `BlockOnAsync` and `UnblockAsync` should write `BlockedByTaskId` directly. `EnqueueAsync` for a Planning child should keep `BlockedByTaskId` null when it's the head of the chain. The chain coordinator will compose these calls in Slice 4 — for now just expose the API.
|
||||
>
|
||||
> **Caller migration (mechanical — preserve current behavior):**
|
||||
> - `TaskRunner.HandleSuccess` → replace `taskRepo.MarkDoneAsync` + `TryCompleteParentAsync` + `_chain.OnChildFinishedAsync` block with a single `_state.CompleteAsync(taskId, finishedAt, result, CancellationToken.None)`.
|
||||
> - `TaskRunner.HandleFailure` → `_state.FailAsync(taskId, finishedAt, errorMarkdown, CancellationToken.None)`.
|
||||
> - `TaskRunner.MarkFailed` (early-fail path) → same.
|
||||
> - `TaskRunner.RunAsync` start of run → `_state.StartRunningAsync(taskId, startedAt, ct)`.
|
||||
> - `StaleTaskRecovery.StartAsync` → `_state.RecoverStaleRunningAsync("worker restart", ct)`.
|
||||
> - `TaskResetService.ResetAsync` → `_state.ResetToIdleAsync(taskId, ct)` for the status flip; service keeps owning worktree cleanup.
|
||||
> - `PlanningSessionManager.StartAsync` (the `SetPlanningStartedAsync` call) → `_state.StartPlanningAsync(parentId, ct)`. The manager still owns token/session-dir setup; only the status flip moves.
|
||||
> - `PlanningChainCoordinator.OnChildFinishedAsync` (the `next.Status = TaskStatus.Queued` write) → keep its existing logic but use `_state.UnblockAsync(next.Id, ct)` for the actual write. The Slice 4 rewrite finishes the rest.
|
||||
> - `ExternalMcpService.UpdateTaskStatus` (status flip in the Queued case) → `_state.EnqueueAsync(taskId, ct)`. The Manual case stays as-is until Slice 6 since `Manual` is still a valid legacy value.
|
||||
>
|
||||
> **Repo helpers to mark `internal`:** `MarkRunningAsync`, `MarkDoneAsync`, `MarkFailedAsync`, `FlipAllRunningToFailedAsync`. Verify nothing outside `ClaudeDo.Worker.State` calls them after migration. (`Worker.Tests` may need `InternalsVisibleTo` — add it if so.)
|
||||
>
|
||||
> **DI wiring:** register `TaskStateService` as Singleton in `Program.cs` for both the main app and the external-MCP app. The service holds no per-request state.
|
||||
>
|
||||
> **Tests:** new file `tests/ClaudeDo.Worker.Tests/State/TaskStateServiceTests.cs`. At minimum:
|
||||
> - Happy path for each transition (verify DB state + side-effect mocks invoked).
|
||||
> - Reject path for each invalid transition (verify result + DB unchanged).
|
||||
> - Concurrency: two parallel `StartRunningAsync` for the same `Queued` task → exactly one returns `Ok=true`.
|
||||
> - Mock or fake the broadcaster, queue, and chain-coordinator dependencies. Use real SQLite for the DB (existing test pattern).
|
||||
>
|
||||
> Build all projects, run the worker test project (the 4 pre-existing constructor-drift errors are out of scope — but my changes shouldn't add new errors), commit as `refactor(worker/state): introduce TaskStateService and route mutations through it`.
|
||||
|
||||
---
|
||||
|
||||
## Slice 3 — `IQueueWaker` + `IQueuePicker`
|
||||
|
||||
**Prompt to paste into a fresh session:**
|
||||
|
||||
> Slice 3 of the worker state consolidation refactor. Spec: `docs/superpowers/specs/2026-04-27-worker-state-and-queue-consolidation-design.md` (section 3). Slices 1 and 2 already landed.
|
||||
>
|
||||
> **Goal:** extract queue-wake and queue-pick from `QueueService` and `TaskRepository` into dedicated single-responsibility components. Make wakes automatic.
|
||||
>
|
||||
> **New components in `Worker/Queue/`:**
|
||||
> - `IQueueWaker` (interface, `void Wake()`). Backed by `QueueWaker` singleton holding the existing `SemaphoreSlim`. Inject into `TaskStateService` (replaces the direct `QueueService` ref from Slice 2) and into `QueueService` itself.
|
||||
> - `IQueuePicker` with `Task<TaskEntity?> ClaimNextAsync(DateTime now, CancellationToken ct)`. Implementation `QueuePicker` moves the raw SQL out of `TaskRepository.GetNextQueuedAgentTaskAsync` and **adds a `blocked_by_task_id IS NULL` filter to the WHERE clause**. Order stays `sort_order ASC, created_at ASC` (verify the existing query — add ORDER BY if missing). Atomic `UPDATE … RETURNING` flips `Queued → Running` and writes `started_at`.
|
||||
>
|
||||
> **Caller updates:**
|
||||
> - `TaskStateService` swaps its `Func<QueueService>` for `IQueueWaker`. The `→ Queued` side-effect now calls `_waker.Wake()`.
|
||||
> - `QueueService.ExecuteAsync` calls `_picker.ClaimNextAsync` instead of `_taskRepo.GetNextQueuedAgentTaskAsync`. The slot-claim, broadcaster, and `WakeQueue()` after slot release stay where they are.
|
||||
> - `WorkerHub.WakeQueue()` and `ExternalMcpService.WakeQueue` calls in app code → remove the explicit invocations. The state-service triggers waking automatically. **Keep** the SignalR/MCP endpoint that exposes `WakeQueue()` for diagnostics/manual use — that one delegates to `_waker.Wake()`.
|
||||
> - `TaskRepository.GetNextQueuedAgentTaskAsync` becomes a thin shim that forwards to `IQueuePicker` for any remaining tests, OR delete it and update tests to use the picker. Prefer delete if tests are easy to migrate.
|
||||
>
|
||||
> **Tests:** new `tests/ClaudeDo.Worker.Tests/Queue/QueuePickerTests.cs`:
|
||||
> - Skipped: `BlockedByTaskId` set; missing agent tag; `scheduled_for > now`; status not Queued.
|
||||
> - Picked: correct order (`sort_order, created_at`).
|
||||
> - Atomic claim: two parallel pickers → exactly one row returned non-null, the other null.
|
||||
>
|
||||
> Update existing `TaskRepositoryTests.GetNextQueuedAgentTaskAsync_*` tests if they exercised the removed method.
|
||||
>
|
||||
> Build, test, commit as `refactor(worker/queue): split queue waker and picker, auto-wake on enqueue`.
|
||||
|
||||
---
|
||||
|
||||
## Slice 4 — Planning flow consolidation (kills the original bug)
|
||||
|
||||
**Prompt to paste into a fresh session:**
|
||||
|
||||
> Slice 4 of the worker state consolidation refactor. Spec: `docs/superpowers/specs/2026-04-27-worker-state-and-queue-consolidation-design.md` (section 4). Slices 1–3 already landed. **This slice eliminates the original "queue never picks up planning tasks" bug structurally.**
|
||||
>
|
||||
> **Goal:** one path through planning. Delete the dual-flow problem.
|
||||
>
|
||||
> **Changes:**
|
||||
> - **Delete** `TaskRepository.FinalizePlanningAsync` entirely. Also delete its tests in `TaskRepositoryPlanningTests.cs`.
|
||||
> - **Rewrite** `PlanningSessionManager.FinalizeAsync(taskId, queueAgentTasks, ct)`:
|
||||
> 1. `_state.FinalizePlanningAsync(parentId, ct)` (sets parent `PlanningPhase=Finalized`, `Status=Idle`).
|
||||
> 2. If `queueAgentTasks` is true, call the new `_chainCoordinator.SetupChainAsync(parentId, ct)`.
|
||||
> 3. Existing worktree-cleanup + session-dir-deletion remains.
|
||||
> 4. Return the count of children that ended up in the chain.
|
||||
> - **Rename** `PlanningChainCoordinator.QueueSubtasksSequentiallyAsync` → `SetupChainAsync`. Make it `internal`. New behavior:
|
||||
> - Eligibility check: children must be in `Status=Idle` (was `Manual` or `Planned` legacy values — keep tolerating those for one slice via OR).
|
||||
> - Auto-attach `agent` tag to all children (already in WIP — keep that behavior).
|
||||
> - For first child: `_state.EnqueueAsync(child[0].Id, ct)` (no BlockedBy, head of chain).
|
||||
> - For rest: `_state.EnqueueAsync(child[i].Id, ct)` followed immediately by `_state.BlockOnAsync(child[i].Id, child[i-1].Id, ct)`. (Or: add a single `EnqueueBlockedAsync` helper to TaskStateService if call-site clutter bothers you.)
|
||||
> - **Update** `PlanningChainCoordinator.OnChildFinishedAsync`: replace status-via-LINQ logic with: query for the next child where `BlockedByTaskId == childTaskId`, call `_state.UnblockAsync` on it. Drop the `Waiting` lookup entirely.
|
||||
> - Audit `Status == TaskStatus.Waiting` in UI/tests — replace with `Status == Queued && BlockedByTaskId != null`. (UI changes confirmed against `TaskRowViewModel`, `TasksIslandViewModel` from Slice 1's WIP.)
|
||||
>
|
||||
> **Regression test:** new `tests/ClaudeDo.Worker.Tests/Planning/PlanningEndToEndTests.cs` (or extend existing) — `Active` parent + 3 drafts → call `FinalizeAsync(queueAgentTasks: true)` → assert within 200 ms the first child has `Status=Running` (queue picker claimed it) without anyone calling `WakeQueue()` manually. This was the bug the user originally reported.
|
||||
>
|
||||
> **Update** `PlanningMcpService.EditableStatuses` — replace `Waiting` with `Queued` (since blocked tasks are now `Queued + BlockedByTaskId`). Verify the MCP tool still gates on `parent.PlanningPhase == Active` (legacy: `parent.Status == Planning`).
|
||||
>
|
||||
> Build, test, commit as `feat(planning): consolidate finalize+chain via TaskStateService, fix queue pickup`.
|
||||
|
||||
---
|
||||
|
||||
## Slice 5 — `OverrideSlotService` + folder reorg
|
||||
|
||||
**Prompt to paste into a fresh session:**
|
||||
|
||||
> Slice 5 of the worker state consolidation refactor. Spec: `docs/superpowers/specs/2026-04-27-worker-state-and-queue-consolidation-design.md` (section 5). Slices 1–4 already landed.
|
||||
>
|
||||
> **Goal:** split the override slot out of QueueService and reorganize `Worker/Services/` into domain folders.
|
||||
>
|
||||
> **`OverrideSlotService` (new in `Worker/Queue/`):**
|
||||
> - Owns the `_overrideSlot` field, `RunNow(taskId)`, `ContinueTask(taskId, followUpPrompt)`, and the override-slot piece of `CancelTask`.
|
||||
> - Status mutations go through `TaskStateService.StartRunningAsync` (non-atomic claim is fine; serialized by slot lock).
|
||||
> - `QueueService.CancelTask` delegates to `OverrideSlotService.TryCancel` first, falls back to its own queue slot.
|
||||
> - WorkerHub's `RunNow`/`ContinueTask`/`CancelTask` SignalR endpoints route to the new service via `OverrideSlotService` when applicable; keep the signatures stable.
|
||||
>
|
||||
> **Folder reorg** (use `git mv`, don't copy/delete):
|
||||
> ```
|
||||
> Worker/State/ ← ITaskStateService.cs, TaskStateService.cs, TransitionResult.cs (already exist; no move needed if already there)
|
||||
> Worker/Queue/ ← IQueueWaker.cs, QueueWaker.cs, IQueuePicker.cs, QueuePicker.cs, QueueService.cs, OverrideSlotService.cs, QueueSlotState.cs
|
||||
> Worker/Lifecycle/ ← StaleTaskRecovery.cs, TaskResetService.cs, TaskMergeService.cs
|
||||
> Worker/Worktrees/ ← WorktreeMaintenanceService.cs
|
||||
> Worker/Agents/ ← AgentFileService.cs, DefaultAgentSeeder.cs
|
||||
> Worker/Runner/ ← unchanged
|
||||
> Worker/Planning/ ← unchanged
|
||||
> Worker/External/ ← unchanged
|
||||
> Worker/Hub/ ← unchanged
|
||||
> ```
|
||||
>
|
||||
> Update namespaces to match folders (existing convention: namespace == folder path under `ClaudeDo.Worker`). Delete the old `Worker/Services/` folder once empty.
|
||||
>
|
||||
> Update DI registrations in `Program.cs` (both apps) — most calls just need `using` updates. `OverrideSlotService` is a new singleton.
|
||||
>
|
||||
> Update test `using` statements to follow.
|
||||
>
|
||||
> Build, test, commit as `refactor(worker): extract OverrideSlotService and reorganize Worker/Services into domain folders`.
|
||||
|
||||
---
|
||||
|
||||
## Slice 6 — Cleanup, legacy retirement, docs
|
||||
|
||||
**Prompt to paste into a fresh session:**
|
||||
|
||||
> Slice 6 (final) of the worker state consolidation refactor. Spec: `docs/superpowers/specs/2026-04-27-worker-state-and-queue-consolidation-design.md` (section 6 + slice plan). Slices 1–5 already landed.
|
||||
>
|
||||
> **Goal:** retire legacy enum values, backfill DB rows, update docs.
|
||||
>
|
||||
> **EF migration `RetireLegacyTaskStatus`:**
|
||||
> ```sql
|
||||
> UPDATE tasks SET status='idle' WHERE status IN ('manual', 'draft');
|
||||
> UPDATE tasks SET status='idle', planning_phase='active' WHERE status='planning';
|
||||
> UPDATE tasks SET status='idle', planning_phase='finalized' WHERE status='planned';
|
||||
>
|
||||
> -- Waiting → Queued + blocked_by from sort_order:
|
||||
> WITH ordered AS (
|
||||
> SELECT id,
|
||||
> LAG(id) OVER (PARTITION BY parent_task_id ORDER BY sort_order, created_at) AS prev_id
|
||||
> FROM tasks WHERE status='waiting'
|
||||
> )
|
||||
> UPDATE tasks
|
||||
> SET status='queued',
|
||||
> blocked_by_task_id=(SELECT prev_id FROM ordered WHERE ordered.id=tasks.id)
|
||||
> WHERE id IN (SELECT id FROM ordered);
|
||||
> ```
|
||||
> Use `migrationBuilder.Sql(...)` for these. Down() is best-effort: `Cancelled` → `Failed`, `(idle, finalized)` → `planned`, `(idle, active)` → `planning`, `queued + blocked_by_task_id != null` → `waiting`. Document lossiness in a comment.
|
||||
>
|
||||
> **Code changes:**
|
||||
> - Remove legacy values from `TaskStatus` enum: `Manual, Planning, Planned, Draft, Waiting`.
|
||||
> - Strip the legacy branches from `TaskEntityConfiguration.StatusToString`/`StatusFromString`.
|
||||
> - Default for `TaskEntity.Status` is `TaskStatus.Idle` (already correct after Slice 1's revert).
|
||||
> - Audit + remap every remaining caller — they should already use new values from Slices 2–4, but search for any leftover `TaskStatus.Manual` etc. in:
|
||||
> - tests (~10 files seed status — flip to `Idle`/`Queued`/etc.)
|
||||
> - UI (`TaskRowViewModel.IsPlanningParent`, `IsDraft`, `CanOpenPlanningSession`, status maps — replace with `PlanningPhase` checks where appropriate)
|
||||
> - any leftover guards in MCP/services
|
||||
> - Mark `Mark*Async` repo helpers as `internal` if not already (Slice 2 should have done this — verify).
|
||||
>
|
||||
> **Docs to update:**
|
||||
> - `src/ClaudeDo.Worker/CLAUDE.md` — new folder structure, new state-service flow, new wake mechanics, removal of legacy values.
|
||||
> - `src/ClaudeDo.Data/CLAUDE.md` — TaskEntity new fields (`PlanningPhase`, `BlockedByTaskId`), retired legacy enum values, new tag-attach behavior.
|
||||
> - `docs/plan.md` — update status flow section.
|
||||
> - `docs/open.md` — close the "queue doesn't pick up planning tasks" item if it's tracked there; add any follow-ups discovered along the way.
|
||||
> - Memory: update `~/.claude/projects/C--Private-ClaudeDo/memory/` with a new entry summarizing the new architecture (state-service + queue split + planning chain via blocked-by).
|
||||
>
|
||||
> **Sanity tests** — full test run. The 4 pre-existing constructor-drift errors should still be the only failures. If new ones surfaced from missed legacy-value remappings, fix them before commit.
|
||||
>
|
||||
> Build, full test run, commit as `refactor(data): retire legacy TaskStatus values and backfill existing rows`.
|
||||
|
||||
---
|
||||
|
||||
## After Slice 6
|
||||
|
||||
- All 6 slices on `main`.
|
||||
- The original bug ("queue doesn't pick up planning tasks") is structurally impossible.
|
||||
- Worker has clear domain folders, single state-mutator, single queue-picker.
|
||||
- Spec doc + this prompt file can be deleted or moved to `docs/superpowers/done/`.
|
||||
Reference in New Issue
Block a user