# Session Prompts — Worker State & Queue Consolidation Slices 2–6 Paste-ready prompts for each remaining slice. Run **one slice per session** so the diff stays reviewable and tests stay green between commits. Spec lives at `docs/superpowers/specs/2026-04-27-worker-state-and-queue-consolidation-design.md` — reference it when the prompt asks. **Common ground rules** (carry across all slices): - Direct on `main`, one commit per slice, conventional commit messages. - Build green (`dotnet build src/ClaudeDo.Worker/ClaudeDo.Worker.csproj` + Data + Ui) before commit. - Pre-existing test errors (TaskRunner/WorkerHub constructor drift in 4 test files) are **not** in scope to fix — they exist on `main` already. New compile errors my changes introduce ARE in scope. - No drive-by refactors outside the slice's stated scope. - New files must follow existing naming/folder conventions; legacy enum values stay until Slice 6. - After each slice, update `~/.claude/projects/C--Private-ClaudeDo/memory/` if I learn something durable about the codebase. --- ## Slice 2 — `TaskStateService` (centralized state machine) **Prompt to paste into a fresh session:** > Slice 2 of the worker state consolidation refactor. Spec: `docs/superpowers/specs/2026-04-27-worker-state-and-queue-consolidation-design.md` (sections 2 and 8). Slice 1 already landed (commit 7b737e6) — `TaskStatus` has `Idle`/`Cancelled`, `PlanningPhase` enum exists, `BlockedByTaskId` field exists. Legacy enum values still around. > > **Goal:** introduce `Worker/State/ITaskStateService` + `TaskStateService` as the single component that mutates `Status`, `PlanningPhase`, `BlockedByTaskId`. Migrate every existing caller. Mark repo `Mark*Async` helpers `internal`. > > **Public surface (verbatim from spec):** > ```csharp > Task EnqueueAsync(string taskId, CancellationToken ct); > Task StartRunningAsync(string taskId, DateTime startedAt, CancellationToken ct); > Task CompleteAsync(string taskId, DateTime finishedAt, string? result, CancellationToken ct); > Task FailAsync(string taskId, DateTime finishedAt, string? error, CancellationToken ct); > Task CancelAsync(string taskId, DateTime finishedAt, CancellationToken ct); > Task ResetToIdleAsync(string taskId, CancellationToken ct); > Task StartPlanningAsync(string parentId, CancellationToken ct); > Task FinalizePlanningAsync(string parentId, CancellationToken ct); > Task BlockOnAsync(string taskId, string predecessorTaskId, CancellationToken ct); > Task UnblockAsync(string taskId, CancellationToken ct); > Task RecoverStaleRunningAsync(string reason, CancellationToken ct); > ``` > > **Allowed transition table:** see spec §2. Reject invalid transitions with `TransitionResult(false, "")` — no exceptions. Each transition is one atomic `ExecuteUpdate` with `WHERE Status = ` for TOCTOU-freedom. > > **Side effects after successful DB write** (do these inside the service so callers don't need to remember): > - On any `→ Queued`: call `_queue.WakeQueue()` directly for now (Slice 3 will replace with `IQueueWaker`). Inject `QueueService` lazily via `Func` to break the DI cycle if needed. > - On any successful transition: `_broadcaster.TaskUpdated(taskId)`. > - On `Done`/`Failed`/`Cancelled` for a child task: invoke `_chain.OnChildFinishedAsync(taskId, finalStatus, ct)`. If it returns a next-task-id, call `UnblockAsync` on it. Then run `_repo.TryCompleteParentAsync(parentId, ct)`. > > **Important:** `BlockOnAsync` and `UnblockAsync` should write `BlockedByTaskId` directly. `EnqueueAsync` for a Planning child should keep `BlockedByTaskId` null when it's the head of the chain. The chain coordinator will compose these calls in Slice 4 — for now just expose the API. > > **Caller migration (mechanical — preserve current behavior):** > - `TaskRunner.HandleSuccess` → replace `taskRepo.MarkDoneAsync` + `TryCompleteParentAsync` + `_chain.OnChildFinishedAsync` block with a single `_state.CompleteAsync(taskId, finishedAt, result, CancellationToken.None)`. > - `TaskRunner.HandleFailure` → `_state.FailAsync(taskId, finishedAt, errorMarkdown, CancellationToken.None)`. > - `TaskRunner.MarkFailed` (early-fail path) → same. > - `TaskRunner.RunAsync` start of run → `_state.StartRunningAsync(taskId, startedAt, ct)`. > - `StaleTaskRecovery.StartAsync` → `_state.RecoverStaleRunningAsync("worker restart", ct)`. > - `TaskResetService.ResetAsync` → `_state.ResetToIdleAsync(taskId, ct)` for the status flip; service keeps owning worktree cleanup. > - `PlanningSessionManager.StartAsync` (the `SetPlanningStartedAsync` call) → `_state.StartPlanningAsync(parentId, ct)`. The manager still owns token/session-dir setup; only the status flip moves. > - `PlanningChainCoordinator.OnChildFinishedAsync` (the `next.Status = TaskStatus.Queued` write) → keep its existing logic but use `_state.UnblockAsync(next.Id, ct)` for the actual write. The Slice 4 rewrite finishes the rest. > - `ExternalMcpService.UpdateTaskStatus` (status flip in the Queued case) → `_state.EnqueueAsync(taskId, ct)`. The Manual case stays as-is until Slice 6 since `Manual` is still a valid legacy value. > > **Repo helpers to mark `internal`:** `MarkRunningAsync`, `MarkDoneAsync`, `MarkFailedAsync`, `FlipAllRunningToFailedAsync`. Verify nothing outside `ClaudeDo.Worker.State` calls them after migration. (`Worker.Tests` may need `InternalsVisibleTo` — add it if so.) > > **DI wiring:** register `TaskStateService` as Singleton in `Program.cs` for both the main app and the external-MCP app. The service holds no per-request state. > > **Tests:** new file `tests/ClaudeDo.Worker.Tests/State/TaskStateServiceTests.cs`. At minimum: > - Happy path for each transition (verify DB state + side-effect mocks invoked). > - Reject path for each invalid transition (verify result + DB unchanged). > - Concurrency: two parallel `StartRunningAsync` for the same `Queued` task → exactly one returns `Ok=true`. > - Mock or fake the broadcaster, queue, and chain-coordinator dependencies. Use real SQLite for the DB (existing test pattern). > > Build all projects, run the worker test project (the 4 pre-existing constructor-drift errors are out of scope — but my changes shouldn't add new errors), commit as `refactor(worker/state): introduce TaskStateService and route mutations through it`. --- ## Slice 3 — `IQueueWaker` + `IQueuePicker` **Prompt to paste into a fresh session:** > Slice 3 of the worker state consolidation refactor. Spec: `docs/superpowers/specs/2026-04-27-worker-state-and-queue-consolidation-design.md` (section 3). Slices 1 and 2 already landed. > > **Goal:** extract queue-wake and queue-pick from `QueueService` and `TaskRepository` into dedicated single-responsibility components. Make wakes automatic. > > **New components in `Worker/Queue/`:** > - `IQueueWaker` (interface, `void Wake()`). Backed by `QueueWaker` singleton holding the existing `SemaphoreSlim`. Inject into `TaskStateService` (replaces the direct `QueueService` ref from Slice 2) and into `QueueService` itself. > - `IQueuePicker` with `Task ClaimNextAsync(DateTime now, CancellationToken ct)`. Implementation `QueuePicker` moves the raw SQL out of `TaskRepository.GetNextQueuedAgentTaskAsync` and **adds a `blocked_by_task_id IS NULL` filter to the WHERE clause**. Order stays `sort_order ASC, created_at ASC` (verify the existing query — add ORDER BY if missing). Atomic `UPDATE … RETURNING` flips `Queued → Running` and writes `started_at`. > > **Caller updates:** > - `TaskStateService` swaps its `Func` for `IQueueWaker`. The `→ Queued` side-effect now calls `_waker.Wake()`. > - `QueueService.ExecuteAsync` calls `_picker.ClaimNextAsync` instead of `_taskRepo.GetNextQueuedAgentTaskAsync`. The slot-claim, broadcaster, and `WakeQueue()` after slot release stay where they are. > - `WorkerHub.WakeQueue()` and `ExternalMcpService.WakeQueue` calls in app code → remove the explicit invocations. The state-service triggers waking automatically. **Keep** the SignalR/MCP endpoint that exposes `WakeQueue()` for diagnostics/manual use — that one delegates to `_waker.Wake()`. > - `TaskRepository.GetNextQueuedAgentTaskAsync` becomes a thin shim that forwards to `IQueuePicker` for any remaining tests, OR delete it and update tests to use the picker. Prefer delete if tests are easy to migrate. > > **Tests:** new `tests/ClaudeDo.Worker.Tests/Queue/QueuePickerTests.cs`: > - Skipped: `BlockedByTaskId` set; missing agent tag; `scheduled_for > now`; status not Queued. > - Picked: correct order (`sort_order, created_at`). > - Atomic claim: two parallel pickers → exactly one row returned non-null, the other null. > > Update existing `TaskRepositoryTests.GetNextQueuedAgentTaskAsync_*` tests if they exercised the removed method. > > Build, test, commit as `refactor(worker/queue): split queue waker and picker, auto-wake on enqueue`. --- ## Slice 4 — Planning flow consolidation (kills the original bug) **Prompt to paste into a fresh session:** > Slice 4 of the worker state consolidation refactor. Spec: `docs/superpowers/specs/2026-04-27-worker-state-and-queue-consolidation-design.md` (section 4). Slices 1–3 already landed. **This slice eliminates the original "queue never picks up planning tasks" bug structurally.** > > **Goal:** one path through planning. Delete the dual-flow problem. > > **Changes:** > - **Delete** `TaskRepository.FinalizePlanningAsync` entirely. Also delete its tests in `TaskRepositoryPlanningTests.cs`. > - **Rewrite** `PlanningSessionManager.FinalizeAsync(taskId, queueAgentTasks, ct)`: > 1. `_state.FinalizePlanningAsync(parentId, ct)` (sets parent `PlanningPhase=Finalized`, `Status=Idle`). > 2. If `queueAgentTasks` is true, call the new `_chainCoordinator.SetupChainAsync(parentId, ct)`. > 3. Existing worktree-cleanup + session-dir-deletion remains. > 4. Return the count of children that ended up in the chain. > - **Rename** `PlanningChainCoordinator.QueueSubtasksSequentiallyAsync` → `SetupChainAsync`. Make it `internal`. New behavior: > - Eligibility check: children must be in `Status=Idle` (was `Manual` or `Planned` legacy values — keep tolerating those for one slice via OR). > - Auto-attach `agent` tag to all children (already in WIP — keep that behavior). > - For first child: `_state.EnqueueAsync(child[0].Id, ct)` (no BlockedBy, head of chain). > - For rest: `_state.EnqueueAsync(child[i].Id, ct)` followed immediately by `_state.BlockOnAsync(child[i].Id, child[i-1].Id, ct)`. (Or: add a single `EnqueueBlockedAsync` helper to TaskStateService if call-site clutter bothers you.) > - **Update** `PlanningChainCoordinator.OnChildFinishedAsync`: replace status-via-LINQ logic with: query for the next child where `BlockedByTaskId == childTaskId`, call `_state.UnblockAsync` on it. Drop the `Waiting` lookup entirely. > - Audit `Status == TaskStatus.Waiting` in UI/tests — replace with `Status == Queued && BlockedByTaskId != null`. (UI changes confirmed against `TaskRowViewModel`, `TasksIslandViewModel` from Slice 1's WIP.) > > **Regression test:** new `tests/ClaudeDo.Worker.Tests/Planning/PlanningEndToEndTests.cs` (or extend existing) — `Active` parent + 3 drafts → call `FinalizeAsync(queueAgentTasks: true)` → assert within 200 ms the first child has `Status=Running` (queue picker claimed it) without anyone calling `WakeQueue()` manually. This was the bug the user originally reported. > > **Update** `PlanningMcpService.EditableStatuses` — replace `Waiting` with `Queued` (since blocked tasks are now `Queued + BlockedByTaskId`). Verify the MCP tool still gates on `parent.PlanningPhase == Active` (legacy: `parent.Status == Planning`). > > Build, test, commit as `feat(planning): consolidate finalize+chain via TaskStateService, fix queue pickup`. --- ## Slice 5 — `OverrideSlotService` + folder reorg **Prompt to paste into a fresh session:** > Slice 5 of the worker state consolidation refactor. Spec: `docs/superpowers/specs/2026-04-27-worker-state-and-queue-consolidation-design.md` (section 5). Slices 1–4 already landed. > > **Goal:** split the override slot out of QueueService and reorganize `Worker/Services/` into domain folders. > > **`OverrideSlotService` (new in `Worker/Queue/`):** > - Owns the `_overrideSlot` field, `RunNow(taskId)`, `ContinueTask(taskId, followUpPrompt)`, and the override-slot piece of `CancelTask`. > - Status mutations go through `TaskStateService.StartRunningAsync` (non-atomic claim is fine; serialized by slot lock). > - `QueueService.CancelTask` delegates to `OverrideSlotService.TryCancel` first, falls back to its own queue slot. > - WorkerHub's `RunNow`/`ContinueTask`/`CancelTask` SignalR endpoints route to the new service via `OverrideSlotService` when applicable; keep the signatures stable. > > **Folder reorg** (use `git mv`, don't copy/delete): > ``` > Worker/State/ ← ITaskStateService.cs, TaskStateService.cs, TransitionResult.cs (already exist; no move needed if already there) > Worker/Queue/ ← IQueueWaker.cs, QueueWaker.cs, IQueuePicker.cs, QueuePicker.cs, QueueService.cs, OverrideSlotService.cs, QueueSlotState.cs > Worker/Lifecycle/ ← StaleTaskRecovery.cs, TaskResetService.cs, TaskMergeService.cs > Worker/Worktrees/ ← WorktreeMaintenanceService.cs > Worker/Agents/ ← AgentFileService.cs, DefaultAgentSeeder.cs > Worker/Runner/ ← unchanged > Worker/Planning/ ← unchanged > Worker/External/ ← unchanged > Worker/Hub/ ← unchanged > ``` > > Update namespaces to match folders (existing convention: namespace == folder path under `ClaudeDo.Worker`). Delete the old `Worker/Services/` folder once empty. > > Update DI registrations in `Program.cs` (both apps) — most calls just need `using` updates. `OverrideSlotService` is a new singleton. > > Update test `using` statements to follow. > > Build, test, commit as `refactor(worker): extract OverrideSlotService and reorganize Worker/Services into domain folders`. --- ## Slice 6 — Cleanup, legacy retirement, docs **Prompt to paste into a fresh session:** > Slice 6 (final) of the worker state consolidation refactor. Spec: `docs/superpowers/specs/2026-04-27-worker-state-and-queue-consolidation-design.md` (section 6 + slice plan). Slices 1–5 already landed. > > **Goal:** retire legacy enum values, backfill DB rows, update docs. > > **EF migration `RetireLegacyTaskStatus`:** > ```sql > UPDATE tasks SET status='idle' WHERE status IN ('manual', 'draft'); > UPDATE tasks SET status='idle', planning_phase='active' WHERE status='planning'; > UPDATE tasks SET status='idle', planning_phase='finalized' WHERE status='planned'; > > -- Waiting → Queued + blocked_by from sort_order: > WITH ordered AS ( > SELECT id, > LAG(id) OVER (PARTITION BY parent_task_id ORDER BY sort_order, created_at) AS prev_id > FROM tasks WHERE status='waiting' > ) > UPDATE tasks > SET status='queued', > blocked_by_task_id=(SELECT prev_id FROM ordered WHERE ordered.id=tasks.id) > WHERE id IN (SELECT id FROM ordered); > ``` > Use `migrationBuilder.Sql(...)` for these. Down() is best-effort: `Cancelled` → `Failed`, `(idle, finalized)` → `planned`, `(idle, active)` → `planning`, `queued + blocked_by_task_id != null` → `waiting`. Document lossiness in a comment. > > **Code changes:** > - Remove legacy values from `TaskStatus` enum: `Manual, Planning, Planned, Draft, Waiting`. > - Strip the legacy branches from `TaskEntityConfiguration.StatusToString`/`StatusFromString`. > - Default for `TaskEntity.Status` is `TaskStatus.Idle` (already correct after Slice 1's revert). > - Audit + remap every remaining caller — they should already use new values from Slices 2–4, but search for any leftover `TaskStatus.Manual` etc. in: > - tests (~10 files seed status — flip to `Idle`/`Queued`/etc.) > - UI (`TaskRowViewModel.IsPlanningParent`, `IsDraft`, `CanOpenPlanningSession`, status maps — replace with `PlanningPhase` checks where appropriate) > - any leftover guards in MCP/services > - Mark `Mark*Async` repo helpers as `internal` if not already (Slice 2 should have done this — verify). > > **Docs to update:** > - `src/ClaudeDo.Worker/CLAUDE.md` — new folder structure, new state-service flow, new wake mechanics, removal of legacy values. > - `src/ClaudeDo.Data/CLAUDE.md` — TaskEntity new fields (`PlanningPhase`, `BlockedByTaskId`), retired legacy enum values, new tag-attach behavior. > - `docs/plan.md` — update status flow section. > - `docs/open.md` — close the "queue doesn't pick up planning tasks" item if it's tracked there; add any follow-ups discovered along the way. > - Memory: update `~/.claude/projects/C--Private-ClaudeDo/memory/` with a new entry summarizing the new architecture (state-service + queue split + planning chain via blocked-by). > > **Sanity tests** — full test run. The 4 pre-existing constructor-drift errors should still be the only failures. If new ones surfaced from missed legacy-value remappings, fix them before commit. > > Build, full test run, commit as `refactor(data): retire legacy TaskStatus values and backfill existing rows`. --- ## After Slice 6 - All 6 slices on `main`. - The original bug ("queue doesn't pick up planning tasks") is structurally impossible. - Worker has clear domain folders, single state-mutator, single queue-picker. - Spec doc + this prompt file can be deleted or moved to `docs/superpowers/done/`.