Files
ClaudeDo/docs/superpowers/specs/2026-04-14-worker-cli-modernization-design.md

17 KiB

Worker CLI Modernization

Date: 2026-04-14 Status: Approved Scope: ClaudeDo.Worker — CLI invocation, execution tracking, per-task configuration, multi-turn support

Problem

The Worker currently invokes Claude CLI with hardcoded flags (-p --output-format stream-json --verbose --dangerously-skip-permissions). There is no way to configure model, system prompt, or agent per list or task. Execution is single-shot with no retry or follow-up capability. Results are stored as a single markdown blob on the tasks row with no structured metadata, token usage, or turn count.

Goals

  1. Per-list configuration (model, system prompt, agent file) with per-task overrides
  2. Execution history — each CLI invocation tracked as its own task_runs row
  3. Multi-turn support — manual continue and auto-retry via --resume
  4. Structured output alongside markdown via --json-schema
  5. Agent file management — filesystem-based .md agents with UI to browse/create/edit
  6. Richer stream parsing — token usage, turn count, session ID, retry events

Non-Goals (Deferred)

  • --bare mode (forces API key; user relies on OAuth/keychain auth)
  • --allowedTools / permission modes (keep --dangerously-skip-permissions)
  • Schema migration framework (use IF NOT EXISTS / INSERT OR IGNORE for additive changes)

1. Schema Changes

1.1 New table: list_config

One-to-one with lists. Stores per-list defaults for CLI invocation.

CREATE TABLE IF NOT EXISTS list_config (
    list_id       TEXT PRIMARY KEY REFERENCES lists(id) ON DELETE CASCADE,
    model         TEXT NULL,       -- 'opus-4-6' | 'sonnet-4-6' | 'haiku-4-5'
    system_prompt TEXT NULL,       -- appended via --append-system-prompt
    agent_path    TEXT NULL        -- path to agent .md file, passed via --agents
);

1.2 New columns on tasks

Per-task overrides. All nullable — NULL means "use list default".

ALTER TABLE tasks ADD COLUMN model         TEXT NULL;
ALTER TABLE tasks ADD COLUMN system_prompt  TEXT NULL;
ALTER TABLE tasks ADD COLUMN agent_path     TEXT NULL;

Since schema uses IF NOT EXISTS and is re-applied on startup, these are added via ALTER TABLE ... ADD COLUMN wrapped in a try/catch (SQLite raises "duplicate column" if already present — safe to ignore).

1.3 New table: task_runs

One row per CLI invocation. Supports multi-turn and retry tracking.

CREATE TABLE IF NOT EXISTS task_runs (
    id                 TEXT PRIMARY KEY,
    task_id            TEXT NOT NULL REFERENCES tasks(id) ON DELETE CASCADE,
    run_number         INTEGER NOT NULL,   -- 1, 2, 3... sequential per task
    session_id         TEXT NULL,          -- Claude CLI session ID (for --resume)
    is_retry           INTEGER NOT NULL DEFAULT 0,  -- 0 = normal/continue, 1 = auto-retry
    prompt             TEXT NOT NULL,      -- the prompt sent for this run
    result_markdown    TEXT NULL,          -- free-form result from 'result' field
    structured_output  TEXT NULL,          -- JSON from 'structured_output' field
    error_markdown     TEXT NULL,          -- error output on failure
    exit_code          INTEGER NULL,       -- CLI exit code
    turn_count         INTEGER NULL,       -- number of agent loop turns
    tokens_in          INTEGER NULL,       -- total input tokens
    tokens_out         INTEGER NULL,       -- total output tokens
    log_path           TEXT NULL,          -- NDJSON log file for this run
    started_at         TIMESTAMP NULL,
    finished_at        TIMESTAMP NULL
);

CREATE INDEX IF NOT EXISTS idx_task_runs_task_id ON task_runs(task_id);

1.4 Denormalized fields on tasks

Keep existing result, log_path, started_at, finished_at on the tasks table. After each run completes, update them with the latest run's values. This preserves backward compatibility for UI queries that read tasks directly.

1.5 Model validation

Valid model values: opus-4-6, sonnet-4-6, haiku-4-5. Validated at the application layer (repository/service), not via SQL CHECK constraint, to allow easy future additions.


2. Agent File Management

2.1 Directory

Agents live in ~/.todo-app/agents/. The directory is created on Worker startup if absent.

2.2 File format

Standard Claude agent markdown with YAML frontmatter:

---
name: .NET Developer
description: Senior .NET developer focused on clean architecture
---

You are a senior .NET developer. Follow existing project patterns...

2.3 AgentFileService

New service in ClaudeDo.Worker (not a repository — operates on filesystem, not DB):

Method Description
ScanAsync() Returns List<AgentInfo> — parse frontmatter for name/description from all *.md in agents dir
ReadAsync(string path) Full file content
WriteAsync(string path, string content) Create or overwrite
DeleteAsync(string path) Remove file

2.4 AgentInfo DTO

public sealed record AgentInfo(string Name, string Description, string Path);

2.5 Discovery

  • Worker scans on startup and exposes agents via a new SignalR method GetAgents().
  • UI calls GetAgents() to populate dropdowns.
  • A RefreshAgents() hub method triggers a re-scan (for after UI creates/edits a file).

3. CLI Invocation Changes

3.1 Current invocation

claude -p --output-format stream-json --verbose --dangerously-skip-permissions

Prompt written to stdin. Single-shot, no config, no structured output.

3.2 New invocation

Built dynamically per run by ClaudeArgsBuilder:

claude -p
  --output-format stream-json
  --verbose
  --dangerously-skip-permissions
  --model <resolved-model>                    # if set
  --append-system-prompt <resolved-prompt>    # if set
  --agents '[{"file":"<resolved-agent-path>"}]' # if set
  --json-schema <schema-json>                 # always
  --resume <session-id>                       # only for multi-turn/retry

3.3 Config resolution

resolved_model  = task.model  ?? list_config.model  ?? null (omit --model)
resolved_prompt = task.system_prompt ?? list_config.system_prompt ?? null (omit --append-system-prompt)
resolved_agent  = task.agent_path ?? list_config.agent_path ?? null (omit --agents)

3.4 Structured output schema

Passed via --json-schema on every invocation:

{
  "type": "object",
  "properties": {
    "summary": { "type": "string" },
    "files_changed": {
      "type": "array",
      "items": { "type": "string" }
    },
    "commit_type": { "type": "string" }
  },
  "required": ["summary"]
}

The CLI returns this in the structured_output field of the JSON result event. The markdown result remains in the result field.

3.5 ClaudeArgsBuilder

New class, single responsibility for argument construction:

public sealed class ClaudeArgsBuilder
{
    // Returns the full argument string for ProcessStartInfo.Arguments
    public string Build(ClaudeRunConfig config);
}

public sealed record ClaudeRunConfig(
    string? Model,
    string? SystemPrompt,
    string? AgentPath,
    string? ResumeSessionId
);

Testable in isolation — no process spawning, just string building.


4. Stream Parsing

4.1 StreamAnalyzer (replaces MessageParser)

Processes each NDJSON line and accumulates metrics:

Responsibility How
Extract result markdown Look for type: "result", read .result field
Extract structured output Same event, read .structured_output field
Extract session ID Read .session_id from the result event
Count turns Count events where .type == "assistant"
Accumulate tokens Sum .usage.input_tokens and .usage.output_tokens from each turn
Track retries Count system/api_retry events (informational logging)

4.2 StreamResult

public sealed class StreamResult
{
    public string? ResultMarkdown { get; set; }
    public string? StructuredOutputJson { get; set; }
    public string? SessionId { get; set; }
    public int TurnCount { get; set; }
    public int TokensIn { get; set; }
    public int TokensOut { get; set; }
    public int ApiRetryCount { get; set; }
}

4.3 Extended RunResult

public sealed class RunResult
{
    public required int ExitCode { get; init; }
    public string? ResultMarkdown { get; init; }
    public string? ErrorMarkdown { get; init; }
    public string? StructuredOutputJson { get; init; }
    public string? SessionId { get; init; }
    public int TurnCount { get; init; }
    public int TokensIn { get; init; }
    public int TokensOut { get; init; }

    public bool IsSuccess => ExitCode == 0 && ResultMarkdown is not null;
}

5. Multi-Turn & Auto-Retry

5.1 Execution flow

Task queued
  -> Run 1 (run_number=1, is_retry=0)
     -> Resolve config (list defaults + task overrides)
     -> Build CLI args (no --resume on first run)
     -> Spawn claude, stream output, parse via StreamAnalyzer
     -> Create task_runs row with all metrics
     -> Update denormalized tasks fields

     If failure (exit_code != 0):
       -> Auto-retry: Run 2 (run_number=2, is_retry=1)
          -> Prompt: "The previous attempt failed with:\n\n{error_markdown}\n\nTry again and fix the issues."
          -> Uses --resume <session_id> from Run 1
          -> Same worktree, same config
          -> Create new task_runs row
          -> If still fails: mark task Failed, stop

     If success (exit_code == 0):
       -> Auto-commit in worktree if changes
       -> Mark task Done

User triggers "Continue" on finished/failed task:
  -> New run (run_number=N+1, is_retry=0)
  -> User-provided follow-up prompt
  -> Uses --resume <session_id> from last run
  -> Task status -> Running -> Done/Failed

5.2 Rules

  • Max 1 auto-retry per task execution (no retry loops)
  • Auto-retry reuses the session via --resume (full context of prior failure)
  • Manual continue works on both Done and Failed tasks
  • Each run gets its own log file: {task_id}_run{N}.ndjson
  • Worktree commit happens only after a successful run
  • If Run 1 has no session_id (edge case: CLI crashed before producing one), skip auto-retry

5.3 Continue via SignalR

New hub method: ContinueTask(string taskId, string followUpPrompt) -> returns string runId

Validation:

  • Task must exist
  • Task must not be currently running
  • Previous run must have a session_id

6. TaskRunner Refactoring

6.1 Current flow (TaskRunner.RunAsync)

  1. Load list, create worktree/sandbox, mark running
  2. Build prompt from title + description
  3. Call _claude.RunAsync(prompt, dir, logPath, taskId, callback, ct)
  4. Handle result: commit on success, mark done/failed

6.2 New flow

public async Task RunAsync(TaskEntity task, string slot, CancellationToken ct)
{
    // 1. Load list + list_config
    // 2. Resolve config (merge list_config + task overrides)
    // 3. Create worktree/sandbox (unchanged)
    // 4. Execute run (see RunOnceAsync below)
    // 5. If failed and no prior retry: auto-retry
    // 6. Final status update
}

public async Task ContinueAsync(string taskId, string followUpPrompt, string slot, CancellationToken ct)
{
    // 1. Load task, last run (for session_id)
    // 2. Mark task running
    // 3. Execute run with --resume
    // 4. Commit if success + worktree
    // 5. Final status update
}

private async Task<RunResult> RunOnceAsync(
    TaskEntity task, string slot, string runDir, ClaudeRunConfig config,
    int runNumber, bool isRetry, string prompt, CancellationToken ct)
{
    // 1. Create task_runs row (started_at = now)
    // 2. Build log path: {task_id}_run{runNumber}.ndjson
    // 3. Build CLI args via ClaudeArgsBuilder
    // 4. Spawn ClaudeProcess
    // 5. Stream lines to LogWriter + StreamAnalyzer + HubBroadcaster
    // 6. Build RunResult from StreamAnalyzer
    // 7. Update task_runs row (finished_at, metrics, result)
    // 8. Update denormalized tasks fields
    // 9. Return RunResult
}

6.3 ClaudeProcess changes

Simplified — receives pre-built args, no longer constructs its own:

public async Task<RunResult> RunAsync(
    string arguments,           // pre-built by ClaudeArgsBuilder
    string prompt,              // written to stdin
    string workingDirectory,
    Func<string, Task> onStdoutLine,
    CancellationToken ct)

The StreamAnalyzer instance is owned by the caller (TaskRunner), not ClaudeProcess. ClaudeProcess just feeds lines via the callback.


7. Repository Changes

7.1 New: TaskRunRepository

Method Description
AddAsync(TaskRunEntity) Insert new run
UpdateAsync(TaskRunEntity) Update after completion
GetByTaskIdAsync(string taskId) All runs for a task, ordered by run_number
GetLatestByTaskIdAsync(string taskId) Most recent run (for session_id lookup)
GetByIdAsync(string runId) Single run

7.2 Extended: ListRepository

Method Description
GetConfigAsync(string listId) Returns ListConfigEntity?
SetConfigAsync(ListConfigEntity) Upsert via INSERT OR REPLACE

7.3 New models

public sealed class TaskRunEntity
{
    public required string Id { get; init; }
    public required string TaskId { get; init; }
    public required int RunNumber { get; init; }
    public string? SessionId { get; set; }
    public required bool IsRetry { get; init; }
    public required string Prompt { get; init; }
    public string? ResultMarkdown { get; set; }
    public string? StructuredOutputJson { get; set; }
    public string? ErrorMarkdown { get; set; }
    public int? ExitCode { get; set; }
    public int? TurnCount { get; set; }
    public int? TokensIn { get; set; }
    public int? TokensOut { get; set; }
    public string? LogPath { get; set; }
    public DateTime? StartedAt { get; set; }
    public DateTime? FinishedAt { get; set; }
}

public sealed class ListConfigEntity
{
    public required string ListId { get; init; }
    public string? Model { get; set; }
    public string? SystemPrompt { get; set; }
    public string? AgentPath { get; set; }
}

8. SignalR Hub Changes

8.1 New server methods

Method Description
ContinueTask(string taskId, string followUpPrompt) Trigger follow-up run. Returns string runId. Throws if running or no session.
GetAgents() Returns List<AgentInfo> from AgentFileService scan
RefreshAgents() Re-scan agents directory

8.2 Updated broadcasts

Event Change
TaskStarted(slot, taskId, runId, runNumber, startedAt) Added runId, runNumber
TaskFinished(slot, taskId, runId, status, finishedAt) Added runId
TaskMessage(taskId, runId, ndjsonLine) Added runId
RunCreated(taskId, runId, runNumber, isRetry) New — signals retry/continue started

8.3 Unchanged

Ping, GetActive, CancelTask, WakeQueue, WorktreeUpdated, TaskUpdated — no changes.


9. File Structure (New/Changed)

src/ClaudeDo.Worker/
  Runner/
    ClaudeArgsBuilder.cs          NEW — CLI argument construction
    StreamAnalyzer.cs             NEW — replaces MessageParser
    StreamResult.cs               NEW — accumulated stream metrics
    RunResult.cs                  CHANGED — extended with tokens, turns, session_id
    ClaudeProcess.cs              CHANGED — simplified, takes pre-built args
    TaskRunner.cs                 CHANGED — retry/continue logic, config resolution
    MessageParser.cs              DELETED — replaced by StreamAnalyzer
  Services/
    AgentFileService.cs           NEW — filesystem agent management

src/ClaudeDo.Data/
  Models/
    TaskRunEntity.cs              NEW
    ListConfigEntity.cs           NEW
    AgentInfo.cs                  NEW — DTO (name, description, path)
  Repositories/
    TaskRunRepository.cs          NEW
    ListRepository.cs             CHANGED — GetConfigAsync, SetConfigAsync

schema/
  schema.sql                      CHANGED — list_config table, task_runs table, tasks columns

10. Testing Strategy

10.1 Unit tests (new)

Test class Covers
ClaudeArgsBuilderTests Arg construction with all config combos, omitted flags for null values
StreamAnalyzerTests Turn counting, token accumulation, result extraction, session_id, retry events, malformed input
AgentFileServiceTests Scan, frontmatter parsing, read/write/delete, missing directory handling

10.2 Unit tests (updated)

Test class Changes
TaskRunnerTests New: auto-retry flow, continue flow, config resolution
QueueServiceTests New: continue task routing

10.3 Integration tests (new)

Test class Covers
TaskRunRepositoryTests CRUD, ordering, latest-by-task queries
ListRepositoryConfigTests GetConfig, SetConfig upsert behavior

10.4 Existing tests (MessageParserTests)

Removed along with MessageParser. Equivalent coverage moves to StreamAnalyzerTests.