2 Commits

Author SHA1 Message Date
mika kuns
9236ca6d45 docs(data): add EF Core migration implementation plan
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 08:32:18 +02:00
mika kuns
9e1f1370bb docs(data): add EF Core migration design spec
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 08:20:33 +02:00
2 changed files with 1975 additions and 0 deletions

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,253 @@
# EF Core Migration Design
Replace the raw ADO.NET / Microsoft.Data.Sqlite data layer with Entity Framework Core and LINQ queries.
## Motivation
- Developer ergonomics: raw SQL is tedious to write and maintain; LINQ enables faster iteration.
- Maintainability: the ad-hoc migration approach (ALTER TABLE with error-code catching) and manual DBNull/enum mapping are a liability as the schema grows. EF Core provides proper migration versioning, value converters, and change tracking.
## Decision Summary
| Decision | Choice |
|---|---|
| Approach | Big bang — rewrite all 6 repositories at once |
| Migration strategy | Fresh start — EF Core owns the schema, drop schema.sql |
| DbContext sharing | Single shared `ClaudeDoDbContext` in ClaudeDo.Data |
| Configuration style | Fluent API only, clean POCO models |
| Atomic queue claim | Kept as `FromSqlRaw` — not expressible in LINQ |
---
## 1. DbContext and Entity Configuration
### ClaudeDoDbContext
A single `ClaudeDoDbContext` in `ClaudeDo.Data` with DbSets for all entities:
```csharp
public class ClaudeDoDbContext : DbContext
{
public DbSet<TaskEntity> Tasks => Set<TaskEntity>();
public DbSet<ListEntity> Lists => Set<ListEntity>();
public DbSet<TagEntity> Tags => Set<TagEntity>();
public DbSet<ListConfigEntity> ListConfigs => Set<ListConfigEntity>();
public DbSet<WorktreeEntity> Worktrees => Set<WorktreeEntity>();
public DbSet<TaskRunEntity> TaskRuns => Set<TaskRunEntity>();
public DbSet<SubtaskEntity> Subtasks => Set<SubtaskEntity>();
}
```
### Entity-to-Table Mapping
| Entity | Table | Key | Notes |
|---|---|---|---|
| `TaskEntity` | `tasks` | `Id` (TEXT) | Nav to List, Tags, Worktree, Runs, Subtasks |
| `ListEntity` | `lists` | `Id` (TEXT) | Nav to Tasks, Tags, Config |
| `TagEntity` | `tags` | `Id` (INTEGER auto) | Nav to Lists, Tasks (both M:N) |
| `ListConfigEntity` | `list_config` | `ListId` (TEXT) | 1:1 owned by List |
| `WorktreeEntity` | `worktrees` | `TaskId` (TEXT) | 1:1 owned by Task |
| `TaskRunEntity` | `task_runs` | `Id` (TEXT) | FK to Task |
| `SubtaskEntity` | `subtasks` | `Id` (TEXT) | FK to Task |
### Navigation Properties Added to Models
```csharp
// TaskEntity gains:
public ListEntity List { get; set; }
public WorktreeEntity? Worktree { get; set; }
public ICollection<TagEntity> Tags { get; set; }
public ICollection<TaskRunEntity> Runs { get; set; }
public ICollection<SubtaskEntity> Subtasks { get; set; }
// ListEntity gains:
public ListConfigEntity? Config { get; set; }
public ICollection<TaskEntity> Tasks { get; set; }
public ICollection<TagEntity> Tags { get; set; }
// TagEntity gains:
public ICollection<ListEntity> Lists { get; set; }
public ICollection<TaskEntity> Tasks { get; set; }
```
### Enum Handling
EF Core `ValueConverter<TEnum, string>` for `TaskStatus` and `WorktreeState`, storing the same lowercase strings (`"manual"`, `"active"`, etc.) for database compatibility. The `ToDb`/`FromDb` methods in repositories are removed.
### Junction Tables
`list_tags` and `task_tags` are configured as implicit join tables via `.UsingEntity()` in Fluent API — no explicit junction entity classes needed.
### Fluent Configuration
Each entity gets its own `IEntityTypeConfiguration<T>` class in a `Configuration/` folder within `ClaudeDo.Data`.
---
## 2. Migration Strategy
### Fresh Start
- `schema.sql` and `SchemaInitializer` are deleted.
- An initial EF Core migration (`InitialCreate`) is generated from the DbContext model, producing the full schema (all 8 tables, indexes, foreign keys, check constraints).
- EF's `__EFMigrationsHistory` table tracks applied migrations.
### Startup
Both App and Worker call `context.Database.Migrate()` at startup instead of `SchemaInitializer.Apply()`. This is idempotent.
### Existing Database Compatibility
For users who already have a database created by `schema.sql`, the initial migration must handle the schema already existing. On startup, if the `lists` table exists but `__EFMigrationsHistory` does not, insert the initial migration record into `__EFMigrationsHistory` so EF skips it.
### Seed Data
The `"agent"` and `"manual"` tags move into `OnModelCreating` via `HasData()`:
```csharp
modelBuilder.Entity<TagEntity>().HasData(
new TagEntity { Id = 1, Name = "agent" },
new TagEntity { Id = 2, Name = "manual" });
```
### Ad-hoc Migrations Removed
The 3 manual `ALTER TABLE` statements (model, system_prompt, agent_path on tasks) become part of the initial migration since they're already in the model. The manual `ApplyMigrations()` method is deleted.
---
## 3. Repository Rewrite
All 6 repositories are rewritten to use `ClaudeDoDbContext` and LINQ.
### Per-Repository Changes
| Repository | After EF Core |
|---|---|
| `TagRepository` | LINQ queries. `GetOrCreateAsync` uses `FirstOrDefaultAsync` + `Add` + `SaveChangesAsync`. Static `SqliteConnection` overload removed. |
| `SubtaskRepository` | Straightforward LINQ CRUD, `.OrderBy(s => s.OrderNum)`. |
| `WorktreeRepository` | LINQ CRUD. State update becomes property set + `SaveChangesAsync`. |
| `ListRepository` | LINQ CRUD. Tag management via `.Tags` navigation property. Config upsert via `List.Config` navigation. |
| `TaskRunRepository` | LINQ CRUD. Latest = `.OrderByDescending(r => r.RunNumber).FirstOrDefaultAsync()`. |
| `TaskRepository` | See special cases below. |
### TaskRepository Special Cases
**Atomic queue claim** (`GetNextQueuedAgentTaskAsync`): kept as `FromSqlRaw` / `ExecuteSqlRawAsync`. The `UPDATE ... WHERE id = (SELECT ...) RETURNING` is not expressible in LINQ and the atomicity guarantee matters.
**Effective tags** (`GetEffectiveTagsAsync`): LINQ via navigation properties:
```csharp
var taskTags = context.Tasks
.Where(t => t.Id == taskId)
.SelectMany(t => t.Tags);
var listTags = context.Tasks
.Where(t => t.Id == taskId)
.SelectMany(t => t.List.Tags);
return await taskTags.Union(listTags).Distinct().ToListAsync(ct);
```
**FlipAllRunningToFailed**: EF Core 7+ bulk update:
```csharp
await context.Tasks
.Where(t => t.Status == TaskStatus.Running)
.ExecuteUpdateAsync(s => s.SetProperty(t => t.Status, TaskStatus.Failed), ct);
```
**Status transitions** (`MarkRunningAsync`, `MarkDoneAsync`, `MarkFailedAsync`): property updates + `SaveChangesAsync`.
### Removed Code
- `SqliteConnectionFactory.cs`
- `SchemaInitializer.cs`
- `schema/schema.sql`
- All `ToDb`/`FromDb` enum mapping methods
- All manual `DBNull.Value` handling
- `BindTask` helper methods
---
## 4. Package Changes and DI Registration
### ClaudeDo.Data.csproj
- Remove: `Microsoft.Data.Sqlite`
- Remove: embedded resource for `schema.sql`
- Add: `Microsoft.EntityFrameworkCore.Sqlite`
- Add: `Microsoft.EntityFrameworkCore.Design` (`PrivateAssets="all"`)
### ClaudeDo.Worker.Tests.csproj
- Remove: `Microsoft.Data.Sqlite`
- Add: `Microsoft.EntityFrameworkCore.Sqlite`
### App DI (Program.cs)
```csharp
// Replace SqliteConnectionFactory + singleton repos with:
sc.AddDbContextFactory<ClaudeDoDbContext>(opt =>
opt.UseSqlite($"Data Source={dbPath}"));
sc.AddScoped<ClaudeDoDbContext>(sp =>
sp.GetRequiredService<IDbContextFactory<ClaudeDoDbContext>>().CreateDbContext());
sc.AddScoped<ListRepository>();
sc.AddScoped<TaskRepository>();
sc.AddScoped<SubtaskRepository>();
sc.AddScoped<TagRepository>();
sc.AddScoped<WorktreeRepository>();
sc.AddScoped<TaskRunRepository>();
// Migrate at startup:
using var initScope = services.CreateScope();
initScope.ServiceProvider.GetRequiredService<ClaudeDoDbContext>().Database.Migrate();
```
ViewModels are singletons that currently take repositories as constructor parameters. Since repositories become scoped, ViewModels switch to taking `IDbContextFactory<ClaudeDoDbContext>` and create a fresh context (+ repositories) per operation. Each ViewModel method that touches data does: `using var context = _factory.CreateDbContext();` then constructs or resolves the needed repository with that context. This mirrors the current connection-per-call pattern.
### Worker DI (Program.cs)
```csharp
builder.Services.AddDbContext<ClaudeDoDbContext>(opt =>
opt.UseSqlite($"Data Source={cfg.DbPath}"));
builder.Services.AddScoped<ListRepository>();
builder.Services.AddScoped<TaskRepository>();
builder.Services.AddScoped<SubtaskRepository>();
builder.Services.AddScoped<TagRepository>();
builder.Services.AddScoped<WorktreeRepository>();
builder.Services.AddScoped<TaskRunRepository>();
// Migrate at startup after build:
using var scope = app.Services.CreateScope();
scope.ServiceProvider.GetRequiredService<ClaudeDoDbContext>().Database.Migrate();
```
Worker has request scopes via SignalR hub invocations, so scoped registration works naturally.
---
## 5. Test Infrastructure
### DbFixture
`DbFixture` is rewritten as an EF Core fixture:
- Creates a temp SQLite file per test class.
- Builds `DbContextOptions<ClaudeDoDbContext>` with `UseSqlite`.
- Calls `context.Database.Migrate()` to apply the schema (also tests that migrations work).
- Exposes a `CreateContext()` method so each test gets a fresh context instance (avoids change-tracker bleed).
Tests construct repositories by passing in a fresh context from the fixture.
No mocking — tests keep hitting real SQLite, same philosophy as today.
---
## 6. Risk and Mitigation
| Risk | Mitigation |
|---|---|
| Big-bang rewrite touches nearly every file in ClaudeDo.Data | Existing tests are the safety net — all must pass after migration |
| Existing databases with schema from schema.sql | Compatibility shim: detect existing tables, mark initial migration as applied |
| Atomic queue claim semantics change | Kept as raw SQL via `FromSqlRaw` |
| Scoped lifetime vs. singleton ViewModels | `IDbContextFactory` provides on-demand contexts |
| EF change tracker overhead vs. raw ADO.NET | Negligible for this workload size; use `AsNoTracking()` for read-only queries |