Step key parsing in execution_service uses naive split(":") and silently truncates colon-containing step names

W1a (#1292), W1b (#1295), LockfileRepository prequel (#1298), W2 (#231 — closes swamp-club#201), W3 (#252 — closes the rebundle-loop bug class), W4 (#1XXX — closes swamp-club#214 and swamp-club#270), and W5 (#290 — closes the V8 module-graph aliasing class) collectively close all three structural concerns the audit identified. The loader-side rearchitecture is complete.

But the diagnostic surface for users is still primitive. `swamp doctor extensions` today reports basic invalidation-guard pass/fail. Users hitting weird states (stale rows, terminal-failure states, orphaned bundles) have no way to inspect the catalog without raw SQL queries. Several recent bugs (#200, #201, #310 territory) had users resorting to `rm -rf .swamp/_extension_catalog.db` as a recovery path because the existing diagnostic surface didn't show them what was wrong or how to repair it.

W6 closes this loop. It renders the post-rearchitecture aggregate state to users in a structured, actionable form, and provides a repair surface for catalog garbage collection (folding in the swamp-club#267 GC scope).

Full architectural context: `design/extension-rearchitecture.md` ("W6 — `swamp doctor extensions` aggregate-state rendering" section) — referenced from #211.

Scope

W6 has two phases. The carve question — whether W6 absorbs #267's extension-layer GC repair scope — is a pre-work decision (see below).

Phase 1 — Aggregate-state rendering (always in scope)

Extend `swamp doctor extensions` to render the catalog's aggregate state in user-friendly form:

Per-aggregate summary: name (e.g., `@local/[email protected]` or `@hivemq/[email protected]`), origin (local, source-mounted, pulled), source count, RowState distribution
Per-source detail (verbose mode): source path, RowState tag, fingerprint (truncated), bundle path
RowState legend: render all 7 tags (`Indexed | Bundled | ValidationFailed | BundleBuildFailed | EntryPointUnreadable | OrphanedBundleOnly | Tombstoned`) with brief descriptions
Orphan detection:
- Catalog rows whose `source_path` doesn't exist on disk
- Bundle files in `.swamp/-bundles/` not referenced by any catalog row
- Aggregates in terminal states (Tombstoned, OrphanedBundleOnly) lingering after their lifecycle reasons
Output modes: `log` (human-readable, follows terminal-output skill conventions) and `json` (structured per the existing two-mode CLI convention)
Counts and rollups: "X total extensions across N kinds, M sources in healthy states, P orphan rows, Q orphan bundle files"

Phase 2 — Repair surface (folds in #267 if decided)

Add `swamp doctor extensions repair` (or `--apply` flag) for catalog and bundle file garbage collection:

`--dry-run` is the DEFAULT mode (mirrors `git clean -n` semantics). Lists what would be deleted with rationale per row.
`--apply` performs the cleanup operations.
Catalog row pruning: delete rows in `Tombstoned` state or terminal states where neither source nor bundle exists on disk.
Bundle file eviction: delete files in `.swamp/-bundles/` that aren't referenced by any `bundle_path` in the catalog.
Structured logging: every cleanup operation logged with what + why for scripting visibility.

The repair surface absorbs #267 entirely if Phase 2 is in scope. See pre-work decision #1.

Pre-work decisions to pin in the PR description

Scope: render-only or render + repair? Recommend: render + repair (absorb #267 fully). Reasoning: `doctor extensions` is the natural home for the repair UX (per #267's own framing), and shipping rendering without repair leaves users with the same "see the problem but can't fix it" gap that exists today. The combined scope is workstream-shaped, not 2-day-shaped. If splitting is preferred, Phase 1 ships first; Phase 2 becomes the next workstream after W6 settles.
Output format precedent: follow the terminal-output skill for log mode. JSON mode follows the existing CLI two-mode convention. Don't invent new patterns.
Bundle file naming convention verification: per #267's pre-work, the eviction policy depends on whether bundle files are content-addressed (stale fingerprint versions accumulate as orphans) or overwrite-on-rebundle (only true orphans accumulate). Confirm in the audit phase before designing eviction rules.
Repair safety model: dry-run as default, explicit `--apply` to act. Recommend: no interactive confirmation prompts for v1; the dry-run-by-default convention is the safety. Users who want belt-and-suspenders can pipe the dry-run output into `less` before applying. Mirror `git clean -n` / `-i` semantics if interactive mode is wanted later.
Granularity of rendering: per-aggregate by default, per-source via `--verbose` or `--detail`. Avoid showing 1000-row tables by default; show summary + drill-down on demand.

Out of scope (deferred to other workstreams or never)

Workflow runner mid-step extension refresh — separate concern; W5 made it safe via per-fingerprint URLs but the workflow runner architecture doesn't currently re-import between steps. If wanted as a feature, file as its own tracker.
`allExtensionMethodsAttached` removal (swamp-club#318) — separate registration-path consolidation work; W6 doesn't touch this.
`sourceToRow` mtime user surface (swamp-club#271) — if W6's rendering wants to show mtime, that's fine; the underlying mtime threading is already done.
`getCatalogStore()` cleanup (the surviving `legacyStore`-shaped accessor pattern from W4 review) — could be tracked separately if cleanup is desired.
Per-fingerprint import URL changes — W5's territory.
Catalog schema changes — none needed; W6 reads existing state.
Reconcile semantic changes — W3's territory.

Success criteria

`swamp doctor extensions` renders aggregate state in both `log` and `json` modes covering all 7 RowState tags, with per-aggregate summary + verbose drill-down.
Orphan detection surfaces catalog-only rows, bundle-only files, and lingering Tombstoned aggregates.
`--dry-run` (default) lists every cleanup-eligible row + file with reason; `--apply` performs cleanup atomically; idempotent on re-run.
swamp-club#267 closes (absorbed into Phase 2 if Phase 2 is in scope).
Diagnostic shift: users hitting weird catalog states have `swamp doctor extensions` as the first-line repair path. The "delete `.swamp/_extension_catalog.db` and pray" anti-pattern from #200 stays closed.
All existing tests pass on Linux + macOS (Windows not a merge gate per W-series precedent).
Auto-ship-on-merge readiness verified via diversity-matrix soak.
Forward-only revert posture documented.

Suggested test additions

Rendering per RowState tag: parameterized test that seeds a catalog with one row per RowState, runs `swamp doctor extensions --json`, asserts each state is correctly classified and rendered.
Orphan detection: seed catalog rows whose source paths don't exist; seed bundle files not referenced by any catalog row; assert both classes surface in the rendering.
Output mode parity: `log` mode and `json` mode contain the same information (modulo formatting). Use the output-parity assertion pattern from W2's DuplicateTypeError test.
`--dry-run` safety: seed cleanup-eligible state, run `--dry-run`, assert no catalog or filesystem changes after the run.
`--apply` idempotence: run `--apply` twice; second run reports zero operations (state already clean).
Cross-platform paths: path rendering uses canonical forms; use `assertPathEquals` from `path_test_helpers.ts`.

Auto-ship-on-merge constraint

Standard gates plus W6-specific considerations:

CI green (all new + existing tests + type-check + lint + fmt)
Author smoke on real repo: run `doctor extensions` against actual installed extensions; verify output makes sense; run `--dry-run` then `--apply` on a repo with known cleanup-eligible state.
Reviewer smoke on different real repo: same checks.
Diversity-matrix soak (~12-24 hours since W6 is presentation-layer, narrower blast radius than W4):
- Linux + macOS
- Repos with varied state (clean, with Tombstoned rows, with terminal-failure states, with orphan bundles)
- Verify output is readable and accurate across states
No user-visible regression for users who don't run `doctor extensions`
Repair operations verified to not affect repos that don't run `--apply`
Forward-only revert posture documented

Push-back encouraged

If the design doesn't fit the ground, surface before implementation. Specific watch list:

Repair scope larger than expected. If the bundle-file eviction logic discovers complexity (content-addressed naming, cross-platform path quirks), Phase 2 may warrant its own follow-up workstream rather than folding into W6. Apply the LockfileRepository-prequel threshold pattern: pre-commit to size thresholds before audit.
Rendering volume on large catalogs. A repo with 500+ rows might overwhelm log-mode output by default. The summary-by-default + drill-down-on-demand pattern should handle this, but verify against a real large-repo scenario.
Output format design decisions. Color usage, table formatting, tree rendering — design choices that should follow the terminal-output skill, not invent new conventions. If the skill doesn't cover something W6 needs, surface for a design conversation rather than picking ad-hoc.
Interaction with the existing `doctor extensions` invalidation-guard reporting. The existing command reports invalidation-guard pass/fail. W6 extends this with aggregate-state rendering. Confirm the two reports compose cleanly — invalidation-guard report stays at the top, aggregate state below it. Don't replace the existing output silently.

The two most expensive misses to watch for

`--apply` deletes more than intended. A bug in the cleanup logic could delete rows that should have been preserved. Catch by: dry-run-by-default convention + comprehensive regression tests asserting that `Indexed` rows are NEVER touched, only terminal states are eligible for deletion.
Phase 2 absorbing #267 silently grows scope past the W6 budget. Bundle file eviction has more edge cases than catalog row pruning (content-addressed naming, cross-platform paths, possible race with concurrent swamp processes). If audit reveals Phase 2 is ≥ ~600 LOC on its own, consider splitting into W6a (rendering) and W6b (repair).

References

Predecessors: #211 (W1 tracking), #223 (W1b), #231 (W2), #252 (W3), #269 (W4), #290 (W5)
Folded in: #267 (extension layer garbage collection — superseded by W6's Phase 2 if absorbed)
Possibly addressable: swamp-uat#200 (UAT scenario for bundle-cache self-recovery — could be a natural surface for verifying repair-after-corruption)
Design doc: `design/extension-rearchitecture.md`

02Bog Flow

Shipped

5/11/2026, 9:45:53 PM

Click a lifecycle step above to view its details.

03Sludge Pulse

stack72 assigned stack725/11/2026, 5:11:14 PM

Step key parsing in execution_service uses naive split(":") and silently truncates colon-containing step names

Workflow-scope report artifacts unreachable via `swamp data get --workflow`

Add --stdin support to method run and workflow run for Unix pipe composition

Docs: add run namespace to CEL expressions reference

Fix invalidate-then-reconcile sequencing in doctor extensions; failure-mode RowStates unreachable in sourceDetails

doctor extensions invalidateAll does not trigger fingerprint recheck for existing Indexed rows

Extension failure recording has dual write paths (legacy buildIndex vs W3 reconcile)

Expose workflow run ID in CEL so resource keys can be run-scoped

Add `swamp doctor workflows` subcommand to surface workflow YAML parse errors during preflight

Add 'swamp issue comment' command for updating existing issues

Close #327 — fixed in 20260511.160514.0-sha.9d03b09a

Nested workflow task fails with 'Bad resource ID' when workflowIdOrName uses @collective/ prefix

Detect stale skill directories and prompt for repo upgrade

Improve skill trigger routing for cross-model edge cases

Add type search for driver, datastore, and report kinds

Surface ReconcileFromDisk dryRun transitions in swamp doctor extensions

Docs: update doctor extensions reference for W6 aggregate-state rendering + repair flags

swamp issue: check if reporter is on an outdated binary before opening

Implement W6: swamp doctor extensions aggregate-state rendering + repair surface (extension catalog rearchitecture)

extension pull fails referencing removed source after extension source rm

Investigate whether allExtensionMethodsAttached guard can be removed via registration-path consolidation

Trace context is lost: CLI ignores inbound TRACEPARENT and raw driver doesn't propagate active context into in-process methods

swamp model create --global-arg KEY=VALUE doesn't coerce strings to z.number() schemas

swamp model create --global-arg KEY=VALUE doesn't coerce strings to z.number() schemas

swamp issue get should rate-limit unauthenticated users instead of blocking

swamp issue get should not require authentication

telemetry: emit child entries for follow-up action method invocations

Publish release-candidate / unstable extension versions

extension source rm leaves stale @local/. rows in catalog, blocking later pulls of registry types

extension push drops binaries: field from re-emitted archive manifest.yaml

macOS launchd autoupdate (club.swamp.autoupdate) silently fails — binary stays stale

Clicking @hivemq/honeycomb extension card on /extensions shows 'Something went wrong'

Docs: Update model-definitions.md and workflows.md for direct type execution

Docs: document binaries manifest field in extension-manifest.md

Add an official @swamp/ssh extension for general-purpose SSH (brownfield-friendly)

Accept and display binaries field from extension push metadata

Direct type execution: collapse model create + method run into one command

Per-method telemetry events for workflow runs

Workflow run liveness: orphaned 'running' records when originating CLI process dies mid-run

Provide a CLI-shape primer for AI agents to reduce rediscovery overhead

quality rubric: don't penalize extensions whose upstream constrains them to a single platform

Extension update rejects multiple .ts files extending the same target type within one local extension (regression)

swamp extension push deadlocks when invoked from inside a swamp workflow step

Collective-scoped auth keys + OIDC federation for CI publishing

forEach self.* in modelIdOrName not resolved in runtime execution path

Award leaderboard points for referrals and collective invites

Add agent harness detection and AiTool to telemetry

Workflow-level runtime expressions (env.*, vault.*) not resolved in driverConfig — docker driver receives literal ${{ ... }} strings

Implement W5: Per-fingerprint import URLs + subprocess test harness (extension catalog rearchitecture)

swamp config set crashes with YAML serialization error

datastore compact: VACUUM fails in compiled binary (SQLITE_LIMIT_ATTACHED=0)

Repo-level version gating: minSwampVersion high-water mark for team consistency

Docs: document self.* expressions in modelIdOrName during forEach

Docs: How-to guide for background autoupdating

Manifest version bumps silently ignored for existing local extension aggregates

materialiseExtensions misclassifies pulled rows when manifest name collides with a pulled extension

Local extension edits don't reliably trigger rebundle

Missing unique indexes on user.email and user.username allow duplicate users

Resolve self.* expressions in modelIdOrName during forEach expansion

discord-bot double-sends sign_up notifications

Discord bot sends duplicate signup notifications

Add 'swamp workflow list' as alias for 'swamp workflow search'

Add 'swamp auth status' as alias for 'swamp auth whoami'

Extension bundle cache does not invalidate on source edits

extension pull fails on @local/[email protected] phantom-claim collision when local repo has its own extension

Lab search by numeric issue ID returns no results

W3 sourceToRow writes empty source_mtime — should carry filesystem mtime through Source entity

Warm-start rebundleAndUpdateCatalog should respect terminal RowStates set by reconcile

Implement W4: KindAdapter + unified loader (extension catalog rearchitecture)

Docs: add vault read-secret command to reference manual

Extension layer garbage collection: prune catalog rows + evict orphaned bundles

Docs: document workflow concurrency limits in reference manual

Locally-sourced extension: source_mtime updates without regenerating stale bundle

Extension push: allow shipping executable host helpers (bin/mudroom blocker)

Vault expressions silently deliver __SWAMP_VSEC__ sentinels under the docker driver

First-class shell-shim support for extensions, with registry-level visibility

Local extension model bundles don't rebuild when source changes (no rebuild CLI; manual cache delete breaks the runner)

Configurable concurrency limits for workflow fan-out (forEach, parallel jobs/steps)

feat(security): redact sensitive method arg values from audit log

docs: document swamp datastore compact and GC WAL behaviour

Workflow-level runtime expressions (env., vault.) not resolved in driverConfig — docker driver receives literal ${{ ... }} strings