Skip to main content
← Back to list
01Issue
FeatureShippedSwamp CLI
Assigneesstack72

#241 performance degrades significantly with large SQLite catalog

Opened by webframp Ā· 5/5/2026Ā· Shipped 5/5/2026

Observation

Workflow execution rate drops significantly as the .swamp/data/_catalog.db grows:

Catalog Size WAL Size Events/min Runs/min
~10MB ~5MB ~50 ~3.3
~60MB ~20MB ~35 ~2.3
~106MB ~40MB ~17 ~1.1

At 2.2GB total .swamp/ size with ~106MB catalog database, workflow runs take 3-4x longer than when the repo was fresh.

Root Cause Hypothesis

Each workflow run performs many SQLite reads/writes to the catalog (one per step output, plus report generation). With a large catalog and concurrent access (5 parallel workflow runs), WAL-mode SQLite contention becomes the bottleneck.

Suggestions

  1. WAL checkpoint on gc — swamp data gc could VACUUM the catalog after purging old versions
  2. Catalog partitioning — separate catalog per model or per time window
  3. Incremental indexing — ensure catalog queries use indexed columns (modelName, tags, createdAt)
  4. Connection pooling — if concurrent runs each open separate connections, pool them
  5. Periodic auto-compact — offer swamp datastore compact that checkpoints WAL and rebuilds indexes

Impact

Long-running automation repos (CI/CD, monitoring loops) will hit this. A repo doing 100+ workflow runs per day will see meaningful slowdown within a week.

Environment

  • swamp version: 20260504.233645.0-sha.430c1535
  • Repo: ~4,500+ workflow runs, 2.2GB .swamp directory, 106MB catalog.db + 40MB WAL
02Bog Flow
āœ“OPENāœ“TRIAGEDāœ“IN PROGRESSāœ“SHIPPED+ 1 MOREASSIGNED+ 7 MOREREVIEW+ 3 MOREPR_MERGEDSHIPPED

Shipped

5/5/2026, 6:19:50 PM

Click a lifecycle step above to view its details.

03Sludge Pulse
stack72 assigned stack725/5/2026, 4:55:29 PM

Sign in to post a ripple.