Skip to main content
← Back to list
01Issue
FeatureTriagedSwamp CLI
Assigneesstack72

Workflow-level workspace for docker driver: stateful multi-step workflows

Opened by stack72 · 4/11/2026

Problem

The docker driver currently runs each step in an isolated container. This is great for provisioning-style workflows where steps are independent, but it makes CI-style workflows painful — workflows that have shared filesystem state (checkouts, installed dependencies, build artifacts) across multiple steps.

Concrete example: we're building a multi-model-eval workflow that mirrors a GitHub Actions workflow for evaluating skill triggers across multiple LLMs. The workflow structure is:

  1. checkout — clone the swamp repo
  2. setup-npm — run npm install in evals/promptfoo/
  3. run-evals — run 4 parallel evals (forEach over models) against the checkout
  4. cleanup — remove the checkout

All four eval steps in step 3 need to read the same checkout and share the same node_modules tree. In raw mode this "just works" because everything runs on the host filesystem. In docker mode, each step is a fresh container with no shared state, so we have to:

  • Explicitly configure a volume mount in driverConfig.volumes
  • Use identical host and container paths (e.g., /tmp/swamp-eval-workspace:/tmp/swamp-eval-workspace) so the same path string is valid in both raw and docker modes
  • Store the clone path as a data artifact so downstream steps can look it up via data.latest('swamp-repo', 'repository').attributes.path
  • Add a dedicated setup-npm job that runs once before the parallel evals to populate the shared volume with node_modules (otherwise 4 parallel npm install runs race against each other)
  • Add a cleanup job to remove the checkout when done
  • Worry about host/container path parity, volume lifecycle, and npm cache persistence

Every CI-shaped workflow that uses the docker driver will have to reinvent this pattern. It's a lot of ceremony for something that should be "give me a shared working directory."

What we'd like

A first-class "workspace" concept for workflows. One of the following, in order of preference:

Option A: Workflow-level workspace primitive

Each workflow run gets an automatically-provisioned working directory that's mounted into every step's container at a stable path. Lifecycle is tied to the workflow run — created on start, cleaned up on end (unless --preserve-workspace is passed for debugging). Referenced in CEL as workspace.path.

workspace:
  enabled: true
  persistent: false   # optional: survive between runs for caching

jobs:
  - name: checkout
    steps:
      - name: clone
        task:
          type: model_method
          modelIdOrName: swamp-repo
          methodName: clone
          inputs:
            # workDir defaults to workspace.path

This eliminates:

  • Manual driverConfig.volumes config
  • Host/container path parity hacks
  • Manual cleanup jobs
  • Storing workdir paths as data artifacts purely for path propagation

Option B: Session-mode docker driver

Instead of one container per step, one container per workflow run. Steps execute as sub-operations inside the same long-lived container. State naturally persists without volume mounts. Parallel steps run as concurrent operations within the container.

driver: docker
driverConfig:
  mode: session    # vs. "per-step" (current default)
  image: ghcr.io/systeminit/swamp-eval-runner:latest

This matches how GitHub Actions / GitLab CI actually work (one runner hosts the entire job) and matches what users intuitively expect from "CI in a container." Per-step mode stays available for the current use cases.

Option C: Step output files as implicit inputs

A lighter-weight version: a step can declare "this file or directory is my output," and swamp makes it available at the same path in downstream steps' containers. Similar to GHA's upload-artifact/download-artifact but automatic based on step dependencies.

steps:
  - name: install-deps
    task: ...
    outputs:
      - path: node_modules
        makeAvailableTo: [run-evals]

Why this matters

  • CI is a first-class use case. swamp is pitching itself as a general automation framework. Multi-step CI workflows with shared state are one of the most common automation patterns.
  • The current workarounds leak driver-specific concerns into workflow YAML. Users have to know that docker steps are isolated containers and plan around it. A workspace primitive abstracts this away.
  • The workarounds don't compose. If another workflow wants a similar pattern, it has to re-solve volume mounts, path parity, setup steps, and cleanup from scratch. That's a sign we're missing an abstraction.
  • It unblocks parallelism. Right now we can run 4 parallel eval steps, but only after carefully engineering around shared state. A workspace primitive or session mode makes parallelism the default, not a puzzle.

Concrete reference

The full multi-model-eval workflow and extension code are in this repo at:

  • workflows/workflow-8a88a569-4620-431c-9028-643df0118c72.yaml
  • extensions/models/ci_git.ts
  • extensions/models/ci_promptfoo_eval.ts
  • extensions/reports/ci_eval_analysis.ts
  • extensions/reports/ci_eval_result.ts

It's a complete, working example of the workarounds described above, in case it's useful to look at when designing the primitive.

02Bog Flow
OPENTRIAGEDIN PROGRESSSHIPPEDTRIAGE+ 7 MOREREVIEW

Triaged

4/11/2026, 10:09:26 PM

Click a lifecycle step above to view its details.

03Sludge Pulse
stack72 assigned stack724/11/2026, 10:07:19 PM
Editable. Press Enter to edit.

bixu commented 4/21/2026, 1:05:00 PM

I'm concerned about the security implication of B and especially A. C is a bit more work but feels safer (explicit opt-in).

Sign in to post a ripple.