Skip to main content
← Back to list
01Issue
FeatureOpenSwamp CLI
AssigneesNone

Relationships

#840 Expose run/job/step identifiers as SWAMP_* env vars + CEL values, and template placement selectors (extends #331's run.id)

Opened by swamp_lord · 6/26/2026

Problem

#331 shipped run.id in the CEL templating context — the workflow-run UUID — which unblocked run-scoped resource KEYS. Three gaps remain for CI-style and self-provisioning workflows:

  1. CEL-only, not env. run.id is reachable when expanding a step's inputs, but a model method's body (and any subprocess or container it spawns) has no ambient way to learn the run it belongs to. The only path today is to thread run.id in as an explicit input arg to every method, then re-thread it into every spawned docker run -e … / child process. GitHub Actions solves this by injecting a fixed GITHUB_* env set into every step's environment, so any tool the step shells out to can self-identify without the author plumbing it.

  2. Only the run, nothing finer. There is no job id, step id, fan-out (forEach) item coordinate, retry/attempt counter, or "which worker executed this step." Correlating the artifacts of a fan-out needs at least run + job + item-index to tell parallel branches apart.

  3. CEL works in inputs but NOT in placement label selectors. ${{ … }} is evaluated when expanding a step's task.inputs, but a placed step's labels: selector is NOT templated — a run: ${{ run.id }} selector passes through VERBATIM and matches no worker (No connected worker matches labels …run=${{ run.id }}), while swamp workflow validate accepts it silently. So even though the worker side CAN advertise a run-scoped label (it's a model-method input, where CEL works), the selector side can't match on it — making per-run placement scoping inexpressible. (Verified on build 20260623; corrects an earlier draft of this issue that assumed run.id was usable in a placement selector.)

Concretely: we run a workflow that provisions its own pool of resource-capped containerized remote-execution workers (a phase = provision N workers -> fan out work across them -> tear down) via a model method that docker runs sibling containers. To scope placement deterministically and to label each container so docker ps --filter label=… and log correlation work, every container needs a stable id tying it to (a) the run and (b) the job/phase that spawned it. We CAN stamp the worker side with run.id (it's a CEL input on the provision method), but we can neither (a) match it on the placement selector (gap 3) nor (b) have the spawning method self-identify without threading run.id through as an input (gap 1) — and there is nothing for the job/step/item axis at all (gap 2). We're forced to derive a synthetic scope from inputs and thread it everywhere — exactly the boilerplate #331 set out to remove for resource keys.

Proposed Solution

Expose a small, GitHub-Actions-inspired family of identifiers in BOTH surfaces — the CEL templating context AND the per-step process environment under a SWAMP_ prefix — so each is usable from ${{ … }} expressions and from Deno.env.get(…) / shell $SWAMP_* alike. A step's spawned children inherit the env, so the contract reaches one level deeper than CEL can. AND run placement-label selectors through the same CEL pass that step inputs already get, so these ids are usable on the matching side too (gap 3).

Proposed set (GH Actions analogue shown for reference; names are a strawman):

CEL env var GH Actions analogue notes
run.id SWAMP_RUN_ID GITHUB_RUN_ID already in CEL via #331; add the env var
run.attempt SWAMP_RUN_ATTEMPT GITHUB_RUN_ATTEMPT increments on retry / resume
run.workflowName SWAMP_WORKFLOW_NAME GITHUB_WORKFLOW
run.workflowId SWAMP_WORKFLOW_ID GITHUB_WORKFLOW_REF the definition UUID
run.actor SWAMP_ACTOR GITHUB_ACTOR principal that launched the run (audit / auth)
run.trigger SWAMP_EVENT_NAME GITHUB_EVENT_NAME manual / scheduled / resumed
job.id SWAMP_JOB_ID UUID of this job within the run
job.name SWAMP_JOB_NAME GITHUB_JOB
step.id SWAMP_STEP_ID GITHUB_ACTION
step.name SWAMP_STEP_NAME
item.index / item.key SWAMP_ITEM_INDEX / SWAMP_ITEM_KEY matrix context forEach fan-out coordinate
worker.id SWAMP_WORKER_ID RUNNER_NAME which remote worker executed the step (placement debugging)

run.id is the load-bearing value; the minimum that unblocks us is SWAMP_RUN_ID as an env var, job.name / item.index for fan-out correlation, and CEL-templated placement selectors. The rest is the natural CI surface and can land incrementally. Happy to converge on swamp's vocabulary (run vs execution, etc.).

Example — the self-provisioning worker case SHOULD collapse to this (gap 3 must be closed first; today the selector is not templated):

# DESIRED: placement selector matches ONLY this run's workers — a prior run's
# leaked/zombie worker carries a different run.id and is never selected.
# Does NOT work today: selector labels aren't CEL-evaluated (gap 3).
labels:
  run: ${{ run.id }}
// the provision method stamps each spawned container with no input threading
// (needs gap 1 — SWAMP_RUN_ID in the env):
//   docker run --label run=$SWAMP_RUN_ID --label job=$SWAMP_JOB_NAME …
// and `docker ps --filter label=run=<uuid>` then lists exactly that run's fleet.

Affected Components

  • CEL evaluator / templating — extend variable resolution with job, step, item, worker, and the extra run.* fields (run.id already wired by #331); and run placement-label selector values through the same evaluator (gap 3).
  • Workflow runner / step executor — inject the SWAMP_* env vars into each step's process environment at the same point inputs are bound, and forward them in the dispatch envelope so a remote-worker-executed step sees identical $SWAMP_* to a loopback one.
  • swamp workflow validate — at minimum, flag a literal ${{ … }} left unexpanded in a selector instead of accepting it (today it passes, then silently matches nothing at runtime).
  • Docs — add the SWAMP_* env table + new CEL bindings wherever inputs.* / self.* / run.id are listed.

High-Level Approach

These values already exist in the run / job / step records the engine walks — this is exposure, not new state. The env-var half is a dictionary merge into the step's spawn environment; the CEL half is the same extension #331 made for run.id, widened to the job / step / item records already in scope at expansion time, and applied to selector labels as well. The one piece of genuine plumbing is forwarding the env to remote workers so the contract holds regardless of where a step lands.

Why It Matters

Self-provisioning, fan-out, and CI-driven workflows all need to correlate what a run spawns — containers, child processes, intermediate resources, logs — back to the run (and job / item) that owns them, AND to place work only on the resources that run provisioned. #331 unblocked run-scoped resource KEYS via CEL; this extends the same idea to the EXECUTION ENVIRONMENT (spawned processes self-identify with zero input threading), to FINER granularity (job / step / fan-out item), and to the PLACEMENT SELECTOR (so a run can target only its own workers). It mirrors the GITHUB_* contract every CI tool already expects, which makes swamp workflows legible to anyone arriving from Actions or GitLab CI.

  • Builds directly on #331 (run.id in CEL, shipped).
  • #519 (persistent, queryable workflow runs) — same run-identity surface, complementary.
02Bog Flow
OPENTRIAGEDIN PROGRESSSHIPPED

Open

6/26/2026, 6:12:34 PM

No activity in this phase yet.

03Sludge Pulse

Sign in to post a ripple.