Skip to main content

THE WORKFLOW EXECUTION MODEL

Why DAGs, not scripts

A Swamp workflow is a directed acyclic graph of steps, not a sequential script. Steps declare their dependencies; the engine resolves topological order and runs independent steps concurrently. The design decision is to express the actual dependency structure rather than force an artificial sequence.

The alternative would be a linear script where step two waits for step one even when the two have no data relationship. That serializes work that could run in parallel, and it forces the author to manually reason about what can safely be reordered. A DAG makes the dependencies explicit: if step B does not depend on step A, the engine runs them at the same time without the author having to think about it. The cost is that the author must declare dependencies rather than rely on line order, but that declaration is also documentation -- it says exactly why one step must follow another.

Jobs and steps

A workflow contains jobs; each job contains steps. Jobs are the unit of concurrency control -- you set a concurrency limit on a job to cap how many of its steps run simultaneously. Steps are the unit of dependency within a job.

When to use multiple jobs versus multiple steps within one: jobs represent logically distinct phases (provision infrastructure, run tests, deploy). Steps represent operations within a phase that may depend on each other. The distinction matters because jobs carry their own concurrency settings, so a compute-heavy test phase can be throttled independently of a lightweight notification phase.

Dependency resolution and trigger conditions

Steps depend on others via depends_on. By default, a step runs when its dependencies have succeeded. Two other triggers change that behavior: failed runs the step only when a dependency failed, and completed runs it regardless of outcome.

This is how Swamp handles errors declaratively. A notification step triggered on failed sends an alert without wrapping anything in try/catch. A cleanup step triggered on completed tears down temporary resources whether the workflow succeeded or not. The workflow author states the conditions under which each step should run, and the engine evaluates those conditions against the actual outcomes. The alternative -- imperative error handling scattered through step logic -- conflates what a step does with when it should run.

Data flow through workflows

Steps produce data outputs that downstream steps reference via CEL expressions. When step A runs a model method, the engine writes the resulting data to the data layer. Step B can then reference that output through data.latest("model-name", "dataName").attributes.field in its own inputs. The expression resolves at execution time, not at workflow parse time, so the value is always current.

This data flow is what makes workflows more than task sequencing. Each step's output is typed, versioned, and queryable through the same data layer that How Swamp Works describes. For workflows that run across environments, dataOutputOverrides redirect where a step writes its data, and vary dimensions isolate outputs by environment or region so that a staging run does not collide with production data.

See the workflows reference for the full YAML schema governing data output configuration.

forEach iteration

A step can iterate over a list using forEach. Each iteration runs concurrently, up to the job's concurrency limit, and each produces its own independent data output, status, and logs.

The reason forEach is a workflow primitive rather than a loop inside the model is visibility. When a model loops internally, the workflow engine sees one step that either succeeded or failed. When forEach drives the iteration, the engine tracks each iteration as a distinct unit: iteration three failed while the other nine succeeded, and its logs explain why. The tradeoff is that forEach only works when iterations are independent -- if iteration B depends on the result of iteration A, the work belongs inside the model, not in forEach.

Manual approval gates

A manual_approval step suspends the workflow and waits for human sign-off before downstream steps proceed. Approval and resume are deliberately separate operations: the approver and the person who resumes the run may be different operatives, and resume can accept new inputs that did not exist when the workflow started.

The design rationale and mechanics of suspension are covered in Understanding Workflow Suspension.

Triggers and scheduling

Workflows can include a trigger block with a cron schedule. When swamp serve is running, it evaluates these triggers and executes matching workflows on schedule. The decision to keep triggers in the workflow definition rather than in a separate scheduling system is about cohesion: "what to run" and "when to run it" live in the same file, so reviewing a workflow shows the complete picture.

External events can also start workflow runs. Webhooks allow outside systems to trigger execution in response to events like repository pushes or deployment notifications. See the webhooks reference for how event-triggered execution works.