Skip to main content

REMOTE EXECUTION

Swamp workflows run steps. By default, every step runs in the same process that started the workflow — the orchestrator. This is simple and sufficient when all the work happens on one machine. But when steps need to run on other machines — different architectures, different networks, closer to the data they act on — the orchestrator needs a way to send work elsewhere while retaining control over everything that matters. That is the remote execution model.

Orchestrator and worker roles

The orchestrator is the machine that owns the durable world. It holds the datastores, resolves vault secrets, stores model definitions, and runs workflows. When a step needs to execute remotely, the orchestrator does not push work outward. Instead, a worker dials home.

Workers initiate all connections. Each worker opens an outbound WebSocket control socket to the orchestrator for receiving step assignments and reporting results. A separate HTTP/2 data plane handles bulk transfers — step inputs, output artifacts, streamed logs. Because the worker connects outward, the orchestrator never needs to reach into the worker's network. This matters in environments where workers sit behind NATs, firewalls, or ephemeral cloud networking: the worker only needs to know where the orchestrator is, not the other way around.

A worker is stateless compute. It receives a step to execute, runs it, streams results back, and moves on. It holds no credentials, no definitions, no persistent data. If a worker disappears — its container is reclaimed, its spot instance is terminated — the orchestrator notices the dropped connection and can reassign the step. Nothing is lost because nothing lived on the worker in the first place.

The orchestrator, by contrast, is durable. It persists run state to disk, stores step outputs in its datastores, and manages the vault. A restarted orchestrator recovers its runs; a restarted worker just reconnects and asks for more work.

Executors: local and remote

The mechanism that decides where a step runs is the executor. An executor is a named target for step execution. Swamp ships two built-in executors:

Local loopback is the default. It runs steps inside the orchestrator process itself. There is no network overhead, no serialization boundary — the step executes as if the workflow engine called the model method directly. This is what happens when no executor is specified, and it is what most single-machine workflows use.

Remote worker sends the step to a connected worker for execution. The orchestrator serializes the step's inputs, transmits them over the data plane, and waits for results to stream back. From the workflow's perspective, the step behaves identically — it produces the same outputs, writes the same data — but the compute happened elsewhere.

A remote worker can use any runtime internally; the orchestrator does not care how the worker executes the step, only that it reports results through the control socket. See the remote execution reference for the specifics.

Capability proxying

The design principle behind stateless workers is that every capability lives on the orchestrator. Workers do not hold credentials. They do not cache definitions. They do not maintain their own datastores. Instead, every access that a step needs — reading from a datastore, resolving a vault secret, looking up a model definition — is proxied back through the orchestrator.

When a step running on a worker calls context.queryData(), that call does not hit a local datastore. It is serialized, sent back to the orchestrator over the data plane, executed against the orchestrator's datastore, and the result is returned to the worker. The same applies to vault resolution: when a step references a secret, the orchestrator resolves it from its vault provider and sends back the plaintext value over the encrypted connection. The worker never sees the vault configuration, never knows which provider backs it, and never persists the secret.

This proxy model has three consequences worth understanding.

Workers are disposable. Because a worker holds no state and no credentials, it can be created and destroyed freely. Autoscalers can add workers under load and remove them when idle. A compromised worker reveals nothing beyond the secrets it was actively using for in-flight steps — and those secrets are transient, resolved per-step, not cached.

The orchestrator is the security boundary. All access control decisions happen at the orchestrator. A worker cannot request a secret its step does not reference. It cannot query data outside the scope the orchestrator grants. The orchestrator mediates every interaction, which means the trust model is simple: trust the orchestrator, verify the workers through enrollment tokens.

Latency is a trade-off. Every proxied call adds a network round-trip. For steps that make many small datastore queries, the overhead can add up. This is an intentional trade-off: centralized control over distributed performance. For most workloads, the round-trips are negligible compared to the step's own execution time. For latency-sensitive patterns, the security reference discusses what is and is not proxied.

How the pieces connect

The orchestrator starts and listens for worker connections. Workers connect, authenticate via enrollment tokens, and advertise their availability. When a workflow reaches a step assigned to a remote executor, the orchestrator selects a connected worker, transmits the step, and waits for results.

From the workflow author's perspective, remote execution changes where a step runs but not how it behaves. Data chaining, vault references, and output recording all work identically. The workflow does not need to know whether a step ran locally or on a machine across the network — the orchestrator handles that distinction.

The remote execution tutorial walks through setting up an orchestrator and connecting a worker. The worker commands reference covers the commands involved. The enrollment tokens reference explains how workers authenticate, and the security reference details the trust model and what the proxy boundary protects.