WORKING WITH DATA
In this tutorial, we will create two models, wire one's output into the other
using a data.latest() expression, run them through a workflow, then query and
inspect the versioned data they produce.
What we will build
We are going to set up two command/shell models. The first collects system information. The second reads the first's output and produces a summary. We will connect them in a workflow, run it, then explore what ended up in The Swamp — listing data, reading specific artifacts, querying across models, and looking at version history.
Prerequisites
- Swamp installed (Hello World covers installation)
- A terminal open in an empty directory
Initialize the repository
First, we create a fresh Swamp repo:
$ mkdir data-tutorial
$ cd data-tutorial
$ swamp repo init --tool noneYou will see the Swamp banner followed by:
info repo·init Initialized swamp repository at "..." (tools: "none")Create the first model
We create a model called system-info that runs uname -a:
$ swamp model create command/shell system-infoCreated: system-info (command/shell)
Path: .../models/command/shell/....yaml
Methods:
execute - Execute the shell command and capture stdout, stderr, and exit code
Inputs:
run (string) *required
...Now we edit the definition to set the command. Open the YAML file shown in the
Path line above — or use swamp model edit system-info — and set the run
global argument:
globalArguments:
run: "uname -a"Run the first model
$ swamp model method run system-info executeYou will see a progress tree, then output like:
info model·method·run·system-info·execute Executing method "execute"
info model·method·run·system-info·execute Darwin Mac.localdomain 25.3.0 ...
info model·method·run·system-info·execute Method "execute" completed on "system-info"The model ran uname -a and stored the result. Let us look at what it produced:
$ swamp data list system-infoData for system-info (command/shell)
file (1 item):
log v1 text/plain 146B ...
resource (1 item):
result v1 application/json 251B ...
report (2 items):
report-swamp-method-summary v1 text/markdown 471B ...
report-swamp-method-summary-json v1 application/json 2.6KB ...Notice three kinds of data: a file (the raw log output), a resource (the
structured JSON result), and report entries (summaries generated
automatically). We will work with the result resource.
Read the data
$ swamp data get system-info resultData: result (v1)
Model: system-info (command/shell)
Content: application/json, 251B
Lifetime: infinite | GC: 10
Tags: type=resource, specName=result, modelName=system-info
Owner: model-method (...)
Created: ...
Path: .swamp/data/command/shell/.../result/1/raw
{
"exitCode": 0,
"executedAt": "...",
"command": "uname -a",
"durationMs": 7,
"stdout": "Darwin Mac.localdomain 25.3.0 ...",
"stderr": ""
}Notice the metadata header above the JSON content — it shows the version (v1),
content type, lifetime, tags, and where the file lives on disk. The JSON
underneath is the actual data the model produced.
Create a model that reads another model's data
Now we create a second model that reads from system-info. This is where
data.latest() comes in:
$ swamp model create command/shell summariserOpen the new definition file (swamp model edit summariser) and set the run
global argument to a CEL expression that reads the first model's output:
globalArguments:
run: "echo \"System: ${{ data.latest('system-info', 'result').attributes.stdout }}\""The ${{ }} marks a CEL expression. data.latest('system-info', 'result')
fetches the latest version of system-info's result data. .attributes.stdout
reads the stdout field from it.
Now run it:
$ swamp model method run summariser executeinfo model·method·run·summariser·execute Evaluating expressions
info model·method·run·summariser·execute Executing method "execute"
info model·method·run·summariser·execute System: Darwin Mac.localdomain 25.3.0 ...
info model·method·run·summariser·execute Method "execute" completed on "summariser"Notice the "Evaluating expressions" line — Swamp resolved the data.latest()
expression before running the command. The summariser read system-info's
output from The Swamp without either model knowing about the other.
Connect them in a workflow
We have been running models one at a time. Now we wire them into a workflow so they run together:
$ swamp workflow create info-pipelineCreated: info-pipeline
Path: .../workflows/workflow-....yamlOpen the workflow file (swamp workflow edit info-pipeline) and replace its
contents with:
id: <keep the id from the generated file>
name: info-pipeline
description: Gather system info and summarise it
jobs:
- name: main
description: Run both models in sequence
steps:
- name: gather
description: Collect system information
task:
type: model_method
modelIdOrName: system-info
methodName: execute
- name: summarise
description: Summarise the gathered data
depends_on:
- gather
task:
type: model_method
modelIdOrName: summariser
methodName: executeNotice depends_on: [gather] on the summarise step — it waits for the gather
step to complete before starting.
Validate the workflow:
$ swamp workflow validate info-pipelineValidating: info-pipeline
✓ Schema validation
✓ Unique job names
✓ Unique step names in job 'main'
✓ Valid job dependency references
✓ Valid step dependency references in job 'main'
✓ No cyclic job dependencies
✓ No cyclic step dependencies in job 'main'
✓ Step inputs for 'gather' in job 'main' (system-info.execute)
✓ Step inputs for 'summarise' in job 'main' (summariser.execute)
...
Summary: 11 passed
Result: PASSEDNow run it:
$ swamp workflow run info-pipelineinfo workflow·run·info-pipeline Starting workflow
info workflow·run·info-pipeline·main Job started
info workflow·run·info-pipeline·main·gather Step started
info workflow·run·info-pipeline·main·gather Step completed
info workflow·run·info-pipeline·main·summarise Step started
info workflow·run·info-pipeline·main·summarise Step completed
info workflow·run·info-pipeline·main Job completed
info workflow·run·info-pipeline Workflow "succeeded"Both models ran in sequence. We can see the workflow-scoped data:
$ swamp data list --workflow info-pipelineData for workflow info-pipeline (run ...)
file (2 items):
log v3 system-info main.gather 146B
log v2 summariser main.summarise 154B
resource (2 items):
result v3 system-info main.gather 251B
result v2 summariser main.summarise 405B
report (6 items):
...Notice the version numbers — result v3 for system-info because we have run it
three times now (twice manually, once through the workflow).
Query across models
So far we have looked at data one model at a time. swamp data query searches
across everything in The Swamp using CEL predicates:
$ swamp data query 'tags.type == "resource"'┌────────┬─────────────┬──────────┬──────────┬─────────┬──────┐
│ name │ modelName │ specName │ dataType │ version │ size │
├────────┼─────────────┼──────────┼──────────┼─────────┼──────┤
│ result │ system-info │ result │ resource │ ... │ 251B │
├────────┼─────────────┼──────────┼──────────┼─────────┼──────┤
│ result │ summariser │ result │ resource │ ... │ 405B │
└────────┴─────────────┴──────────┴──────────┴─────────┴──────┘
2 resultsThe predicate tags.type == "resource" matched every resource across both
models. We could also query by model name, content type, or any tag.
View version history
Every time a model runs, it creates a new version of its data. Let us see system-info's version history:
$ swamp data versions system-info resultYou will see output like:
{
"dataName": "result",
"modelName": "system-info",
"versions": [
{
"version": 3,
"createdAt": "...",
"size": 251,
"isLatest": true
},
{
"version": 2,
"createdAt": "...",
"size": 251,
"isLatest": false
},
{
"version": 1,
"createdAt": "...",
"size": 251,
"isLatest": false
}
],
"total": 3
}Three versions — one for each time we ran system-info. data.latest() always
reads the most recent version. Older versions stay available until garbage
collection removes them.
What we built
We created two models, connected them with a data.latest() expression so one
reads the other's output, ran them through a workflow, and explored the data
they produced — listing artifacts, reading content, querying across models with
CEL predicates, and inspecting version history.