Skip to main content
← Back to list
01Issue
FeatureOpenSwamp CLI
AssigneesNone

Relationships

#637 Expose per-run memory/CPU metrics for method & workflow executions

Opened by john · 6/12/2026

Problem statement

There is no visibility into a run's resource consumption. A long, high-fan-out method can climb to the V8 heap ceiling and OOM with no forewarning — operators cannot see RSS/heap growth, peak memory, or CPU per method/subprocess, so the cost of a fan-out is invisible until the process dies.

This is the direct lesson from bug #636: an 11.5-hour sync_from_s3 run marched steadily to ~4GB and OOM-crashed, and there was no per-run signal to catch it early. Throughout debugging we were reduced to externally counting bucket objects (which itself got rate-limited) because the run exposed no memory/CPU telemetry.

Proposed solution

Sample and record per-run (and per-subprocess) resource metrics:

  • Memory: peak RSS, current + peak V8 heap-used.
  • CPU: CPU time / utilization per method/subprocess.

Surface them in two places:

  1. Live in progress output (e.g. alongside the existing periodic Sync progress heartbeat lines), so growth is visible during the run.
  2. Persisted in the run record, so completed/failed runs retain their peak-memory and CPU footprint for later inspection and capacity planning.

Optionally add a configurable soft threshold that emits a warning as a run approaches the hard heap limit (pairs with the heap-configurability ask in #636).

Alternatives considered

  • External OS-level monitoring (e.g. watching the process RSS from outside): does not attribute usage to a specific method/run/subprocess, cannot distinguish V8 heap from RSS, and is impractical for ad-hoc or CI runs.
  • Status quo (no metrics): fan-out cost stays invisible until OOM; this is what bug #636 demonstrates.
02Bog Flow
OPENTRIAGEDIN PROGRESSSHIPPED

Open

6/12/2026, 9:24:05 AM

No activity in this phase yet.

03Sludge Pulse

Sign in to post a ripple.