Runtimes and nodes
TL;DR
- A runtime is a compute reservation a deployment runs in: a pool of machines the workspace is entitled to deploy onto. It carries the runtime contract — runtime class, capacity, lifetime, per-deployment ceiling — that a deployment inherits.
- A node is one machine inside a runtime. The platform schedules containers across nodes automatically; users do not select nodes by hand.
- A flavor is the sized template a node was provisioned from (CPU / GPU / RAM / disk). The flavor catalog lists what sizes exist and what is in stock.
- Runtimes come in two shapes: cloud-backed (the platform provisions nodes from a flavor SKU on demand) and static (the runtime is attached to pre-registered customer hardware — on-premise or air-gapped).
- A runtime's runtime contract — whether it's locked, its node count, its expiration, its per-deployment ceiling, and its runtime class — is what makes runtime selection a deliberate match against the workload rather than a blind deploy.
What runtimes and nodes are
A runtime is the unit a deployment targets, and a node is a machine inside it. Pipelogic models the two separately so that the choice a user makes (which runtime) is decoupled from the placement the platform performs (which node). A single "where does my workload run" concept would fold together details that matter to the user — GPU class, per-instance memory, tenancy model, whether the workload runs on managed cloud or on the customer's own hardware — and details the scheduler should own.
The runtime is what users select: a named pool with its own lock state, runtime class, lifetime, and entitlement. The node is what runs inside it: a machine the platform schedules containers onto. The split lets users reason about the unit they care about (runtime) without reasoning about the unit they do not (a specific machine), and lets the scheduler place containers without surfacing placement as a user choice.
The runtime class on a runtime is the load-bearing detail for many workflows. shared runtimes host multiple tenants and are the default for development and small workloads. dedicated runtimes pin compute to one workspace, for when shared-pool placement is too uncertain. on_premise and air_gapped runtimes host nodes the customer enrolled — the workspace's own hardware, inside the customer's network, attached to the platform without compute leaving the customer's perimeter. The same backend deploys onto any of these classes with no code changes, because backends are decoupled from runtime.
Mental model
Workspace
│
│ owns
▼
Runtime (cloud-backed SKU or static set of nodes)
│
│ contains
▼
Node (a machine: VM / bare metal / cloud instance)
│
│ runs
▼
container (one vertex of the deployed backend graph)
The user selects the runtime; the platform selects the node. A deployment is the pairing of one backend with one runtime, and the scheduler places the deployment's containers across whichever nodes are healthy and have headroom. From the user's side, "where does my workload run" is answered by naming a runtime; the platform handles the rest.
See Deployments for the runtime pairing on top of this fabric.
Cloud-backed versus static runtimes
Use this framing whenever the question is "where should this workload's compute come from".
A cloud-backed runtime is the default shape. The runtime is identified by a flavor SKU; the platform provisions nodes from that SKU at deploy time and sizes the compute to match the workload. The runtime appears in the workspace's listing with its current node count and its per-deployment ceiling, and deployments land on it with no prior provisioning step.
A static runtime is the customer-hardware shape. The runtime is attached to pre-registered nodes — machines the customer enrolled into their workspace, typically inside their own network, often air-gapped from the public internet. The runtime class (on_premise or air_gapped) is inherited from the nodes. Static runtimes are for workloads that must run on the customer's own hardware for compliance, latency, or data-locality reasons. The managed cloud and the customer's hardware are the same backend on the same platform — no code changes between them.
The enrollment surface for static runtimes is available on plans that include them; the relevant ppl node enroll … commands appear when the entitlement is present. Pinning dedicated cloud capacity follows the same pattern — that surface appears when the plan permits it. Browsing cloud flavors with ppl node flavors is universally available; the live catalog lists what is in stock and which runtime classes each flavor maps to.
Picking a runtime
Use this framing every time you reach for ppl backend deploy.
Runtime selection is a one-shot list-and-filter. The workspace sees a set of runtimes; each one advertises whether it is locked (a locked runtime takes no new deployments), its node count (whether headroom exists), its deployment_count (current usage against capacity), its runtime class (to match the workload's compliance and performance needs), and its expires_at (when the runtime itself will be reclaimed, for time-limited runtimes).
ppl runtime list
ppl runtime list --query=staging
If the only available runtime is full or its runtime class does not match the workload, the deploy rejects before any container starts. This is intentional: the platform does not place a deployment into a runtime that cannot host it, so "the deploy succeeded" also means "the runtime expectation was met". Accepting the deploy and letting it fail at runtime would surface the failure too late to be useful.
The deployment pair is then a single call: ppl backend deploy --backend <bid> --runtime <cid>. For single-backend deploys, --runtime is optional: when omitted, the workspace's configured default runtime is used first, and the platform's picker selects a deployable runtime after that. Use the explicit form when the workload needs a specific runtime, when batching multiple backends (where --runtime is required), or when the default does not fit the run at hand.
See Deploy and monitor for the operational loop after the runtime is chosen.
Why nodes are scheduler-managed
Use this framing whenever you wonder "can I pick which node runs this".
Nodes are visible (ppl node list returns the list) but they are not selectable for deployment placement. The scheduler decides where each container lands based on whether the node is locked, its capabilities, its current load, and the deployment's resource needs. That decision is not user-controllable, for two reasons.
First, placement is an optimization problem the scheduler resolves with information the user does not have. A workload with two containers might be co-located on one node (for shared-memory transport) or spread across nodes (for fault tolerance) depending on details that vary per run. Pinning containers to specific nodes degrades placement on average and adds operational work for no benefit.
Second, node identity is ephemeral. Cloud-backed nodes are rotated, drained for maintenance, and replaced when their flavor catalog changes. Pinning a deployment to a specific node would expose those operations, which are meant to be invisible. The user's contract is with the runtime, which has a stable identity and runtime class; the platform manages the nodes underneath it.
Observability into placement is still exposed. ppl container list --deployment <did> shows which node each container landed on, so the team can correlate container state with node state and tell whether an issue is node-local (one node misbehaving) or systemic (every node showing the same problem).
Runtime lifecycle and operations
A runtime has its own lifecycle, separate from any deployment that runs on it. A runtime accepts deployments while it is unlocked and has a healthy node; an operator can lock it (frozen for maintenance, no new deployments accepted); and it is reclaimed at its expires_at. Cloud-backed runtimes expire on the platform's schedule; static runtimes expire when the customer detaches their nodes.
Each deployment that lands on a runtime has its own ceiling — the deployment_timeout advertised on the runtime. By default deployments auto-extend up to that ceiling; passing --fixed-duration <dur> at deploy time pins a hard end and disables auto-extension. Pin a fixed duration when the workload has a known wall-clock budget (a demo window, a scheduled batch); leave auto-extend in place for deployments that should run as long as the runtime does.
Common failure shapes
Most runtime-related failures cluster into a small set of shapes:
no deployable runtime available— every reachable runtime is locked, full, or has no healthy node. The fix is to pick a different runtime with headroom or to free capacity.runtime locked— the runtime is frozen for maintenance. The fix is to wait for it to unfreeze or to use a different runtime for the run.- Reserved runtime not paid or reservation expired — a dedicated reserved runtime's window has lapsed. The fix is to renew the reservation or to deploy on a different runtime.
- Deployment expired before the run finished — the runtime's
deployment_timeoutis shorter than the run needed. The fix is to pin--fixed-durationto a longer window, or to deploy on a runtime with a longer ceiling. unknown flavorfrom deploy — the flavor name is not in the current catalog (cloud-backed runtimes only). The fix is to resolve a current flavor withppl node flavorsand pick one that exists.
Broader failure patterns live in Common failures.
Where this fits
Runtimes and nodes are the runtime layer of the platform. They sit underneath deployments — every deployment lands on exactly one runtime, every container in the deployment lands on exactly one node — and they expose enough contract for the team to reason about capacity, compliance, and placement without exposing the scheduling internals the platform owns.
The split between runtime (user-visible runtime contract) and node (platform-scheduled compute) is what lets the same backend deploy with no code changes onto a shared cloud runtime and onto a customer-enrolled static runtime. That portability follows from keeping backends decoupled from runtime.
Related
- Deployments — the backend-plus-runtime runtime pairing on top of this fabric.
- Backends — the typed graph that gets deployed.
- Leases — runtimes are also what leases target.
- Deploy and monitor — the operational loop after the runtime is chosen.
- Common failures — symptom → fix lookup.