Leases

TL;DR

  • A lease is a database-transaction analogue for cloud resources. Open a lease, stamp every resource you create with --lease <id>, then close the lease with commit (keep everything) or rollback (discard everything).
  • Two states per resource, period: workspace-owned (permanent) or lease-owned (temporary). Workspace-owned rows can never re-enter a lease — the only entry path is --lease <id> at create time.
  • The TTL is the safety net. If the closing call never happens, the platform auto-rolls-back the lease at expiry, so abandoned runs do not leak compute.
  • Leases are the agent sandbox: an LLM (or a human) experiments freely inside a lease — graph shapes, parameter swaps, deploy and tear down — while workspace-owned state stays untouched and cleanup happens automatically.
  • Leases double as a predictable batch surface: spin up a backend that processes a dataset, capture the generated artifacts, Promote the ones worth keeping, rollback the rest. The pinned inputs, pinned runtime, and addressable outputs make the batch reproducible.
  • Most users never touch the verbs directly. ppl lease run --test=<manifest> orchestrates the full lifecycle for the common case (live test from a manifest).

What a lease actually is

A lease scopes the resources a short-lived run creates — a candidate prerelease, an uploaded fixture, a backend wrapping them, a deployment on a runtime — so that all of it can be discarded together when the run finishes, with no leaked compute, storage, or orphaned graph rows.

The alternative is tracking each created resource and deleting them one by one at the end, where the cleanup runs last. A crash mid-run, a CI runner shutting down, or an agent ending its run before that step executes leaves cleanup half-done, and the leftover resource is the one that keeps consuming compute.

The lease makes the cleanup boundary a first-class object. The lease is a scope: open it, attach resources at the moment they come into existence by stamping --lease <id> on their create call, close it with one decision. Commit makes everything in the lease permanent and workspace-owned. Rollback removes everything in the lease as a single operation. If the close never happens, the TTL closes it for you, with rollback as the default, because keeping unknown state around is the dangerous option.

A lease has the same shape as a database transaction: an ABORT-on-expiry that matches a connection-drop ABORT in a SQL database, a commit-or-rollback choice that matches COMMIT/ROLLBACK in SQL, per-row Promotion that matches savepoint-style partial commits, and no implicit attach because the only entry path is --lease <id> at create time. The transaction semantics carry over directly to cloud resources.

Mental model

   ppl lease create   ── BEGIN

       │  ppl lease run / ppl file link --lease <lid>
       │       ── INSERT, lease-owned
       │       (version, fixture uploads, ephemeral
       │        backend + deployment, all stamped with
       │        the lease id at the moment they exist)

       ├── ppl lease commit   <lid>   ── COMMIT
       │       → every owned row becomes workspace-owned,
       │         lease closes

       ├── ppl lease rollback <lid>   ── ROLLBACK
       │       → every owned row deleted, lease closes

       └── TTL elapses without close  ── ABORT-on-disconnect
               → auto-rollback, lease closes

The contract is short: a row is stamped with the lease id at the moment it is created, and that is the only way a row enters a lease. There is no "attach this existing row to a lease later" operation, and that absence is deliberate. A row is either in the lease from the moment it exists, or it is workspace-owned forever. There is no third state, no race condition, and no half-stamped row to reason about.

See The lease lifecycle for the operational walkthrough and Deployments for the deployment that a lease typically holds.

Two patterns leases unlock

Use this framing to understand why leases exist as a first-class primitive rather than a CLI cleanup script.

The first pattern is predictable batch computation with captured outputs. A common shape is: feed a dataset through a backend graph, let it write derived artifacts (a vector index, a feature table, a set of cropped images, a model checkpoint), keep the artifacts, discard everything else. Without leases that flow leaks — the test backend, the test deployment, and the temporary fixtures all stay behind, and locating them later means searching through workspace state by hand. With leases the whole apparatus is one stamped scope: deploy inside the lease, run the batch, capture the outputs by name, Promote the kept files out of the lease, rollback the rest. The inputs were pinned, the runtime was pinned, and the outputs are addressable, so the batch is reproducible because the lease made the boundary unambiguous.

The second pattern is the agent sandbox. The CLI agent surface has destructive operations removed — there is no delete-everything verb in agent mode — which leaves the agent with a concrete need: iterate on a graph design, try parameter combinations, deploy a candidate, discard it, and start over, without leaving orphaned backends and deployments behind. The lease covers that need. The agent opens a lease, works inside it, and rolls back when finished. If the agent ends its run before closing the lease, the TTL closes it. The destructive operation the agent surface does not expose is the one the lease provides safely, scoped to exactly the resources the agent created during this run. Workspace-owned state stays outside the lease's reach; the lease is the scoped sandbox around it.

The lease design exists because of these two patterns. The test-harness convenience is real but secondary; the underlying contract is to give every ephemeral run a transaction boundary so cleanup is automatic and outputs are first-class.

The four lifecycle verbs

Use this as the catalog of "what the lease itself can do".

The lease lifecycle has exactly four verbs, and they map directly to a database transaction.

ppl lease create opens the lease. It takes the runtime the eventual deployment will land on, an optional TTL, an optional label, and returns the lease ID. Subsequent create calls stamp themselves with --lease <id> to ride inside.

ppl lease promote <lid> --kind <k> --id <id> is the savepoint-style per-row commit. It takes a single lease-owned resource and makes it workspace-owned immediately, so it survives whatever happens to the rest of the lease. This is the right shape for keeping one artifact while the rest of the lease stays throwaway.

ppl lease commit <lid> closes the lease keeping everything: every still-owned row becomes workspace-owned in one atomic operation.

ppl lease rollback <lid> closes the lease discarding everything: every still-owned row is destroyed in one atomic operation. Anything Promoted earlier survives because it is already workspace-owned.

The complementary operational verbs are smaller in number. wait streams the deployment lifecycle and exits with a meaningful code (0 for ready, 1 for failed or timeout). debug returns a structured snapshot of the lease's backend, deployment, and per-container state — the first call to make when a lease fails. set-result records a structured result payload that survives the close, for post-mortem. collect-outputs tears down the deployment with file-save semantics and returns the captured file_ids. list and delete are housekeeping.

See The lease lifecycle for the verb-level walkthrough.

States per resource

A resource on the platform has exactly two ownership states, and the lease design depends on that simplicity.

Workspace-owned is the permanent state. The resource was created without --lease, was created with --lease and then Promoted, or was created with --lease and the lease was Committed. It belongs to the workspace forever (until something explicitly deletes it through the destructive-operations contract).

Lease-owned is the temporary state. The resource was created with --lease <id> and is still inside that open lease. When the lease closes, the resource either becomes workspace-owned (commit) or is destroyed (rollback). The TTL bounds the lease-owned state so it cannot persist indefinitely; abandoned leases get rolled back automatically.

There is no third state. There is no "lease-attached-but-retain" half-state. There is no adopt verb that pulls workspace-owned rows into a lease after the fact. The minimalism is the point — it makes "is this resource ephemeral or permanent" a one-bit question with an unambiguous answer at every point in time.

TTL as the safety net, explicit close as the default

A lease ends one of two ways. The TTL is the passive close: the platform automatically rolls back any lease whose TTL has passed. The explicit commit or rollback is the active close: it applies the keep-or-discard decision immediately rather than waiting for the clock.

The right default is the explicit close. It makes the decision deliberate, it keeps the lease list small, and it keeps an accurate record of why each row exists. The TTL covers the runs that ended before the close, and those runs should be the minority, not the norm.

Rollback is the TTL default, rather than commit, because the dangerous mistake is keeping unknown state. When a run is abandoned without an explicit close, the platform does not treat its output as worth keeping; the keep decision is the operator's, and its absence resolves to rollback.

Where this fits

Leases address the gap between "I created some resources for a run" and "all of them are reliably cleaned up afterwards" — the gap where orphaned state accumulates over a project's lifetime. By giving that gap a name, a transaction shape, and a safety net, the platform makes cleanup a contract the runtime enforces rather than a step each run has to remember. The cost is one extra concept (the lease); in return, automation, agents, batch jobs, and live tests all share the same cleanup discipline by default.

Related

Was this page helpful?