Components

TL;DR

A component is the platform's unit of capability: a typed, versioned, containerized piece of code that does one job — read an image, run a detector, normalize an audio frame, write to a queue. It is reusable across many backends and owned by one team.
A component declares what it consumes, what it produces, and how it can be configured — typed input streams, typed output streams, typed parameters, optional file slots. That declaration is the contract every other layer of the platform reads.
Components have immutable releases. Each publish creates a prerelease; promotion turns a prerelease into a released version. Backends pin specific releases, so promotion does not change any running deployment.
A component owns behavior. A backend owns composition. A deployment owns runtime. Keeping the three primitives separate is what lets a single capability power many products without forking the code.
Most components are stateless functions: input in, output out. State, virtual streams, and custom pull policies are advanced topics for the cases that need them; the basic shape covers most real components.

What a component actually is

A component is the smallest reusable unit of capability the platform models. Concretely, it is a typed function packaged as a containerized worker, with a manifest declaring what it consumes, what it produces, and how it can be configured. That declaration — the component.yml — is the contract every other layer reads: backends use it to type-check graph wiring before deploy, the catalog uses it for discovery, the runtime uses it to determine what the worker's container expects.

The granularity is deliberate. A bare function is too small to operate independently: no manifest, no container, no versioning. A whole pipeline is too large to reuse across products. A component sits at the level where one job maps to one unit that can be pinned, validated, reused, swapped, and operated on its own. The typed contract and immutable release model are what make that unit composable rather than just isolated.

The default contract is that components are stateless at the graph boundary. The function consumes typed inputs and produces typed outputs; everything else — model weights, configuration, optional state, side-channel events — is machinery the component declares when it needs it. That default keeps most components small and lets the scheduler parallelize them freely. The opt-ins exist for the cases that need them, and each one constrains how the runtime can schedule the vertex, which is the right pressure for keeping components simple.

Mental model

  ┌──── component release (immutable) ─────┐
  │                                        │
  │   component.yml      src/              │
  │   ┌─────────────┐    ┌─────────────┐   │
  │   │ language    │    │ Python or   │   │
  │   │ inputs[]    │    │ C++ code    │   │
  │   │ outputs[]   │    │             │   │
  │   │ config      │    │ implements  │   │
  │   │ files       │    │ the typed   │   │
  │   │ tags        │    │ function    │   │
  │   └─────────────┘    └─────────────┘   │
  │                                        │
  └────────────────────┬───────────────────┘
                       │  pinned by
                       ▼
                 backend vertex

The release is the immutable artifact. A backend pins it as a vertex, binds its parameters and files, and wires its streams. The same release can sit as a vertex in many backends; each backend carries its own bindings and its own graph context. Promotion records a new released row; pinning is what isolates a backend from that — a backend that pinned release_v3 keeps running v3 after v4 becomes default.

See Backends for the graph the release plugs into, and Types for the typed-stream contract its inputs and outputs satisfy.

What "typed function" actually buys you

Use this framing the first time you ask "what does the component contract guarantee".

The declared typed contract is small in the manifest — a few lines per stream — and load-bearing everywhere else. The backend's type inference uses it to reject wiring an Image output into a [BoundingBox] input. The CLI uses it to render meaningful error messages when a parameter is missing. The catalog uses it to filter components by the modalities they handle. The runtime uses it to determine what container resources the component needs at start.

The component author writes the contract once; every other party — graph authors, downstream consumers, ops teams, later maintainers — reads it. Declaring the contract moves wiring and parameter mistakes to edit time, where the backend's type inference catches them, instead of deferring them to a failed deploy. That is the practical reason the manifest is mandatory rather than optional.

The typed values themselves come from one of three places: pipelang atomic types (Double, String, Int64), composite types built from them ([BoundingBox], Maybe<String>, (Image, [Double])), or named types from the platform's type registry (Image, Tensor, AudioFrame, BoundingBox). The named types carry domain semantics: an Image is not just a tensor of bytes, it is the typed value every image-handling component agrees on, with shape, channel order, and metadata that downstream consumers read directly rather than re-deriving.

See Named types for the registry that backs Image, Tensor, and the rest.

Stateless by default, stateful when behavior needs history

Use the default; reach for state only when the next output genuinely depends on previous ones.

The default component is a stateless function: the runtime can call it many times concurrently for the same vertex, because no call depends on another. That lets the scheduler fan a single vertex out across CPU cores under load, or run independent invocations in parallel without coordinating ordering.

Declaring state changes that contract. The component asks the platform to persist memory between calls and to invoke the function one at a time per vertex, so the previous call's next state is visible to the next one. The vertex becomes history-dependent: trackers, rolling windows, session maps, conversation memory, debouncing, and "emit only after N seconds of buffered audio" all need state to behave correctly. A stateful vertex is serial per replica, which is the right pressure for keeping state out of components that do not need it.

What stays outside persisted state is the per-process machinery: model handles, database clients, compiled regexes, caches that can be regenerated on startup. Those belong in module globals or setup hooks, not in the platform's persisted state. Persisted state is for behavioral memory — the values whose loss between calls would change what the next call emits. Everything else is process-local and lives accordingly.

Side effects live behind virtual streams

Use this when the component needs to talk to the world outside the graph: HTTP, WebSocket, devices, SDK callbacks, file writes with retry semantics.

Most components consume from graph inputs, run a deterministic function, and emit to graph outputs. Some components have a harder job: they own an HTTP listener that receives traffic asynchronously, a camera thread that produces frames independently of graph timing, an LLM SDK that streams tokens back through a callback, a file writer that needs retry handling. Threading those side effects directly into the graph-facing function makes it impossible for the scheduler to determine when the function is safe to invoke.

Virtual streams isolate that work. Virtual input is a component-local queue populated by code outside the normal graph-facing tick: an HTTP listener, a device thread, a streaming SDK callback. The component can include virtual input in its pull request, so those external events still flow through the same scheduler-visible function as normal stream messages. Virtual output is the inverse: a queue of side-effect work items the component returns from a tick, which a bridge thread executes asynchronously and can feed results back through virtual input.

The discipline that makes virtual streams work is that the graph-facing function stays scheduler-visible. Side effects live behind the queues; the function reads and writes them; the bridge performs the IO. That separation is what lets a component talk to the outside world while the scheduler retains its model of when the function runs.

Releases are immutable; promotion is deliberate

Use this mental model whenever you reason about "why did my backend not pick up the new code".

Each publish creates a prerelease: a private validated build the caller can deploy in test backends. Promotion is the separate, deliberate step that turns a prerelease into a released version: the image goes to the registry, the tags declared in component.yml move (latest, default), and the version becomes deployable by everyone in the workspace's audience. The version ID does not change; only its state does.

Backends pin specific release IDs. Promoting a prerelease does not change any deployed backend's pinned version — the backend keeps running whatever it pinned at the time of last deploy. Rolling a backend forward to a newer release is an explicit vertex mutation followed by a redeploy. That separation is why promoting a broken version cannot, by itself, take down a running backend. The running backend is pinned; the new release is a new addressable artifact in the catalog.

See Publish semantics for the full publish + promote loop.

What a component owns and what it does not

A component owns its identity (display name plus workspace-scoped slug), its language (Python or C++), its declared input and output stream types, its typed config_schema, its optional file_schema for model and data slots, its optional generated_file_schema for outputs it writes back as workspace files, and its implementation source under src/. It also owns its discovery metadata — the catalog category and modality tags, the README and AGENT.yml that travel with the release.

It does not own anything graph-shaped. The backend it appears in, the vertex bindings of its parameters in that backend, the file IDs bound into its slots, the endpoint aliases its outputs are exposed through, the deployment that runs it, the runtime that hosts the deployment — all of those are backend-and-deployment concerns. The component is the capability; everything else is composition and runtime.

That separation is the whole reason a single component can power many products. A detector component is the same in a warehouse-safety backend, a retail-traffic backend, and a clinical-imaging backend; each backend wires it differently, binds different parameters, deploys it on different compute. The component does not need to know any of that.

Where this fits

Components are the platform's load-bearing primitive for what code does. Backends compose them; deployments run them; the catalog discovers them; the type system constrains how they wire together; the release model keeps their behavior reproducible over time. Every other concept on the platform exists to make components reusable, composable, operable, or discoverable, so that a single capability serves many products instead of being reimplemented as bespoke glue for each one.

The discipline the platform asks of component authors is small: declare the typed contract accurately, default to stateless, reach for state and virtual streams only when the basic shape cannot express the contract, and treat releases as immutable. In return, the platform provides graph type-checking, catalog discovery, runtime scheduling, deployment, observability, and proof loops. The author writes the function; everything around it is the platform's responsibility.

Backends — wire released components into a typed graph.
Types — the contract that catches bad wiring before runtime.
Named types — the registry behind Image, Tensor, AudioFrame.
Publish semantics — the publish and promote loop in detail.
File schema — declare the files a component consumes and produces.
Build systems — how the platform turns source into a validated image.
Solutions — component + backend + deployment + surface, end-to-end.
Quickstart — the route picker for your first task.