Component contract
TL;DR
- Every component ships with a
component.ymlnext to its source. - The file declares: language, build system, typed inputs and outputs, configuration parameters, file dependencies, optional HTTP endpoints.
- Everything downstream — type checker, app builder, deploys, agent catalog — reads this file. Get it right and the rest of the platform validates your component for free.
- The page you are reading is the mental model. For the field-by-field schema with every flag and edge case, fetch the reference from the CLI:
ppl docs get component-api/component-contract.
What component.yml actually is
The file is the contract between the component author and the rest of the platform. It is the only authoritative description of the component: what it accepts on its inputs, what it emits on its outputs, what it can be configured with, what it needs at deploy time. Other components, the visual builder, and the agent catalog all read the same file — there is no separate registration step.
Top-level shape
A complete component.yml covers six categories of metadata. You almost never use all of them.
| Category | Lives under | Purpose |
|---|---|---|
| Identity | name, language, platform, tags | Display name, source language, target architecture, catalog labels. |
| Build | build_system, install, xmake_packages | Curated image pair the container builds against; build- vs deploy-time install. |
| Discovery | categories, modalities, neighbors, alternatives | Hints the catalog uses to surface the component and suggest siblings. |
| Type contract | worker.input_type / output_type (or plural) | Pipelang type expressions that gate every wire. |
| Runtime parameters | worker.config_schema | Knobs the operator sets at deploy time; one entry per parameter. |
| File and model deps | worker.file_schema, worker.cache | Files the platform provisions on the node; model caches that survive deploys. |
| HTTP endpoints | http | Optional. Declares ingress/egress over HTTP, WebSocket, SSE, or WebRTC. |
The smallest legal file is identity + build + a one-line worker: block:
name: "Echo"
language: py
platform: linux/amd64
build_system: 2
worker:
input_type: "String"
output_type: "String"
From there you grow it as the component grows: add config_schema when you need parameters, add file_schema when you need a model on disk, add http: when the component is an ingress or egress for the outside world.
Type expressions in one breath
input_type / output_type carry a Pipelang type expression. The forms you will use most often:
- Atomic —
"Int32","Double","String","Bool","Bytes", … - Named —
"Image","AudioFrame","Tensor","BoundingBox", … - List —
"[BoundingBox]". - Tuple —
"(Image, String)". - Record —
"{x: Double, y: Double}". - Union —
"Image | DepthImage". - Generic —
"Polygon<Double>".
Lowercase identifiers (t, frame) are type variables; uppercase identifiers are concrete types. For the full grammar — disambiguation rules, pack expansions, optional fields — see /type-api/type-syntax. For the catalog of registered named types, see /type-api/catalog.
Configuration parameters
worker.config_schema declares the knobs an operator sets at deploy time — one entry per parameter:
worker:
config_schema:
confidence_threshold:
type: Double
default: 0.5
color_model:
type: String<"BGR" | "RGB">
default: "BGR"
api_token:
type: Maybe<String>
secret: true
Each parameter carries a type (any type expression), an optional default, and three optional flags:
mutable—truelets the value change on a live deployment; the defaultfalselocks it at deploy time.secret—truemakes the value a workspace-secret reference, so the raw value never lands in the graph. OnlyString,Maybe<String>, and[String]can be secret.description— free text describing the parameter.
A parameter's type can be refined — narrowed by a predicate the platform enforces. String<"BGR" | "RGB"> accepts only those two strings; Int32<0..=255> bounds a range; Int64<%8> requires a multiple of eight; String<email> checks a format. A refined value is rejected at change-parameter if it falls outside the constraint, so the component does not have to re-validate it — prefer a String<…> enum over type: String plus a hand-written allowed-value list. For a closed set of whole types rather than values, use oneof[T1, T2]. See /type-api/type-syntax for the full set.
Do not redeclare a config_key from file_schema here — the platform synthesises that parameter from the file binding, and duplicating it fails validation.
File and model dependencies
worker.file_schema declares the Files the platform provisions on the node before the component runs — each slot fixes a file_type, a config_key, and an optional component target. The full slot reference, including the config_key duplicate-parameter rule and the component target behaviour, is in File schema; the accepted file_type values are in the file type catalog.
Generated files
worker.generated_file_schema declares the Files a component produces at runtime, each with a name, a single file_type, and a config_key the platform fills with the path to write to. The produced file becomes available to downstream consumers and to the operator after the run. See File schema.
Model and artifact caching
cache keeps large model artifacts on the node across deploys so identical inputs reuse a pull instead of re-downloading. Each cache is a named list of rules:
worker:
cache:
capybara:
- ids: model_cfg # config keys whose values seed the cache lookup
revision: model_revision # optional — config key holding the artifact revision
when: # optional — only cache when these config values match
backend: gpu
allow_local_paths: false # optional — allow paths outside the managed cache dir
The cache key is derived from the values of the ids config keys plus the optional revision: different config values resolve to different cache entries, so swapping a model name or revision pulls fresh while the old entry stays warm. when restricts caching to deployments whose config matches (for example, only cache when backend: gpu), and allow_local_paths opts into artifacts that live outside the managed cache directory. This is the mechanism behind HuggingFace, docaligner, and similar model loaders.
HTTP and WebSocket endpoints
A component that talks to the outside world declares http endpoints. The platform handles TLS, auth, and the public URL — the component only listens on the declared port:
http:
image-input:
port: 9000
kind: ingress # ingress | egress
transports: [http, ws] # subset of http | ws | sse | webrtc | multipart
media: [video, audio] # media kinds the endpoint carries
format: binary # binary | json | text
config_param: transport # a config key picks the active protocol at runtime
config_param_map:
http: [http]
websocket: [ws]
both: [http, ws]
Use the singular transport: / method: fields for a fixed single-protocol endpoint; use the plural transports: with config_param and config_param_map when the operator picks the protocol at deploy time. The endpoint name (image-input) is what ppl backend forward targets.
Build environment
build_system selects the curated compile-and-runtime image the container builds against — list the registry with ppl component builders (build_system_base is required only for build_system: custom). install chooses when dependencies install: node defers the requirements.txt install to deploy time on each node, anything else installs at build time. xmake_packages adds C++ requires, and depends_on lists sibling component slugs the component needs at runtime.
Where this fits
component.yml is the contract layer between code you wrote and the platform that runs it. It is read at compile time (validation, type checking), at release time (catalog entry, schema introspection), and at deploy time (parameter binding, file provisioning). Get the shape right once; never touch it again until the contract changes.
For the full schema
The website covers the model. For the table of every field, every default, every edge case, every validation rule, every example by category, and the validation failure modes, fetch the reference from the CLI:
ppl docs get component-api/component-contract
Related
- /concepts/components — the component as a platform primitive.
- /type-api/type-syntax — the full grammar for
input_type/output_type. - /type-api/catalog — catalog of built-in named types.
- /file-api/file-types — valid
file_typeslots. - /concepts/build-systems — the
build_systemregistry. - /concepts/install-modes —
install: nodevs build-time install.