Build systems

TL;DR

Every component declares build_system: <key> in component.yml. The key picks a curated pair of images: a build image that compiles the source and a runtime image that runs it in production. No hand-written Dockerfiles for the common cases.
The runtime image already contains pipelogic plus its core dependencies (numpy, opencv, pyyaml, protobuf, pika, the C++ ML SDK). The component author writes only what is component-specific — additional Python packages in requirements.txt, additional C++ packages via xmake_packages.
ppl component publish runs the build on a remote build cluster, not on the author's machine. No local Docker, no local CUDA drivers, no local toolchain. The build either succeeds remotely (and produces a publishable artifact) or fails remotely (with the same error every other author would see).
Pin everything component-specific exactly with ==. Unpinned dependencies resolve to whatever the registry returns on build day, so a build that worked yesterday can resolve a different tree tomorrow.
The curated keys are a catalog the platform maintains. New stacks are added when they are widely useful; one-off needs are served by xmake_packages or Dockerfile-base, with a fully custom Dockerfile (built on a curated build_system_base) available on plans that entitle the custom build tier.

Mental model — `build_system` key drives the whole build

   component.yml                   ppl component publish
   ┌───────────────────────────┐   ┌───────────────────────────────────┐
   │ language: py              │   │ Build stage   (build_system img)  │
   │ build_system:             │──▶│   pip wheel <requirements.txt>    │
   │   2-cuda12.8-torch2.8-    │   │   xmake against xmake_packages    │
   │   onnxrtgpu1.22           │   ├───────────────────────────────────┤
   │ requirements.txt:         │   │ Runtime stage (build_system img)  │
   │   transformers==4.44.2    │   │   install wheels / copy binary    │
   │   huggingface-hub==0.24.6 │   │   image already has pipelogic +   │
                                   │   numpy + opencv + pyyaml + …     │
   └───────────────────────────┘   └───────────────────────────────────┘
                                                     │
                                                     ▼
                                               worker image

The platform owns the curated images, keeps them up to date, and guarantees a working stack of system libraries, language runtimes, and ML frameworks behind each key. The component author owns the source and the per-component requirements.txt (or xmake_packages).

Why curated build images instead of arbitrary Dockerfiles

The build_system key serves two purposes: it streamlines component creation — the author never writes a Dockerfile for the common case — and it avoids image bloat across the whole catalog. A curated base is built once and reused; if every component used a divergent base, each would ship gigabytes of duplicated layers.

The size dimension is concrete. A GPU stack — CUDA, cuDNN, PyTorch, ONNX Runtime — is several gigabytes before any component code is added. When components share a base, that stack is built once: the registry stores the shared layers a single time, and a node that already pulled one component has them cached for the next. When every component picks a different base, nothing is shared. Each component ships its own multi-gigabyte copy of nearly-identical system libraries, the registry stores it all, and every node re-pulls it. A catalog of a few dozen components becomes hundreds of gigabytes of mostly-duplicated layers.

Divergent bases compound the problem in other ways: they pin different CUDA versions, and a single security patch means editing every Dockerfile by hand.

A curated catalog of build_system keys addresses both. Each key fixes a tested combination of CUDA version, framework version, and supporting libraries, patched on the platform's schedule rather than per component, and — because the key resolves to one shared base image — every component that picks it reuses the same layers instead of duplicating them. The author picks a key; requirements.txt then carries only what is component-specific: the model loader, the transformer version, the image-processing library. The shared layer is built once, behind the key.

In exchange for picking from the catalog rather than writing arbitrary base images, authors get reproducible builds across team members, security updates that land behind one key rather than across dozens of forked Dockerfiles, predictable image sizes, and a platform that maintains the CUDA / framework compatibility matrix on the author's behalf.

Walkthrough — pick a key, pin deps, publish

# component.yml — Python Component on a GPU stacklanguage: pybuild_system: 2-cuda12.8-torch2.8-onnxrtgpu1.22worker:  input_type: Image  output_type: "[BoundingBox]"

# requirements.txt — exact versions only; NEVER re-pin pipelogic / numpy / opencv / pyyaml / protobuf / pika
transformers==4.44.2
huggingface-hub==0.24.6
pillow==11.3.0

# validate the build remotely without creating a version rowppl component publish --dry-run# publish a prerelease (build runs on the remote build cluster — no local Docker)ppl component publish -m "add foo support"# flip the prerelease into a released version (publishes worker image, applies tags)ppl component promote

ppl component publish uploads the source tree + component.yml + sibling Dockerfiles to the build cluster; the build runs there. No local Docker required.

Reference snapshot

Common Python keys (`language: py`)

`build_system`	Pre-installed in runtime	Pick when
`2`	Python 3.10, numpy, pipelogic	Pure-Python Component, no ML deps.
`2-opencv4.11`	Above + opencv-python-headless 4.11	Image processing, no ML.
`2-torch2.8`	Above + torch 2.8 (CPU)	Torch CPU inference.
`2-torch2.8-vision`	Above + torch 2.8 + torchvision 0.23 (CPU)	Torch + torchvision, CPU.
`2-cuda12.6`	CUDA 12.6 + cuDNN 9 + opencv	CUDA 12.6 without a framework.
`2-cuda12.6-torch2.8-onnxrtgpu1.22`	CUDA 12.6 + torch (cu126) + onnxruntime-gpu + opencv	Torch + ONNX GPU on CUDA 12.6.
`2-cuda12.8`	CUDA 12.8 + cuDNN 9 + opencv	CUDA without a framework.
`2-cuda12.8-onnxrtgpu1.22`	CUDA + cuDNN 9 + onnxruntime-gpu	ONNX inference on GPU.
`2-cuda12.8-torch2.8`	CUDA + torch + torchvision + opencv	Torch GPU inference.
`2-cuda12.8-torch2.8-onnxrtgpu1.22`	CUDA + torch + onnxruntime-gpu	Mixed Torch + ONNX GPU.
`2-cuda12.8-torch2.8-onnxrtgpu1.22-roboflow`	Above + rfdetr, inference (Roboflow stack)	Roboflow-stack Components.
`2-cuda12.8-torch2.8-ultralytics`	Above + ultralytics + torchvision	YOLO / Ultralytics.

Common C++ keys (`language: cpp`)

`build_system`	What	Pick when
`2`	C++ toolchain, no ML libs	Pure C++ utility Component.
`2-ml`	Above + C++ ML SDK + OpenCV + FFmpeg	C++ Component doing image / signal / video work.
`2-opencv4.11`	C++ toolchain + OpenCV + C++ ML SDK headers	C++ Component needing OpenCV.

Picking the lightest key

Bigger keys ship bigger images and longer pulls. If a component only needs OpenCV, pick 2-opencv4.11, not the full GPU ML stack. When unsure, start with 2-opencv4.11 (CPU image work) or 2-cuda12.8-torch2.8-onnxrtgpu1.22 (GPU ML).

How dependency pinning interacts with the curated layer

This section covers how a pip install step coexists with what the runtime image already provides.

The runtime image ships pipelogic and its core dependencies pre-installed. That set includes numpy, opencv-python / opencv-python-headless, pyyaml, protobuf, pika, and a handful of others depending on the key. Those are not yours to re-pin. If requirements.txt lists numpy==1.26.4, the build installs a parallel numpy on top of the one the runtime already has; at component startup, Python's import machinery resolves whichever is on the path first, and the result is either an ABI conflict (immediate crash) or a silent version mismatch (subtle bugs).

The rule is: everything the runtime ships, the component leaves alone. Everything the component genuinely needs that the runtime does not ship, the component pins with == to a specific version. The pip install step then installs exactly the additional packages, on top of the existing layer, without overlap.

The pinning discipline is not optional. An unpinned transformers in requirements.txt resolves to whatever PyPI returns at build time — 4.44.0 today, 4.45.0 next week, 5.0.0-rc1 when the major bump lands — so a build can resolve a different dependency tree from one day to the next. Exact == pinning means the build either reproduces what worked or fails cleanly with a "version not found" error.

When the curated keys are not enough

This section covers the options when a team's needs do not fit any single key.

In rough order of preference:

xmake_packages (C++). For C++ components that need extra libraries, declare them in component.yml under xmake_packages instead of forcing a new builder. The xmake registry covers most cases; simple packages are string entries, packages that need configuration are object entries with a require line. The build picks them up and links them in.

xmake_packages:  - fast_float  - require: arrow 7.0.0    package: arrow    configs:      parquet: true      snappy: true      zstd: true

Dockerfile-base for system packages. For components that need extra apt packages (a system library, a font, a codec, a tool), drop a sibling Dockerfile-base with the apt-install hook. The build runs the curated base image first, then layers Dockerfile-base on top, then runs pip install. System packages only — Python dependencies stay in requirements.txt where the pinning discipline applies.

# Dockerfile-baseRUN apt-get update && apt-get install -y --no-install-recommends \      libsndfile1 \ && rm -rf /var/lib/apt/lists/*

build_system: custom (paired with build_system_base). For the component that needs to own its Dockerfile outright — an extra build stage, a compiler toolchain, a system-level build step the curated keys do not cover. It is not a build from nothing: build_system_base must name a curated catalog key, the custom Dockerfile builds FROM that base, and the base is what supplies pipelogic and the core runtime. Because the base is a catalog key, a Dockerfile that references it correctly can still be migrated by the platform when that base is patched, and the base's layers stay shared. The author's cost is owning the build steps on top and wiring the platform's component shim by hand. Available on plans that entitle the custom build tier; a handful of first-party components (FFmpeg and GStreamer ingest, FAISS, RNNoise, Gaussian-splatting) build this way.

The escalation ladder exists because each step adds operational debt. xmake_packages is cheap. Dockerfile-base adds an apt layer. custom hands the team its own Dockerfile — still anchored on a catalog build_system_base, but the build steps on top are theirs to maintain. The right answer is almost always the leftmost option that meets the need.

Run it

ppl component builders                 # live catalog of build_system keysppl component publish --dry-run        # remote build validation, no version rowppl component publish -m "<msg>"       # publish a prereleaseppl component promote                  # flip prerelease → released

Where this fits

Build systems are the platform's curated layer between the component source the author writes and the image that runs in a deployed container. The platform owns that layer — the curated images, the supported framework matrices, the security updates, the layer caches. The author writes the component-specific bits; the curated layer covers everything beneath them.

Components — the unit being built.
Install modes — install: node defers package installation to deploy time.
Models — pairing build stacks with serving services and model artifacts.
Publish semantics — the publish + promote loop the build feeds.