Python footguns

Pipelogic's Python component API is small, but the platform has opinions about how process() may behave. The patterns on this page are the ones that compile and ship cleanly but then fail at runtime — usually because the return shape, the stateful contract, or the import layout does not match what the runtime expects.

Read this once before debugging a component that builds fine but produces no output.

Things that bit other authors first, organized by where the trap is.

Configuration

Don't pin packages the runtime base already ships

The runtime base image ships numpy, pyyaml, protobuf, pika, and pipelogic. Pinning any of them in requirements.txt does not silently misbehave — the build system's requirements.txt validator rejects the upload with a field error telling you to remove the pin. If you genuinely need a different version, pick a heavier build_system rather than re-pinning.

# WRONG — rejected at validation time
numpy==2.0.0

# Right — the runtime base supplies numpy

Pin every dependency exactly

==1.2.3, never >=, ~=, a wildcard like ==1.2.*, a URL/VCS reference, or no specifier at all — the validator rejects every one of those. Loose pins make builds non-reproducible and quietly pull in new CVEs on rebuild.

Pick the right build_system

build_system is a required field with no default. Its value selects the runtime base image your component is built on; importing a library that base doesn't ship (e.g. torch under a CPU-only base) fails at runtime. Values seen across the component registry include:

build_systemTypical use
2Lean Python runtime
2-mlCommon ML runtime
2-opencv4.11OpenCV-heavy components
2-cuda12.6 / 2-cuda12.8-onnxrtgpu1.22CUDA / ONNX Runtime GPU
2-cuda12.8-torch2.8-onnxrtgpu1.22PyTorch + ONNX Runtime
2-cuda12.8-torch2.8-ultralyticsUltralytics stack
2-cuda12.8-torch2.8-onnxrtgpu1.22-roboflowRoboflow inference
customBring-your-own base image

Match the value to the libraries you import. See concepts/build-systems for the full matrix.

Types

Numpy access from Bytes / List: know which call copies

import numpy as np

arr = pipe_bytes.unsafe_numpy(dtype=np.uint8, shape=(h, w, 3))  # zero-copy, READ-ONLY view
arr = np.asarray(pipe_bytes)                                    # copy-on-write
arr = pipe_bytes.safe_numpy()                                   # copy-on-write
arr = np.array(pipe_bytes, copy=True)                           # always an independent copy

unsafe_numpy() returns a read-only, zero-copy view that is only valid while the source Bytes/List object is in scope — don't stash it past the call, and don't write through it. np.asarray(...) and .safe_numpy() are copy-on-write: safe to hold while in scope, and mutations are reflected back into the buffer until the first write forces a private copy. Use np.array(..., copy=True) when you need a value that is fully independent of the input.

Default int is Int64, default float is Double

from pipelogic.types import List

L = List([1, 2, 3])              # element type: Int64
L = List([1.0, 2.0])             # element type: Double
L = List([1, 2, 3], '[Int32]')   # explicit Int32

Don't construct dicts for typed values

The high-level wrappers exist for a reason — they handle the named-type registration and field layout for you:

# WRONG — manual dict construction breaks when type fields drift
return {"width": w, "height": h, "data": arr.tobytes(), "format": ...}

# Right — let the wrapper do it
return Image(arr, color_space=ColorSpace.BGR)

The same applies to BoundingBox, Tensor, AudioFrame, Mask, Landmark, etc. Prefer the wrappers in pipelogic.cv and pipelogic.types — they handle field layout and named-type registration for you. Drop down to raw dicts only when you have a reason the wrappers don't cover.

Components

Virtual input is opt-in by parameter name

To receive virtual-input messages, name the parameter virtual_input on your component function. The runtime detects the name and switches the component into virtual-input mode:

def from_vin(virtual_input):
    return virtual_input[0]

run(from_vin)

Returning a tuple from a single-output component wraps it as one output

If component.yml declares one output_type and your function returns (a, b), the runtime serializes the tuple as your one output — it does not auto-split. To return multiple outputs, declare them in component.yml:

worker:
  output_types:
    - BoundingBox
    - Image

then return (boxes, image) in declared order.

Stateful components must return a dict

A component becomes stateful when you pass initial_state= to run. From then on the function receives a state argument and must return both keys:

def stateful(x, state):
    return {"output": x * 2, "state": state + 1}    # required keys

run(stateful, initial_state=0)

Omitting state from the returned dict raises ValueError("state must be provided when stateful is True").

Virtual-output components must return a virtual_output list

Enable the virtual output by passing use_virtual_output=True to run. The function then returns a dict whose virtual_output key is a list (possibly empty); each item is emitted on the virtual-output stream ahead of the normal outputs:

def with_virtual_out(x):
    return {"output": x, "virtual_output": [item1, item2]}

run(with_virtual_out, use_virtual_output=True)

Omitting the virtual_output key raises.

CV wrappers

Stereo audio is not auto-mixed to mono

AudioFrame.numpy() returns the original (samples, channels) float32 shape. Many audio models expect mono — call .mono() explicitly:

audio_arr = audio.mono()              # 1-D float32, channels averaged

.mono() stays float32; if a model wants int16 PCM use audio.to_int16(mono=True).

Image color space is enforced — don't lie

Image(arr, color_space=ColorSpace.RGB) advertises RGB on the wire. Downstream consumers that expect a specific color space (e.g. Image.to_gray() chain) compute the right conversion based on the declared color space. If you pass BGR data labeled as RGB, every downstream conversion is wrong.

image.numpy() returns the current color space

If the input was BGR, image.numpy() is BGR. If you want a specific space, call to_bgr()/to_rgb()/to_gray() — they always return the right space (and copy when conversion is needed).

Image.resize((h, w)) takes height first, then width

OpenCV is the opposite — cv2.resize(img, (w, h)). Pipelogic's wrapper takes (height, width) to match numpy shape conventions.

File and model paths

find_model_file requires exactly one match

find_model_file searches a directory recursively for .pt, .onnx, .safetensors, or .bin files (override with extensions=). Zero matches raises FileNotFoundError; more than one raises ValueError. That's intentional — it forces you to be explicit about which weights file is "the model". For multi-file checkpoints, point at a parent directory and write your own loader.

ensure_local_dir respects offline mode

ensure_local_dir returns a local directory path as-is and otherwise snapshot-downloads the repo. Set HF_HUB_OFFLINE=1 (or TRANSFORMERS_OFFLINE=1) in your container env and the download switches to local_files_only — it serves the build-time-prefetched cache and never reaches the network.

Init and startup

Don't import pipelogic lazily

pipelogic connects to the running backend at import time. If you delay the import, your component won't be able to attach to its streams. Put from pipelogic.worker import run at the top of main.py.

Mutable configuration

Mutable params only update when you drain config.sync()

Parameters declared mutable in component.yml do not change config attributes on their own. You have to call config.sync() once per tick — it drains the pending update queue, applies the new values onto config, and returns the set of changed keys. A component that reads config.threshold but never calls config.sync() will keep seeing the value it started with.

from pipelogic.worker import config, run

def process(x):
    changed = config.sync()          # drain once per tick
    if "threshold" in changed:
        ...                          # react to the new value
    return x

Hand the changed-key set to HotSwapModel.apply yourself

HotSwapModel reloads a model backend when a watched config key changes, but it deliberately never calls config.sync() itself — so several consumers in one component can react to the same update batch. You drain config.sync() and pass its result in:

def process(x):
    swapper.apply(config.sync())     # you drain, HotSwapModel reacts
    return swapper.backend.run(x)

If the rebuild (or the pre-validation download for a new model id) fails, the previous backend stays live and a warning is logged — the component keeps serving with the last known-good model.

Related

Was this page helpful?