File upload and backend binding

TL;DR

  • A file is a typed workspace artifact that a component consumes at runtime. Each upload is typed (onnx, mar, gguf, weights, csv, image, parquet, …), addressable by file_id, and reusable across many backends.
  • A component declares file slots in its component.yml (file_schema). A backend binds a specific uploaded file into a specific slot on a specific vertex with ppl backend add-file. The binding is part of the backend graph — reproducible, undoable, redeployable.
  • The file and the binding are separate. Uploading a model produces a workspace artifact; the artifact runs only after it is bound into a backend vertex and the backend is deployed. The same file can be bound into many backends and many vertices simultaneously.
  • A file's optional config metadata can set vertex parameter defaults at bind time. That is the pattern for an artifact that carries its own confidence threshold, labels, or preprocessing config: the file becomes self-describing, and the backend graph picks up those defaults automatically.
  • The flow is done when the bound file works in the runtime, not when the upload succeeds. Bind is followed by a fixture run, and the fixture run is the part that confirms behavior.

What a file is

A file is a typed, named workspace artifact a component consumes at runtime — uploaded once, bound by reference into a backend vertex, reused across backends. The primitive, the mental model, and why files are separate from components live in Files. This page is the hands-on path: upload, attach config, bind, prove.

Uploading

Use this step when the file is not yet in the workspace.

Upload pushes a local file (or directory) into the workspace file store and registers it with a type, a display name, and a README. The non-interactive shape is the one to use in scripts and agent flows:

ppl file upload <path> --type <type> --name "<display>" --readme @./README.md
ppl file upload <path> --type <type> --name "<display>" --readme @./README.md --config ./file.config.yml

--type is the load-bearing flag: it determines what kind of artifact the upload is and which file_schema slots accept the resulting file. A wrong type at upload means the bind step rejects it later, or the bind succeeds and the runtime reaches an incompatible loader. The platform exposes a broad type vocabulary: model artifacts (model, onnx, safetensors, checkpoint, weights, tokenizer, lora, gguf, mar), data fixtures (csv, json, jsonl, image, audio, video, parquet, arrow, numpy, archive, binary), and config/doc types (yaml, toml, text, markdown, openapi, jsonschema, config).

The Triton case is worth calling out because it is the most common upload mistake. --type model expects a Triton model repository directory passed as a directory path, not a pre-tarred archive — see /file-api/file-types for the layout and the directory-not-archive rule. The bind validates, and the failure surfaces only when the runtime loads the model.

A description is required at upload: pass it inline with --readme, or from a file with --readme @<path>. A display name comes from --name. Passing both on the command line keeps the flow non-interactive — an agent flow that omits the description stalls on an editor prompt it cannot answer.

Attaching config to a file

Use this when the artifact should carry its own runtime defaults.

A file can have a --config <yaml> document attached at upload. The config has two effects: its tags are searchable metadata in the workspace file listing, and its config_schema defaults flow into matching vertex parameters when the file is later bound to a backend vertex.

tags:
  - detector
  - warehouse

config_schema:
  confidence_threshold:
    type: Double
    default: 0.35
  labels:
    type: String
    default: warehouse_labels

The config_schema defaults match by name: every entry whose name matches a parameter on the consuming component's vertex replaces that vertex parameter's value at bind time. This is what makes a model artifact self-describing — the artifact and its preferred runtime settings travel together, and the backend graph picks them up automatically. Binding a file whose config sets confidence=0.35 leaves the backend running at 0.35 without a separate parameter step.

The side-effect is also load-bearing. A vertex parameter previously set with ppl backend change-parameter is replaced when a file with a matching config_schema entry is bound. The artifact's value wins for the parameters the artifact owns. The rule: anything in config_schema belongs to the file, and anything else stays under change-parameter control on the graph.

Binding to a vertex

Use this step once the file is in the workspace and the backend vertex declares a slot for it.

Binding attaches a file_id to a specific file_schema slot on a specific vertex of a specific backend:

ppl backend add-file <backend_id> --vertex <vertex_id> --key <file_schema_key> --file <file_id>

--key is the slot name declared by the consuming component (for example --key model for a file_schema.model declaration). The platform validates that the file type matches what the slot accepts, so a mismatched type rejects at bind rather than at runtime.

The bind is a single graph operation in the event-sourced backend log: it adds an entry, applies any config_schema defaults that came with the file, and becomes part of the backend's reproducible history. The next deploy or redeploy picks up the new binding. Replacing one bound file with another is the same operation pointed at a different file_id, and the prior binding is undoable through the backend operation log.

See Backend operations for how this lands in the operation log, and Deploy and monitor for what redeploys after a binding change.

Proving the binding

Use this step every time. Binding is not done until the runtime confirms behavior.

A bound file means the platform validated that the artifact's type is acceptable for the slot. It does not mean the artifact produces correct output. A detector can bind and then load the wrong label set; a tokenizer can bind and then mismatch the model's expected vocabulary; a Triton repository can bind and then reference a backend the container does not have installed. Each of those failures surfaces only when the runtime loads and exercises the file.

The proof step is a fixture run through the deployed backend. The cheap form is a live-backend test against a small representative input; the thorough form is the version-comparison proof loop described in Prove behavior. What matters is that the bound file produces the expected semantic output, not that the bind call succeeded.

Related

Was this page helpful?