Files

A File in Pipelogic is data, not code. A component is a typed, versioned, containerized capability; a File is the model weights, the dataset, the fixture, the config document, or the archive that the component loads when it runs. The two are deliberately separate primitives, and the backend graph is where they meet.

What a File actually is

A File is a typed, named, addressable object in the workspace. It is uploaded once, stored once, versioned on its own schedule, and reused by reference rather than copied into a component image. One typed File concept covers a multi-gigabyte model, a small tokenizer, a CSV fixture, a Triton model repository, and a YAML config document — there is no separate "model file" and "data file" machinery, only a file_type that says what the artifact is and which slots will accept it.

The defining property is that the File and the binding are separate. Uploading a model produces a workspace artifact and nothing more; the artifact runs only after it is bound into a specific slot on a specific backend vertex and that backend is deployed. The same File can be bound into many backends and many vertices at once, and swapping it for another is a graph mutation — re-point the binding — not a component rebuild.

Why files are their own primitive

Models, fixtures, and configs have lifecycles that do not line up with component lifecycles. A component release is immutable; a model artifact is swapped on its own schedule; a fixture set evolves independently. Folding any of those into the component would bloat the image and force a republish for every artifact change. Keeping the File separate costs one extra step — upload, then bind, instead of shipping the artifact inside the code — and in return the same model can be compared across sibling vertices in one backend, the same fixture can be reused across regression suites, and the same tokenizer can sit alongside many models, none of which requires re-bundling a component.

The three reference pages

Everything else about files splits into three concerns, each with its own page:

How a component asks for a file. A component declares the slots it needs under worker.file_schema, and the files it produces under worker.generated_file_schema. That contract is in File schema.

Which file is allowed in which slot. Every slot and every upload carries a file_type. The platform refuses a bind unless the uploaded File's type satisfies the slot. The full value list is in File types.

How a file gets uploaded and attached. Upload registers the artifact and its type; binding attaches it to a vertex slot and can carry config defaults into the graph. The mechanics, including the common mismatch errors, are in File binding.

Mental model

workspace files                       backend graph
───────────────                       ─────────────
                                      ┌────────────┐
   detector.onnx ──(add-file)──▶      │  vertex A  │
   (file_id, type=onnx)               │  ┌───────┐ │
                                      │  │ model │◀┼── detector.onnx
   tokenizer.json ──(add-file)──▶     │  └───────┘ │
   (file_id, type=tokenizer)          │  ┌───────┐ │
                                      │  │ tokn  │◀┼── tokenizer.json
                                      │  └───────┘ │
                                      └────────────┘

The two halves stay independent. An uploaded File waits in the workspace until something binds it; the binding takes effect at deploy time. This is why a file-backed backend is reproducible: the binding is an entry in the event-sourced operation log, the File it points at is immutable, and redeploying replays the same wiring.

Related

Was this page helpful?