Build an object detector
An object detector takes an image and returns a list of bounding boxes — each one a class label, a confidence score, and the rectangle in pixel coordinates.
Pipeml's detection mixins handle preprocessing (resize, pad, normalize), Triton invocation, and postprocessing (NMS, rescaling), so the component is mostly composition. The same skeleton fits any object-detection model.
1. The Params struct
struct Params : public detect::params::Format,
public detect::params::InputLayer,
public detect::params::Normal,
public detect::params::Pad,
public detect::params::Resize,
public detect::params::ObjectDetection,
public detect::params::DetectOutput,
public detect::params::Rect,
public triton::Params {
private:
Params(const std::map<std::string, Object<Any>>& config)
: detect::params::Format(config),
detect::params::InputLayer(config),
detect::params::Normal(config),
detect::params::Pad(config),
detect::params::Resize(config),
detect::params::ObjectDetection(config),
detect::params::DetectOutput(config),
detect::params::Rect(config),
triton::Params(config) {
top_detections = read_config(config, "top_detections").as<UInt64>();
}
public:
size_t top_detections;
Params() : Params(parse_config()) {}
std::vector<std::unique_ptr<infer::Tensor>> preprocess(const cv::Mat& img) const {
std::vector<std::unique_ptr<infer::Tensor>> answer;
answer.emplace_back(detect::params::InputLayer::preprocess(
detect::params::Normal::preprocess(
detect::params::Format::preprocess(detect::params::Pad::preprocess(
detect::params::Resize::preprocess(img))))));
return answer;
}
};
Each parent contributes a slice of config_schema and a preprocess() step. The composed preprocess runs them in order: Resize → Pad → Format (color conversion) → Normal (mean/std) → InputLayer (channel order, batch dim).
You can swap mixins out: skip Pad for variable-size models, swap Format for a custom color path, etc. The composition order in preprocess() is just function chaining — adjust it to match your model's preprocessing.
The custom field top_detections is read directly from config (no mixin needed).
2. Postprocess
std::vector<ocv::BoundingBox>
postprocess(const std::vector<std::unique_ptr<infer::Tensor>>& outputs,
const cv::Mat& img, const Params& params) {
float x_scale, y_scale, scale_factor;
std::tie(x_scale, y_scale, scale_factor) =
params.get_scale_factors(cv::Size(img.cols, img.rows));
uint64_t num_detections =
std::min(params.top_detections, params.detections_size(outputs));
std::vector<ocv::BoundingBox> output_values;
for (int i = 0; i < num_detections; ++i) {
auto detected_class = params.get_detected_class(outputs, i);
if (detected_class.confidence() <= 0.0 || detected_class.confidence() > 1.0)
continue;
output_values.push_back(ocv::BoundingBox(
std::move(detected_class),
params.extract_rectangle(outputs[params.output_detections_order], i,
cv::Size(img.cols, img.rows),
x_scale, y_scale, scale_factor)));
}
return output_values;
}
Three things are happening:
params::Rect::get_scale_factors()computes the inverse of the preprocess scaling so we can map model output back to image coordinates.params::DetectOutputknows which output tensor holds boxes vs. classes vs. confidences.get_detected_class(outputs, i)reads detectionifrom the right tensor.params::DetectOutput::extract_rectangle(...)reads the bounding-box coordinates and applies the scale factors.
The confidence guard <= 0.0 || > 1.0 filters padding entries that some detectors emit.
3. Run loop
PIPELOGIC_MAIN() {
const Params params;
Type in_type = dynamic_type_v<ocv::types::Image>;
Type out_type = dynamic_type_v<List<ocv::types::BoundingBox>>;
auto model = std::make_shared<triton::InferenceModel>(params);
auto pre = [¶ms](const cv::Mat& img) {
return params.preprocess(img);
};
auto post = [¶ms](const auto& outputs, const cv::Mat& img) {
return postprocess(outputs, img, params);
};
auto detector = std::make_shared<detect::ObjectDetector>(
model,
std::vector<std::string>{model->inputs().at(0).name},
params.output_layer,
pre, post, params);
run([detector](Message input_image) -> Message {
ocv::Image img = std::move(input_image);
Object<List<ocv::types::BoundingBox>> answer;
if (!img.empty()) {
auto bbs = detector->find_objects(img.mat());
for (auto&& it : bbs) {
if (!cv::Rect2d{it.rectangle()}.empty()) {
answer.push_back(Object<ocv::types::BoundingBox>{std::move(it)});
}
}
}
return answer;
});
return EXIT_SUCCESS;
}
Key moves:
- The Triton input name is queried from the model itself (
model->inputs().at(0).name) — no need to hardcode it incomponent.yml. - Output names come from
params.output_layer(aparams::DetectOutputmember populated from theoutput_layer: [String]config field). - The runtime worker is a lambda that takes a
Message(the typed pipelang input) and returns one (the typed output).ocv::Image{...}consumes the move-construction fromMessage; theObject<List<...>>accumulator is the output. - Empty image → empty list (defensive).
- Empty result rectangles are filtered (some Triton models pad with zero boxes).
4. component.yml
The matching component.yml declares all the config fields the mixins read:
name: "Detect Objects (Triton)"
language: cpp
platform: linux/amd64
build_system: 2-ml
tags: ["latest", "default"]
worker:
input_type: "Image"
output_type: "[BoundingBox]"
file_schema:
model:
file_type: "model"
config_key: "model_name"
component: "triton"
is_optional: false
config_schema:
# params::Format
color_model:
type: "Image.RGB | Image.BGR | Image.GRAY"
# params::Resize
input_height:
type: UInt64
input_width:
type: UInt64
# params::Pad
pad_value:
type: Double
pad_mode:
type: String
# params::Normal
mean:
type: "(Double, Double, Double)"
std:
type: "(Double, Double, Double)"
# params::InputLayer
int_input_type:
type: Bool
add_batch_layer:
type: Bool
change_channel_order:
type: String
# params::ObjectDetection
object_confidence_threshold:
type: Double
iou_threshold:
type: Double
# params::DetectOutput
output_layer:
type: "[String]"
# custom
top_detections:
type: UInt64
default: 100
(Real component.yml has more parameters; this is the minimum each mixin needs.)
To adapt for your detector
- Pick which
params::*mixins your preprocessing needs. Keep the inheritance order the same; reorder the calls inpreprocess()if the model requires a different chain. - Implement
postprocess()to read your model's specific output layout.params::DetectOutputplusparams::Rectcover the most common case (boxes/classes/scores in three tensors). - Add custom config fields by reading them in the
Paramsctor withread_config(config, "..."). - Update
component.ymlto match the union of all mixin-required fields plus your custom ones.
What's next
- How to write a classification worker
- How to write a segmentation worker
- How to write a pose worker
- API: detect — full mixin reference.