detect

The detection layer of pipeml. A composable mixin system for building object-detection, classification, segmentation, pose, and bodypart components.

The params::* mixin system

Every detection component declares a Params struct that inherits from a set of params::* mixins. Each mixin pulls a piece of config_schema from component.yml and contributes a step to the preprocess/postprocess chain. Compose them in the order:

Format → Resize → Pad → Normal → InputLayer → ObjectDetection → DetectOutput → Rect

Plus triton::Params for Triton-served models.

params::Format

Reads color_model (Union of Image.RGB | Image.BGR | Image.GRAY). Applies the right cv::cvtColor to the input image during preprocess.

params::Resize

Reads input_height, input_width. Resizes the input matrix; preserves aspect ratio when paired with params::Pad.

params::Pad

Reads pad_value, pad_mode (one of top_left, middle_middle, etc.). Letterboxes after Resize so the model gets the exact (input_height, input_width) it expects.

params::Normal

Reads mean: (Double, Double, Double), std: (Double, Double, Double). Applies (x - mean) / std per channel.

params::InputLayer

Reads int_input_type (Bool), add_batch_layer (Bool), change_channel_order (one of channels_height_width, height_width_channels, height_width). Reshapes the tensor for the model's expected input layout.

params::ObjectDetection

Reads class_names: [String], confidence_threshold: Double, iou_threshold: Double, plus model-specific fields. Drives NMS and class filtering.

params::DetectOutput

Reads the indices/names of output tensors (boxes_output_name, scores_output_name, labels_output_name, etc.). Tells the postprocessor where to read predictions from in the model's output tensor list.

params::Rect

Provides get_scale_factors(cv::Size) that returns (x_scale, y_scale, scale_factor) for converting normalized model output back to pixel coordinates of the original image.

Composing them

The canonical pattern from detect-objects/src/main.cpp:

struct Params : public detect::params::Format,
                public detect::params::InputLayer,
                public detect::params::Normal,
                public detect::params::Pad,
                public detect::params::Resize,
                public detect::params::ObjectDetection,
                public detect::params::DetectOutput,
                public detect::params::Rect,
                public triton::Params {

private:
  Params(const std::map<std::string, Object<Any>>& config)
      : detect::params::Format(config),
        detect::params::InputLayer(config),
        detect::params::Normal(config),
        detect::params::Pad(config),
        detect::params::Resize(config),
        detect::params::ObjectDetection(config),
        detect::params::DetectOutput(config),
        detect::params::Rect(config),
        triton::Params(config) {
    top_detections = read_config(config, "top_detections").as<UInt64>();
  }

public:
  size_t top_detections;

  Params() : Params(parse_config()) {}

  std::vector<std::unique_ptr<infer::Tensor>>
  preprocess(const cv::Mat& img) const {
    std::vector<std::unique_ptr<infer::Tensor>> answer;
    answer.emplace_back(detect::params::InputLayer::preprocess(
        detect::params::Normal::preprocess(
            detect::params::Format::preprocess(detect::params::Pad::preprocess(
                detect::params::Resize::preprocess(img))))));
    return answer;
  }
};

Each mixin's preprocess() is composed by hand in the order matching the comment above. Add or remove mixins to match your model — e.g. omit Pad if your model accepts variable-size input.

Domain wrappers

Detector<Input, Output> (umbrella)

Header: pipelogic/detect/detect.hpp. The base class:

template <typename Input, typename Output>
class Detector {
  Detector(std::shared_ptr<infer::Model>,
           std::vector<std::string> input_names,
           std::vector<std::string> output_names,
           std::function<std::vector<std::unique_ptr<infer::Tensor>>(const Input&)> preprocess,
           std::function<Output(const std::vector<std::unique_ptr<infer::Tensor>>&, const Input&)> postprocess);

  Output raw_detect(const Input&);   // preprocess → infer → postprocess
};

ObjectDetector

Header: pipelogic/detect/object.hpp.

class ObjectDetector : Detector<cv::Mat, std::vector<ocv::BoundingBox>> {
public:
  ObjectDetector(std::shared_ptr<infer::Model> model,
                 std::vector<std::string> input_names,
                 std::vector<std::string> output_names,
                 PreprocessFn preprocess,
                 PostprocessFn postprocess,
                 params::ObjectDetection params = {});

  std::vector<ocv::BoundingBox> find_objects(const cv::Mat& img);
};

Plus NMS helpers:

std::vector<ocv::BoundingBox>
object_non_maximum_suppression(
    const std::vector<ocv::BoundingBox>& boxes,
    const std::set<uint64_t>& classes_of_interest,
    float iou_threshold);

std::vector<ocv::BoundingBox>
object_nms_equivalence_classes(
    const std::vector<ocv::BoundingBox>& boxes,
    const std::set<std::set<uint64_t>>& equivalence_classes,
    float iou_threshold);

std::vector<ocv::BoundingBox>
object_nms_collectively(
    const std::vector<ocv::BoundingBox>& boxes,
    const std::set<uint64_t>& classes_of_interest,
    float iou_threshold);

std::vector<ocv::BoundingBox>
get_bboxes_from_data(const cv::Mat& img,
                     const std::vector<float>& boxes,
                     const std::vector<int>& labels,
                     const std::vector<float>& scores);

ImageClassifier

Header: pipelogic/detect/classify.hpp.

class ImageClassifier : Detector<cv::Mat, std::vector<ocv::DetectedClass>> {
public:
  ImageClassifier(model, input_names, output_names, pre, post, params::Classify = {});
  std::vector<ocv::DetectedClass> classify(const cv::Mat& img);
};

Returns top-K detected classes (with confidences) sorted by confidence.

PanopticSegmentor

Header: pipelogic/detect/segment.hpp.

class PanopticSegmentor : Detector<cv::Mat, std::vector<ocv::Segmentation>> {
public:
  PanopticSegmentor(model, input_names, output_names, pre, post, params::Segment = {});
  std::vector<ocv::Segmentation> segment(const cv::Mat& img);
};

Each Segmentation carries a class label plus a boolean mask.

PoseEstimator

Header: pipelogic/detect/pose.hpp.

class PoseEstimator : Detector<cv::Mat, std::vector<std::vector<ocv::Landmark>>> {
public:
  PoseEstimator(model, input_names, output_names, pre, post, params::Pose = {});
  std::vector<std::vector<ocv::Landmark>> find_keypoints(const cv::Mat& img);
};

Returns one inner vector per detected pose, with keypoints in the order defined by the pose family (Landmarks2d::Human17, Human22, Human26, Animal17, Hand21).

Bodypart

Header: pipelogic/detect/bodypart.hpp. Crops bodyparts from an image given a Landmarks2d::Human22 and emits per-part confidence scores.

What's next

Was this page helpful?