Build a pose estimator
A pose estimator takes an image plus the bounding boxes of detected people and returns a set of landmarks — keypoint locations like shoulders, hips, and knees — for each person.
Pipeml ships a mixin that handles preprocessing, model invocation, and decoding, so the component is mostly wiring: declare the inputs and output, plug in the mixin, and run.
1. Params
class Params : public detect::params::GeneralPoseEstimator,
public triton::Params {
Params(const std::map<std::string, Object<Any>>& config)
: detect::params::GeneralPoseEstimator(config),
triton::Params(config) {}
public:
Params() : Params(parse_config()) {}
};
A single mixin handles all the pose-specific config (preprocess, postprocess, joint count, top-k, etc.).
2. Run loop
PIPELOGIC_MAIN() {
const Params params;
auto model = std::make_shared<triton::InferenceModel>(params);
// GPE has 1 or 2 outputs depending on the model
const auto& ou = model->outputs();
std::vector<std::string> outputs{ou.at(0).name};
if (ou.size() > 1) outputs.push_back(ou.at(1).name);
detect::GeneralPoseEstimator detector(
model, params, model->inputs().at(0).name, outputs);
run([&detector](Message image, Message pbboxes) -> Message {
ocv::Image img = std::move(image);
auto bbox_list = pbboxes.as<List<ocv::types::BoundingBox>>();
std::vector<ocv::BoundingBox> bboxes;
for (int i = 0; i < bbox_list.size(); ++i) {
bboxes.push_back(ocv::BoundingBox{bbox_list.data(i)});
}
auto landmarks_collection = detector.find_landmarks(img.mat(), bboxes);
Object<List<List<ocv::types::Landmark>>> answer;
for (const auto& landmarks : landmarks_collection) {
Object<List<ocv::types::Landmark>> lds;
for (auto&& ld : landmarks) {
lds.push_back(Object<ocv::types::Landmark>{std::move(ld)});
}
answer.push_back(std::move(lds));
}
return answer;
});
return EXIT_SUCCESS;
}
Notes:
- The worker takes two
Messagearguments. The pipelogic runtime supplies them in the order declared bycomponent.yml'sworker.input_types. - The bbox list is consumed via
pbboxes.as<List<ocv::types::BoundingBox>>(), then unwrapped into astd::vector<ocv::BoundingBox>for the detector API. - The output type is
[[Landmark]]— an outer list of pose instances, each an inner list of joints in the order defined by the pose family (e.g., COCO 17-point order forLandmarks2d::Human17).
3. component.yml
name: "Detect Landmarks"
language: cpp
platform: linux/amd64
build_system: 2-ml
tags: ["latest", "default"]
worker:
input_types:
- "Image"
- "[BoundingBox]"
output_type: "[[Landmark]]"
file_schema:
model:
file_type: "model"
config_key: "model_name"
component: "triton"
is_optional: false
config_schema:
# detect::params::GeneralPoseEstimator
color_model:
type: "Image.RGB | Image.BGR | Image.GRAY"
input_height:
type: UInt64
input_width:
type: UInt64
mean:
type: "(Double, Double, Double)"
std:
type: "(Double, Double, Double)"
num_joints:
type: UInt64
pose_family:
type: String # e.g. "Human17", "Human22"
Pose families
The Landmarks2d::* named types fix the joint count and ordering:
| Type | Joints | Source |
|---|---|---|
Landmarks2d::Human17 | 17 | COCO Body |
Landmarks2d::Human22 | 22 | COCO + MPII |
Landmarks2d::Human26 | 26 | Halpe |
Landmarks2d::Animal17 | 17 | AP-10K |
Landmarks2d::Hand21 | 21 | COCO Wholebody Hand |
The label-*-landmarks components in the workers catalog convert raw [Landmark] lists into the typed Landmarks2d::* shape that downstream pose-aware components consume (detect-bodypart, filter-pose-bbox, etc.).