Detect Objects (Triton)

2 VersionsDetect

Triton-served object detector. The worker preprocesses each {{type:Image}} (letterbox, colour cast, normalise, channel/batch reshape) and sends it through the Triton checkpoint registered as {{param:model}}; raw detections are decoded, NMS-suppressed, optionally rescaled, and emitted as {{type:[BoundingBox]}}.

How it fits

{{type:Image}} -> {{component:detect_objects_triton}} -> {{type:[BoundingBox]}}
                          |
                          +-- preprocess: letterbox to {{param:shortest_edge}} / {{param:longest_edge}} (or {{param:pad_height}} / {{param:pad_width}})
                          +-- colour cast to {{param:color_model}} -> linear `(x * {{param:rescale_factor}} + {{param:rescale_offset}})` -> normalise via {{param:mean}} / {{param:std}}
                          +-- gRPC inference against {{param:model}}, reading {{param:output_score}} / {{param:output_label}} / {{param:output_detections}} / {{param:num_detections}}
                          +-- decode -> NMS at {{param:object_nms_threshold}} respecting {{param:classes_nms_equivalence}} -> top-{{param:top_detections}} -> rescale by {{param:object_scale_factor}}

Pick this when the detector is packaged as a Triton model-repository entry and the deployment runs a triton sidecar. For HuggingFace / Optimum / Ultralytics checkpoints without Triton prefer {{component:detect_objects_huggingface}}; for an in-process YOLO checkpoint prefer {{component:detect_objects_ultralytics_yolo}}; for TorchServe-hosted models see {{component:detect_objects_torchserve}}.

Typical backends

Object tracking dashboard: {{component:input_camera}} -> {{component:detect_objects_triton}} -> {{component:track_bounding_boxes_hungarian_algorithm}} -> {{component:visualize}} -> {{component:output_browser_stream}}.
Streaming source: {{component:input_video_url_ffmpeg}} -> {{component:detect_objects_triton}} -> {{component:detect_zone_transition}} -> {{component:send_object_counts_mqtt}}.
Recorded video analytics: {{component:input_video_file}} -> {{component:detect_objects_triton}} -> {{component:track_bounding_boxes_hungarian_algorithm}}.
Per-box segmentation refine: {{component:input_image_file}} -> {{component:detect_objects_triton}} -> {{component:segment_image_triton}} -> {{component:visualize}}.
Person to pose: {{component:input_camera}} -> {{component:detect_objects_triton}} -> {{component:locate_body_part}} -> {{component:visualize}}.
Distance watch: {{component:input_camera}} -> {{component:detect_objects_triton}} -> {{component:track_bounding_boxes_hungarian_algorithm}} -> {{component:check_object_distance}}.

Caveats

{{param:mean}}, {{param:std}}, {{param:color_model}}, {{param:change_channel_order}}, {{param:rescale_factor}}, {{param:rescale_offset}} must mirror the training config of {{param:model}}; any drift silently produces low-confidence garbage with no error.
Tensor-name mismatches across {{param:output_score}}, {{param:output_label}}, {{param:output_detections}}, and {{param:num_detections}} relative to the Triton signature yield an EMPTY output without an error.
{{param:bbox_x0_y0_x1_y1_order}}, {{param:center_to_corners}}, and {{param:logits}} pick how the raw detection tensor is parsed — mismatches against the served head silently produce wrong coordinates or unscaled scores. There is NO auto-detection of the wrong setting.
{{param:num_classes}} must match the model's class count exactly; too small drops trailing classes silently, too large reads past the tensor end and produces garbage.
{{param:image_scaling}} true letterboxes to {{param:pad_height}} / {{param:pad_width}} (when both are set) or to the {{param:shortest_edge}} / {{param:longest_edge}} clamp; false skips the resize entirely and feeds the network at the input resolution.
{{param:add_batch_layer}} and {{param:int_input_type}} pin the input tensor shape and dtype to the {{param:model}} signature; a mismatch aborts the first inference with a Triton shape error.
{{param:pad_color}} is interpreted as (R, G, B) regardless of {{param:color_model}}; mixing it with a BGR-targeting model dyes the letterbox border the wrong colour and biases detections near the frame edge.
{{param:object_confidence_threshold}} is the base cutoff. {{param:class_confidence_thresholds}} provides per-class overrides; classes absent from the override list keep the base cutoff. {{param:classes}} is a separate allow-list — when non-empty only the listed class ids reach the output regardless of cutoff.
{{param:object_nms_threshold}} is an IoU threshold applied AFTER class filtering; {{param:classes_nms_equivalence}} groups treat listed class ids as the same class during NMS, useful for hierarchical or alias-mapped label sets.
{{param:object_scale_factor}} multiplies the final (x1, y1, x2, y2) rectangle around its centre BEFORE clipping to the image. Values other than 1.0 deliberately inflate or shrink the detected boxes.
{{param:top_detections}} clamps the final list size after NMS — overly aggressive values drop valid detections silently.
{{param:resize_detections_height}} / {{param:resize_detections_width}} rescale the final boxes to a target frame size; 0 keeps them in the original input frame.
All config keys are captured ONCE at startup; runtime changes have NO effect and require a redeploy.
Triton must load {{param:model}} before the first inference; first-frame latency includes the cold load.

Versions

142fc7fadefaultlinux/amd64
Automated release
5/9/2026