vision3d.ops#

Geometric operators for 3D data.

Functions

`batched_nms_3d`(boxes, scores, idxs, ...)	Class-aware 3D NMS: runs `nms_3d()` independently per class.
`box3d_convert`(boxes, in_fmt, out_fmt)	Convert 3D bounding boxes from `in_fmt` to `out_fmt`.
`box3d_corners`(boxes, format)	Compute the 8 world-space corners of 3D bounding boxes.
`box3d_iou`(boxes1, boxes2, format)	Compute the pairwise intersection-over-union of 3D `boxes1` and `boxes2`.
`box3d_overlap`(boxes1, boxes2, format)	Check 3D overlap between two sets of oriented bounding boxes.
`nms_3d`(boxes, scores, iou_threshold, format)	Greedy, class-agnostic non-maximum suppression on 3D bounding boxes.
`points_in_boxes_3d`(points, boxes, format)	Compute a boolean mask indicating which points fall inside which boxes.
`points_in_boxes_3d_indices`(points, boxes, format)	Return per-point box assignment.
`project_to_image`(points_3d, extrinsics, ...)	Project 3D points in lidar frame to pixel coordinates.

vision3d.ops.batched_nms_3d(boxes, scores, idxs, iou_threshold, format)[source]#

Class-aware 3D NMS: runs nms_3d() independently per class.

Parameters:

boxes (Tensor) – [N, K] boxes.
scores (Tensor) – [N] prediction confidences.
idxs (Tensor) – [N] integer class labels.
iou_threshold (float) – See nms_3d().
format (BoundingBox3DFormat) – Format of boxes.

Returns:

int64 tensor of indices into boxes that survived, sorted in decreasing order of score.

Return type:

Tensor

vision3d.ops.box3d_convert(boxes, in_fmt, out_fmt)[source]#

Convert 3D bounding boxes from in_fmt to out_fmt.

Only the lossless XYZXYZ <-> XYZLWH conversion is supported. All other conversions would discard or fabricate rotation angles.

Parameters:

boxes (Tensor) – Boxes to convert with shape [..., K]. Supports any number of leading batch dimensions.
in_fmt (BoundingBox3DFormat | str) – Source format.
out_fmt (BoundingBox3DFormat | str) – Target format.

Returns:

Converted boxes with the same leading dimensions.

Return type:

Tensor

vision3d.ops.box3d_corners(boxes, format)[source]#

Compute the 8 world-space corners of 3D bounding boxes.

Supports all rotation formats including full 9-DOF (yaw, pitch, roll).

Corner ordering:

4 -------- 5       z  x
|\         |\      |  /
| 7 -------| 6     | /
| |        | |     |/
0 |--------1 |     +------ y
 \|         \|
  3 -------- 2

Bottom face (z-): {0, 1, 2, 3}. Top face (z+): {4, 5, 6, 7}.

Parameters:

boxes (Tensor) – 3D bounding boxes [N, K].
format (BoundingBox3DFormat) – Format of the bounding boxes.

Returns:

Corner coordinates [N, 8, 3].

Return type:

Tensor

vision3d.ops.box3d_iou(boxes1, boxes2, format)[source]#

Compute the pairwise intersection-over-union of 3D boxes1 and boxes2.

iou = vol / (vol1 + vol2 - vol), where vol is the volume of the intersecting convex polyhedron and vol1, vol2 are the volumes of the two input boxes.

The same algorithm handles every supported box format — including full 9-DOF orientation (XYZLWHYPR) — because the clipping step operates on the 8 box corners regardless of how they were produced.

Note: This function is not differentiable.

Parameters:

boxes1 (Tensor) – First set of boxes [N, K].
boxes2 (Tensor) – Second set of boxes [M, K].
format (BoundingBox3DFormat) – Format of both box sets.

Returns:

[N, M] matrix of IoU values in [0, 1].

Return type:

Tensor

vision3d.ops.box3d_overlap(boxes1, boxes2, format)[source]#

Check 3D overlap between two sets of oriented bounding boxes.

Uses the Separating Axis Theorem (SAT) with 15 potential separating axes (3 face normals per box + 9 edge cross products).

Parameters:

boxes1 (Tensor) – First set of boxes [N, K].
boxes2 (Tensor) – Second set of boxes [M, K].
format (BoundingBox3DFormat) – Format of both box sets.

Returns:

Boolean matrix [N, M] where True indicates overlap.

Return type:

Tensor

vision3d.ops.nms_3d(boxes, scores, iou_threshold, format)[source]#

Greedy, class-agnostic non-maximum suppression on 3D bounding boxes.

Iteratively removes lower-scoring boxes whose IoU with a higher-scoring box exceeds iou_threshold.

Parameters:

boxes (Tensor) – [N, K] boxes to perform NMS on. K depends on format.
scores (Tensor) – [N] prediction confidences.
iou_threshold (float) – Discard any box whose IoU with a higher-scoring kept box is strictly greater than this value.
format (BoundingBox3DFormat) – Format of boxes.

Returns:

int64 tensor of indices into boxes that survived, sorted in decreasing order of score.

Return type:

Tensor

vision3d.ops.points_in_boxes_3d(points, boxes, format)[source]#

Compute a boolean mask indicating which points fall inside which boxes.

Supports all rotation formats including full 9-DOF (yaw, pitch, roll).

Parameters:

points (Tensor) – Point cloud coordinates [N, 3+C]. Only the first 3 columns (x, y, z) are used.
boxes (Tensor) – 3D bounding boxes [M, K] where K depends on format.
format (BoundingBox3DFormat) – Format of the bounding boxes.

Returns:

Boolean tensor [N, M] where entry (i, j) is True if point i is inside box j.

Return type:

Tensor

vision3d.ops.points_in_boxes_3d_indices(points, boxes, format)[source]#

Return per-point box assignment.

If a point is inside multiple boxes, the first (lowest index) box wins.

Parameters:

points (Tensor) – Point cloud coordinates [N, 3+C].
boxes (Tensor) – 3D bounding boxes [M, K].
format (BoundingBox3DFormat) – Format of the bounding boxes.

Returns:

Integer tensor [N] with the index of the box each point belongs to, or -1 if the point is not in any box.

Return type:

Tensor

vision3d.ops.project_to_image(points_3d, extrinsics, intrinsics)[source]#

Project 3D points in lidar frame to pixel coordinates.

Parameters:

points_3d (Tensor) – Points in lidar frame [N, 3].
extrinsics (Tensor) – Lidar-to-camera transformation [4, 4].
intrinsics (Tensor) – Camera intrinsic matrix [3, 3].

Returns:

(uv, depth) where uv is the pixel coordinates [N, 2] (u, v) and depth is the camera-frame depth [N].

Return type:

tuple[Tensor, Tensor]