vision3d.ops#

Geometric operators for 3D data.

Functions

batched_nms_3d(boxes, scores, idxs, ...)

Class-aware 3D NMS: runs nms_3d() independently per class.

box3d_convert(boxes, in_fmt, out_fmt)

Convert 3D bounding boxes from in_fmt to out_fmt.

box3d_corners(boxes, format)

Compute the 8 world-space corners of 3D bounding boxes.

box3d_iou(boxes1, boxes2, format)

Compute the pairwise intersection-over-union of 3D boxes1 and boxes2.

box3d_overlap(boxes1, boxes2, format)

Check 3D overlap between two sets of oriented bounding boxes.

nms_3d(boxes, scores, iou_threshold, format)

Greedy, class-agnostic non-maximum suppression on 3D bounding boxes.

points_in_boxes_3d(points, boxes, format)

Compute a boolean mask indicating which points fall inside which boxes.

points_in_boxes_3d_indices(points, boxes, format)

Return per-point box assignment.

project_to_image(points_3d, extrinsics, ...)

Project 3D points in lidar frame to pixel coordinates.

vision3d.ops.batched_nms_3d(boxes, scores, idxs, iou_threshold, format)[source]#

Class-aware 3D NMS: runs nms_3d() independently per class.

Parameters:
Returns:

int64 tensor of indices into boxes that survived, sorted in decreasing order of score.

Return type:

Tensor

vision3d.ops.box3d_convert(boxes, in_fmt, out_fmt)[source]#

Convert 3D bounding boxes from in_fmt to out_fmt.

Only the lossless XYZXYZ <-> XYZLWH conversion is supported. All other conversions would discard or fabricate rotation angles.

Parameters:
Returns:

Converted boxes with the same leading dimensions.

Return type:

Tensor

vision3d.ops.box3d_corners(boxes, format)[source]#

Compute the 8 world-space corners of 3D bounding boxes.

Supports all rotation formats including full 9-DOF (yaw, pitch, roll).

Corner ordering:

4 -------- 5       z  x
|\         |\      |  /
| 7 -------| 6     | /
| |        | |     |/
0 |--------1 |     +------ y
 \|         \|
  3 -------- 2

Bottom face (z-): {0, 1, 2, 3}. Top face (z+): {4, 5, 6, 7}.

Parameters:
Returns:

Corner coordinates [N, 8, 3].

Return type:

Tensor

vision3d.ops.box3d_iou(boxes1, boxes2, format)[source]#

Compute the pairwise intersection-over-union of 3D boxes1 and boxes2.

iou = vol / (vol1 + vol2 - vol), where vol is the volume of the intersecting convex polyhedron and vol1, vol2 are the volumes of the two input boxes.

The same algorithm handles every supported box format — including full 9-DOF orientation (XYZLWHYPR) — because the clipping step operates on the 8 box corners regardless of how they were produced.

Note: This function is not differentiable.

Parameters:
Returns:

[N, M] matrix of IoU values in [0, 1].

Return type:

Tensor

vision3d.ops.box3d_overlap(boxes1, boxes2, format)[source]#

Check 3D overlap between two sets of oriented bounding boxes.

Uses the Separating Axis Theorem (SAT) with 15 potential separating axes (3 face normals per box + 9 edge cross products).

Parameters:
Returns:

Boolean matrix [N, M] where True indicates overlap.

Return type:

Tensor

vision3d.ops.nms_3d(boxes, scores, iou_threshold, format)[source]#

Greedy, class-agnostic non-maximum suppression on 3D bounding boxes.

Iteratively removes lower-scoring boxes whose IoU with a higher-scoring box exceeds iou_threshold.

Parameters:
  • boxes (Tensor) – [N, K] boxes to perform NMS on. K depends on format.

  • scores (Tensor) – [N] prediction confidences.

  • iou_threshold (float) – Discard any box whose IoU with a higher-scoring kept box is strictly greater than this value.

  • format (BoundingBox3DFormat) – Format of boxes.

Returns:

int64 tensor of indices into boxes that survived, sorted in decreasing order of score.

Return type:

Tensor

vision3d.ops.points_in_boxes_3d(points, boxes, format)[source]#

Compute a boolean mask indicating which points fall inside which boxes.

Supports all rotation formats including full 9-DOF (yaw, pitch, roll).

Parameters:
  • points (Tensor) – Point cloud coordinates [N, 3+C]. Only the first 3 columns (x, y, z) are used.

  • boxes (Tensor) – 3D bounding boxes [M, K] where K depends on format.

  • format (BoundingBox3DFormat) – Format of the bounding boxes.

Returns:

Boolean tensor [N, M] where entry (i, j) is True if point i is inside box j.

Return type:

Tensor

vision3d.ops.points_in_boxes_3d_indices(points, boxes, format)[source]#

Return per-point box assignment.

If a point is inside multiple boxes, the first (lowest index) box wins.

Parameters:
  • points (Tensor) – Point cloud coordinates [N, 3+C].

  • boxes (Tensor) – 3D bounding boxes [M, K].

  • format (BoundingBox3DFormat) – Format of the bounding boxes.

Returns:

Integer tensor [N] with the index of the box each point belongs to, or -1 if the point is not in any box.

Return type:

Tensor

vision3d.ops.project_to_image(points_3d, extrinsics, intrinsics)[source]#

Project 3D points in lidar frame to pixel coordinates.

Parameters:
  • points_3d (Tensor) – Points in lidar frame [N, 3].

  • extrinsics (Tensor) – Lidar-to-camera transformation [4, 4].

  • intrinsics (Tensor) – Camera intrinsic matrix [3, 3].

Returns:

(uv, depth) where uv is the pixel coordinates [N, 2] (u, v) and depth is the camera-frame depth [N].

Return type:

tuple[Tensor, Tensor]