vision3d.transforms.functional#

Functional form of the 3D transforms in vision3d.transforms.

Functions

center_crop_camera_intrinsics(inpt, output_size)

Update CameraIntrinsics for a center crop of the corresponding image.

crop_camera_intrinsics(inpt, top, left, ...)

Update CameraIntrinsics for a crop of the corresponding image.

flip_3d(inpt, *, axis)

Flip a tensor along a 3D spatial axis.

flip_3d_bounding_boxes(boxes, *, format, axis)

Flip 3D bounding boxes along axis.

flip_3d_point_cloud(points, *, axis)

Flip point cloud coordinates along axis.

horizontal_flip_bounding_boxes_3d(inpt)

Flip BoundingBoxes3D to match a horizontal image flip.

horizontal_flip_camera_extrinsics(inpt)

Update CameraExtrinsics for a horizontal image flip.

horizontal_flip_camera_intrinsics(inpt)

Update CameraIntrinsics for a horizontal flip of the corresponding image.

horizontal_flip_point_cloud_3d(inpt)

Flip a PointCloud3D to match a horizontal image flip.

jitter_points(inpt, *, noise)

Dispatcher entry point for point jittering.

jitter_points_point_cloud(points, *, noise)

Add noise to point xyz coordinates.

pad_camera_intrinsics(inpt, padding, **kwargs)

Update CameraIntrinsics for a pad of the corresponding image.

register_kernel(functional, tv_tensor_cls, *)

Register a kernel for a functional and TVTensor type.

resize_camera_intrinsics(inpt, size[, max_size])

Update CameraIntrinsics for a resize of the corresponding image.

resized_crop_camera_intrinsics(inpt, top, ...)

Update CameraIntrinsics for a crop followed by a resize.

rotate_3d(inpt, *, rotation_matrix)

Rotate a tensor by a 3x3 rotation matrix.

rotate_3d_bounding_boxes(boxes, *, format, ...)

Rotate 3D bounding boxes by rotation_matrix.

rotate_3d_camera_extrinsics(extrinsics, *, ...)

Update camera extrinsics after rotating the lidar frame.

rotate_3d_point_cloud(points, *, rotation_matrix)

Rotate point cloud coordinates by rotation_matrix.

sample_points(inpt, *, indices)

Dispatcher entry point for point sampling.

sample_points_point_cloud(points, *, indices)

Select points by index.

scale_3d(inpt, *, factor)

Scale a tensor by a uniform factor.

scale_3d_bounding_boxes(boxes, *, format, factor)

Scale 3D bounding boxes by factor.

scale_3d_camera_extrinsics(extrinsics, *, factor)

Update camera extrinsics after scaling the lidar frame.

scale_3d_point_cloud(points, *, factor)

Scale point cloud coordinates by factor.

shuffle_points(inpt, *, perm)

Dispatcher entry point for point shuffling.

shuffle_points_point_cloud(points, *, perm)

Permute point order.

translate_3d(inpt, *, offset)

Translate a tensor by a 3D offset.

translate_3d_bounding_boxes(boxes, *, ...)

Translate 3D bounding boxes by offset.

translate_3d_camera_extrinsics(extrinsics, ...)

Update camera extrinsics after translating the lidar frame.

translate_3d_point_cloud(points, *, offset)

Translate point cloud coordinates by offset.

vertical_flip_bounding_boxes_3d(inpt)

Flip BoundingBoxes3D to match a vertical image flip.

vertical_flip_camera_extrinsics(inpt)

Update CameraExtrinsics for a vertical image flip.

vertical_flip_camera_intrinsics(inpt)

Update CameraIntrinsics for a vertical flip of the corresponding image.

vertical_flip_point_cloud_3d(inpt)

Flip a PointCloud3D to match a vertical image flip.

vision3d.transforms.functional.center_crop_camera_intrinsics(inpt, output_size)[source]#

Update CameraIntrinsics for a center crop of the corresponding image.

Parameters:
  • inpt (CameraIntrinsics) – The intrinsics to update.

  • output_size (list[int]) – Target (h, w) after the center crop.

Returns:

Updated intrinsics with image_size set to output_size.

Return type:

CameraIntrinsics

vision3d.transforms.functional.crop_camera_intrinsics(inpt, top, left, height, width)[source]#

Update CameraIntrinsics for a crop of the corresponding image.

Shifts the principal point so projection through the updated intrinsics matches projection through the original intrinsics on the cropped image.

Parameters:
  • inpt (CameraIntrinsics) – The intrinsics to update.

  • top (int) – Top edge of the crop in pixels.

  • left (int) – Left edge of the crop in pixels.

  • height (int) – Crop height in pixels.

  • width (int) – Crop width in pixels.

Returns:

Updated intrinsics with image_size set to (height, width).

Return type:

CameraIntrinsics

vision3d.transforms.functional.flip_3d(inpt, *, axis)[source]#

Flip a tensor along a 3D spatial axis.

This is the dispatcher entry point. Type-specific kernels are registered below.

Parameters:
  • inpt (Tensor) – Input tensor.

  • axis (str) – One of "x", "y", "z".

Returns:

Flipped tensor.

Return type:

Tensor

vision3d.transforms.functional.flip_3d_bounding_boxes(boxes, *, format, axis)[source]#

Flip 3D bounding boxes along axis.

Parameters:
  • boxes (Tensor) – Bounding box tensor [..., K].

  • format (BoundingBox3DFormat) – Format of the boxes.

  • axis (str) – One of "x", "y", "z".

Returns:

Flipped bounding boxes with the same shape.

Return type:

Tensor

vision3d.transforms.functional.flip_3d_point_cloud(points, *, axis)[source]#

Flip point cloud coordinates along axis.

Parameters:
  • points (Tensor) – Point cloud tensor [..., 3+C].

  • axis (str) – One of "x", "y", "z".

Returns:

Flipped point cloud with the same shape.

Return type:

Tensor

vision3d.transforms.functional.horizontal_flip_bounding_boxes_3d(inpt)[source]#

Flip BoundingBoxes3D to match a horizontal image flip.

Reflects the source frame’s Y axis following the fixed world-axis convention for a horizontal flip. The paired extrinsics kernel applies the matching camera-frame reflection, so projection stays consistent for any camera pose.

Parameters:

inpt (BoundingBoxes3D) – The boxes to flip.

Returns:

The flipped boxes with the same format.

Return type:

BoundingBoxes3D

vision3d.transforms.functional.horizontal_flip_camera_extrinsics(inpt)[source]#

Update CameraExtrinsics for a horizontal image flip.

Reflects the source frame about its Y axis (paired with a camera-frame X reflection) so the source-to-camera mapping stays consistent with the horizontally flipped image.

Parameters:

inpt (CameraExtrinsics) – The extrinsics to update.

Returns:

Updated extrinsics with the same shape.

Return type:

CameraExtrinsics

vision3d.transforms.functional.horizontal_flip_camera_intrinsics(inpt)[source]#

Update CameraIntrinsics for a horizontal flip of the corresponding image.

Mirrors the principal point about the image’s vertical center line and negates the skew so projection through the updated intrinsics matches projection through the original intrinsics on the flipped image.

Parameters:

inpt (CameraIntrinsics) – The intrinsics to update.

Returns:

Updated intrinsics with the same image_size.

Return type:

CameraIntrinsics

vision3d.transforms.functional.horizontal_flip_point_cloud_3d(inpt)[source]#

Flip a PointCloud3D to match a horizontal image flip.

Reflects the source frame’s Y axis following the fixed world-axis convention for a horizontal flip. The paired extrinsics kernel applies the matching camera-frame reflection, so projection stays consistent for any camera pose.

Parameters:

inpt (PointCloud3D) – The point cloud to flip.

Returns:

The flipped point cloud.

Return type:

PointCloud3D

vision3d.transforms.functional.jitter_points(inpt, *, noise)[source]#

Dispatcher entry point for point jittering.

Returns:

Input unchanged (passthrough for non-point types).

Parameters:
Return type:

Tensor

vision3d.transforms.functional.jitter_points_point_cloud(points, *, noise)[source]#

Add noise to point xyz coordinates.

Parameters:
  • points (Tensor) – Point cloud [N, 3+C].

  • noise (Tensor) – Additive noise [N, 3].

Returns:

Jittered point cloud with the same shape. Non-xyz features are unchanged.

Return type:

Tensor

vision3d.transforms.functional.pad_camera_intrinsics(inpt, padding, **kwargs)[source]#

Update CameraIntrinsics for a pad of the corresponding image.

Shifts the principal point by the top-left pad and grows image_size to include the padded borders.

Parameters:
Returns:

Updated intrinsics with the padded image_size.

Return type:

CameraIntrinsics

vision3d.transforms.functional.register_kernel(functional, tv_tensor_cls, *, tv_tensor_wrapper=True)[source]#

Register a kernel for a functional and TVTensor type.

Parameters:
  • functional (Callable[[...], Any]) – The functional to register a kernel for.

  • tv_tensor_cls (type[TVTensor]) – The TVTensor subclass this kernel handles.

  • tv_tensor_wrapper (bool) – If True (default), the kernel receives an unwrapped pure tensor and the output is automatically re-wrapped. If False, the kernel receives the full TVTensor and must handle wrap itself.

Returns:

Decorator that registers the kernel.

Return type:

Callable[[Callable[[…], Any]], Callable[[…], Any]]

vision3d.transforms.functional.resize_camera_intrinsics(inpt, size, max_size=None, **kwargs)[source]#

Update CameraIntrinsics for a resize of the corresponding image.

Scales the focal lengths, skew, and principal point so projection through the updated intrinsics matches projection through the original intrinsics on the resized image.

Parameters:
Returns:

Updated intrinsics with the new image_size.

Return type:

CameraIntrinsics

vision3d.transforms.functional.resized_crop_camera_intrinsics(inpt, top, left, height, width, size, **kwargs)[source]#

Update CameraIntrinsics for a crop followed by a resize.

Parameters:
Returns:

Updated intrinsics with image_size set to size.

Return type:

CameraIntrinsics

vision3d.transforms.functional.rotate_3d(inpt, *, rotation_matrix)[source]#

Rotate a tensor by a 3x3 rotation matrix.

Dispatcher entry point. Type-specific kernels are registered below.

Parameters:
  • inpt (Tensor) – Input tensor.

  • rotation_matrix (Tensor) – [3, 3] rotation matrix.

Returns:

Rotated tensor.

Return type:

Tensor

vision3d.transforms.functional.rotate_3d_bounding_boxes(boxes, *, format, rotation_matrix)[source]#

Rotate 3D bounding boxes by rotation_matrix.

Only rotated formats are supported:

  • XYZLWHY: only Z-axis rotations (pure yaw).

  • XYZLWHYPR: arbitrary rotations.

Axis-aligned formats (XYZXYZ, XYZLWH) cannot represent rotation and will raise NotImplementedError.

Parameters:
  • boxes (Tensor) – Bounding box tensor [..., K].

  • format (BoundingBox3DFormat) – Format of the boxes.

  • rotation_matrix (Tensor) – [3, 3] rotation matrix.

Returns:

Rotated bounding boxes with the same shape.

Raises:
Return type:

Tensor

vision3d.transforms.functional.rotate_3d_camera_extrinsics(extrinsics, *, rotation_matrix)[source]#

Update camera extrinsics after rotating the lidar frame.

The lidar-to-camera extrinsic E satisfies p_cam = E @ p_lidar. After rotating the lidar frame by R, points become p' = R @ p, so E' = E @ R_inv to keep p_cam = E' @ p'.

Parameters:
  • extrinsics (Tensor) – Extrinsic matrices [..., 4, 4].

  • rotation_matrix (Tensor) – [3, 3] rotation matrix.

Returns:

Updated extrinsics with the same shape.

Return type:

Tensor

vision3d.transforms.functional.rotate_3d_point_cloud(points, *, rotation_matrix)[source]#

Rotate point cloud coordinates by rotation_matrix.

Parameters:
  • points (Tensor) – Point cloud tensor [..., 3+C].

  • rotation_matrix (Tensor) – [3, 3] rotation matrix.

Returns:

Rotated point cloud with the same shape.

Return type:

Tensor

vision3d.transforms.functional.sample_points(inpt, *, indices)[source]#

Dispatcher entry point for point sampling.

Returns:

Input unchanged (passthrough for non-point types).

Parameters:
Return type:

Tensor

vision3d.transforms.functional.sample_points_point_cloud(points, *, indices)[source]#

Select points by index.

Parameters:
  • points (Tensor) – Point cloud [N, 3+C].

  • indices (Tensor) – Selection indices [M]. May contain repeats for oversampling.

Returns:

Selected point cloud [M, 3+C].

Return type:

Tensor

vision3d.transforms.functional.scale_3d(inpt, *, factor)[source]#

Scale a tensor by a uniform factor.

Dispatcher entry point. Type-specific kernels are registered below.

Parameters:
  • inpt (Tensor) – Input tensor.

  • factor (float) – Scale factor.

Returns:

Scaled tensor.

Return type:

Tensor

vision3d.transforms.functional.scale_3d_bounding_boxes(boxes, *, format, factor)[source]#

Scale 3D bounding boxes by factor.

Scales both position and dimensions. Rotation angles are unchanged.

Parameters:
Returns:

Scaled bounding boxes with the same shape.

Return type:

Tensor

vision3d.transforms.functional.scale_3d_camera_extrinsics(extrinsics, *, factor)[source]#

Update camera extrinsics after scaling the lidar frame.

Parameters:
  • extrinsics (Tensor) – Extrinsic matrices [..., 4, 4].

  • factor (float) – Scale factor applied to the lidar frame.

Returns:

Updated extrinsics with the same shape.

Return type:

Tensor

vision3d.transforms.functional.scale_3d_point_cloud(points, *, factor)[source]#

Scale point cloud coordinates by factor.

Parameters:
  • points (Tensor) – Point cloud tensor [..., 3+C].

  • factor (float) – Scale factor.

Returns:

Scaled point cloud with the same shape.

Return type:

Tensor

vision3d.transforms.functional.shuffle_points(inpt, *, perm)[source]#

Dispatcher entry point for point shuffling.

Returns:

Input unchanged (passthrough for non-point types).

Parameters:
Return type:

Tensor

vision3d.transforms.functional.shuffle_points_point_cloud(points, *, perm)[source]#

Permute point order.

Parameters:
  • points (Tensor) – Point cloud [N, 3+C].

  • perm (Tensor) – Permutation indices [N].

Returns:

Permuted point cloud with the same shape.

Return type:

Tensor

vision3d.transforms.functional.translate_3d(inpt, *, offset)[source]#

Translate a tensor by a 3D offset.

Dispatcher entry point. Type-specific kernels are registered below.

Parameters:
  • inpt (Tensor) – Input tensor.

  • offset (Tensor) – Translation [3] as (tx, ty, tz).

Returns:

Translated tensor.

Return type:

Tensor

vision3d.transforms.functional.translate_3d_bounding_boxes(boxes, *, format, offset)[source]#

Translate 3D bounding boxes by offset.

Parameters:
  • boxes (Tensor) – Bounding box tensor [..., K].

  • format (BoundingBox3DFormat) – Format of the boxes.

  • offset (Tensor) – Translation [3] as (tx, ty, tz).

Returns:

Translated bounding boxes with the same shape.

Return type:

Tensor

vision3d.transforms.functional.translate_3d_camera_extrinsics(extrinsics, *, offset)[source]#

Update camera extrinsics after translating the lidar frame.

The lidar-to-camera extrinsic translation changes because the lidar origin moved by offset in the lidar frame.

Parameters:
  • extrinsics (Tensor) – Extrinsic matrices [..., 4, 4].

  • offset (Tensor) – Translation [3] as (tx, ty, tz) in lidar frame.

Returns:

Updated extrinsics with the same shape.

Return type:

Tensor

vision3d.transforms.functional.translate_3d_point_cloud(points, *, offset)[source]#

Translate point cloud coordinates by offset.

Parameters:
  • points (Tensor) – Point cloud tensor [..., 3+C].

  • offset (Tensor) – Translation [3] as (tx, ty, tz).

Returns:

Translated point cloud with the same shape.

Return type:

Tensor

vision3d.transforms.functional.vertical_flip_bounding_boxes_3d(inpt)[source]#

Flip BoundingBoxes3D to match a vertical image flip.

Reflects the source frame’s Z axis following the fixed world-axis convention for a vertical flip. The paired extrinsics kernel applies the matching camera-frame reflection, so projection stays consistent for any camera pose.

Parameters:

inpt (BoundingBoxes3D) – The boxes to flip.

Returns:

The flipped boxes with the same format.

Return type:

BoundingBoxes3D

vision3d.transforms.functional.vertical_flip_camera_extrinsics(inpt)[source]#

Update CameraExtrinsics for a vertical image flip.

Reflects the source frame about its Z axis (paired with a camera-frame Y reflection) so the source-to-camera mapping stays consistent with the vertically flipped image.

Parameters:

inpt (CameraExtrinsics) – The extrinsics to update.

Returns:

Updated extrinsics with the same shape.

Return type:

CameraExtrinsics

vision3d.transforms.functional.vertical_flip_camera_intrinsics(inpt)[source]#

Update CameraIntrinsics for a vertical flip of the corresponding image.

Mirrors the principal point about the image’s horizontal center line and negates the skew so projection through the updated intrinsics matches projection through the original intrinsics on the flipped image.

Parameters:

inpt (CameraIntrinsics) – The intrinsics to update.

Returns:

Updated intrinsics with the same image_size.

Return type:

CameraIntrinsics

vision3d.transforms.functional.vertical_flip_point_cloud_3d(inpt)[source]#

Flip a PointCloud3D to match a vertical image flip.

Reflects the source frame’s Z axis following the fixed world-axis convention for a vertical flip. The paired extrinsics kernel applies the matching camera-frame reflection, so projection stays consistent for any camera pose.

Parameters:

inpt (PointCloud3D) – The point cloud to flip.

Returns:

The flipped point cloud.

Return type:

PointCloud3D