vision3d.tensors#
TVTensor subclasses with 3D semantics.
Functions
Classes
|
Coordinate format of a 3D bounding box. |
|
|
|
|
|
|
|
|
|
|
- class vision3d.tensors.BoundingBox3DFormat(*values)[source]#
Bases:
EnumCoordinate format of a 3D bounding box.
Dimension convention:
l(length) = extent along X (forward axis,dx)w(width) = extent along Y (lateral axis,dy)h(height) = extent along Z (vertical axis,dz)
Rotation uses intrinsic Tait-Bryan ZY’X’’ angles (yaw, pitch, roll) in radians (-pi, +pi).
Available formats are:
XYZXYZ: axis-aligned box via two opposite corners; x1, y1, z1 (min corner), x2, y2, z2 (max corner). 6 values.XYZLWH: center position and axis-aligned extents; cx, cy, cz (center), l, w, h (dx, dy, dz). 6 values.XYZLWHY: center, extents, and yaw; cx, cy, cz, l, w, h, yaw. 7 values.XYZLWHYPR: center, extents, and full Euler rotation; cx, cy, cz, l, w, h, yaw, pitch, roll. 9 values.
- class vision3d.tensors.BoundingBoxes3D(data, *, format, dtype=None, device=None, requires_grad=None)[source]#
Bases:
TVTensorTensorsubclass for 3D bounding boxes with shape[N, K].Where
Nis the number of bounding boxes andKdepends on the format: 6 for axis-aligned (XYZXYZ,XYZLWH), 7 for yaw-only (XYZLWHY), or 9 for full 9-DOF (XYZLWHYPR).Rotation angles are Euler angles in radians (-pi to +pi).
- Parameters:
data (Any) – Any data that can be turned into a tensor with
torch.as_tensor().format (
BoundingBox3DFormat,str) – Format of the 3D bounding box.dtype (
torch.dtype, optional) – Desired data type of the bounding box. If omitted, will be inferred fromdata.device (
torch.device, optional) – Desired device of the bounding box. If omitted anddatais aTensor, the device is taken from it. Otherwise, the bounding box is constructed on the CPU.requires_grad (
bool, optional) – Whether autograd should record operations on the bounding box. If omitted anddatais aTensor, the value is taken from it. Otherwise, defaults toFalse.
- Return type:
- class vision3d.tensors.CameraExtrinsics(data, *, dtype=None, device=None, requires_grad=None)[source]#
Bases:
TVTensorTensorsubclass for camera extrinsic matrices with shape[N, 4, 4].Each matrix transforms a point from the dataset’s source frame to camera frame. The source frame is dataset-defined: lidar for lidar-equipped datasets (e.g. KITTI, nuScenes), ego/world for camera-only datasets.
3D spatial transforms (flip, rotate, etc.) update these matrices to keep the source-to-camera mapping consistent after the source frame changes.
- Parameters:
data (Any) – Any data that can be turned into a tensor with
torch.as_tensor().dtype (
torch.dtype, optional) – Desired data type.device (
torch.device, optional) – Desired device.requires_grad (
bool, optional) – Whether autograd should record operations.
- Return type:
- class vision3d.tensors.CameraImages(data, *, dtype=None, device=None, requires_grad=None)[source]#
Bases:
ImageImagesubclass for multi-camera images with shape[N, C, H, W].Inherits from
torchvision.tv_tensors.Imageso every torchvision v2 image transform dispatches automatically.For 3D spatial transforms (flip, rotate, etc.) this type passes through unchanged.
- Parameters:
data (Any) – Any data that can be turned into a tensor with
torch.as_tensor().dtype (
torch.dtype, optional) – Desired data type.device (
torch.device, optional) – Desired device.requires_grad (
bool, optional) – Whether autograd should record operations.
- Return type:
- class vision3d.tensors.CameraIntrinsics(data, *, image_size, dtype=None, device=None, requires_grad=None)[source]#
Bases:
TVTensorTensorsubclass for camera intrinsic matrices with shape[N, 3, 3].Each matrix maps from camera-frame 3D coordinates to pixel coordinates.
Image-space transforms (resize, crop, etc.) update these matrices. 3D spatial transforms pass them through unchanged.
- Parameters:
data (Any) – Any data that can be turned into a tensor with
torch.as_tensor().image_size (
tuple) – Height and width of the corresponding images as(h, w). Required for geometric transforms (resize) that need to compute scale factors.dtype (
torch.dtype, optional) – Desired data type.device (
torch.device, optional) – Desired device.requires_grad (
bool, optional) – Whether autograd should record operations.
- Return type:
- class vision3d.tensors.PointCloud3D(data, *, dtype=None, device=None, requires_grad=None)[source]#
Bases:
TVTensorTensorsubclass for 3D point clouds with shape[N, 3+C].The first 3 columns are
(x, y, z)coordinates. Additional columns are per-point features (e.g. intensity, color, normals).- Parameters:
data (Any) – Any data that can be turned into a tensor with
torch.as_tensor().dtype (
torch.dtype, optional) – Desired data type. If omitted, will be inferred fromdata.device (
torch.device, optional) – Desired device. If omitted anddatais aTensor, the device is taken from it. Otherwise, the point cloud is constructed on the CPU.requires_grad (
bool, optional) – Whether autograd should record operations. If omitted anddatais aTensor, the value is taken from it. Otherwise, defaults toFalse.
- Return type:
- vision3d.tensors.wrap(wrappee, *, like, **kwargs)[source]#
Convert a
Tensorinto the same TVTensor subclass aslike.If
likeis aBoundingBoxes3D, theformatoflikeis assigned towrappeeunless overridden viakwargs.- Parameters:
wrappee (Tensor) – The tensor to convert.
like (TVTensor) – The reference.
wrappeewill be converted into the same subclass aslike.kwargs (Any) – Can contain
"format"iflikeis aBoundingBoxes3D. Ignored otherwise.
- Returns:
A TVTensor of the same subclass as
like.- Return type: