vision3d.transforms#
3D data augmentation transforms.
Classes
|
Batch-level 3D copy-paste data augmentation. |
|
Add Gaussian noise to point xyz coordinates with probability |
|
Subsample (or oversample with replacement) to exactly |
|
Randomly permute point order with probability |
|
Flip inputs along a 3D axis with probability |
|
Rotate inputs around an axis by a random angle with probability |
|
Scale inputs by a random uniform factor with probability |
|
Base class for transforms applied with probability |
|
Translate inputs by a random 3D offset with probability |
|
Drop points and boxes outside an axis-aligned 3D region. |
Base class for vision3d transforms. |
- class vision3d.transforms.CopyPaste3D(target_counts, min_points=5, max_database_size=None, p=1.0)[source]#
Bases:
TransformBatch-level 3D copy-paste data augmentation.
Maintains a lazy object database that grows as batches pass through. For each sample, pastes additional objects from the database to reach a target count per class. Objects are pasted at their original scene position from the source frame.
Operates on collated batches
(tuple_of_inputs, tuple_of_targets), not individual samples. Each instance should be used with only one dataset to avoid cross-contamination.CopyPaste3Dmust be the first transform in any pipeline, before any 3D spatial transform (RandomFlip3D,RandomRotate3D,RandomScale3D,RandomTranslate3D). Pasted objects are extracted and re-inserted in the source-frame geometry of the scene they came from. If a scene transform has already mutated the frame, the pasted objects will disagree with the rest of the scene and the resulting boxes/points will be inconsistent.- Parameters:
target_counts (dict[int, int]) – Dict mapping integer class label to desired object count per sample. E.g.
{0: 15, 1: 10}.min_points (int) – Minimum number of points an extracted object must have to be stored in the database. Default:
5.max_database_size (int | None) – Maximum entries per class. None means unlimited. Default:
None.p (float) – Probability of applying the augmentation. Default:
1.0.
- forward(*inputs)[source]#
Apply copy-paste augmentation to a collated batch.
Accepts any pytree structure containing
PointCloud3D,BoundingBoxes3D, and optionally camera tensors and plain-tensor labels.
- class vision3d.transforms.PointJitter(sigma=0.01, p=0.5)[source]#
Bases:
RandomTransformAdd Gaussian noise to point xyz coordinates with probability
p.- Parameters:
- class vision3d.transforms.PointSample(n)[source]#
Bases:
TransformSubsample (or oversample with replacement) to exactly
npoints.If the point cloud has more than
npoints, a random subset is selected. If fewer, points are sampled with replacement to reachn.- Parameters:
n (int) – Target number of points.
- class vision3d.transforms.PointShuffle(p=0.5)[source]#
Bases:
RandomTransformRandomly permute point order with probability
p.- Parameters:
p (float) – Probability of applying. Default:
0.5.
- class vision3d.transforms.RandomFlip3D(axis='x', p=0.5)[source]#
Bases:
RandomTransformFlip inputs along a 3D axis with probability
p.Dispatches to type-specific kernels for
BoundingBoxes3DandPointCloud3D. Camera data (images, intrinsics, extrinsics) passes through unchanged.- Parameters:
- class vision3d.transforms.RandomRotate3D(angle_range=math.pi / 4, axis=(0.0, 0.0, 1.0), p=0.5)[source]#
Bases:
RandomTransformRotate inputs around an axis by a random angle with probability
p.Dispatches to type-specific kernels for
PointCloud3D,BoundingBoxes3D, andCameraExtrinsics.- Parameters:
- class vision3d.transforms.RandomScale3D(scale_range=(0.95, 1.05), p=0.5)[source]#
Bases:
RandomTransformScale inputs by a random uniform factor with probability
p.Dispatches to type-specific kernels for
PointCloud3D,BoundingBoxes3D, andCameraExtrinsics.- Parameters:
- class vision3d.transforms.RandomTransform(p=0.5)[source]#
Bases:
TransformBase class for transforms applied with probability
p.- Parameters:
p (float)
- class vision3d.transforms.RandomTranslate3D(translation_range=0.5, p=0.5)[source]#
Bases:
RandomTransformTranslate inputs by a random 3D offset with probability
p.Dispatches to type-specific kernels for
PointCloud3D,BoundingBoxes3D, andCameraExtrinsics.- Parameters:
- class vision3d.transforms.RangeFilter3D(point_cloud_range)[source]#
Bases:
TransformDrop points and boxes outside an axis-aligned 3D region.
Points are filtered by their xyz coordinates; boxes are filtered by their center (format-agnostic). Labels in
targetsare filtered in sync with boxes.Must be applied after spatial augmentations (rotate / scale / translate can push data out of the sensor range) and before the model sees the data.
- class vision3d.transforms.Transform[source]#
Bases:
ModuleBase class for vision3d transforms.
Only
TVTensorsubclasses (e.g.BoundingBoxes3D,PointCloud3D) are transformed. Plain tensors (labels, scores, etc.) pass through unchanged.Subclasses should override
transform()and use_call_kernelto dispatch to the correct kernel for each input type.