Spaces:
Running
on
Zero
Running
on
Zero
File size: 6,106 Bytes
9b33fca |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 |
"""Defines data related constants.
While the datasets can hold arbitrary data types and formats, this file
provides some constants that are used to define a common data format which is
helpful to use for better data transformation.
"""
from dataclasses import dataclass
from enum import Enum
# A custom value to distinguish instance ID and category ID; need to be greater
# than the number of categories. For a pixel in the panoptic result map:
# panaptic_id = instance_id * INSTANCE_OFFSET + category_id
INSTANCE_OFFSET = 1000
class AxisMode(Enum):
"""Enum for choosing among different coordinate frame conventions.
ROS: The coordinate frame aligns with the right hand rule:
- x axis points forward.
- y axis points left.
- z axis points up.
See also: https://www.ros.org/reps/rep-0103.html#axis-orientation
OpenCV: The coordinate frame aligns with a camera coordinate system:
- x axis points right.
- y axis points down.
- z axis points forward.
See also: https://docs.opencv.org/3.4/d9/d0c/group__calib3d.html
LiDAR: The coordinate frame aligns with a LiDAR coordinate system:
- x axis points right.
- y axis points forward.
- z axis points up.
See also: https://www.nuscenes.org/nuscenes#data-collection
"""
ROS = 0
OPENCV = 1
LIDAR = 2
@dataclass
class CommonKeys:
"""Common supported keys for DictData.
While DictData can hold arbitrary keys of data, we define a common set of
keys where we expect a pre-defined format to enable the usage of common
data pre-processing operations among different datasets.
General Info:
- sample_names (str): Name of the sample.
If the dataset contains videos:
- sequence_names (str): The name of the sequence.
- frame_ids (int): The temporal frame index of the sample.
Image Based Inputs:
- images (NDArrayF32): Image of shape [1, H, W, C].
- input_hw (Tuple[int, int]): Shape of image in (height, width) after
transformations.
- original_images (NDArrayF32): Original image of shape [1, H, W, C].
- original_hw (Tuple[int, int]): Shape of original image in
(height, width).
Image Classification:
- categories (NDArrayI64): Class labels of shape [1, ].
2D Object Detection:
- boxes2d (NDArrayF32): 2D bounding boxes of shape [N, 4] in xyxy
format.
- boxes2d_classes (NDArrayI64): Classes of 2D bounding boxes of shape
[N,].
- boxes2d_names (List[str]): Names of 2D bounding box classes, same
order as `boxes2d_classes`.
2D Object Tracking:
- boxes2d_track_ids (NDArrayI64): Tracking IDs of 2D bounding boxes of
shape [N,].
Segmentation:
- masks (NDArrayUI8): Segmentation masks of shape [N, H, W].
- seg_masks (NDArrayUI8): Semantic segmentation masks [H, W].
- instance_masks (NDArrayUI8): Instance segmentation masks of shape
[N, H, W].
- panoptic_masks (NDArrayI64): Panoptic segmentation masks [H, W].
Depth Estimation:
- depth_maps (NDArrayF32): Depth maps of shape [H, W].
Optical Flow:
- optical_flows (NDArrayF32): Optical flow maps of shape [H, W, 2].
Sensor Calibration:
- intrinsics (NDArrayF32): Intrinsic sensor calibration. Shape [3, 3].
- extrinsics (NDArrayF32): Extrinsic sensor calibration, transformation
of sensor to world coordinate frame. Shape [4, 4].
- axis_mode (AxisMode): Coordinate convention of the current sensor.
- timestamp (int): Sensor timestamp in Unix format.
3D Point Cloud Data:
- points3d (NDArrayF32): 3D pointcloud data, assumed to be [N, 3] and
in sensor frame.
- colors3d (NDArrayF32): Associated color values for each point [N, 3].
3D Point Cloud Annotations:
- semantics3d (NDArrayI64): Semantic classes of 3D points [N, 1].
- instances3d (NDArrayI64): Instance IDs of 3D points [N, 1].
3D Object Detection:
- boxes3d (NDArrayF32): 3D bounding boxes of shape [N, 10], each
consists of center (XYZ), dimensions (WLH), and orientation
quaternion (WXYZ).
- boxes3d_classes (NDArrayI64): Associated semantic classes of 3D
bounding boxes of shape [N,].
- boxes3d_names (List[str]): Names of 3D bounding box classes, same
order as `boxes3d_classes`.
- boxes3d_track_ids (NDArrayI64): Associated tracking IDs of 3D
bounding boxes of shape [N,].
- boxes3d_velocities (NDArrayF32): Associated velocities of 3D bounding
boxes of shape [N, 3], where each velocity is in the form of
(vx, vy, vz).
"""
# General Info
sample_names = "sample_names"
sequence_names = "sequence_names"
frame_ids = "frame_ids"
# image based inputs
images = "images"
input_hw = "input_hw"
original_images = "original_images"
original_hw = "original_hw"
# Image Classification
categories = "categories"
# 2D Object Detection
boxes2d = "boxes2d"
boxes2d_classes = "boxes2d_classes"
boxes2d_names = "boxes2d_names"
# 2D Object Tracking
boxes2d_track_ids = "boxes2d_track_ids"
# Segmentation
masks = "masks"
seg_masks = "seg_masks"
instance_masks = "instance_masks"
panoptic_masks = "panoptic_masks"
# Depth Estimation
depth_maps = "depth_maps"
# Optical Flow
optical_flows = "optical_flows"
# Sensor Calibration
intrinsics = "intrinsics"
extrinsics = "extrinsics"
axis_mode = "axis_mode"
timestamp = "timestamp"
# 3D Point Cloud Data
points3d = "points3d"
colors3d = "colors3d"
# 3D Point Cloud Annotations
semantics3d = "semantics3d"
instances3d = "instances3d"
# 3D Object Detection
boxes3d = "boxes3d"
boxes3d_classes = "boxes3d_classes"
boxes3d_names = "boxes3d_names"
boxes3d_track_ids = "boxes3d_track_ids"
boxes3d_velocities = "boxes3d_velocities"
|