TUMTraf V2X Cooperative Perception Dataset

Overview

TUMTraf-V2X is the first high-quality real-world V2X dataset for the cooperative 3D object detection and tracking task in autonomous driving.

It contains:

data collected by 9 sensors simultaneously from onboard and roadside sensors.
2,000 labeled point clouds and 5,000 labeled images.
30k 3D bounding boxes with track IDs.
Challenging scenarios: near-miss and traffic violation events, overtaking and U-turn maneuvers.
HD maps of the driving domains.
Labels in OpenLABEL standard.
A dev kit to load, preprocess, visualize, convert labels, and to evaluate perception methods.

Abstract

Cooperative perception offers several benefits for enhancing the capabilities of autonomous vehicles and improving road safety. Using roadside sensors in addition to onboard sensors increases reliability and extends the sensor range. They offer a higher situational awareness for automated vehicles and prevent occlusions. We propose CoopDet3D, a cooperative multi-modal fusion model, and TUMTraf-V2X, a perception dataset, for the cooperative 3D object detection and tracking task. Our dataset contains 2,000 labeled point clouds and 5,000 labeled images from five roadside and four onboard sensors. It includes 30k 3D boxes with track IDs and precise GPS and IMU data. We labeled eight categories and covered occlusion scenarios with challenging driving maneuvers, like traffic violations, near-miss events, overtaking, and U-turns. Through multiple experiments, we show that our CoopDet3D camera-LiDAR fusion model achieves an increase of +14.36 3D mAP compared to a vehicle camera-LiDAR fusion model. Finally, we make our model, dataset, labeling tool, and development kit publicly available to advance in connected and automated driving.

Sensor Setup

On the infrastructure, the following roadside sensors were used:

1x Ouster LiDAR OS1-64 (generation 2), 64 vertical layers, 360° FOV,
below horizon configuration, 10 cm accuracy @120 m range
4x Basler ace acA1920-50gc, 1920×1200, Sony IMX174 with 8 mm lenses

On the vehicle, the following onboard sensors were used:

1x Robosense RS-LiDAR-32, 32 vert. layers, 360° FOV, 3 cm accuracy @200 m range
1x Basler ace acA1920-50gc, 1920×1200, Sony IMX174 with 16 mm lens
1x Emlid Reach RS2+ multi-band RTK GNSS receiver
1x XSENS MTi-30-2A8G4 IMU

Dataset Statistics

Distribution of objects between day and night.

Average and maximum number of 3D points for each class.

Average and maximum track length for all classes.

Visualization of rotations (yaw).

3D points grouped by distance.

BEV visualization of tracks.

Histogram of 3D points.

Histogram of 3D box labels.

Histogram of track lengths.

Config		BEV mAP		3D mAP
Vehicle	Camera	46.83	31.47	37.82	30.77	30.36
Vehicle	LiDAR	85.33	85.22	76.86	69.04	80.11
Vehicle	Camera + LiDAR	84.90	77.60	72.08	73.12	76.40
Infrastructure	Camera	61.98	31.19	46.73	40.42	35.04
Infrastructure	LiDAR	92.86	86.17	88.07	75.73	84.88
Infrastructure	Camera + LiDAR	92.92	87.99	89.09	81.69	87.01
Cooperative	Camera	68.94	45.41	42.76	57.83	45.74
Cooperative	LiDAR	93.93	92.63	78.06	73.95	85.86
Cooperative	Camera + LiDAR	94.22	93.42	88.17	79.94	90.76

Config

BEV mAP

3D mAP

Domain

Modality

Easy

Moderate

Hard

Average

Vehicle

Camera

46.83

31.47

37.82

30.77

30.36

Vehicle

LiDAR

85.33

85.22

76.86

69.04

80.11

Vehicle

Camera + LiDAR

84.90

77.60

72.08

73.12

76.40

Infrastructure

Camera

61.98

31.19

46.73

40.42

35.04

Infrastructure

LiDAR

92.86

86.17

88.07

75.73

84.88

Infrastructure

Camera + LiDAR

92.92

87.99

89.09

81.69

87.01

Cooperative

Camera

68.94

45.41

42.76

57.83

45.74

Cooperative

LiDAR

93.93

92.63

78.06

73.95

85.86

Cooperative

Camera + LiDAR

94.22

93.42

88.17

79.94

90.76

Acknowledgements

Our CoopDet3D model is build on top of BEVFusion and PillarGrid. This research was supported by the Federal Ministry of Education and Research in Germany within the AUTOtech.agil project, Grant Number: 01IS22088U.

@article{zimmer2024tumtraf, title={TUMTraf V2X Cooperative Perception Dataset}, author={Zimmer, Walter and Wardana, Gerhard Arya and Sritharan, Suren and Zhou, Xingcheng and Song, Rui and Knoll, Alois}, journal={arXiv preprint arXiv:2403.01316}, year={2024} }

TUMTraf V2X Cooperative Perception Dataset

Overview

Abstract

Sensor Setup

Visualization of roadside sensors used to record the TUMTraf-V2X Cooperative Perception Dataset from infrastructure perspective.

Dataset Labeling

Dataset Statistics

Distribution of objects between day and night.

Average and maximum number of 3D points for each class.

Average and maximum track length for all classes.

Visualization of rotations (yaw).

3D points grouped by distance.

BEV visualization of tracks.

Histogram of 3D points.

Histogram of 3D box labels.

Histogram of track lengths.

Benchmark

Evaluation results (BEV mAP and 3D mAP) of CoopDet3D on our
TUMTraf-V2X Cooperative Perception test set in south2 FOV.

Acknowledgements

BibTeX

TUMTraf V2X Cooperative Perception Dataset

Overview

Abstract

Sensor Setup

Visualization of roadside sensors used to record the TUMTraf-V2X Cooperative Perception Dataset from infrastructure perspective.

Dataset Labeling

Dataset Statistics

Distribution of objects between day and night.

Average and maximum number of 3D points for each class.

Average and maximum track length for all classes.

Visualization of rotations (yaw).

3D points grouped by distance.

BEV visualization of tracks.

Histogram of 3D points.

Histogram of 3D box labels.

Histogram of track lengths.

Benchmark

Evaluation results (BEV mAP and 3D mAP) of CoopDet3D on our TUMTraf-V2X Cooperative Perception test set in south2 FOV.

Acknowledgements

BibTeX

Evaluation results (BEV mAP and 3D mAP) of CoopDet3D on our
TUMTraf-V2X Cooperative Perception test set in south2 FOV.