

Public large-scale datasets for autonomous driving research
A short list of datasets focus on semantic understanding of urban street and integration
- Sep 2019Audi released AEV Autonomous Driving Dataset (A2D2) an open multi-sensor dataset for autonomous driving research.
- Aug 2019Waymo released Open Datasets which comprised of high-resolution sensor data collected by Waymo self-driving cars in a wide variety of conditions.
- Jul 2019Lyft released *Level 5 Datasets*, a subset of their autonomous driving data collected by Lyft’s Level 5 team with high-quality data from camera and LIDAR sensors.
- Mar 2019Aptiv released *nuScenes Datasets* which comprised of labelled data of comprehensive autonomous vehicle multi-sensor suite.
- May 2018Berkeley released DeepDrive datasets includes Instance segmentation, object detection, drivable areas, lane markings as part of challenges at their hosted CVPR 2018 Workshop on Autonomous Driving.
Owner | Dataset | Feature | License |
---|---|---|---|
Audi | A2D2 ↗ | - 2D semantic segmentation - 3D point cloud labels - 3D bounding boxes - Unlabelled sensor data | CC BY-ND 4.01 |
Waymo | Open Datasets ↗ | - Labelled Camera Data - Labels for LiDAR - 3D bounding boxes - Sensor data | Non-Commercial Use |
Lyft | Level 5 ↗ | - Labelled Camera Data, LiDAR - 3D bounding boxes - Drivable surface map - Spatial semantic map | CC BY-NC-SA 4.02 |
Aptiv | nuScences ↗ | - Full sensor suite (LiDAR, RADAR, Camera, IMU, GPS) - 3D bounding boxes - Detailed map information | CC BY-NC-SA 4.02 |
Berkeley | DeepDrive ↗ | - Labelled Caemera (Data, IMU, GPS) - 2D bounding boxes - Drivable surface map - Lane Markings | BSD 3-Clause “New” or “Revised”3 |
Summary Table of Autonomous Driving in Urban Area Datasets
The common goal is an in-depth understanding of perception for autonomous driving vehicles in complex environment such as the urban area.
While some datasets focus on imaging technologies, others also offer spatial map, surface map, lane marking and furthermore.
Another interesting aspects is their licenses where Waymo, Lyft, Aptiv (Waymo Dataset License Agreement for Non-Commercial Use, CC BY-NC-SA 4.02) explicitly state that the dataset is intended for research and require a license for any Commercial Use.
On the other hand, Berkeley and Audi with their license (BSD 3-Clause3, CC BY-ND 4.01) means that it is permissble for commercialization of the technology developed based on such datasets under one modification: the copyrights from the original dataset left intact.
Audi has also uploaded their A2D2 Datasets onto Registry of Open Data on AWS and simplify the setup for working on their datasets