So far, this is all you need to know about the autonomous driving dataset. This is the first in a series of eight shared autonomous driving datasets:
“Eight Series Overview”
Autonomous driving data set sharing is a new sharing series launched by Integer Intelligence. In this series, we will introduce all the open autonomous driving data sets launched by various scientific research institutions and enterprises so far. The dataset is mainly divided into eight series:
- Series 1: Target detection dataset
- Series two: Semantically segmented data sets
- Series 3: Lane line detection data set
- Series 4: Optical flow data sets
- Series 5: Stereo Dataset
- Series 6: Location and mapping data sets
- Series 7: Driving behavior data sets
- Series 8: Simulation data sets
This article is the first in a series of three object detection datasets.
In the past, researchers have created and published relatively small data sets from limited sources, often limited to camera data. As acquisition equipment is upgraded, so are the autonomous driving data sets. Take Google self-driving car as an example, Google self-driving car is equipped with 64 laser rangefinder on the exterior roof of the car, which can enable the self-driving car to combine the laser measured data with high-resolution maps to make different types of data scenes, so as to avoid obstacles and follow traffic rules in autonomous driving. In addition, Pandset, nuScenes, BLVD, etc. all use lidar sensors.
In addition to some well-known data sets abroad, Baidu, Huawei, Didi and The Institute of Xi ‘an Jiaotong University have also launched domestic autonomous driving data sets, such as Apollo Scape data set, SODA10M data set, D²-City data set and BLVD data set, which provide important research materials for the progress of domestic autonomous driving technology.
The following 14 data sets are included:
01 “Waymo Data Set”
- Publisher: Waymo
- Download it at waymo.com/open/
- Release date: Perception data set in 2019, motion data set in 2021
- Size: 1.82 TB
- Summary: Waymo data set is the largest and most diverse data set so far. Compared with previous data sets, Waymo has greatly improved sensor quality and data set size, and the number of scenarios is three times that of nuScenes data set
- Perception Dataset
- 1,950 autopilot video clips, each containing 20 seconds of continuous driving footage
- Automobile, pedestrian, bicycle, traffic signs four types of labels
- 12.6 million 3D boxes, 11.8 million 2D boxes
- Sensor data: 1 medium-range liDAR, 4 short-range lidar, 5 cameras
- The collection range covers the downtown and suburban areas of Phoenix, Kirkland, Mountain View, San Francisco and other areas in California. It also involves data under various driving conditions, including day, night, dawn, dusk, rainy day and sunny day
- Motion Dataset
- Including 574 hours of data, 103,354 segments with map data
- Car, pedestrian, and bicycle labels, each object is marked with a 2D frame
- Mining behaviors and scenarios for behavior prediction studies, including turning, lane merging, lane changing, and intersection
- Locations include: San Francisco, Phoenix, Mountain View, Los Angeles, Detroit and Seattle
02 “PandaSet”
- Publisher: Hecai Technology &Scale AI
- Download: scale.com/resources/d…
- Release time: 2019
- Size: 16.0 GB
- Pandaset is open for scientific research and commercial applications. For the first time, both mechanical rotary and image-level forward liDAR are used for data acquisition and point cloud segmentation results are output
- Characteristics of the
- 48,000 + camera images
- 16,000 Lidar scanning point cloud images (over 100 8-second scenes)
- 28 comments for each scene
- 37 semantically segmented labels for most scenarios
- Sensors: 1 mechanical LiDAR, 1 solid State LiDAR, 5 wide Angle cameras, 1 telephoto camera, onboard GPS/IMU
03 “nuScenes”
- Published by: Motional, a driverless technology company
- Download: scale.com/open-datase…
- Address: arxiv.org/abs/1903.11…
- Release time: 2019
- Size: 547.98 GB
- Abstract: nuScenes dataset is one of the most widely used public datasets in the field of autonomous driving. It is also the most authoritative evaluation set of autonomous driving pure visual 3D target detection. The nuScenes dataset is inspired by Kitti and is the first dataset to include a full sensor suite. It includes 1,000 complex driving scenes in Boston and Singapore. This data set is prohibited for commercial use
- Characteristics of the
- Full sensor kit: 1 liDAR, 5 radars, 6 cameras, GPS, IMU
- 1000 scenarios, 20 seconds each (850 for model training, 150 for model testing)
- 400,000 keyframes, 1.4 million camera images, 390,000 lidar scanned point cloud images, 1.4 million radar scanned point cloud images
- 14 million 3D annotation boxes for 23 object classes
04 “Lyft Level 5”
- Published by: Lyft, a transportation network
- Level-5. Global /register/
- Address: arxiv.org/pdf/2006.14…
- Release date: Lyft- Perception dataset in 2019, Lyft- Prediction dataset in 2020
- Lyft-perception
- Description: Lyft’s self-driving car is equipped with an internal sensor suite that collects raw sensor data from other cars, pedestrians, traffic lights, and more
- Characteristics of the
- Over 55,000 frames, manually annotated in 3D
- 1.3 million 3D annotations
- 30,000 Lidar point cloud data
- 350 60-90 minute scenes
- Lyft-prediction
- Summary: This data set includes records of the movements of cars, cyclists, pedestrians and other traffic actors encountered by driverless fleets. These recordings are derived from raw lidar, camera and radar data and are ideal for training motion prediction models
- Characteristics of the
- 1,000 hours of driving
- 170,000 scenes: Each scene lasts about 25 seconds and includes traffic lights, aerial maps, sidewalks and more
- 2575 km: 2575 km data from public roads
- 15,242 annotated images: including a high-resolution semantic segmentation image and a high-resolution aerial view of the area
05 “H3D-HRI-US”
- Published by Honda Research Institute
- Download it at usa.honda-ri.com//H3D
- Address: arxiv.org/abs/1903.01…
- Release time: 2019
- Summary: A large, full-surround 3D multi-object detection and tracking data set was collected using a 3D LiDAR scanner for use only by university researchers
- Characteristics of the
- 360-degree LiDAR data set
- 160 crowded and complex traffic scenes
- 27,721 frames, 1,071,302 3D annotation boxes
- Manual annotation of 8 kinds of common objects in automatic driving scenarios
- Sensors: 3 HD cameras, 1 LIDAR, GPS/IMU
06 “Boxy Vehicle Detection Dataset”
- Publisher: Bosch
- Download it at boxy-dataset.com/boxy/
- The paper addresses: openaccess.thecvf.com/content_ICC…
- Release time: 2019
- Size: 1.1 TB
- Description: Large vehicle detection data set, which is highlighted by its high resolution of 5 million pixels, but does not provide 3D point cloud data and urban road traffic data
- Characteristics of the
- 2.2 million 1.1-terabyte high-resolution images
- 5 megapixel resolution
- 1,990,806 vehicle marks, including 2D frame marks and 2.5D marks
- Including sunny, rainy, dawn, day, evening and other scenarios
- Covers traffic congestion and unblocked freeway scenarios
07 “BLVD”
- Published by: Institute of Artificial Intelligence and Robotics, Xi ‘an Jiaotong University
- Download it at github.com/VCCIV/BLVD/
- Address: arxiv.org/pdf/1903.06…
- Release time: 2019
- Introduction: The world’s first five-dimensional driving scenario understanding dataset. BLVD aims to provide a unified validation platform for tasks such as dynamic 4D tracking (speed, distance, horizontal Angle and vertical Angle), 5D interactive event recognition (4D+ interactive behavior), and intention prediction. Collected by xi ‘an Jiaotong University Kuafu unmanned vehicle
- Characteristics of the
- 654 sequences containing 120,000 frames were annotated with 5D semantic annotations for the whole sequence
- 249,129 3D object boxes, 4902 effective traceable individual
- The total length is about 214,900 tracking points
- 6004 valid fragments for 5D interaction event recognition and 4900 targets for 5D intention prediction
- Rich scenes: cities and highways, day and night
- Multiple objects: pedestrians, vehicles, cyclists (including cyclists and motorcyclists)
- Sensors: A Velodyne HDL-64E 3D liDAR, GPS/IMU, and two high-resolution multi-view cameras
08 “SODA10M Dataset”
- Published by: Huawei Noah’s Ark Laboratory & Sun Yat-sen University
- IO/Soda-2d.github.
- Address: arxiv.org/pdf/2106.11…
- Release time: 2021
- Size: 5.6GB (labeled data), 2TB (unlabeled data)
- Summary: A semi-/ self-supervised 2D benchmark dataset consisting of 10 million diverse untagged road scenes and 20,000 tagged images from 32 cities
- Characteristics of the
- 10 million untagged images and 20,000 tagged images were captured every 10 seconds by mobile phone or dashcam (1080P+)
- There are six main categories of human-vehicle scenarios: pedestrian, bicycle, car, truck, tram, and tricycle
- It covers 32 cities in China
- Variety of scene coverage: sunny/cloudy/rainy; City streets/highways/country roads/residential areas; Day/night/dawn/dusk
- The horizon remains in the center of the image, and the occlusion of the car does not exceed 15% of the entire image
09 “D²-City Dataset”
- Published by: Didi
- Download: www.scidb.cn/en/detail?d…
- Release date: 2019
- Size: 131.21 GB
- Introduction: D²-City is a large-scale driving video data set. Compared to existing data sets, D²-City stands out for the diversity of its data set, which is collected from Didi’s operating vehicles in five Chinese cities and covers different weather, road and traffic conditions
- Characteristics of the
- More than 10,000 videos, all recorded in high definition (720P) or ultra high definition (1080P) resolution, provide raw data stored in 25fps, 30 second short video
- There are about 1,000 videos that 2D frame and track all 12 categories of objects, including cars, vans, buses, trucks, pedestrians, motorcycles, bicycles, open and closed tricycles, forklifts and obstacles
- The raw data provided is stored as a short video with a frame rate of 25fps and a length of 30 seconds
- Rich scenes: covering different weather, roads and traffic conditions, especially extremely complex and diverse traffic scenes, such as insufficient light, rainy and foggy weather, road congestion and low image resolution
10 “Apollo Scape Dataset”
- Publisher: Baidu
- Download address: apolloscape. Auto/scene. HTML
- Release time: 2018-2020
- Description: Baidu Apollo data sets include trajectory prediction, 3D lidar target detection and tracking, scene analysis, lane semantic segmentation, 3D vehicle instance segmentation, stereo and repair data sets, etc
- Characteristics of the
- Scene segmentation data: The entire dataset published by ApolloScape contains hundreds of thousands of frames of high resolution 3384 x 2710 image data annotated with semantic segmentation per pixel
- Lane semantic segmentation: over 110,000 frames of high quality pixel-level semantic segmentation data
- 3D object detection and tracking dataset: Collected under various lighting conditions and traffic densities in Beijing, China
11 “BDD100K”
- AI Lab, University of California, Berkeley (BAIR)
- Download address: bdd-data.berkeley.edu/
- Address: arxiv.org/pdf/1805.04…
- Release time: 2018
- Size: 57.45 GB
- Summary: BDD100K has gained a lot of attention for the diversity of its data set, which was crowdsourced from tens of thousands of drivers in cities including New York, the San Francisco Bay Area and beyond. BAIR researchers sample key frames from videos and provide annotations for these key frames
- Characteristics of the
- 100,000 HD videos, over 1,100 hours of driving, each approximately 40 seconds long, 720p sharpness and 30 frame rate
- The video also contains GPS location information, IMU data, and timestamps
- Cover sunny, cloudy, rainy, snowy, foggy, cloudy 6 kinds of weather; Day and night; Urban roads, tunnels, highways, residential areas, parking lots and gas stations and other driving scenarios
- The researchers sampled key frames for the 10th second of each video
- It includes the following annotation types: image annotation, lane line annotation, drivable area annotation, road target detection, semantic segmentation, instance segmentation, multi-target detection and tracking, etc
12 “KITTI”
- Published by: Karlsruhe Institute of Technology (KIT), Toyota Institute of Technology Chicago (TTIC)
- Download address: www.cvlibs.net/datasets/ki…
- Address: arxiv.org/abs/1803.09…
- Release date: 2011
- KITTI is one of the most important data sets in the field of autonomous driving. KITTI is an image processing technology for autonomous driving, mainly applied in autonomous driving perception and prediction, which also involves localization and SLAM technology. This dataset is used to evaluate the performance of computer vision techniques such as STEREO, Optical flow, Visual Odometry, 3D Object Detection and 3D tracking in vehicular environment
- Characteristics of the
- Including KitTI-STEREO, Kitti-Flow, Kitti-Sceneflow, Kitti-Depth, Kitti-Odometry, Kitti-Object, Kitti-Tracking, Kitti-Road, and Kitti-Semant Ics, etc
- Stereo images and optical flow: 389 pairs
- The 39.2km visual ranging sequence and images of 3D labeled objects over 200K were sampled and synchronized at 10Hz
- 3D object detection categories: cars, trucks, trucks, pedestrians, bicycles, trams, others
- Includes scenarios: city roads, country roads, and highways
- Sensors: 1 64-line 3D lidar, 2 grayscale cameras, 2 color cameras, and 4 optical lenses
13 “CityPersons”
- Max Planck Inst.(Info.)
- Download it at www.cityscapes-dataset.com/login/
- Address: arxiv.org/abs/1702.05…
- Release time: 2017
- CityPersons is a subset of Cityperscapes that uses 2D frames to mark pedestrians. The bank’s dataset is more diverse and rich than previous datasets such as INRIA, ETH, TudBrussels and Daimler, covering France, Germany and Switzerland
- Characteristics of the
- Further fine-grained labels: pedestrian (walking, running, standing), cyclist (cyclist, motorcyclist), seated person, others (unusual human gestures such as stretching, etc.)
- In addition to real people, posters, sculptures, mirrors and reflections of people in Windows are also highlighted
- The dataset covers 27 different cities, 3 different seasons and different weather conditions
- The dataset consisted of 35,000 pedestrians, and each graph contained an average of seven annotations
14 “Tud-Pedestrian & Tud-motionpairs”
- Max Planck Inst.(Info.)
- Download: www.mpi-inf.mpg.de/departments…
- Address: www.mpi-inf.mpg.de/fileadmin/i…
- Release date: 2010
- Introduction: In early 2010, the Max Planck Institute introduced the Pedestrian Data set, which was used to realize a challenging task at the time — to detect pedestrians from multiple perspectives based on facial features and movement characteristics while cars were moving
- TUD-Brussels Pedestrian
- Data collected from a driving car in central Brussels
- 508 pairs of 640×480 resolution images
- Contains 1326 pedestrian markers
- TUD- MotionPairs
- 1,092 pairs of images, with 1,776 pedestrian annotations
- 192 pairs of images containing positive and negative images
- Multi-view images recorded in urban pedestrian areas
“Contact us”
We hope that through our professional in the field of data processing ability, in the next three years, can be assigned to more than 1000 + AI enterprises, become the enterprise of “partners” data, so we are looking forward to and you are reading this article, with further communication, so very welcome you to contact us, together to explore the possibility of cooperation, Our contact information is as follows:
Contact person: Mr. Qi 13456872274
For more details, please visit our official website: www.molardata.com