Free LiDAR point cloud for self-driving cars

Published on December 9, 2020 • 3 min read

3D Labeling • Automation • LiDAR • Object Detection

Scale AI released a new LiDAR point cloud dataset, and accelerate the growth of Autonomous Driving research.

Point Cloud Data labelling

Data labelling, also called data annotation/tagging/classification, is the process of tagging (i.e. labelling) datasets with labels. The quality of this process is essential for Supervised Machine Learning algorithms. They learn patterns from labelled data before trying to predict labels by identifying the same patterns in unlabeled datasets.

Data labelling

For self-driving car applications, we most often avoid explicitly programming machine learning algorithms on how to make decisions. Instead we feed deep learning (DL) models with labelled data to learn from. Indeed, DL models can get better with more data, seemingly without limit. However, to get a well-functioning model, it is not enough to just have large amounts of data, you also need high-quality data annotation.

LiDAR point cloud

With this in mind, Scale AI aims at delivering training data for AI applications such as self-driving cars, mapping, AR/VR, and robotics. Scale CEO and co-founder Alexandr Wang told TechCrunch in a recent interview: “Machine learning is definitely a garbage in, garbage out kind of framework — you really need high-quality data to be able to power these algorithms. It’s why we built Scale and it’s also why we’re using this data set today to help drive forward the industry with an open-source perspective.”

Example of an annotated point cloud for lane and boundary detection applications. Image credits: Scale AI

In collaboration with the lidar manufacturer Hesai, the company released a new dataset called PandaSet that can be used for training machine learning models, e.g. applied to autonomous driving challenges. The dataset is free and licensed for academic and commercial use and includes data collected using Hesai’s forward-facing (Solid-State) PandarGT LiDAR as well as a mechanical spinning LiDAR known as Pandar64.

The vehicle, mounted with wide-angle cameras, long-focus camera, 1 mechanical spinning LidAR (Pandar64), and 1 Solid-State LiDAR (PandarGT). Image Credits: Scale AI

The data was collected while driving urban areas in San Francisco and Silicon Valley before officials issued the stay-at-home COVID-19 orders in the area (according to the company).

The dataset features

  • 48,000 camera images
  • 16,000 LiDAR sweeps
  • +100 scenes of 8s each
  • 28 annotation classes
  • 37 semantic segmentation labels
  • Full sensor suite: 1x mechanical LiDAR, 1x solid-state LiDAR, 6x cameras, On-board GPS/IMU

It is freely downloadable at this link.

PandaSet includes 3D Bounding boxes for 28 object classes and a rich set of class attributes related to activity, visibility, location, pose. The dataset also includes Point Cloud Segmentation with 37 semantic labels. These include smoke, car exhaust, vegetation, and driveable surface within complex urban environments filled with cars, bikes, traffic lights, and pedestrians.

Some modalities of the open-source dataset. Image Credits: Scale AI

While other great Open-source autonomous vehicle dataset exist, this is a new effort to license datasets without any restrictions.

Open-source Self-driving cars datasets

I gathered below four other datasets that are of high quality and will certainly be useful for your Machine Learning / Self-Driving Cars projects. These can then be used with the 3D Geodata Academy, among the formations 3D Reconstructor, 3D segmentor and VR/AR Creator. You can start point cloud processing now with this.

Originally published in Towards Data Science

Architect Spatial Intelligence.

The Brain-to-Deploy methodology. From first principles to production-grade 3D AI.

The Foundation
€1 997 €7 249
Founding Price Lifetime Access

Master the core 3D AI stack. For innovators building a strategic edge in spatial intelligence.

  • Spatial Accelerator (17 Episodes) The 17-episode deep-dive on the Brain-to-Deploy methodology. From mental models to shipped production systems.
  • Full 3D Course Library (20+ Courses) The complete curriculum. Point clouds, meshing, segmentation, deep learning 3D, spatial reasoning.
  • Neurones 3D Software Suite The software suite I built to unify 3D reconstruction, segmentation, and spatial analysis in one stack. Standard commercial license included.
  • Monthly Spatial AI Nuggets A monthly briefing on what moved in 3D AI. Research, code, and market signals I think you should know.
  • Private Job & Market Intelligence
Secure Foundation Access
Best Value
Professional
€2 997 €13 997
Founding Price Lifetime Production Access

Scale from prototype to industrial-grade deployment. For founders architecting proprietary spatial systems.

  • Everything in Foundation
  • 4 OS Deep-Dive Production Tracks Complete tracks: Spatial Reconstructor, Segmentor, Deep Learning 3D, and more. Each ships with 12 months of active updates and support, with optional annual renewal. Each track valued at €1,497.
  • 5 Forge .exe Apps + AI Agent Toolkit Five ready-to-run Windows apps plus the AI agent toolkit. Run 3D AI pipelines without touching infrastructure code.
  • 12-Month Strategic Production Briefs Every month I spot a real market opportunity and hand you the full step-by-step blueprint to build it. Twelve briefs, twelve shipped tools or software over the year.
  • Monthly Live Q&A Sessions Monthly live calls. Ask anything technical or strategic and get a direct answer from me.
Accelerate to Professional
Architect
€4 997 €19 249
Strictly Limited: 15 Seats Direct Access

Elite 1-on-1 advisory. I personally review your architecture and deliver custom Brain-to-Deploy blueprints.

  • Everything in Professional
  • Onboarding + Annual Strategy Sessions A kickoff session to map your system, plus recurring strategy calls where I pressure-test your architecture and roadmap.
  • Private 48h Priority Channel A direct line to me for architecture reviews and technical unblocks. Guaranteed response within 48h on business days.
  • Co-Built Custom 3D AI Solution Not built for you, but with you. I code alongside you to architect and ship a custom 3D AI solution for your specific use case.
  • Portfolio & Project Endorsement
Apply for Architect Advisory

Reach out or book a call
to keep learning

Reach out for tailored support, or book a call to have more information about new courses.

Scroll to Top
Review Your Cart
0
Add Coupon Code
Subtotal