Self-Driving Cars That Recognize Free Space Can Better Detect Objects

Researchers at Carnegie Mellon University (CMU) have demonstrated that they can significantly improve detection accuracy in self-driving cars by helping the vehicle recognize what it does not see.

Self-driving vehicles use 3D data from lidar to represent objects as a point cloud and then try to match those point clouds to a library of 3D representations of objects. The problem with that, according to Peiyn Hu, a Ph.D. student in CMU’s Robotics Institute, is that the 3D data from the vehicle’s lidar isn’t exactly 3D — the sensor can’t see the occluded parts of an object, and current algorithms don’t reason about such occlusions.

New CMU research shows that what a self-driving car doesn’t see (in green) is as important to navigation as what it actually sees (in red). Courtesy of Carnegie Mellon University.

“Perception systems need to know their unknowns,” Hu said.

Hu’s work enables a self-driving car’s perception systems to consider visibility as it reasons about what its sensors are seeing. In fact, reasoning about visibility is already used when companies build digital maps.

“Map-building fundamentally reasons about what’s empty space and what’s occupied,” said Deva Ramanan, an associate professor of robotics and director of the CMU Argo AI Center for Autonomous Vehicle Research. “But that doesn’t always occur for live, on-the-fly processing of obstacles moving at traffic speeds.”

Hu and his colleagues’ research takes cues from map-making techniques to help the system reason about visibility when trying to recognize objects. When tested against a standard benchmark, the CMU method outperformed the previous top-performing technique, improving detection by 10.7% for cars, 5.3% for pedestrians, 7.4% for trucks, 18.4% for buses, and 16.7% for trailers.

One reason previous systems may not have taken visibility into account is a concern about computation time. But Hu and his team found that was not a problem: Their method takes just 24 ms to run. For comparison, each sweep of the lidar is 100 ms.

The research was presented at the Computer Vision and Pattern Recognition conference.