Graphene-Based Tracking System Could Streamline Autonomous Vision

A real-time 3D tracking system developed at the University of Michigan may one day replace lidar and cameras in autonomous technologies. The system combines transparent graphene-based light detectors and advanced neural networks to sense and image scenes in three dimensions.

The graphene photodetectors, developed by Zhaohui Zhong, associate professor of electrical and computer engineering, were modified to absorb only about 10% of the light to which they are exposed, which makes them nearly transparent. Due to graphene’s high sensitivity to light, that percentage is sufficient to generate images that can be reconstructed through computational imaging.

“One way to make transparent detectors is to make the absorbing material thin, usually below tens of nanometers,” Dehui Zhang, a doctoral student in electrical and computer engineering, told Photonics Media. “The trade-off is less absorption and weaker response.”

A graphene-based transparent photodetector array (acting as two layers of sensors in a camera) measures the focal stack images of a point object simulated by focusing a green laser beam onto a small spot in front of the lens inside. Courtesy of Robert Coelius/Michigan Engineering, Communications, and Marketing.

Traditional image sensors based on silicon and III-V materials, Zhang explained, target high absorption to maximize response, and are therefore not transparent.

“We solve the problem by using atomically thin 2D materials to fabricate phototransistors,” Zhang continued. “The photoconductive gain amplifies the small signal from the weak absorption, which enables a large response and high transparency simultaneously.”

The photodetectors in the new design are stacked behind one another, creating a compact system with each layer focused at a different focal plane to enable 3D imaging.

“The in-depth combination of graphene nanodevices and machine learning algorithms can lead to fascinating opportunities in both science and technology,” Zhang said. “Our system combines computational power efficiency, fast tracking speed, compact hardware, and a lower cost compared with several other solutions.”

In addition to 3D imaging, the researchers also used the system for real-time motion tracking, critical to a variety of autonomous robot applications. To do this, they had to find a way to determine the position and orientation of an object being tracked. Typically this is done with lidar and light-field cameras, both with significant limitations, the researchers said. Other approaches use metamaterials or multiple cameras.

“The speed can be fast given the small data size of focal stack images and the short inference time of neural networks using GPU,” Zhang told Photonics Media.

The technology possesses certain advantages over other modalities, which themselves have particular strengths and weaknesses, Zhang said; lidar, for example, requires active illumination, which results in extra power consumption, complexity, and safety concerns. Light-field imaging takes in a great amount detail, though it can bog down speed due to the large amounts of data that has to be processed. Another approach, stereo cameras, can be bulky due to the necessity of multiple cameras.

The team found that the technology could be used for motion tracking with the addition of deep learning algorithms. Doctoral student Zhen Xu helped to bridge the gap between the two fields. Xu built the optical setup and worked with the team to enable a network to decipher positional information.

The network is designed to search for specific objects within the scene, and then to focus only on the object of interest, such a pedestrian or an object moving into a driver’s lane on the highway. The technology works particularly well for stable systems, such as in automated manufacturing or in certain medical contexts.

“The algorithm was implemented and trained by Pytorch by feeding the network training samples,” Zhang said. “In the case of 3D point object tracking, each training sample consists of an input focal stack and the corresponding 3D coordinates of the point object being imaged.”

The samples were either collected experimentally using the transparent graphene detector in a single exposure, Zhang said, or with a CMOS camera with multiple exposures each with a distinct focus position.

In a demonstration, the technology successfully tracked a beam of light, and a ladybug, using a stack of two 16-pixel graphene photodetector arrays. The researchers also demonstrated that the method is scalable, and they believe that it would take as few as 4000 pixels for some practical applications, and 400- × 600-pixel arrays for many more.

The research was published in Nature Communications (www.doi.org/10.1038/s41467-021-22696-x).