“The basic principle for the multiplication is a simple transmission measurement of a straight waveguide with a phase-change material on top,” Johannes Feldmann, lead author on the paper, told Photonics Media.
PCMs are commonly used with DVDs or Blu-ray Discs as optical data storage elements.
“PCMs as nonvolatile memory elements have the additional advantage that no energy is needed to keep their phase-state, meaning that in the case of the trained neural network with fixed weights,” Feldmann said, “no energy is required to preserve the matrix once it is programmed.”
Because different wavelengths of light do not interact with one another, frequency combs were an attractive light source for achieving parallel calculations. The researchers used a chip-based frequency comb developed at École Polytechnique Fédérale de Lausanne (EPFL) as a light source to carry out matrix multiplications on multiple data sets in parallel.
“Our study is the first to apply frequency combs in the field of artificially neural networks,” said Tobias Kippenberg, a professor at EPFL and pioneer in the development of frequency combs. “The frequency comb provides a variety of optical wavelengths that are processed independently of one another in the same photonic chip.”
This enables wavelength multiplexing — highly parallel data processing by simultaneously calculating on all wavelengths
For the experiment, the physicists used a convolutional neural network for the recognition of handwritten numbers.
“The convolutional operation between input data and one or more filters — which can be a highlighting of edges in a photo, for example — can be transferred very well to our matrix architecture,” Feldmann said. “Exploiting light for signal transference enables the processor to perform parallel data processing through wavelength multiplexing, which leads to a higher computing density and many matrix multiplications being carried out in just one timestep. In contrast to traditional electronics, which usually work in the GHz range, optical modulation speeds can be achieved with speeds up to the 50- to 100-GHz range.
“As the matrix vector multiplication is done in a passive transmission measurement, the speed is only limited by the photodetectors and the modulation of the input power,” Feldmann said. “Modulation and detection has been shown in the paper to work up to at least 14 GHz.”
The prototype showed matrices up to 9 × 4 with 14-GHz speed, and four vectors in parallel, which allowed it to achieve 2 TOPS (two trillion operations per second).
“With reasonable scaling assumptions (larger matrices, more vectors, faster operation) petaOPS/mm2 can be obtained,” Feldmann said.
“The compute density for state-of-the-art AI processors is less than 1 TOPS/mm2.”
The technology has a wide range of potential applications, particularly in AI. Larger neural networks allow for more accurate and so far unattainable forecasts and more precise data analysis. Photonic processors, for example, support the evaluation of large quantities of data in medical diagnoses, in high-resolution 3D data produced in special imaging methods.
Additional applications include self-driving vehicles, which require the rapid evaluation of sensor data, and in IT infrastructures such as cloud computing.
The Universities of Oxford and Exeter, the University of Pittsburgh, and the IBM research laboratory in Zurich, Switzerland, contributed to the work.
The research was published in Nature (www.doi.org/10.1038/s41586-020-03070-1).