AI Software Uses Programmatic Imaging to Train Vision Systems

PRINCETON, N.J., July 12, 2023 — Princeton University researchers have developed a software system that aims to overcome limits to existing generative AI systems and quickly create image sets to prepare machines for nearly any visual setting. The system, called Infinigen, creates natural-looking objects and environments in three dimensions.

Though AI presents an opportunity for creating the massive sets of images necessary to train autonomous cars and other machines to see their environments, current generative AI systems have shortcomings that can limit their use. Infinigen is a procedural generator, meaning that it creates content based on automated, human-designed algorithms rather than labor-intensive manual data entry or the neural networks that power modern AI. In this way, the new program generates myriad 3D objects using only randomized mathematical rules.
Princeton researchers have developed an open-source software system that generates an infinite number of photorealistic scenes of the natural world, an advance that could improve the training of autonomous cars and other robots. Courtesy of Princeton University.

Princeton researchers have developed an open-source software system that generates an infinite number of photorealistic scenes of the natural world, an advance that could improve the training of autonomous cars and other robots. Courtesy of Princeton University.

An open-source software system that generates an infinite number of photorealistic scenes of the natural world, an advancement that could improve the training of autonomous cars and other robots. Courtesy of Princeton University.

Infinigen’s mathematical approach allows it to create labeled visual data, which is needed to train computer vision systems including those deployed on home robots and autonomous cars. Because Infinigen generates every image programmatically — it creates a 3D world first, populates it with objects, and places a camera to take a picture — Infinigen can automatically provide detailed labels about each image including the category and location of each object.

The resulting labeled images can be used to train a robot to recognize and locate objects given only an image as input. Such labeled visual data would not be possible with existing AI image generators, according to Jia Deng, an associate professor of computer science at Princeton and senior author of a study that details the software system. This is because those programs generate images using a deep neural network that does not allow the extraction of labels, Deng said.

OSI Optoelectronics - Design & Manufacturing Standard Oct 22 MR

In addition, Infinigen’s users have detailed control of the system’s settings, such as the precise lighting and viewing angle, and they can fine-tune the system to make images more useful as training data.

Besides generating virtual worlds populated by digital objects with natural shapes, sizes, textures, and colors, Infinigen’s capabilities extend to synthetic representations of natural phenomena including fire, clouds, rain, and snow. By vastly expanding the menu of 3D-rendered objects and landscapes, Infinigen also boosts machines’ ability to perform 3D reconstructions, from just 2D pixels, of the complex spaces they will operate within. While moving away from real-world images to synthetic images to develop cars and robots that will move in the real world might seem counterintuitive, real image data sets have key limitations, Deng said.

For example, the computers that guide robots and smart cars do not perceive images and other visual objects like humans do. An image that looks three-dimensional to a human is just a two-dimensional collection of pixels to a computer. To allow robots to perceive an image in 3D, the image needs to include an instruction called a “3D ground truth.” This is difficult to do with existing 2D images.

An array of Infinigen-generated trees showing the variation and control users have over their images. Courtesy of Princeton University.

According to Deng, the developers expect the system to be a useful resource for augmented and virtual reality, and for additive manufacturing.

A study describing Infinigen was presented at the 2023 Conference on Computer Vision and Pattern Recognition (www.doi.org/10.48550/arXiv.2306.09310).

There are 228 suppliers of Vision Systems in the Photonics Marketplace.

Published: July 2023

Glossary

computer vision: Computer vision enables computers to interpret and make decisions based on visual data, such as images and videos. It involves the development of algorithms, techniques, and systems that enable machines to gain an understanding of the visual world, similar to how humans perceive and interpret visual information. Key aspects and tasks within computer vision include: Image recognition: Identifying and categorizing objects, scenes, or patterns within images. This involves training...
machine vision: Machine vision, also known as computer vision or computer sight, refers to the technology that enables machines, typically computers, to interpret and understand visual information from the world, much like the human visual system. It involves the development and application of algorithms and systems that allow machines to acquire, process, analyze, and make decisions based on visual data. Key aspects of machine vision include: Image acquisition: Machine vision systems use various...
image: In optics, an image is the reconstruction of light rays from a source or object when light from that source or object is passed through a system of optics and onto an image forming plane. Light rays passing through an optical system tend to either converge (real image) or diverge (virtual image) to a plane (also called the image plane) in which a visual reproduction of the object is formed. This reconstructed pictorial representation of the object is called an image.
artificial intelligence: The ability of a machine to perform certain complex functions normally associated with human intelligence, such as judgment, pattern recognition, understanding, learning, planning, and problem solving.

Browse Cameras & Imaging, Lasers, Optical Components, Test & Measurement, and more.