Photonics Spectra BioPhotonics Vision Spectra Photonics Showcase Photonics Buyers' Guide Photonics Handbook Photonics Dictionary Newsletters Bookstore
Latest News Latest Products Features All Things Photonics Podcast
Marketplace Supplier Search Product Search Career Center
Webinars Photonics Media Virtual Events Industry Events Calendar
White Papers Videos Contribute an Article Suggest a Webinar Submit a Press Release Subscribe Advertise Become a Member


Algorithm Adds 3rd Dimension to Standard Video

By exploiting the graphics-rendering software that powers sports video games, researchers at the Massachusetts Institute of Technology (MIT) and the Qatar Computing Research Institute (QCRI) have developed a system that automatically converts 2D video of soccer games into a 3D version that can be played on commercial 3D televisions and other special-purpose displays.

Today's video games generally store very detailed 3D maps of the virtual environment that players navigate. When the player initiates a move, the game adjusts the map accordingly and, on the fly, generates a 2D projection of the 3D scene that corresponds to a particular viewing angle.


To create a 3D video from 2D source material, the researchers essentially ran this process in reverse. They set the realistic Microsoft soccer game "FIFA13" to play over and over again, and used Microsoft's video-game analysis tool PIX to continuously store screen shots of the action. For each screen shot, they extracted the corresponding 3D map.

Using a standard algorithm for gauging the difference between two images, they winnowed out most of the screen shots, keeping those that best captured the range of possible viewing angles and player configurations that the game presented; the total number of screen shots ran to the tens of thousands. Then they stored each screen shot and the associated 3D map in a database.

For every frame of 2D video of an actual soccer game, the system identifies in the database the 10 best corresponding screen shots. It decomposes those images, looking for the best matches between smaller regions of the video feed and smaller regions of the screen shots. It then superimposes the depth information from the screen shots on the corresponding sections of the video feed and, finally, stitches the pieces back together.

The result is a 3D effect with no visual artifacts. In a user study, the majority of subjects gave the 3D effect a rating of 5, or excellent, on a five-point scale; the average score was between 4 and 5.

Currently the system takes about a third of a second to process a frame of video. But successive frames could all be processed in parallel, meaning a broadcast delay of a second or two might provide an adequate buffer to permit conversion on the fly. The researchers are working to further reduce conversion time.

The research was presented at the Association for Computing Machinery’s 2015 Multimedia conference.

Explore related content from Photonics Media




LATEST NEWS

Terms & Conditions Privacy Policy About Us Contact Us

©2025 Photonics Media