Human vision is designed to look at the rough outlines of the various objects before examining the details. This improves the speed and accuracy of our image recognition – a trait that could be essential to our survival. However, machine vision typically performs the task in the opposite manner, carefully comparing the details of each image section to the images in its data bank, slowly and methodically eliminating areas until the device finds a match. In an attempt to change the way computers see the world, researchers at Boston College have developed a technique that models computer vision on the human visual system. They say that their process enables the machines to identify fleeting images with nearly twice the speed and 10 times the accuracy of previous methods, which could have a major impact on the fields of action and object recognition, surveillance, wide-base stereomicroscopy and three-dimensional shape reconstruction. Researchers took video footage of several objects, including a teddy bear and a fish. To locate the object in each image, they applied both a traditional “greedy” method and their experimental method. The overlay of lines and dots demonstrates the points the programs determined to be part of the object. (The objects are pictured on the left, and in the insets. The “greedy” method is shown in the center images, and the researchers’ method is on the right.) Images courtesy of Hao Jiang.The investigators, Hao Jiang and Stella X. Yu, realized that the true challenge lay in identifying objects that move, thus changing in scale and orientation, whereas previous vision systems had to compare every slight change with an image database – essentially repeating the locating process for every frame of a video. They have developed a series of linear algorithms that use an approximation of the desired target (an exemplar given to the system) to quickly scan the search area, then identify the changing image by updating trust search regions. This process focuses on the mathematically generated template of an image, which enables the system to track deformations of the object. It then records each of these changed templates, adding them to its database. According to Jiang, the method detects the object in each image (each video frame). This, he said, is “a more robust way to track [the] target object because it avoids [the] drifting problem [that is found] in traditional tracking systems.” The researchers say that the process enables their program to maintain spatial consistency as the target moves and reduces the number of necessary variables from millions to a few hundred. This simplification produced the increased speed of matching. After testing their software on a variety of images, the scientists determined that their algorithms were able to detect the correct object about 95 percent of the time, while previously used “greedy” methods of search and match had detection rates of around 50 percent. The investigators are researching methods to further improve the speed of the software, with the intention of enabling it to perform real-time image analysis. Rebecca C. Jernigan rebecca.jernigan@laurin.com