Although there have been many advances in machine vision, most relatively simple robots are still not able to maneuver around objects at high speeds because they are unable to quickly judge their distance from the objects. In order to tackle this problem researchers from Stanford University have developed a new algorithm that many said was impossible: it will allow robots to calculate distances from a single, still image. The algorithm was developed by a team led by computer science Assistant Professor Andrew Ng and was presented at the Neural Information Processing Systems Conference held in Vancouver this week.
Autonomous robot navigation is not new. With the proper investment in hardware, sensors arrays, cameras, radar, GPS, etc., robots can navigate fairly effectively at fast speeds as illustrated by the number of robots to successfully finish DARPA’s Grand Challenge this year. As a matter of fact, Stanley the Volkswagen who ultimately won the race was developed by a team from Stanford. The problem is that these robots are both expensive and have significant physical requirements in order to accommodate all of their navigational hardware. Enter Ng and his team of graduate students. With their algorithm, robots that are too small to carry multiple sensors or must be constructed cheaply can navigate with a single video camera.
“Many people have said that depth estimation from a single monocular image is impossible,” says Ng. “I think this work shows that in practical problems, monocular depth estimation not only works well, but can also be very useful.”
The algorithm itself mimics the way that humans infer depth from photos. The software is designed to recognize the same clues that we do, including the fact that objects in the foreground (or closer) are sharper than objects further away, lines appear to converge at a distance, and objects that are far away are hazy. These clues can be converted into properties that the software can analyze: texture, edges, and haze, respectively. The algorithm examines individual quadrants of the picture alone and then in comparison to one another to determine depth.
The algorithm has been used to successfully determine distances with an error of about 35%, which still gives a robot capturing images at 10 fps and traveling 20 miles an hour plenty of time to avoid most objects. Also, this single camera vision algorithm can actually see objects up to ten times further away than regular stereo vision systems.
This is only the beginning for Ng, though. “The difficulty of getting visual depth perception to work at large distances has been a major barrier to getting robots to move and to navigate at high speeds,” he says. “I’d like to build an aircraft that can fly through a forest, flying under the tree canopy and dodging around trees.”
Read the press release at the Stanford University web site: “Robots can travel more safely with new software“