Robots and self-driving cars have one very large challenge in common, how to navigate the world. Typically, that task is approached by artificial intelligence as a problem of how to map the surroundings, to construct a precise overview of the geometry of a scene before a robot or a car moves across that terrain.
There may be a simpler way.
In a paper posted on arXiv Wednesday by scholars at the University of California at Berkeley, a wheeled robot is able to travel kilometers over suburban terrain.
The droid sticks to paths and dodges previously unseen obstacles. Essential is that it doesn't map its environment, as some other approaches have done, such as in autonomous driving AI programs.
Instead, it relies on heuristics picked up from thirty hours of video of previous runs and some overhead maps of the terrain to create an improved schematic of the way stations along the way relate to one another, without a full map.
The research, titled "ViKiNG: Vision-Based Kilometer-Scale Navigation with Geographic Hints," is authored by Ph.D. candidate Dhruv Shah and UC Berkeley assistant professor Sergey Levine.
Levine has worked for years on bringing AI to robotics in collaboration with Google. Many of the key discoveries of that work were related by Levine last year in a paper titled "How to train your robot." That paper focused on discoveries in what's called "reinforcement learning," a form of deep learning AI where neural networks are trained to advance in stages toward a goal.
The latest work by Shah and Levine, called ViKiNG, has important connections to RL.
ViKiNG builds upon a previous system, called "RECON," standing for "Rapid Exploration Controllers for Outcome-driven Navigation," introduced by Shah and Levine last year.
Also: Way beyond AlphaZero: Berkeley and Google work shows robotics may be the deepest machine learning of all
RECON was trained by having the wheeled robot, a Jackal unmanned ground vehicle made by Clearpath Robotics, travel on "random walks" through multiple environments such as parking lots and fields over the course of 18 months, collecting hours of video via mounted RGB cameras, LiDAR and GPS.
RECON learned what is called "navigational priors" by dint of a convolutional network that compressed and uncompressed the image data, as what's called an "information bottleneck," an approach to dealing with signals introduced in 2000 by Naftali Tishby and colleagues.
That approach for RECON develops the software's ability to have a good representation of visual surroundings by compressing images down and then recalling what's salient. In the test phase, RECON is shown an image of a goal, a particular building, say, and has to figure out on the fly how to navigate to that new place.
Also: A new experiment: Does AI really know cats or dogs -- or anything?
RECON constructs a graph of steps along a path to that goal, a kind of improvised map. Using those techniques, the Jackal robot was able to navigate up to 80 meters toward a goal in new surroundings it had never encountered before. It was able to do so in instances where every other existing approach to robot navigation failed to reach a goal.
In ViKiNG, Shah and Levine extend RECON in one specific way: hints. They give the Jackal's software alternatively overhead satellite images of the new terrain or overhead maps.
As Shah and Levine write, "In contrast to RECON, which performs an uninformed search, ViKiNG incorporates geographic hints in the form of approximate GPS coordinates and overhead maps.
"This enables ViKiNG to reach faraway goals, up to 25× further away than the furthest goal reported by RECON, and to reach goals up to 15× faster than RECON when exploring a novel environment."
The ViKiNG program has increased its training data of camera observations from the random walk with an additional 12 hours of video that was from "teleoperated" trips, where a human guided the Jackal to pursue paths, such as sidewalks or hiking trails, to build up those prior examples. The neural network utilized to crunch all that training data is fairly humdrum, the familiar MobileNet convolutional neural network.
This time, equipped with ViKiNG, Jackal goes well beyond the 80 meters of RECON, traveling from start to destination almost 3 kilometers away, or almost two miles.
In videos featured on a blog page for the project, Shah and Levine show how the Jackal with ViKiNG can figure out how to route around previously unknown obstacles, such as a parked vehicle blocking its path. A companion video explains the work, which you can view at the bottom of this post.
RECON explicitly employed elements of reinforcement learning. ViKiNG, likewise, borrows in some way. Asked about the connection to RL, Levine told ZDNet in an email, "I would characterize ViKiNG as a kind of reinforcement learning method with a higher level planner sitting on top of it."
The key, Levine explained, is the combination of low-level learned control approaches for moment-by-moment navigation, and higher-level planning, which is akin to RL.
As Levine described it,
The explicit high-level planning provides the ability to handle very long horizons, so a good way to look at the method is using model-free [RL] techniques to handle the low-level problem of local navigation (e.g., how to drive around a tree) with planning for the high-level problem of how to plot a path to a distant goal. I think that's actually a really natural fit -- much like a person driving a car might not think very carefully about every single turn they make, but would do some explicit planning in their mind to determine which route to take to their destination, perhaps reasoning about landmarks as the "nodes" in the plan.
Levine believes there is high relevance for more complex navigation such as autonomously driven vehicles. ViKiNG, he said, is a start toward a "sidewalk delivery robot."
"But autonomous driving or other tasks with higher stakes (or even real sidewalk delivery that has to deal with dense traffic) has to have additional mechanisms to deal with safety and constraints, which the current approach doesn't directly handle just yet," said Levine.
Levine offered that additional work on things such as safety can include both explicit instructions from humans as "co-pilots" to direct robots away from harm. It can also include imitation of existing policies that would instill some safeguards.
However, to deal with a vehicle moving at high speed and with unseen elements such as jay-walking pedestrians will take a lot more research, said Levine.
"Of course, providing rigorous safety guarantees for such systems is a major open problem, and I do think a lot more work is required there to make this safe and secure enough for full-scale autonomous vehicles," Levine told ZDNet.