Scientists have developed a new technique that allows robots to autonomously navigate in different environments and carry out actions a soldier would expect from a team mate on the battlefield.
Scientists have developed a new technique that allows robots to autonomously navigate in different environments and carry out actions a soldier would expect from a team mate on the battlefield. The technique, developed by researchers at the US Army Research Laboratory (ARL) and Carnegie Mellon University in the US, helps quickly teach robots novel behaviours with minimal human oversight.
“If a robot acts as a teammate, tasks can be accomplished faster and more situational awareness can be obtained,” said Maggie Wigness, from ARL. “Further, robot teammates can be used as an initial investigator for potentially dangerous scenarios, thereby keeping Soldiers further from harm,” said Wigness.
To achieve this, Wigness said the robot must be able to use its learned intelligence to perceive, reason and make decisions. “This research focuses on how robot intelligence can be learned from a few human example demonstrations,” Wigness said. “The learning process is fast and requires minimal human demonstration, making it an ideal learning technique for on-the-fly learning in the field when mission requirements change,” she said.
Researchers focused their initial investigation on learning robot traversal behaviours with respect to the robot’s visual perception of terrain and objects in the environment. More specifically, the robot was taught how to navigate from various points in the environment while staying near the edge of a road, and also how to traverse covertly using buildings as cover.
According to the researchers, given different mission tasks, the most appropriate learned traversal behaviour can be activated during robot operation. This is done by leveraging inverse optimal control, also commonly referred to as inverse reinforcement learning, which is a class of machine learning that seeks to recover a reward function given a known optimal policy.
In this case, a human demonstrates the optimal policy by driving a robot along a trajectory that best represents the behaviour to be learned. These trajectory exemplars are then related to the visual terrain/object features, such as grass, roads and buildings, to learn a reward function with respect to these environment features. While similar research exists in the field of robotics, what ARL is doing is especially unique.
“The challenges and operating scenarios that we focus on here at ARL are extremely unique compared to other research being performed,” Wigness said. “We seek to create intelligent robotic systems that reliably operate in warfighter environments, meaning the scene is highly unstructured, possibly noisy, and we need to do this given relatively little a priori knowledge of the current state of the environment,” she said.
The research is crucial for the future battlefield, where soldiers will be able to rely on robots with more confidence to assist them in executing missions.