Scientists have developed a computer system that can discover and identify the real-world objects based on the same method of visual learning that humans use. The system is an advance in a type of technology called “computer vision,” which enables computers to read and identify visual images, said researchers from the University of California, Los Angeles (UCLA) in the US. It could be an important step towards general artificial intelligence (AI) systems –computers that learn on their own, are intuitive, make decisions based on reasoning and interact with humans in a much more human-like way.
Although current AI computer vision systems are increasingly powerful and capable, they are task-specific, meaning that their ability to identify what they see is limited by how much they have been trained and programmed by humans. Even today’s best computer vision systems cannot create a full picture of an object after seeing only certain parts of it — and the systems can be fooled by viewing the object in an unfamiliar setting.
Engineers are aiming to make computer systems with those abilities — just like humans can understand that they are looking at a dog, even if the animal is hiding behind a chair and only the paws and tail are visible. Humans can also easily intuit where the dog’s head and the rest of its body are, but that ability still eludes most artificial intelligence systems, researchers said. Current computer vision systems are not designed to learn on their own. They must be trained on exactly what to learn, usually by reviewing thousands of images in which the objects they are trying to identify are labelled for them.
Computers also can’t explain their rationale for determining what the object in a photo represents: AI-based systems don’t build an internal picture or a common-sense model of learned objects the way humans do, researchers said. The new method, described in the journal PNAS, shows a way around those shortcomings. The approach is made up of three broad steps.
First, the system breaks up an image into small chunks, which the researchers call “viewlets.” Second, the computer learns how those viewlets fit together to form the object in question. Finally, it looks at what other objects are in the surrounding area, and whether or not information about those objects is relevant to describing and identifying the primary object.
To help the new system “learn” more like humans, the engineers decided to immerse it in an internet replica of the environment humans live in. “Fortunately, the internet provides two things that help a brain-inspired computer vision system learn in the same way that humans do,” said Vwani Roychowdhury, a professor at UCLA. “One is a wealth of images and videos that depict the same types of objects. The second is that those objects are shown from many perspectives-obscured, bird’s eye, up-close-and they are placed in all different kinds of environments,” Roychowdhury said.