Friday 15 June 2018

Scientific American/Brett Stetka: That Vision Thing: New AI System Can Imagine What It Hasn't Seen

Scientific American

Computing
That Vision Thing: New AI System Can Imagine What It Hasn't Seen

Scientists have developed machine-learning that can teach itself to visualize a three-dimensional scene from unobserved angles

    By Bret Stetka on June 15, 2018

That Vision Thing: New AI System Can Imagine What It Hasn't Seen
An artist's interpretation of an AI system able to "visualize" objects in a scene from perspectives it has not yet observed—an advance detailed in the paper entitled "Neural Scene Representation and Rendering," published June 15 in Science.   Credit: DeepMind
Advertisement

“Before we work on artificial intelligence, why don’t we do something about natural stupidity?” computer scientist Steve Polyak once joked. The latter might be a tall order. But AI, it appears, just took one small step for robotkind.

New research published June 14 in Science reports that for the first time scientists have developed a machine-learning system that can observe a particular scene from multiple angles and predict what it will look like from a new, never-before-observed angle. With further development the technology could lead to more autonomous robots in industrial and manufacturing settings.

Much in the way we can scan a friend’s apartment from one side of the living room and have a pretty good sense of what it looks like from the other, the new technology can do just that for a scene within a three-dimensional computer image.
Advertisement

Devised by researchers at artificial intelligence company DeepMind—acquired by Google in 2014—the new system can “learn” the three-dimensional layout of a space without any human supervision. This Generative Query Network, or GQN—as its developers call it, is first trained by observing simple computer-generated scenes containing different lighting and object arrangements. It can then be exposed to multiple images of a new environment and accurately predict what it looks like from any angle within it. Unlike the hyperconnected perceptual regions of the human brain, the system learns and processes properties like shape, size and color separately and then assimilates the data into a cohesive “view” of a space. “Humans and other animals have a rich understanding of the visual world in terms of objects, geometry, lighting, etcetera,” says Ali Eslami, lead author on the new paper and a research scientist at DeepMind. “This capability is developed through a combination of innate knowledge and unsupervised learning. Our motivation behind this research is to understand how we can build computer systems that learn to interpret the visual world in a similar manner.”

Credit: DeepMind

Machine learning as a field has barreled forward in recent years. And GQN technology builds on many past systems, including the numerous “deep-learning” models based on neural networks inspired by the human brain. Deep learning is a form of machine learning in which a computer “learns” from exposure to an image or other data to, say, detect the various features that make an object a cat or a spoon. It does so after observing many images of scenes labeled to identify these objects. GQN utilizes deep learning to build a form of computerized “vision” that enables navigation through complex scenes. What’s unique about it compared with many other systems is its ability to learn on its own purely from observation and without human supervision. It analyzes unlabeled objects and the space in which they fit in a scene and then apply its learnings to another image. “This gives GQN increased flexibility and frees us from having to create a large collection of models for every object in the world,” Eslami says. In other words it can recognize a novel object based on prior exposure to a different object using characteristics like shape and color.

For now the new system has only been designed to work with computer-generated scenes, not to control a robot’s actions in the real world. But Eslami and his colleagues plan to continue developing the GQN with more complex geometry and situations, hoping that one day the fully autonomous robotic understanding of a scene could lend itself to any number of industrial applications. Robots could theoretically be trained on one task and redeployed on another without extensive reprogramming. GQN could bring down manufacturing costs, increase production speed and streamline assembly of just about anything cobbled together by robots. “This work is interesting and exciting,” says Massachusetts Institute of Technology professor of cognitive science and computation, Joshua Tenenbaum, who also says the technology has a way to go before it sees any practical uses. “In my view, this research is still rather far from direct applications,” he notes. “From a strictly practical engineering point of view, the problems it solves can currently be solved as well or better by other means, which are less dependent on pure learning-based methods.”

Tenenbaum, who was not involved with the project, adds, “In the long term this work could help advance the state of robotic perception and control, leading to systems that are more adaptive and autonomous than today's AI technologies.”
Advertisement

As AI advances to the point where machines take on qualities previously exclusive to humans, there are of course dystopian concerns: namely that we will cultivate our own demise at the hands of a smarter, more powerful population of cyber beings, whatever form they may take. And as German philosopher Thomas Metzinger has cautioned for years, creating certain mental states in machines could result in those machines experiencing pain and suffering.

Tenenbaum is not worried. “Any fear of developing computers that are ‘smarter’ than us, in the practically accessible future, are unfounded,” he says. “The system presented here is a noteworthy advance over previous inverse-graphics systems, but it is far from capturing the perception abilities that even young children possess. It also requires vast quantities of training data, which children do not, suggesting that its learning abilities are not nearly as powerful as those of human beings.”

Computer science founding father Alan Turing once said a computer could only be called intelligent if it could deceive a person into believing it was human. Any true success on Turing’s test would require a machine that exhibits general intelligence—one that can do calculus, tie shoes and cook supper, all the things that humans do—a goal that is still nothing more than a futurist fantasy for now.
Rights & Permissions
ABOUT THE AUTHOR(S)

Bret Stetka

Bret Stetka is an editorial director at Medscape (a subsidiary of WebMD) and a frequent contributor to Scientific American. His writing has appeared in Wired and online for the Atlantic and NPR.
Recent Articles

    Babies Can Think Logically before They Learn to Talk
    A Big Step toward a Blood Test for Alzheimer's
    Cocktail of Brain Chemicals May Be a Key to What Makes Us Human

Read This Next
Computing
How a Machine Learns Prejudice
How a Machine Learns Prejudice

December 29, 2016 — Jesse Emspak
Computing
Spoiler Alert: Artificial Intelligence Can Predict How Scenes Will Play Out
Spoiler Alert: Artificial Intelligence Can Predict How Scenes Will Play Out

December 12, 2016 — Edd Gent and LiveScience
Space
Can Artificial Intelligence Help Find Alien Intelligence?
Can Artificial Intelligence Help Find Alien Intelligence?

May 12, 2018 — Michael P. Oman-Reagan and The Conversation US
Computing
Spot the Fake: Artificial Intelligence Can Produce Lifelike Photographs
Spot the Fake: Artificial Intelligence Can Produce Lifelike Photographs

April 1, 2018 — Lawrence Greenemeier
Newsletter
Sign Up
Expertise. Insights. Illumination.

Discover world-changing science. Explore our digital archive back to 1845, including articles by more than 150 Nobel Prize winners.
Subscribe Now!Expertise. Insights. Illumination.

Follow us

    instagramyoutubetwitterfacebookrss

    Store
    About
    Press Room
    More

Scientific American is part of Springer Nature, which owns or has commercial relations with thousands of scientific publications (many of them can be found at www.springernature.com/us). Scientific American maintains a strict policy of editorial independence in reporting developments in science to our readers.

© 2018 Scientific American, a Division of Nature America, Inc.

All Rights Reserved.

No comments: