Developing Perceptive Machines that See and Reason Like Humans

Print Friendly, PDF & Email

The National Science Foundation has awarded computer scientist Subhransu Maji at the University of Massachusetts Amherst its Faculty Early Career Development (CAREER) award, a five-year, $545,586 grant, to support his work in computer vision and artificial intelligence (AI).

My main research aims are to teach machines to reason, learn, take intelligent decisions and make sense of the tremendous amount of information we can gather from visual data,” says Maji. “It’s a hard problem but an incredibly useful one if you can master it. I believe it is tremendously important for robots to have a way of understanding the visual world.”

Machines now are being asked to analyze and “understand” visual data from many sources such as consumer and satellite cameras, video cameras, 3D systems and depth sensors such as those used in navigation and calculating distance, Maji notes. “Every autonomous car has this, for example,” he says, “but these systems have a long way to go to be able to see and reason like a human in order to navigate in challenging environments.”

At present, Maji collaborates with ecologists using weather radar to extract biological information, which can be a useful tool for understanding bird migration, for example, while screening out weather-related data.

There are more casual applications, as well. “You can now take a picture of a bird and have a computer tell you which one of the 700 species found in North America it is, to about 80 percent accuracy. It’s not as good as a human expert, but it’s a very useful tool for many of us,” he notes.

“There’s a group in Denmark that has built a model that can recognize 6,000 or 7,000 different kinds of fungi. I’ve heard of other groups who do food recognition, shoes and fashion. It’s fun to see what people come up with.” Maji co-organizes fine-grained visual categorization (FGVC) workshops where teams from academia and industry collaborate on building data sets and models to tackle such problems.

His larger project integrates research, teaching and outreach to address challenges such as computer vision system deployment in areas like social media, health care, robotics and ecology, among others, Maji says. He plans to develop computer architectures that are “substantially more accurate and capable of extracting detailed information from perceptual data across different modalities,” he notes, with an emphasis on computer vision systems that can reason about data in ways interpretable by humans.

“These are things that we are able to do, but machines can’t yet,” he notes. Maji plans to develop a new class of graph-based and factorized architectures for machines to accomplish 3D shape and spatio-temporal analysis that “provide better tradeoffs between computational cost, memory overhead and accuracy, than existing models.”

One reason we would like machines to reason about the world the way we do is because we will be able to collaborate with them, and humans don’t always want the robot to behave in the same way all the time, Maji says. “Things change, and the notion of explaining your behavior and having the machine learn from the user is something I worry about. Humans are really good at understanding from small amounts of data, and we can explain each other’s behavior and interpret it, but machines are in the infancy of doing this.”

His research will support a number of Ph.D., M.S. and undergraduate students. His collaborators include computer scientist Dan Sheldon in the Dark Ecology project, his fellow computer vision researcher Erik Learned-Miller, with computer graphics researchers Rui Wang and Evangelos Kalogerakis in the College of Information and Computer Sciences. Maji says, “If computer vision becomes good enough and fast enough, it will change the way we behave in areas like consuming content and shopping.”

The NSF CAREER grant is its highest award in support of junior faculty who exemplify the role of teacher-scholars through outstanding research, excellent education and the integration of both.


Sign up for the free insideBIGDATA newsletter.

Speak Your Mind