The hidden geometry of learning: Neural networks think alike

March 27th, 2024

This story is by Ian Scheffler. Read more at Penn Engineering Today.

New research by Penn engineers illuminates the inner workings of neural networks, opening the possibility of developing hyper-efficient algorithms that could classify images in a fraction of the time.

Penn Engineers have uncovered an unexpected pattern in how neural networks—the systems leading today’s AI revolution—learn, suggesting an answer to one of the most important unanswered questions in AI: why these methods work so well.

Inspired by biological neurons, neural networks are computer programs that take in data and train themselves by repeatedly making small modifications to the weights or parameters that govern their output, much like neurons adjusting their connections to one another. The final result is a model that allows the network to predict on data it has not seen before. Neural networks are being used today in essentially all fields of science and engineering, from medicine to cosmology, identifying potentially diseased cells and discovering new galaxies.

In a new paper published in the Proceedings of the National Academy of Sciences (PNAS), Pratik Chaudhari, assistant professor in electrical and systems engineering (ESE) and core faculty at the General Robotics, Automation, Sensing and Perception (GRASP) Lab, and co-authors show that neural networks, no matter their design, size or training recipe, follow the same route from ignorance to truth when presented with images to classify. 

“Suppose the task is to identify pictures of cats and dogs,” says Chaudhari. “You might use the whiskers to classify them, while another person might use the shape of the ears—you would presume that different networks would use the pixels in the images in different ways, and some networks certainly achieve better results than others, but there is a very strong commonality in how they all learn. This is what makes the result so surprising.”

The result not only illuminates the inner workings of neural networks, but gestures toward the possibility of developing hyper-efficient algorithms that could classify images in a fraction of the time, at a fraction of the cost. Indeed, one of the highest costs associated with AI is the immense computational power required to develop neural networks. “These results suggest that there may exist new ways to train them,” says Chaudhari.