This was a hybrid event with in-person attendance in Levine 307 and virtual attendance…
The way neural networks are studied and built today bears a striking resemblance to how electronic circuits were studied and built 100 years ago: back then we manually designed electronic circuits for specific tasks while now we train neural circuits for specific tasks. The retrieval-augmented transformer, for example, is the deep-learning analogue to the von Neumann architecture of the 1940s. In this talk, I explore the question: what did it take for us to scale electronic circuits to the modern software stack we have today, and how can we apply a similar approach to scaling our neural circuits to create the future of learning software? I will discuss four examples of abstractions humans have invented for scaling electronic circuits to programmed software — the digital abstraction, data abstraction, function abstraction, and problem abstraction. I will deconstruct the underlying principles that made these abstractions powerful, and show how we can apply these principles to design neural networks that exhibit many of the same properties and benefits that we get from discreteness, variables, reusable computations, and hierarchical organization, without explicitly defining these abstractions ourselves. Similarly to how programmed software enabled humans to manually model and manipulate systems, this line of work, which I call “neural software abstractions,” aims to build learning software for automatically modeling and manipulating systems. I conclude by situating neural software abstractions in the broader context of the relationship between AI and humans, arguing for research on both neural software abstractions and adaptive human computer interfaces as two complementary research directions towards building AI that enables humans to do more with less.