Machine learning algorithms excel primarily in settings where an engineer can first reduce the problem to a particular function (e.g. an image classifier), and then collect a substantial amount of labeled input-output pairs for that function. In drastic contrast, humans are capable of learning from streams of raw sensory data with minimal external instruction. In this talk, I will argue that, in order to build intelligent systems that are as capable as humans, machine learning models should not be trained in the context of one particular application. Instead, we should be designing systems that can be versatile, can learn in unstructured settings without detailed human-provided labels, and can accomplish many tasks, all while processing high-dimensional sensory inputs. To do so, these systems must be able to actively explore and experiment, collecting data themselves rather than relying on detailed human labels.
My talk will focus on two key aspects of this goal: versatility and self-supervision. I will first show how we can move away from hand-designed, task-specific representations of a robot’s environment by enabling the robot to learn high-capacity models, such as deep networks, for representing complex skills from raw pixels. Further, I will present an algorithm that learns deep models that can be rapidly adapted to different objects, new visual concepts, or varying environments, leading to versatile behaviors in the real world. Beyond such versatility, a hallmark of human intelligence is self-supervised learning. I will discuss how we can allow a robot to learn by "playing" with objects in the environment without any human supervision. From this experience, the robot can acquire a visual predictive model of the world that can be used for maneuvering many different objects to varying positions. In all settings, our experiments on simulated and real robot platforms demonstrate the ability to scale to complex, vision-based skills with novel objects.
Chelsea Finn is currently a research scientist at Google Brain and a post-doc at UC Berkeley, and will join the faculty at Stanford in 2019. She is interested in how learning algorithms can enable machines to develop more general notions of intelligence, allowing them to learn a variety of complex sensorimotor skills in real-world settings. During her PhD at UC Berkeley, she developed deep learning algorithms for concurrently learning visual perception and control in robotic manipulation skills, inverse reinforcement methods for scalable acquisition of nonlinear reward functions, and meta-learning algorithms that can enable fast, few-shot adaptation in both visual perception and deep reinforcement learning. She received her Bachelors degree in EECS at MIT. Her research has been recognized through an NSF graduate fellowship, a Facebook fellowship, and the C.V. Ramamoorthy Distinguished Research Award, and her work has been covered by various media outlets, including the New York Times, Wired, and Bloomberg.
For links to papers, videos, and open-sourced code and data, see: https://people.eecs.berkeley.edu/~cbfinn/