Humans demonstrate a remarkable ability to generalize their knowledge and skills to new unseen scenarios. One of the primary reasons is that they are continually learning by acting in the environment and adapting to novel circumstances. This is in sharp contrast to our current machine learning algorithms which are incredibly narrow in only performing the tasks they are explicitly trained for. The reason for this is reliance on human labels which force the training to be done once ahead of time rather than continuously throughout the life of the agent.
In this talk, I will present our initial efforts toward formulating artificial agents that could continually learn to perceive and perform complex sensorimotor tasks across a stream of new scenarios. I will focus on two key aspects: representation and action. Our embodied agent can 1) learn to represent its sensory input via self-supervision, and 2) map its learned representation to motor outputs via curiosity. This combination should equip the agent with the skills required to plan actions for solving complex tasks. I will discuss results on self-supervised representation learning for visual input, an agent learning to play video games simply based on its curiosity, a real robot learning to manipulate ropes, find its way in office environments driven by self-supervision and performing pick-n-place interaction to learn about objects.