Abstract: The field of reinforcement learning is concerned with the problem of
learning efficient behavior from experience. In real life
applications, gathering this experience is time-consuming and possibly
costly, so it is critical to derive algorithms that can learn
effective behavior with bounds on the experience necessary to do so.
This talk presents our successful efforts to create such algorithms
within a framework known as “PAC-MDP”. I’ll summarize the framework,
our algorithms, their formal validations, and their empirical
evaluations in robotic and videogame testbeds.