BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//GRASP Lab - ECPv6.3.7//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:GRASP Lab
X-ORIGINAL-URL:https://www.grasp.upenn.edu
X-WR-CALDESC:Events for GRASP Lab
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20190310T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20191103T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20190222T110000
DTEND;TZID=America/New_York:20190222T120000
DTSTAMP:20240426T031943
CREATED:20210120T155505Z
LAST-MODIFIED:20210120T155506Z
UID:9684-1550833200-1550836800@www.grasp.upenn.edu
SUMMARY:CIS/GRASP Seminar: Wen Sun\, CMU\, “Towards Generalization and Efficiency in Reinforcement Learning”
DESCRIPTION:ABSTRACT\nIn classic supervised machine learning\, a learning agent behaves as a passive observer: it receives examples from some external environment which it has no control over and then makes predictions. Reinforcement Learning (RL)\, on the other hand\, is fundamentally interactive : an autonomous agent must learn how to behave in an unknown and possibly hostile environment\, by actively interacting with the environment to collect useful feedback. One central challenge in RL is how to explore an unknown environment and collect useful feedback efficiently. In recent practical RL success stories\, we notice that most of them rely on random exploration which requires large a number of interactions with the environment before it can learn anything useful.  The theoretical RL literature has developed more sophisticated algorithms for efficient learning\, however\, the sample complexity of these algorithms has to scale exponentially with respect to key parameters of underlying systems such as the dimensionality of state vector\, which prohibits a direct application of these theoretically elegant RL algorithms to large-scale applications. Without any further assumptions\, RL is hard\, both in practice and in theory. \n  \nIn this work\, we improve generalization and efficiency on RL problems by introducing  extra sources of help and additional assumptions. The first contribution of this work comes from improving RL sample efficiency via Imitation Learning (IL). Imitation Learning reduces policy improvement to classic supervised learning. We study in both theory and in practice how one can imitate experts to reduce sample complexity compared to RL approaches. The second contribution of this work comes from exploiting the underlying structures of the RL problems via model-based learning approaches.  While there exist efficient model-based RL approaches specialized for specific RL problems (e.g.\, tabular MDPs\, Linear Quadratic Systems)\, we develop a unified model-based algorithm that generalizes a large number of RL problems that were often studied independently in the literature. We also revisit the long standing debate on whether model-based RL is more efficient than model-free RL from a theoretical perspective\, and demonstrate that model-based RL can be exponentially more sample efficient than model-free ones\, which to the best of our knowledge\, is the first that separates model-based and model-free general approaches.
URL:https://www.grasp.upenn.edu/events/wen-sun/
CATEGORIES:Seminars
END:VEVENT
END:VCALENDAR