Spring 2019 GRASP Seminar Series: Georgia Gkioxari, UC Berkeley, "Beyond 2D Semantic Recognition: 3D representations for object understanding and embodied question-answering"

ABSTRACT
Rapid advances in 2D perception have led to systems that accurately detect objects in real-world images. However, these systems make predictions in 2D, ignoring the 3D structure of the world. Concurrently, advances in 3D shape prediction have mostly focused on synthetic benchmarks and isolated objects. In this talk, I present our efforts in unifying advances in these two areas. In particular, I will present our recent work in augmenting state-of-the-art 2D recognition systems with the ability to infer 3D shapes on real-world images in the wild. Then I will turn to embodied question answering where 3D shape cues are used both for semantic navigation and question answering, fused with 2D cues in an end-to-end manner. 

Presenter's biography

Georgia Gkioxari is a research scientist at Facebook AI Research (FAIR). She received a PhD in computer science and electrical engineering from the  University of California at Berkeley under the supervision of Jitendra Malik in 2016. Her research interests lie in computer vision, with a focus on object and person recognition from static images and videos. In 2017, Georgia received the Marr Prize at ICCV for "Mask R-CNN."