The key to figuring out “how” to perceive lies in being able to model the underlying “structure” in the problem. I propose that for reasoning about the human environments, it is the humans that are the true underlying structure in the problem. This is not only true for tasks that involve humans explicitly (such as human activity detection), but also true for tasks in which a human was never observed! In this talk, I will present learning algorithms that model such underlying structure in the problem.
Finally, I will present several robotic applications ranging from single-image based aerial vehicle navigation to personal robots performing tasks of unloading items from a dishwasher, loading a fridge, arranging a disorganized room, and performing assistive tasks in response to human activities.