There is a growing interest in using data-driven methods to scale up manipulation capabilities of robots for handling a large variety of objects. Many of these methods are oblivious to the notion of objects and they learn monolithic policies from the whole scene in image space. As a result, they don’t generalize well to different scenes, viewpoints, and lighting changes. In addition, these models cannot be combined with other components and constraints without re-training. In this talk, I will present our approach for learning object centric models trained on 3D depth data. I will show how these approaches are combined with each other to accomplish tasks on unseen objects and environments. In particular, I will cover our works on grasping and segmenting unknown objects, obstacle avoidance, and task planning for unknown object rearrangement task.