vision. In this talk I will consider the problem of detecting objects
from a generic category, such as people or cars, in static images. This
is a difficult problem because objects in such categories can vary
greatly in appearance. For example, people wear different clothes and
take a variety of poses while cars come in various shapes and colors.
We have built an object detection system that addresses this challenge
using mixtures of deformable part models. The system is both highly
efficient and accurate, achieving state-of-the-art results on the
PASCAL object detection benchmarks. We train our models from weakly
labeled data using a discriminative procedure that we call latent SVM.
A latent SVM leads to a non-convex training problem. However, a latent
SVM is semi-convex and the training problem becomes convex once latent
information is specified for the positive examples. This leads to an
iterative algorithm for solving the training problem. I will also
discuss a general framework for object detection using rich visual
grammars.
Joint work with Ross Girshick, David McAllester and Deva Ramanan