Home
People
Publications
Research
Education
News & Events
People
Home
Home
Home Contacts
Prospective Students
Welcome to GRASP

GRASP Seminar Series: Spring 2006

March 3, 11:00 a.m., Wu & Chen Auditorium

Andrew Zisserman
Oxford University

"Object Recognition Using Bags of Visual Words"

Abstract: There has been much recent research activity - and much recent success - in recognizing particular objects and object categories (such as cars, faces, motorbikes) in images and videos. The success has come from representing objects by sets of local iconic image patches, where each patch may be thought of as a "visual word" for describing part of the object. Surprizingly object categories can be recognized without including the spatial organization/location of the patches, and these models are referred to as a "bag of words" in analogy with similar models in the statistical text literature.

In the first part of the talk I'll describe an approach to searching for and localizing all the occurrences of an object in a video. The object is represented by a set of visual words that enable recognition to proceed successfully despite changes in viewpoint, illumination and partial occlusion. By pushing this analogy with textual representation, efficient methods from text retrieval can be employed to retrieve shots containing the object in the manner of a Google search of the web. The methods will be demonstrated on several feature length films.

In the second part, I'll describe how object categories can be learnt from sets of visual words by fitting a probabilistic Latent Semantic Analysis (pLSA) model - a model again borrowed from the statistical text literature.

Biography: Andrew Zisserman is a Professor in the Department of Engineering Science, University of Oxford. He graduated from the University of Cambridge with a degree in Theoretical Physics, and for the last 20 years has carried out research in the area of Computer Vision. He has co-authored and co-edited several books on this area. The most recent, "Multiple View Geometry in Computer Vision" (with Richard Hartley), has now been published as a second edition in paperback and also translated into Chinese. He has been a program co-chair and general co-chair of ICCV. Software from his research group was marketed by the spin out company 2d3 (www.2d3.com) as a camera tracker for the special effects industry. This was awarded a Technical Emmy in 2002. He has been awarded the IEEE Marr Prize three times.


Full Seminar schedule...

 

 

top of page

GRASP Laboratory
Site maintained by graspadm@grasp.cis.upenn.edu
Last update: 24 February, 2006