Keynote Talks

The keynotes at BMVC will be delivered on the mornings of Tuesday September 4, and Wednesday September 5. Further information on scheduling is available on the conference programme.

People in Motion: Pose, Action and Communication

Prof. Stan Sclaroff
Boston University, MA
Tuesday 4th September

This talk will give an overview of some of the research in the Image and Video Computing Group at Boston University related to tracking, analysis, recognition and retrieval of images and video based on humans and their actions. First, efficient methods for inference of human pose will be presented. Linearly augmented tree models are proposed that enable efficient scale and rotation invariant matching. In another approach, articulated pose estimation with loopy graph models is made efficient via a branch-and-bound strategy for finding the globally optimal pose. Second, methods for learning human action models from Web images and video will be presented; the methods require no human intervention other than the action keywords to be used to form text queries to Web image and video search engines. A Multiple Instance Learning framework for exploiting properties of the scene, objects, and humans in video is also proposed. Third, work towards automatic recognition and retrieval of American Sign Language (ASL) in video databases will be presented. The goal is to enable users to search ASL video content simply by video-recording a query sign and relying on computer-based sign-recognition for lookup.

Collaborators in these works include (in alphabetical order): Vassilis Athitsos, Nazl. .kizler-Cinbi., Hao Jiang, He Kun, Shugao Ma, Carol Neidle, Joan Poole-Nash, Ashwin Thangali, Tai-peng Tian, and others.


Visual Tracking in the 21st Century

Prof Jiri Matas
Czech Technical University (CTU), Prague

Tuesday 4th September

Visual tracking is an old area that has recently seen a surge in activity. The interest has been fueled by progress in related fields like detection, segmentation and optic flow as well as by application-driven demand and the increase in the available computing power.

The published tracking methods differ in many aspects such as the speed, the complexity of the model of the tracked entity, the (geometric) transformations assumed, the mode of operation (casual and non-causal), the ability to adapt and learn, the robustness to occlusion and assumptions about the observer. I will review the dataset used in recent publications and show that the "tracker space" is still wide open with large areas to be explored.

I will then present three trackers developed by me and my collaborators that operate at very different points in the speed-robustness-flexibility space that are close to the "convex hull" of published methods: the TLD tracker, the Flock-of-Trackers and the Zero-Shift-Point tracker. I will focus on a common aspect shared by the trackers: mechanisms for prediction and handling of tracking errors. Such mechanisms contribute to tracker robustness, which will be demonstrated live.