Skip to main content

Multi-view Tracking in Crowded Scenes


Occlusion and lack of visibility in crowded scenes make it very difficult to track individual people correctly and consistently. This problem is particularly hard to tackle in single camera systems. We have developed a multi-view approach to tracking people in crowded scenes where people may be partially or completely occluding each other. Our approach is to use multiple views in synergy so that information from all views is combined to detect objects. To achieve this we develop a planar homography constraint to resolve occlusions and robustly determine locations on the ground plane corresponding to the feet of the people. To find tracks we obtain feet regions over a window of frames and stack them creating a space time volume. Feet regions belonging to the same person form contiguous spatio-temporal regions that are clustered using a graph cuts segmentation approach. Figure 1 below shows the various steps in our algorithm.

Figure 1: Tracking results on a sequence captured with four cameras.

Figure 2: Tracking results on a sequence captured with four cameras.

Data Set


Related Publications

Saad M. Khan and Mubarak Shah, Tracking Multiple Occluding People by Localizing on Multiple Scene Planes, 9th European Conference on Computer Vision, Graz, Austria, 2006.

Saad M. Khan and Mubarak Shah, A Multiview Approach to Tracking People in Crowded Scenes using a Planar Homography Constraint, IEEE Transaction on Pattern Analysis and Machine Intelligence, Volume 31, Issue: 3, pp 505-519, March 2009.