Scene Understanding by Statistical Modeling of Motion Patterns
Paper:  Scene Understanding by Statistical Modeling of Motion Patterns∗ 
Contact:  Imran Saleemi, Lance Hartung, Mubarak Shah 
Introduction
We present a novel method for the discovery and statistical representation of motion patterns in a scene observed by a static camera. Related methods involving learning of patterns of activity rely on trajectories obtained from object detection and tracking systems, which are unreliable in complex scenes of crowded motion. We propose a mixture model representation of salient patterns of optical flow, and present an algorithm for learning these patterns from dense optical flow in a hierarchical, unsupervised fashion. Using low level cues of noisy optical flow, Kmeans is employed to initialize a Gaussian mixture model for temporally segmented clips of video. The components of this mixture are then filtered and instances of motion patterns are computed using a simple motion model, by linking components across space and time. Motion patterns are then initialized and membership of instances in different motion patterns is established by using KL divergence between mixture distributions of pattern instances. Finally, a pixel level representation of motion patterns is proposed by deriving conditional expectation of optical flow. Results of extensive experiments are presented for multiple surveillance sequences containing numerous patterns involving both pedestrian and vehicular traffic.
 Video Sequence:
 Static camera
 Structured scene
 High density crowds
 Mulitiple flows
 Goal:
 Learn patterns of motion
 Statistical distribution
 Applications:
 Anomaly detection, prior motion model, persistent tracking
Figure: Examples of scenes to be analyzed and desirable patterns
 Compute optical flow
 Define
 A single Gaussian approximates a motion blob
 Gaussian component estimation
 Temporal quantization
 Kmeans clustering in 4d space
 No optimization
 Insensitive to choice of K
 Numerous, low variance clusters
 Component Filtering

 Optical flow is noisy
 Filter high directional variance components

 Pattern Instance Estimation
 Sequences of components form spatiotemporal worms (instances)
 Pattern instances are temporally bounded
 A pattern itself is periodic
 Intercomponent Transition
 Pattern instance occurs over several clips
 Two components i and j form an instance if,
 i and j are temporally proximal,
 j is ‘reachable’ from i
 Instance Learning

 Define a planar graph G = (V, E)
 V = {components from all video clips}
 E = {probability value if temporally proximal}
 Weak connected component analysis on G
 Connected components are pattern instances
 Define a planar graph G = (V, E)

 Figure: Left: One instance each from 4 patterns. Right: More instances for each of the 4 patterns.

 Motion Patterns

 Multiple Instances per pattern
 Each instance is a Gaussian mixture
 KL divergence defines similarity between instances
 Approximate with Monte Carlo sampling
 Graph connected analysis

 Conditional Expectation of flow

 Compute conditional expected orientation / magnitude given a pixel

Imran Saleemi, Lance Hartung, and Mubarak Shah, Scene Understanding by Statistical Modeling of Motion Patterns, IEEE Conference on Computer Vision and Pattern Recognition 2010, San Francisco, CA.