Skip to main content

Online Detection and Classification of Moving Objects Using Progressively Improving Detectors


Major limitations of classification algorithms like Adaboost, SVMs, or Naïve Bayes include,

  1. Requirement of a large amount of labeled training data.
  2. The parameters of the classifier are fixed after the end of training phase, i.e., these classifiers can not attune themselves to particular detection scenarios after their deployment.

To overcome above mentioned shortcomings, the algorithm proposed in this project has following properties: i) Requirement of minimal training data, ii) automated online selection of training examples after deployment for continuous improvement, and iii) near real time performance (4-5 frames/sec). The algorithm is based on a boosted classification framework, in which, aeparate views (features) of the data are used for online collection of training examples through co-training. While combined view (all features) are used to make the classification decisions. The background modeling is used to prune away stationary regions to speed up the classification process. We used global feature representations for robustness.

Feature Extraction 

Features for classification are derived from Principal Component Analysis (PCA) of the appearance templates of the training examples. For each object class ci (excluding background) an appearance subspace, represented by d x mi projection matrix Si, is constructed. m is chosen such that eigenvectors represent 99% of respective subspace. Appearance features for base learners are obtained by projecting a training example ‘r’ into appearance subspace of each object class. For two object classes the feature vector v of an example will be,

Initial Training

The Bayes Classifier is used as the base (weak) classifier for boosting. Each feature vector component vq, where q ranges from 1,.., m1+m2 (for two object classes + background class), is used to learn the pdf for each class. The classification decision by the qth base classifier hq is taken as ci, if

Adaboost.M1 (Freund and Schapire, 96) is used to learn the strong classifier, from the initial training data and the base classifiers.

Row 1: Top 3 eigenvectors for person appearance subspace. Row 2: Vehicle appearance subspace

Histograms of a feature coefficients from the appearance subspace


The Online Co-training Framework

The boosting mechanism selects the least correlated base classifiers, which is an ideal property for co-training. The examples confidently labeled by one classifier are used to train the other classifier. In order to carry out this step, only the examples lying close to the decision boundary of the boosted classifier are useful, as classifiying such examples correctly will improve the classification performance. In other words, we employ examples with small margins for online training.

The implementation steps of the online co-training framework are:

  1. Determine confidence thresholds for each base classifier, using a validation data set.
  2. For class ci and jth base classifier hj set the confidence threshold, T_{j,ci}^{base}, (highest probability achieved by a negative example).
  3. Compute Margin thresholds T_{j,ci}^{base} from the validation data set.

During the test phase, select example for training, if i) more than 10% of the classifiers confidently predict the label of an example, or ii) example’s margin is less than the computed thresholds. Once an example has been labeled by the co-training mechanism, an online boosting algorithm by [Oza and Russel, 2002] is used to update the base classifiers and the boosting coefficients.                 
Information flow for the real-time object classification system


Initial Training: 50 examples of each class All examples were scaled to 30×30 vector. Validation Set :20 images per class Testing is performed on on three sequences.

Example Detections 
Change in performance with increase in time for sequence 1,2 and 3. The performance was measure over two minute intervals. Over 150 to 200 detection decisions were usually made in this time period.
Performance vs. the number of co-trained examples for the three sequences. Relatively few examples are required for improving the detection rates since these examples are from the same scene in which the classifier is being evaluated.



  1. Power Point Presentation [41MB]
  2. Poster