SCIENTIFIC NEWS AND
INNOVATION FROM ÉTS
Tracking Faces in Video Surveillance: Which is the best method? - By : M. Ali Akber Dewan, Éric Granger, Fabio Roli, Robert Sabourin, Gian Luca Marcialis,

Tracking Faces in Video Surveillance: Which is the best method?


M. Ali Akber Dewan
M. Ali Akber Dewan Author profile
M. Ali Akber Dewan was a Postdoctoral fellow at École te technologie supérieure de Montréal (ÉTS) from 2012 to 2014 in the Livia laboratory. He is specializing in face tracking and recognition in video surveillance.

Éric Granger
Éric Granger Author profile
Éric Granger is a professor in the Automated Manufacturing Engineering Departement at ÉTS. His research interests are adaptive and intelligent systems, video surveillance, computer and network security and faces recognition.

Fabio Roli
Fabio Roli Author profile
Fabio Roli is a professor of computer engineering and Director of the Patter Recognition and Application Lab (PRA) at the University of Cagliari, Italy. His research activity is focused on the design of pattern recognition systems.

Robert Sabourin
Robert Sabourin Author profile
Robert Sabourin is a professor in the Automated Manufacturing Engineering Department at ÉTS. His research includes pattern recognition and inspection, neural networks, machine learning, genetic programming and bank cheques processing.

Gian Luca Marcialis
Gian Luca Marcialis Author profile
Dr.Gian Luca Marcialis is currently Assistant Professor at University of Cagliari, and member of the PRA lab. His research interests are in the fields of fusion of multiple classifiers for person recognition by biometrics.

Header picture is from Paul Sastrasinh Website, no usage restriction, source.

Given the current demand for security and surveillance technologies, decision support systems for video surveillance are being considered by many public safety organizations for enhanced situation analysis. In many applications, automated face recognition is increasingly employed to alert a human operator to the presence of individuals of interest appearing in either live (real-time monitoring) or archived (post-event analysis) videos.

ADcrowd

Picture 1: Source [Img1]

In practice, face recognition in video surveillance (FRiVS) is challenging because accurate responses are required for faces captured under semi-constrained (e.g., inspection lane, portal and checkpoint entry) and unconstrained (e.g., cluttered free-flow scene at an airport or casino) conditions. In recent years, face tracking has become an important tool for recognition.

Techniques for face tracking in video surveillance should be robust to changes in pose, expression and illumination, as well as occlusion in cluttered scenes. Given these challenges, trackers based on Adaptive Appearance Modelling (AAM) typically improve target’s state estimation because they initiate and update an internal face model per individual according to changes in facial appearance.

This article presents an empirical comparison of performance for three state-of-the-art trackers based on AAM :

  1. Tracking Learning Detection (TLD) [1]
  2. Incremental Visual Tracking (IVT) [2]
  3. Discriminative Sparse Coding based Tracking (DSCT) [3].
Figure 1: A framework for face tracking using adaptive appearance modelling tracker. Source [Img3]

Figure 1: A framework for face tracking using adaptive appearance modelling tracker. Source [Img2]

Those trackers are compared for face tracking with video surveillance applications in mind. These methods are evaluated according to area overlap error, tracking error and time complexity using Chokepoint videos collected in uncontrolled video-surveillance environments, where individuals walk through portals.

1. Tracking Learning Detection (TLD)

Method

Face model is represented with a collection of target and non-target patches observed so far.  The main framework consists of three components:

  1. Tracking component uses median-flow tracker to find the face correspondence in frames;
  2. Detection component employs a three layer cascaded classifier to select the patch most similar to the target face model;
  3. Learning component employs p-expert and nexpert to select target and non-target patches to update face model.

Strengths

  • Tracks consistently as long as the appearance does not change much from observations;
  • TLD learns the appearance of the target with respect to non-target samples, and thus can automatically retrieve track after a reappearance.

Weaknesses

  • Face model is less adaptive and the vulnerability of TLD to drift is high in a cluttered scene;
  • Performs exhaustive search of faces which increases processing time;
  • Track failure may occur if an object with similar appearance to the target appear in the scene.

2. Incremental Visual Tracking (IVT)

Method

  • Facial models are represented in a low dimensional sub-space;
  • Incremental update is performed using Sequential Karhunen–Loeve;
  • To find face correspondence, particle filter–based affine motion parameters using Euclidean and Mahalanobis distances for data association.

Strengths

  • Face representation based on Eigenspace is robust to pose changes and clutter;
  • The on-line learning incrementally updates facial models according to the changes in the scene.

Weaknesses

  • Susceptible to drift as it can gradually adapt to non-targets regions during update;
  • Lacks mechanisms for detecting and correcting drift as it does not incorporate global constraints.

 

 

Picture 3: Face tracking provided by IVT, TLD, and DSCT on selected frames used for the study. Source [Img3]

Picture 2: Face tracking provided by IVT, TLD, and DSCT on selected frames used for the study. Source [Img2]

3. Discriminative Sparse Coding based Tracking (DSCT)

Method

  • Sparse code is used for face model representation;
  • Two observation models are used as a face model:
    1. Static model computed with the first frame observation;
    2. Dynamic model computed by accumulating observations on several most recent frame;
  • Candidate regions are compared with the adaptive and then with the static observation models

Strengths

Tracks well as long as the appearance of a face does not change much from the first frame

Weaknesses

  • Track fails if a future state changes drastically compared to the observation in the first frame;
  • The adaptive model often fails, which leads to malfunctioning of the static model;
  • The dimensionality of feature vector is high which results in high computational cost.

 

ADresult

Figure 2: Tracking Error (TE) versus Pascal VOC Overlap Error (AOE), Source [Img2]

Experimental results

Results indicate that :

  • IVT outperforms the others in its ability to accurately track faces in the presence of occlusion, and under variations in pose, scale and lighting;
  • Further characterization of IVT indicates that using a small batch size and forgetting factor during update provide better tracking accuracy when face tracks changes in their capture conditions;
  • When conditions change more gradually, IVT benefits from assessing facial quality before updating face models;
  • Low discriminant power of TLD and computational complexity of DSCT are the main limitations of these two methods.

To improve the performance of IVT :

  • Quality assessment process can be used to construct and validate reliable face model;
  • Parameters (e.g., batch size, forgetting factor) can be dynamically optimized based on face capture condition;
  • Contextual information can be exploited to improve in multi-face tracking.

Additional Information

To get more informations regarding this study, we invite you to read the following research article (PDF) :

M. Ali Akber Dewan, E. Granger, F. Roli , R. Sabourin, and G. L. Marcialis (2014). A Comparison of Adaptive Appearance Methods for Tracking Faces in Video Surveillance. International Conference on Imaging for Crime Detection and Prevention (ICDP 2013) at Kingstone University London, United Kingdom.

livia logoTo get more information about the Laboratory for Imagery Vision and Artificial Intelligence (Livia), use this link. Livia is looking for students for many research projects.

 

 

M. Ali Akber Dewan

Author's profile

M. Ali Akber Dewan was a Postdoctoral fellow at École te technologie supérieure de Montréal (ÉTS) from 2012 to 2014 in the Livia laboratory. He is specializing in face tracking and recognition in video surveillance.

Research laboratories : LIVIA – Imaging, Vision and Artificial Intelligence Laboratory 

Author profile

Éric Granger

Author's profile

Éric Granger is a professor in the Automated Manufacturing Engineering Departement at ÉTS. His research interests are adaptive and intelligent systems, video surveillance, computer and network security and faces recognition.

Program : Automated Manufacturing Engineering 

Research laboratories : LIVIA – Imaging, Vision and Artificial Intelligence Laboratory 

Author profile

Fabio Roli

Author's profile

Fabio Roli is a professor of computer engineering and Director of the Patter Recognition and Application Lab (PRA) at the University of Cagliari, Italy. His research activity is focused on the design of pattern recognition systems.

Author profile

Robert Sabourin

Author's profile

Robert Sabourin is a professor in the Automated Manufacturing Engineering Department at ÉTS. His research includes pattern recognition and inspection, neural networks, machine learning, genetic programming and bank cheques processing.

Program : Automated Manufacturing Engineering 

Research chair : ETS Research Chair on Adaptive and Evolutive Surveillance Systems in Dynamic Environments 

Research laboratories : LIVIA – Imaging, Vision and Artificial Intelligence Laboratory 

Author profile

Gian Luca Marcialis

Author's profile

Dr.Gian Luca Marcialis is currently Assistant Professor at University of Cagliari, and member of the PRA lab. His research interests are in the fields of fusion of multiple classifiers for person recognition by biometrics.

Author profile