Saliency Map Estimation

Contact person: Dmitriy S. Vatolin (


The goal is to design algorithm for automatic saliency maps construction for real-life video examples

  • Most of the existing methods are designed for still images
  • Usually saliency construction algorithms works with specific cases; different video sequences require different methods
  • Content-aware compression
  • Content-aware quality estimation
  • Autofocus

Eye-tracking saliency from TU Delft database.

Methods for constructing saliency maps


Automatic method selection

We get probability of features extracted from binarized saliency maps using Relevance Vector Machine[3]


Method selection results


Temporal smoothing

We use our key-frame based depth propagation to smooth results in time. The following scheme illustrate our strategy



The links attached below contain our results on the test set and a few illustrations of how the methods work

  1. Large set
  2. Small set

Problems & future work

  • New methods development
    There are some saliency clues which we haven’t covered yet. It is scene geometry, point of focus, text detection and eyes detection. We are going to improve existing methods too.
  • Improvement of the feature extraction for machine learning
    In the area of machine learning now we see the main challenge to design robust feature extraction algorithm. Also we are going to use feature set extracted directly from grayscale saliency map without binarization.
  • Application of obtained saliency maps for real-life problems
    Content-aware compression and content-aware cropping seems the most realistic for us.


  1. S. Goferman, L. Zelnik-Manor, and A. Tal, “Context-Aware Saliency Detection,” CVPR, 2010, pp. 2376–2383.
  2. Chenlei Guo, Liming Zhang, “A Novel Multiresolution Spatiotemporal Saliency Detection Model and Its Applications in Image and Video Compression,” Image Processing, 2010, vol. 19, pp. 185–198.
  3. M.E. Tipping, “Sparse Bayesian Learning and the Relevance Vector Machine,” The Journal of Machine Learning, 2001, pp. 211–244.



    This work is partially supported by the Intel/Cisco Video-Aware Wireless Network (VAWN) Program