Visual processing in foveated systems

Studies made on primate visual systems reveal that there is a compromise that simultaneously provides a wide field-of-view, high spatial resolution in the region of interest, and a small fast-to-process output. The distribution of the photoreceptors in the retina is denser in the central region, the fovea, whereas it is more sparse in the periphery.

 

The projection of the photoreceptor array into the primary visual cortex can be well described by a logarithmic-polar (log-polar) distribution mapped in an approximately uniform way onto a rectangular-like surface (the cortex). The log-polar imaging is a well established paradigm for simplifying a wide number of computational problems in active vision, since simultaneously provides a wide field-of-view, high spatial resolution on the fovea, and a significant data reduction.

These features are well suitable for visual systems that continuously interact with the environment, by purposefully moving the eyes to bring the interesting objects into the foveas.

Related paper:

F. Solari, M. Chessa, S.P. Sabatini. Design strategies for direct multi-scale and multi-orientation feature extraction in the log-polar domain. Pattern Recognition Letters 33(1), pp. 41-51. [doi]

We have addressed the problem of the multi-orientation and multiscale filtering in the log-polar domain. To this aim, a systematic analysis of the relationships between the parameters of the discrete log-polar mapping and of a bank of Gabor filters has been carried out.

The validity of such analysis has been proved by applying the distributed phase-based approach for the computation of vector binocular disparity based on a bank of Gabor filters (click here for more details) on log-polar stereo pairs. The distributed algorithm for the computation of the 2D vector disparity is suitable to be directly applied on cortical images, since 2D vector disparity is computed without an explicit search of the correspondences, between the left and the right images, along the epipolar lines. In this way, it is not necessary to take into account that the straight lines in the Cartesian domain become curves in the log-polar space.The possibility of efficiently exploiting a space-variant representation is of great importance in the development of active systems capable of interacting with the environment, since a precise processing of the visual signal is possible in the foveal area, where the feature errors are small enough to allow a fine exploration of the object of interest. At the same time, the coarse computation of the feature in the peripheral area provides enough information to detect new saliencies and to bring the focus of attention there.

Datasets and software librariesThe datasets and the software libraries used in our papers can be downloaded and used by Computer Vision researchers. We grant permission to use and publish all images, and ground truth maps on this website. However, if you use our datasets, we request that you cite the appropriate paper.

  • Software libraries
    • C++ library implementing the Nearest Pixel, the Bilinear Interpolation, the Overlapping Circular RFs, and the Adjacent RFs techniques. The library is embedded into OpenCV release (click here to go directly to the source code).  Related paper:
      Manuela Chessa, Silvio P. Sabatini, Fabio Solari, Fabio Tatti. (2011) A quantitative comparison of speed and reliability for log-polar mapping techniques. 8th International Conference on Computer Vision Systems, ICVS 2011.
  • Datasets (More details about the generation of the stereo pairs and of the ground truth map here)
    • Synthetic frontoparallel plane. Click here to download the dataset.
    • Toy. Click here to download the dataset.
    • Desktop. Click here to download the dataset.
For further information feel free to contact Fabio Solari or Manuela Chessa