Perception of Visual Motion

The measurement of visual motion is one of the most important functions of the visual system. This measurement is useful for detecting preys and predators, tracking mates, navigation, controlling eye movements, and so on. We have been interested in the biological measurement of local and global motions. Local motion is the velocity vector that one can measure essentially at any point and any instant in time. Global motion is the flow field obtained by combining many velocity vectors over space and over time.

In recent years, our focus has been on global motion. Measuring global motion is important, because it gives information about the motion of largish structures in the scenes. For instance, if one were moving towards an object, its image would expand on the retina. In other words, the velocity vectors of this object would be co-dependent, having a coherent structure in space. Normally, this structure would be such that the vectors near the center (the so-called focus of expansion) would be small. In contrast, vectors far from the center would be large. One can quantify this gradient of vector sizes with a single parameter called the rate of expansion. Humans are so sensitive to this parameter that they lose the ability to measure the magnitude of the individual vectors. But the human measurement of the rate of expansion is made in relatively small regions of space. If for example, one shows to a subject an expansion for which all vectors are equal in magnitude, then the subject sees a non-rigid expansion, perceiving the center as expanding faster than the periphery. Psychophysical measurements show that each sub-region is perceived as expanding relatively independently from the others, that is, the rate of expansion is estimated correctly and independently for each sub-region. The same is done for other global motions besides expansion. One can decompose the motion of rigid patches of objects in terms of translation, rotation, expansion (and contraction), and shear.


Example of Stimulus Used to Study Psychophysical Measurement of Rate of Expansion.
In this example, all vectors have the same magnitude. However, subjects see a faster expansion near the center than in peripheral regions of the optic flow (non-rigid expansion). If a subject attends in the demarcated ring, then the perceived rate of expansion is correct, that is, equal to the velocity divided by the distance to the center. Hence, it appears that the subject perceives multiple rates of expansion, all correct. This indicates that the brain estimates rates of expansion in relatively local computations.

Besides this decomposition into spatially coherent patterns, local velocity vectors do not abruptly change directions in time, that is, there is also a temporal coherence in motion flows. We have been looking both computationally and psychophysically at how humans exploit such motion coherences. It turns out that humans exploit temporal coherence to improve the measurement of motion signals. This improvement could be shown to be implemented by a neural version of the Kalman filtering. (Engineers track things as satellites using this filtering.) As for spatial coherence, we showed that the nervous system could measure quantitatively the parameters of global flows, such as the angular velocity of rotation.

We also proposed a theoretical framework in which to understand how the brain measures all the components of global motion simultaneously. This framework computes the probability that particular motion models fit small regions of space (and time). These models could be things like translation, expansion, rotation, and shear. However, they could also include new models not included in everyday activity (such as when one practices specialized forms of sport) or the so-called non-parametric models. This latter class of models allows segmentation of scenes in novel situations, that is, situations that subjects never saw before.


Competition between Parametric and Non-parametric Motion Models.
Different models, parametric and non-parametric, compete to explain the visual data. The overall result is the segmentation into sub-regions and the assignment of one or more motion models to each sub-region.