Heeger DJ and Jepson A, Visual perception of three-dimensional motion, Neural Computation, 2:127-135, 1990.
Abstract: As an observer moves and explores the environment, the visual stimulation in his eye is constantly changing. Somehow he is able to perceive the spatial layout of the scene, and to discern his movement through space. Computational vision researchers have been trying to solve this problem for a number of years with only limited success. It is a difficult problem to solve because the relationship between the optical-flow field, the 3D motion-parameters, and depth is nonlinear. We have come to understand that this nonlinear equation describing the optical-flow field can be split by an exact algebraic manipulation to form three sets of equations. The first set relates the image velocities to the translational component of the 3D motion alone. Thus, the depth and the rotational velocity need not be known or estimated prior to solving for the translational velocity. Once the translation has been recovered, the second set of equations can be used to solve for the rotational velocity. Finally, depth can be estimated with the third set of equations, given the recovered translation and rotation. The algorithm applies to the general case of arbitrary motion with respect to an arbitrary scene. It is simple to compute, and it is plausible biologically.