Two objects of the same size at different distances subtend
different visual angles.
Two objects at different distances that subtend same visual angle have
different physical sizes.
With only one eye open, you still see with a sense of depth, but
there is inherent ambiguity between size and distance. What cues does
visual system
use? In class we reviewed a large set of such cues: relative size,
occlusion, cast shadows, shading, dynamic shadows (shadow motion),
aerial perspective, linear perspective, texture perspective, and height
within the image. Most of these are based on the concept illustrated
above: the size of the retinal image of an object is proportional to
the object's size, but inversely proportional to the distance to the
object.
Texture made up of little circular texture elements on curved surface
Texture provides 3 cues about shape/distance:
Shading: crater vs mound
Brightness of a surface depends on its orientation with respect to the light source. The visual system assumes that the light comes from above. Brighter patches appear to be tilted up facing the light.
Shading and contour
The interpretation of shape from shading interacts with the interpretation of shape from contours. These two images have the same shading, but different bounding contours, and you see different shapes.
Railroad tracks
Linear perspective is another monocular depth cue. The distance
between the
rails is constant in the 3D scene but gets smaller and smaller in the
image.
This is a cue for distance. The visual system uses this to compare the
sizes of objects.
The two lines are the same length but the one on top appears bigger
because it
is seen as being further away and the visual system is compensating for
the
perspective. This compensation for distance in interpreting size is
known as "size constancy".
Ames Room
The man on the left is actually almost twice as far away from the observer as the man on the right. However, when the room is viewed through the peephole, the actual distances can not be seen. Since you perceive the two people to be at the same distance from you, the one who has the larger visual angle appears larger.
Size constancy: Hallway
The visual system compensates for perspective in making judgements about size. It is striking that we are so unaware of this. We have a tendency to interpret shape and size in 3D - often unaware of 2D size.
Shepard Tables
A central premise of object perception is that we see objects in a
three-dimensional world. If there is an opportunity to interpret a
drawing or an image as a
three-dimensional object, we do. The two table tops above have
precisely the same
two-dimensional shape on the page, except for a rigid rotation. Nobody
believes
this when they first look at the illusion. The illusion shows that we
don't
see the two-dimensional shape drawn on the page, but instead we see the
three-dimensional
shape of the object in space.
When we fixate an object, we typically accommodate to the object,
i.e., change the power of the lens in our eyes to bring that object
into focus. The accommodative effort is a weak cue to depth. Once we've
accommodated to that distance, objects that are much closer or further
from us than that distance are out of focus on our retina. Thus, blur
is a cue that objects are at a different distance than the
accommodative distance, although the cue is ambiguous as to whether the
objects are closer or more distant. Even weaker still as depth cues
(although theoretically useful) are the image distortions resulting
from astigmatism (the cornea isn't a perfect sphere) and chromatic
aberration (when yellow light is in focus, blue light is out of focus,
from a given distance to the object).
The next set of cues involve movement on the retinal image. There
are two cases. If the observer moves through a stationary environment,
the resulting movement is called motion
parallax (discussed briefly in the previous chapter). Objects
will move at different speeds on your retina (for a particular speed of
observer movement and choice of fixation object) depending on their
distance from the observer. In the second case, the observer is
stationary but the object is in motion (e.g., it is rotating and/or
moving in a straight path relative to the observer). The resulting
retinal velocities will depend on the relative distance of each object
feature from the observer resulting in the kinetic depth effect (the
calculation of depth here is called structure-from-motion).
Close one eye, and hold up your two index fingers, one fairly close
to your face and one as far as you can reach. Fixate the more distance
hand, and alternately view the scene with your left eye and then your
right. As you can see, the distance between the two fingers is
different in your left than in your right eye; their relative positions
in the two retinae are disparate.
Illustration of binocular disparity
Binocular disparity is defined as the difference in the
location of a feature between the right eye's and left eye's image. The
amount of disparity depends on the depth (i.e., the difference in
distance to the two object and the distance to the point of fixation),
and hence it
is a cue that the visual system uses to infer depth. Wheatstone (1838)
was first to figure this out. Before that, people were confused,
thought that
having two eyes posed a problem because couldn't figure out how you
could
see only one image when viewing the world with two eyes. Wheatstone
correctly
pointed out the advantage of having two eyes to see objects in 3D
depth. However, the disparity also depends on the distance to the
fixation as well, so that disparities must be further interpreted using
estimates of the fixation distance.
Horopter, crossed- and uncrossed disparity
Horopter: imaginary 3D surface in the room in front of you
that includes the object you are fixating on and all other points in 3D
space that
project to corresponding positions in the two retinae. The
above picture is very misleading, however, because the geometric
horopter (the set of points with zero disparity) is a circle that
includes the fixation point and the optical centers of the two eyes (in
the above picture, the labeled horopter should be much more curved, and
curve back to pass through the two lenses).
Uncrossed disparity: An object farther away from you than the
horoptor
has uncrossed disparities. You'd have to uncross (diverge) your eyes to
fixate
on it. It lies further to the right from the right eye's viewpoint than
from the left eye's viewpoint.
Crossed disparity: An object closer than the horoptor has
crossed disparities. You'd have to cross (converge) your eyes to fixate
on it. It is further to the left
from the right eye's perspective.
Wheatstone stereoscope
Stereoscope: One way to view stereo image pairs is to use a mirror stereoscope. If you put your face in front of a pair of angled mirrors, and put two slightly different pictures off to the sides, your left eye will see the left picture (E') and your right eye will view the right-hand picture (E).
Stereogram: A pair of images (such as E/E' above) that are viewed using a stereoscope (or a red-green anaglyph). The two images in a stereogram are slightly different, with features in one image shifted to slightly different positions in the other image. The shifts mimic differences which ordinarily would exist between the views of genuine 3D objects.
Red-green anaglyph and stereo glasses
There are lots of ways to make and view stereograms. The basic concept is to present slightly different images to the two eyes. One way is to superimpose two half images, one in red and one in green. Viewed through red-green glasses, one eye sees the red image and the other eye sees the green image.
Stereograms have been part of popular culture in each generation since Wheatstone. Brewster stereoscopes (different design, but same in concept) were popular around 1900 with photographed stereo pairs. 3D movies were viewed with red-green anaglyph glasses in the 1950s. Current 3D movies are usually viewed with polarized glasses instead of red-green so the movies can be in color. Another recent technique is the "magic eye" autostereograms.
Random-dot stereogram: The random-dot stereogram was invented
by Bela Julesz, a perceptual psychologist who was very influential over
the past 30 years. In the example below, with anaglyph glasses you
would see a square-shaped surface floating in depth in front of a
background. Both the foreground square and the background have little
dots painted on them in random locations.
Random dot stereogram example
This has important consequences. It indicates that you
How to make a random-dot stereogram
To construct a random-dot stereogram, you first place a bunch of
dots randomly in an image. Then make two copies of it. In one
copy shift a central square region to the left and in
the other copy shift the same central square region to the right. This
leaves
holes in each of the images (left over from where the square shifted
from).
Fill the holes with new random dots. Why do you see it in
3D?
The shift mimics differences which ordinarily exist between the views
of
genuine 3D objects. The extra dots (X and Y above) correspond to those
parts of the background that one eye can see, but which are occluded
from the view of the other eye by the foreground square.
How does the visual see depth in a random-dot stereogram? One hypothesis is that the visual system matches up features of similar shape, size, contrast, etc. to estimate disparity. But, there can be lots of potential matches. In principle each dot present in one row of one half-image could have a large number of matches in the other half-image.
This problem of resolving this ambiguity is known as the problem of global stereopsis because the brain must find the correct overall (global) set of matches. It can't just try to find a mate for each feature independently. Global stereopsis is not just an issue for random-dot stereograms. Natural scenes (e.g., tree with leaves, carpet, etc.) have similar features. The visual system "solves" the global stereopsis problem by using additional constraints. For example, nearby points in the image are usually at nearby positions in depth, hence have nearly the same disparity.
Autostereogram: The autostereogram is also known as a "magic eye" stimulus. The trick is to display slightly different images to the two eyes. The autostereogram works by having repetitive patterns. To see depth in an autostereogram, you need to either cross or diverge your eyes so that they fixate separately on two different repeats of a repetitive pattern. In this way, you effectively get two different images to the two eyes. A simple example is the wallpaper illusion. If you view vertically striped wallpaper and fixate one eye on one stripe and the other eye on the another stripe, the stripes appear to pop out in depth in front of the wall. This is hard to do because you need to fixate on a point that is effectively in front of the picture while focusing/accomodating on the picture itself. Note that the depth you will perceive will be the opposite if you cross-fuse an autostereogram than if you diverge-fuse it.
Fusion: You have two eyes, and hence two visual fields. A big question in the early 19th century (before Wheatstone): we have two eyes, why don't we alway see two views of the world? Answer: the two images are combined in the brain to yield a single unified perceptual experience.
Panun's fusional area is the range of disparities, or equivalently the range of depths in 3D space on either side of the horoptor, over which the visual system can successfully fuse the two views. If disparity is small enough, within Panum's fusional area, then the visual system suceeds in fusing the two views. If disparity is too large then the neurons in the brain cannot cope with it to create single vision and you either get: diplopia or suppression or binocular rivalry.
Diplopia (double vision): Look at a distant object with boths eyes open. While fixating that object, put your index finger about 6 inches in front of your face. You will see two index fingers (one from the left eye's image and one from the right).
Suppression: This is what normally happens when the retinal disparity is too big (outside of Panum's fusional area). One eye's view dominates. That one is perceived. And the other eye's view is suppressed from awareness.
Binocular rivalry is a phenomenon we experience when the two eyes' views are very different from one another. One eye's view dominates for several seconds and is then replaced by that of the other eye. For example, if a horizontal grating presented to one eye and a vertical grating in the other eye, in the percept one might first see the horizontal for a few seconds, then a mixture, then the vertical for a few seconds, etc. The phenomenon of binocular rivalry is of particular interest in studying consciousness/visual awareness because the physical stimuli (the two gratings) do not change, yet the conscious percept changes dramatically over time. Moreover, we have no conscious control over the percept; you cannot by force of will cause the percept to switch from one to the other.
There are two ways to have single vision: (1) small disparity yields fusion and stereopsis, (2) large disparity often causes one eye's view to be suppressed. Binocular rivalry is a special case of suppression in which the suppression switches back and forth between the two views.
Note that you can have stereopsis in part of the visual field, diplopia in another part of the visual field and rivalry/suppression in yet another part of the visual field, all simultaneously.
Stereoblindness: 10% of people are stereoblind. Some are totally stereoblind, some are blind only to either crossed or uncrossed disparities. Some stereoblindness is caused by strabismus (wandering eye). If not treated/fixed at a very early age (infancy), binocular vision never develops properly. Some people with strabismus end up with amblyopia (sometimes called lazy eye). Amblyopia is a cortical blindness. Amblyopia is a general term for a visual deficit that has nothing to do with the optics or structure of eye and retina. In amblyopia, the brain basically ignores inputs from one eye. Other people with untreated strabismus end up as alternate fixators who can see with either eye, but never use them both at the same time. That is, they first look at you with their left eye (while the right eye is diverged), then switch and look at you with their right eye (while the left eye is diverged). In either case, there is no binocular vision and no stereopsis.
Stereovision in the brain: If you record with a
micro-electrode from a V1 neuron
while an animal views oriented lines presented separately to the two
eyes and vary the disparity, some neurons are selective for particular
disparities.
Disparity tuning of V1 neuron
This neuron does not respond at all when a line is shown to one eye at a time. To get a response, the line must be presented simultaneously to both eyes, it must have the correct orientation, direction of motion and the correct binocular disparity, in this case a disparity of about 1/2 deg of visual angle.
Distribution of disparity tunings
Binocular rivalry in the brain: Logothetis and colleagues recorded from neurons in the inferior temporal lobe, an area of the brain believed to be involved in recognition (see lecture notes on recognition). Neurons in IT are very selective for stimulus patterns. Monkeys were trained to report their percepts during rivalry. At the same time, Logothetis recorded the responses of IT neurons. The neurons tracked the alternations in the monkey's reported percept during rivalry even though the physical stimulus never changed.