Basilar membrane not homogeneous
Schematic diagram of the uncurled cochlea. Cochlea is not a homogeneous piece of tissue. It varies in thickness and elasticity as it curles from the oval window out to the helicotrema. The effect of this is that different parts of the basilar membrane respond more strongly to some sounds than others. For sinusoidal (pure tone) sound, each point on the basilar membrane oscillates up and down at the same frequency as the sound. What differs from point to point is the size of the oscillation.
"Snapshot" of basilar membrane
Displacement of the basilar membrane at one instant in time. Some points are displaced up and others down.
Traveling wave over time
Over time, the different points on the membrane move up and down. The entire motion, that occurs on the basilar membrane in response to a sound stimulus, is called a travelling wave. Each point moves up and down sinusoidally; different points move up and down slightly delayed (out of phase) with one another, yielding the travelling wave. The wave begins at the oval window, rises to a crescendo somewhere along the basilar membrane, and finally falls off with the energy being absorbed around the helicotrema. The dotted lines indicate the envelope of the membrane modulation, the maximum excursion of that bit of membrane throughout the duration of the traveling wave.
Envelope for several frequencies
Each point along the basilar membrane oscillates a different amount, depending on its the frequency of the sound. Points near the oval window, at the start, oscillate the largest amount in response to high frequency tones. Points near the helicotrema oscillate by the largest amount when the tone is a low frequency tone.
Cochlear tuning curve
This graph shows the location of peak excursion for different tone frequencies. These measurements were made on post mortum human ears. This simply summarizes what I've already said. The location of the biggest oscillation depends in a systematic way on the frequency of the tone.
The reason for the appearance of the travelling wave along the basilar membrane is the fact that the stimulus begins with a push at the oval window, which forces the part of the cochlea nearest the oval window to begin oscillating, and then it takes time for that oscillation to propogate down the length of the cochlea. The reason that the travelling wave peaks at one location is because the different points of the basilar membrane oscillate by different amounts - different amplitudes - in response to different tone frequencies.
Hair cell and nerve terminals
Each auditory nerve fiber is connected to a small number of hair cells, near one another, on the basilar membrane. The nerve fiber's response is governed, therefore, by the motion of a small region of the basilar membrane. And the basilar membrane in any small region undergoes a large motion only to a limited range of frequencies.
Auditory nerve freq tuning
This graph plots the sensitivity of a single fiber of the 8th nerve
to pure tones of different frequencies. The horizontal axis plots the frequency
of the input stimulus. The vertical axis plots the threshold stimulus intensity,
the minimum sound pressure level (in db) needed to evoke a response. Notice
that the neuron is mainly responsive over a narrow range of frequencies,
near 5000 Hz, that is called the neuron's characteristic frequency.
This nerve fiber must be connected to the section of the basilar membrane
near the oval window because it is tuned to high frequencies. A neuron
tuned to low frequencies will be attached near the helicotrema.
Freq tunings of 3 auditory nerve fibers
Different 8th nerve fibers attach to different portions of the basilar membrane. The information about different frequency regions on the basilar membrane is distributed among many different neurons. No individual neuron is responsible for all of the auditory information.
Summary: The representation of information on the basilar membrane and in the 8th nerve is very different from the representation at the tympanic membrane. The tympanic membrane displaces to a 100 hz tone, a 1000 hz tone, and to a 10,000 hz tone. It responds to all auditory stimuli, come what may, and faithfully reproduces the changing air pressure by a displacement. By watching any part of the tympanic membrane, we can discriminate between these stimuli.
By the time we have reached the cochlea and beyond, the physical signal is no longer represented by a single mechanism. Rather, we now have forty thousand mechanisms - the 8th nerve fibers coming from the cochlea to the brain - which each encode different portions of the stimulus. And each component is deaf to most of the range of auditory frequencies. This decomposition of the response into different neural channels is very similar to what we saw with the swinging pendulum. There are lots of motions that cause no response whatsoever in the long pendulum. From its point of view, it is as though nothing is happening.
Ohm's Acoustic Law: When two tuning forks are struck at the same time, and these two tuning forks are of different size and therefore provide different frequencies, the ear is capable of separating out the sounds from the two tuning forks. Our ability to hear the pitch of two different frequencies, presented simultaneously is called Ohm's Acoustic Law.
Audio demo: CD track 1
You hear a complex sound consisting of 20 different pure tones. When the relative amplitudes of all 20 tones remain steady, we tend to hear holistically, that is, we focus on the whole sound and pay little or no attention to the components. When we listen analytically, however, we hear the different components separately. With practice, you can learn to listen analytically to separate out the tones as Ohm pointed out 100 years ago. This demonstration helps you so you can hear each of the pure tones components by turning the individual tones off and on one at a time. What is going on in the ear that allows you to hear these different pitches.
Place Code Theory: Helmholtz's theory of pitch is based on observations of the anatomy of the ear. Most important theory of hearing for 100 years.
Place code
Sensation of a low frequency pitch derives exclusively from the motion of a particular group of hair cells, while the sensation of a high pitch derives from the motion of a different group of hair cells. Each sensation is perfectly identified with the action of an anatomical location along the basilar membrane. The place code theory is given that name because it identifies each pitch with a particular place along the basilar membrane. Assumes that any excitation of that particular place gives rise to a specific pitch.
Temporal Code Theory: According to temporal code theory location of activity along the basilar membrane is irrelevant.
Temporal code (Goldstein Fig 11.40)
Rather, pitch is coded by the firing rates of nerve cells in the 8th nerve. In principle, this makes a lot of sense. A low frequency tone causes slow waves of motion in the basilar membrane and that might give rise to slow firing rates in the 8th nerve. A high frequency tone causes fast waves of motion in the basilar membrane and that might give rise to fast firing rates.
However, there's a problem with temporal code. The ear is sensitive to frequencies from about 20 hz up to 20,000 hz. But a single nerve cell can not signal at a rate of 20,000 hz. Therefore, the possibility of a temporal code account for detecting the pitch of a 20,000 hz tone seems impossible because no nerve cells can conduct that many impulses per second. And, in fact, Hallowell Davis, in the 1930s, showed that the maximum response rate of auditory neurons in the cat is about 1000 action potentials per second.
Cochlear Microphonic: Discovery that cast doubt on Helmholtz's place code and supports temporal code. Discovered by Wever. The cochlear microphonic is a small electrical signal that can be measured by an electrode placed near the hair cells of the cochlea. We now know that the cochlear microphonic arises from the sum of electrical potentials in the hair cells of the cochlea. Mimicks the form of the sound pressure waves that arrive at the ear. Low frequency tones result in low frequency modulations of the cochlear microphonic electrical signal. High freq tones result in high freq modulations of the electrical signal. Combination (sum) of high plus low freq tones results in sum of high plus low freq modulations in cochlear microphonic electrical signal. In fact, cochlear microphonic is a shift-invariant linear system that obeys scalar, additivity, and shift-invariance rules.
Volley Principle
Volley Principle: Reconciles the fact that the cochlear microphonic mimicks the sound pressure waves with the implausibility of the temporal code. Wever suggested that while one neuron alone could not carry the temporal code for a 20,000 Hz tone, 20 neurons, with staggerred firing rates, could. Each neuron would respond on average to every 20th cycle of the pure tone, and the pooled neural responses would jointly contain the information that a 20,000 hz tone was being presented.
Phase Locking
Phase Locking: Empirical observation that supports the volley principle. When 8th nerve neurons fire action potentials, they tend to respond at times corresponding to a peak in the sound pressure waveform, i.e., when the basilar membrane moves up. The result of this is that there are a bunch of neurons firing near the peak of each and every cycle of a pure tone. No individual neuron can respond to every cycle of a sound signal, so there must be different neurons firing on successive cycles. Nonetheless, when they do respond they tend to fire together.
Why is phase locking important? What you need (for temporal code theory, and to explain the cochlear microphonic) is for the neural activity to look just like the sound pressure waveform. The response (across the whole population of hair cells/8th nerve fibers) must follow each rise and fall of sound pressure level in the sound signal.
Wever's temporal code theory (based on the volley principle) was a clear rejection of Helmholtz's Place Code Theory, and it was backed up by compelling data (cochlear microphonic and phase locking). Wever said that the particular neuron that was signalling was not important, but instead, the way in which the neurons signalled together contained the information as to the pitch of the sound.
How might you test thest two alternative hypotheses? Discussion...
White's Cochlear Implants: Professor John White of the electrical engineering Department at Stanford did some experiments that directly addressed these 2 alternative hypotheses. The ultimate goal of his research was to produce cochlear implants to make up for some kinds of hearing loss. There are many such diseases, including one fairly common one called Meunieres (rhymes with "bunny ears") syndrome, that can poison and destroy the hair cells in the inner ear, while leaving the auditory nerve and the rest of the auditory system intact. What we would like to do for these patient is to send a signal directly to the auditory nerve that will effectively substitute for the signal that the auditory nerve would be receiving were the system fully intact.
Overview of pacemaker
It is in fact possible to intercede in the nervous system by injecting signals that are just like the ones the nervous system itself might have injected. The classic example of such a success is the cardiac pacemaker. Basically, the way the pacemaker works is to sense if the appropriate nerve in the heart is putting out the electrical command for a contraction of the heart muscle. If the natural pulse is detected, the pacemaker takes no action since things are working normally. If no signal is detected the pacemaker intercedes and injects an electrical signal into the heart muscles. This signal causes the heart to contract, effectively pacing the rate of heart beat and replacing the natural signal.
White's electrodes
White's early experiments with cochlear implants were designed to test the place and temporal code theories of pitch perception. White implanted four electrodes located at different positions along the basilar membrane. He tested the two theories by delivering different types of electrical stimuli to his observer and asking the observer to estimate the pitch of the signal delivered by the prosthetic device. He varied the signal in two ways. First, he varied which of the four electrodes was used for stimulation. By measuring the dependence of pitch on which electrode was being stimulated he could test the place code theory. Second, White varied the rate of the electrical stimulation. Stimulate either with a low frequency series of electrical pulses through one of the electrodes, or with high frequency series of pulses through the same electrode. By measuring the dependence of pitch on the frequency of electrical stimulation he could test the temporal code theory.
White's data
As it turns out, both mechanisms play a role in pitch perception. As the stimulating frequency is increased, the subject tends to report a higher pitch. This continues over a significant range, up to a maximum of about 300 hz. At that point, the rate of stimulation on the electrode does not seem to influence the subject's judgement. The perceived pitch also depends on which electrode was doing the stimulating, i.e., the place that is being stimulated is also important information.
In fact, this makes sense. Place coding is weak below 1 kHz because of broad pattern of oscillation of the basilar membrane at low frequencies (look back at the figure near the beginning of the lecture showing the envelope of basilar membrane motion for low frequencies). Temporal code works best at low frequencies because fibers can phase lock most easily for low frequencies.
Caveat: the patient reported that these electrical stimulations did not sound particuarly like tones, but rather they sounded like a noisy kind of buzzing. The buzzing could appear to be at different pitches. But it was, nonetheless, a buzz rather than a clear tone with a distinct pitch.
Construct a sound that is made by adding pure tones with frequencies 400, 800, 1200, and so on. The 400 Hz component is called the base or fundamental frequency of the tone complex, and the other frequencies are called the higher harmonics. Most sound sources (your vocal tract, musical instruments) produce sounds like this. The higher harmonics come along for the ride.
Present tone complex pictured in (a), then present a pure tone, and ask the observer to set the frequency of the pure tone so that its pitch matches the frequency of the tone-complex, an example of a matching experiment. The perceived pitch of this tone complex is very much the same as a pure tone with the same fundamental frequency (400 Hz in this example).
Now take the same tone complex and remove (subtract out) the 400 Hz component, as pictured in (c). The lowest frequency is now 800 Hz, so you might think that perceived pitch of this new tone complex would match that of an 800 Hz pure tone. Surprisingly, observers still match the complex with a pure tone of 400 Hz.
Virtual pitch audio demo: CD track 37
This is a challenge to Helmholtz's place theory because the tone complex does not contain any energy that would stimulate the auditory nerve at the point where a tone of 400 Hz would stimulate the nerve. If pitch is encoded by position alone, then how can these two yield the same pitch? This is also a challenge to Wever's volley theory, because there is no energy (or oscillation) in the tone complex at 400 Hz, i.e., there is no 400 Hz component in the cochlear microphonic. Both the place theory and the temporal code/volley theory play roles in pitch perception. However, neither theory provides a complete explanation of pitch perception. Even though it is a seemingly simple perceptual attribute, pitch is not currently fully understood.
Virtual pitch audio demo: CD track 38
If you shift the tone complex to higher frequencies (e.g., from 400, 800, 1200,... up to 500, 700, 1300,...) it is perceived at a slightly higher pitch. Note that this manipulation is a bit odd in that the tones of the new complex are no longer exact harmonics of any fundamental. The auditory system accepts them as "nearly harmonic'' and identifies/assigns a virtual pitch.
Shepard pitch illusion: CD track 52
Roger Shepard and others having taken advantage of residual pitch to produce an auditory illusion that gives the sensation of a sound that continuously changes in pitch, rising or falling forever.
Shepard pitch Fourier spectra
Intensity of each component fit within an amplitude envelope that tapers off at very high and low frequencies. Move all the freq components up over time but constrained to fit within the amplitude envelope so that the low frequency tones gradually increase in amplitude and the high frequency tones gradually decrease in amplitude, as they all shift up in frequency. When one tone falls off the top (note that its amplitude has been reduced to zero by then), add a new one down at the bottom (initially with zero amplitude, but gradually increasing).
Analytic vs. synthetic hearing and virtual pitch: CD track 48
As mentioned above (back when talking about Ohm's Acoustic Law), our auditory system has the ability to listen to complex sounds in different modes. When we listen analytically, we hear the different frequency components separately; when we listen synthetically or holistically, we focus on the whole sound and pay little attention to its components. In this demonstration, a two-tone complex of 800 and 1000 Hz is followed by one of 750 and 1000 Hz. If you listen analytically, you hear the lower frequency tone go down in pitch from 800 to 750. If you listen holistically, you hear the virtual pitch of the missing fundamental go up in pitch from 200 to 250.