Perception Lecture Notes: Loudness Perception and Critical Bands

Professor David Heeger

What you should know from this lecture

Loudness

The range of sound intensity is enormous.

Note that the scale on the x-axis is in decibels (db) and recall that 20 dB is a factor of 10, so a rock band (120 dB) is 100,000 times greater in amplitude than rustling leaves (20 dB). Loudness, like pitch, is a perceptual (not a physical) quantity. When we measure the range in terms of psychological units (the sones scale, established by Stevens and others via magnitude estimation), we find a rock band sounds about 1000 times louder than rustling leaves.

Equal loudness curves

Two sounds that have the same physical sound pressure levels (but different frequencies) are often perceived to have different loudnesses. Each of the curves in this graph represent auditory stimuli that sound equally loud (as determined in a loudness matching experiment). The bottom curve the the threshold of hearing; each point along that curve represents the physical amplitude and frequency of a barely audible pure tone.

Firing rate hypothesis. The rate of firing in the auditory nerve might determine perceived loudness. For weak tones, the basilar membrane is displaced little, hair cells are not pushed very far, and there are few spikes in the auditory nerve fibers.

 

The rising (high pressure) phase of each cycle of the sound signal evokes bursts of spikes in a collection of auditory nerve fibers. The amplitude of the sound determines the number of spikes per burst.  Low amplitude sounds evoke few spikes while high amplitude sounds evoke more spikes.  Were this hypothesis correct, we could (using White's cochlear implant apparatus) control the perceived loudness simply by injecting patterns of current into the neuron that cause it to respond at a more rapid firing rate.

Number of neurons hypothesis. More neurons fire to a louder sound. As a traveling wave passes down the basilar membrane, each point of the membrane oscillates at the frequency of the tone. When the sound is weak, displacements are generally quite small and only a small region of the basilar membrane moves sufficiently to evoke any responses. Increase the sound intensity and the membrane is displaced by a larger amount at each point. This is particularly true for the part of the basilar membrane that is most sensitive to the tone we are playing. But there is also a region nearby that did not respond to the weak stimulus, but that does respond if the intensity is increased.

In the auditory nerve, this means that additional nearby auditory nerve fibers (those at position 2 in the figure) will be recruited when the sound intensity is increased.

The physics, physiology, and anatomy do not define the perceptual code: Even though we have an understanding of the physics of sound, the response (motions) of the basilar membrane, and the response (firing rates) of the auditory neve fibers, this information alone can not yet answer for us what the mechanism of loudness perception is. We are in the same position with respect to loudness perception as we were with respect to the perception of pitch. Both the firing rate hypothesis and the number of neurons hypothesis could plausibly serve as the mechanism for our perception of loudness. One needs to to do behavioral/perceptual experiments to determine which hypothesis holds. In fact, we now know that both mechanisms play a role in loudness perception.

Critical Bands

Experiments that tested the firing rate and number of neurons hypotheses also led to an important discovery about how the auditory nerve signals are pooled/combined.  I'll just the describe the stimuli and summarize the conclusions.

Band-limited noise: Any sound can be decomposed as a sum of pure tone sinusoids. And we can represent the pure tone components of a sound by drawing a graph of its Fourier spectrum. The Fourier spectra of band-limited noise stimuli look like this:

Band-limited noise stimuli have equal energy at all frequencies within some region, and no energy outside of that region. One can describe this stimulus with three values:

Band-limited noise on the basilar membrane. If we change only the center frequency, we shift the position along the basilar membrane that is excited. If we change only the bandwidth, we change how much of the basilar membrane we are exciting. This allows us to test the number of neurons hypothesis. If we change only the total energy, we change the amount of the displacement at each excited position. This allows us to test the firing rate hypothesis.

Zwicker's loudness matching experiment. The test stimulus was band-limited noise centered at, let's say, 1000 Hz, with a bandwidth of 20 Hz. He had subjects adjust the intensity of a pure tone of 1000 Hz so that it appeared to be equally loud as the band-limited noise. Then, he increased the bandwidth of the noise, while decreasing the intensity of each pure tone frequency component so that the total energy was unchanged, and repeated the loudness matching judgment with the new bandwidth.

In the demonstration we listened to in class, you heard the comparison sound eight times, each time followed by a test sound of increasing bandwidth (but equal total energy). When you compared the loudness of the comparison and test sounds, the first few sounded equally loud, then the perceived loudness of the test sound increases. Thus, there is a range of bandwidths for which the perceived loudnesses are equal, i.e., over which the firing rate of each neuron and the number of neurons trade off against one another perfectly. Then there is a range of bandwidths for which the perceived loudness increases with the increase in the number of neurons (even though the firing rate of each neuron is still decreasing).

Critical bands. From these experiments Zwicker determined the that there was a region over which the cochlea adds up the energy that it is receiving. Within this critical region sounds of equal total energy have equal loudness. As soon as the sounds that we present are spread out over a larger frequency range, however, the sound with the larger bandwidth sounds louder.

The critical band corresponds to a pooling along the basilar membrane: the width in terms of frequency corresponds to an estimate of the physical length, along the membrane, over which auditory nerve signals are pooled. For a center frequency of 1000 Hz the critical bandwidth is 150 Hz; that corresonds to a 1.3 mm stretch along the basilar membrane. Likewise for a center frequency of 8000 Hz the critcal bandwidth is 800 Hz, but this frequency range also corresponds to 1.3 mm along the basilar membrane. This is a remarkable regularity that relates the anatomy of the cochlea, the physiology of the auditory system and the perception of loudness, that was inferred entirely through perceptual/psychophysical (matching) experiments.

Summary: a neural code for both pitch and loudness. Pitch depends on both a place code and a temporal code.  Loudness depends on firing rates and the number of neurons firing.  How do these four neural codes co-exist?

This figure shows four sounds: (1) low frequency, low intensity, (2) low frequency, high intensity, (3) high frequency, low intensity, (4) high frequency, high intensity. Responses are shown for two bundles of auditory nerve fibers connected to two different positions along the basilar membrane. Each bundle corresponds to a critical band. Each sound evokes a uniquely different pattern of firing in the auditory nerve. Pitch is determined by place code (which position exhibits the larger firing rate) and by temporal code (firing in bursts that phase lock to the stimulus frequency). Loudness is determined by firing rate (more spikes per burst for louder sounds) and by the number of neurons (at high intensity, you get some spikes from both positions).


Copyright © 2006, Department of Psychology, New York University
David Heeger