Empirical Derivation of Acoustic Grouping Cues from Natural Sound Statistics

McDermott,  J H; Ellis,  D P W; Simoncelli,  E P

Empirical Derivation of Acoustic Grouping Cues from Natural Sound Statistics

J H McDermott, D P W Ellis and E P Simoncelli

Published in 34th midWinter Meeting, Assoc. for Research in Otolaryngology, Feb 2011 .

The ear typically receives mixtures of sounds, from which listeners perceive individual sound sources of interest. Many sets of sound signals are physically consistent with the mixture that enters the ear, and the brain must use its knowledge of natural sounds to infer which set actually occurred in the world. Certain sound properties have long been viewed as grouping cues bonsets or offsets, for instance, seem likely to be due to the same source, and are heard as such. Natural sounds could potentially contain many such cues, some of which might not be intuitively obvious. Here we propose an empirical framework for investigating the cues that could underlie sound segregation.

Our approach stems from the observation that the signals in incorrect mixture decompositions will themselves tend to be partial mixtures of the true sources. Grouping cues might thus be sound properties that have different values for individual sources compared to mixtures. Using databases of natural sound source recordings, we can evaluate sound statistics of individual sources and their mixtures, and search for statistics that should be useful for segregation.

We processed thousands of speech excerpts and their mixtures with an auditory model bfilter bank. From these filter responses we measured a large set of simple statistics that we have shown to be perceptually relevant in the analysis and synthesis of natural sound textures (McDermott & Simoncelli, 2011). The statistics included marginal statistics, capturing sparsity and modulation power, and correlations between filter responses. We found that most statistics, some of which relate to conventional grouping cues, helped to discriminate sources from mixtures. The results suggest that acoustic grouping cues are more diverse than has previously been suspected, and point the way to new perceptual experiments and machine algorithms for sound segregation.

Listing of all publications