Sensory signals are transduced at high resolution, but their structure must be stored in a more compact format. We hypothesize that time-averaged sound statistics form a perceptual code used by the auditory system to summarize the acoustic structure of natural sounds. To explore this hypothesis, we generated synthetic sound stimuli using a recently developed model of auditory texture representation (McDermott & Simoncelli, 2011). The synthesis procedure shapes samples of noise to have the same summary statistics as a target sound (despite having a completely different acoustic waveform), matching the moments and pair-wise correlations of simulated cochlear channel envelopes and their modulation bands. We conducted experiments with excerpts of five-second signals synthesized to have the same statistics as a real-world sound texture (rain, fire, etc.). The summary statistics of different excerpts were variable when the excerpts were short, but converged to the statistics of the full-length signal as the excerpt length increased. We first presented listeners with excerpts from two signals with distinct statistics, asking them to judge whether the excerpts arose from the same sound source. Performance improved with excerpt duration, presumably because the summary statistics that differentiate different sound types are more accurate for longer excerpts. We then presented excerpts of two different exemplars synthesized from the same summary statistics, and asked listeners to judge whether these were identical. Performance in this task was good for short excerpts but paradoxically declined with duration, even though the longer sounds contain more information to support discrimination. The results can be explained by supposing that listeners represent sounds with time-averaged statistics, such that discrimination of two sounds becomes difficult as their summary statistics converge to the same values. Such statistical representations produce good categorical discrimination, but limit the ability to discern fine-grained temporal detail.