Using metameric stimuli to test a model of neural populations in V2

J Freeman and E P Simoncelli

Published in Annual Meeting, Neuroscience, Nov 2011.

Visual patterns are represented in a sequence of cortical areas known as the ventral stream. We develop a functional model for ventral area V2, by leveraging known facts about the afferent input from V1, and the organization of ventral stream receptive fields -- specifically, the fact that they grow in size with eccentricity. Our model begins by computing half-squared and squared responses of oriented linear filters, simulating V1 receptive fields. We then compute products of various pairs of these V1 responses, and pool these over regions the size of V2 receptive fields (as estimated from macaque single-unit physiology). These pooling regions grow linearly with eccentricity, and the large pooling regions in the periphery lead to a significant loss of visual information. If we assume that observers cannot access this lost information, then two images matched in terms of their model responses should be indistinguishable - that is, they should be "metamers".

We test this hypothesis by computing model responses on full-field natural images, and generating synthetic images that produce identical model responses but are otherwise random. We use a two-alternative discrimination task to show that these images are metameric to human observers, as long as the pooling regions are matched to those in area V2. As a control, we also show that stimuli synthesized from an analogous model of V1 are metameric when the pooling regions are matched to those in area V1. We further show that the metamericity of the V2 stimuli is not affected by either extension of presentation time, or direction of endogenous attention toward image locations having the largest differences. Although both manipulations increase the probability that observers can discriminate stimuli, they do not change the receptive field size required for metamericity, suggesting that the information loss is an inherent aspect of the system that cannot be overcome with bottom-up or top-down improvements in signal-to-noise ratio.

Finally, the phenomenon of "visual crowding", in which peripherally presented objects become unrecognizable when surrounded by other objects, is believed to arise from eccentricity-dependent pooling of visual features (e.g., Balas et.al., 2009). Our model provides an explicit full-field implementation of this proposal, and we show that it accurately predicts the eccentricity and spacing dependence of these effects. More generally, the model provides a framework for assessing the impact of such peripheral deficits in everyday visual tasks.


  • Listing of all publications