Learning a texture model for representing cortical area V2

N Parthasarathy and E P Simoncelli

Published in Computational and Systems Neuroscience (CoSyNe), Feb 2020.

This paper has been superseded by:
Self-supervised learning of a biologically-inspired visual texture model
N Parthasarathy and E P Simoncelli.
arXiv.org e-prints, Technical Report 2006.16976, Jun 2020.


Neurons in primary visual cortex (V1) respond selectively to oriented edge-like features, and this selectivity can be captured using oriented filters. Such filters can be learned from natural images using unsupervised techniques (eg., ICA or sparse coding), which provide a theoretical explanation for the experimental observations. Cortical area V2 is less well understood, but recent work shows that most V2 neurons respond selectively to 'visual texture'. Moreover, the responses of V2 populations can be used to distinguish between different textures, and to distinguish textures from their spectrally matched counterparts. Although several models have been proposed for V2, these are not learned from natural images, and they have not been shown to provide accurate predictions of V2 population responses to texture images. With these goals in mind, we develop a parametric functional model for V2, and choose its parameters by optimizing a novel self-supervised objective function over a dataset of homogeneous texture images. The model consists of a V1 layer implemented through a set of fixed convolutional filters followed by rectification (simple cells) and pooling (complex cells). These responses are provided as input to a V2 stage that consists of a set of learned convolutional filters followed by the same canonical simple and complex cell transformations. We optimize the filters in the V2 stage according to an objective function that seeks to simultaneously: 1) Minimize variability of V2 responses within individual homogeneous texture images 2) maximize variability of these responses across all images and 3) Minimize the error in locally reconstructing V1 responses from V2 responses. We find that the trained model captures the observed texture selectivities of V2 neurons. Additionally, the model responses provide much better linear predictivity of V2 population responses than is achieved with current state-of-the-art CNNs.
  • Listing of all publications