Published in Computational and Systems Neuroscience (CoSyNe), Mar 2024.
It has long been debated whether biological vision is optimized for predicting incoming sensory stimuli, or for discriminating patterns and objects. Are the two views really incompatible? In this work we propose "straightening" as a simple and biologically plausible learning objective that achieves both goals at the same time. This objective is motivated by recent experimental findings that neural representations evolve over time in straighter temporal trajectories than their initial photoreceptor encoding, measured both at V1 population level (Hénaff et al. 2021) and in terms of perceptual discriminability (Hénaff et al. 2019). Such a neural representation automatically fulfills the first goal: for straight representations, prediction is simply a matter of linearly extrapolating past responses. Recent results show that robust training leads to straighter representations (Toosi & Issa 2023; Harrington et al. 2023). Here, we show that the converse also holds: straightening makes a discrimination model more immune to noise. In particular, we show that optimizing a network for straightening can give representations that not only achieve near state-of-the-art object recognition performance on several datasets, but also are more robust to two forms of degradation: 1) independent white noise injected into each channel throughout the network (mimicking neural noise); 2) adversarial perturbations (known to yield perceptual differences between artificial and biological systems). Our results suggest that straightness and robustness are two sides of the same coin. The dual successes of straightening as a learning mechanism provide an important step towards bridging the predictive and discriminative perspectives in vision.