Brouwer GJ & Heeger DJ. Decoding and reconstructing color from responses in human visual cortex, 29:13992-14003, 2009.

Abstract: How is color represented by spatially distributed patterns of activity in visual cortex? Functional magnetic resonance imaging responses to several stimulus colors were analyzed with multivariate techniques: conventional pattern classification, a forward model of idealized color tuning, and principal component analysis (PCA). Stimulus color was accurately decoded from activity in V1, V2, V3, V4, and VO1 but not LO1, LO2, V3A/B, or MT+. The conventional classifier and forward model yielded similar accuracies, but the forward model (unlike the classifier) also reliably reconstructed novel stimulus colors not used to train (specify parameters of) the model. The mean responses, averaged across voxels in each visual area, were not reliably distinguishable for the different stimulus colors. Hence, each stimulus color was associated with a unique spatially distributed pattern of activity, presumably reflecting the color selectivity of cortical neurons. Using PCA, a color space was derived from the covariation, across voxels, in the responses to different colors. In V4 and VO1, the first two principal component scores (main source of variation) of the responses revealed a progression through perceptual color space, with perceptually similar colors evoking the most similar responses. This was not the case for any of the other visual cortical areas, including V1, although decoding was most accurate in V1. This dissociation implies a transformation from the color representation in V1 to reflect perceptual color space in V4 and VO1.

Moradi F & Heeger DJ. Inter-ocular contrast normalization in human visual cortex. Journal of Vision, 9(3):13, 1-22, 2009.

Abstract: The brain combines visual information from the two eyes and forms a coherent percept, even when inputs to the eyes are different. However, it is not clear how inputs from the two eyes are combined in visual cortex. We measured fMRI responses to single gratings presented monocularly, or pairs of gratings presented monocularly or dichoptically with several combinations of contrasts. Gratings had either the same orientation or orthogonal orientations (i.e., plaids). Observers performed a demanding task at Þxation to minimize top-down modulation of the stimulus-evoked responses. Dichoptic presentation of compatible gratings (same orientation) evoked greater activity than monocular presentation of a single grating only when contrast was low (<10%). A model that assumes linear summation of activity from each eye failed to explain binocular responses at 10% contrast or higher. However, a model with binocular contrast normalization, such that activity from each eye reduced the gain for the other eye, Þtted the results very well. Dichoptic presentation of orthogonal gratings evoked greater activity than monocular presentation of a single grating for all contrasts. However, activity evoked by dichoptic plaids was equal to that evoked by monocular plaids. Introducing an onset asynchrony (stimulating one eye 500 ms before the other, which under attentive vision results in ßash suppression) had no impact on the results; the responses to dichoptic and monocular plaids were again equal. We conclude that when attention is diverted, inter-ocular suppression in V1 can be explained by a normalization model in which the mutual suppression between orthogonal orientations does not depend on the eye of origin,nor on the onset times, and cross-orientation suppression is weaker than inter-ocular (same orientation) suppression.

Kang M-S, Heeger DJ, Blake R. Periodic perturbations producing phase-locked fluctuations in visual perception. Journal of Vision, 9(2):8 1-22, 2009.

Abstract: This paper describes a novel psychophysical and analytical technique, called periodic perturbation, for creating and characterizing perceptual waves associated with transitions in visibility of a stimulus during binocular rivalry and during binocular fusion. Observers tracked rivalry within a small, central region of spatially extended rival targets while small, brief increments in contrast (ÒtriggersÓ) were presented repetitively in antiphase within different regions of the two rival targets. Appropriately timed triggers produced entrainment of rivalry alternations within the central region, with the optimal timing dependent on an observer Õs native alternation rate. The latency between trigger and state switch increased with the distance between the location of the trigger and the central region being monitored, providing evidence for traveling waves of dominance. Traveling waves produced by periodic perturbation exhibited the same characteristics as those generated using a less efÞcient, more demanding discrete trial technique. We used periodic perturbation to reveal a novel relation between the dynamics associated with the spontaneous perceptual alternations and the speed of traveling waves across observers. In addition, we found evidence for traveling waves even when the events triggering them were initiated within regions of the visual Þeld where binocular vision was stable, in the absence of binocular rivalry, implying that perceptual organization generally depends on spatio-temporal context.

Reynolds JH & Heeger DJ. The normalization model of attention. Neuron, 61:168-185, 2009.

Abstract: Attention has been found to have a wide variety of effects on the responses of neurons in visual cortex. We describe a model of attention that exhibits each of these different forms of attentional modulation, depending on the stimulus conditions and the spread (or selectivity) of the attention field in the model. The model helps reconcile proposals that have been taken to represent alternative theories of attention. We argue that the variety and complexity of the results reported in the literature emerge from the variety of empirical protocols that were used, such that the results observed in any one experiment depended on the stimulus conditions and the subject's attentional strategy, a notion that we define precisely in terms of the attention field in the model, but that has not typically been completely under experimental control.

Offen S, Schluppeck D, Heeger DJ. The role of early visual cortex in visual short-term memory and visual attention. Vision Research, 49:1352-1362, 2009.

Abstract: We measured cortical activity with functional magnetic resonance imaging to probe the involvement of early visual cortex in visual short-term memory and visual attention. In four experimental tasks, human subjects viewed two visual stimuli separated by a variable delay period. The tasks placed differential demands on short-term memory and attention, but the stimuli were visually identical until after the delay period. Early visual cortex exhibited sustained responses throughout the delay when subjects performed attention-demanding tasks, but delay-period activity was not distinguishable from zero when subjects performed a task that required short-term memory. This dissociation reveals different computational mechanisms underlying the two processes.

Dinstein I, Gardner JL, Jazayeri M, Heeger DJ. Executed and observed movements have different distributed representations in human aIPS. Journal of Neuroscience, 28:11231-11239, 2008.

Abstract: How similar are the representations of executed and observed hand movements in the human brain? We used functional magnetic resonance imaging (fMRI) and multivariate pattern classification analysis to compare spatial distributions of cortical activity in response to several observed and executed movements. Subjects played the rockÐpaperÐscissors game against a videotaped opponent, freely choosing their movement on each trial and observing the opponentÕs hand movement after a short delay. The identities of executed movements were correctly classified from fMRI responses in several areas of motor cortex, observed movements were classified from responses in visual cortex, and both observed and executed movements were classified from responses in either left or right anterior intraparietal sulcus (aIPS). We interpret above chance classification as evidence for reproducible, distributed patterns of cortical activity that were unique for execution and/or observation of each movement. Responses in aIPS enabled accurate classification of movement identity within each modality (visual or motor), but did not enable accurate classification across modalities (i.e., decoding observed movements from a classifier trained on executed movements and vice versa). These results support theories regarding the central role of aIPS in the perception and execution of movements. However, the spatial pattern of activity for a particular observed movement was distinctly different from that for the same movement when executed, suggesting that observed and executed movements are mostly represented by distinctly different subpopulations of neurons in aIPS.

Donner TH, Sagi D, Bonneh Y, Heeger DJ. Opposite neural signatures of motion-induced blindness in human dorsal and ventral visual cortex. Journal of Neuroscience, 28:10298-10310, 2008.

Abstract: Motion-induced blindness (MIB) is a visual phenomenon in which a salient static target spontaneously fluctuates in and out of visual awareness when surrounded by a moving mask pattern. It has been hypothesized that MIB reflects an antagonistic interplay between cortical representations of the static target and moving mask. Here, we report evidence for such antagonism between human ventral and dorsal visual cortex during MIB. Functional magnetic resonance imaging (fMRI) responses in ventral visual area V4 decreased with the subjective disappearance of the target. These response decreases were specific for the cortical subregion corresponding retinotopically to the target, occurred early in time with respect to the perceptual report, and could not be explained by shifts of attention in reaction to target disappearance. At the same time, responses increased in mask-specific subregions in dorsal visual areas in and around the intraparietal sulcus. These opposite responses in ventral and dorsal visual areas occurred only during subjective target disappearance, not when the target was physically removed. Perceptual reports of target disappearance were furthermore associated with a ÒglobalÓ modulation of activity, which was delayed in time, and evident throughout early visual cortex, for both subjective target disappearance and physical target removal. We conclude that awareness of the target is tightly linked to the strength of its representation in ventral visual cortex, and that the mask representation in dorsal visual cortex plays a crucial role in the spontaneous suppression of the target representation during MIB.

Hasson U, Landesman O, Knappmeyer B, Vallines I, Rubin N, Heeger DJ. Neurocinematics: The neuroscience of films. Projections: The Journal for Movies and Mind, 2:1-26, 2008.

Abstract: While the recognition that films can impose a tight grip on viewers' minds dates back to the early days of cinema, until recently there was no way to record the mental states of viewers while watching a film. In this paper, we describe a new method for assessing the effect of a given film on viewers' brain activity. Brain activity was measured using functional magnetic resonance imaging (fMRI) during free viewing of films, and inter-subject correlation analysis (ISC, Figure 1) was used to assess similarities in the spatiotemporal responses across viewers' brains during movie watching. Our results demonstrate that some films can exert considerable control over brain activity (Figure 2) and eye movements (Figure 3). However, this was not the case for all types of motion picture sequences (Figure 4), and the level of control over viewers' brain activity differed as a function of movie content (Figure 5), editing (Figure 6), and directing style (Figure 7). We propose that ISC may be useful to film studies by providing a quantitative neuroscientific assessment (Figures 8 and 9) of the impact of different styles of filmmaking upon viewers' brains, and a valuable method for the film industry to better assess its products. Finally, we suggest that this method brings together two separate, largely unrelated disciplines, cognitive neuroscience and film studies, and may open the way for a new interdisciplinary field of "neurocinematic" studies.

Nir Y, Dinstein I, Malach R, Heeger DJ. BOLD and spiking activity - a comment on Viswanathan and Freeman. Nature Neuroscience, 11:523, 2008.

Summary: Viswanathan and Freeman (Nat Neurosci, 2007) claim that oxygen concentration and by inference blood oxygen level-dependent (BOLD) fMRI reflect synaptic activity more than spiking activity. This is a fundamental and controversial issue in fMRI research, so this claim, if incorrect, may erroneously bias the interpretation of a large body of data. The authors simultaneously recorded multi-unit activity (MUA), local field potentials (LFP), and tissue oxygen concentration in primary visual cortex of anesthetized cats stimulated with moving gratings. During high temporal frequency stimulation, when thalamic inputs are active but few cortical neurons respond, oxygen signals were observed without MUA. Hence, they concluded that oxygen responses reflect synaptic inputs more than spiking. However, careful inspection of their results leads to the opposite conclusion and supports a tight coupling between oxygen signals and local cortical spiking.

Gardner JL, Merriam EP, Movshon JA, Heeger DJ. Maps of visual space in human occipital cortex are retinotopic, not spatiotopic. Journal of Neuroscience, 28:3988-3999, 2008.

Abstract: We experience the visual world as phenomenally invariant to eye position, but almost all cortical maps of visual space in monkeys use a retinotopic reference frame - that is, the cortical representation of a point in the visual world is different across eye positions. It was recently reported that human cortical area MT (unlike monkey MT) represents stimuli in a reference frame linked to the position of stimuli in space, a spatiotopic reference frame. We used visuotopic mapping with BOLD fMRI signals to define 12 human visual cortical areas, and then determined whether the reference frame in each area was spatiotopic or retinotopic. We found that all 12 areas, including MT, represented stimuli in a retinotopic reference frame. Although there were patches of cortex in and around these visual areas that were ostensibly spatiotopic, none of these patches exhibited reliable stimulus-evoked responses. We conclude that the early, visuotopically-organized visual cortical areas in the human brain (like their counterparts in the monkey brain) represent stimuli in a retinotopic reference frame.

Hasson U, Yang E, Vallines I, Heeger DJ, Rubin N. A hierarchy of temporal receptive windows in human cortex. Journal of Neuroscience, 28:2539-2550, 2008.

Abstract: Real-world events unfold at different time scales and, therefore, cognitive and neuronal processes must likewise occur at different time scales. We present a novel procedure that identifies brain regions responsive to sensory information accumulated over different time scales. We measured functional magnetic resonance imaging activity while observers viewed silent films presented forward, backward, or piecewise-scrambled in time. Early visual areas (e.g., primary visual cortex and MT complex responsive to visual motion) exhibited high response reliability regardless of disruptions in temporal structure. In contrast, the reliability of responses in several higher brain areas, including the superior temporal sulcus (STS), precuneus, posterior lateral sulcus (LS), temporal parietal junction (TPJ), and frontal eye field (FEF), was affected by information accumulated over longer time scales. These regions showed highly reproducible responses for repeated forward, but not for backward or piecewise-scrambled presentations. Moreover, these regions exhibited marked differences in temporal characteristics, with LS, TPJ, and FEF responses depending on information accumulated over longer durations (36 s) than STS and precuneus (12 s). We conclude that, similar to the known cortical hierarchy of spatial receptive fields, there is a hierarchy of progressively longer temporal receptive windows in the human brain.

Dinstein I, Thomas C, Behrmann M, Heeger DJ. A mirror up to nature. Current Biology, 18:R13-18, 2008.

Abstract: Mirror neurons were first documented in the macaque monkey a little over ten years ago. Their discovery has led to the formulation of several theories about their function in humans, including suggestions that mirror neurons are involved in understanding the meaning and intentions of observed actions, learning by imitation, feeling empathy, formation of a "theory of mind", and even the development of language. Hypotheses have also been made about the consequences of mirror neuron dysfunction; foremost among these is the notion that such a dysfunction during development leads to many of the social and cognitive symptoms associated with the autism spectrum disorders (ASDs). Yet, despite a decade of prolific research on these appealing theories, there is little evidence to support them. In this essay, we review the current state of "mirror system" research, point to several weaknesses in the field, and offer suggestions for how better to study these remarkably interesting neurons in both neurotypical and autistic individuals.

Lee SH, Blake R, Heeger DJ. Hierarchy of responses underlying binocular rivalry. Nature Neuroscience, 10:1048-1054, 2007.

Abstract: During binocular rivalry, physical stimulation is dissociated from conscious visual awareness. Human brain imaging reveals a tight linkage between neural events in human primary visual cortex (V1) and dynamics of perceptual waves during transitions in dominance during binocular rivalry. Here, we report results from experiments in which observers’ attention was diverted from the rival stimuli, implying that: 1) competition between two rival stimuli involves neural circuits in V1, 2) attention is crucial for the consequences of this neural competition to advance to higher visual areas and promote perceptual waves.

Dinstein I, Hasson U, Rubin N, Heeger DJ. Brain areas selective for both observed and executed movements. Journal of Neurophysiology, 98:1415-1427, 2007.

Abstract: When observing a particular movement a subset of movement-selective visual and visuomotor neurons are active in the observer's brain forming a representation of the observed movement. Similarly, when executing a movement a subset of movement-selective motor and visuomotor neurons are active forming a representation of the executed movement. In this study we used an fMRI-adaptation protocol to assess cortical response selectivity to observed and executed movements simultaneously. Subjects freely played the rock-paper-scissors game against a videotaped opponent, sometimes repeatedly observing or executing the same movement on subsequent trials.Numerous brain areas exhibited adaptation (repetition suppression) during either repeated observations or repeated executions of the same movement. A subset of areasexhibited an overlap of both effects, containing neurons with selective responses for both executed and observed movements. We describe the function of these unique movement representation areas in the context of the human mirror system, which is expected to respond selectively to both observed and executed movements.

Yang Z, Heeger DJ, Seidemann E. Rapid and precise retinotopic mapping of visual cortex obtained by voltage sensitive dye imaging in the behaving monkey. Journal of Neurophysiology, 98:1002-1014, 2007.

Abstract: Retinotopy is a fundamental organizing principle of the visual cortex. Over the years, a variety of techniques have been used to examine it. None of these techniques, however, provides a way to characterize retinotopy rapidly, at the sub-millimeter range, in alert, behaving subjects. Voltage sensitive dye imaging (VSDI) can be used to monitor neuronal population activity at high spatial and temporal resolutions. Here we present a VSDI protocol for rapid and precise retinotopic mapping in the behaving monkey. Two monkeys performed a fixation task while thin visual stimuli swept periodically at a high speed in one of two possible directions through a small region of visual space. Because visual space is represented systematically across the cortical surface, each moving stimulus produced a traveling wave of activity in the cortex that could be precisely measured with VSDI. The time at which the peak of the traveling wave reached each location in the cortex linked this location with its retinotopic representation. We obtained detailed retinotopic maps from a region of ~1 cm2 over the dorsal portion of areas V1 and V2. Retinotopy obtained during less than 4 minutes of imaging had a spatial precision of 0.11-0.19 mm, was consistent across experiments, and reliably predicted the locations of the response to small localized stimuli. The ability to rapidly obtain precise retinotopic maps in behaving monkeys opens the door for detailed analysis of the relationship between spatiotemporal dynamics of population responses in the visual cortex and perceptually guided behavior.

Levy I, Schluppeck D, Heeger DJ, & Glimcher PW, Specificity of human cortical areas for reaches and saccades. Journal of Neuroscience, 27:4687-4696, 2007.

Abstract: Electrophysiological studies in monkeys have identified effector-related regions in the posterior parietal cortex (PPC). The lateral intraparietal area (LIP), for example, responds preferentially for saccades whereas the parietal reach region (PRR) responds preferentially for arm movements. However, the degree of effector selectivity actually observed is limited; each area contains neurons selective for the non-preferred effector, and many neurons in both areas respond for both effectors. We used fMRI to assess the degree of effector preference at the population level, focusing on topographically organized regions in the human PPC (V7, IPS1 and IPS2). An event-related design adapted from monkey experiments was employed. In each trial, an effector cue preceded the appearance of a spatial target, after which a go-signal instructed subjects to produce the specified movement with the specified effector. Our results show that the degree of effector specificity is limited in many cortical areas, and transitions gradually from saccade to reach preference as one moves through the hierarchy of areas in the occipital, parietal, and frontal cortices. Saccade preference was observed in visual cortex, including early areas and V7. IPS1 exhibited balanced activation to saccades and reaches, whereas IPS2 showed a weak but significant preference for reaches. In frontal cortex, areas near the central sulcus showed a clear and absolute preference for reaches while the Frontal Eye Field (FEF) showed little or no effector selectivity. Although these results contradict many theoretical conclusions about effector specificity, they are compatible with the complex picture arising from electrophysiological studies and also with previous imaging studies that reported largely overlapping saccade and arm related activation. The results are also compatible with theories of efficient coding in cortex.

Montaser-Kouhsari L, Landy MS, Heeger DJ, & Larsson J. Orientation-selective adaptation to illusory contours in human visual cortex. Journal of Neuroscience, 27:2186-2195, 2007.

Abstract: Humans can perceive illusory or subjective contours in the absence of any real physical boundaries. We used an adaptation protocol to look for orientation-selective neural responses to illusory contours defined by phase-shifted abutting line gratings in the human visual cortex. We measured fMRI responses to illusory-contour test stimuli after adapting to an illusory-contour adapter stimulus that was oriented parallel or orthogonal to the test stimulus. We found orientation-selective adaptation to illusory contours in early (V1 and V2) and higher-tier visual areas (V3, hV4, VO1, V3A/B, V7, LO1, LO2). That is, fMRI responses were smaller for test stimuli parallel to the adapter than for test stimuli orthogonal to the adapter. In two control experiments using spatially jittered and phase-randomized stimuli, we demonstrated that this adaptation was not just in response to differences in the distribution of spectral power in the stimuli. Orientation-selective adaptation to illusory contours increased from early to higher-tier visual areas. Thus, both early and higher-tier visual areas contain neurons selective for the orientation of this type of illusory contour.

Olman CA, Inati S, & Heeger DJ. The effect of large veins on spatial localization with GE BOLD at 3T: displacement, not blurring. NeuroImage, 34:1126-1135, 2007.

Abstract: We used two different methods of region of interest (ROI) definition to investigate the spatial accuracy of functional magnetic resonance imaging (fMRI) at low and high spatial resolution. The “single-condition localizer” consisted of block alternation between a target stimulus and a mean gray background. The “differential localizer” consisted of block alternation between the target stimulus and another stimulus that filled the complement of the visual field. A separate series of scans, in which the target stimulus was presented briefly with long inter-stimulus intervals, was used to measure the hemodynamic impulse response function (HIRF). As expected, the differential localizer defined more restricted ROIs that better matched the predicted cortical representation of the target stimulus. However, at low resolution (3mm isotropic) many voxels that responded positively to the target stimulus in the differential protocol responded negatively to the target stimulus in the single-condition localizer and in the HIRF measurements. The localization errors were attributed to voxels near large veins, which were identified based on low mean intensity and high variance.  At high resolution (1.2mm isotropic), the effects of large veins were present, but affected a smaller number of voxels. Thus, the use of differential localizers does not necessarily result in a more accurate indication of the underlying neural activity. Localization errors are reduced at higher spatial resolutions and can be eliminated by identification and removal of voxels dominated by large veins.

Silver MA, Ress D, & Heeger DJ. Neural correlates of sustained spatial attention in human early visual cortex. J Neurophysiol, 97:229-237, 2007.

Abstract: Attention is thought to enhance perceptual performance at attended locations due to top-down attention signals that modulate activity in visual cortex. Here, we show that activity in early visual cortex is sustained during maintenance of attention in the absence of visual stimulation. We used functional magnetic resonance imaging (fMRI) to measure activity in visual cortex while human subjects performed a visual detection task in which a variable-duration delay period preceded target presentation. Portions of cortical areas V1, V2, and V3 representing the attended part of the visual field exhibited sustained increases in activity throughout the delay period. Portions of these cortical areas representing peripheral, unattended parts of the visual field displayed sustained decreases in activity. The data were well-fit by a model that assumed the sustained neural activity was constant in amplitude over a time period equal to that of the actual delay period for each trial. These results demonstrate that sustained attention responses are present in early visual cortex (including primary visual cortex) in the absence of a visual stimulus, and that these responses correlate with the allocation of visuospatial attention in both the spatial and temporal domains.

Larsson J & Heeger DJ. Two retinotopic visual areas in human lateral occipital cortex. Journal of Neuroscience, 26:13128-13142, 2006.

Abstract: We describe two visual field maps, LO1 and LO2, in human lateral occipital cortex between dorsal V3 and V5/MT+. Each map contained a topographic representation of the contralateral visual hemifield. The eccentricity representations were shared with V1/V2/V3. The polar angle representation in LO1 extended from the lower vertical meridian (at the boundary with dorsal V3) through the horizontal to the upper vertical meridian (at the boundary with LO2). The polar angle representation in LO2 was the mirror-reversal of that in LO1. LO1 and LO2 overlapped with the posterior part of the object-selective lateral occipital complex and the kinetic occipital region (KO). The retinotopy and functional properties of LO1 and LO2 suggest that they correspond to two new human visual areas, which lack exact homologues in macaque visual cortex. The topography, stimulus selectivity and anatomical location of LO1 and LO2 indicate that they integrate shape information from multiple visual submodalities in retinotopic coordinates.

Liu T, Heeger DJ, Carrasco M. Neural correlates of the visual vertical meridian asymmetry. Journal of Vision, 6:1294-1306, 2006.

Abstract: Human visual performance is better below than above fixation along the vertical meridian &## Matlab code for running the simulations (25Kb tarfile). # Matlabe code for analyzing the simulation results (25Kb tarfile).150; vertical meridian asymmetry (VMA). Here we used fMRI to investigate the neural correlates of the VMA. We presented stimuli of two possible sizes and spatial frequencies on the horizontal and vertical meridians and analyzed the fMRI data in subregions of early visual cortex (V1/V2) that corresponded retinotopically to the stimulus locations. Asymmetries in both the spatial extent and amplitude of the fMRI measurements correlated with the behavioral VMA. These results demonstrate that the VMA has a neural basis at the earliest stages of cortical visual processing, andsensory responses of large populations of neurons in visual cortex.

Schluppeck D, Curtis CE, Glimcher PW, & Heeger DJ. Sustained activity in topographic areas of human posterior parietal cortex during memory-guided saccades. J Neurosci, 26:5098-5108, 2006.

Abstract: In a previous study, we identified three cortical areas in human posterior parietal cortex that exhibited topographic responses during memory-guided saccades [visual area 7 (V7), intraparietal sulcus 1 (IPS1), and IPS2], which are candidate homologs of macaque parietal areas such as the lateral intraparietal area and parietal reach region. Here, we show that these areas exhibit sustained delay-period activity, a critical physiological signature of areas in macaque parietal cortex. By varying delay duration, we disambiguated delay-period activity from sensory and motor responses. Mean time courses in the parietal areas were well fit by a linear model comprising three components representing responses to (1) the visual target, (2) the delay period, and (3) the eye movement interval. We estimated the contributions of each component: the response amplitude during the delay period was substantially smaller (the transient visual target. All three parietal regions showed comparable delay-period response amplitudes, with a trend toward larger responses from V7 to IPS1 and IPS2. Responses to the cue and during the delay period showed clear lateralization with larger responses to trials in which the target was placed in the contralateral visual field, suggesting that both of these components contributed to the topography we measured.

Larsson J, Landy MS, & Heeger DJ. Orientation-selective adaptation to first- and second-order patterns in human visual cortex. J Neurophysiol, 95:962-881, 2006.

Abstract: Second-order textures – patterns that cannot be detected by mechanisms sensitive only to luminance changes – are ubiquitous in visual scenes, but the neuronal mechanisms mediating perception of such stimuli are not well understood. We used an adaptation protocol to measure neural activity in the human brain selective for the orientation of second-order textures. FMRI responses were measured in three subjects to presentations of first- and second-order probe gratings after adapting to a high-contrast first- or second-order grating that was either parallel or orthogonal to the probe gratings. First-order (LM) stimuli were generated by modulating the stimulus luminance. Second-order stimuli were generated by modulating the contrast (CM) or orientation (OM) of a first-order carrier. We used four combinations of adapter and probe stimuli: LM:LM, CM:CM, OM:OM, and LM:OM. The fourth condition tested for cross-modal adaptation with first-order adapter and second-order probe stimuli. Attention was diverted from the stimulus by a demanding task at fixation. Both first- and second-order stimuli elicited orientation-selective adaptation in multiple cortical visual areas, including V1, V2, V3, V3A/B, a newly identified visual area anterior to dorsal V3 which we have termed LO1, hV4, and VO1. For first-order stimuli (condition LM:LM), the adaptation was no larger in extrastriate areas than in V1, implying that the orientation-selective first-order (luminance) adaptation originated in V1. For second-order stimuli (conditions CM:CM and OM:OM), the magnitude of adaptation, relative to the absolute response magnitude, was significantly larger in VO1 (and for condition CM:CM, also in V3A/B and LO1) than in V1, suggesting that second-order stimulus orientation was extracted by additional processing after V1. There was little difference in the amplitude of adaptation between the second-order conditions. No consistent effect of adaptation was found in the cross-modal condition LM:OM, in agreement with psychophysical evidence for weak interactions between first- and second-order stimuli and computational models of separate mechanisms for first- and second-order visual processing.

Silver MA, Ress D, & Heeger DJ, Topographic maps of visual spatial attention in human parietal cortex, J Neurophysiol 94(2):1358-71, 2005.

Abstract: Functional magnetic resonance imaging (fMRI) was used to measure activity in human parietal cortex during performance of a visual detection task in which the focus of attention systematically traversed the visual field. Critically, the stimuli were identical on all trials (except for slight contrast changes in a fully randomized selection of the target locations) whereas only the cued location varied. Traveling waves of activity were observed in posterior parietal cortex consistent with shifts in covert attention in the absence of eye movements. The temporal phase of the fMRI signal in each voxel indicated the corresponding visual field location. Visualization of the distribution of temporal phases on a flattened representation of parietal cortex revealed at least two distinct topographically organized cortical areas within the intraparietal sulcus (IPS), each representing the contralateral visual field. Two cortical areas were proposed based on this topographic organization, which we refer to as IPS1 and IPS2 to indicate their locations within the IPS. This nomenclature is neutral with respect to possible homologies with well-established cortical areas in the monkey brain. The two proposed cortical areas exhibited relatively little response to passive visual stimulation in comparison with early visual areas. These results provide evidence for multiple topographic maps in human parietal cortex.

Schluppeck D, Glimcher P, & Heeger DJ, Topographic organization for delayed saccades in human posterior parietal cortex, J Neurophysiol 94(2):1372-84, 2005.

Abstract: Posterior parietal cortex (PPC) is thought to play a critical role in decision making, sensory attention, motor intention, and/or working memory. Research on the PPC in non-human primates has focused on the lateral intraparietal area (LIP) in the intraparietal sulcus (IPS). Neurons in LIP respond after the onset of visual targets, just before saccades to those targets, and during the delay period in between. To study the function of posterior parietal cortex in humans, it will be crucial to have a routine and reliable method for localizing specific parietal areas in individual subjects. Here, we show that human PPC contains at least two topographically organized regions, which are candidates for the human homologue of LIP. We mapped the topographic organization of human PPC for delayed (memory guided) saccades using fMRI. Subjects were instructed to fixate centrally while a peripheral target was briefly presented. After a further 3-s delay, subjects made a saccade to the remembered target location followed by a saccade back to fixation and a 1-s inter-trial interval. Targets appeared at successive locations "around the clock" (same eccentricity, approximately 30 degrees angular steps), to produce a traveling wave of activity in areas that are topographically organized. PPC exhibited topographic organization for delayed saccades. We defined two areas in each hemisphere that contained topographic maps of the contra-lateral visual field. These two areas were immediately rostral to V7 as defined by standard retinotopic mapping. The two areas were separated from each other and from V7 by reversals in visual field orientation. However, we leave open the possibility that these two areas will be further subdivided in future studies. Our results demonstrate that topographic maps tile the cortex continuously from V1 well into PPC.

Lee SH, Blake R, & Heeger DJ, Traveling waves of activity in primary visual cortex during binocular rivalry, 8:22-23, 2005.

Abstract: When the two eyes view large dissimilar patterns that induce binocular rivalry, alternating waves of visibility are experienced, as one pattern sweeps the other out of conscious awareness.  Here we show tight linkage between dynamics of perceptual waves during rivalry and neural events in human primary visual cortex (V1).

Neri P, Bridge H & Heeger DJ, Stereoscopic processing of absolute and relative disparity in human visual cortex, Journal of Neurophysiology, 92:1880-1891, 2004.

Abstract: Stereoscopic vision relies mainly on relative depth differences between objects, rather than on their absolute distance in depth from where the eyes fixate. However, relative disparities are computed from absolute disparities, and it is not known where these two stages are represented in the human brain. Using functional magnetic resonance imaging (fMRI), we assessed absolute and relative disparity selectivity with stereoscopic stimuli consisting of pairs of transparent planes in depth in which the absolute and relative disparity signals could be independently manipulated (at a local spatial scale).  In experiment 1, relative disparity was kept constant, while absolute disparity was varied in half the blocks of trials (“mixed” blocks) and kept constant in the remaining half (“same” blocks), alternating between blocks. Because neuronal responses undergo adaptation and reduce their firing rate following repeated presentation of an effective stimulus, the fMRI signal reflecting activity of units selective for absolute disparity is expected to be smaller during “same” blocks as compared to “mixed” ones. Experiment 2 similarly manipulated relative disparity rather than absolute disparity. The results from both experiments were consistent with adaptation with differential effects across visual areas such that 1) dorsal areas (V3a, MT+/V5, V7) showed more adaptation to absolute than to relative disparity; 2) ventral areas (hV4, V8/V4) showed an equal adaptation to both; 3) early visual areas (V1, V2, V3) showed a small effect in both experiments. These results indicate that processing in dorsal areas may rely mostly on information about absolute disparities, while ventral areas split neural resources between the two types of stereoscopic information so as to maintain an important representation of relative disparity.

Zenger-Landolt B & Heeger DJ, Response suppression in V1 agrees with psychophysics of surround masking , Journal of Neuroscience, 23:6884-6893, 2003.

Abstract: When a target stimulus is embedded in a high contrast surround, the target appears reduced in contrast and is harder to detect, and neural responses in visual cortex are suppressed. We used functional magnetic resonance imaging (fMRI) and psychophysics to quantitatively compare these physiological and perceptual effects. Observers performed a contrast discrimination task on a contrast-reversing sinusoidal target grating. The target was either presented in isolation or embedded in a high-contrast surround. While observers performed the task, we also measured fMRI responses as a function of target contrast, both with and without a surround.Wefound that the surround substantially increased the psychophysical thresholds while reducing fMRI responses. The two data sets were compared, on the basis of the assumption that a fixed response difference is required for correct discrimination, and we found that the psychophysics accounted for 96.5% of the variance in the measured V1 responses. The suppression in visual areas V2 and V3 was stronger, too strong to agree with psychophysics. The good quantitative agreement between psychophysical thresholds and V1 responses suggests V1 as a plausible candidate for mediating surround masking.

Ress D & Heeger DJ, Neuronal correlates of perception in early visual cortex, Nature Neuroscience, 6:414-420, 2003.

Abstract: We used functional magnetic resonance imaging (fMRI) to measure activity in human early visual cortex (areas V1, V2 and V3) during a challenging contrast-detection task. Subjects attempted to detect the presence of slight contrast increments added to two kinds of background patterns. Behavioral responses were recorded so that the corresponding cortical activity could be grouped into the usual signal detection categories: hits, false alarms, misses and correct rejects. For both kinds of background patterns, the measured cortical activity was retinotopically specific. Hits and false alarms were associated with significantly more cortical activity than were correct rejects and misses. That false alarms evoked more activity than misses indicates that activity in early visual cortex corresponded to the subjects' percepts, rather than to the physically presented stimulus.

Carandini M, Heeger DJ, & Senn W, A synaptic explanation of suppression in visual cortex, Journal of Neuroscience, 22:10053–10065, 2002.

Abstract: The responses of neurons in the primary visual cortex (V1) are suppressed by mask stimuli that do not elicit responses if presented alone. This suppression is widely believed to be mediated by intr\ acortical inhibition. As an alternative, we propose that it can be explained by thalamocortical synaptic depression. This explanation correctly predicts that suppression is monocular, immune to cortical adap\ tation, and occurs for mask stimuli that elicit responses in the thalamus but not in the cortex. Depression also explains other phenomena previously ascribed to intracortical inhibition. It explains why resp\ onses saturate at high stimulus contrast, whereas selectivity for orientation and spatial frequency is invariant with contrast. It explains why transient responses to flashed stimuli are nonlinear, whereas s\ patial summation is primarily linear. These results suggest that the very first synapses into the cortex, and not the cortical network, may account for important response properties of V1 neurons.

Huk A, Dougherty RF, & Heeger DJ, Retinotopy and functional subdivision of human areas MT and MST, Journal of Neuroscience, 22:7195-7205, 2002.

Abstract: We performed a series of functional magnetic resonance imaging experiments to divide the human MT+ complex into subregions that may be identified as homologs to a pair of macaque motion-responsive visual areas: the middle temporal area (MT) and the medial superior temporal area (MST). Using stimuli designed to tease apart differences in retinotopic organization and receptive field size, we established a double dissociation between two distinct MT+ subregions in 8 of the 10 hemispheres studied. The first subregion exhibited retinotopic organization but did not respond to peripheral ipsilateral stimulation, indicative of smaller receptive fields. Conversely, the second subregion within MT+ did not demonstrate retinotopic organization but did respond to peripheral stimuli in both the ipsilateral and contralateral visual hemifields, indicative of larger receptive fields. We tentatively identify these subregions as the human homologues of macaque MT and MST, respectively. Putative human MT and MST were typically located on the posterior/ventral and anterior/dorsal banks of a dorsal/posterior limb of the inferior temporal sulcus, similar to their relative positions in the macaque superior temporal sulcus.

Neri P & Heeger DJ, Spatiotemporal mechanisms for detecting and identifying image features in human vision, Nature Neuroscience, 5:812-816, 2002.

Abstract: The human visual system constantly selects salient features in the environment for further attention, processing and identification. Models of feature detection often assume that salient features are selected on the basis of contrast energy (local variance in intensity in the visual stimulus. This hypothesis, however, has not been tested directly. We used psychophysical reverse correlation to study how humans detect and identify basic image features (bars and short line segments). Subjects detected a briefly-flashed 'target bar' that was embedded in 'noise bars' that randomly changed in intensity over space and time. By studying how the intensity of the noise bars affected performance, we were able to dissociate two processing stages: an early 'detection' stage, whereby only locations of high contrast energy in the image were selected, and an identification stage (~100 ms later) during which subjects used image intensity at selected locations to determine whether the target was bright or dark.

Heeger DJ & Ress D, What does fMRI tell us about neuronal activity?, Nature Reviews Neuroscience, 3:142-151, 2002.

Abstract: In recent years, cognitive neuroscientists have taken great advantage of functional magnetic resonance imaging (fMRI) as a non-invasive method of measuring neuronal activity in the human brain. But what exactly does fMRI tell us? We know that its signals arise from changes in local haemodynamics that, in turn, result from alterations in neuronal activity, but exactly how neuronal activity, haemodynamics and fMRI signals are related is unclear. It has been assumed that the fMRI signal is proportional to the local average neuronal activity, but many factors can influence the relationship between the two. A clearer understanding of how neuronal activity influences the fMRI signal is needed if we are correctly to interpret functional imaging data.

Huk AC & Heeger DJ, Pattern-motion responses in human visual cortex, Nature Neuroscience, 5:72-75, 2001.

Abstract: Physiological models of visual motion processing posit that 'pattern-motion cells' represent the direction of moving objects independent of their particular spatial pattern. We performed fMRI experiments to identify pattern-motion cells in the human brain, and to test the hypothesis that the activity of these neurons is linked to the perception of coherent motion. A protocol employing moving 'plaid' stimuli allowed us to separate pattern-motion responses from other types of motion-related activity within the same brain structures, and revealed strong pattern-motion selectivity in human visual area MT+. Reducing the perceptual coherence of the plaids yielded a corresponding decrease in pattern-motion responsivity, providing evidence that percepts of coherent motion are closely linked to the activity of pattern-motion cells.

Huk AC, Ress D, & Heeger DJ, Neuronal basis of the motion aftereffect reconsidered. Neuron, 32:161–172, 2001.

Abstract: Several recent fMRI studies have reported response increases in human MT+ correlated with perception of the motion aftereffect (MAE). However, MT+ responses can be strongly affected by attention, and subjects may naturally attend more strongly during the MAE than during controls without MAE. We found that requiring subjects to attend to the motion of the stimulus on both MAE and control trials produced equal levels of MT+ response, suggesting that attention may be a major confound in the interpretation of previous fMRI MAE experiments; in our data, attention appears to account for the entire effect. After eliminating this confound, we sought to measure direction-selective motion adaptation in human visual cortex. We observed that adaptation produced a direction-selective imbalance in MT+ responses (as well as earlier visual areas including V1), and yielded a corresponding psychophysical asymmetry in speed discrimination thresholds. These findings provide physiological evidence of a population-level response imbalance related to the MAE, and quantify the relative proportions of direction-selective neurons in human cortical visual areas.

Backus BT, Fleet DJ, Parker AJ, & Heeger DJ, Human cortical activity correlates with stereoscopic depth perception. J Neurophysiol, 86:2054-2068 , 2001.

Abstract: Stereoscopic depth perception is based on binocular disparities. Although neurons in primary visual cortex (V1) are selective for binocular disparity, their responses do not explicitly code perceived depth. The stereoscopic pathway must therefore include additional processing beyond V1. We used functional magnetic resonance imaging (fMRI) to examine stereo processing in V1 and other areas of visual cortex. We created stereoscopic stimuli that portrayed two planes of dots in depth, placed symmetrically about the plane of fixation, or else asymmetrically with both planes either nearer or farther than fixation. The interplane disparity was varied parametrically to determine the stereoacuity threshold (the smallest detectable disparity) and the upper depth limit (largest detectable disparity). fMRI was then used to quantify cortical activity across the entire range of detectable interplane disparities. Measured cortical activity covaried with psychophysical measures of stereoscopic depth perception. Activity increased as the interplane disparity increased above the stereoacuity threshold and dropped as interplane disparity approached the upper depth limit. From the fMRI data and an assumption that V1 encodes absolute retinal disparity, we predicted that the mean response of V1 neurons should be a bimodal function of disparity. A post hoc analysis of electrophysiological recordings of single neurons in macaques revealed that, although the average firing rate was a bimodal function of disparity (as predicted), the precise shape of the function cannot fully explain the fMRI data. Although there was widespread activity within the extrastriate cortex (consistent with electrophysiological recordings of single neurons), area V3A showed remarkable sensitivity to stereoscopic stimuli, suggesting that neurons in V3A may play a special role in the stereo pathway.

Simoncelli EP & Heeger DJ, Representing retinal image speed in visual cortex. Nature Neuroscience, 4:461-462, 2001.

Abstract: Speed preferences in MT neurons are found to be unaffected by changes in stimulus pattern, supporting the hypothesis that these neurons represent retinal image velocities.

Polonsky A, Blake R, Braun J, & Heeger DJ, Neuronal activity in human primary visual cortex correlates with perception during binocular rivalry. Nature Neuroscience, 3:1153-1159, 2000.

Abstract: During binocular rivalry two incompatible monocular images compete for perceptual dominance, with one pattern temporarily suppressed from conscious awareness. We measured fMRI signals in early visual cortex while subjects viewed rival dichoptic images of two different contrasts; the contrast difference served as a “tag” for the representations of the two monocular images. Activity in primary visual cortex (V1) increased when subjects perceived the higher contrast pattern and decreased when subjects perceived the lower contrast pattern. These fluctuations in V1 activity during rivalry were about 55% as large as those evoked by alternately presenting the two monocular images without rivalry. The rivalry-related fluctuations in V1 activity were roughly equal to those observed in extrastriate visual areas (V2, V3, V3a, and V4v). These results challenge the view that binocular rivalry primarily takes place later in the cortical visual pathways.

Ress D, Backus BT, & Heeger DJ, Activity in primary visual cortex predicts performance in a visual detection task. Nature Neuroscience, 3:940-945, 2000.

Abstract: Visual attention can affect both neural activity and behavior in humans. To quantify possible links between the two, we measured activity in early visual cortex (V1, V2 and V3) during a challenging pattern detection task. Activity was dominated by a large response that was independent of the presence or absence of the stimulus pattern. The measured activity quantitatively predicted the subject's pattern detection performance: when activity was greater, the subject was more likely to correctly discern the presence or absence of the pattern. This stimulus independent activity had several characteristics of visual attention, suggesting that attentional mechanisms modulate activity in early visual cortex, and that this attention related activity strongly influences performance.

Heeger DJ, Huk AC, Geisler WS, & Albrecht DG, Spikes versus BOLD: what does neuroimaging tell us about neuronal activity? Nature Neuroscience, 3:631-633, 2000.

Abstract: By demonstrating that fMRI responses in human MT+ increase linearly with motion coherence and comparing these responses with slopes of single-neuron firing rates in monkey MT, a new paper provides the best evidence so far that fMRI responses are proportional to firing rates.

Xing J, & Heeger DJ, Measurement and modeling of center-surround suppression and enhancement. Vision Research, 41:571-583, 2001.

Abstract: The apparent contrast of a central stimulus is affected by the presence of surrounding stimuli. For some stimulus conditions, the apparent contrast is suppressed and for other conditions the apparent contrast is enhanced. This report is intended to offer a coherent description of the stimulus factors that influence suppression and enhancement. Using a contrast-matching protocol, we measured the contrast dependence of center-surround interactions by systematically varying the suprathreshold contrasts of the central and surround gratings. Different spatial configurations of the surround stimuli were studied. Our results confirmed previous findings that (1) a surround stimulus could produce either contrast enhancement or contrast suppression depending on the balance of the central and surround contrasts; (2) suppression varied with the width of the surround stimulus and was strongly orientation-specific; and (3) enhancement was less sensitive to changes in surround configurations (in particular, enhancement did not depend on the colinearity of the central and surround gratings). Based on the experimental data, we developed a computational model to account for center-surround suppression and enhancement.

Xing J, & Heeger DJ, Center-surround interactions in foveal and peripheral vision. Vision Research, 40:3065-3072, 2000.

Abstract: The perceived contrast of a central stimulus can be decreased (surround suppression) or increased (surround facilitation) by the presence of surround stimuli. In this report we examined center-surround interactions in foveal and peripheral vision using contrast-matching tasks. We found that: (1) surround suppression became markedly stronger as the center-surround stimulus was moved toward the periphery; (2) surround facilitation diminished in the periphery; and (3) the suppression in the periphery was less orientation- and frequency-specific than that in the fovea, so that significant suppression was induced even when the central and surround gratings had very different orientations and spatial frequencies. The different center-surround interactions in the fovea and periphery can not be accounted for by cortical magnification, suggesting that center-surround interactions in the fovea and periphery are incommensurable and play different functional roles in human image processing.

Huk AC & Heeger DJ, Task-related modulation of visual cortex. J Neurophysiol, 83:3525–3536, 2000.

Abstract: We performed a series of experiments to quantify the effects of task performance on cortical activity in early visual areas. Functional magnetic resonance imaging (fMRI) was used to measure cortical activity in several cortical visual areas including primary visual cortex (V1) and the MT complex (MT+) as subjects performed a variety of threshold-level visual psychophysical tasks. Performing speed, direction, and contrast discrimination tasks produced strong modulations of cortical activity. For example, one experiment tested for selective modulations of MT+ activity as subjects alternated between performing contrast and speed discrimination tasks. MT+ responses modulated in phase with the periods of time during which subjects performed the speed  discrimination task; that is, MT+ activity was higher during speed discrimination than during contrast discrimination. Task related modulations were consistent across repeated measurements in each subject; however, significant individual differences were observed between subjects. Together, the results suggest 1) that specific changes in the cognitive/behavioral state of a subject can exert selective and reliable modulations of cortical activity in early visual cortex, even in V1; 2) that there are significant individual differences in these modulations; and 3) that visual areas and pathways that are highly sensitive to small changes in a given stimulus feature (such as contrast or speed) are selectively modulated during discrimination judgments on that feature. Increasing the gain of the relevant neuronal signals in this way may improve their signal-to-noise to help optimize task performance.

Nestares O & Heeger DJ, Robust multiresolution alignment of MRI brain volumes, Magnetic Resonance in Medicine, 43:705-715, 2000.

Abstract: An algorithm for the automatic alignment of MRI volumes of the human brain was developed, based on techniques adopted from the computer vision literature for image motion estimation. Most image registration techniques rely on the assumption that corresponding voxels in the two volumes have equal intensity, which is not true for MRI volumes acquired with different coils and/or pulse sequences. Intensity normalization and contrast equalization were used to minimize the differences between the intensities of the two volumes. However, these preprocessing steps do not correct perfectly for the image differences when using different coils and/or pulse sequences. Hence, the alignment algorithm relies on robust estimation, which automatically ignores voxels where the intensities are sufficiently different in the two volumes. A multiresolution pyramid implementation enables the algorithm to estimate large displacements. The resulting algorithm is used routinely to align MRI volumes acquired using different protocols (3D SPGR and 2D fast spin echo) and different coils (surface and head) to subvoxel accuracy (better than 1 mm).

Heeger DJ, Boynton GM, Demb JB, Seidemann E, & Newsome WT, Motion Opponency in Visual Cortex, Journal of Neuroscience, 19:7162-7174 1999.

Abstract: Perceptual studies suggest that visual motion perception is mediated by opponent mechanisms that correspond to mutually suppressive populations of neurons sensitive to motions in opposite directions.  We tested for a neuronal correlate of motion opponency using functional magnetic resonance imaging to measure brain activity in human visual cortex.  There was strong motion opponency in a secondary visual cortical area known as the human MT complex (MT+), but there was little evidence of motion opponency in primary visual cortex (V1).  To determine whether the level of opponency in human MT+ and monkey MT are comparable, a variant of these experiments was performed using multi-unit electrophysiological recording in areas MT and MST of the macaque monkey brain.  While there was substantial variability in the degree of opponency between recording sites, the monkey and human data were qualitatively similar on average.  These results provide further evidence that: 1) direction selective signals underlie human MT+ responses, 2) neuronal signals in human MT+ support visual motion perception, 3) human MT+ may be homologous to macaque monkey MT along with adjacent motion sensitive brain areas, and 4) that fMRI measurements are correlated with average spiking activity.

Heeger DJ, Linking Visual Perception with Human Brain Activity, Current Opinion in Neurobiology, 9:474-479.

Abstract: The past year has seen great advances in the use of functional magnetic resonance imaging (fMRI) to study the functional organization of human visual cortex, to measure the neuronal correlates of visual perception, and to test computational theories of vision.  This paper reviews quantitative fMRI methods and summarizes some recent results that illustrate the promise of the approach.

Gandhi SP, Heeger DJ, and Boynton GM, Spatial Attention Affects Brain Activity in Human Primary Visual Cortex, Proc Natl Acad Sci USA, 96:3314-3319 1999.

Abstract: Functional magnetic resonance imaging (fMRI) was used to test if instructing subjects to attend to one or another location in a visual scene would affect neural activity in human primary visual cortex (V1). Stimuli were moving gratings restricted to a pair of peripheral, circular apertures, positioned to the right and to the left of a central fixation point. Subjects were trained to perform a motion discrimination task, attending (without moving their eyes) at any moment in time to one of the two stimulus apertures. FMRI responses were recorded while subjects were cued to alternate their attention between the two apertures. V1 responses in each hemisphere modulated with the alternation of the cue; responses were greater when the subject attended to the stimuli in the contralateral hemifield. The attentional modulation of the brain activity was about 25 percent of that evoked by alternating the stimulus with a uniform field.

Demb JB, Boynton GM, and Heeger DJ, Functional Magnetic Resonance Imaging of Early Visual Pathways in Dyslexia, J Neurosci, 18:6939-6951, 1998.

Abstract: We measured brain activity, perceptual thresholds and reading performance in a group of dyslexic and normal readers to test the hypothesis that dyslexia is associated with an abnormality in the magnocellular (M) pathway of the early visual system. Functional magnetic resonance imaging (fMRI) was used to measure brain activity in conditions designed to preferentially stimulate the M pathway. Speed discrimination thresholds, that measure the minimal increase in stimulus speed that is just noticeable, were acquired in a paradigm modeled after a previous study of M pathway lesioned monkeys. Dyslexics showed reduced brain activity compared to controls both in primary visual cortex (V1) and in several extrastriate areas, including area MT+ that is believed to receive a predominant M pathway input. There was a strong three-way correlation between brain activity, speed discrimination thresholds, and reading speed. Subjects with higher V1 and MT+ responses had lower perceptual thresholds (better performance) and were faster readers. These results support the hypothesis for an M pathway abnormality in dyslexia and imply strong relationships between the integrity of the M pathway, visual motion perception, and reading ability.

Boynton GM, Demb JB, Glover GH, and Heeger DJ, Neural Basis of Contrast Discrimination, Vision Research, 39:257-269, 1999.

Abstract: Psychophysical contrast increment thresholds were compared with neuronal responses, measured using functional magnetic resonance imaging (fMRI), to test the hypothesis that pattern discrimination judgments are limited by neuronal signals in early visual cortical areas. FMRI was used to measure human brain activity as a function of stimulus contrast, in each of several identifiable visual cortical areas. Contrast increment thresholds were measured for the same stimuli across a range of baseline contrasts. FMRI responses in visual areas V1, V2d, and V3d were found to be consistent with the psychophysical judgments, i.e., a contrast increment was detected when the fMRI responses in each of these brain areas increased by a criterion amount. Thus, the pooled activity of large numbers of neurons can reasonably well predict behavioral performance.

Demb JB, Boynton GM, Best M, and Heeger DJ, Psychophysical evidence for a magnocellular pathway deficit in dyslexia, Vision Research, 38:1555-1560, 1998.

Abstract: The relationship between reading ability and psychophysical performance was examined to test the hypothesis that dyslexia is associated with a deficit in the magnocellular (M) pathway. Speed discrimination thresholds and contrast detection thresholds were measured under conditions (low mean luminance, low spatial frequency, high temporal frequency) for which psychophysical performance presumably depends on M pathway integrity. Dyslexic subjects had higher psychophysical thresholds than controls in both the speed discrimination and contrast detection tasks, but only the differences in speed thresholds were statistically significant. In addition, there was a strong correlation between individual differences in speed thresholds and reading rates. These results support the hypothesis for an M pathway abnormality in dyslexia, and suggest that motion discrimination may be a better indicator of dyslexia than is contrast sensitivity.

Carandini M, Heeger DJ, and Movshon JA, Linearity and Gain Control in V1 Simple Cells, in Cerebral Cortex, vol. 13: Models of Cortical Circuits, 1999.

Simoncelli EP and Heeger DJ, A Model of Neuronal Responses in Visual Area MT. Vision Research, 38:743-761, 1998.

Abstract: Electrophysiological studies indicate that neurons in the Middle Temporal (MT) area of the primate brain are selective for the velocity of visual stimuli. This paper describes a computational model of MT physiology, in which local image velocities are represented via the distribution of MT neuronal responses. The computation is performed in two stages, corresponding to neurons in cortical areas V1 and MT. Each stage computes a weighted linear sum of inputs, followed by rectification and divisive normalization. V1 receptive field weights are designed for orientation and direction selectivity. MT receptive field weights are designed for velocity (both speed and direction) selectivity. The paper includes computational simulations accounting for a wide range of physiological data, and describes experiments that could be used to further test and refine the model.

Black M, Sapiro G, Marimont D, and Heeger DJ, Robust anisotropic diffusion, IEEE Transactions on Image Processing, 7:421-432, 1998.

Abstract: Relations between anisotropic diffusion and robust statistics are described in this paper. Specifically, we show that anisotropic diffusion can be seen as a robust estimation procedure that estimates a piecewise smooth image from a noisy input image. The “edge-stopping” function in the anisotropic diffusion equation is closely related to the error norm and in uence function in the robust estimation framework. This connection leads to a new “edge-stopping” function based on Tukey’s biweight robust estimator that preserves sharper boundaries than previous formulations and improves the automatic stopping of the diffusion. The robust statistical interpretation also provides a means for detecting the boundaries (edges) between the piecewise smooth regions in an image that has been smoothed with anisotropic diffusion. Additionally, we derive a relationship between anisotropic diffusion and regularization with line processes. Adding constraints on the spatial organization of the line processes allows us to develop new anisotropic diffusion equations that result in a qualitative improvement in the continuity of edges.

Demb JB, Boynton GM, and Heeger DJ, Brain activity in visual cortex predicts individual differences in reading performance, Proc Natl Acad Sci USA, 94:13363-13366, 1997

Abstract: The relationship between brain activity and reading performance was examined to test the hypothesis that dyslexia involves a deficit in a specific visual pathway known as the magnocellular (M) pathway. Functional magnetic resonance imaging (fMRI) was used to measure brain activity in dyslexic and control subjects in conditions designed to preferentially stimulate the M pathway. Dyslexics showed reduced activity compared to controls both in primary visual cortex (V1) and in a secondary cortical visual area (MT+) that is believed to receive a strong M pathway input. Most importantly, significant correlations were found between individual differences in reading rate and brain activity. These results support the hypothesis for an M pathway abnormality in dyslexia and imply a strong relationship between the integrity of the M pathway and reading ability.

Carandini M, Heeger DJ, and Movshon JA, Linearity and normalization of simple cells of the macaque primary visual cortex, J Neurosci, 17:8621-8644, 1997.

Abstract: Simple cells in the primary visual cortex often appear to compute a weighted sum of the light intensity distribution of the visual stimuli that fall on their receptive fields. A linear model of these cells has the advantage of simplicity and captures a number of basic aspects of cell function. It, however, fails to account for important response nonlinearities, such as the decrease in response gain and latency observed at high contrasts and the effects of masking by stimuli that fail to elicit responses when presented alone. To account for these nonlinearities we have proposed a normalization model, which extends the linear model to include mutual shunting inhibition among a large number of cortical cells. Shunting inhibition is divisive, and its effect in the model is to normalize the linear responses by a measure of stimulus energy. To test this model we performed extracellular recordings of simple cells in the primary visual cortex of anesthetized macaques. We presented large stimulus sets consisting of (1) drifting gratings of various orientations and spatiotemporal frequencies; (2) plaids composed of two drifting gratings; and (3) gratings masked by full-screen spatiotemporal white noise. We derived expressions for the model predictions and fitted them to the physiological data. Our results support the normalization model, which accounts for both the linear and the nonlinear properties of the cells. An alternative model, in which the linear responses are subject to a compressive nonlinearity, did not perform nearly as well.

Tolhurst DJ and Heeger DJ, Contrast normalization and a linear model for the directional selectivity of simple cells in cat striate cortex, Visual Neuroscience, 14:19-26, 1997.

Abstract: Previous tests of the linearity of spatiotemporal summation in cat simple cells have compared the responses to moving sinusoidal gratings and to gratings whose contrast was modulated sinusoidally in time. In particular, since a moving grating can be expressed as a sum of modulated gratings, the response to a moving grating should be predictable (assuming linearity) from the responses to modulated gratings. However, these simple linear predictions have shown varying degrees of failure (e.g. Reid et al., 1987, 1991), depending on the directional selectivity of the neurons (Tolhurst & Dean, 1991). We demonstrate here that the failures of these linear predictions are, in fact, explained by the contrast normalization model of Heeger (1993). We concentrate on the ratio of the measured to predicted moving grating responses. In the context of the contrast normalization model, calculating this ratio turns out to be particularly appropriate, since the ratio is independent of the precise details of the linear fronted mechanisms ultimately responsible for directional selectivity. Hence, the contrast normalization model can be compared quantitatively with this ratio measure, by varying only one free parameter. When account is taken both of the expansive output nonlinearity and of contrast normalization, the directional selectivity of simple cells seems to be dependent only on linear spatiotemporal filtering.

Tolhurst DJ and Heeger DJ, Comparison of contrast normalization and threshold models of the responses of simple cells in cat striate cortex, Visual Neuroscience, 14:293-309, 1997.

Abstract: In almost every study of the linearity of spatiotemporal summation in simple cells of the cat's visual cortex, there have been systematic mismatches between the experimental observations and the predictions of the linear theory. These mismatches have generally been explained by supposing that the initial spatiotemporal summation stage is strictly linear, but that the following output stage of the simple cell is subject to some contrast dependent nonlinearity. Two main models of the output nonlinearity have been proposed: the threshold model (e.g. Tolhurst & Dean, 1987) and the contrast normalization model (e.g. Heeger, 1992a,b). In this paper, the two models are fitted rigorously to a variety of previously published neurophysiological data, in order to determine whether one model is a better explanation of the data. We reexamine data on the interaction between two bar stimuli presented in different parts of the receptive field; on the relationship between the receptive field map and the inverse Fourier transform of the spatial frequency tuning curve; on the dependence of response amplitude and phase on the spatial phase of stationary gratings; on the relationships between the responses to moving and modulated gratings; and on the suppressive action of gratings moving in a neuron's nonpreferred direction. In many situations, the predictions of the two models are similar, but the contrast normalization model usually fits the data slightly better than the threshold model, and it is easier to apply the equations of the normalization model. More importantly, the normalization model is naturally able to account very well for the details and subtlety of the results in experiments where the total contrast energy of the stimuli changes; some of these phenomena are completely beyond the scope of the threshold model. Rigorous application of the models' equations has revealed some situations where neither model fits quite well enough, and we must suppose, therefore, that there are some subtle nonlinearities still to be characterized.

Fleet DJ, Wagner H, and Heeger DJ, Modelling Binocular Neurons in the Primary Visual Cortex, in Computational and Biological Mechanisms of Visual Coding, M. Jenkin and L. Harris, eds, Cambridge University Press, p. 103-130, 1997.

Abstract: This chapter presents a formal description and analysis of a binocular energy model of disparity selectivity. According to this model, disparity selectivity results from a combination of position-shifts and/or phase-shifts. Our theoretical analysis suggests how one might perform an experiment to estimate the relative contributions of phase and position shifts to the disparity selectivity of binocular neurons, based on their responses to drifting sinusoidal grating stimuli of different spatial frequencies and disparities.

We also show that for drifting gratings stimuli, the binocular energy response (with phase and/or position shifts) is a sinusoidal function of disparity, consistent with the physiology of neurons in primary visual cortex (area 17) of the cat. However, Freeman and Ohzawa (1990) also found that the depth of modulation in the sinusoidal disparity tuning curves was remarkably invariant to interocular contrast differences. This is inconsistent with the binocular energy model.

As a consequence we propose a modified binocular energy model that incorporates two stages of divisive normalization. The first normalization stage is monocular, preceding the combination of signals from the two eyes. The second normalization stage is binocular. Our simulation results demonstrate that the normalized binocular energy model provides the required stability of the depth of response modulation. Simulations also demonstrate that the model's monocular and binocular contrast response curves are consistent with those of neurons in primary visual cortex.

Fleet DJ & Heeger DJ. Embedding invisible information in color images, in Proceedings of International Conference on Image Processing, 1997.

Abstract: We describe a method for embedding information in color images. A model of human color vision is used to ensure that the embedded signal is invisible. Sinusoidal signals are embedded so that they can be detected (decoded) without the use of the original image. The sinusoids act as a grid, providing a coordinate frame on the image. We use the grid to automatically scale and align (deskew) images that have been printed and then scanned.

Nestares O and Heeger DJ, Modeling the Apparent Frequency Specific Suppression in Simple Cell Responses, Vision Research, 37:1535-1543, 1997.

Abstract: Simple cells in cat striate cortex are selective for spatial frequency. It is widely believed that this selectivity arises simply because of the way in which the neur ons sum inputs from the lateral geniculate nucleus. Alternate models, however, advocate the need for frequency-specific inhibitory mechanisms to refine the spatial frequency se lectivity. Indeed, simple cell responses are often suppressed by superimposing stimuli with spatial frequencies that flank the neuron's preferred spatial frequency.

In this article, we compare two models of simple cell responses head-to-head. One of these models, the flanking-suppression model, includes an inhibitory mechanism that is spec ific to frequencies that flank the neuron's preferred spatial frequency. The other model, the nonspecific-suppression model, includes a suppressive mechanism that is very broad ly tuned for spatial frequency. Both models also include a rectification nonlinearity and both may include an additional accelerating (e.g., squaring) output nonlinearity. We d emonstrate that both models can be consistent with the apparent flanking suppression. However, based on other experimental results, we argue that the nonspecific-suppression mo del is more plausible. We conclude that the suppression is probably broadly tuned for spatial frequency and that the apparent flanking suppression is actually due to distortion s introduced by an accelerating output nonlinearity.

Tian TY, Tomasi C, and Heeger DJ, Comparison of Approaches to Egomotion Computation, Proceedings of Computer Vision and Pattern Recognition, 1996.

Abstract: We evaluated six algorithms for computing egomotion from image velocities. We established benchmarks for quantifying bias and sensitivity to noise, and for quantifying the convergence properties of those algorithms that require numerical search. Our simulation results reveal some interesting and surprising results. First, it is often written in the literature that the egomotion problem is difficult because translation (e.g., along the X-axis) and rotation (e.g., about the Y-axis) produce similar image velocities. We found, to the contrary, that the bias and sensitivity of our six algorithms are totally invariant with respect to the axis of rotation. Second, it is also believed by some that fixating helps to make the egomotion problem easier. We found, to the contrary, that fixating does not help when the noise is independent of the image velocities. Fixation does help if the noise is proportional to speed, but this is only for the trivial reason that the speeds are slower under fixation. Third, it is widely believed that increasing the field of view will yield better performance. We found, to the contrary, that this is not necessarily true.

Boynton GM, Engel SA, Glover GH, and Heeger DJ, Linear Systems Analysis of fMRI in Human V1, J Neurosci, 16:4207-4221, 1996.

Abstract: The linear transform model of functional magnetic resonance imaging (fMRI) hypothesizes that fMRI responses are proportional to local average neural activity, averaged over a period of time. This article reports results from three empirical tests that support this hypothesis. First, fMRI responses in human primary visual cortex (V1) depend separably on stimulus timing and stimulus contrast. Secondly, responses to long duration stimuli can be predicted from responses to shorter duration stimuli. Thirdly, the noise in the fMRI data is independent of stimulus contrast and temporal period. Although these tests can not prove the correctness of the linear transform model, they might have been used to reject the model. Since the linear transform model is consistent with our data, we proceeded to estimate the temporal fMRI impulse response function and the underlying (presumably neural) contrast-response function of human V1.

Fleet DJ, Wagner H, and Heeger DJ, Encoding of Binocular Disparity: Energy Models, Position Shifts and Phase Shifts, Vision Research, 36:1839-1858, 1996.

Abstract: Neurophysiological data supports two models for the disparity selectivity of binocular simple and complex cells in the primary visual cortex. These involve binocular combinations of monocular receptive fields that are shifted in retinal position (the position-shift model) or in phase (the phase-shift model) between the two eyes. This article presents a theoretical analysis of these two models. We describe the quantitative behaviour of these model neurons, along with proposals for how one might measure the relative contributions of phase- and position-shifts towards the disparity selectivity of binocular cells. The analysis also reveals ambiguities in the disparity encoding that is inherent in these model neurons, suggesting a need for a second stage of processing; we propose that pooling the binouclar responses across orientations and scales (spatial frequency) is capable of producing an unambiguous representation of disparity.

Heeger DJ, Simoncelli EP, and Movshon JA, Computational Models of Cortical Visual Processing, Proc Natl Acad Sci USA, 93:6 23-627, 1996.

Abstract: The visual responses of neurons in the cerebral cortex were first adequately characterized in the 1960s by D. H. Hubel and T. N. Wiesel [(1962) J. Physiol. (London) 160, 106-154; (1968) J. Physiol. (London) 195, 215-243] using qualitative analyses based on simple geometric visual targets. Over the past 30 years, it has become common to consider the properties of these neurons by attempting to make formal descriptions of the transformations they execute on the visual image. Most such models have their roots in linear-systems approaches pioneered in the retina by C. Enroth-Cugell and J. R. Robson [(1966) J. Physiol. (London) 187, 517-552], but it is clear that purely linear models of cortical neurons are inadequate. We present two related models: one designed to account for the responses of simple cells in primary visual cortex (V1) and one designed to account for the responses of pattern direction selective cells in MT (or V5), an extrastriate visual area thought to be involved in the analysis of visual motion. These models share a common structure that operates in the same way on different kinds of input, and instantiate the widely held view that computational strategies are similar throughout the cerebral cortex. Implementations of these models for Macintosh microcomputers are available and can be used to explore the models' properties.

Heeger AJ, Heeger DJ, Langen J, and Yang T, Image Enhancement using Polymer Grid Triode Arrays, Science, 270:1642-1644, 1995.

Abstract: An array of polymer grid triodes with common grid functions as a "plastic retina" which provides local contrast gain control for image enhancement. The polymer grid triode array can be implemented on the focal plane to process the analog data directly from a photodetector array. Alternatively, the array of polymer grid triodes can be utilized after analog to digital conversion and integrated directly into a display.

Heeger DJ and Bergen JR, Pyramid Based Texture Analysis/Synthesis, Computer Graphics Proceedings, p. 229-238, 1995.

Abstract: This paper describes a method for synthesizing images that match the texture appearance of a given digitized sample. This synthesis is completely automatic and requires only the ``target'' texture as input. It allows generation of as much texture as desired so that any object can be covered. It can be used to produce solid textures for creating textured 3-d objects without the distortions inherent in texture mapping. It can also be used to synthesize texture mixtures, images that look a bit like each of several digitized samples. The approach is based on a model of human texture perception, and has potential to be a practically useful tool for graphics applications.

Heeger DJ, The Representation of Visual Stimuli in Primary Visual Cortex, Current Directions in Psychological Science, 3:159-163, 1994.

Carandini M, and Heeger DJ, Summation and Division by Neurons in Visual Cortex. Science, 264:1333-1336, 1994.

Abstract: Recordings from monkey primary visual cortex (V1) were used to test a model for the visually-driven responses of simple cells. According to the model, simple cells compute a linear sum of the responses of lateral geniculate nucleus (LGN) neurons. In addition, each simple cell's linear response is divided by the pooled activity of a large number of other simple cells. The cell membrane performs both operations; synaptic currents are summed and then divided by the total membrane conductance. Current and conductance are decoupled (by a complementary arrangement of excitation and inhibition) so that current depends only on the LGN inputs and conductance depends only on the cortical inputs. Closed form expressions were derived for fitting and interpreting physiological data. The model accurately predicted responses to drifting grating stimuli of various contrasts, orientations, and spatiotemporal frequencies.

Teo P and Heeger DJ, Perceptual Image Distortion, Proceedings of SPIE, volume 2179, p. 127-141, 1994.

Abstract: In this paper, we present a perceptual distortion measure that predicts image integrity far better than mean-squared error. This perceptual distortion measure is based on a model of human visual processing that fits empirical measurements of: (1) the response properties of neurons in the primare visual cortex, and (2) the psychophysics of spatial pattern detection. We also illustrate the usefulness of the model in predicting perceptual distortion in real images.

Teo P and Heeger DJ, Perceptual Image Distortion, First IEEE International Conference on Image Processing, vol 2, pp 982-986, November 1994.

Abstract: In this paper, we present a perceptual distortion measure that predicts image integrity far better than mean-squared error. This perceptual distortion measure is based on a model of human visual processing that fits empirical measurements of the psychophysics of spatial pattern detection. The model of human visual processing proposed involves two major components: a steerable pyramid transform and contrast normalization. We also illustrate the usefulness of the model in predicting perceptual distortion in real images.

Heeger DJ, Modeling simple cell direction selectivity with normalized, half-squared, linear operators, Journal of Neurophysiology, 70:1885-1898, 1993.

Summary: 1. A longstanding view of simple cells is that they sum their inputs linearly. However, the linear model falls short of a complete account of simple-cell direction selectivity. We have developed a nonlinear model of simple-cell responses (hereafter referred to as the normalization model) to explain a larger body of physiological data. 2. The normalization model consists of an underlying linear stage along with two additional nonlinear stages. The first is a half-squaring nonlinearity; half-squaring is half-wave rectification followed by squaring. The second is a divisive normalization non-linearity in which each model cell is suppressed by the pooled activity of a large number of cells. 3. By comparing responses with counterphase (flickering) gratings and drifting gratings, researchers have demonstrated that there is a nonlinear contribution to simple-cell responses. Specifically they found 1) that the linear prediction from counterphase grating responses underestimates a direction index computed from drifting grating responses, 2) that the linear prediction correctly estimates responses to gratings drifting in the preferred direction, and 3) that the linear prediction overestimates responses to gratings drifting in the nonpreferred direction. 4. We have simulated model cell responses and derived mathematical expressions to demonstrate that the normalization model accounts for this empirical data. Specifically the model behaves as follows. 1) The linear prediction from counterphase data underestimates the direction index computed from drifting grating responses. 2) The linear prediction from counterphase data overestimates the response to gratings drifting in the nonpreferred direction. The discrepancy between the linear prediction and the actual response is greater when using higher contrast stimuli. 3) For an appropriate choice of contrast, the linear prediction from counterphase data correctly estimates the response to gratings drifting in the preferred direction. For higher contrasts the linear prediction overestimates the actual response, and for lower contrasts the linear prediction underestimates the actual response. 5. In addition, the normalization model is qualitatively consistent with data on the dynamics of simple-cell responses. Tolhurst et al. found that simple cells respond with an initial transient burst of activity when a stimulus first appears. The normalization model behaves similarly; it takes some time after a stimulus first appears before the model cells are fully normalized. We derived the dynamics of the model and found that the transient burst of activity in model cells depends in a particular way on stimulus contrast. The burst is short for high-contrast stimuli and longer for low-contrast stimuli.

Chichilnisky EJ, Heeger DJ, and Wandell BA, Functional segregation of color and motion perception examined in motion nulling, Vision Research, 15:2113-2125, 1993.

Abstract: We examine two hypotheses about the functional segregation of color and motion perception, using a motion nulling task. The most common interpretation of functional segregation, that motion perception depends only on one of the three dimensions of color, is rejected. We propose and test an alternative formulation of functional segregation: that motion perception depends on a univariate motion signal driven by all three color dimensions, and that the motion signal is determined by the product of the stimulus contrast and a term that depends only on the relative cone excitations. Two predictions of this model are confirmed. First, motion nulling is transitive: when two stimuli null a third they also null another. Second, motion nulling is homogeneous: if two stimuli null one another, they continue to null one another when their contrasts are scaled equally. We describe how to apply our formulation of functional segregation to other behavioral and physiological measurements.

Heeger DJ and Simoncelli EP, Model of visual motion sensing, in Spatial Vision in Humans and Robots, Harris L and Jenkin M, eds, Cambridge University Press, p. 367-392, 1993.

Abstract: A number of researchers have proposed models of early motion sensing based on direction-selective, spatiotemporal linear operators. Others have formalized the problem of measuring optical flow in terms of the spatial and temporal derivatives of stimulus intensity. Recently, the spatiotemporal filter models and the gradient-based methods have been placed into a common framework. In this chapter, we review that framework and we extend it to develop a new model for the computation and representation of velocity information in the visual system. We use the model to simulate psychophysical data on perceived velocity of sine-grating plaid patterns, and to simulate physiological data on responses of simple cells in primary (striate) visual cortex.

Jepson A and Heeger DJ, Linear subspace methods for recovering translation direction, In Spatial Vision in Humans and Robots, Harris L and Jenkin M, eds, Cambridge University Press, p. 39-62, 1993.

Heeger DJ, Normalization of cell responses in cat striate cortex, Visual Neuroscience, 9:181-198, 1992.

Abstract: Simple cells in striate cortex have been depicted as halfwave-rectified linear operators. Complex cells have been depicted as energy mechanisms, constructed from the squared sum of the outputs of quadrature pairs of linear operators. However, the linear/energy model falls short of a complete explanation of striate cell responses. In this paper, I present a modified version of the linear/energy model in which striate cells mutually inhibit one another, effectively normalizing their responses with respect to stimulus contrast. This paper reviews experimental measurements of striate cell responses, and shows that the new model explains a significantly larger body of physiological data.

Heeger DJ, Half-squaring in responses of cat striate cells, Visual Neuroscience, 9:427-443, 1992.

Abstract: Simple cells in striate cortex have been depicted as rectified linear operators, and complex cells have been depicted as energy mechanisms (constructed from the squared sums of linear operator outputs). This paper discusses two essential hypotheses of the linear/energy model: (1) that a cell's selectivity is due to an underlying (spatiotemporal and binocular) linear stage; and (2) that a cell's firing rate depends on the squared output of the underlying linear stage. This paper reviews physiological measurements of cat striate cell responses, and concludes that both of these hypotheses are supported by the data.

Simoncelli EP, Freeman W, Adelson EH, and Heeger DJ, Shiftable multi-scale transforms, IEEE Transactions on Information Theory, 38:587-607, 1992.

Abstract: Orthogonal wavelet transforms have recently become a popular representation for multi-scale signal and image analysis. One of the major drawbacks of these representations is their lack of translation invariance: the content of wavelet subbands is unstable under translations of the input signal. Wavelet transforms are also unstable with respect to dilations of the input signal, and in two dimensions, rotations of the input signal. We formalize these problems by defining a type of translation invariance that we call "shiftability". In the spatial domain, shiftability corresponds to a lack of aliasing; thus, the conditions under which the property holds are specified by the sampling theorem. Shiftability may also be considered in the context of other domains, particularly orientation and scale. We explore "jointly shiftable" transforms that are simultaneously shiftable in more than one domain. Two examples of jointly shiftable transforms are designed and implemented: a one-dimensional transform that is jointly shiftable in position and scale, and a two-dimensional transform that is jointly shiftable in position and orientation. We demonstrate the usefulness of these image representations for scale-space analysis, stereo disparity measurement, and image enhancement.

Heeger DJ and Jepson A, Subspace methods for recovering rigid motion I: Algorithm and implementation, International Journal of Computer Vision, 7:95-117, 1992.

Abstract: As an observer moves and explores the environment, the visual stimulation in his/her eye is constantly changing. Somehow he/she is able to perceive the spatial layout of the scene, and to discern his/her movement through space. Computational vision researchers have been trying to solve this problem for a number of years with only limited success. It is a difficult problem to solve because the optical flow field is nonlinearly related to the 3D motion and depth parameters.

In this paper, we show that the nonlinear equation describing the optical flow field can be split by an exact algebraic manipulation to form three sets of equations. The first set relates the flow field to only the translational component of 3D motion. Thus, depth and rotation need not be known or estimated prior to solving for translation. Once the translation has been recovered, the second set of equations can be used to solve for rotation. Finally, depth can be estimated with the third set of equations, given the recovered translation and rotation.

The algorithm applies to the general case of arbitrary motion with respect to an arbitrary scene. It is simple to compute, and it is plausible biologically. The results in this paper demonstrate the potential of our new approach, and show that it performs favorably when compared with two other well known algorithms.

Simoncelli EP, Adelson EH, & Heeger DJ, Probability distributions of optical flow, in Proceedings of Computer Vision and Pattern Recognition, p. 310-315, 1991.

Abstract: Gradient methods are widely used in the computation of optical flow. We discuss extensions of these methods which compute probability distributions of optical flow. The use of distributions allows representation of the uncertainties inherent in the optical flow computation, facilitating the combination with information from other sources. We compute distributed optical flow for a synthetic image sequence and demonstrate that the probabilistic model accounts for the errors in the flow estimates. We also compute distributed optical flow for a real image sequence.

Freeman W, Adelson EH, and Heeger DJ, Motion without movement, Computer Graphics, 25:27-30, 1991.

Abstract: We describe a technique for displaying patterns that appear to move continuously without changing their positions. The method uses a quadrature pair of oriented filters to vary the local phase, giving the sensation of motion. We have used this technique in various computer graphic and scientific visualization applications.

Heeger DJ, Nonlinear model of neural responses in cat visual cortex, in Computational Models of Visual Processing, Landy M and Movshon JA, eds, p. 119-133. MIT Press, 1991.

Heeger DJ and Jepson A, Visual perception of three-dimensional motion, Neural Computation, 2:127-135, 1990.

Abstract: As an observer moves and explores the environment, the visual stimulation in his eye is constantly changing. Somehow he is able to perceive the spatial layout of the scene, and to discern his movement through space. Computational vision researchers have been trying to solve this problem for a number of years with only limited success. It is a difficult problem to solve because the relationship between the optical-flow field, the 3D motion-parameters, and depth is nonlinear. We have come to understand that this nonlinear equation describing the optical-flow field can be split by an exact algebraic manipulation to form three sets of equations. The first set relates the image velocities to the translational component of the 3D motion alone. Thus, the depth and the rotational velocity need not be known or estimated prior to solving for the translational velocity. Once the translation has been recovered, the second set of equations can be used to solve for the rotational velocity. Finally, depth can be estimated with the third set of equations, given the recovered translation and rotation. The algorithm applies to the general case of arbitrary motion with respect to an arbitrary scene. It is simple to compute, and it is plausible biologically.

Heeger DJ, Optical flow using spatiotemporal filters, International Journal of Computer Vision, 1:270-302, 1988.

Abstract: A model is presented, consonant with current views regarding the neurophysiology and psychophysics of motion perception, that combines the outputs of a set of spatiotemporal motion-energy filters to estimate image velocity. A parallel implementation computes a distributed representation of image velocity. A measure of image-flow uncertainty is formulated; preliminary results indicate that this uncertainty measure may be used to recognize ambiguity due to the aperture problem. The model appears to deal with the aperture problem as well as the human visual system since it extracts the correct velocity for some patterns that have large differences in contrast at different spatial orientations.

Heeger D, Model for the extraction of image flow, Journal of the Optical Society of America A, 4:1455-1471, 1987.

Abstract: A model is presented, consonant with current views regarding the neurophysiology and psychophysics of motion perception, that combines the outputs of a set of spatiotemporal motion-energy filters to extract optic flow. The output velocity is encoded as the peak in a distribution of velocity-tuned units that behave much like cells of the middle temporal area of the primate brain. The model appears to deal with the aperture problem as well as the human visual system since it extracts the correct velocity for patterns that have large differences in contrast a different spatial orientations, and it simulates psychophysical data on the coherence of sine-grating plaid patterns.