Title : Decoding conjunctions of direction - of - motion and binocular 1 disparity from human visual cortex
نویسندگان
چکیده
37 38 Motion and binocular disparity are two features in our environment that share a 39 common correspondence problem. Decades of psychophysical research dedicated to 40 understanding stereopsis suggest that these features interact early in human visual 41 processing to disambiguate depth. Single unit recordings in the monkey also provide 42 evidence for the joint encoding of motion and disparity across much of the dorsal 43 visual stream. Here we used fMRI and multivariate pattern analysis to examine where 44 in the human brain conjunctions of motion and disparity are encoded. Subjects 45 sequentially viewed two stimuli that could be distinguished only by their conjunctions 46 of motion and disparity. Specifically, each stimulus contained the same feature 47 information (leftward and rightward motion, and crossed and uncrossed disparity) but 48 differed exclusively in the way these features were paired. Our results revealed that a 49 linear classifier could accurately decode which stimulus a subject was viewing based 50 on voxel activation patterns throughout the dorsal visual areas and as early as V2. 51 This decoding success was conditional on some voxels being individually sensitive to 52 the unique conjunctions comprising each stimulus, thus a classifier could not rely on 53 independent information about motion and binocular disparity to distinguish these 54 conjunctions. This study expands on evidence that disparity and motion interact at 55 many levels of human visual processing, particularly within the dorsal stream. It also 56 lends support to the idea that stereopsis is subserved by early mechanisms also tuned 57 to direction of motion. 58 59 60 Introduction 61 62 Motion and binocular disparity are two features in our visual environment that are 63 commonly used by the brain to recover the three-dimensional quality of a scene. In 64 fact the encoding of these two features relies on solving a common ‘correspondence 65 problem’ a matching of object position over time (motion) or across the two eyes 66 (binocular disparity). Thus, it has been suggested that extracting these two features 67 from retinal images may rely on a shared neural substrate (Anstis and Harris 1974, 68 Bradley et al., 1995, Graham & Rogers 1982, Graham & Rogers 1984, (Nawrot and 69 Blake 1991). Psychophysical evidence suggests that conjunctions of motion and 70 disparity are encoded within inseparable neural ‘units’ prior to the computation of 71 depth. For instance, experiments have demonstrated that a combination of these cues 72 facilitates depth perception (van Ee and Anderson 2001), (Johnston et al. 1994), 73 (Bradshaw and Cumming 1997). Furthermore, the well known ‘motion after-effect’ 74 has been shown to be contingent on the binocular disparities at which an adapting and 75 test stimulus are presented (Anstis and Harris, 1974; Smith, 1976; Nawrot and Blake, 76 1989; Verstraten et al., 1994). Similarly, disparity after-effects are contingent on 77 motion direction (Nawrot and Blake, 1989). Whilst such findings are best explained 78 by the adaptation of neurons coding for specific conjunctions of disparity and motion, 79 the question of where in the in human visual cortex these cells may arise remains 80 unresolved. 81 82 Perhaps the strongest evidence for conjunction coding of motion and disparity has 83 been provided by physiological experiments in non-human primates. These studies 84 indicate that the initial sensory integration of motion and disparity occurs at, or prior 85 to, area MT+ (Poggio and Talbot, 1981; Maunsell and Van Essen, 1983; DeAngelis 86 and Newsome, 1999; Anzai et al., 2001; Pack et al., 2003, DeAngelis and Uka, 2003; 87 Grunewald and Skoumbourdis, 2004). However, only few studies confined to 88 macaque MT actually present evidence for true conjunction coding (Roy et al., 1992; 89 Bradley et al., 1995; Dodd et al., 2001; DeAngelis and Newsome, 2004). For instance, 90 whilst many neurons in macaque MT exhibit separable selective responses to motion 91 and disparity information, a proportion also exhibit selectivity for motion which is 92 ‘unfixed’ and can be modulated by disparity (Roy et al., 1992; DeAngelis and 93 Newsome, 2004). 94 95 Whilst research in the macaque brain supports the psychophysical evidence for 96 conjunction coding of motion and disparity, no study has directly examined where in 97 the human visual cortex these conjunctions are encoded. fMRI in combination with 98 multivariate pattern classification has provided evidence for motion and binocular 99 disparity selectivity throughout the visual cortex (Kamitani and Tong, 2006; Preston 100 et al., 2008). And a recent fMRI adaptation experiment (Smith and Wall, 2008) 101 showed that area MT responded differentially to motion when it was presented at 102 different disparities. However, since voxels containing separate clusters of neurons 103 selective for specific motion directions and specific binocular disparities could 104 explain these findings, it remains unclear whether single functional units (i.e. voxels) 105 respond to specific pairings of these two features. 106 107 In the current study we examined whether fMRI Blood Oxygen Level Dependent 108 (BOLD) signals measured in human visual cortex could accurately discriminate 109 stimuli differing exclusively by their specific pairing of two motion directions and 110 two binocular disparities. A similar approach has previously been employed to 111 examine conjunction coding of colour-motion and colour-form in the human visual 112 cortex (Seymour et al., 2009; 2010). Here we distinguished conjunction coding of 113 disparity and motion (i.e. supra-linear responses to distinct motion-disparity pairings) 114 from joint selectivity, where a voxel could simply respond when either or both 115 features were present (i.e. resulting in additive separable responses to the two 116 features). Our results provide evidence that disparity and motion information is 117 integrated early in human visual processing and is represented at many levels of the 118 visual hierarchy, particularly within the dorsal stream. 119 120 Methods 121 122 Five participants (3 male, 2 female) took part in the study, including both authors. All 123 had normal vision and stereoacuity (tested in a separate session within the scanner 124 setup). Each subject was familiarized with the task during one preliminary 125 psychophysics session outside of the scanner. 126 127 Stimuli 128 Basic stimulus parameters 129 A dichoptic stimulus display set-up was designed for use in a Phillips 3T MRI 130 scanner with standard back-projection (projector: Dell 5100MP, display resolution: 131 1024 X 768 pixels). This followed closely the design of (Schurger, 2009). A 132 cardboard divider connected the viewing mirror and the projector screen (running 133 through the bore of the magnet) allowed for each eye’s image to be viewed separately. 134 Two square stimulus frames were projected onto the screen (one either side of the 135 vertical midline) and remained there for the entire scan session in order to help the 136 subject maintain fusion of the two eye’s images. Each eye’s image also had its own 137 set of nonius lines displayed at fixation (subtending 0.7 deg) so that, when fused, the 138 two images were perceived as a single square frame subtending 15 deg visual angle 139 with a fixation cross at the centre. To stabilise fixation further, subjects additionally 140 fused a white fixation point (presented to the left eye) with a surrounding black ring 141 of a bulls-eye (presented to the right eye). Throughout the entire scan session each 142 subject wore custom-cut prism lenses to adjust the viewing angle of each eye’s image 143 and make fusion comfortable. The screen was viewed from a distance of 167 cm. 144 Stimuli were presented using PsychToolbox 3.0.8 (Brainard 1997; Pelli 1997). 145 146 Specific conjunction stimuli 147 Since our experiment aimed to establish evidence for voxels tuned to specific 148 conjunctions of motion and disparity, we relied on the use of multivariate pattern 149 classifiers to distinguish between fMRI activation patterns associated with viewing of 150 two specific conjunction stimuli. Both stimuli were composed of the same two 151 directions of motion and two binocular disparity cues but differed exclusively by the 152 way these cues were paired (i.e. their conjunctions). Hence a classifier could not rely 153 on activation associated with disparity-specific or motion-specific responses to 154 distinguish the two stimuli. True conjunction coding would require a non-linear (or 155 unique) response to the combined features (i.e. one differing from the sum of the 156 responses to each feature). 157 158 All stimuli were displayed on a grey background (37 cd/m) within a square fusion 159 frame. Each specific conjunction stimulus subtended 15 deg and consisted of 160 translating black dots (size: 0.17 deg radius, speed: 2 deg/sec) presented at disparities 161 of + 0.17 deg (crossed disparity) and 0.17 deg (uncrossed disparity). To ensure that 162 depth was defined purely by binocular disparity and unaided by monocular cues, we 163 removed all interocular unmatched dots (half-occlusions) from each eye’s image 164 (Brooks and Gillam, 2006). Figure 1 represents a schematic of the stimulus display 165 for conjunction stimuli A and B. Note that both conjunction stimuli contained the 166 same basic motion and disparity information (i.e. leftward motion, rightward motion, 167 crossed disparity and uncrossed disparity). The only difference between the two 168 conditions was the way these features were paired. For example, where leftward 169 motion was paired with crossed disparity in condition A, it was paired with uncrossed 170 disparity in condition B etc. 171
منابع مشابه
Decoding conjunctions of direction-of-motion and binocular disparity from human visual cortex.
Motion and binocular disparity are two features in our environment that share a common correspondence problem. Decades of psychophysical research dedicated to understanding stereopsis suggest that these features interact early in human visual processing to disambiguate depth. Single-unit recordings in the monkey also provide evidence for the joint encoding of motion and disparity across much of...
متن کاملA Causal Role for V5/MT Neurons Coding Motion-Disparity Conjunctions in Resolving Perceptual Ambiguity
Judgments about the perceptual appearance of visual objects require the combination of multiple parameters, like location, direction, color, speed, and depth. Our understanding of perceptual judgments has been greatly informed by studies of ambiguous figures, which take on different appearances depending upon the brain state of the observer. Here we probe the neural mechanisms hypothesized as r...
متن کاملAsymmetric interaction between motion and stereopsis revealed by concurrent adaptation.
Although contingent aftereffects between motion and stereopsis have been referred to as behavioral evidence for the joint processing of the two features, the reciprocal nature of encoding the two features has not been systematically studied. Using a novel form of concurrent adaptation, we probed the perception of direction- and disparity-defined coherent surfaces in parallel before and after ad...
متن کاملAudio–visual interactions for motion perception in depth modulate activity in visual area V3A
Multisensory signals can enhance the spatial perception of objects and events in the environment. Changes of visual size and auditory intensity provide us with the main cues about motion direction in depth. However, frequency changes in audition and binocular disparity in vision also contribute to the perception of motion in depth. Here, we presented subjects with several combinations of audito...
متن کاملJoint tuning for direction of motion and binocular disparity in macaque MT is largely separable.
Neurons in sensory cortical areas are tuned to multiple dimensions, or features, of their sensory space. Understanding how single neurons represent multiple features is of great interest for determining the informative dimensions of the neurons' response, the decoding algorithms appropriate for extracting this information from the neuronal population, and for determining where specific transfor...
متن کامل