Viewers can rapidly extract a holistic semantic representation of a real-world scene within a single eye fixation, an ability called recognizing the gist of a scene, and operationally defined here as recognizing an image’s basic-level scene category. However, it is unknown how scene gist recognition unfolds over both time and space — within a fixation and across the visual field. Thus, in three...