In Chapter 3 we introduced the imagery debate, which concerns two different accounts of the architectural properties of mental images. One account, known as the depictive theory (Kosslyn, 1980, 1994; Kosslyn, Thompson, & Ganis, 2006), argues that we experience the visual properties of mental images because the format of these images is quasi-pictorial, and that they literally depict visual information.
The other account, propositional theory, proposes that images are not depictive, but instead describe visual properties using a logical or propositional representation (Pylyshyn, 1973, 1979b, 1981a, 2003b). It argues that the privileged properties of mental images proposed by Kosslyn and his colleagues are actually the result of the intentional fallacy: the spatial properties that Kosslyn assigns to the format of images should more properly be assigned to their contents.
The primary support for the depictive theory has come from relative complexity evidence collected from experiments on image scanning (Kosslyn, 1980) and mental rotation (Shepard & Cooper, 1982). This evidence generally shows a linear relationship between the time required to complete a task and a spatial property of an image transformation. For instance, as the distance between two locations on an image increases, so too does the time required to scan attention from one location to the other. Similarly, as the amount of rotation that must be applied to an image increases, so too does the time required to judge that the image is the same or different from another. Proponents of propositional theory have criticized these results by demonstrating that they are cognitively penetrable (Pylyshyn, 2003c): a change in tacit information eliminates the linear relationship between time and image transformation, which would not be possible if the depictive properties of mental images were primitive.
If a process such as image scanning is cognitively penetrable, then this means that subjects have the choice not to take the time to scan attention across the image. But this raises a further question: “Why should people persist on using this method when scanning entirely in their imagination where the laws of physics and the principles of spatial scanning do not apply (since there is no real space)?” (Pylyshyn, 2003b, p. 309). Pylyshyn’s theory of visual cognition provides a possible answer to this question that is intriguing, because it appeals to a key proposal of the embodied approach: cognitive scaffolding.
Pylyshyn’s scaffolding approach to mental imagery was inspired by a general research paradigm that investigated whether visual processing and mental imagery shared mechanisms. In such studies, subjects superimpose a mental image over other information that is presented visually, in order to see whether the different sources of information can interact, for instance by producing a visual illusion (Bernbaum &Chung, 1981; Finke&Schmidt, 1977; Goryo, Robinson,&Wilson, 1984; Ohkuma, 1986). This inspired what Pylyshyn (2007) called the index projection hypothesis. This hypothesis brings Pylyshyn’s theory of visual cognition into contact with embodied cognitive science, because it invokes cognitive scaffolding via the visual world.
According to the index projection hypothesis, mental images are scaffolded by visual indices that are assigned to real world (i.e., to visually present) entities. For instance, consider Pylyshyn’s (2003b) application of the index projection hypothesis to the mental map paradigm used to study image scanning:
If, for example, you imagine the map used to study mental scanning superimposed over one of the walls in the room you are in, you can use the visual features of the wall to anchor various objects in the imagined map. In this case, the increase in time it takes to access information from loci that are further apart is easily explained since the ‘images,’ or, more neutrally, ‘thoughts’ of these objects are actually located further apart. (Pylyshyn, 2003b, p. 376, p. 374)
In other words, the spatial properties revealed in mental scanning studies are not due to mental images per se, but instead arise from “the real spatial nature of the sensory world onto which they are ‘projected’” (p. 374).
If the index projection hypothesis is valid, then how does it account for mental scanning results when no external world is visible? Pylyshyn argued that in such conditions, the linear relationship between distance on an image and the time to scan it may not exist. For instance, evidence indicates that when no external information is visible, smooth attentional scanning may not be possible (Pylyshyn & Cohen, 1999). As well, the exploration of mental images is accompanied by eye movements similar to those that occur when a real scene is explored (Brandt & Stark, 1997). Pylyshyn (2007) pointed out that this result is exactly what would be predicted by the index projection hypothesis, because the eye movements would be directed to real world entities that have been assigned visual indices.
The cognitive scaffolding of mental images may not merely concern their manipulation, but might also be involved when images are created. There is a long history of the use of mental images in the art of memory (Yates, 1966). One important technique is the ancient method of loci, in which mental imagery is used to remember a sequence of ideas (e.g., ideas to be presented in a speech).
The memory portion of the Rhetorica ad Herrenium, an anonymous text that originated in Rome circa 86 BC and reached Europe by the Middle Ages, teaches the method of loci as follows. A well-known building is used as a “wax tablet” onto which memories are to be “written.” As one mentally moves, in order, through the rooms of the building, one places an image representing some idea or content in each locus—that is, in each imagined room. During recall, one mentally walks through the building again, and “sees” the image stored in each room. “The result will be that, reminded by the images, we can repeat orally what we have committed to the loci, proceeding in either direction from any locus we please” (Yates, 1966, p. 7).
In order for the method of loci to be effective, a great deal of effort must be used to initially create the loci to be used to store memories (Yates, 1966). Ancient rules of memory taught students the most effective way to do this. According to the Rhetorica ad Herrenium, each fifth locus should be given a distinguishing mark. A locus should not be too similar to the others, in order to avoid confusion via resemblance. Each locus should be of moderate size and should not be brightly lit, and the intervals between loci should also be moderate (about thirty feet). Yates (1966, p. 8) was struck by “the astonishing visual precision which [the classical rules of memory] imply. In a classically trained memory the space between the loci can be measured, the lighting of the loci is allowed for.”
How was such a detailed set of memory loci to be remembered? The student of memory was taught to use what we would now call cognitive scaffolding. They should lay down a set of loci by going to an actual building, and by literally moving through it from locus to locus, carefully committing each place to memory as they worked (Yates, 1966). Students were advised to visit secluded buildings in order to avoid having their memorization distracted by passing crowds. The Phoenix, a memory manual published by Peter of Ravenna in 1491, recommended visiting unfrequented churches for this reason. These classical rules for the art of memory “summon up a vision of a forgotten social habit. Who is that man moving slowly in the lonely building, stopping at intervals with an intent face? He is a rhetoric student forming a set of memory loci” (Yates, 1966, p. 8).
According to the index projection hypothesis, “by anchoring a small number of imagined objects to real objects in the world, the imaginal world inherits much of the geometry of the real world” (Pylyshyn, 2003b, p. 378). The classical art of memory, the method of loci, invokes a similar notion of scaffolding, attempting not only to inherit the real world’s geometry, but to also inherit its permanence.