9.3: The Imagery Debate

Last updated
Save as PDF

Page ID: 92845

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Imagine yourself back on vacation again. You are now walking along the beach, while projecting images of white benzene-molecules onto the horizon. At once you are realizing that there are two real little white dots under your projection. Couriously you are walking towards them, until your visual field is filled by two seriously looking, but fiercely debating scientists. As they take notice of your presence, they invite you to take a seat and listen to the still unsolved imagery debate.

Today’s imagery debate is mainly influenced by two opposing theories: On the one hand Zenon Pylyshyn’s (left) propositional theory and on the other hand Stephen Kosslyn’s (right) spatial representation theory of imagery processing.

Theory of propositional representation

The theory of Propositional Representation was founded by Dr. Zenon Pylyshyn who invented it in 1973. He described it as an epiphenomenon which accompanies the process of imagery, but is not part of it. Mental images do not show us how the mind works exactly. They only show us that something is happening. Just like the display of a compact disc player. There are flashing lights that display that something happens. We are also able to conclude what happens, but the display does not show us how the processes inside the compact disc player work. Even if the display would be broken, the compact disc player would still continue to play music.

Representation

The basic idea of the propositional representation is that relationships between objects are represented by symbols and not by spatial mental images of the scene. For example, a bottle under a table would be represented by a formula made of symbols like UNDER(BOTTLE,TABLE). The term proposition is lend from the domains of Logic and Linguistics and means the smallest possible entity of information. Each proposition can either be true or false.

If there is a sentence like "Debby donated a big amount of money to Greenpeace, an organization which protects the environment", it can be recapitulated by the propositions "Debby donated money to Greenpeace", "The amount of money was big" and "Greenpeace protects the environment". The truth value of the whole sentence depends on the truth values of its constituents. Hence, if one of the propositions is false, so is the whole sentence.

Propositional networks

This last model does not imply that a person remembers the sentence or its single propositions in its exact literal wording. It is rather assumed that the information is stored in the memory in a propositional network.

In Figure 1, each circle represents a single proposition. Regarding the fact that some components are connected to more than one proposition, they construct a network of propositions. Propositional networks can also have a hierarchy, if a single component of a proposition is not a single object, but a proposition itself. An example of a hierarchical propositional network describing the sentence "John believes that Anna will pass her exam" is illustrated in Figure 2.

Complex objects and schemes

Even complex objects can be generated and described by propositional representation. A complex object like a ship would consist of a structure of nodes which represent the ships properties and the relationship of these properties.

Almost all humans have concepts of commonly known objects like ships or houses in their mind. These concepts are abstractions of complex propositional networks and are called schemes. For example our concept of a house includes propositions like:

Houses have rooms.
Houses can be made from wood.
Houses have walls.
Houses have windows.
...

Listing all of these propositions does not show the structure of relationships between these propositions. Instead, a concept of something can be arranged in a schema consisting of a list of attributes and values, which describe the properties of the object. Attributes describe possible forms of categorisation, while values rep- resent the actual value for each attribute. The schema-representation of a house looks like this:

House
Category: building
Material: stone, wood 
Contains: rooms
Function: shelter for humans
Shape: rectangular
...

The hierarchical structure of schemes is organised in categories. For example, "house" belongs to the category "building" (which has of course its own schema) and contains all attributes and values of the parent schema plus its own specific values and attributes. This way of organising objects in our environment into hierarchical models enables us to recognize objects we have never seen before in our life, because they can possibly be related to categories we already know.

Experimental support

In an experiment performed by Wisemann und Neissner in 1974, people are shown a picture which, on first sight, seems to consist of random black and white shapes. After some time the subjects realise that there is a dalmatian dog in it. The results of this show that people who recognise the dog remember the picture better than people who do not recognise him. An possible explanation is that the picture is stored in the memory not as a picture, but as a proposition.

In an experiment by Weisberg in 1969, subjects had to memorise sentences like "Children who are slow eat bread that is cold". Then the subjects were asked to associate the first word from the sentence that comes in their mind to a word given by the experiment conductor. Almost all subjects associated the word "children" to the given word "slow", although the word "bread" has a position that is more close to the given word "slow" than the word "children". An explanation for this is that the sentence is stored in the memory using the three propositions "Children are slow", "Children eat bread" and "Bread is cold". The subjects associated the word "children" with the given word "slow", because both belong to one proposition, while "bread" and "slow" belong to different ones. The same evidence was proven in another experiment by Ratcliff and McKoon in 1978.

Theory of spatial representation

Stephen Kosslyn's theory opposing Pylyshyn's propositional approach implies that images are not only represented by propositions. He tried to find evidence for a spatial representation system that constructs mental, analogous, three-dimensional models.

The primary role of this system is to organize spatial information in a general form that can be accessed by either perceptual or linguistic mechanisms. It also provides coordinate frameworks to describe object locations, thus creating a model of a perceived or described environment. The advantage of a coordinate representation is that it is directly analogous to the structure of real space and captures all possible relations between objects encoded in the coordinate space. These frameworks also reflect differences in the salience of objects and locations consistent with the properties of the environment, as well as the ways in which people interact with it. Thus, the representations created are models of physical and functional aspects of the environment.

Encoding

What, then, can be said about the primary components of cognitive spatial representation? Certainly, the distinction between the external world and our internal view of it is essential, and it is helpful to explore the relationship between the two further from a process-oriented perspective.

The classical approach assumes a complex internal representation in the mind that is constructed through a series of specific perceived stimuli, and that these stimuli generate specific internal responses. Research dealing specifically with geographic-scale space has worked from the perspective that the macro-scale physical environment is extremely complex and essentially beyond the control of the individual. This research, such as that of Lynch and of Golledge (1987) and his colleagues, has shown that there is a complex of behavioural responses generated from corresponding complex external stimuli, which are themselves interrelated. Moreover, the results of this research offers a view of our geographic knowledge as a highly interrelated external/internal system. Using landmarks encountered within the external landscape as navigational cues is the clearest example of this interrelationship.

The rationale is as follows: We gain information about our external environment from different kinds of perceptual experience; by navigating through and interacting directly with geographic space as well as by reading maps, through language, photographs and other communication media. Within all of these different types of experience, we encounter elements within the external world that act as symbols. These symbols, whether a landmark within the real landscape, a word or phrase, a line on a map or a building in a photograph, trigger our internal knowledge representation and generate appropriate responses. In other words, elements that we encounter within our environment act as external knowledge stores.

Spatial_representation.svg.png

Each external symbol has meaning that is acquired through the sum of the individual perceiver's previous experience. That meaning is imparted by both the specific cultural context of that individual and by the specific meaning intended by the generator of that symbol. Of course, there are many elements within the natural environment not "generated" by anyone, but that nevertheless are imparted with very powerful meaning by cultures (e.g., the sun, moon and stars). Man-made elements within the environment, including elements such as buildings, are often specifically designed to act as symbols as at least part of their function. The sheer size of downtown office buildings, the pillars of a bank facade and church spires pointing skyward are designed to evoke an impression of power, stability or holiness, respectively.

These external symbols are themselves interrelated, and specific groupings of symbols may constitute self-contained external models of geographic space. Maps and landscape photographs are certainly clear examples of this. Elements of differing form (e.g., maps and text) can also be interrelated. These various external models of geographic space correspond to external memory. From the perspective just described, the total sum of any individual's knowledge is contained in a multiplicity of internal and external representations that function as a single, interactive whole. The representation as a whole can therefore be characterised as a synergistic, self-organising and highly dynamic network.

Experimental support

Interaction

Early experiments on imagery were already done in 1910 by Perky. He tried to find out, if there is any interaction between imagery and perception by a simple mechanism. Some subjects are told to project an image of common objects like a ship onto a wall. Without their knowledge there is a back projection, which subtly shines through the wall. Then they have to describe this picture, or are questioned about for example the orientation or the colour of the ship. In Perkys experiment, none of the 20 subjects recognised that the description of the picture did not arise from their mind, but were completely influenced by the picture shown to them.

Image Scanning

Another seminal research in this field were Kosslyn's image-scanning experiments in the 1970s. Referring to the example of the mental representation of a ship, he experienced another linearity within the move of the mental focus from one part of the ship to another. The reaction time of the subjects increased with distance between the two parts, which indicates, that we actually create a mental picture of scenes while trying to solve small cognitive tasks. Interestingly, this visual ability can be observed also with congenitally blind, as Marmor and Zaback (1976) found out. Presuming, that the underlying processes are the same of sighted subjects, it could be concluded that there is a deeper encoded system that has access to more than the visual input.

Mental Rotation Task

Other advocates of the spatial representation theory, Shepard and Metzler, developed the mental rotation task in 1971. Two objects are presented to a participant in different angles and his job is to decide whether the objects are identical or not. The results show that the reaction times increases linearly with the rotation angle of the objects. The participants mentally rotate the objects in order to match the objects to one another. This process is called "mental chronometry".

Together with Paivio's memory research, this experiment was crucial for the importance of imagery within cognitive psychology, because it showed the similarity of imagery to the processes of perception. For a mental rotation of 40° the subjects needed two seconds in average, whereas for a 140° rotation the reaction time increased to four seconds. Therefore, it can be concluded that people in general have a mental object rotation rate of 50° per second.

Spatial Frameworks

Although most research on mental models has focussed on text comprehension, researchers generally believe that mental models are perceptually based. Indeed, people have been found to use spatial frameworks like those created for texts to retrieve spatial information about observed scenes (Bryant, 1991). Thus, people create the same sorts of spatial memory representations no matter if they read about an environment or see it themselves.

Size and the visual field

If an object is observed from different distances, it is harder to perceive details if the object is far away because the objects fill only a small part of the visual field. Kosslyn made an experiment in 1973 in which he wanted to find out if this is also true for mental images, to show the similarity of the spatial representation and the perception of real environment. He told participants to imagine objects which are far away and objects which are near. After asking the participants about details, he supposed that details can be observed better if the object is near and fills the visual field. He also told the participants to imagine animals with different sizes near by another. For example an elephant and a rabbit. The elephant filled much more of the visual field than the rabbit and it turned out that the participants were able to answer questions about the elephant more rapidly than about the rabbit. After that the participants had to imagine the small animal besides an even smaller animal, like a fly. This time, the rabbit filled the bigger part of the visual field and again, questions about the bigger animal were answered faster. The result of Kosslyn's experiments is that people can observe more details of an object if it fills a bigger part of their mental visual field. This provides evidence that mental images are represented spatial.

Discussion

Since the 1970s, many experiments enriched the knowledge about imagery and memory to a great extend in the course of the two opposing point of views of the imagery debate. The seesaw of assumed support was marked of lots of smart ideas. The following section is an example of the potential of such controversities.

In 1978, Kossylyn expanded his image screening experiment from objects to real distances represented on maps. In the picture you see our island with all the places you encountered in this chapter. Try to imagine, how far away from each other they are. This is exactly the experiment performed by Kossylyn. Again, he predicted successfully a linear dependency between reaction time and spatial distance to support his model.

In the same year, Pylyshyn answered with what is called the "tacit-knowledge explanation", because he supposed that the participants include knowledge about the world without noticing it. The map is decomposed into nodes with edges in between. The increase of time, he thought, was caused by the different quantity of nodes visited until the goal node is reached.

Only four years later, Finke and Pinker published a counter model. Picture (1) shows a surface with four dots, which were presented to the subjects. After two seconds, it was replaced by picture (2), with an arrow on it. The subjects had to decide, if the arrow pointed at a former dot. The result was, that they reacted slower, if the arrow was farer away from a dot. Finke and Pinker concluded, that within two seconds, the distances can only be stored within a spatial representation of the surface.

To sum it up, it is commonly believed, that imagery and perception share certain features but also differs in some points. For example, perception is a bottom-up process that originates with an image on the retina, whereas imagery is a top-down mechanism which originates when activity is generated in higher visual centres without an actual stimulus. Another distinction can be made by saying that perception occurs automatically and remains relatively stable, whereas imagery needs effort and is fragile. But as psychological discussions failed to point out one right theory, now the debate is translocated to neuroscience, which methods had promising improvements throughout the last three decades.