Hermann von Helmholtz was not aware of problems of visual underdetermination of the form illustrated in Figures 8-1 and 8-2. However, he was aware that visual sensors could be seriously misled. One example that he considered at length (Helmholtz & Southall, 1962a, 1962b) was the mechanical stimulation of the eye (e.g., slight pressure on the eyeball made by a blunt point), which produced a sensation of light (a pressure-image or phosphene) even though a light stimulus was not present. From this he proposed a general rule for determining the “ideas of vision”:
Such objects are always imagined as being present in the field of vision as would have to be there in order to produce the same impression on the nervous mechanism, the eyes being used under ordinary normal conditions. (Helmholtz & Southall, 1962b, p. 2)
Helmholtz’s studies of such phenomena forced him to explain the processes by which such a rule could be realized. He first noted that the visual system does not have direct access to the distal world, but instead that primary visual data was retinal activity. He concluded that inference must be involved to transform retinal activity into visual experience. “It is obvious that we can never emerge from the world of our sensations to the apperception of an external world, except by inferring from the changing sensation that external objects are the causes of this change” (Helmholtz & Southall, 1962b, p. 33). This theory allowed Helmholtz to explain visual illusions as the result of mistaken reasoning rather than as the product of malfunctions in the visual apparatus: “It is rather simply an illusion in the judgment of the material presented to the senses, resulting in a false idea of it” (p. 4).
Helmholtz argued that the accuracy of visual inferences is due to an agent’s constant exploration and experimentation with the world, determining how actions in the world such as changing viewpoints alter visual experience.
Spontaneously and by our own power, we vary some of the conditions under which the object has been perceived. We know that the changes thus produced in the way that objects look depend solely on the movements we have executed. Thus we obtain a different series of apperceptions of the same object, by which we can be convinced with experimental certainty that they are simply apperceptions and that it is the common cause of them all. (Helmholtz & Southall, 1962b, p. 31)
Helmholtz argued that the only difference between visual inference and logical reasoning was that the former was unconscious while the latter was not, describing “the psychic acts of ordinary perception as unconscious conclusions” (Helmholtz & Southall, 1962b, p. 4). Consciousness aside, seeing and reasoning were processes of the same kind: “There can be no doubt as to the similarity between the results of such unconscious conclusions and those of conscious conclusions” (p. 4).
A century after Helmholtz, researchers were well aware of the problem of underdetermination with respect to vision. Their view of this problem was that it was based in the fact that certain information is missing from the proximal stimulus, and that additional processing is required to supply the missing information. With the rise of cognitivism in the 1950s, researchers proposed a top-down, or theory-driven, account of perception in which general knowledge of the world was used to disambiguate the proximal stimulus (Bruner, 1957, 1992; Bruner, Postman, & Rodrigues, 1951; Gregory, 1970, 1978; Rock, 1983). This approach directly descended from Helmholtz’s discussion of unconscious conclusions because it equated visual perception with cognition.
One of the principal characteristics of perceiving [categorization] is a characteristic of cognition generally. There is no reason to assume that the laws governing inferences of this kind are discontinuous as one moves from perceptual to more conceptual activities. (Bruner, 1957, p. 124)
The cognitive account of perception that Jerome Bruner originated in the 1950s came to be known as the New Look. According to the New Look, higher-order cognitive processes could permit beliefs, expectations, and general knowledge of the world to provide additional information for disambiguation of the underdetermining proximal stimulus. “We not only believe what we see: to some extent we see what we believe” (Gregory, 1970, p. 15). Hundreds of studies provided experimental evidence that perceptual experience was determined in large part by a perceiver’s beliefs or expectations. (For one review of this literature see Pylyshyn, 2003b.) Given the central role of cognitivism since the inception of the New Look, it is not surprising that this type of theory has dominated the modern literature.
The belief that perception is thoroughly contaminated by such cognitive factors as expectations, judgments, beliefs, and so on, became the received wisdom in much of psychology, with virtually all contemporary elementary texts in human information processing and vision taking that point of view for granted. (Pylyshyn, 2003b, p. 56)
To illustrate the New Look, consider a situation in which I see a small, black and white, irregularly shaped, moving object. This visual information is not sufficient to uniquely specify what in the world I am observing. To deal with this problem, I use general reasoning processes to disambiguate the situation. Imagine that I am inside my home. I know that I own a black and white cat, I believe that the cat is indoors, and I expect that I will see this cat in the house. Thus I experience this visual stimulus as “seeing my cat Phoebe.” In a different context, different expectations exist. For instance, if I am outside the house on the street, then the same proximal stimulus will be disambiguated with different expectations; “I see my neighbour’s black and white dog Shadow.” If I am down walking in the forest by the creek, then I may use different beliefs to “see a skunk.”
It would seem that a higher agency of the mind, call it the executive agency, has available to it the proximal input, which it can scan, and it then behaves in a manner very like a thinking organism in selecting this or that aspect of the stimulus as representing the outer object or event in the world. (Rock, 1983, p. 39)
The New Look in perception is a prototypical example of classical cognitive science. If visual perception is another type of cognitive processing, then it is governed by the same laws as are reasoning and problem solving. In short, a crucial consequence of the New Look is that visual perception is rational, in the sense that vision’s success is measured in terms of the truth value of the representations it produces.
For instance, Richard Gregory (1970, p. 29, italics added) remarked that “it is surely remarkable that out of the infinity of possibilities the perceptual brain generally hits on just about the best one.” Gregory (1978, p. 13, italics added) also equated visual perception to problem solving, describing it as “a dynamic searching for the best interpretation of the available data.” The cognitive nature of perceptual processing allows,
past experience and anticipation of the future to play a large part in augmenting sensory information, so that we do not perceive the world merely from the sensory information available at any given time, but rather we use this information to test hypotheses of what lies before us. Perception becomes a matter of suggesting and testing hypotheses. (Gregory, 1978, p. 221)
In all of these examples, perception is described as a process that delivers representational contents that are most (semantically) consistent with visual sensations and other intentional contents, such as beliefs and desires.
The problem with the New Look is this rational view of perception. Because of its emphasis on top-down influences, the New Look lacks an account of links between the world and vision that are causal and independent of beliefs. If all of our perceptual experience was belief dependent, then we would never see anything that we did not expect to see. This would not contribute to our survival, which often depends upon noticing and reacting to surprising circumstances in the environment.
Pylyshyn’s (2003b, 2007) hybrid theory of visual cognition rests upon the assumption that there exists a cognitively impenetrable visual architecture that is separate from general cognition. This architecture is data-driven in nature, governed by causal influences from the visual world and insulated from beliefs and expectations. Such systems can solve problems of underdetermination without requiring assumptions of rationality, as discussed in the next section.