Skip to main content
Social Sci LibreTexts

4.2: Seeing

  • Page ID
    12119
  • Learning Objectives

    1. Identify the key structures of the eye and the role they play in vision.
    2. Summarize how the eye and the visual cortex work together to sense and perceive the visual stimuli in the environment, including processing colors, shape, depth, and motion.

    Whereas other animals rely primarily on hearing, smell, or touch to understand the world around them, human beings rely in large part on vision. A large part of our cerebral cortex is devoted to seeing, and we have substantial visual skills. Seeing begins when light falls on the eyes, initiating the process of transduction. Once this visual information reaches the visual cortex, it is processed by a variety of neurons that detect colors, shapes, and motion, and that create meaningful perceptions out of the incoming stimuli.

    The air around us is filled with a sea of electromagnetic energy; pulses of energy waves that can carry information from place to place. As you can see in Figure 4.6 “The Electromagnetic Spectrum”, electromagnetic waves vary in their wavelengththe distance between one wave peak and the next wave peak, with the shortest gamma waves being only a fraction of a millimeter in length and the longest radio waves being hundreds of kilometers long. Humans are blind to almost all of this energy—our eyes detect only the range from about 400 to 700 billionths of a meter, the part of the electromagnetic spectrum known as the visible spectrum.

    Figure 4.6 The Electromagnetic Spectrum

    b4eaddac5823123ca0aa5e6abe24da3d.jpg

    Only a small fraction of the electromagnetic energy that surrounds us (the visible spectrum) is detectable by the human eye.

    The Sensing Eye and the Perceiving Visual Cortex

    As you can see in Figure 4.7 “Anatomy of the Human Eye”, light enters the eye through the cornea, a clear covering that protects the eye and begins to focus the incoming light. The light then passes through the pupil, a small opening in the center of the eye. The pupil is surrounded by the iris, the colored part of the eye that controls the size of the pupil by constricting or dilating in response to light intensity. When we enter a dark movie theater on a sunny day, for instance, muscles in the iris open the pupil and allow more light to enter. Complete adaptation to the dark may take up to 20 minutes.

    Behind the pupil is the lens, a structure that focuses the incoming light on theretina, the layer of tissue at the back of the eye that contains photoreceptor cells. As our eyes move from near objects to distant objects, a process known as visual accommodation occurs. Visual accommodation is the process of changing the curvature of the lens to keep the light entering the eye focused on the retina. Rays from the top of the image strike the bottom of the retina and vice versa, and rays from the left side of the image strike the right part of the retina and vice versa, causing the image on the retina to be upside down and backward. Furthermore, the image projected on the retina is flat, and yet our final perception of the image will be three dimensional.

    Figure 4.7 Anatomy of the Human Eye

    f8f7d23e8173735d0702cb3e9159503d.jpg

    Light enters the eye through the transparent cornea, passing through the pupil at the center of the iris. The lens adjusts to focus the light on the retina, where it appears upside down and backward. Receptor cells on the retina send information via the optic nerve to the visual cortex.

    Accommodation is not always perfect, and in some cases the light that is hitting the retina is a bit out of focus. As you can see in Figure 4.8 “Normal, Nearsighted, and Farsighted Eyes”, if the focus is in front of the retina, we say that the person is nearsighted, and when the focus is behind the retina we say that the person is farsighted. Eyeglasses and contact lenses correct this problem by adding another lens in front of the eye, and laser eye surgery corrects the problem by reshaping the eye’s own lens.

    Figure 4.8 Normal, Nearsighted, and Farsighted Eyes

    f2834b8767c2db01234fe11247e1682b.jpg

    For people with normal vision (left), the lens properly focuses incoming light on the retina. For people who are nearsighted (center), images from far objects focus too far in front of the retina, whereas for people who are farsighted (right), images from near objects focus too far behind the retina. Eyeglasses solve the problem by adding a secondary, corrective, lens.

    The retina contains layers of neurons specialized to respond to light (see Figure 4.9 “The Retina With Its Specialized Cells”). As light falls on the retina, it first activates receptor cells known as rods and cones. The activation of these cells then spreads to the bipolar cells and then to the ganglion cells, which gather together and converge, like the strands of a rope, forming the optic nerve. The optic nerve is a collection of millions of ganglion neurons that sends vast amounts of visual information, via the thalamus, to the brain. Because the retina and the optic nerve are active processors and analyzers of visual information, it is not inappropriate to think of these structures as an extension of the brain itself.

    Figure 4.9 The Retina With Its Specialized Cells

    e618be902e39dc0da00285ca2f422454.jpg

    When light falls on the retina, it creates a photochemical reaction in the rods and cones at the back of the retina. The reactions then continue to the bipolar cells, the ganglion cells, and eventually to the optic nerve.

    Rods are visual neurons that specialize in detecting black, white, and gray colors. There are about 120 million rods in each eye. The rods do not provide a lot of detail about the images we see, but because they are highly sensitive to shorter-waved (darker) and weak light, they help us see in dim light, for instance, at night. Because the rods are located primarily around the edges of the retina, they are particularly active in peripheral vision (when you need to see something at night, try looking away from what you want to see). Cones are visual neurons that are specialized in detecting fine detail and colors. The 5 million or so cones in each eye enable us to see in color, but they operate best in bright light. The cones are located primarily in and around the fovea, which is the central point of the retina.

    To demonstrate the difference between rods and cones in attention to detail, choose a word in this text and focus on it. Do you notice that the words a few inches to the side seem more blurred? This is because the word you are focusing on strikes the detail-oriented cones, while the words surrounding it strike the less-detail-oriented rods, which are located on the periphery.

    Figure 4.10 Mona Lisa’s Smile

    782ef5dd4e91a075ecd12c4626dc570a.jpg

    Margaret Livingstone (2002) found an interesting effect that demonstrates the different processing capacities of the eye’s rods and cones—namely, that the Mona Lisa’s smile, which is widely referred to as “elusive,” is perceived differently depending on how one looks at the painting. Because Leonardo da Vinci painted the smile in low-detail brush strokes, these details are better perceived by our peripheral vision (the rods) than by the cones. Livingstone found that people rated the Mona Lisa as more cheerful when they were instructed to focus on her eyes than they did when they were asked to look directly at her mouth. As Livingstone put it, “She smiles until you look at her mouth, and then it fades, like a dim star that disappears when you look directly at it.”

    As you can see in Figure 4.11 “Pathway of Visual Images Through the Thalamus and Into the Visual Cortex”, the sensory information received by the retina is relayed through the thalamus to corresponding areas in the visual cortex, which is located in the occipital lobe at the back of the brain. Although the principle of contralateral control might lead you to expect that the left eye would send information to the right brain hemisphere and vice versa, nature is smarter than that. In fact, the left and right eyes each send information to both the left and the right hemisphere, and the visual cortex processes each of the cues separately and in parallel. This is an adaptational advantage to an organism that loses sight in one eye, because even if only one eye is functional, both hemispheres will still receive input from it.

    Figure 4.11 Pathway of Visual Images Through the Thalamus and Into the Visual Cortex

    640912d55e32911ce242b2e29943780c.jpg

    The left and right eyes each send information to both the left and the right brain hemisphere.

    The visual cortex is made up of specialized neurons that turn the sensations they receive from the optic nerve into meaningful images. Because there are no photoreceptor cells at the place where the optic nerve leaves the retina, a hole or blind spot in our vision is created (see Figure 4.12 “Blind Spot Demonstration”). When both of our eyes are open, we don’t experience a problem because our eyes are constantly moving, and one eye makes up for what the other eye misses. But the visual system is also designed to deal with this problem if only one eye is open—the visual cortex simply fills in the small hole in our vision with similar patterns from the surrounding areas, and we never notice the difference. The ability of the visual system to cope with the blind spot is another example of how sensation and perception work together to create meaningful experience.

    Figure 4.12 Blind Spot Demonstration

    c7831c0a985ce9921723f00288fd20b0.jpg

    You can get an idea of the extent of your blind spot (the place where the optic nerve leaves the retina) by trying this demonstration. Close your left eye and stare with your right eye at the cross in the diagram. You should be able to see the elephant image to the right (don’t look at it, just notice that it is there). If you can’t see the elephant, move closer or farther away until you can. Now slowly move so that you are closer to the image while you keep looking at the cross. At one distance (probably a foot or so), the elephant will completely disappear from view because its image has fallen on the blind spot.

    Perception is created in part through the simultaneous action of thousands of feature detector neuronsspecialized neurons, located in the visual cortex, that respond to the strength, angles, shapes, edges, and movements of a visual stimulus (Kelsey, 1997; Livingstone & Hubel, 1988). The feature detectors work in parallel, each performing a specialized function. When faced with a red square, for instance, the parallel line feature detectors, the horizontal line feature detectors, and the red color feature detectors all become activated. This activation is then passed on to other parts of the visual cortex where other neurons compare the information supplied by the feature detectors with images stored in memory. Suddenly, in a flash of recognition, the many neurons fire together, creating the single image of the red square that we experience (Rodriguez et al., 1999).

    Figure 4.13 The Necker Cube

    483f6b6c12ac7534c6d4d9dba817118b.jpg

    The Necker cube is an example of how the visual system creates perceptions out of sensations. We do not see a series of lines, but rather a cube. Which cube we see varies depending on the momentary outcome of perceptual processes in the visual cortex.

    Some feature detectors are tuned to selectively respond to particularly important objects, for instance, faces, smiles, and other parts of the body (Downing, Jiang, Shuman, & Kanwisher, 2001; Haxby et al., 2001). When researchers disrupted face recognition areas of the cortex using the magnetic pulses of transcranial magnetic stimulation (TMS), people were temporarily unable to recognize faces, and yet they were still able to recognize houses (McKone, Kanwisher, & Duchaine, 2007; Pitcher, Walsh, Yovel, & Duchaine, 2007).

    Perceiving Color

    It has been estimated that the human visual system can detect and discriminate among 7 million color variations (Geldard, 1972), but these variations are all created by the combinations of the three primary colors: red, green, and blue. The shade of a color, known as hue, is conveyed by the wavelength of the light that enters the eye (we see shorter wavelengths as more blue and longer wavelengths as more red), and we detect brightness from the intensity or height of the wave (bigger or more intense waves are perceived as brighter).

    Figure 4.14 Low- and High-Frequency Sine Waves and Low- and High-Intensity Sine Waves and Their Corresponding Colors

    ace0c50114b5b55efbad729899a07099.jpg

    Light waves with shorter frequencies are perceived as more blue than red; light waves with higher intensity are seen as brighter.

    In his important research on color vision, Hermann von Helmholtz (1821–1894) theorized that color is perceived because the cones in the retina come in three types. One type of cone reacts primarily to blue light (short wavelengths), another reacts primarily to green light (medium wavelengths), and a third reacts primarily to red light (long wavelengths). The visual cortex then detects and compares the strength of the signals from each of the three types of cones, creating the experience of color. According to this Young-Helmholtz trichromatic color theory, what color we see depends on the mix of the signals from the three types of cones. If the brain is receiving primarily red and blue signals, for instance, it will perceive purple; if it is receiving primarily red and green signals it will perceive yellow; and if it is receiving messages from all three types of cones it will perceive white.

    The different functions of the three types of cones are apparent in people who experience color blindnessthe inability to detect either green and/or red colors. About 1 in 50 people, mostly men, lack functioning in the red- or green-sensitive cones, leaving them only able to experience either one or two colors (Figure 4.15).

    Figure 4.15

    edac27a4100f44ddac6dd84485fadb79.jpg

    People with normal color vision can see the number 42 in the first image and the number 12 in the second (they are vague but apparent). However, people who are color blind cannot see the numbers at all.

    The trichromatic color theory cannot explain all of human vision, however. For one, although the color purple does appear to us as a mixing of red and blue, yellow does not appear to be a mix of red and green. And people with color blindness, who cannot see either green or red, nevertheless can still see yellow. An alternative approach to the Young-Helmholtz theory, known as the opponent-process color theory, proposes that we analyze sensory information not in terms of three colors but rather in three sets of “opponent colors”: red-green, yellow-blue, and white-black. Evidence for the opponent-process theory comes from the fact that some neurons in the retina and in the visual cortex are excited by one color (e.g., red) but inhibited by another color (e.g., green).

    One example of opponent processing occurs in the experience of an afterimage. If you stare at the flag on the left side of Figure 4.16 “U.S. Flag” for about 30 seconds (the longer you look, the better the effect), and then move your eyes to the blank area to the right of it, you will see the afterimage. When we stare at the green stripes, our green receptors habituate and begin to process less strongly, whereas the red receptors remain at full strength. When we switch our gaze, we see primarily the red part of the opponent process. Similar processes create blue after yellow and white after black.

    Figure 4.16 U.S. Flag

    67a9a926daa610a2f90ab5d508770678.jpg

    The presence of an afterimage is best explained by the opponent-process theory of color perception. Stare at the flag for a few seconds, and then move your gaze to the blank space next to it. Do you see the afterimage?

    The tricolor and the opponent-process mechanisms work together to produce color vision. When light rays enter the eye, the red, blue, and green cones on the retina respond in different degrees, and send different strength signals of red, blue, and green through the optic nerve. The color signals are then processed both by the ganglion cells and by the neurons in the visual cortex (Gegenfurtner & Kiper, 2003).

    Perceiving Form

    One of the important processes required in vision is the perception of form. German psychologists in the 1930s and 1940s, including Max Wertheimer (1880–1943), Kurt Koffka (1886–1941), and Wolfgang Köhler (1887–1967), argued that we create forms out of their component sensations based on the idea of the gestalt, a meaningfully organized whole. The idea of the gestalt is that the “whole is more than the sum of its parts.” Some examples of how gestalt principles lead us to see more than what is actually there are summarized in Table 4.1 “Summary of Gestalt Principles of Form Perception”.

    Table 4.1 Summary of Gestalt Principles of Form Perception

    Principle Description Example Image
    Figure and ground We structure input such that we always see a figure (image) against a ground (background). At right, you may see a vase or you may see two faces, but in either case, you will organize the image as a figure against a ground.

    Figure 4.1

    d100af2425fc65bd44d660e20df82bee.jpg

    Similarity Stimuli that are similar to each other tend to be grouped together. You are more likely to see three similar columns among the XYX characters at right than you are to see four rows.

    Figure 4.1

    99698f39231a8216ea794b0cbb3749d4.jpg

    Proximity We tend to group nearby figures together. Do you see four or eight images at right? Principles of proximity suggest that you might see only four.

    Figure 4.1

    b363b29d182282b10681e34e9cf5ffe7.jpg

    Continuity We tend to perceive stimuli in smooth, continuous ways rather than in more discontinuous ways. At right, most people see a line of dots that moves from the lower left to the upper right, rather than a line that moves from the left and then suddenly turns down. The principle of continuity leads us to see most lines as following the smoothest possible path.

    Figure 4.1

    be9842547ed5489f1b124614ee57d0b6.jpg

    Closure We tend to fill in gaps in an incomplete image to create a complete, whole object. Closure leads us to see a single spherical object at right rather than a set of unrelated cones.

    Figure 4.1

    fe704ab1f4b03f5281118eb5fba6ca91.jpg