5.14: The Extended Mind

Last updated
Save as PDF

Page ID: 41170

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

In preceding pages of this chapter, a number of interrelated topics that are central to embodied cognitive science have been introduced: situation and embodiment, feedback between agents and environments, stigmergic control of behavior, affordances and enactive perception, and cognitive scaffolding. These topics show that embodied cognitive science places much more emphasis on body and world, and on sense and action, than do other “flavours” of cognitive science.

This change in emphasis can have profound effects on our definitions of mind or self (Bateson, 1972). For example, consider this famous passage from anthropologist Gregory Bateson:

But what about ‘me’? Suppose I am a blind man, and I use a stick. I go tap, tap, tap. Where do I start? Is my mental system bounded at the handle of the stick? Is it bounded by my skin? (Bateson, 1972, p. 465)

The embodied approach’s emphasis on agents embedded in their environments leads to a radical and controversial answer to Bateson’s questions, in the form of the extended mind (Clark, 1997, 1999, 2003, 2008; Clark & Chalmers, 1998; Menary, 2008, 2010; Noë, 2009; Rupert, 2009; Wilson, 2004, 2005). According to the extended mind hypothesis, the mind and its information processing are not separated from the world by the skull. Instead, the mind interacts with the world in such a way that information processing is both part of the brain and part of the world—the boundary between the mind and the world is blurred, or has disappeared.

Where is the mind located? The traditional view—typified by the classical approach introduced in Chapter 3—is that thinking is inside the individual, and that sensing and acting involve the world outside. However, if cognition is scaffolded, then some thinking has moved from inside the head to outside in the world. “It is the human brain plus these chunks of external scaffolding that finally constitutes the smart, rational inference engine we call mind” (Clark, 1997, p. 180). As a result, Clark (1997) described the mind as a leaky organ, because it has spread from inside our head to include whatever is used as external scaffolding.

The extended mind hypothesis has enormous implications for the cognitive sciences. The debate between classical and connectionist cognitive science does not turn on this issue, because both approaches are essentially representational. That is, both approaches tacitly endorse the classical sandwich; while they have strong disagreements about the nature of representational processes in the filling of the sandwich, neither of these approaches views the mind as being extended. Embodied cognitive scientists who endorse the extended mind hypothesis thus appear to be moving in a direction that strongly separates the embodied approach from the other two. It is small comfort to know that all cognitive scientists might agree that they are in the business of studying the mind, when they can’t agree upon what minds are.

For this reason, the extended mind hypothesis has increasingly been a source of intense philosophical analysis and criticism (Adams & Aizawa, 2008; Menary, 2010; Robbins & Aydede, 2009). Adams and Aizawa (2008) are strongly critical of the extended mind hypothesis because they believe that it makes no serious attempt to define the “mark of the cognitive,” that is, the principled differences between cognitive and non-cognitive processing:

If just any sort of information processing is cognitive processing, then it is not hard to find cognitive processing in notebooks, computers and other tools. The problem is that this theory of the cognitive is wildly implausible and evidently not what cognitive psychologists intend. A wristwatch is an information processor, but not a cognitive agent. What the advocates of extended cognition need, but, we argue, do not have, is a plausible theory of the difference between the cognitive and the non-cognitive that does justice to the subject matter of cognitive psychology. (Adams & Aizawa, 2008, p. 11)

A variety of other critiques can be found in various contributions to Robbins and Aydede’s (2009) Cambridge Handbook of Situated Cognition. Prinz made a pointed argument that the extended mind has nothing to contribute to the study of consciousness. Rupert noted how the notion of innateness poses numerous problems for the extended mind. Warneken and Tomasello examined cultural scaffolding, but they eventually adopted a position where these cultural tools have been internalized by agents. Finally, Bechtel presented a coherent argument from the philosophy of biology that there is good reason for the skull to serve as the boundary between the world and the mind. Clearly, the degree to which extendedness is adopted by situated researchers is far from universal.

In spite of the currently unresolved debate about the plausibility of the extended mind, the extended mind hypothesis is an idea that is growing in popularity in embodied cognitive science. Let us briefly turn to another implication that this hypothesis has for the practice of cognitive science.

The extended mind hypothesis is frequently applied to single cognitive agents. However, this hypothesis also opens the door to co-operative or public cognition in which a group of agents are embedded in a shared environment (Hutchins, 1995). In this situation, more than one cognitive agent can manipulate the world that is being used to support the information processing of other group members.

Hutchins (1995) provided one example of public cognition in his description of how a team of individuals is responsible for navigating a ship. He argued that “organized groups may have cognitive properties that differ from those of the individuals who constitute the group” (p. 228). For instance, in many cases it is very difficult to translate the heuristics used by a solo navigator into a procedure that can be implemented by a navigation team.

Collective intelligence—also called swarm intelligence or co-operative computing—is also of growing importance in robotics. Entomologists used the concept of the superorganism (Wheeler, 1911) to explain how entire colonies could produce more complex results (such as elaborate nests) than one would predict from knowing the capabilities of individual colony members. Swarm intelligence is an interesting evolution of the idea of the superorganism; it involves a collective of agents operating in a shared environment. Importantly, a swarm’s components are only involved in local interactions with each other, resulting in many advantages (Balch & Parker, 2002; Sharkey, 2006).

For instance, a computing swarm is scalable—it may comprise varying numbers of agents, because the same control structure (i.e., local interactions) is used regardless of how many agents are in the swarm. For the same reason, a computing swarm is flexible: agents can be added or removed from the swarm without reorganizing the entire system. The scalability and flexibility of a swarm make it robust, as it can continue to compute when some of its component agents no longer function properly. Notice how these advantages of a swarm of agents are analogous to the advantages of connectionist networks over classical models, as discussed in Chapter 4.

Nonlinearity is also a key ingredient of swarm intelligence. For a swarm to be considered intelligent, the whole must be greater than the sum of its parts. This idea has been used to identify the presence of swarm intelligence by relating the amount of work done by a collective to the number of agents in the collection (Beni & Wang, 1991). If the relationship between work accomplished and number of agents is linear, then the swarm is not considered to be intelligent. However, if the relationship is nonlinear—for instance, exponentially increasing—then swarm intelligence is present. The nonlinear relationship between work and numbers may itself be mediated by other nonlinear relationships. For example, Dawson, Dupuis, and Wilson (2010) found that in collections of simple LEGO robots, the presence of additional robots influenced robot paths in an arena in such a way that a sorting task was accomplished far more efficiently.

While early studies of robot collectives concerned small groups of homogenous robots (Gerkey & Mataric, 2004), researchers are now more interested in complex collectives consisting of different types of machines for performing diverse tasks at varying locations or times (Balch & Parker, 2002; Schultz & Parker, 2002). This leads to the problem of coordinating the varying actions of diverse collective members (Gerkey & Mataric, 2002, 2004; Mataric, 1998). One general approach to solving this coordination problem is intentional co-operation (Balch & Parker, 2002; Parker, 1998, 2001), which uses direct communication amongst robots to prevent unnecessary duplication (or competition) between robot actions. However, intentional co-operation comes with its own set of problems. For instance, communication between robots is costly, particularly as more robots are added to a communicating team (Kube & Zhang, 1994). As well, as communication makes the functions carried out by individual team members more specialized, the robustness of the robot collective is jeopardized (Kube & Bonabeau, 2000). Is it possible for a robot collective to coordinate its component activities, and solve interesting problems, in the absence of direction communication?

The embodied approach has generated a plausible answer to this question via stigmergy (Kube & Bonabeau, 2000). Kube and Bonabeau (2000) demonstrated that the actions of a large collective of robots could be stigmergically coordinated so that the collective could push a box to a goal location in an arena. Robots used a variety of sensors to detect (and avoid) other robots, locate the box, and locate the goal location. A subsumption architecture was employed to instantiate a fairly simple set of sense-act reflexes. For instance, if a robot detected that is was in contact with the box and could see the goal, then box-pushing behavior was initiated. If it was in contact with the box but could not see the goal, then other movements were triggered, resulting in the robot finding contact with the box at a different position.

This subsumption architecture caused robots to seek the box, push it towards the goal, and do so co-operatively by avoiding other robots. Furthermore, when robot activities altered the environment, this produced corresponding changes in behavior of other robots. For instance, a robot pushing the box might lose sight of the goal because of box movement, and it would therefore leave the box and use its other exploratory behaviors to come back to the box and push it from a different location. “Cooperation in some tasks is possible without direct communication” (Kube & Bonabeau, 2000, p. 100). Importantly, the solution to the box-pushing problem required such co-operation, because the box being manipulated was too heavy to be moved by a small number of robots!

The box-pushing research of Kube and Bonabeau (2000) is an example of stigmergic processing that occurs when two or more individuals collaborate on a task using a shared environment. Hutchins (1995) brought attention to less obvious examples of public cognition that exploit specialized environmental tools. Such scaffolding devices cannot be dissociated from culture or history. For example, Hutchins noted that navigation depends upon centuries-old mathematics of chart projections, not to mention millennia-old number systems.

These observations caused Hutchins (1995) to propose an extension of Simon’s (1969) parable of the ant. Hutchins argued that rather than watching an individual ant on the beach, we should arrive at a beach after a storm and watch generations of ants at work. As the ant colony matures, the ants will appear smarter, because their behaviors are more efficient. But this is because,

the environment is not the same. Generations of ants have left their marks on the beach, and now a dumb ant has been made to appear smart through its simple interaction with the residua of the history of its ancestor’s actions. (Hutchins, 1995, p. 169)

Hutchins’ (1995) suggestion mirrored concerns raised by Scribner’s studies of mind in action. She observed that the diversity of problem solutions generated by dairy workers, for example, was due in part to social scaffolding.

We need a greater understanding of the ways in which the institutional setting, norms and values of the work group and, more broadly, cultural understandings of labor contribute to the reorganization of work tasks in a given community. (Scribner & Tobach, 1997, p. 373)

Furthermore, Scribner pointed out that the traditional methods used by classical researchers to study cognition were not suited for increasing this kind of understanding. The extended mind hypothesis leads not only to questions about the nature of mind, but also to the questions about the methods used to study mentality.