2.11: Architectures against Homunculi

Last updated
Save as PDF

Page ID: 35709

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

We have described an algorithm for calculating an input-output mapping as a sequence of operations or behaviours. This description is misleading, though, because the notion of sequence gives the impression of a linear ordering of steps. However, we would not expect most algorithms to be linearly organized. For instance, connectionist cognitive scientists would argue that more than one step in an algorithm can be carried out at the same time (Feldman & Ballard, 1982). As well, most algorithms of interest to classical cognitive scientists would likely exhibit a markedly hierarchical organization (Miller, Galanter, & Pribram, 1960; Simon, 1969). In this section, I use the notion of hierarchical organization to motivate the need for an algorithm to be supported by an architecture.

What does it mean for an algorithm to be hierarchical in nature? To answer this question, let us again consider the situation in which behavioural measurements are being used to reverse engineer a calculating black box. Initial experiments could suggest that an input-output mapping is accomplished by an algorithm that involves three steps (Step 1 → Step 2 → Step 3). However, later studies could also indicate that each of these steps might themselves be accomplished by sub-algorithms.

For instance, it might be found that Step 1 is accomplished by its own fourstep sub-algorithm (Step a → Step b → Step c → Step d). Even later it could be discovered that one of these sub-algorithms is itself the product of another sub-subalgorithm. Such hierarchical organization is common practice in the development of algorithms for digital computers, where most programs are organized systems of functions, subfunctions, and sub-subfunctions. It is also a common characteristic of cognitive theories (Cummins, 1983).

The hierarchical organization of algorithms can pose a problem, though, if an algorithmic account is designed to explain a calculating device. Consider our example where Step 1 of the black box’s algorithm is explained by being hierarchically decomposed into the sub-algorithm “Step a → Step b → Step c → Step d.” On closer examination, it seems that nothing has really been explained at all. Instead, we have replaced Step 1 with a sequence of four new steps, each of which requires further explanation. If each of these further explanations is of the same type as the one to account for Step 1, then this will in turn produce even more steps requiring explanation. There seems to be no end to this infinite proliferation of algorithmic steps that are appearing in our account of the calculating device.

This situation is known as Ryle’s regress. The philosopher Gilbert Ryle raised it as a problem with the use of mentalistic terms in explanations of intelligence:

Must we then say that for the hero’s reflections how to act to be intelligent he must first reflect how best to reflect to act? The endlessness of this implied regress shows that the application of the criterion of appropriateness does not entail the occurrence of a process of considering this criterion. (Ryle, 1949, p. 31)

Ryle’s regress occurs when we explain outer intelligence by appealing to inner intelligence.

Ryle’s regress is also known as the homunculus problem, where a homunculus is an intelligent inner agent. The homunculus problem arises when one explains outer intelligence by appealing to what is in essence an inner homunculus. For instance, Fodor noted the obvious problems with a homuncular explanation of how one ties their shoes:

And indeed there would be something wrong with an explanation that said, ‘This is the way we tie our shoes: we notify a little man in our head who does it for us.’ This account invites the question: ‘How does the little man do it?’ but, ex hypothesis, provides no conceptual mechanisms for answering such questions. (Fodor, 1968a, p. 628)

Indeed, if one proceeds to answer the invited question by appealing to another homunculus within the “little man,” then the result is an infinite proliferation of homunculi.

To solve Ryle’s regress an algorithm must be analyzed into steps that do not require further decomposition in order to be explained. This means when some function is decomposed into a set of subfunctions, it is critical that each of the subfunctions be simpler than the overall function that they work together to produce (Cummins, 1983; Dennett, 1978; Fodor, 1968a). Dennett (1978, p. 123) noted that “homunculi are bogeymen only if they duplicate entire the talents they are rung in to explain.” Similarly, Fodor (1968a, p. 629) pointed out that “we refine a psychological theory by replacing global little men by less global little men, each of whom has fewer unanalyzed behaviors to perform than did his predecessors.”

If the functions produced in a first pass of analysis require further decomposition in order to be themselves explained, then the subfunctions that are produced must again be even simpler. At some point, the functions become so simple—the homunculi become so stupid—that they can be replaced by machines. This is because at this level all they do is answer “yes” or “no” to some straightforward question. “One discharges fancy homunculi from one’s scheme by organizing armies of such idiots to do the work” (Dennett, 1978, p. 124).

The set of subfunctions that exist at this final level of decomposition belongs to what computer scientists call the device’s architecture (Blaauw & Brooks, 1997; Brooks, 1962; Dasgupta, 1989). The architecture defines what basic abilities are built into the device. For a calculating device, the architecture would specify three different types of components: the basic operations of the device, the objects to which these operations are applied, and the control scheme that decides which operation to carry 48 Chapter 2 out at any given time (Newell, 1980; Simon, 1969). To detail the architecture is to specify “what operations are primitive, how memory is organized and accessed, what sequences are allowed, what limitations exist on the passing of arguments and on the capacities of various buffers, and so on” (Pylyshyn, 1984, p. 92).

What is the relationship between an algorithm and its architecture? In general, the architecture provides the programming language in which an algorithm is written. “Specifying the functional architecture of a system is like providing a manual that defines some programming language. Indeed, defining a programming language is equivalent to specifying the functional architecture of a virtual machine” (Pylyshyn, 1984, p. 92).

This means that algorithms and architectures share many properties. Foremost of these is that they are both described as operations, behaviours, or functions, and not in terms of physical makeup. An algorithm is a set of functions that work together to accomplish a task; an architectural component is one of the simplest functions—a primitive operation—from which algorithms are composed. In order to escape Ryle’s regress, one does not have to replace an architectural function with its physical account. Instead, one simply has to be sure that such a replacement is available if one wanted to explain how the architectural component works. It is no accident that Pylyshyn (1984) uses the phrase functional architecture in the quote given above.

Why do we insist that the architecture is functional? Why don’t we appeal directly to the physical mechanisms that bring an architecture into being? Both of these questions are answered by recognizing that multiple physical realizations are possible for any functional architecture. For instance, simple logic gates are clearly the functional architecture of modern computers. But we saw earlier that functionally equivalent versions of these gates could be built out of wires and switches, vacuum tubes, semiconductors, or hydraulic valves.

To exit Ryle’s regress, we have to discharge an algorithm’s homunculi. We can do this by identifying the algorithm’s programming language—by saying what its architecture is. Importantly, this does not require us to say how, or from what physical stuff, the architecture is made! “Whether you build a computer out of transistors, hydraulic valves, or a chemistry set, the principles on which it operates are much the same” (Hillis, 1998, p. 10).