4.15: Architectural Connectionism - An Overview

Last updated
Save as PDF

Page ID: 35743

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

In the last several sections, we have been concerned with interpreting the internal structure of multilayered artificial neural networks. While some have claimed that all that can be found within brains and networks is goo (Mozer & Smolensky, 1989), the preceding examples have shown that detailed interpretations of internal network structure are both possible and informative. These interpretations reveal algorithmic-level details about how artificial neural networks use their hidden units to mediate mappings from inputs to outputs.

If the goal of connectionist cognitive science is to make new representational discoveries, then this suggests that it be practiced as a form of synthetic psychology (Braitenberg, 1984; Dawson, 2004) that incorporates both synthesis and analysis, and that involves both forward engineering and reverse engineering.

The analytic aspect of connectionist cognitive science involves peering inside a network in order to determine how its internal structure represents solutions to problems. The preceding pages of this chapter have provided several examples of this approach, which seems identical to the reverse engineering practiced by classical cognitive scientists.

The reverse engineering phase of connectionist cognitive science is also linked to classical cognitive science, in the sense that the results of these analyses are likely to provide the questions that drive algorithmic-level investigations. Once a novel representational format is discovered in a network, a key issue is to determine whether it also characterizes human or animal cognition. One would expect that when connectionist cognitive scientists evaluate their representational discoveries, they should do so by gathering the same kind of relative complexity, intermediate state, and error evidence that classical cognitive scientists gather when seeking strong equivalence.

Before one can reverse engineer a network, one must create it. And if the goal of such a network is to discover surprising representational regularities, then it should be created by minimizing representational assumptions as much as possible. One takes the building blocks available in a particular connectionist architecture, creates a network from them, encodes a problem for this network in some way, and attempts to train the network to map inputs to outputs.

This synthetic phase of research involves exploring different network structures (e.g., different design decisions about numbers of hidden units, or types of activation functions) and different approaches to encoding inputs and outputs. The idea is to give the network as many degrees of freedom as possible to discover representational regularities that have not been imposed or predicted by the researcher. These decisions all involve the architectural level of investigation.

One issue, though, is that networks are greedy, in the sense that they will exploit whatever resources are available to them. As a result, fairly idiosyncratic and specialized detectors are likely to be found if too many hidden units are provided to the network, and the network’s performance may not transfer well when presented with novel stimuli. To deal with this, one must impose constraints by looking for the simplest network that will reliably learn the mapping of interest. The idea here is that such a network might be the one most likely to discover a representation general enough to transfer the network’s ability to new patterns.

Importantly, sometimes when one makes architectural decisions to seek the simplest network capable of solving a problem, one discovers that the required network is merely a perceptron that does not employ any hidden units. In the remaining sections of this chapter I provide some examples of simple networks that are capable of performing interesting tasks. In section 4.15 the relevance of perceptrons to modern theories of associative learning is described. In section 4.16 I present a perceptron model of the reorientation task. In section 4.17 an interpretation is given for the structure of a perceptron that learns a seemingly complicated progression of musical chords.