12.2: Knowledge Representation in the Brain
-
- Last updated
- Save as PDF
- Wikipedia
Concepts and Categories
For many cognitive functions, concepts are essential. Concepts are mental representations, including memory, reasoning and using/understanding language. One function of concepts is the categorisation of knowledge which has been studied intensely. In the course of this chapter, we will focus on this function of concepts.
Imagine you wake up every single morning and start wondering about all the things you have never seen before. Think about how you would feel if an unknown car parked in front of your house. You have seen thousands of cars but since you have never seen this specific car in this particular position, you would not be able to provide yourself with any explanation. Since we are able to find an explanation, the questions we need to ask ourselves are: How are we able to abstract from prior knowledge and why do we not start all over again if we are confronted with a slightly new situation? The answer is easy: We categorise knowledge. Categorisation is the process by which things are placed into groups called categories.
Categories are so called “pointers of knowledge”. You can imagine a category as a box, in which similar objects are grouped and which is labeled with common properties and other general information about the category. Our brain does not only memorise specific examples of members of a category, but also stores general information that all members have in common and which therefore defines the category. Coming back to the car-example, this means that our brain does not only store how your car, your neighbors’ and your friends’ car look like, but it also provides us with the general information that most cars have four wheels, need to be fueled and so on. Because categorisation immediately allows us to get a general picture of a scene by allowing us to recognise new objects as members of a category, it saves us much time and energy that we otherwise would have to spend in investigating new objects. It helps us to focus on the important details in our environment, and enables us to draw the correct inferences. To make this obvious, imagine yourself standing at the side of a road, wanting to traverse it. A car approaches from the left. Now, the only thing you need to know about this car is the general information provided by the category, that it will run you over if you don't wait until it has passed. You don't need to care about the car's color, number of doors and so on. If you were not able to immediately assign the car to the category "car", and infer the necessity to step back, you would get hit because you would still be busy with examining the details of that specific and unknown car. Therefore categorisation has proved itself as being very helpful for surviving during evolution and allows us to quickly and efficiently navigate through our environment.
Definitional Approach
Take a look at the following picture! You will see four different kinds of cars. They differ in shape, color and other features, nonetheless you are probably sure that they are all cars.
What makes us so convinced about the identity of these objects? Maybe we can try to find a definition which describes all these cars. Have all of them four wheels? No, There are some which have only three. Do all cars drive with petrol? No, That's not true for all cars either. Apparently we will fail to come up with a definition. The reason for this failure is that we have to generalise to make a definition. That would work perhaps for geometrical objects, but obviously not for natural things. They do not share completely identical features in one category for that it is problematic to find an appropriate definition. There are however similarities between members of one category, so what about this familiarity? The famous philosopher and linguist Ludwig Wittgenstein asked himself this question and claimed to have found a solution. He developed the idea of family resemblance. That means that members of a category resemble each other in several ways. For example cars differ in shape, color and many other properties but every car resembles somehow other cars. The following two approaches determines categories by similarity.
Prototype Approach
The prototype approach was proposed by Rosch in 1973. A prototype is an average case of all members in a particular category, but it is not an actual, really existent member of the category. Even extreme various features of members within one category can be explained by this approach. Different degrees of prototypicality represent differences among category- members. Members which resemble the prototype very strongly are high-prototypical. Members which differ in a lot of ways from the prototype are therefore low-prototypical. There seem to be connections to the idea of family resemblance and indeed some experiments showed that high prototypicality and high family resemblance are strongly connected. The typicality effect describes the fact that high-prototypical members are faster recognised as a member of a category. For example participants had to decide whether statements like “A penguin is a bird.” or “A sparrow is bird.” are true. Their decisions were much faster concerning the “sparrow” as a high-prototypical member of the category “bird” than for an atypical member as “penguin”. Participants also tend to prefer prototypical members of a category when asked to list objects of a category. Concerning the birds-example, they rather list “sparrow” than “penguin”, which is a quite intuitive result. In addition high-prototypical objects are strongly affected by priming.
Exemplar Approach
The typicality effect can also be explained by a third approach which is concerned with exemplars. Similar to a prototype, an exemplar is a very typical member of the category. The difference between exemplars and prototypes is that exemplars are actually existent members of a category that a person has encountered in the past. Nevertheless, it involves also the similarity of an object to a standard object. Only that the standard here involves many examples and not the average, each one called an exemplar.
Again we can show the typicality effect: Objects that are similar to many examples we have encountered are classified faster to objects which are similar to few examples. You have seen a sparrow more often in your life than a penguin, so you should recognise the sparrow faster.
For both prototype and exemplar approach there are experiments whose results support either one approach. Some people claim that the exemplar approach has less problems with variable categories and with atypical cases within categories. E.g. the category “games” is quite difficult to realise with the prototype approach. How do you want to find an average case for all games, like football, golf, chess. The reason for that could be that “real” category- members are used and all information of the individual exemplars, which can be useful when encountering other members later, are stored. Another point where the approaches can be compared is how well they work for differently sized categories. The exemplar approach seems to work better for smaller categories and prototypes do better for larger categories.
Some researchers concluded that people may use both approaches: When we initially learn something about a category we average seen exemplars into a prototype. It would be very bad in early learning, if we already take into account what exceptions a category has. In getting to know some of these exemplars more in detail the information becomes strengthened.
“We know generally what cats are (the prototype), but we know specifically our own cat the best (an exemplar).” (Minda & Smith, 2001)
Hierarchical Organization of Categories
Now that we know about the different approaches of how we go about forming categories, let us look at the structure of a category and the relationship between categories. The basic idea is that larger categories can be split up into more specific and smaller ones.
Rosch stated that by this process three levels of categorization are created:
It is interesting that the decrease of information from basic to superordinate is really high but that the increase of information from basic down to subordinate is rather low. Scientists wanted to find out if among these levels one is preferred over the others. They asked participants to name presented objects as quickly as possible. The result was that the subjects tended to use the basic-level name, which includes the optimal amount of stored information. Therefore a picture of a retriever would be named “dog” rather than “animal” or “retriever”. It is important to note that the levels are different for each person depending on factors such as expertise and culture.
One factor which influences our categorization is knowledge itself. Experts pay more attention to specific features of objects in their area than non-experts would do. For example after presenting some pictures of birds experts of birds tend to say the subordinate name (blackbird, sparrow) while non-experts just say "bird". The basic level in the area of interest of an expert is lower than the basic level of a layperson. Therefore knowledge and experience of people affect categorization.
Another factor is culture. Imagine a people living for instance in close contact with their natural environment, and have therefore a greater knowledge about plants etc. than, for example, students in Germany. If you ask the latter what they see in nature, they use the basic level ‘tree’ and if you do the same task for the people closer to nature they will tend to answer in terms of lower level concepts such as ‘oak tree’.
Representation of Categories in the Brain
There is evidence that some areas in the brain are selective for different categories, but it is not very probable that there is a corresponding brain area for each category. Results of neurophysiological research point to a kind of double dissociation for living and non-living things. Evidence has been found in fMRI studies that they are indeed represented in different brain areas. It is important to denote that nevertheless there is much overlap between the activation of different brain areas by categories. Moreover when going one step closer into the physical area there is a connection to mental categories, too. There seem to exist neurons which respond better to objects of a particular category, namely so called “category-specific neurons”. These neurons fire not only as a response to one object but to many objects within one category. This leads to the idea that probably many neurons fire if a person recognises a particular object and that maybe these combined patterns of the firing neurons represent the object.
Semantic Networks
The "Semantic Network approach" proposes that concepts of the mind are arranged in networks, in other words, in a functional storage-system for the `meanings' of words. Of course, the concept of a semantic net is very flexible. In a graphical illustration of such a semantic net, concepts of our mental dictionary are represented by nodes, which in this way represent a piece of knowledge about our world.
The properties of a concept could be placed, or "stored", next to a node representing that concept. Links between the nodes indicate the relationship between the objects. The links can not only show that there is a relationship, they can also indicate the kind of relation by their length, for example.
Every concept in the net is in a dynamical correlation with other concepts, which may have protoypically similar characteristics or functions.
Collins and Quillian's Model
Semantic Network according to Collins and Quillian with nodes, links, concept names and properties.
One of the first scientists who thought about structural models of human memory that could be run on a computer was Ross Quillian (1967). Together with Allan Collins, he developed the Semantic Network with related categories and with a hierarchical organisation.
In the picture on the right hand side, Collins and Quillians network with added properties at each node is shown. As already mentioned, the skeleton-nodes are interconnected by links. At the nodes, concept names are added. Like in paragraph "Hierarchical Organisation of Categories", general concepts are on the top and more particular ones at the bottom. By looking at the concept "car", one gets the information that a car has 4 wheels, has an engine, has windows, and furthermore moves around, needs fuel, is manmade.
These pieces of information must be stored somewhere. It would take too much space, if every detail must be stored at every level. So the information of a car is stored at the basis level and further information about specific cars, e.g. BMW, is stored at the lower level, where you do not need the fact that the BMW also has four wheels, if you already know that it is a car. This way of storing shared properties at a higher-level node is called Cognitive Economy.
In order not to produce redundancies, Collins and Quillian thought of this as an information inheritance principle. Information, that is shared by several concepts, is stored in the highest parent node, containing the information. So all son-nodes, that are below the information bearer , also can access the information about the properties. However, there are exceptions. Sometimes a special car has not four wheels, but three. This specific property is stored in the son-node.
The logic structure of the network is convincing, since it can show that the time of retrieving a concept and the distances in the network correlate. The correlation is proven by the sentence-verification technique. In experiments probands had to answer statements about concepts with "yes" or "no". It took actually longer to say "yes", if the concept bearing nodes were further apart.
The phenomenon that adjacent concepts are activated is called Spreading activation. These concepts are far more easily accessed by memory, they are "primed". This was studied and backed by David Meyer and Roger Schaneveldt (1971) with a lexical-decision task. Probands had to decide if word pairs were words or non-words. They were faster at finding real word pairs if the concepts of the two words were close by in the intended network.
While having the ability to explain many questions, the model has some flaws.
The Typicality Effect is one of them. It is known that "reaction times for more typical members of a category are faster than for less typical members". (MITECS) This contradicts the assumptions of Collins' and Quillian's Model, that the distance in the net is responsible for reaction time. It was experimentally determined that some properties are stored at specific nodes, therefore the cognitive economy stands in question. Furthermore, there are examples of faster concept retrieval although the distances in the network are longer.
These points led to another version of the Semantic Network approach: Collins and Loftus Model.
Collins and Loftus Model
Collins and Loftus (1975) tried to abandon these problems by using shorter or longer links depending on the relatedness and interconnections between formerly not directly linked concepts. Also the former hierarchic structure was substituted by a more individual structure of a person. Only to name a few of the extensions. As shown in the picture on the right, the new model represents interpersonal differences, such as acquired during a humans lifespan. They manifest themselves in the layout and the various lengths of the links of the same concepts.
An example: The concept "vehicle" is connected to car, truck or bus by short links, and to fire engine or ambulance with longer links.
After these enhancements, the model is so omnipotent that some researchers scarced it for being too flexible. In their opinion, the model is no longer a scientific theory, because it is not disprovable. Furthermore, we do not know how long these links are in us. How should they be measurable and could they actually?
Connectionist Approach
Every concept in a semantic net is in a dynamical correlation with other concepts which can have prototypically similar characteristics or functions. The neural networks in the brain are organised similarly. Furthermore, it is useful to include the features of ”spreading activation” and ”parallel distributed activity” in a concept of such a semantic net to explain the complexity of the very sophisticated environment.
Basic Principles of Connectionism
The connectionists did this by modeling their networks after neural networks in the nervous system. Every node of the diagram represents a neuron-like processing unit. These units can be divided into three subgroups: Input units , which become activated by a stimulation of the environment, hidden units , which receive signals from an input-unit and pass them to an output unit and output units , which show a pattern of activation that represents the initial stimulus. Excitatory and inhibitory connections between units just like synapses in the brain allow ’input’ to be analyzed and evaluated. For computing the outcome of such systems, it is useful to attach a certain ’weight’ to the input of the connectionists system, that mimics the strength of a stimulus of the human nervous system.
It needs to be emphasized that connectionist networks are not models of how the nervous system works. The approach of connectionist networks is a hypothetical approach to represent categories in network patterns. Another name for the connectionist approach is Parallel Distributed Processing approach, for short PDP, since processing takes place in parallel lines and the output is distributed across many units.
Operation of Connectionist Networks
First a stimulus is presented to the input units. Then the links pass on the signal to the hidden units, that distribute the signal to the output units via further links. In the first trial, the output units shows a wrong pattern. After many repetitions, the pattern finally is correct. This is achieved by back propagation. The error signals are send back to the hidden units and the signals are reprocessed. During these repetitive trials, the ”weights” of the signal are gradually calibrated on behalf of the error signals in order to get a right output pattern at last. After having achieved a correct pattern for one stimulus, the system is ready to learn a new concept.
Evaluating Connectionism
The PDP approach is important for knowledge representation studies. It is far from perfect, but on the move to get there. The process of learning enables the system to make generalizations, because similar concepts create similar patterns. After knowing one car, the system can recognize similar patterns as other cars, or may even predict how other cars look like. Furthermore, the system is protected against total wreckage. A damage to single units will not cause the system’s total breakdown, but will delete only some patterns, which use those units. This is called graceful degradation and is often found in patients with brain lesions. These two arguments lead to the third. The PDP is organized similarly to the human brain. And some effective computer programs have been developed on this basis, that were able to predict the consequences of human brain damage.
On the other hand, the connectionist approach is not without problems. Formerly learned concepts can be superposed by new concepts. In addition, PDP can not explain more complex processes than learning concepts. Neither can it explain the phenomenon of rapid learning, which does not require extensive learning. It is assumed that rapid learning takes place in the hippocampus, and that conceptual and gradual learning is located in the cortex.
In conclusion, the PDP approach can explain some features of knowledge representation very well but fails for some complex processes.
Mental Representation
There are different theories on how living beings, especially humans encode information to knowledge. We may think of diverse mental representations of the same object. When reading the written word "car", we call this a discrete symbol. It matches with all imaginable cars and is therefore not bound to a special vehicle. It is an abstract, or amodal, representation. This is different if instead we see a picture of a car. It might be a red sports car. Now we speak of a non-discrete symbol, an imaginable picture that appears in front of our inner eye and that fits only to certain cars of sufficiently similar appearance.
Propositional Approach
The Propositional Approach is one possible way to model mental representations in the human brain. It works with discrete symbols which are strongly connected among each other. The usage of discrete symbols necessitates clear definitions of each symbol, as well as information about the syntactic rules and the context dependencies in which the symbols may be used. The symbol "car" is only comprehensible for people who do understand English and have seen a car before and therefore know what a car is about. The Propositional Approach is an explicit way to explain mental representation.
Definitions of propositions differ in the different fields of research and are still under discussion. One possibility is the following: ”Traditionally in philosophy a distinction is made between sentences and the ideas underlying those sentences, called propositions. A single proposition may be expressed by an almost unlimited number of sentences. Propositions are not atomic, however; they may be broken down into atomic concepts called ”Concepts”.
In addition, mental propositions deal with the storage, retrieval and interconnection of information as knowledge in the human brain. There is a big discussion, if the brain really works with propositions or if the brain processes its information to and from knowledge in another way or perhaps in more than one way.
Imagery Approach
One possible alternative to the Propositional Approach, is the Imagery Approach. Since here the representation of knowledge is understood as the storage of images as we see them, it is also called analogical or perceptual approach. In contrast to the Propositional Approach it works with non-discrete symbols and is modality specific. It is an implicit approach to mental representation. The picture of the sports car includes implicitly seats of any kind. If additionally mentioned that they are off-white, the image changes to a more specific one. How two non-discrete symbols are combined is not as predetermined as it is for discrete symbols. The picture of the off-white seats may exist without the red car around, as well as the red car did before without the off-white seats. The Imagery and the Propositional Approaches are also discussed in chapter 8 .