Welcome to the Symposium on Connectionism and Psychology.
The agenda of the symposium is shown above. Each speaker's talk will last about 25 minutes, with about 5 minutes after each talk for questions from the audience. Because of the long duration and topical breadth of this symposium, no formal group discussion among the speakers is scheduled. However, you are invited to address questions to more than one speaker during the question periods, and the speakers are invited to discuss questions initially answered by other speakers.
I hope that you will come away from this symposium impressed by the ability of connectionist models to address detailed psychological data in a variety of domains, from vision to language. Just as importantly, I hope also that this symposium emphasizes the psychological and/or neural principles that are embodied in the models and carry the explanatory force of the models. Across the variety of topical applications, I hope you will see a unity of explanatory principles and their formal expression in connectionist models.
In organizing this symposium, it was very difficult to select only a few speakers from a burgeoning array of researchers worldwide. Entire conferences could be devoted just to the topic of this morning's symposium. No doubt there are many people in the audience today who could also be giving talks in this symposium. Nevertheless, our speakers represent applications of connectionist models to a broad range of topics in cognitive psychology, and I am very pleased indeed that the other speakers have agreed to be here today. I am eager to hear their talks.
This research has been supported by an NIMH FIRST Award, an Indiana University Summer Faculty Fellowship, the Indiana University Cognitive Science Program, and by an NIH Biomedical Research Support Grant
Outline of presentation:
The first part of my presentation today will describe incorporating psychological principles into connectionist models of human learning. I will describe several psychological principles, such as dimensional attention, and so on, illustrating each with an empirical effect that the principle accounts for, and further buttressing each principle by giving it a formal expression in a connectionist model that fits the data.
The principles are expressions of a common theme that I will discuss in the second part of my presentation, in which I will describe some general conclusions for connectionist learning. These conclusions are (a) Vanilla regression engines are not enough to simulate human learning. By a ``vanilla regression engine'' I mean a learning algorithm that ``looks at'' the entire stimulus pattern and attempts gradually to extract statistical regularities. Back-propagation is the best known case of such a vanilla regression engine, but various algorithms fall into this class. Humans, on the contrary, have other important constraints besides just having their predictions eventually match the environment. People must learn quickly, with little working memory, often from few occurrences, and without damaging previously learned knowledge. The flip side of this conclusion is (b) People learn about what they attend to, and what they attend to protects and builds on what they already know. Vanilla regression mechanisms do not incorporate such shifts of attention to protect and utilize what is already learned. This is the ``take-home message'' that I will be reiterating throughout the talk.
Let's consider first the psychological principle that people can learn to attend to relevant dimensions, when this is useful. That is, people can improve the efficiency of learning by ignoring irrelevant features or dimensions and attending to relevant features or dimensions.
As an example of this ability, I will briefly review the filtration vs. condensation experiment I described in a 1993 article in Connection Science.
[... review of Exp.1, filtration vs. condensation, from Kruschke (1993), Connection Science ...]
What I wished to highlight in this demonstration was that people can shift attention to particular features or dimensions when this is useful for accelerating learning. Some connectionist learning algorithms do not incorporate this characteristic, but they should if they want to model human learning.
People not only learn to attend to relevant dimensions, they also learn to attend to relevant representational forms. This can be illustrated by considering a situation in which much of the stimulus space can be correctly classified by a simple rule, but in which there are a few scattered exceptions to the rule. The challenge for a person trying to learn such a structure is to take advantage of the rule, for rapid learning and generalization, while simultaneously properly learning the exceptions. The learner needs to learn not only the correct classification, but also needs to learn when to attend to rules and when to attend to exemplars in memory.
Let me illustrate this concretely with work done in collaboration with a graduate student named Michael Erickson. We have reported aspects of this work at the Cognitive Science Conferences of 1994 and 1996, and in an article in preparation.
[... review of Exp.1 of Erickson & Kruschke (1996), Cognitive Science Society Conference ...]
The central message of this demonstration was that people can shift attention toward or away from particular representational formats when this is useful for accelerating new learning and protecting previous learning within representational formats. In particular, people can learn to shift attention away from rules in order to learn exceptional cases. This protects prior learning of the rule and enhances learning of the exception.
Not only can people shift attention to preserve associations from particular features or representations, people can also preserve prior learning of categories themselves, and rapidly shift the correspondence of internal categories to overt responses. I will illustrate this with some research in press in Connection Science.
[... review of Kruschke (1996), Connection Science ...]
The main point of this example has been that people are very good at preserving and building upon prior learning whenever they can, in particular when a new situation merely re-maps previously learned categories to new responses. Connectionist models of human learning should also incorporate these characteristics.
As a final, and most important, example of how people preserve and build upon prior learning, I will describe rapid attention shifts in the inverse base rate effect.
[... review of Kruschke (1996), JEP:LMC ...]
The central message of this example was that people can rapidly shift attention across features on a single trial, in order to protect prior learning and facilitate new learning on that trial. Connectionist models of human learning should incorporate this characteristic!
Now to some general conclusions: First, vanilla regression engines are not enough to model human learning. Again, by a ``vanilla regression engine'' I mean a statistical algorithm that unselectively tries to learn about the entire stimulus pattern, gradually improving its predictions. Such generic algorithms might describe any organism's cognition. Backpropagation, for example, might be used to model learning in organisms from insects to humans. Yet no amount of talking to a mosquito will get it to talk back. Mere exposure to structured input is not, generally, enough to produce human-like learning.
There is no claim that a combination of a regression engine, such as backpropagation, with attention shifting, is adequate or sufficient to model human learning. Rather, the claim is that a variety of forms of attention shifting are necessary and essential to model human learning.
This general theme has many precedents in the psychological literature: In animal learning research, in developmental psychology, in human memory research, and so on. I think what is new here is the specific realization of this theme in a variety of connectionist architectures I have shown you, and also the general reminder to connectionist modelers that attention shifting is critical in human learning.
Human learning operates under constraints of limited time and limited capacity. Unlike generic regression engines, people do not just absorb all aspects of a stimulus indiscriminately. Such an approach to learning and memory is very costly in terms of retroactive interference with prior learning and proactive interference with new learning. Instead, human learning incorporates mechanisms to protect prior learning and facilitate new learning. In summary, people learn about what they attend to, and what they attend to protects and builds on what they already know.