Connectionist researchers freely admit that it is extremely difficult to determine how connectionist networks accomplish the task that they have been taught. "One thing that connectionist networks have in common with brains is that if you open them up and peer inside, all you can see is a big pile of goo" (Mozer & Smolensky, 1989, p.3). Similarly, Seidenberg (1993, p.229) states that "if the purpose of simulation modeling is to clarify existing theoretical constructs, connectionism looks like exactly the wrong way to go. Connectionist models do not clarify theoretical ideas, they obscure them." As a result, connectionists are reluctant to analyse the internal structure of their networks in an attempt to determine what algorithm lies within.
Unfortunately, this reluctance has raised serious doubts concerning the ability of connectionists to provide fruitful theories about cognitive processing. Because researchers rarely understand the internal workings of PDP models, McCloskey (1991) suggested that "connectionist networks should not be viewed as theories of human cognitive functions, or as simulations of theories, or even as demonstrations of specific theoretical points" (p.387).
Much of the research at the Biological Computation Project at the University of Alberta aims to rectify this situation. We are focusing our research efforts on the interpretation of a particular type of PDP architecture--networks of value units--in an attempt to maximize their contribution to cognitive science. A value unit network is an extension of the standard backpropagation network that uses a nonmonotonic (i.e., Gaussian) activation function in its hidden and output layers instead of a monotonic (e.g., logistic) function (see Dawson & Schopflocher, 1992). Value unit networks have several advantages over standard backpropagation networks, including faster learning of linearly inseparable classes, better generalization, and better "scaling up" properties (e.g., Dawson & Schopflocher, 1992; Dawson, & Shamanski, 1994; Medler & Dawson, 1994). Furthermore, value units have a characteristic that makes them very straightforward to interpret. Characterizing Hidden Units with Jittered Density Plots
Consider using a set of patterns to train a network of value units. After training, one could again present each pattern to the network and record the activity that each pattern produced in each hidden unit. This amounts to "wiretapping" each hidden unit while the stimulus set is being presented. One could use this information to create a jittered density plot for each hidden unit (e.g., Figure 1).
The jittered density plots in Figure 1 illustrate the result of wiretapping a network that
consisted of four different hidden units. In each of the density plots, a single dot is used to
represent the activity in that hidden unit produced by presenting one pattern to the network. The
density plots in Figure 1 were produced in a network trained on 8124 different input patterns;
consequently, each density plot is composed of 8124 different dots. The horizontal position of
each plotted point along the x-axis represents the activation produced by one of the training
patterns, whereas the y-axis represents a random jittering introduced to prevent points from
overlapping (Chambers, Cleveland, Kleiner, & Tukey, 1983, pp. 19-21).
Berkeley, Dawson, Medler, Schopflocher, and Hornsby (1995) found that in many cases, the density plots for hidden value units are highly structured. Density plots of value unit activations typically reveal distinct "bands," as is the case for the four density plots in Figure 1. (Berkeley et al., 1995, did not find that this structure was typical of density plots for standard backpropagation networks using the logistic activation function.)
This "banding" provides an important method for interpreting the kinds of features to which a hidden unit is sensitive. Each band in such a density plot supports a coherent interpretation: each pattern that falls into a band is characterized by definite features--a specific feature or set of features. These definite features can be quickly identified by calculating simple descriptive statistics. To illustrate the utility of identifying definite features, let us consider an example problem.
The Mushroom Problem
Providing a biological classification of a mushroom (i.e., correctly identifying a
mushroom's genus and species) requires paying attention to a wide variety of features, some that
are apparent from a casual inspection, others that are apparent only after microscopic
examination. Box 1 provides an algorithm for correctly classifying all mushrooms within
Schlimmer's (1987) data set.
Schlimmer's data set consisted of the hypothetical description of 23 different mushrooms in the Agaricus and Lepiota family (see Lincoff, 1981, pp. 500-525). Each mushroom was described as a set of 21 different features. Multiple featural descriptions of one species of mushroom were possible because one species might be found in several different habitats, have more than one possible odor, etc. The total data set consisted of 8124 different instances. 4208 of these patterns corresponded to edible mushrooms; the remaining 3916 training patterns corresponded to inedible mushrooms (i.e., mushrooms that were definitely poisonous, or were of unknown edibility and therefore not recommended).
The purpose of the current experiment was to determine if an artificial neural network could learn to identify correctly a mushroom as edible or not. In particular, we were interested in seeing whether--after the network converged--we could determine the rules that it used to classify mushrooms.
The network that we trained had four hidden value units, and one output value unit. The network was trained to generate a response of "1" to an edible mushroom, and a response of "0" to a poisonous mushroom. Initial connection weights for the network were randomly selected from the range [-1.0,1.0]. Biases for each hidden unit and for the output unit were set to 0, and were not modified during training. The network was trained using Dawson and Schopflocher's (1992) modified generalized delta rule, with a learning rate of 0.01 and no momentum. Network convergence (i.e., a hit on every pattern) was achieved after 108 sweeps through the training set.
Interpretation of the Network
The density plots for the four hidden units in the trained network are shown in Figure 1. As can be seen, all four hidden units reveal distinct banding. But, are these bands interpretable?
Following the practice of Berkeley et al. (1995), descriptive statistics (i.e., means, standard deviations, and correlations) were computed for the set of features that defined the mushrooms that fell into each band identified in Figure 1. Almost all of these bands revealed a rich set of definite features (for a detailed list, see Dawson, Medler, & Berkeley, 1995).
For example, consider Band H of Hidden Unit 2. The following description is true of all the mushrooms that fall into this band: Free gill attachment, narrow gill size, tapering stalk shape, stalk surface above ring silky or smooth, stalk surface below ring silky or smooth, stalk color above ring is pink or white, stalk color below ring is pink or white, white veil color, one ring, several population. If it has bruises, then odor is almond or anise, crowded gill spacing, and pendant ring type. If it does not have bruises, then odor is foul or musty, close gill spacing, and evanescent ring type. If odor is almond or foul, then spore print color is brown or green or orange. If odor is anise or musty, then spore print color is buff or chocolate or purple or white. Leaves or path or urban or woods habitat. This level of detail is typical of the definite features for all of the bands identified in Figure 1.
An examination of the definite features that were identified indicated that Hidden Unit 0 sends a strong signal when a member of a specific class of edible mushrooms is encountered. All of the mushrooms that fall into Band C have almond or anise odor, which is characteristic of an edible mushroom. Some of the mushrooms that fall into Band D of Hidden Unit 0 are edible in virtue of having almond or anise odor. The others that fall into this band have no odor, but are edible because the pass all five questions in the algorithm detailed in Box 1. With respect to this latter set of mushrooms, mushrooms that fall into Band B are nearly identical to them with one exception--the Band B mushrooms have bruises. As a result, they pass the first four steps of the algorithm, but fail (i.e., are identified as poisonous) in the fifth step.
Hidden Unit 1 detects inedible mushrooms based on odor. In particular, all of the mushrooms in Band B and C of Hidden Unit 1 have foul odor, which is diagnostic of a poisonous mushroom.
Hidden Unit 2 captures different classes of mushrooms related to each other in terms of bruises, gill attachment, and number of rings (almost all the mushrooms that produce high activity in this unit have no bruises, free gill attachment, and one ring). The classes diverge from one another with respect to combinations of other features, primarily odor and ring type. As a result, this unit does not pick out features that are only definitive of poisonous (or edible) mushrooms. For example, Band J in Hidden Unit 2 picks out edible mushrooms, while band K picks out some mushrooms that are poisonous (i.e., they have foul or musty odors) and some mushrooms that are edible (i.e., they have almond or anise odors).
Hidden Unit 3 generates strong activity when it detects the presence of a combination of features that are characteristic of inedible mushrooms. Specifically all of the mushrooms that fall in Bands B and C are inedible, and have free gill attachment, close gill spacing, stalk color above the ring that is not yellow, a white veil color, a spore print color that is not yellow, and a pendant ring type if it has no bruises, or a nonpendant ring type if it has bruises.
How does the network use the features encoded in the hidden unit activity bands to determine whether a mushroom is edible or not? For the most part, three of the hidden units in the network (hidden units 1, 2, and 3) detect features that are characteristic of poisonous mushrooms. As a result, the network defines a very large class of edible mushrooms as those that fail to produce activity in its hidden units; 4064 (96.58%) of the edible mushrooms produce bands of [0-A, 1-A, 2-A, 3-A], where 0-A is read "Band A of Hidden Unit 0". These mushrooms are classified as edible because the network fails to find any poisonous features in them.
While this approach works for most of the mushrooms, it fails for a minority of them. A small percentage of edible mushrooms fall into the same (high activity) bands in Hidden Unit 2 as do a number of poisonous mushrooms. As a result, a different approach must be used to identify the edibility of these mushrooms. The network pools the activation of hidden units 0 and 2 to solve this problem, essentially by intersecting the features detected by the two units with an AND operation. The high activity bands of Hidden Unit 2 capture groups of related mushrooms, but include both poisonous and edible mushrooms. The intersection of the members of the high activity bands from both units define a set of edible mushrooms--this intersection picks out the edible mushrooms from the bands in Hidden Unit 2 that also contain poisonous mushrooms. Consequently, the three other "rules" needed to identify the remaining 144 edible mushrooms are: [0-C, 1-A, 2-H, 3-A], [0-D, 1-A, 2-J, 3-A], and [0-D, 1-A, 2-K, 3-A]--each of these rules uniquely identifies 48 or 1.14% of the remaining edible mushrooms.
Conclusion
Connectionists must interpret the internal structure of their networks if PDP models are to contribute to cognitive science (e.g., McCloskey, 1991). In this paper, we have provided a very brief overview of a technique for interpreting value unit networks (for more detailed accounts, see Berkeley et al., 1995; Dawson, Medler, & Berkeley, 1995). Our hope is that this kind of technique will allow cognitive scientists to generate detailed algorithmic accounts of PDP models of cognitive processes, and thus allow these models to inform other empirical investigations of cognition (e.g., by suggesting experiments to be performed by cognitive psychology).
Author's Notes
This research was supported by an NSERC research grant awarded to MRWD, and by a
Province of Alberta Graduate Fellowship awarded to DAM. Address correspondence to
Dr. Michael Dawson, Biological Computation Project, Department of Psychology,
University of Alberta,Edmonton, AB, T6G 2E9.
Phone: 403-492-5175. E-mail: mike@psych.ualberta.ca.