Qnet 2000

Learning Modes

An important concept to understand about the training process is there are two distinct learning modes. One type is generalized learning where the network model develops relationships to formulate output predictions. The other mode is “memorization” where the maps input sets to output sets. Generalized learning can be exemplified by the learning of addition principals with a small set of numbers (i.e., 2+2, 3+5, etc.) and through understanding general concepts, the results of any two summed numbers can be determined even though the sum pair was not previously studied. Memorization learning can be shown by learning national capitals. Learning the capitals of 20 countries does nothing to help with answers not previously learned. Memorization learning is only useful for the learned set. The sample problem RANDOM included with Qnet shows an example of memorization learning. While memorization can be a legitimate learning goal for some model types, generalized learning is desired for the vast majority of data modeling problems. Differentiating between the learning modes will allow you to optimize model training and better determine the effectiveness of a model prior to use.

Qnet’s training algorithms attempt to drive the network’s response error for the training set to a minimum value. The error value monitored during Qnet training is the root-mean-square (RMS) error between the network’s output response and the training targets (equivalent to the standard deviation). When the training set's error is descending during the training process, one or both types of learning discussed above is taking place. Unfortunately, there is no way to determine which type of learning is taking place by monitoring the training set error by itself. To determine the type of learning, a test set (or overtraining set) must be employed. Qnet allows the test set to be monitored interactively during training. This set of data is not used to train the network, however, the error in the network response is monitored to determine how the network responds to patterns outside the training set. If both the training and test set errors are declining, generalized learning predominates. The network is learning to generalize the relationships between the inputs and outputs. If the test set error increases while the training set error declines, then memorization is the predominant learning mode. Overtraining occurs when a test set’s error has reached a minimum level and begins to increase indefinitely thereafter.. Overtraining a network after this minimum has been reached can actually hurt the predictive capabilities of the model being developed due to losses in generalization of concepts.

It should be noted that the method of determining the learning modes and overtraining status by monitoring the training and test set errors assumes that the test set is an adequate subset of the training set. This may not always be true. For problems where the test set is some limited or organized subset of the training set cases, the minimum test set error may simply indicate the point that the network has best modeled that subset of test set cases. To function as a true overtraining indicator, it is important that the test set cases be a truly random and broad sample of the training cases. To help prevent this problem, Qnet may be instructed to select the cases randomly from the training set. While this does not guarantee a perfect set for testing, a high probability will exist that an ample amount of unique case types will be present when large test sets are employed.