Qnet 2000

Data Normalization

Backpropagation neural networks require that all training targets be normalized between 0 and 1 for training. This is because an output node’s signal is restricted to a 0 to 1 range. Qnet also requires input normalization to improve training characteristics. Qnet can perform the data normalization automatically and hide the normalization details from the user. If this option is selected during training setup, all data for the nodes in the input layer and/or training targets for the output layer will be normalized between the limits of .15 and .85. If new training patterns are added to the training set in subsequent sessions, the data will be re-normalized as necessary. (Note: This may make the network weights slightly out of sync with the training data.) Even if the training data is already between the limits 0 and 1, normalization may still be desirable. For example, if all target data were between .01 and .02, it would be better to normalize the data over a wider range so that the network can resolve and predict the targets over an optimal range.

When Qnet’s automatic normalization is used, the entire normalization process becomes invisible to the user. All network inputs and outputs will be returned to their original scale when plotted, printed or saved to a file. For recall mode, input node data is normalized to the same limits used for network training. Network outputs will be automatically scaled to the proper range if automatic normalization was selected for the training targets. If new input data is being presented to the network, there is always a chance that the new normalized data will fall outside the 0 to 1 limits. This may not be a problem, however, it should be noted that when inputs to a network are significantly different from the data ranges that were used during training, the model’s accuracy may questionable.

In the loan application example, Qnet’s automatic normalization would be used for the network inputs (the output is in a binary form and does not require normalization). Assume that the number of dependents ranged from 1 to 10 for the training set. If a loan applicant applies with 18 dependents (this is only an example), could the network accurately predict whether the person should qualify for the loan? The model will produce an answer, but the results may be unreliable.