Qnet 2000

Training Setup - Network Design

The Network Design Dialog is used to specify or alter the neural networks design and setup parameters. Specify layer and node quantities, transfer functions and network connections. Dialog options include:

Problem Name- Enter an identifying name for the network.

Number of Network Layers- Enter the number of layers for the network. Include the input layer, the arbitrary number of hidden layers and the output layer. The minimum value is 3 and the maximum value is 10 (i.e., 1-8 hidden layers). The following points should be considered when deciding on the number of hidden layers to use:

1)With more hidden layers, complex input/output relationships can be better modeled by the network with fewer total hidden nodes.

2)More hidden layers slow training by increasing the total number of iterations required to learn. Training time per iteration increases with the use of more hidden layers and nodes.

3)Using 3 to 6 total layers (1 to 4 hidden layers) is sufficient for the vast majority of neural models.

Number of Input Nodes- Enter the number of input nodes for the network. The number of nodes must correspond to the number of inputs in the network model. For example, if 10 inputs were used to model 3 outputs, the number of input nodes would be 10.

Number of Output Nodes- Enter the number of output nodes for the network. The number of nodes must correspond to the number of outputs in the network model. For example, if 10 inputs were used to model 3 outputs, the number of output nodes would be 3.

Number of Hidden Nodes per Layer- Enter the number of hidden nodes for each hidden layer of the network. The number of entries should be [NUMBER OF LAYERS - 2]. The order is from the first hidden layer after the input layer to the last hidden layer before the output layer. Network designs tend to be better when the number of hidden nodes are matched to the size of the problem being modeled. The following points should be considered when choosing the number of hidden nodes in each hidden layer:

1)Fewer nodes are needed per each hidden layer when more hidden layers are used.

2)Highly complex the relationships between the inputs and outputs will require a greater number of hidden nodes and/or hidden layers for accurate results.

3)A larger number of hidden nodes is best used when there are a large number of training cases. When a small number of training cases are used, compared to the network size, memorization will become a problem.

4)When gaussian or secant transfer functions are employed, a greater number of hidden nodes will usually improve model accuracy.

5)When using multiple hidden layers, specifying a constant number of nodes per layer or specifying a design that gradually decreases the number of hidden nodes in successive layers often produces the best results.

Specifying too many hidden nodes can result in poor models tend to memorize the training set rather than learn relationships. Specifying too few hidden nodes will result in models that can’t learn the training data adequately. Some experimentation with the network construction may be required to determine the configuration that offers the best learning characteristics. Often, a large envelope of similar network constructions exist that will produce very similar results. For new models, it is best to start with smaller, simpler designs to validate the model prior to optimizing the network size. Qnet’s node analyzer plot can be useful in determining when layers are over or under specified.

Transfer Functions: Select the transfer function for the hidden or output layers.

Sigmoid (default) - Selecting the sigmoid button for a given layer assigns the sigmoid transfer function to all nodes in that network layer. A sigmoid transfer function is the default backpropagation transfer function used by each node in the network (except input nodes). The sigmoid function is represented by the mathematical relationship 1/(1+e-x). It serves to normalize a node’s output response to a value between 0 and 1. The sigmoid function acts as an output gate that can be opened (1) or closed (0). Since the function is continuous, it also possible for the gate to be partially opened (i.e. somewhere between 0 and 1). Models incorporating sigmoid transfer functions often show good generalized learning characteristics and yield models with excellent accuracy. Use of sigmoid transfer functions can also lead to longer training times.

Gaussian- Selecting the gaussian button for a given layer assigns the gaussian transfer function to all nodes in that network layer. The gaussian transfer function significantly alters the learning dynamics of a neural network model. Where the sigmoid function acts as a gate (opened, closed or somewhere in-between) for a node’s output response, the gaussian function acts as a probabilistic output controller. Like the sigmoid function, the output response is normalized between 0 and 1, but the gaussian transfer function is more likely to produce the “in-between state”. It would be far less likely, for example, for the node’s output gate to be fully opened (i.e. an output of 1). Given a set of inputs to a node, the output will normally be some type of partial response (the output gate will open partially). Gaussian based networks tend to learn quicker than sigmoid counterparts, but can be prone to memorization.

Tanh- The hyperbolic tangent function is a counterpart to the sigmoid transfer function. While the hyperbolic tangent is similar to the sigmoid it can exhibit different learning dynamics during training. It can accelerate learning for some models, but it also may not achieve the same accuracy as sigmoid-based models. Experimenting with transfer functions for each individual model is the only conclusive method to determine if any of the non-sigmoid transfer functions will offer good learning and accuracy characteristics.

Sech- The hyperbolic secant is similar to the Gaussian in both numerical and conceptual function. Experimenting with this Gaussian counterpart is the only conclusive method to determine if it offers better learning and accuracy characteristics.

Connections- The “Connections...” button invokes the connection editor for the given layer. Use this option to alter the network connections from the default fully connected configuration.

The connections in the network are shown in the left list box. The connections removed from the network are shown in the right list box. The connection notation L1, L2,... stands for Layer 1, Layer 2, etc. The notation N1, N2,... stands for Node 1, Node 2, etc. All connections between any two layers are represented by this notation and displayed in the lists. Simply select the desired connections and use the remove and add buttons to transfer the selected connections between lists. Select multiple connections by using the Shift and/or Ctrl keys with the mouse pointer. Holding down the shift key allows you to select multiple sequential connections. Holding down the Ctrl key allows you to select multiple connections individually. After editing the connections between layers, select Ok to register these changes. Selecting Cancel will cause changes to be discarded.

View Network- The “View Network” button will display the current design of your network.