Qnet 2000

Transfer Functions

Sigmoid, Gaussian, Hyperbolic tangent and secant transfer functions normalize the output signal generated by each node.

A node’s transfer functions serves the purpose of controlling the output signal strength for the node (except for the input layer which uses the inputs themselves). These functions set the output signal strength between 0.0 and 1.0. The input to the transfer function is the dot product of all the node’s input signals and the node’s weight vector. Qnet gives you the option of selecting four transfer functions: the sigmoid, gaussian, hyperbolic tangent and hyperbolic secant. The functions are selectable on a layer-by-layer basis in Qnet and networks can be created that incorporate multigle types. The figure shows the behavior of each function.

This sigmoid function is Qnet’s default transfer function and it is the most widely used function for backpropagation neural networks. The sigmoid function is represented by the mathematical relationship.

1/(1+e-x). The sigmoid function acts as an output gate that can be opened (1) or closed (0). Since the function is continuous, it also possible for the gate to be partially opened (i.e. somewhere between 0 and 1). Models incorporating sigmoid transfer functions often help generalized learning characteristics and yield models with improved accuracy. Use of sigmoid transfer functions can also lead to longer training times.

The gaussian transfer function significantly alters the learning dynamics of a neural network model. Where the sigmoid function acts as a gate (opened, closed or somewhere in-between) for a node’s output response, the gaussian function acts like a probabilistic output controller. Like the sigmoid function, the output response is normalized between 0 and 1, but the gaussian transfer function is more likely to produce the “in-between state”. It would be far less likely, for example, for the node’s output gate to be fully opened (i.e. an output of 1). Given a set of inputs to a node, the output will normally be some type of partial response. That is the output gate will open partially. Gaussian based networks tend to learn quicker than sigmoid counterparts, but can be prone to memorization.

The hyperbolic function counterparts to the sigmoid and gaussian functions are the hyperbolic tangent and hyperbolic secant functions. The hyperbolic tangent is similar to the sigmoid but can exhibit different learning dynamics during training. It can accelerate learning for some models and also have an impact on predictive accuracy. Experimenting with transfer functions for each individual model is the only conclusive method to determine if any of the non-sigmoid transfer functions will offer both good learning and accuracy characteristics.

For most modeling tasks, the sigmoid function should at least be a baseline model to measure results. A general rule of thumb is that the sigmoid will produce the most accurate model; but be slower learning. If you intend to frequently train similar models and training speed is critical, different combinations of transfer functions, including hybrid networks, are worth investigating to find faster training models that exhibit acceptable accuracy.