Recurrent Neural Network
(RNN)
Recurrent neural networks are different from traditional feed-forward neural networks. Recurrent networks have an internal state that can represent context information: the feed-back connections.
Context is key: RNNs keep information about past inputs for an amount of time that is not fixed a priori, but rather depends on its weights and on the input data. A recurrent network whose inputs are not fixed but rather constitute an input sequence can be used to transform an input sequence into an output sequence while considering contextual information in a flexible way.
The context required must also be learned, however. Recurrent neural networks contain cycles that feed the network activations from a previous time step as inputs to the network to influence predictions at the current time step. These activations are stored in the internal states of the network which can in principle hold long-term temporal contextual information. This mechanism allows RNNs to exploit a dynamically changing contextual window over the input sequence history.
Legend
Unfortunately, the range of contextual information that standard RNNs can access is in practice quite limited. The problem is that the influence of a given input on the hidden layer, and therefore on the network output, either decays or blows up exponentially as it cycles around the network’s recurrent connections. This shortcoming referred to in the literature as the vanishing gradient problem.