2.2.3. Long-Short-Term Memory (LSTM)

LSTM is based on the RNN architecture. It is a model designed to expand the RNN memory [45,46]. This memory has the ability to store information over an arbitrary length of time. There are three gates, which are the input, output, and forget gate, to control the information flow into and out of the neuron's memory [48–51]. Those three gates get the same input as the input neuron. Furthermore, each gate possesses an activation function [41,48,52].

Figure 4 shows the figuration of LSTM at time *t*. Mathematically, LSTM can be described using the following functions [50–58].

$$\mathbf{f}\_t = \mathbf{g} \left( \mathcal{W}\_f \,\mathbf{x}\_t + \mathcal{U}\_f \,\mathbf{h}\_{t-1} + \mathbf{b}\_f \right), \tag{5}$$

$$\mathbf{i}\_t = \mathbf{g} \left( \mathcal{W}\_i \,\mathbf{x}\_t + \mathcal{U}\_i \,\mathbf{h}\_{t-1} + \mathbf{b}\_i \right), \tag{6}$$

$$k\_t = \tanh\left(\mathcal{W}\_k \ge\_t + \mathcal{U}\_k \, h\_{t-1} + b\_k\right),\tag{7}$$

$$c\_{l} = f\_{l}c\_{t-1} + i\_{l} \; k\_{l} \tag{8}$$

$$\rho\_t = \lg\left(\mathcal{W}\_\mathcal{o} \ge\_t + \mathcal{U}\_\mathcal{o} \: h\_{t-1} + b\_0\right),\tag{9}$$

$$h\_t = o\_t \tanh\left(c\_t\right),\tag{10}$$

where *xt* is the input vector at time *t* and *g* is an activation function (sigmoid, tanh, or ReLU). *W* and *U* are weight matrices, and *b* is the bias vector. *ht* and *ct* are output and cell state vector at time *t*. *ft* has been used for remembering old information and it has been used for getting new information [38,49,50,52].

**Figure 4.** Long-short-term memory (LSTM) at time *t*.
