**2. Methods**

#### *2.1. Cascade Neural Network*

The neural network's environment is uncertain. It is presumed that the teacher and the neural network are linked to an environmental testing vector, as an example. Because of the integrated experience, the teacher is capable of responding to this training vector in the algorithm. In fact, the appropriate outcome is the optimal response of the neural network. The key property of a neural network is the network's capacity to improve performance by learning from its experience [24]. Based on how well the neural networks operate, the networks are divided into supervised learning networks and unsupervised learning networks, otherwise termed teacher learning and teacher-free learning. Structurally, we may think of the teacher as having information about the environment, portrayed by a variety of samples of input and output [25,26].

The network parameters are managed according to the cumulative effect of the training vector, mostly with error signals. Meanwhile, each error signal should be specified as the gap between the requested response and the network's actual response. This modification is made step by step in order to rapidly mimic the teacher in the neural network, with the emulation in any mathematical context assumed to be ideal. This transfers awareness of the teacher's setting to the neural network as thoroughly as possible by preparation [27]. If this is achieved, then we can dispense with the teacher and encourage the neural network to deal entirely with the environment. Throughout the supervised NN model, input vectors and suitable target vectors are used to update the parameters; before a function can be approximated, input features should be tied to specific output vectors and the information that can be processed should be properly identified [28,29].

The most famous and typical algorithm for neural network training is the context of an error, the main principle of which is that an error in the hidden neurons is calculated by propagation of the error in the output layer neurons. The traditional backpropagation algorithm uses two input and learning processes. Vectors or patterns are displayed in the input layer in feedforward operation, and each neuron throughout the hidden layer is measured in the activation with one neuron *netj*. The input vector dot product including neuron weight in the hidden layer is represented in Equation (1):

$$met\_{j} = \sum\_{i=1}^{N\_i} x\_i w\_{ij}^{I-H} + b\_{ij} = \sum\_{i=0}^{N\_i} x\_i w\_{ij}^{I-H} \tag{1}$$

where *Ni* is the input vector dimension, and *i* and *j* are neuron indices in the layer input and in the hidden layer, respectively. The weight value between the input vectors and neurons in the hidden layer is *wI*−*<sup>H</sup> ij* . The weight value of bias in the hidden layer is *bij* and is usually assumed to be *bij* = *wI*−<sup>0</sup> 0*j* , *x*0 = 1. By substituting *netj* into activation function *ϕ*1, *θj* is calculated. In the activation of a single neuron, each neuron in the output layer computes *netk*, which is the dot product of *θj* and the neuron weight in the output layer, represented in Equations (2) and (3).

$$\theta\_{\dot{\jmath}} = \varphi\_1(net\_{\dot{\jmath}}) \tag{2}$$

$$met\_k = \sum\_{j=1}^{N\_H} \theta\_j w\_{jk}^{H-0} \tag{3}$$

In line with this, *NH* is the number of neurons in the hidden layer and *k* is the index of a neuron in the output layer. The weight value of neurons between the hidden layer and output layer is described as *w H*−0 *jk* . We can substitute *netk* into activation function *ϕ*2 to output *yk*, represented in Equation (4).

$$y\_k = \wp\_2(net\_k) \tag{4}$$

$$\mathcal{g}\_k = \mathcal{q}\_2 \left( \sum\_{j=1}^{N\_H} \theta\_j w\_{jk}^{H-0} \right) \tag{5}$$

$$y\_k = q\_2 \left( \sum\_{j=0}^{N\_H} w\_{jk}^{H-0} \left( \sum\_{i=0}^{N\_l} x\_j w\_{ij}^{I-H} \right) \right) \tag{6}$$

Regarding Equations (4) and (5), the entire collection of weights is updated to ensure that *yk* is near the target output value of *tk* by propagating the *Er* error of the output layer neurons throughout the learning phase. Although a variety of output functions are available to evaluate the error, the squared error is commonly used, represented in Equation (7).

$$E\_r = \sum\_{k=1}^{N\_0} (t\_k - y\_k)^2 \tag{7}$$
