*3.2. Pre-Simulation*

According to the settings in Section 3.1, we verified the established model. We tested whether the neural network is suitable for the simulation of the ARMA model to verify the correctness of the derivation in Section 3.2. On the other hand, the proposed method is used for order estimation, and special cases are discussed at the same time.

Figure 4 is the MSE loss value of the time series conforming to the ARMA (3, 2) model. Figure 4a,b represent the time series by simulating the neural network of ARMA (3, 2) and ARMA (1, 2) respectively. We use this example to illustrate the judgment theory of neural networks. As shown in Figure 4a, when the time series passes the correct model, MSE has been in a downward trend and, after a sufficient period of the epoch, MSE reaches the target value (10−7). In Figure 4b, when the time series passes through the mismatched model, we find that MSE reaches the optimal value (10−3) at 33 Epochs and does not drop again for ten consecutive times. The reason for this phenomenon is that, for the correct neural network model, the model can approach an analytical solution after sufficient iterations. For the wrong neural network models, the value of MSE will often not drop after reaching

the critical point, and the correct model can obtain a satisfactory MSE. This is also the basis for us to judge the order of the ARMA model through MSE.

**Figure 4.** Training neural network of ARMA (3, 2): (**a**) ARMA (3, 2), (**b**) ARMA (1, 2).

Next, the problem we needed to solve was how to determine the order of the ARMA model through the BPNN. From Figures 1 and 4, we can see that when determining the best input layer of the BPNN, the *p* and *q* of the input layer corresponded to the best order of the ARMA model. Therefore, we expected that the neural network's MSE loss function should be the smallest for the correct model order. Finally, the time series obtained MSE through 30 possible neural network structures.

Figure 5 shows the MSE calculation results of two different time series. It can be seen from Figure 5a that, as the order increases, the MSE presents an obvious downward trend, in which the red circle marks the true model orders. When the critical point is reached, the MSE will not change significantly with the increase of the order because high-level neural network models can reflect low-level changes. In the calculation, we found that the calculation result of MSE has the special case shown in Figure 5b. Figure 5b simulates a special case of the ARMA (2, 1) calculation. The three points represented by the red circle in Figure 5 may all be the value of the order, and the MSE is relatively small at these three points. Therefore, in addition to comparing MSE, we also introduce the gradient to determine the order of the model when there are multiple critical points. When the descending gradient of the critical point is the largest and the MSE is small, the point is considered to be the best value for this set of critical points. From Figure 5, we find that the asymmetric ARMA structure is more prone to result judgment difficulties because the descending gradient of its MSE is gentler near the correct values.

**Figure 5.** MSE of BPNN: (**a**) ARMA (2, 2), (**b**) ARMA (2, 1).
