*4.3. Parameter Optimization*

For optimization and comparison purposes, the number of hidden units, LSTM layers, the batch size, and the epoch size were all modified [29,30]. Hiddenunitssize:4,8,16,32,64,128,256.

 ThenumberofLSTM layers:1,2,3,4,5,6,

 Thebatchsize: 3, 6,12,24,48,96.

 The epoch size: 100, 500, 1000, 2000, 5000, 10,000, 20,000.

 7.

 • Selection of the number of LSTM layers

The number of hidden units is 16, the batch size is 24, and the epoch size is 2000, all of which are randomly chosen. Only the number of layers in the LSTM is modified with the other parameters fixed: 1, 2, 3, 4, 5, 6, 7. The box diagram for the mean square deviation in the model learning process is shown in Figure 14.

The top line and bottom line represent the edge's maximum and minimum values, respectively. The upper quartile is represented by the box's upper edge, while the box's lower edge represents the lower quartile. The orange line represents the median. Comparing the seven box charts, increasing the number of layers has a minor impact on the mean square error of model training [33].When the number of layers is 5, 6 and 7, the error of the LSTM model will be stabilized to a fixed value immediately after a short training. As shown in Figure 14, the box plot has many outliers (that is, large outliers, black circles in the figure), and the median, upper quartile, and lower quartile overlap. However, in terms of model performance, using more LSTM layers, the running speed will be slower and it becomes more complex, and the result of the model operation is affected [34,35]. The loss error of the test set is positively correlated with that of the training set, and it is the smallest when the number of layers is 2. As a result, two layers of LSTM are best for this model.

**Figure 14.** The mean square error with different numbers of LSTM layers. (**a**) The training error with different number of LSTM layers. (**b**) The test error with different number of LSTM layers.

• Selection of the hidden units size

To determine the size of the hidden units, we keep the batch size and epoch size unchanged and run the LSTM model with different hidden units size, i.e., 4, 8, 16, 32, 64, 128, 256. The box diagram of the mean square is shown in Figure 15. In terms of error size and ultimate training effect, the choice of 128 hidden units is the best for training the data, with the majority of the mean square error values falling below 25, and the loss error of the test set is the smallest.

**Figure 15.** The mean square error with different hidden units size. (**a**) The training error with different hidden units size. (**b**) The test error with different hidden units size.

• Selection of the batch size

> The batch size, which can be 3, 6, 12, 24, 48, or 96, is altered when using two layers of LSTM with 128 hidden units. The box diagram is shown in Figure 16. The batch size refers to the number of samples fed into the model at once and divides the original data set into batch size data sets for independent training. This method helps to speed up training while also consuming less memory [36]. To some extent, batch size training can help to prevent the problem of overfitting [37]. As a result, when building the model, an acceptable batch size should be chosen. When the batch size is 24, the minimum value of the produced mean square deviation data set is the smallest in terms of minimum value and median, as well as the test error value.

**Figure 16.** The mean square error with different batch sizes. (**a**) The training error with different batch sizes. (**b**) The test error with different batch sizes.

• Selection of the epoch size Select two layers of LSTM with 128 hidden units and the batch size is 24, but the epoch size can be any of 100, 500, 1000, 2000, 5000, 10,000, or 20,000. Figure 17 shows a box diagram for the mean square deviation in the model learning process.

The epoch size is the number of times the learning algorithm works in the entire training data set. An epoch means that each sample in the training data set has the opportunity to update internal model parameters [38]. In theory, the more training sessions there are, the better the fit and the lower the error. In practice, however, overfitting occurs when the epoch size exceeds a specific threshold, causing the training outcomes to deteriorate [39]. The epoch size of 100, 500, 1000, 2000, 5000, 10,000, and 20,000 is chosen in Figure 17. The inaccuracy rapidly decreases and approaches zero as the epoch size increases from 100 to 10,000. When the epoch size increases to 20,000, the error is still tiny, but it is greater than when the epoch size is 10,000, indicating an overfitting occurrence. Therefore, the model with a 10,000 epoch size has the best effect.

Figure 18 shows the training and prediction outcomes after optimizing model parameters, while Figure 19 shows the loss value after optimizing parameters. The best parameters for the LSTM model are shown in Table 4. The LSTM model has a good prediction of the resting state of cattle, which largely adheres to the periodic changes in cattle state and has a modest error. Therefore, the digital twin model for cattle has been established and optimized.

**Figure 18.** The training and prediction after optimizing model parameters.

**Figure 19.** The training loss after optimizing model parameters.

**Table 4.** The best parameters for the LSTM model.


#### **5. Results and Analysis**

Figure 20 depict the LSTM model's training and prediction on different sexes, breeds, and states, respectively. This shows the applicability of this model, which can be used to predict various states of different cattle.

The trend of the results predicted by this LSTM model is nearly identical to the actual data. The model for Brahman males performs relatively poorly, which can be attributed to their relatively random rest state, poor cycle regularity, and other external environmental factors. It is possible that increasing the size of the data collection may result in improved predictions. Overall, the LSTM-based model for the cattle state cycle is accurate and effective, and it can accurately predict the dynamic trend of the next cattle state cycle.

In this way, the digital twin model can effectively predict the future time budget of cattle, which is conducive to efficient cattle breeding. Predict the future behaviour of cattle in advance so that appropriate preventive measures can be prepared.

**Figure 20.** Applicability of the model. (**a**) Brahman Male cattle (Rest). (**b**) Angus Female cattle (Rest). (**c**) Brahman Female cattle (Pant).
