Next Article in Journal
Pigeon Pea Intercropped with Tropical Pasture as a Mitigation Strategy for Enteric Methane Emissions of Nellore Steers
Previous Article in Journal
Does Parity Influence the Magnitude of the Stress Response of Nellore Cows at Weaning?
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Method to Predict CO2 Mass Concentration in Sheep Barns Based on the RF-PSO-LSTM Model

1
College of Mechanical and Electrical Engineering, Shihezi University, Shihezi 832003, China
2
Xinjiang Production and Construction Corps Key Laboratory of Modern Agricultural Machinery, Shihezi 832003, China
3
Industrial Technology Research Institute of Xinjiang Production and Construction Corps, Shihezi 832000, China
4
College of Information Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou 510225, China
*
Author to whom correspondence should be addressed.
Animals 2023, 13(8), 1322; https://doi.org/10.3390/ani13081322
Submission received: 6 February 2023 / Revised: 8 April 2023 / Accepted: 10 April 2023 / Published: 12 April 2023
(This article belongs to the Section Animal System and Management)

Abstract

:

Simple Summary

With the change in meat sheep breeding from traditional farming to large-scale, intensified modern breeding practices, the environmental air quality in sheep barns has gradually started to receive more attention. CO2 concentration is an important environmental indicator in the ambient air of sheep sheds; when excess CO2 accumulates, it can lead to chronic hypoxia, lethargy, loss of appetite, weakness, and stress in sheep, which seriously endangers their healthy growth. Therefore, an accurate understanding of the trend of CO2 concentration changes in sheep housing and the precise regulation of their breeding environment are essential to ensure the welfare of sheep. Inspired by developments in deep learning technology in recent years, we propose a method to predict CO2 mass concentration in sheep barns based on the RF-PSO-LSTM model. The experimental results show that our proposed model has a root mean square error (RMSE) of 75.422 μg·m−3, a mean absolute error (MAE) of 51.839 μg·m−3, and a coefficient of determination (R2) of 0.992. The data predicted by the model are similar to the real data of a sheep barn; in fact, the prediction is better. Our proposed method can provide a reference for the prediction and regulation of ambient air quality in meat sheep barns.

Abstract

In large-scale meat sheep farming, high CO2 concentrations in sheep sheds can lead to stress and harm the healthy growth of meat sheep, so a timely and accurate understanding of the trend of CO2 concentration and early regulation are essential to ensure the environmental safety of sheep sheds and the welfare of meat sheep. In order to accurately understand and regulate CO2 concentrations in sheep barns, we propose a prediction method based on the RF-PSO-LSTM model. The approach we propose has four main parts. First, to address the problems of data packet loss, distortion, singular values, and differences in the magnitude of the ambient air quality data collected from sheep sheds, we performed data preprocessing using mean smoothing, linear interpolation, and data normalization. Second, to address the problems of many types of ambient air quality parameters in sheep barns and possible redundancy or overlapping information, we used a random forests algorithm (RF) to screen and rank the features affecting CO2 mass concentration and selected the top four features (light intensity, air relative humidity, air temperature, and PM2.5 mass concentration) as the input of the model to eliminate redundant information among the variables. Then, to address the problem of manually debugging the hyperparameters of the long short-term memory model (LSTM), which is time consuming and labor intensive, as well as potentially subjective, we used a particle swarm optimization (PSO) algorithm to obtain the optimal combination of parameters, avoiding the disadvantages of selecting hyperparameters based on subjective experience. Finally, we trained the LSTM model using the optimized parameters obtained by the PSO algorithm to obtain the proposed model in this paper. The experimental results show that our proposed model has a root mean square error (RMSE) of 75.422 μg·m−3, a mean absolute error (MAE) of 51.839 μg·m−3, and a coefficient of determination (R2) of 0.992. The model prediction curve is close to the real curve and has a good prediction effect, which can be useful for the accurate prediction and regulation of CO2 concentration in sheep barns in large-scale meat sheep farming.

1. Introduction

China has the largest number of sheep and goats in the world and is the largest producer and consumer of meat sheep [1,2]. In order to meet the huge market demand for healthy meat sheep, transforming and upgrading meat sheep farming from the traditional free-range model to a modern model in terms of scale and intensification are inevitable [3,4]. However, the environmental air quality in sheep barns can easily deteriorate under large-scale and intensive farming, and when environmental regulation is not timely, it can threaten the normal growth and breeding of meat sheep by inducing disease outbreaks and even causing mass mortality [5,6].
Housing for sheep with a good breeding environment is the basis for disease prevention and control, considering the genetic and nutritional advantages of meat sheep [7]. Factors affecting the environmental air quality of sheep housing mainly include temperature, humidity, wind speed, and harmful gases. The gases in sheep barns mainly include O2, CO2, NH3, and H2S, among which CO2 is the main greenhouse gas. The CO2 in sheep barns is mainly produced by respiration and fecal decomposition, and its emission is influenced by the growth stage, body weight, exercise habits, and ventilation rate of the sheep [8,9,10,11]. A normal range of CO2 mass concentration is not harmful to the health of sheep and is not serious. When the CO2 mass concentration in sheep sheds is too high, the oxygen content is relatively insufficient, and sheep that live in this environment for a long time will suffer from chronic hypoxia, mental depression, loss of appetite, delayed weight gain, weakness, reduced production level, stress, and susceptibility to infectious diseases, which seriously impairs their welfare [12,13]. Therefore, it is important to study methods of predicting CO2 concentrations in the sheep barn breeding environments of large-scale meat sheep farms to accurately grasp the trend of CO2 changes and precisely regulate air quality, which is of great research value for reducing the impact of environmental stress on meat sheep growth and reproduction, preventing the occurrence of diseases and epidemics, reducing stress, and guaranteeing the welfare of sheep.
Research has been conducted on predicting CO2 mass concentration based on traditional machine learning methods, and some results have been obtained in predicting CO2 concentrations in pig houses [14], composting environments [15], and building construction environments [16,17], and with regard to urban carbon emissions [18,19], crop CO2 emissions [20], and ambient air pollution [21,22]. Although these CO2 mass concentration prediction models can express the trends of internal changes of CO2 in the environment and achieve certain prediction results, they require large amounts of valid data as experimental support, which creates a large and tedious workload. In addition, they can have problems, such as a long training time, slow convergence speed, susceptibility to falling into a local optimum, and poor model generalization ability, which make it difficult to meet the requirements for the timely and accurate prediction and regulation of CO2 mass concentration in sheep barns of large-scale meat sheep farms [23,24,25].
In recent years, with the rapid development of artificial intelligence and deep learning technology, researchers have applied deep learning techniques to a wide range of real-world problems [26,27,28,29,30,31,32,33,34,35,36,37]. Deep learning techniques have been used in crop detection [28,29], data prediction for agricultural management processes [30,31,32], crop disease detection and classification [33,34], and animal behavior recognition [35,36,37].
CO2 mass concentrations in sheep barns of large-scale meat sheep farms can be collected online as time series and nonlinear data, and the LSTM model, one of the typical methods of deep learning, can be used to mine future data change trends by extracting historical time series data features, which allows it to achieve certain results in time series data prediction tasks [38,39,40,41,42,43,44,45]. Wang et al. improved the LSTM model’s prediction performance by adding an adaptive attention module, which allowed the model to obtain more critical information from time series data and achieve an accurate prediction of the remaining service life of lithium–ion batteries [38]. Lin et al. first used LSTM to obtain the long time series relationship of heart rate data, then used BiLSTM to obtain the forward and backward correlation information of the data, and finally combined that with the attention mechanism to achieve an accurate prediction of heart rate [39]. Wu et al. constructed a hybrid model using a combination of LSTM and kinetic models to achieve an accurate prediction of drought occurrence [40]. Zhang et al. combined convolutional neural network (CNN) and LSTM models to construct a hybrid model (CNN-LSTM), and then combined that model with the spatiotemporal characteristics of the soil temperature field (STF) to predict the outlet temperature of energy piles [41]. Wang et al. used several machine learning methods combined with LSTM models to achieve a fast and accurate estimation of winter wheat yield over large areas based on remote sensing data [42]. In summary, considering the potential of LSTM models for complex time series data prediction tasks, in this paper we use LSTM models for the prediction of CO2 mass concentration in sheep barns.
To address the problem that it is difficult to predict and regulate CO2 mass concentrations in sheep sheds of large-scale meat sheep farms in a timely and accurate manner, we proposed a prediction method based on the RF-PSO-LSTM model. First, to address the possible problems of data packet loss, distortion, or singular values in the ambient air quality data collected online from sheep sheds, we used the mean smoothing and linear interpolation methods to repair the problematic data preprocessing and obtained a high-quality dataset. Second, to address the differences in units and magnitudes of the obtained ambient air quality data and to facilitate the study of correlations in the ambient air quality data of meat sheep barns, we normalized the data using a standardized processor. Then, to address the problem that a large variety of ambient air quality parameters as well as possible redundancies or information overlap would not only result in a complex prediction network structure, but also tend to lead to high computational complexity and low execution efficiency, we proposed to use the RF algorithm to screen and evaluate the important features of ambient CO2 concentration to reduce the structural complexity and computational efforts. In addition, to address the problems that the prediction results of LSTM models are susceptible to the influence of hyperparameters and the manual setting of hyperparameters is time-consuming and subjective, we proposed to use the PSO algorithm to find the optimal hyperparameter combination. Finally, the prediction results with the actual collected data were used to test the effectiveness of the method in this paper.

2. Materials and Methods

2.1. Data Source

2.1.1. Test Area

The experimental area for this study was the Xinao livestock meat sheep breeding base in Lanzhouwan Town, Manas County, Xinjiang Uygur Autonomous Region (44.27° N, 86.10° E). This large-scale breeding base mainly focuses on Suffolk sheep. The total area of the meat sheep barn is 2424.31 m2, which includes a main area (middle, daily rest area), a shaded area (north side), and an activity area (south side), with openable and closeable passages between each area.
In summer, sheep barns are sheltered from the heat by natural ventilation and shaded areas. In winter, the main area of the sheep barn is closed for breeding, and ventilation fans are used to maintain air circulation. The test data were collected in the main area of the barn, which was a standard semi-enclosed sheep barn with an area of 442.89 m2 (33.3 m in length, 13.3 m in width). Sheep sheds were designed according to the Code of Management for the Construction of Livestock and Poultry Breeding Communities. The walls were made of brick and concrete, the top surface was made of steel plates, and the floor was made of mud.
The test subjects were Suffolk meat sheep, about 300 in the test barn, fed manually at regular intervals, once in the morning and once in the afternoon, with free watering and manual manure removal.
Sensors were selected to be installed under the center beam of the main area after conducting a site visit and analyzing the environment. The CO2 concentration sensor and total suspended particulate concentration sensor were placed 2.4 m from the ground, and the other sensors were 3.0–3.1 m from the ground; the sensor installation position is shown in Figure 1.

2.1.2. Data Acquisition

In order to obtain real-time ambient air quality data of the sheep barn and ensure the consistency of samples in different seasons and time periods, ambient air quality monitoring equipment produced by Guangzhou Hairui Information Technology Co., Ltd., Guangzhou, China was used. The monitoring equipment included the following: light intensity sensor, temperature sensor, relative humidity sensor, CO2 concentration sensor, PM2.5 concentration sensor, PM10 concentration sensor, noise sensor, total suspended particulate (TSP) matter concentration sensor, H2S concentration sensor, and IoT transmission network and hubs. The response time of the monitoring equipment was less than or equal to 30 s, the repeatability was within ±2%, the linearity error was within ±2%, and the zero-point drift was within ±1%; the specific parameters are shown in Table 1.
The inter-integrated circuit (Table 1) is a serial communication protocol commonly used to connect chips to peripherals such as sensors. It was developed by Philips and is widely used for communication between various microcontrollers and digital signal processors. Modbus is a simple and widely used communication protocol used to transfer data in industrial automation systems. It is used to connect controllers or processors to other devices (e.g., sensors, actuators). The pulse width modulation (PWM) protocol is a commonly used analog signal control technique. It is typically used to convert analog signals to digital signals for reading and processing by microprocessors or other digital systems.
By using the installed monitoring equipment, we could obtain real-time ambient air quality data in the sheep barn and transmit the data to the data center of the IoT monitoring platform through communication technology.
We selected ambient air quality data from the sheep barn for the period 11 February to 25 March 2021, with a data collection interval of 10 min. The data included light intensity, temperature, relative humidity, CO2 mass concentration, PM2.5 mass concentration, PM10 mass concentration, noise, TSP mass concentration, and H2S mass concentration, with a total of 6160 sets of valid sample data. Some of the raw data collected are shown in Table 2.

2.1.3. Data Preprocessing

In the process of collecting sheep barn air quality data online, there can be problems, such as external electromagnetic interference, degraded performance of aging sensors, circuit failures, and so on. These problems can lead to data deviation in the data acquisition process and packet losses in the transmission process, resulting in packet losses, distortion, or singular value problems in the collected data. To reduce the impact of these problems on prediction performance, in this study, we used the mean smoothing and linear interpolation methods to repair the problematic data preprocessing, and we obtained a high-quality dataset.
Since the obtained ambient air quality data differed in units, the data were normalized using a standardized processor in order to facilitate the study of correlations in ambient air quality data from a meat sheep barn in the winter. We preprocessed the data of the 6160 sets of valid samples and divided the preprocessed data into training, validation, and test sets in chronological order with a ratio of 7:2:1.

2.2. Predictive Model Construction

2.2.1. Random Forest Feature Importance Ranking

The CO2 mass concentration in the sheep barn is influenced by the interactions among several air quality parameters, and the mechanism of interaction is complex. Due to the large variety in parameters as well as possible redundancies and information overlap, if all air quality parameters are directly input into the prediction model, it will not only result in a complex network structure, but also easily lead to low prediction accuracy, poor reliability, and high computational complexity of the model. Therefore, it is necessary to eliminate the multicollinearity among air quality parameters, screen out the features that have important effects on CO2 mass concentration, remove the features of lower importance, reduce the model input, optimize the model’s network structure, and improve the model’s prediction performance.
Random forests (RF) are integrated learning algorithms with decision trees as the base learners. RF not only solve the important feature-screening problem, but also have many advantages, such as simple structure, good training effects, easy implementation, and low computing cost. Given their good performance in screening important influencing features, we chose to use RF for important feature screening and evaluation of sheep barn CO2 mass concentration to increase the accuracy of model prediction [46].
A RF calculates feature importance mainly by calculating the average of the contributions of features above the decision tree, and then comparing the features to determine their importance. The error rate of out-of-bag (OOB) data is usually used as an evaluation indicator to aid in screening, as shown in Equation (1):
FIM i = errOOB 2 - errOOB 1 N
In Equation (1), FIM is the feature importance score, i is the number of features, N is the number of decision trees present in the random forest, errOOB1 is the normal out-of-bag error, and errOOB2 is the out-of-bag error in the presence of noise.
We used the sklearn.ensemble module in Python to call the RF function to calculate the importance of each feature.

2.2.2. LSTM Model

Long short-term memory (LSTM) [47] is a further improvement on recurrent neural network (RNN). The LSTM network structure is shown in Figure 2. The LSTM model not only has the advantages of RNN in analyzing short time series, but also selects historical states that have a significant impact on the present as input by setting up a gating mechanism. This operation increases the screening of past states by the LSTM model, selectively utilizes and stores information, achieves information protection and control, and solves the problems of gradient disappearance and gradient explosion that occur in the backpropagation process of RNN training in long time sequences. Therefore, the LSTM model is more suitable for the long time series prediction problem than the RNN model.
The LSTM network structure is composed of input, hidden, and output layers. The input and hidden layers consist of a series of cyclically connected memory units. A memory unit is usually composed of one or more self-connected cells, and it also has input, output, and forgetting gates.
The LSTM model workflow has four main steps. First, the information to be forgotten in the memory cell at the previous moment is determined by multiplying the previous cell state Ct−1 by the forgetting gate ft. This step uses ft to remove the information in Ct−1 that needs to be forgotten so that only the useful information is kept. Second, the new supplementary information is obtained by multiplying input gate it by the new candidate information value C t ˜ , and the current cell state Ct can be obtained by combining the supplementary information with the reserved information; the complementary information is controlled by the input of the model and the output of the previous cell ht−1. Then, the Ct cell state values are mapped between −1 and 1 by the tanh activation function. The main purpose of using the tanh activation function is to limit the value of cell states to a controlled range to avoid problems such as gradient disappearance or gradient explosion, thus improving the stability and learning efficiency of the network. Finally, the output ht of the current moment is obtained by multiplying the Ct cell state value by output gate ot; output gate ot has a value between 0 and 1, which determines which information in the cell state will be output, and the output ht can be used as the input ht−1 in the next moment or as the final output of the whole LSTM network, as shown in Equations (2)–(4):
Calculate the forgetting gate and select the information to be forgotten:
f t = σ W f · h t - 1 , x t + b f
Calculate the cell state at the current moment:
i t = σ W i · h t - 1 , x t + b i C t = tan h W C · h t - 1 , x t + b C C t = f t · C t - 1 + i t · C t
Calculate the output at the current moment:
o t = σ W o · h t - 1 , x t + b o h t = o t · tan h C t
In Equations (2)–(4), it is the input gate, ot is the output gate, ft is the forgetting gate, xt is the input value at time t, and σ is the sigmoid activation function. Wi is the input gate weight, Wf is the forgetting gate weight, Wo is the output gate weight, WC is the new candidate information weight, bf is the input gate bias, bi is the forgetting gate bias, bo is the output gate bias, and bC is the new candidate information bias. Weight W and bias b are the optimal values obtained by self-learning optimization during network training. C t ˜ is the new candidate information value at moment t, Ct is the cell state value at moment t, ht−1 is the output of the model at moment t − 1, and ht is the output of the model at moment t.

2.2.3. Particle Swarm Optimization

The number of neurons in each layer, the dropout probability, and the batch size hyperparameters in the LSTM model have a strong influence on the prediction results of CO2 mass concentration in sheep sheds. In engineering applications, the hyperparameters are usually set by personal experience or by continuous manual debugging; however, personal experience is subjective and manual debugging is time consuming and labor intensive. Therefore, we need an optimization algorithm to optimize the hyperparameters of the LSTM model.
Particle swarm optimization (PSO) [48], an intelligent optimization algorithm that uses group superposition inspired by bird flock foraging behavior, was proposed by James Kennedy and Russell Eberhart. The advantages of the PSO algorithm are its strong global search capability, the small number of parameters, the lack of gradient information, and especially the real number encoding feature, which makes it more suitable for dealing with real optimization problems.
The PSO algorithm has four main steps for solving optimization problems. First, it randomly initializes a population of particles in the solution space, each of which will have a random initial position and velocity. Second, the value of the objective function is calculated according to the location of each particle, and the globally optimal location is selected as the initial optimal solution for the search. Then, after setting the number of iterations of the particle swarm algorithm, the algorithm will enter an iterative process, updating the velocity and position of each particle at each iteration. Finally, the value of the objective function corresponding to each particle position is evaluated, the local optimal position of each particle and the global optimal position of the whole particle swarm are updated, and the algorithm iterates continuously until the end condition is met or the maximum number of iterations is reached. The algorithm flow is shown in Figure 3.
Suppose that in a D-dimensional target search space, the particle swarm number is N, the number of iterations of the current algorithm is t, and the ith particle state of t iterations can be represented by position and velocity vectors, as shown in Equation (5):
x i t = x i , 1 t , x i , 2 t , , x i , D t v i t = v i , 1 t , v i , 2 t , , v i , D t
The individual optimal solution searched by the ith particle of current iteration t is Pbest(i)(t), also called the local optimal solution, and the global optimal solution is Gbest(t). In this paper, our objective was to optimize the hyperparameters of the LSTM model, in order to obtain a smaller error, which would then belong to the minimization optimization problem, as shown in Equation (6):
G best t = min P best 1 t , P best 2 t , , P best N t
The velocity and position of the ith particle at iteration t + 1 of the algorithm can be calculated by Equation (7):
v i , D t + 1 = ω v i , D t + c 1 r 1 P best i , D t x i , D t + c 2 r 2 G best D t x i , D t x i , D t + 1 = x i , D t + v i , D t + 1
In Equations (5)–(7), x is the position of the particle in the target search space; v is the velocity of the particle; t is the number of iterations; N represents the ith particle; N is the number of particles in the particle swarm; D is the dimension of the target search space; c1 and c2 are the learning factors, which generally take values in the range of [0, 2]; r1 and r2 are random fractional numbers in line with Bernoulli distribution and take values in the range of [0, 1]; and ω is the inertia weight factor.
The standard initialization weight method was used for LSTM model parameter setting, and the forgetting gate bias parameter was increased to prevent a large loss of information from the previous moment. The PSO algorithm uses mean absolute percentage error (MAPE), shown in Equation (8), as the fitness function to optimize the combination of the number of hidden layer neurons, dropout probability, and batch size parameters of the LSTM model.
MAPE = 1 n t = 1 n A t F t A t
In Equation (8), n is the sample length of the test set, At is the true state of the CO2 mass concentration at time t, and Ft is the model prediction of the CO2 mass concentration at time t.

2.2.4. RF-PSO-LSTM Prediction Model

In order to improve the performance of the sheep barn CO2 mass concentration prediction model, we proposed an organic combination of RF algorithm, PSO algorithm, and LSTM model to construct an RF-PSO-LSTM prediction model based on RF-PSO-LSTM. The method flow of this process is shown in Figure 4.
The workflow of our proposed prediction model involved four main steps. First, we remediated and standardized the ambient air quality data of the sheep barn. Second, we used RF to filter out the features with important effects on the CO2 mass concentration, removed the features with lower effects, reduced the input of the LSTM model, optimized the prediction model network structure, and improved the model prediction performance. Then, we optimized the number of neurons, dropout probability, and batch size hyperparameters of the LSTM model using PSO to obtain the optimal combination of hyperparameters. Finally, we used the PSO-optimized LSTM model to predict the CO2 mass concentration in the sheep barn.

2.3. Model Performance Evaluation Metrics

In order to evaluate the prediction effect and accuracy of the CO2 mass concentration prediction model, we selected root mean square error (RMSE), mean absolute error (MAE), and the coefficient of determination (R2), as shown in Equations (9)–(11):
RMSE = 1 N i = 1 N y i y i 2
MAE = 1 N i = 1 N y i y i
R 2 = 1 i = 1 N y i - y i 2 / i = 1 N y i y i 2
In Equations (9)–(11), y i is the actual value, y i is the predicted value, y i is the mean value of the dependent variable in the test set, and N is the number of samples in the test set.
The smaller the values of RMSE and MAE, the smaller the error between the predicted and actual values of the model and the higher the accuracy. The value of R2 ranges from 0 to 1, and the closer the value is to 1, the better the reliability of the model prediction results.

2.4. Model Test Platform

In this paper, the models were trained, validated, and tested on the same computer. The computer configuration was based on a Windows 10 operating system with NVIDIA GeForce RTX3060 GPU and AMD Ryzen7 5800H [email protected] GHz processor, running on 16 G of RAM. All models in this study were built using Python, with the LSTM model based on the TensorFlow deep learning framework and PyCharm development tool.

3. Results and Discussion

3.1. PSO Algorithm Parameter Setting

The parameters of the PSO algorithm in this paper were set as shown in Table 3. The LSTM model used the mean square error (MSE) loss function to calculate the loss value of the model during training, and the optimizer was Adam. The number of neurons, dropout probability, and batch size hyperparameters of the LSTM model were obtained by the PSO algorithm for finding the best results.

3.2. Determination of LSTM Model Structure

The structure of the model needed to be determined when predicting time series using the LSTM model, consisting of input, hidden, and output layers. The input layer is the key to the data transfer of the model and is the first part of the whole model to make predictions. The output layer outputs the results of the model’s prediction of CO2 mass concentration and is the final link in the overall model to make predictions. LSTM models usually have only one input layer and one output layer. The structure of the LSTM model mainly lies in the different hidden layers; therefore, we need to determine the number of hidden layers in the model.
With a small number of hidden layers, the model may not be able to fully learn the relationships in the data, and the fitting ability of the data may be insufficient; as a result, the model may not achieve the expected results for CO2 mass concentration prediction. Too many hidden layers will lead to overfitting the model to the data, resulting in a poor generalization ability; additionally, the model will have more parameters and a complex structure. In summary, we need to set the number of hidden layers reasonably.
In this experiment, we selected the number of hidden layers as 1–5, and we used the RMSE, MAE, R2, and model parameters as the evaluation metrics, with a time step set to 20 by default. The test results are shown in Table 4.
From Table 4, we can see that the model has the lowest number of parameters when there is one hidden layer. When there are five hidden layers, the model has the highest number of model parameters.
With one hidden layer, the RMSE of the model was 123.959 μg·m−3, the MAE was 95.315 μg·m−3, the R2 was 0.978, and the parameter size was 32,251. In this case, the model may not be able to fully learn the complex action relationships in the data due to the low number of hidden layers.
With two hidden layers, the RMSE was 108.177 μg·m−3, MAE was 83.187 μg·m−3, R2 was 0.984, parameter size was 52,451, and the index values of the model were optimal, between one and five hidden layers.
Compared with the model with one hidden layer, the RMSE of the model with two hidden layers decreased by 15.782 μg·m−3, the MAE decreased by 12.128 μg·m−3, and the R2 increased by 0.006, showing that the model with two hidden layers could adequately learn the connections in the data and have fewer errors in predicting the CO2 mass concentration. Models with three to five hidden layers may have a relatively complex structure due to more parameters, which leads to larger errors in the prediction results.
The experiments show that the model with two hidden layers had a better prediction effect, less structural complexity, less computation, faster training, and faster running speed. Thus, the model structure with two hidden layers was chosen in this study.

3.3. Optimal Time Step

We used the LSTM model for time series prediction, which requires feature acquisition by a time step. The time step is a very important parameter for LSTM models because it determines the size of the feature composition structure and the amount of data required for the model during training, validation, and testing. The size of the time step directly affects the performance of model training and prediction; thus, we needed to set a reasonable value for this parameter to ensure good model performance.
In our experiments, we used the grid search method for time steps T ∈ {1, 20, 40, 60, 80, 100} [49,50,51]. We used the RMSE, MAE, and R2 as the evaluation indexes of the model to filter the optimal time step. The experimental results are shown in Table 5.
The values of the performance metrics of the LSTM model when the time step is set to T ∈ {1, 20, 40, 60, 80, 100} are shown in Table 5. When the range of values for the time step T are ∈ {1, 20}, it can be seen that the model performs better with T = 20 than with T = 1. The reason for this is that the feature datum produced by time step T = 1 is one feature data point, which is not the same as time series data over a time span and cannot express the relationship between continuous feature data points.
When the value range of the time step is T ∈ {20, 40, 60}, it can be seen that the prediction error of the LSTM model gradually decreases and the performance gradually improves. The LSTM model with T = 60 has the best performance because the feature data produced by this time step can better represent the relationship between continuous feature data points.
When the time step is T ∈ {60, 80, 100}, it can be seen that the overall prediction error of the LSTM model increases and the performance decreases due to the increased time step. Further, if the time step is larger, fewer data points are used for training; thus, we believe that the model is not sufficiently trained, which is the same as the conclusion reached in [51].
In summary, the prediction error of the LSTM model is the lowest and the performance is the best when time step T = 60. Therefore, in this paper, the time step of the model was chosen as T = 60, and subsequent experiments were conducted on this basis.

3.4. Feature Importance Ranking and Filtering

We collected a total of nine categories of environmental quality parameters in the sheep barn using the IoT. CO2 mass concentration is influenced by a variety of parameters, some of which show a strong correlation to it, and these parameters are called important features.
To filter the important features, we used the RF algorithm to calculate eight parameters to obtain their degree of importance and rank them in the following order: light intensity, air relative humidity, air temperature, PM2.5 mass concentration, PM10 mass concentration, noise, TSP mass concentration, and H2S mass concentration; the scores are shown in Table 6.
We selected different numbers of participants to input into the model in order to verify the effectiveness of the RF algorithm in order of ranking for the experiment, and we obtained the MAE variation curve as shown in Figure 5.
As seen in Figure 5, with one feature parameter, although the input dimension of the model is the smallest, the model has a poor fit and the MAE is the largest. With three feature parameters, although the input dimension of the model is smaller, the model is not fully developed, the fitting effect is average, and the MAE can be further reduced.
With four feature parameters, the MAE of the model is further reduced, the model fits better, and the input dimension is smaller. With five to eight feature parameters, the MAE is not much different from that of the model with four feature parameters, and the fitting effect is similar, but the input dimension increases.
In summary, it can be seen that with four feature parameters, the model can develop fully, the average absolute error is relatively low, the fitting effect is more satisfactory, and the input dimension is more reasonable.
In order to reduce the input dimension, optimize the network structure, and reduce the computational complexity of the model, we selected the top four parameters (light intensity, air relative humidity, air temperature, and PM2.5 mass concentration) as the prediction model inputs.

3.5. PSO Results for Hyperparameter Search

After determining the LSTM model structure, optimal step size, and important features, we used the PSO algorithm to find the optimal number of neurons, dropout probability, and batch size hyperparameters for the LSTM model. The results of the PSO algorithm were as follows: there were 64 neurons in the input layer, 128 neurons in hidden layer 1, 32 neurons in hidden layer 2, a dropout probability of 0.1, and a batch size of 32. We needed to train the LSTM model after determining the hyperparameters, and the training loss value changes are shown in Figure 6.
Figure 6 shows that the initial values for the training loss and validation loss of the model were 0.0213 and 0.0057, respectively. Although the initial loss of the model was high, the value rapidly decreased as the training proceeded, because the model updates the weights during the backpropagation process, gradually improving the fitting ability.
When the network training exceeded 260 epochs, the training loss value gradually stabilized between 0.0008 and 0.0009, and the validation loss value gradually stabilized between 0.0006 and 0.0007. The convergence of the loss values of the overall model shows only slight oscillations, indicating the completion of network model training.
The final RF-PSO-LSTM model in this paper was obtained after the model training was completed; Figure 7 shows the prediction effect of the model for CO2 mass concentration in a sheep barn. As can be seen in Figure 7a, the overall trend of our model’s prediction of CO2 mass concentration was similar to the actual CO2 mass concentration. Our model predicted the peak at the same point at which the sheep house CO2 mass concentration reached the peak. This demonstrates the ability of our proposed model to act as an early warning when the CO2 mass concentration in a sheep barn reaches a certain level, safeguarding the welfare of meat sheep to some extent.
Figure 7b further shows the difference between the CO2 mass concentration predicted by our model and the actual CO2 mass concentration in the sheep barn. It can be seen that although the predicted concentration of the model is very similar to the actual concentration, it is not as smooth as the actual value in terms of data smoothing. We determined that this could be related to changes in the environmental parameters of the sheep barn, which is a relatively slow process, and the interactions between parameters are slow, so the changes in CO2 mass concentration in the sheep barn are relatively smooth.

3.6. Comparative Analysis of Hyperparameter Predictions

In order to verify the effectiveness of the hyperparameter search results of the PSO algorithm for the LSTM model, we set different hyperparameters for the LSTM model for comparison tests. The model evaluation metrics were the RMSE, MAE, and R2, and the experimental results are shown in Table 7.
To determine the effectiveness of the PSO algorithm for the batch size hyperparameter search of the LSTM model, we only changed the batch size of the model, and set the RF-LSTM_1 model to have a batch size of 64 and the RF-LSTM_2 model to have a batch size of 128. It can be seen from the table that the RF-PSO-LSTM model with a batch size of 32 has the lowest RMSE and MAE values, indicating that this model has the least error in predicting the CO2 mass concentration in the sheep shed.
To determine the effectiveness of the PSO algorithm for the dropout hyperparameter search of the LSTM model, we changed only the magnitude of the dropout value and set the RF-LSTM_3 model with a dropout value of 0.2 and the RF-LSTM_4 model with a dropout value of 0.3. The table shows that as the dropout value increased, the RF-LSTM_4 model had a larger prediction error than the RF-LSTM_3 model in the prediction of CO2 mass concentration in the sheep shed. The RF-PSO-LSTM model with a dropout value of 0.1 predicted the CO2 mass concentration with fewer errors and better results.
To determine the effectiveness of the PSO algorithm for the hyperparametric optimization of neurons in the LSTM model, we experimented by changing only the number of neurons in the input layer, hidden layer 1, and hidden layer 2 of the model. Specifically, we set up the RF-LSTM_5–11 models for comparison with the RF-PSO-LSTM model proposed in this paper. We found from the test indicators in the table that more neurons in the model is not better, and likewise, fewer is not better either. The number of neurons in each layer of the model needs to be reasonably configured to maximize the performance of the model.

3.7. Comparative Analysis of Model Predictions

In order to verify the difference between our proposed RF-PSO-LSTM model and other models in predicting the CO2 mass concentration in sheep sheds, we used the gradient boosting regression tree (GBRT) algorithm, the light gradient-boosting machine (LightGBM) algorithm, the support vector regression (SVR) algorithm, and the random forest regression (RFR) algorithm models for the experimental analysis. The results are shown in Table 8.
The RFR, SVR, GBRT, and LightGBM models were obtained by training with all the features. The RF-RFR, RF-SVR, RF-GBRT, and RF-LightGBM models were obtained by training with the four features filtered by the RF algorithm.
It can be seen from Table 8 that compared with that of the RFR model, the RMSE of the RF-RFR model increased by 4.471 μg·m−3, the MAE decreased by 6.861 μg·m−3, and the R2 decreased by 0.002. Compared with the SVR model, the RMSE of the RF-SVR model decreased by 43.488 μg·m−3, the MAE decreased by 43.174 μg·m−3, and the R2 increased by 0.065. Compared with the GBRT model, the RMSE of the RF-GBRT model decreased by 4.475 μg·m−3, the MAE decreased by 2.176 μg·m−3, and the R2 increased by 0.004. Compared with the LightGBM model, the RMSE of the RF-LightGBM model decreased by 8.332 μg·m−3, the MAE decreased by 7.127 μg·m−3, and the R2 increased by 0.006.
We found that the model that first uses the RF algorithm to filter the important features and then trains using the filtered features has a lower MAE value and a higher R2 value than the model that trains using all features. The low RMSE and MAE of the model indicate that it has a small error in predicting CO2 mass concentration. The high R2 value of the model indicates that it has a high reliability in predicting CO2 mass concentration.
Among the compared models, the RF-RFR model predicted an RMSE of 220.844 μg·m−3, an MAE of 138.994 μg·m−3, and an R2 of 0.937 for the CO2 mass concentration in sheep sheds, which are the best predicted results.
The differences between our proposed model and the RF-RFR model can be seen in the table. Specifically, compared with RF-RFR, the RMSE of our model decreased by 145.422 μg·m−3, the MAE decreased by 87.155 μg·m−3, and the R2 increased by 0.055. Our proposed model had a better performance than the other models in predicting the CO2 mass concentration in sheep barns.
In summary, the RF-PSO-LSTM prediction model has a higher accuracy and a better fit, which are beneficial for single time series prediction with better real-time performance. Our model can be used for predicting sheep barn CO2 mass concentrations at large-scale meat sheep farms, providing a strong decision basis for early warning while improving the welfare of sheep.

4. Conclusions

In precision animal husbandry, the accurate prediction and early warnings of CO2 mass concentrations in large-scale sheep barns are an important research hotspot. The research work in this paper provides a reference for such predictions and early warnings, and the following conclusions were obtained:
(1)
The RF algorithm was able to filter out the important features affecting the prediction of CO2 mass concentration in sheep barns and remove features of lower importance, reducing the input to the model and the complexity of the data. The experimental results show that training the model using the filtered important features can improve the prediction performance.
(2)
We used the PSO algorithm to find the optimal number of neurons, dropout value, and batch size hyperparameters of the LSTM model and obtain the optimal combination of hyperparameters, avoiding the disadvantages of manual selection of hyperparameters.
(3)
The experimental results show that our proposed RF-PSO-LSTM model could effectively predict the trend of CO2 mass concentration in sheep sheds with a higher accuracy than typical prediction models such as RFR, SVR, GBRT, and LightGBM. The prediction results of our model can provide important support for improving the growing environment of meat sheep, which is conducive to improving the welfare of the sheep.
In short, we hope that our model can provide some help and reference for air quality improvements and the prediction of CO2 mass concentration in sheep barns at large-scale meat sheep farms.

Author Contributions

Conceptualization, H.C., L.Y., J.L. and S.L.; data curation, H.C., L.Y. and Y.P.; formal analysis, H.C. and L.Y.; funding acquisition, H.C. and J.N.; investigation, H.C., L.Y., Y.P., J.G. (Jianbing Ge) and Z.L.; methodology, H.C., L.Y. and Q.C.; project administration, H.C., J.N. and J.G. (Jianjun Guo); resources, H.C. and L.Y.; software, H.C., L.Y., K.W., S.Y. and H.Z.; supervision, J.L. and S.L.; validation, H.C., L.Y., J.L., S.L., J.N., J.G. (Jianbing Ge) and J.G. (Jianjun Guo); visualization, H.C., L.Y. and Y.P.; writing—original draft, H.C. and J.L.; writing—review and editing, H.C. and J.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the Shihezi University Innovation and Development Special Project (recipient: Jing Nie, grant no. CXFZ202103), the Shihezi University Achievement Transformation and Technology Promotion Project (recipient: Honglei Cen, grant no. CGZH202103), Post Expert Task of Meat and Sheep System in Agricultural Area of Autonomous Region (recipient: Jie Zhang, grant no. XJNQRY-G-2107), the National Natural Science Foundation of China (recipient: Shuangyin Liu, grant no. 61871475), the Guangzhou Key Research and Development Project (recipient: Shuangyin Liu, grant nos. 202103000033 and 201903010043), the Innovation Team Project of Universities in Guangdong Province (recipient: Jianjun Guo, grant no. 2021KCXTD019), the Guangdong Province Graduate Education Innovation Program Project (recipient: Jianjun Guo, grant nos. 2022XSLT056 and 2022JGXM115), and the Characteristic Innovation Project of Universities in Guangdong Province (recipient: Shuangyin Liu, grant no. KA190578826).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available because they are part of an ongoing study.

Acknowledgments

The authors would like to thank their schools and colleges, as well as the funders of the project. All support and assistance are sincerely appreciated. Additionally, we would like to thank the editor and reviewers of the present paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Ma, T.; Deng, K.-D.; Tu, Y.; Zhang, N.-F.; Zhao, Q.-N.; Li, C.-Q.; Jin, H.; Diao, Q.-Y. Recent advances in nutrient requirements of meat-type sheep in China: A review. J. Integr. Agric. 2022, 21, 1–14. [Google Scholar] [CrossRef]
  2. Mao, L.; Li, W.; Hao, F.; Yang, L.; Li, J.; Sun, M.; Zhang, W.; Liu, M.; Luo, X.; Cheng, Z. Research Progress on Emerging Viral Pathogens of Small Ruminants in China during the Last Decade. Viruses 2022, 14, 1288. [Google Scholar] [CrossRef]
  3. Alday, J.G.; O’Reilly, J.; Rose, R.J.; Marrs, R.H. Long-term effects of sheep-grazing and its removal on vegetation dynamics of British upland grasslands and moorlands; local management cannot overcome large-scale trends. Ecol. Indic. 2022, 139, 108878. [Google Scholar] [CrossRef]
  4. Jørgensen, N.; Steinheim, G.; Holand, Ø. Does Scale Matter? Variation in Area Use Across Spatiotemporal Scales of Two Sheep Breeds in Two Contrasting Alpine Environments. Rangel. Ecol. Manag. 2018, 71, 189–195. [Google Scholar] [CrossRef]
  5. Wang, L.; Zhang, M.; Li, Y.; Xia, J.; Ma, R. Wearable multi-sensor enabled decision support system for environmental comfort evaluation of mutton sheep farming. Comput. Electron. Agric. 2021, 187, 106302. [Google Scholar] [CrossRef]
  6. Zhang, M.; Wang, X.; Feng, H.; Huang, Q.; Xiao, X.; Zhang, X. Wearable Internet of Things enabled precision livestock farming in smart farms: A review of technical solutions for precise perception, biocompatibility, and sustainability monitoring. J. Clean. Prod. 2021, 312, 127712. [Google Scholar] [CrossRef]
  7. Stubsjøen, S.; Moe, R.; Mejdell, C.; Tømmerberg, V.; Knappe-Poindecker, M.; Kampen, A.; Granquist, E.; Muri, K. Sheep welfare in different housing systems in South Norway. Small Rumin. Res. 2022, 214, 106740. [Google Scholar] [CrossRef]
  8. Zhao, X.; Shi, L.; Lou, S.; Ning, J.; Guo, Y.; Jia, Q.; Hou, F. Sheep Excrement Increases Mass of Greenhouse Gases Emissions from Soil Growing Two Forage Crop and Multi-Cutting Reduces Intensity. Agriculture 2021, 11, 238. [Google Scholar] [CrossRef]
  9. Elghandour, M.M.; Antolin-Cera, X.; Salem, A.Z.; Barbabosa-Pliego, A.; Valladares-Carranza, B.; Ugbogu, E.A. Influence of Escherichia coli inclusion and soybean hulls based diets on ruminal biomethane and carbon dioxide productions in sheep. J. Clean. Prod. 2018, 192, 766–774. [Google Scholar] [CrossRef]
  10. Pedersen, S.; Blanes-Vidal, V.; Joergensen, H.; Chwalibog, A.; Haeussermann, A.; Heetkamp, M.J.W.; Aarnink, A.J.A. Carbon Dioxide Production in Animal Houses: A Literature Review. Agric. Eng. Int. 2008, 10. [Google Scholar]
  11. Moehn, S.; Bertolo, R.F.P.; Pencharz, P.B.; Ball, R.O. Pattern of carbon dioxide production and retention is similar in adult pigs when fed hourly, but not when fed a single meal. BMC Physiol. 2004, 4, 11. [Google Scholar] [CrossRef] [Green Version]
  12. Steiner, A.R.; Flammer, S.A.; Beausoleil, N.J.; Berg, C.; Bettschart-Wolfensberger, R.; Pinillos, R.G.; Golledge, H.D.; Marahrens, M.; Meyer, R.; Schnitzer, T.; et al. Humanely Ending the Life of Animals: Research Priorities to Identify Alternatives to Carbon Dioxide. Animals 2019, 9, 911. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Sindhøj, E.; Lindahl, C.; Bark, L. Review: Potential alternatives to high-concentration carbon dioxide stunning of pigs at slaughter. Animal 2021, 15, 100164. [Google Scholar] [CrossRef] [PubMed]
  14. Yeo, U.-H.; Jo, S.-K.; Kim, S.-H.; Park, D.-H.; Jeong, D.-Y.; Park, S.-J.; Shin, H.; Kim, R.-W. Applicability of Machine-Learned Regression Models to Estimate Internal Air Temperature and CO2 Concentration of a Pig House. Agronomy 2023, 13, 328. [Google Scholar] [CrossRef]
  15. Li, Y.; Li, S.; Sun, X.; Hao, D. Prediction of carbon dioxide production from green waste composting and identification of critical factors using machine learning algorithms. Bioresour. Technol. 2022, 360, 127587. [Google Scholar] [CrossRef]
  16. Zhao, J.; Kou, L.; Jiang, Z.; Lu, N.; Wang, B.; Li, Q. A novel evaluation model for carbon dioxide emission in the slurry shield tunnelling. Tunn. Undergr. Space Technol. 2022, 130, 104757. [Google Scholar] [CrossRef]
  17. Javanmard, M.E.; Ghaderi, S.; Hoseinzadeh, M. Data mining with 12 machine learning algorithms for predict costs and carbon dioxide emission in integrated energy-water optimization model in buildings. Energy Convers. Manag. 2021, 238, 114153. [Google Scholar] [CrossRef]
  18. Qin, J.; Gong, N. The estimation of the carbon dioxide emission and driving factors in China based on machine learning methods. Sustain. Prod. Consum. 2022, 33, 218–229. [Google Scholar] [CrossRef]
  19. Bhatt, H.; Davawala, M.; Joshi, T.; Shah, M.; Unnarkat, A. Forecasting and mitigation of global environmental carbon dioxide emission using machine learning techniques. Clean. Chem. Eng. 2023, 5, 100095. [Google Scholar] [CrossRef]
  20. Abbasi, N.A.; Hamrani, A.; Madramootoo, C.A.; Zhang, T.; Tan, C.S.; Goyal, M.K. Modelling carbon dioxide emissions under a maize-soy rotation using machine learning. Biosyst. Eng. 2021, 212, 1–18. [Google Scholar] [CrossRef]
  21. Kshirsagar, P.R.; Manoharan, H.; Selvarajan, S.; Althubiti, S.A.; Alenezi, F.; Srivastava, G.; Lin, J.C.-W. A Radical Safety Measure for Identifying Environmental Changes Using Machine Learning Algorithms. Electronics 2022, 11, 1950. [Google Scholar] [CrossRef]
  22. Hien, N.L.H.; Kor, A.-L. Analysis and Prediction Model of Fuel Consumption and Carbon Dioxide Emissions of Light-Duty Vehicles. Appl. Sci. 2022, 12, 803. [Google Scholar] [CrossRef]
  23. Tena-Gago, D.; Golcarenarenji, G.; Martinez-Alpiste, I.; Wang, Q.; Alcaraz-Calero, J.M. Machine-Learning-Based Carbon Dioxide Concentration Prediction for Hybrid Vehicles. Sensors 2023, 23, 1350. [Google Scholar] [CrossRef]
  24. Liu, X.; Guo, H. Air quality indicators and AQI prediction coupling long-short term memory (LSTM) and sparrow search algorithm (SSA): A case study of Shanghai. Atmos. Pollut. Res. 2022, 13, 101551. [Google Scholar] [CrossRef]
  25. Wang, X.; Yan, C.; Liu, W.; Liu, X. Research on Carbon Emissions Prediction Model of Thermal Power Plant Based on SSA-LSTM Algorithm with Boiler Feed Water Influencing Factors. Sustainability 2022, 14, 15988. [Google Scholar] [CrossRef]
  26. Nie, J.; Wang, Y.; Li, Y.; Chao, X. Artificial intelligence and digital twins in sustainable agriculture and forestry: A survey. Turk. J. Agric. For. 2022, 46, 642–661. [Google Scholar] [CrossRef]
  27. Nie, J.; Wang, Y.; Li, Y.; Chao, X. Sustainable computing in smart agriculture: Survey and challenges. Turk. J. Agric. For. 2022, 46, 550–566. [Google Scholar] [CrossRef]
  28. Li, Y.; Chao, X. ANN-Based Continual Classification in Agriculture. Agriculture 2020, 10, 178. [Google Scholar] [CrossRef]
  29. Zhao, H.; Li, J.; Nie, J.; Ge, J.; Yang, S.; Yu, L.; Pu, Y.; Wang, K. Identification Method for Cone Yarn Based on the Improved Faster R-CNN Model. Processes 2022, 10, 634. [Google Scholar] [CrossRef]
  30. Wang, N.; Nie, J.; Li, J.; Wang, K.; Ling, S. A compression strategy to accelerate LSTM meta-learning on FPGA. ICT Express 2022, 8, 322–327. [Google Scholar] [CrossRef]
  31. Nie, J.; Wang, N.; Li, J.; Wang, K.; Wang, H. Meta-learning prediction of physical and chemical properties of magnetized water and fertilizer based on LSTM. Plant Methods 2021, 17, 119. [Google Scholar] [CrossRef] [PubMed]
  32. Nie, J.; Wang, N.; Li, J.; Wang, Y.; Wang, K. Prediction of Liquid Magnetization Series Data in Agriculture Based on Enhanced CGAN. Front. Plant Sci. 2022, 13, 1883. [Google Scholar] [CrossRef] [PubMed]
  33. Li, Y.; Nie, J.; Chao, X. Do we really need deep CNN for plant diseases identification? Comput. Electron. Agric. 2020, 178, 105803. [Google Scholar] [CrossRef]
  34. Li, Y.; Yang, J. Few-shot cotton pest recognition and terminal realization. Comput. Electron. Agric. 2020, 169, 105240. [Google Scholar] [CrossRef]
  35. Yu, L.; Pu, Y.; Cen, H.; Li, J.; Liu, S.; Nie, J.; Ge, J.; Lv, L.; Li, Y.; Xu, Y.; et al. A Lightweight Neural Network-Based Method for Detecting Estrus Behavior in Ewes. Agriculture 2022, 12, 1207. [Google Scholar] [CrossRef]
  36. Yin, X.; Wu, D.; Shang, Y.; Jiang, B.; Song, H. Using an EfficientNet-LSTM for the recognition of single Cow’s motion behaviours in a complicated environment. Comput. Electron. Agric. 2020, 177, 105707. [Google Scholar] [CrossRef]
  37. Yu, L.; Guo, J.; Pu, Y.; Cen, H.; Li, J.; Liu, S.; Nie, J.; Ge, J.; Yang, S.; Zhao, H.; et al. A Recognition Method of Ewe Estrus Crawling Behavior Based on Multi-Target Detection Layer Neural Network. Animals 2023, 13, 413. [Google Scholar] [CrossRef]
  38. Wang, Z.; Liu, N.; Chen, C.; Guo, Y. Adaptive self-attention LSTM for RUL prediction of lithium-ion batteries. Inf. Sci. 2023, 635, 398–413. [Google Scholar] [CrossRef]
  39. Lin, H.; Zhang, S.; Li, Q.; Li, Y.; Li, J.; Yang, Y. A new method for heart rate prediction based on LSTM-BiLSTM-Att. Measurement 2023, 207, 112384. [Google Scholar] [CrossRef]
  40. Wu, Z.; Yin, H.; He, H.; Li, Y. Dynamic-LSTM hybrid models to improve seasonal drought predictions over China. J. Hydrol. 2022, 615, 128706. [Google Scholar] [CrossRef]
  41. Zhang, W.; Zhou, H.; Bao, X.; Cui, H. Outlet water temperature prediction of energy pile based on spatial-temporal feature extraction through CNN–LSTM hybrid model. Energy 2023, 264, 126190. [Google Scholar] [CrossRef]
  42. Wang, J.; Si, H.; Gao, Z.; Shi, L. Winter Wheat Yield Prediction Using an LSTM Model from MODIS LAI Products. Agriculture 2022, 12, 1707. [Google Scholar] [CrossRef]
  43. Di Già, S.; Papurello, D. Hybrid Models for Indoor Temperature Prediction Using Long Short Term Memory Networks—Case Study Energy Center. Buildings 2022, 12, 933. [Google Scholar] [CrossRef]
  44. Wang, Y.; Watanabe, D.; Hirata, E.; Toriumi, S. Real-Time Management of Vessel Carbon Dioxide Emissions Based on Automatic Identification System Database Using Deep Learning. J. Mar. Sci. Eng. 2021, 9, 871. [Google Scholar] [CrossRef]
  45. Rezaei, R.; Naderalvojoud, B.; Güllü, G. A Comparative Study of Deep Learning Models on Tropospheric Ozone Forecasting Using Feature Engineering Approach. Atmosphere 2023, 14, 239. [Google Scholar] [CrossRef]
  46. Genuer, R.; Poggi, J.; Tuleau-Malot, C. Variable selection using random forests. Pattern Recognit. Lett. 2010, 31, 2225–2236. [Google Scholar] [CrossRef] [Green Version]
  47. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
  48. Eberhart, R.; Kennedy, J. A New Optimizer Using Particle Swarm Theory. In MHS’95, Proceedings of the Sixth International Symposium on Micro Machine and Human Science, Nagoya, Japan, 4–6 October 1995; IEEE: New York, NY, USA, 1995; pp. 39–43. [Google Scholar] [CrossRef]
  49. Li, Y.; Zhu, Z.; Kong, D.; Han, H.; Zhao, Y. EA-LSTM: Evolutionary attention-based LSTM for time series prediction. Knowl.-Based Syst. 2019, 181, 104785. [Google Scholar] [CrossRef] [Green Version]
  50. Liu, Y.; Wang, Y.; Yang, X.; Zhang, L. Short-term travel time prediction by deep learning: A comparison of different LSTM-DNN models. In Proceedings of the 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), Yokohama, Japan, 16–19 October 2017; pp. 1–8. [Google Scholar]
  51. Yin, H.; Jin, D.; Gu, Y.; Park, C.-J.; Han, S.; Yoo, S. STL-ATTLSTM: Vegetable Price Forecasting Using STL and Attention Mechanism-Based LSTM. Agriculture 2020, 10, 612. [Google Scholar] [CrossRef]
Figure 1. Schematic diagram of sensor installation position.
Figure 1. Schematic diagram of sensor installation position.
Animals 13 01322 g001
Figure 2. Schematic diagram of LSTM network structure.
Figure 2. Schematic diagram of LSTM network structure.
Animals 13 01322 g002
Figure 3. Flowchart of PSO algorithm.
Figure 3. Flowchart of PSO algorithm.
Animals 13 01322 g003
Figure 4. Flowchart of RF-PSO-LSTM prediction model.
Figure 4. Flowchart of RF-PSO-LSTM prediction model.
Animals 13 01322 g004
Figure 5. Average absolute error curve based on number of features.
Figure 5. Average absolute error curve based on number of features.
Animals 13 01322 g005
Figure 6. Change of loss value during training.
Figure 6. Change of loss value during training.
Animals 13 01322 g006
Figure 7. Prediction effect of RF-PSO-LSTM model: (a) overall forecast trend; (b) model prediction.
Figure 7. Prediction effect of RF-PSO-LSTM model: (a) overall forecast trend; (b) model prediction.
Animals 13 01322 g007
Table 1. Technical parameters of sensors.
Table 1. Technical parameters of sensors.
Testing IndexMeasurement RangeAccuracyAgreement
Light intensity (lx)0~65,535±5IIC
Air temperature (°C)−40~105±0.4IIC
Air relative humidity (%)0~100±5IIC
Noise (dB)30~120±5IIC
PM2.5 mass concentration (μg·m−3)0~999.9±7%Modbus
PM10 mass concentration (μg·m−3)0~999.9±7%Modbus
CO2 mass concentration (μg·m−3)0~50,000±50PWM
TSP mass concentration (μg·m−3)0~999.9±7%PWM
H2S mass concentration (μg·m−3)0~10±3%PWM
Table 2. Selected raw data of ambient air quality of sheep barn.
Table 2. Selected raw data of ambient air quality of sheep barn.
Testing Index11 February 202111 February 202111 February 202111 February 202111 February 202111 February 2021
10:12:1810:22:1010:32:0910:42:1010:52:2311:02:16
Light intensity (lx)2430394297122
Air temperature (°C)1.51.51.51.61.61.7
Air relative humidity (%)85.786.186.486.686.887.1
Noise (dB)3280.859.469.83245.3
PM2.5 mass concentration (μg·m−3)13.414.212.412.912.311.9
PM10 mass concentration (μg·m−3)48.238.242.136.424.934.1
CO2 mass concentration (μg·m−3)130012851315133014201425
TSP mass concentration (μg·m−3)76.36567.561.146.257
H2S mass concentration (μg·m−3)8.48.48.28.48.48.4
Table 3. PSO algorithm parameter settings.
Table 3. PSO algorithm parameter settings.
ParameterValue
Inertia weighting factor w0.5
Learning factor c11.3
Learning factor c21.4
Search for spatial dimension D3
r10.6
r20.8
Number of particles N50
Number of iterations100
Table 4. Indicators for number of hidden layers in the model.
Table 4. Indicators for number of hidden layers in the model.
Number of Hidden LayersRMSE (μg·m−3)MAE (μg·m−3)R2Model Parameters
1123.95995.3150.97832,251
2108.17783.1870.98452,451
3127.12397.3370.97572,651
4143.066109.880.97292,851
5165.080125.8490.959113,051
Table 5. Indicators at different time steps.
Table 5. Indicators at different time steps.
Time StepRMSE (μg·m−3)MAE (μg·m−3)R2
1108.21785.1610.981
20108.17783.1870.984
40111.13582.410.983
60109.58681.80.982
80119.72687.690.979
100123.21290.9190.978
Table 6. Eight parameter features ranked by importance.
Table 6. Eight parameter features ranked by importance.
Order of ImportanceParameterImportance Score
1Light intensity (lx)0.750228
2Air relative humidity (%)0.114946
3Air temperature (°C)0.056363
4PM2.5 mass concentration (μg·m−3)0.027768
5PM10 mass concentration (μg·m−3)0.018287
6Noise (dB)0.013143
7TSP mass concentration (μg·m−3)0.011485
8H2S mass concentration (μg·m−3)0.007780
Table 7. Prediction results of LSTM model with different hyperparameters.
Table 7. Prediction results of LSTM model with different hyperparameters.
Model NameNumber of Neurons in Input Layer Number of Neurons in Hidden Layer 1 Number of Neurons in Hidden Layer 2 DropoutBatch SizeRMSE
(μg·m−3)
MAE
(μg·m−3)
R2
RF-PSO-LSTM64128320.13275.42251.8390.992
RF-LSTM_164128320.16479.32154.5920.991
RF-LSTM_264128320.112879.06557.5400.991
RF-LSTM_364128320.23278.50354.2670.991
RF-LSTM_464128320.33278.86455.0390.991
RF-LSTM_564128640.13278.23054.3610.991
RF-LSTM_6641281280.13277.07752.9560.991
RF-LSTM_76464320.13278.25054.0890.991
RF-LSTM_864256320.13277.18053.0720.991
RF-LSTM_932128320.13278.71453.7850.991
RF-LSTM_10128128320.13279.16754.0530.991
RF-LSTM_11256128320.13277.71052.9950.991
Table 8. Comparison of prediction performance of different models.
Table 8. Comparison of prediction performance of different models.
ModelRMSE (μg·m−3)MAE (μg·m−3)R2
RFR216.373145.8550.939
SVR589.336484.4750.545
GBRT285.102213.4990.895
LightGBM288.001209.7880.891
RF-RFR220.844138.9940.937
RF-SVR545.848441.3010.610
RF-GBRT280.627211.3230.899
RF-LightGBM279.669202.6610.897
RF-PSO-LSTM75.42251.8390.992
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cen, H.; Yu, L.; Pu, Y.; Li, J.; Liu, Z.; Cai, Q.; Liu, S.; Nie, J.; Ge, J.; Guo, J.; et al. A Method to Predict CO2 Mass Concentration in Sheep Barns Based on the RF-PSO-LSTM Model. Animals 2023, 13, 1322. https://doi.org/10.3390/ani13081322

AMA Style

Cen H, Yu L, Pu Y, Li J, Liu Z, Cai Q, Liu S, Nie J, Ge J, Guo J, et al. A Method to Predict CO2 Mass Concentration in Sheep Barns Based on the RF-PSO-LSTM Model. Animals. 2023; 13(8):1322. https://doi.org/10.3390/ani13081322

Chicago/Turabian Style

Cen, Honglei, Longhui Yu, Yuhai Pu, Jingbin Li, Zichen Liu, Qiang Cai, Shuangyin Liu, Jing Nie, Jianbing Ge, Jianjun Guo, and et al. 2023. "A Method to Predict CO2 Mass Concentration in Sheep Barns Based on the RF-PSO-LSTM Model" Animals 13, no. 8: 1322. https://doi.org/10.3390/ani13081322

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop