A Hybrid Medium and Long-Term Relative Humidity Point and Interval Prediction Method for Intensive Poultry Farming

Yin, Hang; Wu, Zeyu; Wu, Junchao; Jiang, Junjie; Chen, Yalin; Chen, Mingxuan; Luo, Shixuan; Gao, Lijun

doi:10.3390/math11143247

Open AccessArticle

A Hybrid Medium and Long-Term Relative Humidity Point and Interval Prediction Method for Intensive Poultry Farming

¹

College of Big Data and Internet, Shenzhen Technology University, Shenzhen 518118, China

²

College of Information Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou 510225, China

³

Institute of Collaborative Innovation, University of Macau, Macao 999078, China

⁴

College of Computer Science, Shenyang Aerospace University, Shenyang 110136, China

^*

Authors to whom correspondence should be addressed.

Mathematics 2023, 11(14), 3247; https://doi.org/10.3390/math11143247

Submission received: 7 July 2023 / Revised: 21 July 2023 / Accepted: 22 July 2023 / Published: 24 July 2023

(This article belongs to the Special Issue Computational Methods and Application in Machine Learning)

Download

Browse Figures

Versions Notes

Abstract

:

The accurate and reliable relative humidity (RH) prediction holds immense significance in effectively controlling the breeding cycle health and optimizing egg production performance in intensive poultry farming environments. However, current RH prediction research mainly focuses on short-term point predictions, which cannot meet the demand for accurate RH control in poultry houses in intensive farming. To compensate for this deficiency, a hybrid medium and long-term RH prediction model capable of precise point and interval prediction is proposed in this study. Firstly, the complexity of RH is reduced using a data denoising method that combines complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and permutation entropy. Secondly, important environmental factors are selected from feature correlation and change trends. Thirdly, based on the results of data denoising and feature selection, a BiGRU-Attention model incorporating an attention mechanism is established for medium and long-term RH point prediction. Finally, the Gaussian kernel density estimation (KDE-Gaussian) method is used to fit the point prediction error, and the RH prediction interval at different confidence levels is estimated. This method was applied to analyze the actual collection of waterfowl (Magang geese) environmental datasets from October 2022 to March 2023. The results indicate that the CEEMDAN-FS-BiGRU-Attention model proposed in this study has excellent medium and long-term point prediction performance. In comparison to LSTM, the mean absolute error (MAE), root mean square error (RMSE), and mean absolute percentage error (MAPE) are reduced by 57.7%, 48.2%, and 56.6%, respectively. Furthermore, at different confidence levels, the prediction interval formed by the KDE-Gaussian method is reliable and stable, which meets the need for accurate RH control in intensive farming environments.

Keywords:

medium and long-term point prediction; interval prediction; data denoising; feature selection; BiGRU; attention mechanism; KDE-Gaussian

MSC:

68T07

1. Introduction

Poultry are highly sensitive to environmental changes and have poor adaptability to adverse conditions [1]. Environmental control is a critical part of intensive poultry farming [2]. In intensive poultry farming, the high density of production leads to problems such as temperature and humidity imbalances in the rearing environment and excessive emissions of pollutants, which have a negative impact on poultry health and breeding [3,4]. Relative humidity (RH) is one of the key factors in measuring the poultry-rearing environment, and the optimal range for RH is approximately between 50% and 80%. Low RH increases the risk of virus transmission, while high RH accelerates feed mold growth, reduces poultry evaporation rate, increases heat stress, and affects the physiological function and egg production performance of poultry [5,6,7]. In addition, RH and temperature inside the poultry house are coupled [8], and their changes affect each other. However, compared with temperature, the change in RH is larger and has a shorter period, making it more unstable [9]. Therefore, continuous monitoring of RH in the poultry house with the necessary regulation and intervention are beneficial for ensuring poultry health and improving laying rates [10].

In recent years, various machine learning (ML) and deep learning (DL) based time-series prediction models have been applied to environmental monitoring in livestock and poultry breeding. Arulmozhi et al. [11] compared the performance of different ML models in predicting pigsty humidity and found that the random forest regression (RFR) had the best predictive performance. Liu et al. [12] used an extreme gradient boosting tree (XGBoost) to predict and regulate the concentration of odor in chicken coops, ensuring a clean environment. Lee et al. [13] utilized a recurrent neural network (RNN) to predict and control the temperature and RH of duck houses. Wang et al. [14] proposed a pigsty ammonia concentration prediction model based on a convolutional neural network (CNN) and gate recurrent unit (GRU), which can timely grasp the trend of ammonia concentration changes. Environmental data collected by sensors are complex and nonlinear and are affected by irregular noise. Hybrid prediction methods can achieve better predictive performance [15]. Existing hybrid prediction methods mainly include feature selection, data denoising or decomposition, and selection and optimization of prediction models. Shen et al. [16] employed empirical mode decomposition (EMD) to decompose environmental parameters and used an Elman neural network to predict the ammonia concentration in pigsties. Data decomposition simplifies a complex time series. Song et al. [17] employed kernel principal component analysis (KPCA) to extract the main component information from multiple environmental factors and established a QPSO-RBF combination prediction algorithm to predict ammonia concentration levels in cowsheds. Yin et al. [18] employed LightGBM and the Recursive Feature Elimination (RFE) method to screen out environmental factors with high correlation with carbon dioxide in sheep houses and established an SSA-ELM model to predict carbon dioxide concentration. Feature selection reduces model training time, while optimization algorithms reduce the time required to determine prediction model initialization parameters. Huang et al. [19] used wavelet transformation (WT) to remove noise from environmental data and used a time convolutional network (TCN) to predict the pollution index in waterfowl breeding farms, effectively improving data quality.

Although there has been a certain research foundation regarding the environmental prediction of animal husbandry, it is insufficient to meet the needs for predicting RH within poultry houses. Current research mainly focuses on the one-step-ahead prediction stage, which estimates the next predicted value using partial past observations [20]. Poultry is sensitive and responsive to environmental changes. Due to their biological characteristics, changes in environmental conditions do not immediately affect the egg production and health indicators of poultry, and it will take a certain amount of time to reflect, showing a certain lagging effect. Short-term point prediction is not conducive to resource scheduling and management regulation in intensive poultry farming and is even less conducive to accurate regulation of breeding period variables and assessment of the whole life cycle health status of poultry. Therefore, achieving RH multi-step-ahead prediction is particularly urgent and necessary. However, as the prediction time steps increase, the predictive performance of the model will inevitably decrease, resulting in more errors and risks and making it difficult for regulators to make decisions. Interval prediction can effectively quantify the risk brought by multi-step point predictions. Unlike point prediction, it ensures that future observation values fall within the specified range by constructing prediction intervals (PI) at different confidence levels. For regulators, it can provide more useful information than point prediction and assist in decision-making and management [21].

Synthesizing the aforementioned research, this study proposes a comprehensive and practical hybrid medium and long-term prediction model that can predict point and interval ranges of RH in intensive poultry farming environments. The main contributions and innovations of this study are as follows:

Exploring methods to enhance the quality of input data for the model. Spearman rank correlation analysis and gray relation analysis (GRA) are used to eliminate redundant environmental factors, and complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) and permutation entropy are combined to reduce the noise of RH data. Feature selection and data denoising eliminate interference from redundant data.
Proposing a deep learning model based on BiGRU and an attention mechanism to achieve effective medium and long-term point prediction of poultry house RH. Compared with common models, the BiGRU-Attention model can improve the utilization rate of multi-dimensional and long-term data, fully extract causal relationships between variables and targets, and enhance the accuracy of medium and long-term RH prediction.
Demonstrating measures to reduce decision-making risks caused by point prediction errors. Kernel density estimation (KDE) is used to fit the errors generated by point prediction, and PI at different confidence levels is calculated to quantify the risk brought by point prediction errors. This provides regulators with more useful information.

2. Materials and Methods

The proposed architecture of the hybrid medium and long-term RH prediction model mainly consists of five modules: data preprocessing, data denoising, feature selection, RH point prediction, and interval prediction. The data preprocessing module involves linear interpolation to fill in missing data in the sensor-collected Internet of Things (IoT) datasets and ensure data completeness. The data denoising module primarily employs CEEMDAN to decompose RH data and removes the intrinsic mode functions (IMFs) located in the high-frequency region of permutation entropy, thus achieving noise reduction of RH data. In the feature selection module, Spearman rank correlation analysis and GRA are used in feature selection to select out environmental factors with low correlation and similar to RH, simplifying the model training process and enhancing prediction performance. In the RH point prediction module, data is separated into training, validation, and test sets, and a BiGRU-Attention model is built to train on the training set, adjust model parameters using the validation set, and prevent overfitting, finally conducting testing on the test set. In the RH interval prediction module, a KDE and Gaussian kernel function is used to fit the distribution of point prediction errors in the validation set, ultimately achieving medium and long-term point and interval predictions for future RH. The methodological framework is shown in Figure 1.

2.1. Experimental Area

In this study, the Magang Goose Intensive Breeding Base, situated in Jiangmen City, Guangdong Province, China, served as the designated area for collecting experimental data. We have constructed an IoT-based remote monitoring platform for waterfowl-intensive farming environments, as shown in Figure 2. We used eight IoT sensors to collect real-time PM_2.5, PM₁₀, CO₂, humidity, light, ammonia, temperature, and noise in poultry houses at intervals of 20 min. To ensure the rigor and consistency of the experiment, this study selected the integrated environmental monitoring device from Guangzhou Hairui Information Technology Co., Ltd. (Guangzhou, China). The specific parameters are shown in Table 1. The sensors were chosen to be installed in the goose house, with the CO₂ concentration sensor and total suspended particulate matter sensor positioned 2.4 m above the ground, while the other sensors were positioned at a height of 3.0 to 3.1 m above the ground. As shown in Figure 3, the gateway is used to transmit the IoT-collected data to the database of the remote cloud service center, which can be accessed by the user through a computer browser or smartphone application for easy use.

2.2. Preprocessing of Data

2.2.1. Missing Data Repair

As a result of factors such as sensor aging, line faults, network delays, and adverse weather conditions, data loss may occur during the IoT sensor online collection of environmental data. Therefore, it is necessary to monitor the collected data and fill in missing data using linear interpolation to maintain data integrity and ensure the normal progress of subsequent experiments. Assuming the coordinates

(x_{0}, y_{0})

and

(x_{1}, y_{1})

are known, and

x

is between

x_{0}

and

x_{1}

. The

y

at position

x

can be obtained by linear interpolation formula as follows:

y = y_{0} + (x - x_{0}) \frac{y_{1} - y_{0}}{x_{1} - x_{0}}

(1)

2.2.2. Data Outlier Repair and Processing

The environmental data is a typical time series data that exhibits periodicity and trend. Under normal circumstances, there should not be drastic changes in the data between adjacent time points. If the change at a certain moment exceeds 10% of the preceding and succeeding monitored values, it is considered an outlier and is smoothed using a smoothing method. The formula is as follows:

x_{k} = \frac{x_{k - 1} + x_{k + 1}}{2}

(2)

2.2.3. Data Set Division

The data sampling for this study took place between 10 October 2022 and 13 March 2023, during which period there was a relatively higher poultry egg production rate [22]. After filling in missing data values, the total number of samples included amounted to 11,021. The data was divided into training, validation, and testing sets in a ratio of 7:3:3. Specifically, the training set was utilized to train the model, while the validation set acted as an effective component in averting overfitting and assisting in fitting the point prediction error distribution. Ultimately, the model’s predictive efficacy was assessed on the testing dataset.

2.2.4. Data Normalization

Significant scale differences between different environmental factors may lead to train models that lack robustness or fall into overfitting. Therefore, there is a need to transform the data to a uniform scale. In this study, mapping the data to the interval [0, 1] using the Max-Min normalization formula:

X_{new} = \frac{X - X_{\min}}{X_{\max} - X_{\min}}

(3)

2.3. Data Noise Reduction Method

2.3.1. Complete Ensemble Empirical Mode Decomposition with Adaptive Noise

CEEMDAN is a novel signal time–frequency analysis and processing method suitable for non-stationary signals [23]. By adding specific white noise during each step of the mode decomposition and then averaging them, it can effectively address the issue of IMF mode mixing produced by Empirical Mode Decomposition (EMD) while overcoming the noise residue problem in Ensemble Empirical Mode Decomposition (EEMD), resulting in a reconstruction error essentially equaling to zero with fewer averaging times and higher computational efficiency. Additionally, CEEMDAN introduces the complete ensemble strategy, which improves the stability and reliability of the decomposition results by performing multiple random samplings and decompositions on the signal. This method has been used for the decomposition and denoising of time series in various fields [24,25,26].

2.3.2. Permutation Entropy

Permutation Entropy [27] is a metric used to measure the randomness and abrupt changes in time series, characterized by its fast calculation speed and strong anti-interference ability. Meanwhile, it has found extensive applications across various domains for calculating the complexity of time series [28,29,30], owing to its high sensitivity and robustness towards mutation-type time series.

2.4. Feature Selection Method

2.4.1. Spearman Rank Correlation Analysis

Spearman rank correlation analysis [31] is a measure applied to quantify the degree of correlation between variables. Its calculation method is as follows:

r^{s} = 1 - \frac{6 \sum_{r = 1}^{n} d_{r}^{2}}{n (n^{2} - 1)}

(4)

where

n

is the sample size and

d_{r}

is the positional difference between the two data pairs at the r-th index in the sequence. The Spearman rank correlation coefficient, denoted by

r^{s}

, has a value between 1 and −1, with a larger absolute value indicating a stronger correlation.

2.4.2. Grey Relational Analysis

Grey Relational Analysis (GRA) [32] evaluates the level of correlation between variables by calculating the similarity in sequence among their respective factors. GRA uses Grey Relational Degree (GRD) as a quantitative index to measure the extent of the input feature impact on the output, which ranges between 0 and 1. A higher association score indicates greater similarity between input and output.

2.5. BiGRU-Attention Point Prediction Model

2.5.1. Gate Recurrent Unit

Gate Recurrent Unit (GRU) is an RNN-based improved model [33]. By introducing gating devices and a cyclic structure, GRU effectively overcomes the problems of long dependencies and gradient vanishing in RNN. Compared with LSTM, GRU selects information only through updating gate and resetting gate, which results in a shorter runtime. Figure 4a illustrates the architecture of the GRU neural network. Here, the updating gate selectively forgets the unimportant information in the previous moment hidden layer state, while the resetting gate implements selective memory of current and historical information. The formula is expressed as:

r_{t} = σ (W_{r} \cdot [h_{t - 1}, x_{t}])

(5)

z_{t} = σ (W_{z} \cdot [h_{t - 1}, x_{t}])

(6)

{\tilde{h}}_{t} = \tan h (W_{h} \cdot [r_{t} {* h}_{t - 1}, x_{t}])

(7)

h_{t} = (1 - z_{t}) {* h}_{t - 1} + z_{t} {* \tilde{h}}_{t}

(8)

where

r_{t}

is the reset gate,

z_{t}

is the update gate;

x_{t}

is the current input,

{\tilde{h}}_{t}

represents the summary of the current input and preceding hidden state, and

h_{t}

is the updated output of the hidden layer.

W_{r}

and

W_{z}

are the weight matrices for the reset gate and update gate,

W_{h}

is the weight matrix for the hidden layer, and

σ

and

\tan h

are the sigmoid activation function and hyperbolic tangent activation function, respectively.

2.5.2. Bi-Directional Gate Recurrent Unit

Bi-directional Gate Recurrent Unit (BiGRU) consists of two layers of GRU. One layer handles the forward propagation of the input sequence, while the other layer handles the reverse propagation of the input sequence. The final prediction is obtained by fitting the output results of the two layers. Compared with a single-layer GRU, BiGRU can better utilize both past and future information of the input sequence data. The network structure of BiGRU is illustrated in Figure 4b.

C^{f}_{t} = GRU (x_{t}, h^{f}_{t - 1}, C^{f}_{t - 1})

(9)

C^{b}_{t} = GRU (x_{t}, h^{b}_{t - 1}, C^{b}_{t - 1})

(10)

C_{t} = W^{T} C^{f}_{t - 1} + W^{V} C^{b}_{t - 1}

(11)

where

C^{f}_{t - 1}

and

C^{b}_{t - 1}

represent the memory cell state of the forward and backward GRU at time

t

, and

W^{T}

and

W^{V}

represent weight coefficients of the forward and backward matrix units.

2.5.3. BiGRU-Attention

The attention mechanism simulates the way human brains think. When processing received information, the brain focuses on important information and ignores irrelevant things. By incorporating the attention mechanism into the time series prediction process, correlations between multiple dimensions of the time series can be effectively explored, and the utilization rate of historically important information can be improved, thereby enhancing the predictive performance of the model [34]. We add an attention mechanism layer behind the BiGRU neural network to prevent the model from losing significant information due to long time series.

Figure 5 illustrates the architecture of the BiGRU-Attention model. The hidden layer output

h_{t}

of the BiGRU is fed into the attention layer, and the importance degree of each historical moment is calculated and scored using the softmax activation function to acquire the attention weights. Finally, the attention weights corresponding to each historical state are assigned to obtain the output

c_{t}

optimized by the attention mechanism. The formulas are:

e_{t} = \tan h (W_{s} h_{t} + b_{s})

(12)

a_{t} = s o f t m a x (e_{t})

(13)

c_{t} = \sum_{t = 1}^{T} a_{t} h_{t}

(14)

where

e_{t}

is the matching degree score, which represents the correlation between the historical state and the output state;

W_{s}

and

b_{s}

are matrix weight and bias terms.

a_{t}

is the attention weight of

h_{t}

, indicating the importance of this historical state.

2.6. Kernel Density Estimation

Interval prediction aims to output upper and lower bounds and obtain PI under a given confidence interval. Based on Kernel Density Estimation (KDE) [35] method, which is a non-parametric probability density estimation method, this study analyzes and estimates the errors of point prediction on the model test set. Non-parametric estimation does not require any distribution assumptions when fitting unknown distributions of observed data. Instead, it models the probability density function based on the observed data itself, with lower usage requirements and good performance. Currently, KDE has been applied to interval prediction of time series across multiple fields and achieved good results [36,37,38]. However, it has not yet been applied in livestock and poultry environmental monitoring. The process of KDE is as follows:

Assuming

y = \{y_{1}, y_{2}, \dots, y_{n}\}

is the sample data of

n

continuous distributions

f (y)

and the function

\hat{f} (y)

is the kernel density estimate of

f (y)

, the expression of

\hat{f} (y)

is:

\hat{f} (y) = \frac{1}{nh} \sum_{i = 1}^{n} K (\frac{y - y_{i}}{h})

(15)

where

n

denotes the sample count within the prediction error interval,

h

denotes the bandwidth factor, and

K

represents the kernel function. Different kernel functions result in different distributions. A Gaussian kernel function is applied in this study, whose equation expression is expressed as follows:

K (x) = \frac{1}{\sqrt{2 π}} \exp (- \frac{1}{2} {(x)}^{2})

(16)

2.7. Model Performance Evaluation Metrics

2.7.1. Metrics for Evaluating Point Prediction

In this study, the performance of point predictions was assessed using evaluation metrics, including Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE). Assuming there are

n

prediction samples, where

Y_{i}

represents the predicted results, and

y_{i}

represents the actual values. The formulas for the selected evaluation metrics are as follows:

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(Y_{i} - y_{i})}^{2}}

(17)

MAE = \frac{1}{n} \sum_{i = 1}^{n} | Y_{i} - y_{i} |

(18)

MAPE = \frac{100 %}{n} \sum_{i = 1}^{n} |\frac{Y_{i} - y_{i}}{y_{i}}|

(19)

2.7.2. Metrics for Evaluating Interval Prediction

In this study, the performance of the interval predictions was evaluated using the metrics of Prediction Interval Coverage Probability (PICP), Percentage of Incorrectly Narrowed Intervals Average Width (PINAW), and Coverage Weighted Confidence (CWC). Among them, PICP is used to indicate the probability of observation values falling inside the predicted interval. The closer to 1 the value of PICP is, the higher the probability that the predicted interval covers the actual observed values. PINAW mainly calculates the percentage of the average width between the predicted interval and the actual observation values, preventing the model from pursuing high PICP at the cost of a too-wide interval width, which makes the predicted interval unable to effectively describe the uncertain information of the observation values. The CWC metric simultaneously considers the coverage and width of the model-predicted interval, providing a comprehensive evaluation of interval prediction.

PICP = \frac{1}{N} (\sum_{i = 1}^{N} C_{i}) C_{i} = \{\begin{matrix} 1, y_{i} \in [y_{i}^{U}, y_{i}^{L}] \\ 0, y_{i} \notin [y_{i}^{U}, y_{i}^{L}] \end{matrix}

(20)

PINAW = \frac{1}{NA} \sum_{i = 1}^{N} |y_{i}^{U} - y_{i}^{L}|

(21)

CWC = PINAW (1 + {γ e}^{- η (PICP - μ)}) γ = \{\begin{matrix} 0, PICP \geq μ \\ 1, PICP < μ \end{matrix}

(22)

where

y_{i}^{U}, y_{i}^{L}

are the upper and lower bounds of prediction,

y_{i}

represents the true value,

A

is the range of target value;

μ

is the minimum confidence level for qualified PICP;

η

donates the penalty factor for unqualified PICP, which is set to 1 here. Therefore, when PICP is lower than the confidence level, PICP is the main influencing factor of CWC, and vice versa for PINAW.

3. Results

3.1. Experimental Environment and Parameter Selection

The experimental setup consisted of an Intel(R) I7-13700H 5.0 GHz CPU, an Nvidia Geforce RTX3060 graphics card, and 24 GB of memory, with Python 3.8.5 as the programming language. The ML and DL frameworks were constructed using Scikit-learn 1.1.1 and Tensorflow 2.10.0.

This study compared commonly used baseline models Random Forest (RF), Long Short-Term Memory (LSTM), Bidirectional Long Short-Term Memory (BiLSTM), GRU, BiGRU, and the BiGRU-Attention point prediction model used in this research. During the model training process, we first utilized the Adam optimizer and MSE loss function, in combination with a grid search algorithm, to acquire the preliminary optimal parameters for the models. Subsequently, we manually fine-tuned the critical parameters within the models while applying learning rate reduction and early stop mechanisms to prevent over-fitting and acquire the optimal parameter combinations. The best results of the grid search for the BiGRU-Attention model are presented in Table 2. The key parameter settings for LSTM, GRU, BiLSTM, and BiGRU were identical to those of the BiGRU-Attention model, while the RF parameters were determined using a trial-and-error approach.

3.2. Data Denoising Based on CEEMDAN and Permutation Entropy

Due to various factors such as weather, human activities, and monitoring methods, the RH data collected by IoT sensors are inevitably subject to certain interference, resulting in instability, volatility, and noise. Noise increases the difficulty for models to discover important information, leading to poor performance during model training or overfitting, affecting the accuracy and robustness of the model [39]. To reduce the random interference caused by noise, this study employed the CEEMDAN method, in combination with permutation entropy, to denoise the data. Firstly, the CEEMDAN was used to decompose the RH data into multiple IMFs, and the permutation entropy of each IMF was calculated, respectively, as shown in Figure 6.

We can observe that the permutation entropies of IMF1 and IMF2 are in the high-frequency area, indicating that these two IMFs are relatively complex and irregular compared to other IMFs. Therefore, they were regarded as noise and removed. After the denoising process was complete, we extracted the first 1000 pieces of data to observe the denoising effect. As shown in Figure 7, the RH sequence after denoising is smoother and more stable than the original data, indicating that the denoising method can significantly extract non-stationary features and reduce the volatility and complexity of the RH sequence.

3.3. Selecting Important Environmental Factors

The environmental factors within poultry sheds are interrelated and coupled, requiring careful consideration of the impacts of different environmental factors on RH. This study conducted experimental comparisons of multiple feature selection methods and ultimately found that utilizing Spearman rank correlation analysis and grey relational analysis (GRA) yielded the optimal results in selecting important environmental factors. The former mainly considers the degree of correlation between RH and different environmental factors, while the latter considers the similarity of the curve changes between RH and different environments, and the calculation results are presented in Figure 8. According to Figure 8a, it can be seen that temperature, carbon dioxide, ammonia, and light have a higher correlation with RH than other environmental factors. However, the GBD between light and RH is relatively low in Figure 8b, only 0.672, indicating that the curve change pattern of light and RH has some differences, and their similarity is not very high. Therefore, considering the results of both the Spearman rank correlation analysis and GRA, temperature, carbon dioxide, and ammonia are further selected as the input factors for the prediction model.

3.4. Medium and Long-Term RH Point Prediction Based on BiGRU-Attention

With the specified parameter settings, different prediction models were trained for different prediction frameworks and baseline models, and the point prediction performance indicators were calculated. As shown in Table 3 and Figure 9, the proposed CEEMDAN-FS-BiGRU-Attention prediction model has higher stability and accuracy, with a close-fitting prediction curve to the observed values and lower fluctuation, indicating that its fitting effect is superior to other models. Furthermore, we can also observe that an increase in the prediction time step will exacerbate the fluctuation of the prediction curve. Specifically, when the prediction step length is 3, most models exhibit a trend of over-prediction or under-prediction, with large deviations from the observed values, which is worthy of further comparison and discussion.

3.5. Medium and Long-Term RH Interval Prediction Based on KDE-Gaussian

Through point prediction comparative analysis, we demonstrated that the CEEMDAN-FS-BiGRU-Attention prediction model has outstanding point prediction performance. However, as shown in Table 3 and Figure 9, many unavoidable errors still occur during the prediction process, and the errors accumulate as the prediction length increases. Hence, it is imperative to implement interval prediction of RH to overcome the risk associated with point prediction bias and offer more useful information.

In this study, the Gaussian kernel density estimation (KDE-Gaussian) method was used to realize the interval prediction of RH. First, the point prediction error distribution generated by the validation set during model training was statistically analyzed to obtain the error distribution. Then, the KDE-Gaussian method was used to fit the error distribution curve to acquire the probability distribution function. Finally, the prediction interval formed under different confidence distribution levels was calculated and compared.

Figure 10 and Table 4 show the interval prediction results based on CEEMDAN-FS-BiGRU-Attention-KDE-Gaussian. We can observe that the prediction intervals are within the specified confidence level for different prediction time steps, indicating that the method can stably measure the uncertain risks of RH sequence changes. At the same time, the low PINAW indicates that the formed prediction intervals are generally narrow and can well describe the irregular variation information of the RH sequence.

4. Discussion

4.1. Analysis of Model Results in Comparison Based on Feature Selection

After feature selection, the MAE of FS-BiGRU-Attention and FS-BiGRU models for predicting the future 3 steps decreased by 23.0% and 18.1%, respectively, and their RMSE decreased by 25.6% and 32.2%, respectively. Additionally, the MAPE also decreased by 22.8% and 18.1%. As shown in Figure 11, other baseline models’ predictive errors were reduced, and their prediction performance significantly improved after the feature selection process. This indicates that the feature selection method based on Spearman rank correlation analysis and GRA can effectively select out environmental factors that have a high correlation with RH and similar trend changes. By eliminating redundant environmental factors, feature selection can help models focus on relevant and useful covariates, making it easier to uncover causal relationships between input and output data and thereby improve the predictive performance.

4.2. Analysis of Model Results in Comparison Based on CEEMDAN-Based Denoising

After data denoising, the MAE of CEEMDAN-BiGRU-Attention and CEEMDAN-BiGRU models for predicting the future 3 steps decreased by 13.7% and 5.2%, respectively. The RMSE also decreased by 15.7% and 20.1%, while the MAPE decreased by 12.5% and 12.4%, respectively. As shown in Figure 12, other baseline models had reduced errors and improved predictive performance after data denoising. This demonstrates that the CEEMDAN-based method combined with permutation entropy can effectively extract irregular noise from RH, resulting in a more regular and stable RH curve. After denoising, the predictive models can extract useful information more simply and efficiently and eliminate the interference of redundant information, thereby enhancing robustness and accuracy.

4.3. Analysis of Results Based on the CEEMDAN-FS-BiGRU-Attention Model

To substantiate the excellence of the proposed CEEMDAN-FS-BiGRU-Attention hybrid prediction model, it was compared and analyzed with multiple models. As shown in Table 5, under the same prediction framework, BiGRU-Attention outperformed other baseline models in terms of predictive performance. Compared to CEEMDAN-FS-LSTM, CEEMDAN-FS-BiGRU-Attention reduced the MAE, RMSE, and MAPE for predicting the future 3 steps by 29.1%, 26.4%, and 27.0%, respectively.

From the perspective of the prediction framework, feature selection and data denoising can fully extract features highly correlated with RH, remove noise, and eliminate the influence of redundant factors, thereby enhancing the quality of the model input. From the perspective of the baseline model, BiGRU can fully explore the correlation between model inputs and outputs and has good fitting ability. The introduction of an attention mechanism enables capturing the long-term dependence of the RH sequence and effectively improves the situation where important information is lost due to excessive data during BiGRU training. Compared with LSTM, CEEMDAN-FS-BiGRU-Attention reduced the MAE, RMSE, and MAPE for predicting the future 3 steps by 57.7%, 48.2%, and 56.6%, respectively, demonstrating outstanding predictive performance.

4.4. Comparative Analysis of Interval Prediction Performance

To compare and access the interval prediction performance of different models, this study conducted a comparative analysis using BiGRU and BiLSTM as baseline models. Figure 13 shows the error probability distribution function curves of different baseline model validation sets fitted by KDE-Gaussian. It can be observed that the error distribution of CEEMDAN-FS-BiGRU-Attention is relatively concentrated, mainly between [−7, 7]. The error distribution of CEEMDAN-FS-BiGRU and CEEMDAN-FS-BiLSTM are more dispersed, mainly between [−9, 9]. This indicates that CEEMDAN-FS-BiGRU-Attention has smaller errors on the validation set and better predictive performance than the other two models.

To further substantiate the efficacy of the KDE-Gaussian method, we compare its performance in constructing prediction intervals for the future 3 steps using the commonly used normal distribution estimation (NDE) method and Bootstrap method. NDE is a parameter estimation method that assumes a sample follows a normal distribution, while Bootstrap is an estimation method based on a random sampling of error distributions. As shown in Table 6, we can observe that the KDE-Gaussian method has the best and most stable interval prediction performance. Compared with the KDE-Gaussian method, both the NDE and Bootstrap methods have the disadvantage of forming prediction intervals with PICP lower than the confidence level. A lower PICP indicates that the formed prediction intervals do not cover the true RH data well and may lead to incorrect decisions by regulators. The KDE-Gaussian method can output suitable prediction intervals stably, meeting the requirement of the lowest confidence level without producing excessively wide prediction intervals, which is more reliable and practical.

From the perspective of baseline models, since under the same prediction framework, the errors of the BiGRU-Attention model on the validation set are lower than the other two models, even though the PICP of the prediction interval formed by the BiGRU-Attention model is lower than that of BiGRU and BiLSTM, it still meets the requirement of the confidence level. It is worth mentioning that compared with BiGRU and BiLSTM, BiGRU-Attention can maintain a narrower interval width while fulfilling the requirement of the confidence level, accurately describing the uncertainty information of RH variations, which makes it perform better in practical applications.

In conclusion, the prediction interval formed by the proposed CEEMDAN-FS -BiGRU-Attention-KDE-Gaussian model is capable of closely monitoring the trend of changes in RH sequences, forming well-confident prediction intervals while ensuring narrow interval width. The model can provide more accurate and useful information for regulators and is suitable for precise prediction and control of RH in poultry houses.

5. Conclusions

This study proposes an effective hybrid point and interval prediction framework for RH, which significantly improves the accuracy and stability of medium and long-term RH prediction. Through comparison with multiple models, CEEMDAN-FS-BiGRU-Attention has been proven to be a reliable and efficient RH prediction model. Additionally, using the KDE-Gaussian method to form prediction intervals based on point prediction error distribution has demonstrated excellent interval prediction performance under different confidence levels and prediction steps.

The specific conclusions are as follows:

(1) Due to the influence of various factors, the RH data collected by the sensor will inevitably produce noise, which will cause random interference for model training and prediction. After data denoising, the MAE of BiGRU-Attention, BiGRU, and BiLSTM future 3 steps prediction was reduced by 13.8%, 13.2%, and 5.2%, respectively. This indicates that the data denoising method based on CEEMDAN and permutation entropy effectively extracts irregular noise from RH, making it easier for the model to learn useful information while suppressing overfitting.

(2) Environmental factors in poultry houses impact each other. The comprehensive analysis and selection of environmental factors with high correlation and similar trend changes are important to improve the accuracy of RH prediction. After feature selection, the MAE of BiGRU-Attention, BiGRU, and BiLSTM future 3 steps prediction were reduced by 23.0%, 18.1%, and 22.2%, respectively. This indicates that the feature selection method based on Spearman rank correlation analysis and GRA can select important environmental factors, reduce input dimensions, and improve prediction accuracy.

(3) Common baseline models in existing research have the disadvantage of losing important information due to sequences being too long, which is not conducive to predicting long time series. Self-attention mechanism is an efficient solution. Compared with BiGRU and BiLSTM, BiGRU-Attention MAE for predicting future 3 steps decreased by 15.6% and 11.3%, respectively, illustrating that the attention mechanism can improve the utilization of past data, suppress useful information loss, and effectively improve model prediction performance.

(4) Point prediction outputs only a single datum, providing relatively less information. At the same time, as the prediction step length increases, point prediction will have inevitable fluctuations and larger errors. Therefore, it is necessary to implement interval prediction for RH. Compared with commonly used PI construction methods NDE and Bootstrap, KDE-Gaussian has better interval construction performance, outputting reliable and narrow prediction intervals. This method can provide more useful information for producers to make decisions and warnings.

(5) In terms of the overall prediction framework, the CEEMDAN-FS-BiGRU-Attention model proposed in this paper has the best point prediction performance. The MAE, RMSE, and MAPE of predicting future 3 steps were reduced by 57.7%, 48.2%, and 56.6%, respectively, compared with LSTM. Moreover, the CEEMDAN-FS-BiGRU-Attention -KDE-Gaussian method can form the most appropriate prediction interval at different confidence levels.

This study guides predicting and controlling RH or other environmental factors in livestock breeding from multiple environmental factors and is of great significance for achieving intelligent breeding. However, there are still some limitations in this study, including subjectivity in the feature selection process and high time cost in parameter optimization. In the future, our focus will be on finding more objective and effective feature selection methods and using heuristic optimization algorithms to initialize model parameters.

Author Contributions

Conception, H.Y. and Z.W.; methodology, H.Y., Z.W. and J.W.; software, Z.W. and J.J.; validation, H.Y., Z.W., J.W., Y.C. and M.C.; formal analysis, H.Y., M.C. and S.L.; investigation, H.Y., Z.W. and J.W.; resource, H.Y. and L.G.; data, H.Y. and Z.W.; writing—original draft preparation, H.Y. and Z.W.; writing—review and editing, H.Y., Z.W. and L.G.; projection administration, H.Y. All authors have read and agreed to the published version of the manuscript.

Funding

The work was supported by the National Natural Science Foundation of China (61871475); Guangdong Natural Science Foundation (2021A1515011605); Opening Foundation of Xinjiang Production and Construction Corps Key Laboratory of Modem Agricultural Machinery (BTNJ2021002); Guangzhou Innovation Platform Construction Project (201905010006); Guangdong Province Science and Technology Plan Project (2019B020215003); Yunfu Science and Technology Plan Project (2022020302) and Key R & D projects of Guangzhou (202103000033).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Wu, D.; Cui, D.; Zhou, M.; Ying, Y. Information perception in modern poultry farming: A review. Comput. Electron. Agric. 2022, 199, 107131. [Google Scholar] [CrossRef]
Zheng, H.; Zhang, T.; Fang, C.; Zeng, J.; Yang, X. Design and implementation of poultry farming information management system based on cloud database. Animals 2021, 11, 900. [Google Scholar] [CrossRef]
Gržinić, G.; Piotrowicz-Cieślak, A.; Klimkowicz-Pawlas, A.; Górny, R.L.; Ławniczek-Wałczyk, A.; Piechowicz, L.; Olkowska, E.; Potrykus, M.; Tankiewicz, M.; Krupka, M.; et al. Intensive poultry farming: A review of the impact on the environment and human health. Sci. Total Environ. 2022, 858 Pt 3, 160014. [Google Scholar] [CrossRef]
Li, Y.; Arulnathan, V.; Heidari, M.D.; Pelletier, N. Design considerations for net zero energy buildings for intensive, confined poultry production: A review of current insights, knowledge gaps, and future directions. Renew. Sustain. Energy Rev. 2022, 154, 111874. [Google Scholar] [CrossRef]
El-Hanoun, A.M.; Rizk, R.E.; Shahein, E.H.; Hassan, N.S.; Brake, J. Effect of incubation humidity and flock age on hatchability traits and growth in Pekin ducks. Poult. Sci. 2012, 91, 2390–2397. [Google Scholar] [CrossRef] [PubMed]
Xiong, Y.; Meng, Q.S.; Gao, J.; Tang, X.F.; Zhang, H.F. Effects of relative humidity on animal health and welfare. J. Integr. Agric. 2017, 16, 1653–1658. [Google Scholar] [CrossRef]
Saeed, M.; Abbas, G.; Alagawany, M.; Kamboh, A.A.; Abd El-Hack, M.E.; Khafaga, A.F.; Chao, S. Heat stress management in poultry farms: A comprehensive overview. J. Therm. Biol. 2019, 84, 414–425. [Google Scholar] [CrossRef] [PubMed]
Chang, Y.; Wang, X.J.; Feng, J.H.; Zhang, M.H.; Diao, H.J.; Zhang, S.S.; Peng, Q.Q.; Zhou, Y.; Li, M.; Li, X. Real-time variations in body temperature of laying hens with increasing ambient temperature at different relative humidity levels. Poult. Sci. 2018, 97, 3119–3125. [Google Scholar] [CrossRef] [PubMed]
Gao, L.; Er, M.; Li, L.; Wen, P.; Jia, Y.; Huo, L. Microclimate environment model construction and control strategy of enclosed laying brooder house. Poult. Sci. 2022, 101, 101843. [Google Scholar] [CrossRef]
Pereira, W.F.; da Silva Fonseca, L.; Putti, F.F.; Góes, B.C.; de Paula Naves, L. Environmental monitoring in a poultry farm using an instrument developed with the internet of things concept. Comput. Electron. Agric. 2020, 170, 105257. [Google Scholar] [CrossRef]
Arulmozhi, E.; Basak, J.K.; Sihalath, T.; Park, J.; Kim, H.T.; Moon, B.E. Machine learning-based microclimate model for indoor air temperature and relative humidity prediction in a swine building. Animals 2021, 11, 222. [Google Scholar] [CrossRef]
Liu, Y.; Zhuang, Y.; Ji, B.; Zhang, G.; Rong, L.; Teng, G.; Wang, C. Prediction of laying hen house odor concentrations using machine learning models based on small sample data. Comput. Electron. Agric. 2022, 195, 106849. [Google Scholar] [CrossRef]
Lee, S.-Y.; Lee, I.-B.; Yeo, U.-H.; Kim, J.-G.; Kim, R.-W. Machine Learning Approach to Predict Air Temperature and Relative Humidity inside Mechanically and Naturally Ventilated Duck Houses: Application of Recurrent Neural Network. Agriculture 2022, 12, 318. [Google Scholar] [CrossRef]
Wang, K.; Liu, C.; Duan, Q. Piggery Ammonia Concentration Prediction Method Based on CNN-GRU. J. Phys. Conf. Ser. 2020, 1624, 042055. [Google Scholar] [CrossRef]
Hajirahimi, Z.; Khashei, M. Hybrid structures in time series modeling and forecasting: A review. Eng. Appl. Artif. Intell. 2019, 86, 83–106. [Google Scholar] [CrossRef]
Shen, W.; Fu, X.; Wang, R.; Yin, Y.; Zhang, Y.; Singh, U.; Lkhagva, B.; Sun, J. A prediction model of NH3 concentration for swine house in cold region based on Empirical Mode Decomposition and Elman neural network. Inf. Process. Agric. 2019, 6, 297–305. [Google Scholar] [CrossRef]
Song, L.; Wang, Y.; Zhao, B.; Liu, Y.; Mei, L.; Luo, J.; Zuo, Z.; Yi, J.; Guo, X. Research on prediction of ammonia concentration in QPSO-RBF cattle house based on KPCA nuclear principal component analysis. Procedia Comput. Sci. 2021, 188, 103–113. [Google Scholar] [CrossRef]
Cen, H.; Yu, L.; Pu, Y.; Li, J.; Liu, Z.; Cai, Q.; Liu, S.; Nie, J.; Ge, J.; Guo, J.; et al. Prediction of CO₂ concentration in sheep sheds in Xinjiang based on LightGBM-SSA-ELM. J. Agric. Mach. 2022, 53, 261–270. [Google Scholar]
Huang, J.; Liu, S.; Hassan, S.G.; Xu, L. Pollution index of waterfowl farm assessment and prediction based on temporal convoluted network. PLoS ONE 2021, 16, e0254179. [Google Scholar] [CrossRef] [PubMed]
Du, S.; Li, T.; Horng, S.J. Time series forecasting using sequence-to-sequence deep learning framework. In Proceedings of the 2018 9th International Symposium on Parallel Architectures, Algorithms and Programming (PAAP), Taipei, China, 26–28 December 2018; IEEE: Washington, DC, USA, 2018; pp. 171–176. [Google Scholar]
Khosravi, A.; Nahavandi, S.; Creighton, D.; Atiya, A.F. Comprehensive review of neural network-based prediction intervals and new advances. IEEE Trans. Neural Netw. 2011, 22, 1341–1356. [Google Scholar] [CrossRef]
Li, Q.; Ma, H.P.; Liu, A.F.; Li, M.Y.; Guo, Z.H. Research progress on the effects of light on goose reproductive performance and hormone levels. Chin. J. Anim. Sci. 2015, 51, 88–92. [Google Scholar] [CrossRef]
Torres, M.E.; Colominas, M.A.; Schlotthauer, G.; Flandrin, P. A complete ensemble empirical mode decomposition with adaptive noise. In Proceedings of the 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, 22–27 May 2011; pp. 4144–4147. [Google Scholar]
Zhou, F.; Huang, Z.; Zhang, C. Carbon price forecasting based on CEEMDAN and LSTM. Appl. Energy 2022, 311, 118601. [Google Scholar] [CrossRef]
Dai, S.; Niu, D.; Li, Y. Daily peak load forecasting based on complete ensemble empirical mode decomposition with adaptive noise and support vector machine optimized by modified grey wolf optimization algorithm. Energies 2018, 11, 163. [Google Scholar] [CrossRef] [Green Version]
Li, K.; Huang, W.; Hu, G.; Li, J. Ultra-short term power load forecasting based on CEEMDAN-SE and LSTM neural network. Energy Build. 2023, 279, 112666. [Google Scholar] [CrossRef]
Bandt, C.; Pompe, B. Permutation entropy: A natural complexity measure for time series. Phys. Rev. Lett. 2002, 88, 174102. [Google Scholar] [CrossRef] [PubMed]
Li, Y.; Li, Y.; Chen, X.; Yu, J.; Yang, H.; Wang, L. A new underwater acoustic signal denoising technique based on CEEMDAN, mutual information, permutation entropy, and wavelet threshold denoising. Entropy 2018, 20, 563. [Google Scholar] [CrossRef]
Zhao, C.; Sun, J.; Lin, S.; Peng, Y. Rolling mill bearings fault diagnosis based on improved multivariate variational mode decomposition and multivariate composite multiscale weighted permutation entropy. Measurement 2022, 195, 111190. [Google Scholar] [CrossRef]
Chen, Z.; Li, Y.; Cao, R.; Ali, W.; Yu, J.; Liang, H. A new feature extraction method for ship-radiated noise based on improved CEEMDAN, normalized mutual information and multiscale improved permutation entropy. Entropy 2019, 21, 624. [Google Scholar] [CrossRef]
Schober, P.; Boer, C.; Schwarte, L.A. Correlation coefficients: Appropriate use and interpretation. Anesth. Analg. 2018, 126, 1763–1768. [Google Scholar] [CrossRef]
Kuo, Y.; Yang, T.; Huang, G.W. The use of grey relational analysis in solving multiple attribute decision-making problems. Comput. Ind. Eng. 2008, 55, 80–93. [Google Scholar] [CrossRef]
Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar]
Węglarczyk, S. Kernel density estimation and its application. ITM Web Conf. EDP Sci. 2018, 23, 00037. [Google Scholar] [CrossRef] [Green Version]
Niu, Z.; Zhong, G.; Yu, H. A review on the attention mechanism of deep learning. Neurocomputing 2021, 452, 48–62. [Google Scholar] [CrossRef]
Du, B.; Huang, S.; Guo, J.; Tang, H.; Wang, L.; Zhou, S. Interval forecasting for urban water demand using PSO optimized KDE distribution and LSTM neural networks. Appl. Soft Comput. 2022, 122, 108875. [Google Scholar] [CrossRef]
Niu, D.; Sun, L.; Yu, M.; Wang, K. Point and interval forecasting of ultra-short-term wind power based on a data-driven method and hybrid deep learning model. Energy 2022, 254, 124384. [Google Scholar] [CrossRef]
Pan, C.; Tan, J.; Feng, D. Prediction intervals estimation of solar generation based on gated recurrent unit and kernel density estimation. Neurocomputing 2021, 453, 552–562. [Google Scholar] [CrossRef]
Maharana, K.; Mondal, S.; Nemade, B. A review: Data pre-processing and data augmentation techniques. Glob. Transit. Proc. 2022, 3, 91–99. [Google Scholar] [CrossRef]

Figure 1. Methodological framework for medium and long-term prediction of humidity.

Figure 2. Waterfowl Intensive Farming Site Plan.

Figure 3. Network structure topology diagram.

Figure 4. GRU and BiGRU structure. (a) GRU structure. (b) BiGRU structure.

Figure 5. BiGRU-Attention architecture.

Figure 6. The permutation entropy of each IMF after decomposition.

Figure 7. Comparison before and after data denoising.

Figure 8. Feature selection process. (a) Spearman rank correlation heatmap. (b) GBD with different environmental factors.

Figure 9. Different model point prediction experimental results.

Figure 10. Interval prediction results of CEEMDAN-FS-BiGRU-Attention-KDE-Gaussian.

Figure 11. The performance impact of feature selection on predicting the future 3 steps for different baseline models.

Figure 12. The performance impact of data denoising on predicting the future 3 steps for different baseline models.

Figure 13. Error distribution of KDE-Gaussian under different baseline models.

Table 1. Technical data of sensors.

Environmental Variables	Measurement Range	Precision	Agreement
Humidity (%)	0~100	±5	IIC
Temperature (°C)	−40~105	±0.4	IIC
Carbon dioxide (ppm)	0~50,000	±20	PWM
Ammonia (ppm)	0~100	±5%	Modbus
Light (lx)	0~65,535	±5	IIC
PM_2.5 (ug/m³)	0~999.9	±7%	Modbus
PM₁₀ (ug/m³)	0~999.9	±7%	Modbus
Noise (db)	35~120	±0.5	Modbus

Table 2. Grid Search Scope and Results.

Hyperparameters	Optimal Parameters
Learning rate	0.01
Batch size	64
Hidden units	100
Attention units	64

Table 3. Different baseline model point prediction comparison results.

Model	MAE			RMSE			MAPE
Model	1-Step	2-Step	3-Step	1-Step	2-Step	3-Step	1-Step	2-Step	3-Step
RF	1.544	2.327	4.317	2.185	3.257	5.885	2.446	3.710	6.728
FS-RF	1.182	1.974	3.828	1.732	2.789	5.263	1.894	3.123	5.904
CEEMDAN-RF	0.799	1.629	3.388	1.121	2.274	4.610	1.316	2.638	5.327
CEEMDAN-FS-RF	0.709	1.254	2.744	0.989	1.773	3.908	1.153	2.020	4.280
LSTM	1.499	2.910	5.096	1.813	3.441	5.975	2.340	4.508	7.745
FS-LSTM	1.162	1.890	3.362	1.516	2.443	4.831	1.816	2.923	5.060
CEEMDAN-LSTM	1.166	1.899	4.150	1.507	2.600	5.176	1.830	3.060	6.330
CEEMDAN-FS-LSTM	1.044	1.372	3.039	1.231	1.920	4.202	1.636	2.188	4.596
GRU	1.188	2.475	4.584	1.518	3.385	5.693	1.908	4.011	6.965
FS-GRU	1.196	1.850	3.901	1.456	2.418	4.931	1.833	2.907	5.854
CEEMDAN-GRU	0.884	1.954	4.060	1.254	2.629	4.999	1.485	3.187	6.243
CEEMDAN-FS-GRU	0.802	1.482	2.952	1.034	1.916	3.681	1.309	2.380	4.494
BiLSTM	1.238	2.234	4.159	1.653	3.100	5.763	1.978	3.538	6.351
FS-BiLSTM	1.240	2.067	3.235	1.520	2.847	4.476	1.995	3.331	4.916
CEEMDAN-BiLSTM	0.863	2.328	3.944	1.211	2.787	4.813	0.014	3.514	5.905
CEEMDAN-FS-BiLSTM	0.870	1.316	2.647	1.106	1.691	3.813	1.346	2.099	4.067
BiGRU	1.025	2.458	4.367	1.362	3.089	5.533	1.696	3.845	6.613
FS-BiGRU	0.916	1.548	3.575	1.161	2.092	4.602	1.450	2.456	5.419
CEEMDAN-BiGRU	0.499	2.030	3.791	0.693	2.467	4.651	0.804	3.061	5.792
CEEMDAN-FS-BiGRU	0.763	1.388	2.565	0.912	1.675	3.421	1.147	2.103	3.825
BiGRU-Attention	0.867	1.668	3.687	1.252	2.362	5.283	1.411	2.682	5.701
FS-BiGRU-Attention	0.854	1.427	2.842	1.214	2.034	3.929	1.410	2.271	4.397
CEEMDAN-BiGRU-Attention	0.838	1.453	3.179	1.210	2.092	4.453	1.337	2.319	4.983
CEEMDAN-FS-BiGRU-Attention	0.557	0.932	2.154	0.781	1.344	3.094	0.896	1.470	3.355

Table 4. Interval prediction results of CEEMDAN-FS-BiGRU-Attention-KDE-Gaussian.

Prediction Step	Confidence Level	PICP	PINAW	CWC
1-step prediction	95%	0.964	0.082	0.082
	90%	0.929	0.064	0.064
	85%	0.881	0.051	0.051
2-step prediction	95%	0.950	0.144	0.144
	90%	0.908	0.114	0.114
	85%	0.868	0.095	0.095
3-step prediction	95%	0.953	0.201	0.201
	90%	0.914	0.159	0.159
	85%	0.875	0.129	0.129

Table 5. Comparison of prediction performance of different baseline models under the same prediction framework.

Model	MAE			RMSE			MAPE
Model	1-Step	2-Step	3-Step	1-Step	2-Step	3-Step	1-Step	2-Step	3-Step
CEEMDAN-FS-BiGRU-Attention vs. CEEMDAN-FS-BiGRU	27.0%	29.2%	16.0%	14.3%	19.8%	9.6%	21.9%	30.1%	12.2%
CEEMDAN-FS-BiGRU-Attention vs. CEEMDAN-FS-BiLSTM	36.0%	29.1%	18.6%	29.4%	20.5%	18.9%	33.4%	30.0%	17.5%
CEEMDAN-FS-BiGRU-Attention vs. CEEMDAN-FS-GRU	30.5%	37.1%	27.0%	24.5%	29.9%	15.9%	31.5%	38.2%	25.3%
CEEMDAN-FS-BiGRU-Attention vs. CEEMDAN-FS-LSTM	46.6%	32.0%	29.1%	36.6%	30.0%	26.4%	45.2%	32.8%	27.0%
CEEMDAN-FS-BiGRU-Attention vs. CEEMDAN-FS-RF	21.4%	25.7%	21.5%	21.0%	22.4%	20.8%	22.3%	27.2%	21.6%
CEEMDAN-FS-BiGRU-Attention vs. LSTM	62.8%	70.0%	57.7%	60.0%	61.0%	48.2%	61.7%	67.4%	56.6%

Table 6. Comparison of different baseline models and interval prediction methods.

Prediction Model	Interval Prediction Method	Confidence Level	PICP	PINAW	CWC
CEEMDAN-FS-BiLSTM	KDE-Gaussian	95%	0.964	0.236	0.236
		90%	0.934	0.194	0.194
		85%	0.883	0.165	0.165
	NDE	95%	0.933	0.195	0.394
		90%	0.867	0.161	0.328
		85%	0.751	0.137	0.288
	Bootstrap	95%	0.899	0.183	0.377
		90%	0.799	0.148	0.313
		85%	0.699	0.129	0.280
CEEMDAN-FS-BiGRU	KDE-Gaussian	95%	0.970	0.230	0.230
		90%	0.935	0.189	0.189
		85%	0.873	0.156	0.156
	NDE	95%	0.931	0.185	0.375
		90%	0.855	0.150	0.308
		85%	0.767	0.126	0.264
	Bootstrap	95%	0.899	0.171	0.351
		90%	0.801	0.139	0.293
		85%	0.699	0.118	0.256
CEEMDAN-FS-BiGRU-Attention	KDE-Gaussian	95%	0.953	0.201	0.201
		90%	0.914	0.159	0.159
		85%	0.875	0.129	0.129
	NDE	95%	0.915	0.161	0.328
		90%	0.874	0.128	0.260
		85%	0.823	0.107	0.217
	Bootstrap	95%	0.900	0.149	0.306
		90%	0.801	0.103	0.222
		85%	0.799	0.099	0.203

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yin, H.; Wu, Z.; Wu, J.; Jiang, J.; Chen, Y.; Chen, M.; Luo, S.; Gao, L. A Hybrid Medium and Long-Term Relative Humidity Point and Interval Prediction Method for Intensive Poultry Farming. Mathematics 2023, 11, 3247. https://doi.org/10.3390/math11143247

AMA Style

Yin H, Wu Z, Wu J, Jiang J, Chen Y, Chen M, Luo S, Gao L. A Hybrid Medium and Long-Term Relative Humidity Point and Interval Prediction Method for Intensive Poultry Farming. Mathematics. 2023; 11(14):3247. https://doi.org/10.3390/math11143247

Chicago/Turabian Style

Yin, Hang, Zeyu Wu, Junchao Wu, Junjie Jiang, Yalin Chen, Mingxuan Chen, Shixuan Luo, and Lijun Gao. 2023. "A Hybrid Medium and Long-Term Relative Humidity Point and Interval Prediction Method for Intensive Poultry Farming" Mathematics 11, no. 14: 3247. https://doi.org/10.3390/math11143247

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Hybrid Medium and Long-Term Relative Humidity Point and Interval Prediction Method for Intensive Poultry Farming

Abstract

1. Introduction

2. Materials and Methods

2.1. Experimental Area

2.2. Preprocessing of Data

2.2.1. Missing Data Repair

2.2.2. Data Outlier Repair and Processing

2.2.3. Data Set Division

2.2.4. Data Normalization

2.3. Data Noise Reduction Method

2.3.1. Complete Ensemble Empirical Mode Decomposition with Adaptive Noise

2.3.2. Permutation Entropy

2.4. Feature Selection Method

2.4.1. Spearman Rank Correlation Analysis

2.4.2. Grey Relational Analysis

2.5. BiGRU-Attention Point Prediction Model

2.5.1. Gate Recurrent Unit

2.5.2. Bi-Directional Gate Recurrent Unit

2.5.3. BiGRU-Attention

2.6. Kernel Density Estimation

2.7. Model Performance Evaluation Metrics

2.7.1. Metrics for Evaluating Point Prediction

2.7.2. Metrics for Evaluating Interval Prediction

3. Results

3.1. Experimental Environment and Parameter Selection

3.2. Data Denoising Based on CEEMDAN and Permutation Entropy

3.3. Selecting Important Environmental Factors

3.4. Medium and Long-Term RH Point Prediction Based on BiGRU-Attention

3.5. Medium and Long-Term RH Interval Prediction Based on KDE-Gaussian

4. Discussion

4.1. Analysis of Model Results in Comparison Based on Feature Selection

4.2. Analysis of Model Results in Comparison Based on CEEMDAN-Based Denoising

4.3. Analysis of Results Based on the CEEMDAN-FS-BiGRU-Attention Model

4.4. Comparative Analysis of Interval Prediction Performance

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI