A Hybrid Model for Temperature Prediction in a Sheep House

Feng, Dachun; Zhou, Bing; Hassan, Shahbaz Gul; Xu, Longqin; Liu, Tonglai; Cao, Liang; Liu, Shuangyin; Guo, Jianjun

doi:10.3390/ani12202806

Open AccessArticle

A Hybrid Model for Temperature Prediction in a Sheep House

by

Dachun Feng

^1,2,3,†,

Bing Zhou

^1,2,3,†,

Shahbaz Gul Hassan

^1,2,3,†

,

Longqin Xu

^1,2,3,

Tonglai Liu

^1,2,3,

Liang Cao

^1,2,3,

Shuangyin Liu

^1,2,3,* and

Jianjun Guo

^1,2,3,*

¹

Guangzhou Key Laboratory of Agricultural Products Quality & Safety Traceability Information Technology, Zhongkai University of Agriculture and Engineering, Guangzhou 510225, China

²

College of Information Science and Technology, Zhongkai University of Agriculture and Engineering, Guangzhou 510225, China

³

Academy of Intelligent Agricultural Engineering Innovations, Zhongkai University of Agriculture and Engineering, Guangzhou 510225, China

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Animals 2022, 12(20), 2806; https://doi.org/10.3390/ani12202806

Submission received: 20 August 2022 / Revised: 10 October 2022 / Accepted: 10 October 2022 / Published: 17 October 2022

(This article belongs to the Section Animal System and Management)

Download

Browse Figures

Versions Notes

Abstract

:

Simple Summary

In intensive sheep farming, the temperature is an important indicator of the healthy growth of sheep. The key to ensuring the healthy growth of sheep in a stress-free environment is to grasp the changing trend in the sheep-house temperature in time and adjust in advance. In order to solve this problem, we use machine learning technology to establish a combined prediction model to accurately predict the temperature of the sheep barn. The result shows that the combined prediction model has good stability and high accuracy. Additionally, it can be extended to the prediction research of other environmental parameters of other animal houses, such as pig houses and cow houses, in the future.

Abstract

Too high or too low temperature in the sheep house will directly threaten the healthy growth of sheep. Prediction and early warning of temperature changes is an important measure to ensure the healthy growth of sheep. Aiming at the randomness and empirical problem of parameter selection of the traditional single Extreme Gradient Boosting (XGBoost) model, this paper proposes an optimization method based on Principal Component Analysis (PCA) and Particle Swarm Optimization (PSO). Then, using the proposed PCA-PSO-XGBoost to predict the temperature in the sheep house. First, PCA is used to screen the key influencing factors of the sheep house temperature. The dimension of the input vector of the model is reduced; PSO-XGBoost is used to build a temperature prediction model, and the PSO optimization algorithm selects the main hyperparameters of XGBoost. We carried out a global search and determined the optimal hyperparameters of the XGBoost model through iterative calculation. Using the data of the Xinjiang Manas intensive sheep breeding base to conduct a simulation experiment, the results show that it is different from the existing ones. Compared with the temperature prediction model, the evaluation indicators of the PCA-PSO-XGBoost model proposed in this paper are root mean square error (RMSE), mean square error (MSE), coefficient of determination (R²), mean absolute error (MAE) , which are 0.0433, 0.0019, 0.9995, 0.0065, respectively. RMSE, MSE, and MAE are improved by 68, 90, and 94% compared with the traditional XGBoost model. The experimental results show that the model established in this paper has higher accuracy and better stability, can effectively provide guiding suggestions for monitoring and regulating temperature changes in intensive housing and can be extended to the prediction research of other environmental parameters of other animal houses such as pig houses and cow houses in the future.

Keywords:

intensive culture; temperature prediction; XGBoost algorithm; particle swarm optimization; principal component analysis

1. Introduction

In Xinjiang, sheep meat production will reach 603,200 tons by 2020 [1]. The Xinjiang sheep business has been essential in alleviating poverty and sustaining and stabilizing houses’ income development [2]. The expansion of science and technology, as well as the development of novel breeding concepts, have altered the sheep breeding paradigm from traditional single-family free-range breeding to large-scale and intense breeding [3]. Temperature is a critical environmental indicator in intensive housing. In winter, the minimum temperature in the sheep house should be above 10 °C, and in summer, the temperature should not exceed 30 °C. If the temperature is too high, a sheep’s heat dissipation is impaired, affecting feed intake and utilization. Bacteria breed and cause ectoparasitic and respiratory system diseases in sheep; if the temperature is too low, it is detrimental to a lamb’s health and survival. The proportion of feed consumed to maintain body temperature increases, negatively affecting sheep production and reproductive performance [4]. However, the temperature in the sheep house is time-varying, lags, exhibits significant interference, and exhibits multiple coupling [5]. The frequently employed linearization model cannot accurately predict the dynamic change law of the sheep house environment. The model’s poor accuracy directly influences the sheep house’s environmental management performance [6]. As a result of this complex problem, it is critical to use intelligent algorithms to predict the sheep house’s temperature accurately, understand the trend of temperature changes over time, and plan preventive and control measures to ensure the flock of sheep grows in a healthy environment.

Many scholars have explored the temperature prediction model in recent years and achieved remarkable results in pig houses, greenhouses, room temperature, and water temperature [7,8,9,10,11,12,13,14,15,16,17,18,19,20,21]. Xie Q et al. developed a new dynamic heat exchange model using EBE to accurately predict temperature changes in pig houses [7]. Ortega verified the indoor temperature predictions based on the ARIMA model under outdoor temperature changes. The indoor temperature prediction of the animal area in a conventional weaning piglet house is robust [8]. Fang Z et al. designed an LSTM-based Seq2Seq model to achieve multi-step advanced prediction of indoor temperature [9]. Elmaz F et al. proposed that a CNN-LSTM architecture was used to predict indoor temperature, and the proposed architecture showed excellent stability [10]. Zhou designed a novel convex bidirectional extreme learning machine (CB-ELM) model for greenhouse temperature and humidity prediction, which was proven experimentally to be a viable and effective approach for determining the temperature and humidity of solar greenhouses [11]. Moon T et al. used transfer learning to adjust the previously trained BiLSTM microclimate prediction model for greenhouses [12]. Shamkhani H et al. modeled air temperature through a generalized regression neural network. According to the findings, the GRNN model is a reliable tool for temperature prediction [13]. Jung D H et al. applied the RNN-LSTM model more accurately [14]. Cai W et al. proposed a model based on Light GBM to simulate the internal temperature of the greenhouse [15]. Li designed a time-delay neural network and Ellman neural network (ENN) based indoor temperature prediction system [16]. M Martnez-Comesaa et al. used the multi-objective genetic algorithm NSGA-II to optimize the MLP neural network. They obtained its most precise temperature prediction regarding error and complexity [17]. Li et al. used the Elman neural network, a multi-step prediction model for prediction in a closed environment [18]. Xu L et al. applied the IABC-WTMM model to predict water temperature in shrimp culture, overcoming traditional WTMM blindness and limitations of model parameter selection [19]. Renata Graf et al. developed an integrated model for temperature prediction in rivers by coupling WT and ANN. This model’s excellent performance is especially evident under extreme conditions [20]. Alhamid, M.I. et al. correctly predicted the hot water temperature at the generator’s input using a feedforward backpropagation neural network. The correlation between the anticipated and experimental values was satisfactory [21]. Even though the temperature prediction models proposed by these scholars have good accuracy, there are still issues with their real application, such as sluggish model training, the optimization algorithm is easy to fall into the local optimum when searching for the global optimum solution (That is, it seems to be the optimal solution at present, but it is not the global optimal solution), and the algorithm’s high computing cost.

Compared to typical machine learning algorithms, the XGBoost technique uses the CPU’s multi-threading capability. It incorporates a regularization term to regulate the algorithm’s complexity and prevent the model from overfitting. The benefits are substantial and may significantly increase the model’s accuracy. XGBoost is a gradient descent algorithm that combines multiple weak classifiers into a single strong classifier. As a result of enhancing learning capacity, XGBoost is extensively employed in prediction research across various domains [22,23,24,25]. Using SVMD and XGBoost, Wang Y. et al. increased the accuracy of short-term load prediction by developing a short-term load forecasting model for industrial clients [23]. Liu, Y. et al. proposed a prediction model of odor concentration in chicken houses based on XGBoost, and verified the effectiveness of the algorithm [24]; Obsie E.Y. et al. built a wild blueberry yield prediction method based on XGBoost, which greatly improved the prediction accuracy of blueberry yield [25]. As a result, the XGBoost algorithm was chosen for this experiment to investigate temperature prediction in the sheep house to maximize its benefits in complicated prediction issues and improve temperature forecast accuracy.

The hyperparameters in the XGBoost model directly impact the model’s prediction performance. The performance of models trained with various values is often considerably different, making it critical to choose optimal hyperparameters for the XGBoost model [26]. The benefits of the particle swarm optimization technique are its rapid search speed, high efficiency, resilience, simplicity of the algorithm, and ease of convergence. It is extensively employed in the domains of energy and agriculture [27,28,29]. Zhang, Y. et al. presented the particle swarm optimization technique to improve the parameters of back propagation neural network (BPNN) to forecast daily global solar radiation, solving the difficulties of parameter uncertainty, gradient descent, local optimization, etc. [27]; Huang, Y. et al. proposed particle swarm optimization to optimize support vector regression (SVR), which can accurately capture power peak loads [28]; Zhu, B. et al. optimized the parameters of extreme learning machine (ELM) through particle swarm algorithm, which improved the accuracy and running speed of the crop evapotranspiration prediction model [29].

Due to the sheep house breeding environment is complicated, and some environmental elements interact to impact temperature, some redundancy or information overlaps across environmental parameters. Suppose all components are included directly in the model. In that case, this may result in duplicate data and an overly complex model building. Simultaneously, it succumbed to model overfitting, resulting in suboptimal temperature prediction for sheep breeding environment parameters. As a result, the PCA dimension reduction technique filters the original environmental elements, eliminates their association, and prevents the dimension catastrophe produced by too high feature dimensions, which is critical for improving the model’s accuracy.

As a consequence of prior research, this work employs PCA, PSO, and Extreme Gradient Boosting to overcome the experience and unpredictability associated with XGBoost model hyperparameter selection. The temperature prediction of Xinjiang sheep houses using PCA-PSO-XGBoost is suggested in combination with XGBoost (Extreme Gradient Boosting, XGBoost). PCA removes large temperature-influencing components, and PSO is applied to do a global search for XGBoost hyperparameters. The optimized model is applied to the Xinjiang Manas intensive sheep breeding base. The prediction results are compared to those obtained using the conventional BPNN, SVR, XGBoost, PSO-BPNN, PSO-SVR, and PSO-XGBoost models, demonstrating that the model can effectively provide technical support for temperature change monitoring and regulation in intensive sheep breeding.

2. Materials and Methods

2.1. Experimental Area

In our study, the breeding group ewe barn No. 5 in the Xinjiang Uygur Autonomous Region Manas Xin’ao Animal Husbandry Co., Ltd. base (44°27′18″ north latitude, 86°10′47″ east longitude, Changji Hui Autonomous Prefecture, China) was used as the experimental data area. A comprehensive and intensive breeding production base for Suffolk sheep integrates breeding, production, and sales. This study selects a semi-enclosed Suffolk sheep house with an area of about 422 m². The four walls of the house are made of a brick–concrete structure, mainly made of bricks, steel bars, and concrete. The top surface is a steel plate structure, which is mainly composed of steel beams, steel columns, steel trusses, and other components made of section steel and steel plates, and the ground is a soil structure. The sheep house is divided into three areas: the main area, a shaded area, and an activity area. The sheep are concentrated in the main area for closed breeding in winter. This is to prevent the cold wind from entering the sheep house in winter, thereby reducing the heat loss of the sheep, preventing the sheep from frostbite or even freezing to death. Sensors such as temperature and humidity, noise, light intensity, PM2.5, PM10, TSP, CO₂, NH₃, and H₂S are installed in the main area to monitor the environmental parameters of the sheep barn online. The ammonia sensor is 3.2 m from the ground, the hydrogen sulfide sensor is 3.7 m from the ground, the noise sensor is 3.1 m from the ground, and other sensors are 3.3 m from the ground as shown in Figure 1.

2.2. Experimental Data Preprocessing

When many variables in a sheep house environment have varying magnitudes, it is necessary to normalize and eliminate them from the data set to increase the forecast accuracy of these variables [30]. The normalized calculation formula is as follows:

X_{i}^{*} = \frac{X_{i} - \bar{X}}{X_{s}}

(1)

where,

\bar{X}

is the mean, and

X_{S}

standard deviation and

X_{i}^{*}

is ith normalized value.

2.3. Construction of a Combined Forecasting Model for Sheep House

2.3.1. XGBoost Algorithm

Friedman introduced the Gradient boosting decision tree technique in 2001 [31]. Gradient descent is used to generate a new structure based on these prior trees to reduce the objective function [32]. Extreme gradient boosting (XGBoost) is an acronym for Extreme Gradient Boosting Decision Tree.

It is an ensemble tree regression and classification model. When XGBoost is applied to a regression problem, it generates new regression trees. The residuals of the previous model are then fitted utilizing the newly generated CART tree. A fully trained model has a total number of trees of K in it. The total estimated value is determined by aggregating the findings for each tree [32].

The objective function of XGBoost consists of a loss function and a regularization term. Use to denote the loss error and denote the regularization term. In order to control the complexity of the tree model and prevent overfitting, Chen and Guestringave carried out the following derivation process for the objective function of XGBoost. The objective function adds a new function

f_{m}

through continuous iteration so that the final result is minimized. The objective function in the mth iteration can be expressed as:

O b j^{(m)} = \sum_{i = 1}^{n} l o s s (y_{i}, {\bar{y}}_{i}) + \sum_{m = 1}^{M} Φ (f_{m})

(2)

In Formula (2),

\sum_{i = 1}^{n} l o s s (y_{i}, {\bar{y}}_{i})

is used to measure the error between the predicted value

{\bar{y}}_{i}

and the true value

y_{i}

.

\sum_{m = 1}^{M} Φ (f_{m})

is the regularization term, which represents the sum of the complexity of each submodel, and it is used to prevent the model from overfitting

The calculation formula of the regularization term is:

Φ (f_{m}) = γ T + \frac{1}{2} λ \sum_{l = 1}^{T} ω_{l}^{2}

(3)

In Formula (3),

γ

is the regularization parameter of the number of leaf nodes, which is used to control the number of leaf nodes;

T

is the number of leaf nodes;

λ

represents the regularization parameter of leaf node weights, which is used to control the weight of leaf nodes to prevent the result from overfitting;

ω_{l}

is the weight of the leaf node of each tree.

After m iterations, the resulting objective function in Formula (2) can be expressed as:

O b j^{(m)} = \sum_{i = 1}^{n} l o s s (y_{i}, {\bar{y}}_{i}^{(m - 1)} + f_{m} (x_{i}) + Φ (f_{m}) + C o n s t)

(4)

The objective function obtained by Formula (4) after Taylor series expansion and removal of high-order infinitesimal terms is:

O b j^{(m)} = \sum_{i = 1}^{n} [l o s s (y_{i}, {\bar{y}}_{_{i}}^{(m - 1)}) + p_{i} f_{m} (x_{i}) + \frac{1}{2} q_{i} f_{m}^{2} (x_{i})] + Φ (f_{m}) + C o n s t

(5)

In Formula (3),

p_{i} = \partial {\bar{y}}^{(m - 1)} l o s s (y_{i}, {\bar{y}}_{i}^{(m - 1)})

,

q_{i} = \partial^{2} {\bar{y}}^{(m - 1)} l o s s (y_{i}, {\bar{y}}_{_{i}}^{(m - 1)})

.

Since the result of the objective function needs to be minimized, the constant term has no influence on the formula. After removing the constant term, the objective function can be simplified as:

\begin{array}{l} O b j^{(m)} = \sum_{i = 1}^{n} [p_{i} f_{m} (x_{i}) + \frac{1}{2} q_{i} f_{m}^{2} (x_{i})] + Φ (f_{m}) \\ = \sum_{i = 1}^{n} [p_{i} w_{q} (x_{i}) + \frac{1}{2} q_{i} w_{_{q}}^{2} (x_{i})] + γ T + λ \frac{1}{2} \sum_{l = 1}^{T} w_{l}^{2} \\ = \sum_{i = 1}^{T} [(\sum_{i \in I_{l}} p_{l}) w_{l} + \frac{1}{2} (\sum_{i \in I_{l}} q_{i} + λ) w_{l}^{2}] + γ T \end{array}

(6)

In Formula (6), let

P_{l} = \sum_{i \in I_{l}} p_{l}

,

Q_{l} = \sum_{i \in I_{l}} q_{l}

, solving Equation (6) to achieve the objective function as:

O b j^{(m)} = \sum_{l = 1}^{T} [P_{l} w_{l} + \frac{1}{2} (Q_{l} + λ) w_{l}^{2}] + γ T

(7)

The value of

w_{l}

obtained by derivation of the objective function is as follows

w_{l} = - \frac{P_{l}}{Q_{l} + λ}

. Replacing it into Equation (7) to solve, and achieve the final objective function as:

O b j^{(m)} = - \frac{1}{2} \sum_{l = 1}^{T} \frac{P_{l}^{2}}{Q_{l} + λ} + λ T

(8)

2.3.2. Particle Swarm Optimization Algorithm

A particle swarm optimization algorithm is a swarm intelligence optimization algorithm based on bird social behavior proposed by Eberhart and Kennedy in 1995 [33]. It is currently used in solving various optimization problems with remarkable results [34]. The basic idea is that each bird in a flock is seen as a particle, a candidate solution to the problem to be solved. By continuously adapting to the environment and adjusting the speed and position of each individual, the optimal solution is continuously approached [35] to realize the optimization of the individual in the solution space. Flow of particle swarm optimization algorithm is shown in Figure 2. The particle swarm optimization algorithm mainly consists of the following steps, which loop all the time to satisfy the termination condition:

Step 1: Determine the fitness of each particle.
Step 2: Continuously update the best individual extrema $z_{j}$ and the optimal global solution $z_{g}$
Step 3: $S_{J}^{T}$ indicates the direction and distance for the next iteration of each particle. $Y_{j}^{T}$ represents the current state of each particle. Update $S_{J}^{T}$ and $Y_{j}^{T}$ of each particle.

For the PSO algorithm, it is assumed that there are M particles in the K-dimensional target search space, and these M particles representing candidate solutions form a particle swarm.

The K-dimensional vector

Y_{j} = {[Y_{j 1}, Y_{j 2}, .., Y_{j K}]}^{T}

can be used to represent the information of the jth particle,

S_{j} = {[S_{j 1}, S_{j 2}, ..., S_{j K}]}^{T}

is the velocity of these particles. The particles cooperate and share information according to the two extreme values updated in each iteration. The fitness value of each particle is calculated, and the individual extreme value is the optimal fitness value determined by the jth particle in the search space. The global extrema is the best position found by the entire particle swarm among these individual extrema, denoted as:

z_{g} = {[z_{g 1}, z_{g 2}, ..., z_{g K}]}^{T}

. The particle updates these two extrema according to Equations (9) and (10), to update its velocity

S_{j}^{T}

and position

Y_{j}^{T}

.

S_{j}^{T + 1} = μ S_{j}^{T} + ε_{1} (z_{j}^{T} - Y_{j}^{T}) + ε_{2} (z_{g}^{T} - Y_{j}^{T})

(9)

Y_{j}^{T + 1} = Y_{j}^{T} + Y_{j}^{T + 1}

(10)

In Formulas (9) and (10),

μ

stands for inertia weight,

ε_{1}

and

ε_{2}

are determined by

c_{1} r_{1}

and

c_{2} r_{2}

respectively,

T

represents the Tth iteration,

r_{1}

and

r_{2}

are random numbers from 0 to 1. In addition,

c_{1}

represents the individual learning factor and

c_{2}

stands for the population learning factor.

2.3.3. Model Performance Evaluation

The performance evaluation of the Xinjiang sheep house temperature prediction model based on PCA-PSO-XGBoost uses Formulas (11) to (14) to calculate each model [36]. The expressions of these specific evaluation indicators are as follows:

M S E = \frac{1}{b} \sum_{j = 1}^{b} {(a_{j} - {a^{'}}_{j})}^{2}

(11)

R M S E = \sqrt{\frac{1}{b} \sum_{j = 1}^{b} {(a_{j} - {a^{'}}_{j})}^{2}}

(12)

M A E = \frac{1}{b} \sum_{j = 1}^{b} | a_{j} - {a^{'}}_{j} |

(13)

R^{2} = \frac{\sum_{j = 1}^{b} {({a^{'}}_{j} - \bar{a})}^{2}}{\sum_{j = 1}^{b} {(a_{j} - \bar{a})}^{2}}

(14)

In Formulas (11) to (14),

a_{j}

is the actual value of the jth data,

{a^{'}}_{j}

is the predicted value of the jth data predicted by the model,

\bar{a}

is the average of these data, and b represents the total of the b data.

It can be observed from the preceding calculation that MSE represents the average of the square of the discrepancy between the estimated value and the actual value. The higher the MSE value, the greater the discrepancy between the predicted value of the model and the actual value. The higher the fitting effect of the model is; the MAE reflects the average value of the absolute error. The bigger the MAE number, the larger the inaccuracy of the model’s anticipated value; R² is the baseline for measuring model quality, and its value range is typically [0, 1]. The greater the R², the stronger the correlation between the model’s estimated value and the actual value, and the nearer it is to 1, the ideal. R² represents how well the model fits the original data; in addition, the value range is generally [0, 1]. The higher the R² value, the closer the model calculation result is to 1, and the stronger the correlation between the model-predicted value and the actual value is.

3. Result and Analysis

3.1. Experimental Environment

Taking the temperature of the ewe house of the No. 5 breeding group of Xinjiang Manas Xin’ao Animal Husbandry Co., Ltd. as the research object, using the Internet of Things cloud service platform for environmental monitoring of sheep house breeding developed by Zhongkai Agricultural Engineering College, the data is collected with an interval of 10 min. Two thousand sheep barn parameter data from 8 February 2021 to 22 February 2021 were used as the experimental data, including air temperature, air humidity, carbon dioxide concentration, PM2.5 index, PM10 index, light intensity, noise, TSP index, for parameters such as hydrogen sulfide concentration, in which 1400 prior time periods were the training set. The last 600 time periods were set to realize the temperature prediction in the sheep house for the next 10 min. The raw temperature data are shown in Figure 3. The ordinate corresponds to the temperature of the sheep house, and the abscissa is in units of 10 min, corresponding to 2000 time periods during the period from 17:11 on 8 February to 17:32 on 22 February:

The experimental setup includes Intel(R) Core (TM) i7-6400U processor (Santa Clara, CA, USA), 2.30GHz CPU frequency, 16 GB memory, Windows10 (64-bit) (Microsoft, Redmond, WA, USA) operating system, the integrated development environment is Anaconda3, the programming language is python3.6 (64-bit), and the experiment uses the Keras, Sklearn, and XGBoost packages to realize the temperature prediction of Xinjiang sheep house based on the PCA-PSO-XGBoost model. Through repeated tests, the PSO parameters were set as follows: The population size N is set at 20, the maximum number of iterations T is set at 20, the local search factor C1 is set at 2, and the global search factor C2 is set at 2, and the self-weight factor w = 0.4; the learning rate of the XGBoost model and the maximum depth of the tree is globally searched by the PSO optimization algorithm. The maximum depth max_depth = 9, the maximum number of trees n_estimators is 1000, and other parameters are set to default values.

3.2. Experiment Procedure

The experiment organically combined the PCA dimension reduction algorithm, PSO optimization algorithm, and the XGBoost model to build a temperature prediction model for the Xinjiang sheep house based on PCA-PSO-XGBoost to increase temperature prediction accuracy. Standardized processing, using principal component analysis to screen the key influencing factors of sheep house temperature in Xinjiang, eliminating redundant information among environmental factors, and simplifying the structure of the XGBoost model. Once the PSO optimization method has been used to optimize the XGBoost model’s learning rate, tree depth, and other parameters to enhance accuracy, the best solution is then replaced with the XGBoost model to produce the prediction outcome. The specific steps are shown in Figure 4:

Step 1:: Standardize the temperature time series data for the Xinjiang sheep houses and split them into test and training sets by time. The first 70% of these 2000 time periods is the training set, while the last 30% is the test set.
Step 2:: Use principal component analysis to screen the key influencing factors of sheep house temperature in Xinjiang, eliminate redundant information among environmental factors, and simplify the model structure.
Step 3:: Initialize the particle swarm and XGBoost model parameter ranges.
Step 4:: Based on the training set, PSO optimization was used to optimize the optimal learning rate of the XGBoost and obtain a temperature prediction model based on PSO-XGBoost. Among them, the preset XGBoost model learning rate parameter range is [0,1]. PSO has 10 rounds of optimization, with 10 particles in each round, and R² is used as the fitness.
Step 5:: Apply the PCA-PSO-XGBoost model to the Xinjiang sheep house environmental prediction field to monitor and regulate temperature changes in intensive housing.

3.3. Result and Discussions

The temperature of the ewe house in Xinjiang Manas Xin’ao Animal Husbandry Co., Ltd.’s No. 5 breeding group is used as the research object. The data were collected using the Internet of Things cloud service platform for monitoring the breeding environment of animals developed by Zhongkai Agricultural Engineering College. Between 8 February and 22 February 2021, 2000 time periods of data were gathered every 10 min, including air temperature, air humidity, carbon dioxide concentration, the PM2.5 and PM10 indexes, light intensity, noise, TSP index, and hydrogen sulfide. These factors forecast the sheep house’s temperature for the following ten minutes. Table 1 contains the raw data of the barn’s breeding environment.

The process is complex in the sheep house breeding environment, where various environmental elements interact to influence the temperature. If all the factors are directly input into the model, the model may be too complicated to construct. The over-fitting problem leads to the low-temperature prediction accuracy of sheep breeding environment parameters. PCA and SPSS statistical analysis tools were used to reduce duplicate information and data correlation. The key influencing factors were extracted by analyzing the sheep house data and reducing the dimension to improve the model’s prediction accuracy effectively. Table 2 shows the eigenvalues of each principal component, variance contribution rate, and cumulative total variance obtained after dimensionality reduction analysis. The principal component analysis is based on selecting components with “eigenvalues” larger than one. As seen in Table 2, The eigenvalues of all three components are larger than one, consistent with the notion of main component extraction. Therefore, the first three components of the original characteristics are identified as replacements for the original variables.

Table 3 presents the factor loading values of every environmental factor for the various main components after rotation using the Kaiser normalized orthogonal rotation technique. The contribution to the first principal component is Total Suspended Particulates (TSP), PM10, PM2.5, and the carbon dioxide concentration and air humidity contribute more to the second principal component.

The air temperature contributes more to the third principal component. Table 4 displays the association between the temperature and other environmental characteristics. Air temperature is inversely proportional to air humidity and positively proportional to carbon dioxide concentration, air humidity, total suspended particle matter (TSP), PM10, and PM2.5. The link between carbon dioxide concentration and air humidity was statistically significant. A significant positive association existed between TSP, PM10, and PM2.5. Therefore, the key influencing factors screened out are air temperature, carbon dioxide concentration, PM10, PM2.5, air humidity, and the total suspended particulate matter (TSP), which are the primary influencing variables of sheep housing temperature, as determined by professionals in sheep breeding.

The predicted results are shown in Figure 5. In order to show the effectiveness of the proposed scheme, this research conducts a comparative analysis of seven alternative prediction models using the same dataset. Traditional BP-NN, XGBoost, SVR, BP-NN, SVR, and XGBoost based on PSO particle swarm optimization are compared with the XGBoost model (PCA-PSO-XGBoost) based on the PCA for dimensionality reduction and the particle swarm. The comparison is shown in Figure 6. Comparing the three indicators, MSE, MAE, and R², the findings of the PCA-PSO-XGBoost prediction model best match the real data. The total error is minimal, the prediction accuracy is high, and great results may be produced.

Figure 7. compares a single SVR, a single BPNN, and a single XGBoost model. While SVR and BPNN models are typically reliable, the results of an XGBoost model are still more precise than SVR and BPNN models. The first half of the BPNN prediction curve floats up and down the actual value curve. The fluctuation range is large, but there is an obvious error in predicting the peak value. However, the SVR prediction curve can match the original value curve in the first half, and the prediction effect in the second half is not good. The error with the real value is large.

Although BPNN has a strong nonlinear fitting ability, it has certain advantages for processing unstable data. It can describe small changes in input variables. However, BPNN has poor generalization ability, is very sensitive to initial weights, and is easy to converge to the local minimum, often stagnates in the flat region of the error gradient surface, and the convergence is slow or even unable to converge; SVR is a stable classifier and is easily affected by noise data during training. However, the air temperature in the sheep barn depends on air humidity, carbon dioxide concentration, PM2.5 index, PM10 index, light intensity, noise, TSP, hydrogen sulfide concentration, etc. Environmental factors, which make it sensitive to environmental factors, cause the SVR model to have certain limitations in predicting sheep barn temperature. Additionally, when there are several input variables, it is hard to precisely characterize and manage the influence of minor changes in input parameters on the output results due to the SVR model’s absence of a processing mechanism for missing data. This flaw significantly impacts the accuracy of temperature prediction.

In XGBoost, there is a mechanism for capturing and processing missing values within the algorithm. Different weak learners can be used to process missing values. In addition, XGBoost uses first-order and second-order gradient optimization and introduces L1 and L2 regular terms to control the complexity and reduce over-fitting. XGBoost draws on RF column sampling and introduces a random mechanism to reduce the over-fitting and calculation amount. To sum up, the XGBoost algorithm has feasibility and good fitting ability in the temperature prediction of a sheep house in Xinjiang.

Figure 8 compares the single XGBoost model with the PSO-optimized XGBoost model. The single XGBoost model does not match the original value curve and the PSO-optimized XGBoost model. In the time period of 350–550, a large horizontal line appears in the prediction result of a single XGBoost model, while PSO-XGBoost can match the true value curve, which shows that the PSO method optimizes the selection of parameters required by the XGBoost model and can improve the model search speed, at the same time it can overcome the blindness and limitations of the traditional XGBoost model in parameter selection, thereby improving the model prediction accuracy.

Figure 9 shows the comparison results of the PSO-SVR, PSO-BPNN, and PSO-XGBoost models. It can be seen intuitively from the figure that the prediction results of the PSO-XGBoost model fit the original value curve. Although the prediction curves of PSO-BPNN and PSO-SVR have good prediction results in the first half, there are small fluctuations in different degrees in the second half. Both BPNN and SVR need to configure multiple parameter items and train more times to achieve better prediction results. It is simple to enter the local optimum when utilizing the PSO method for optimization. The flaws in the algorithm itself cannot be fixed by it efficiently. The XGBoost model has relatively simple parameter settings, strong robustness, good fitting effect, and high model accuracy. As a result, under identical circumstances, the PSO-XGBoost model outperforms the PSO-BPNN and PSO-SVR models.

Figure 10 is a comparison diagram of the PSO-XGBoost and the PSO-XGBoost models after PCA dimensionality reduction. As the picture shows, the PSO-XGBoost model’s small fluctuation in the time period of 500–550, the PSO-XGBoost model after PCA dimensionality reduction is more consistent with the original value curve. This is because the environmental index data of the sheep house have been processed by PCA dimensionality reduction. The environmental parameters that have the greatest impact on temperature are screened out. The noise data that interferes with the prediction performance are eliminated. As a result, the distribution of the input prediction model’s primary determinants is more concentrated, leading to maximum prediction accuracy. The PCA-PSO-XGBoost model presented in this study provides the best prediction effect. Simultaneously, it can better fit the nonlinear relationship between the ecological environment influencing factors and temperature of sheep house breeding, which has important guiding significance for sheep breeding.

Figure 11 shows the PSO-XGBoost structure after PCA dimensionality reduction and simplification. This is reflected in the reduction in the number of variables in the input model. By reducing the nine variables of the original input model into six variables, the computational overhead of the algorithm is reduced, and the prediction accuracy of the model is effectively improved.

Table 5 shows that the PCA-PSO-XGBoost prediction model suggested in this article outperforms other prediction models in terms of each index. A single XGBoost model outperforms a single BPNN and a single SVR model in terms of performance. RMSE, MSE, R², and MAE are 89, 99, 70, and 89% higher than those of a single BPNN and 88, 99, and 54% higher than a single SVR, 82%. It shows that under the condition of small samples, XGBoost can better dig out the law of nonlinear temperature change in sheep house environments than BPNN and SVR. The prediction accuracy RMSE, MSE, R², and MAE of PSO-XGBoost are 0.1256, 0.0158, 0.9957, and 0.0898, respectively. Compared with PSO-BPNN, RMSE, MSE, R², and MAE are improved by 72%, 92%, 5%, and 62%, respectively. The PSO-SVR is improved by 82%, 97%, 15%, and 63%, respectively. Compared with a single XGBoost, the prediction accuracy RMSE, MSE, and MAE of PSO-XGBoost are improved by 8%, 16%, and 13%, respectively. Compared with a single BPNN, the prediction accuracy RMSE, MSE, R², and MAE of PSO-BPNN are improved by 64%, 87%, 62%, and 15%, respectively. Prediction accuracy RMSE, MSE, R2, and MAE of the PSO-optimized SVR model increased by 38, 62, 34, and 57% compared to the single SVR model. This demonstrates how a single XGBoost, BPNN, and SVR may significantly increase their prediction accuracy by the PSO optimization method. Compared to the PSO-XGBoost model, the model outputs with PCA dimensionality reduction are more precise. The RMSE, MSE, R², and MAE values for prediction accuracy are 0.0433, 0.0019, 0.9995, and 0.0065, respectively. RMSE, MSE, and MAE are higher than those of PSO. XGBoost increases by 66, 88, and 93%, which shows that using the PCA dimensionality reduction algorithm to screen the key environmental factors affecting sheep house temperature can remove the correlation between environmental factors and simplify the model structure, and effectively improve the model accuracy. The findings demonstrate that the PCA-PSO-XGBoost model in this paper has greater accuracy and a better learning effect, can fully mine the hidden information within the data, and can more precisely predict how the temperature will change in the sheep house for the following ten minutes. It offers a trustworthy foundation for deciding temperature prediction and early warning in intensive sheep farming.

In order to illustrate the applicability of the PCA-PSO-XGBoost model at higher temperatures, this paper generalizes the validation on a dataset with a maximum temperature of 22.4 degrees Celsius.

Figure 12 shows the comparison results of the original data curves with the prediction curves of the single XGBoost model, the PSO-XGBoost model, and the PCA-PSO-XGBoost model. As shown in the figure, the PSO-XGBoost model after PCA dimensionality reduction is more in line with the original value curve and can also fit the original data in the prediction of the temperature peaks of the 60th, 200th, and 350th time segments. As shown in the figure, the PSO-XGBoost model after PCA dimensionality reduction is more in line with the original value curve and can also fit the original data in the prediction of the temperature peaks of the 60th, 200th, and 350th time segments. Although the changing trend of the XGBoost model can be consistent with the original curve, its prediction curve is still relatively deviated from the original curve. The PSO-XGBoost model has a certain error with the original temperature data in the peak prediction of the 200th and 350th time segments, and the effect is not as good as that of PCA-PSO-XGBoost.

Table 6 shows that on the high-temperature dataset, the PCA-PSO-XGBoost temperature prediction model proposed in this paper outperforms the XGBoost and PSO-XGBoost prediction models on every metric. The prediction accuracy RMSE, MSE, R², and MAE of PSO-XGBoost are 0.1676, 0.0281, 0.9969 and 0.1495, respectively, which are 87%, 98%, 20%, and 88% higher than XGBoost. This shows that on higher temperature datasets, the search optimization of the XGBoost parameters by PSO can also significantly improve the model accuracy. The prediction accuracy RMSE, MSE, R², and MAE of PCA-PSO-XGBoost are 0.0104, 0.0001, 0.9999, and 0.0036, respectively, and its RMSE, MSE and MAE values are 59%, 99%, and 98% higher than those of PSO-XGBoost, which proves that on the higher temperature data set, after the PCA dimensionality reduction process, the influence of redundant variables is eliminated, which can effectively improve the model accuracy of PSO-XGBoost. These verify that the PCA-PSO-XGBoost model proposed in this paper can also have good accuracy in higher temperature prediction.

4. Conclusions

The temperature is time-varying, lag, strong interference, and multi-coupling. When the measured temperature changes suddenly, the output of the thermal resistance in the temperature sensor will be delayed for a period of time, which causes the temperature data to have a hysteresis. The commonly used linear model cannot effectively predict the dynamic change law of the sheep house environment. This paper selects XGBoost to predict the temperature change to solve this problem. It uses the PCA dimensionality reduction algorithm to screen the sheep house environmental parameters. Temperature affects key environmental factors. The PSO optimization technique is used to perform a worldwide search for XGBoost parameters to forecast and assess the temperature of a sheep barn in Xinjiang, avoiding the empirical and randomness of XGBoost model parameter selection. Comparing a single SVR, PSO-BPNN, PSO-SVR, and PSO-XGBoost, the experimental findings demonstrate that:

(1): By using the PCA dimensionality reduction algorithm, the key environmental factors affecting the temperature of the sheep house can be screened out as air temperature, air humidity, carbon dioxide concentration, PM10, PM2.5, and the total suspended particulate matter (TSP), which can eliminate the correlation between the environmental factors. When inputting model parameters, irrelevant factors such as light intensity, noise, and H₂S can be filtered, which can reduce the number of input variables, reduce the complexity of the model, simplify the model structure, and avoid the dimensional disaster caused by excessive feature dimensions. All of these benefits are crucial for increasing the model’s accuracy. At the same time, this also means that the use of light intensity sensors, noise sensors, and H₂S sensors can be reduced, and the cost of sensor placement and energy consumption can be saved.
(2): The global search for the parameters of the XGBoost model using the PSO optimization algorithm can significantly increase model accuracy compared to the training results of a single XGBoost model. Avoid the empirical and randomness of parameter selection and eliminate the correlation between the predicted temperature value and the actual value. The developed PSO-XGBoost model has excellent prediction effectiveness, strong stability, and practical applicability.
(3): BPNN and SVR need to configure multiple parameter items and have many training times to improve prediction. Simultaneously, the XGBoost parameter setting is relatively simple, and the fitting effect is good; the model has high precision. Compared with PSO-BPNN and PSO-SVR, the evaluation indicators RMSE, MSE, R², and MAE of PSO-XGBoost are 0.1256, 0.0158, 0.9957, and 0.0898, respectively. The model accuracy is higher, indicating that it has more performance advantages and practical application value.
(4): In essence, BPNN is a gradient descent algorithm. The optimized objective function is complicated, the convergence speed is sluggish, and it is possible to fall into local extreme values, leading to the failure of network training. The training effect of the SVR model is overly dependent on parameter selection, and the variation in parameter selection directly impacts the model’s predictive performance and generalizability. Based on conventional Boosting, XGBoost employs CPU multi-threading and provides a regularization term that regulates the model’s complexity and prevents over-fitting. Therefore, it can effectively improve the accuracy of the model. In small-sample prediction, a single XGBoost can better mine the hidden information between the data than a single BPNN and a single SVR and has a better fit.

The sheep house temperature prediction model based on PCA-PSO-XGBoost proposed in this paper can practically provide guiding suggestions for monitoring and regulating temperature changes in intensive housing and can be extended to the prediction research of other environmental parameters of other animal houses such as pig houses and cow houses in the future.

Author Contributions

Conceptualization, D.F. and B.Z.; methodology, S.G.H.; software, L.C.; validation, L.X., S.G.H. and B.Z.; formal analysis, L.X.; investigation, S.L.; resources, S.L.; data curation, D.F.; writing—original draft preparation, D.F.; writing—review and editing, S.G.H. and T.L.; visualization, L.X.; supervision, J.G.; project administration, J.G.; funding acquisition, S.L. All authors have read and agreed to the published version of the manuscript.

Funding

This paper was supported partly by the National Natural Science Foundation of China under Grant 61871475, Special Project of Laboratory Construction of Guangzhou Innovation Platform Construction Plan under Grant 201905010006, Guangzhou Key Research and Development Project under Grant 202103000033, 201903010043, Guangdong Science and Technology Planning Project under Grant 2020A1414050060, 2020B0202080002, 2016A020210122, 2015A040405014, Innovation Team Project of Universities in Guangdong Province under Grant 2021KCXTD019, Characteristic Innovation Project of Universities in Guangdong Province under Grant KA190578826, Guangdong Province Enterprise Science and Technology Commissioner Project under Grant GDKTP2021004400, Meizhou City Science and Technology Planning Project under Grant 2021A0305010, Rural Science and Technology Correspondent Project of Zengcheng District, Guangzhou City under Grant 2021B42121631, Educational Science Planning Project of Guangdong Province under Grant 2020GXJK102, 2018GXJK072, Guangdong Province Graduate Education Innovation Program Project under Grant 2022XSLT056, 2022JGXM115.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to company policy.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

Gao, W.; Han, R. Xinjiang Statistical Yearbook—2020; China Statistics Press: Beijing, China, 2020; pp. 319–404. [Google Scholar]
Liang, Y.; Zhen, L.; Zhang, C.; Hu, Y. Consumption of products of livestock resources in Kazakhstan: Characteristics and influencing factors. Environ. Dev. 2020, 34, 100492. [Google Scholar] [CrossRef]
Wang, G.; Mao, J.; Fan, L.; Ma, X.; Li, Y. Effects of climate and grazing on the soil organic carbon dynamics of the grasslands in Northern Xinjiang during the past twenty years. Glob. Ecol. Conserv. 2022, 34, e02039. [Google Scholar] [CrossRef]
Zhang, M.; Feng, H.; Tomka, J.; Polovka, M.; Ma, R.; Zhang, X. Predicting of mutton sheep stress coupled with multi-environment sensing and supervised learning network in the transportation process. Comput. Electron. Agric. 2021, 190, 106422. [Google Scholar] [CrossRef]
Tong, X.; Zhao, L.; Heber, A.J.; Ni, J.-Q. Development of a farm-scale, quasi-mechanistic model to estimate ammonia emissions from commercial manure-belt layer houses. Biosyst. Eng. 2020, 196, 67–87. [Google Scholar] [CrossRef]
Huang, J.; Liu, S.; Hassan, S.G.; Xu, L.; Huang, C. A hybrid model for short-term dissolved oxygen content prediction. Comput. Electron. Agric. 2021, 186, 106216. [Google Scholar] [CrossRef]
Xie, Q.; Ni, J.-Q.; Bao, J.; Su, Z. A thermal environmental model for indoor air temperature prediction and energy consumption in pig building. Build. Environ. 2019, 161, 106238. [Google Scholar] [CrossRef]
Ortega, J.A.; Losada, E.; Besteiro, R.; Arango, T.; Ginzo-Villamayor, M.J.; Velo, R.; Fernandez, M.D.; Rodriguez, M.R. Validation of an AutoRegressive Integrated Moving Average model for the prediction of animal zone temperature in a weaned piglet building. Biosyst. Eng. 2018, 174, 231–238. [Google Scholar] [CrossRef]
Fang, Z.; Crimier, N.; Scanu, L.; Midelet, A.; Alyafi, A.; Delinchant, B. Multi-zone indoor temperature prediction with LSTM-based sequence to sequence model. Energy Build. 2021, 245, 111053. [Google Scholar] [CrossRef]
Elmaz, F.; Eyckerman, R.; Casteels, W.; Latré, S.; Hellinckx, P. CNN-LSTM architecture for predictive indoor temperature modeling. Build. Environ. 2021, 206, 108327. [Google Scholar] [CrossRef]
Zou, W.; Yao, F.; Zhang, B.; He, C.; Guan, Z. Verification and predicting temperature and humidity in a solar greenhouse based on convex bidirectional extreme learning machine algorithm. Neurocomputing 2017, 249, 72–85. [Google Scholar] [CrossRef]
Moon, T.; Son, J.E. Knowledge transfer for adapting pre-trained deep neural models to predict different greenhouse environments based on a low quantity of data. Comput. Electron. Agric. 2021, 185, 106136. [Google Scholar] [CrossRef]
Sanikhani, H.; Deo, R.C.; Samui, P.; Kisi, O.; Mert, C.; Mirabbasi, R.; Gavili, S.; Yaseen, Z.M. Survey of different data-intelligent modeling strategies for forecasting air temperature using geographic information as model predictors. Comput. Electron. Agric. 2018, 152, 242–260. [Google Scholar] [CrossRef]
Jung, D.H.; Kim, H.S.; Jhin, C.; Kim, H.J.; Park, S.H. Time-serial analysis of deep neural network models for prediction of climatic conditions inside a greenhouse. Comput. Electron. Agric. 2020, 173, 105402. [Google Scholar] [CrossRef]
Cai, W.; Wei, R.; Xu, L.; Ding, X. A method for modelling greenhouse temperature using gradient boost decision tree. Inf. Process. Agric. 2022, 9, 343–354. [Google Scholar] [CrossRef]
Li, X.; Han, Z.; Zhao, T.; Zhang, J.; Xue, D. Modeling for indoor temperature prediction based on time-delay and Elman neural network in air conditioning system. J. Build. Eng. 2021, 33, 101854. [Google Scholar] [CrossRef]
Martínez-Comesaña, M.; Ogando-Martínez, A.; Troncoso-Pastoriza, F.; López-Gómez, J.; Febrero-Garrido, L.; Granada-Álvarez, E. Use of optimised MLP neural networks for spatiotemporal estimation of indoor environmental conditions of existing buildings. Build. Environ. 2021, 205, 108243. [Google Scholar] [CrossRef]
Li, X.; Zhao, T.; Zhang, J.; Chen, T. Predication control for indoor temperature time-delay using Elman neural network in variable air volume system. Energy Build. 2017, 154, 545–552. [Google Scholar] [CrossRef]
Xu, L.; Liu, S.; Li, D. Prediction of water temperature in prawn cultures based on a mechanism model optimized by an improved artificial bee colony. Comput. Electron. Agric. 2017, 140, 397–408. [Google Scholar] [CrossRef]
Graf, R.; Zhu, S.; Sivakumar, B. Forecasting river water temperature time series using a wavelet–neural network hybrid modelling approach. J. Hydrol. 2019, 578, 124115. [Google Scholar] [CrossRef]
Alhamid, M.I.; Saito, K. Hot water temperature prediction using a dynamic neural network for absorption chiller application in Indonesia. Sustain. Energy Technol. Assess. 2018, 30, 114–120. [Google Scholar]
Ni, L.; Wang, D.; Wu, J.; Wang, Y.; Tao, Y.; Zhang, J.; Liu, J. Streamflow forecasting using extreme gradient boosting model coupled with Gaussian mixture model. J. Hydrol. 2020, 586, 124901. [Google Scholar] [CrossRef]
Wang, Y.; Sun, S.; Chen, X.; Zeng, X.; Kong, Y.; Chen, J.; Guo, Y.; Wang, T. Short-term load forecasting of industrial customers based on SVMD and XGBoost. Int. J. Electr. Power Energy Syst. 2021, 129, 106830. [Google Scholar] [CrossRef]
Liu, Y.; Zhuang, Y.; Ji, B.; Zhang, G.; Rong, L.; Teng, G.; Wang, C. Prediction of laying hen house odor concentrations using machine learning models based on small sample data. Comput. Electron. Agric. 2022, 195, 106849. [Google Scholar] [CrossRef]
Obsie, E.Y.; Qu, H.; Drummond, F. Wild blueberry yield prediction using a combination of computer simulation and machine learning algorithms. Comput. Electron. Agric. 2020, 178, 105778. [Google Scholar] [CrossRef]
Gu, Y.; Zhang, D.; Bao, Z. A new data-driven predictor, PSO-XGBoost, used for permeability of tight sandstone reservoirs: A case study of member of chang 4+ 5, western Jiyuan Oilfield, Ordos Basin. J. Pet. Sci. Eng. 2021, 199, 108350. [Google Scholar] [CrossRef]
Zhang, Y.; Cui, N.; Feng, Y.; Gong, D.; Hu, X. Comparison of BP, PSO-BP and statistical models for predicting daily global solar radiation in arid Northwest China. Comput. Electron. Agric. 2019, 164, 104905. [Google Scholar] [CrossRef]
Huang, Y.; Hasan, N.; Deng, C.; Bao, Y. Multivariate empirical mode decomposition based hybrid model for day-ahead peak load forecasting. Energy 2022, 239, 122245. [Google Scholar] [CrossRef]
Zhu, B.; Feng, Y.; Gong, D.; Jiang, S.; Zhao, L.; Cui, N. Hybrid particle swarm optimization with extreme learning machine for daily reference evapotranspiration prediction from limited climatic data. Comput. Electron. Agric. 2020, 173, 105430. [Google Scholar] [CrossRef]
Yu, C.; Li, Z.; Yang, Z.; Chen, X.; Su, M. A feedforward neural network based on normalization and error correction for predicting water resources carrying capacity of a city. Ecol. Indic. 2020, 118, 106724. [Google Scholar] [CrossRef]
Friedman, J.H. Greedy function approximation: A Gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
Pan, S.; Zheng, Z.; Guo, Z.; Luo, H. An optimized XGBoost method for predicting reservoir porosity using petrophysical logs. J. Pet. Sci. Eng. 2022, 208, 109520. [Google Scholar] [CrossRef]
Eberhart, R.; Kennedy, J. A new optimizer using particle swarm theory. In Proceedings of the Sixth International Symposium on Micro Machine and Human Science, MHS’95, Nagoya, Japan, 4–6 October 1995; pp. 39–43. [Google Scholar]
Raja, J.B.; Pandian, S.C. PSO-FCM based data mining model to predict diabetic disease. Comput. Methods Programs Biomed. 2020, 196, 105659. [Google Scholar] [CrossRef] [PubMed]
Yao, F.; He, W.; Wu, Y.; Ding, F.; Meng, D. Remaining useful life prediction of lithium-ion batteries using a hybrid model. Energy 2022, 248, 123622. [Google Scholar] [CrossRef]
Nevavuori, P.; Narra, N.; Lipping, T. Crop yield prediction with deep convolutional neural networks. Comput. Electron. Agric. 2019, 163, 104859. [Google Scholar] [CrossRef]

Figure 1. Schematic diagram of the Xinjiang sheep house monitoring.

Figure 2. Flow chart of particle swarm optimization algorithm.

Figure 3. Raw temperature data.

Figure 4. Temperature prediction flow chart based on PCA-PSO-XGBoost.

Figure 5. PCA-PSO-XGBoost prediction results.

Figure 6. Comparison of prediction results.

Figure 7. Comparison of SVR, BPNN and XGBoost.

Figure 8. Comparison of XGBoost and PSO-XGBoost.

Figure 9. Comparison of PSO-BPNN, PSO-SVR, and PSO-XGBoost.

Figure 10. Comparison of PSO-XGBoost and PCA-PSO-XGBoost.

Figure 11. PSO-XGBoost structure after PCA dimensionality reduction and simplification.

Figure 12. Comparison of prediction results in high temperature dataset.

Table 1. Part of experimental original data collected on 8–22 February 2021.

Time	Air Temperature	Air Humidity	CO₂	PM_2.5	PM₁₀	Light Intensity	Noise	TSP	H₂S
8 February 2021 17:11:56	4.2	78	1355	16.3	81.3	146	31.1	120.7	5.2
8 February 2021 17:21:35	3.9	78.4	1330	26.5	107.1	97	72.7	165.3	5.2
-	-	-	-	-	-	-	-	-	-
22 February 2021 17:12:15	6.6	81.2	855	12.5	24	97	40.2	45.3	5.2
22 February 2021 17:22:15	6.4	81.4	780	9.3	24.6	183	34.1	42	5
22 February 202117:32:16	6.3	81.4	770	12.2	34.8	159	31.8	58.2	1.4

Table 2. PCA contribution rates and Eigen values.

Component	Initial Eigenvalues			Extract the Load Sum of Squares			Rotational Load Sum of Squares
Component	Total	Variance/%	Accumulation/%	Total	Variance/%	Accumulation/%	Total	Variance/%	Accumulation/%
1	3.151	35.016	35.016	3.151	35.016	35.016	2.938	32.650	32.650
2	2.164	24.043	59.060	2.164	24.043	59.060	2.310	25.666	58.315
3	1.605	17.832	76.892	1.605	17.832	76.892	1.672	18.576	76.892
4	0.952	10.579	87.470	-	-	-	-	-	-
5	0.491	5.460	92.930	-	-	-	-	-	-
6	0.391	4.342	97.272	-	-	-	-	-	-
7	0.187	2.076	99.348	-	-	-	-	-	-
8	0.059	0.652	100.000	-	-	-	-	-	-
9	0.000	0.000	100.000	-	-	-	-	-	-

Table 3. Component matrix.

Parameter	Component
Parameter	1	2	3
TSP	0.995	−0.077	0.018
PM₁₀	0.988	−0.078	−0.006
PM_2.5	0.972	−0.068	0.103
CO₂	0.030	0.901	0.188
Air humidity	−0.141	0.864	−0.179
Light intensity	0.031	−0.799	0.131
Noise	0.039	−0.299	0.013
Air temperature	0.016	−0.047	0.904
H₂S	−0.062	0.062	−0.871

Table 4. Correlation matrix.

Correlation Matrix
		Air Temperature	Air Humidity	CO₂	PM_2.5	PM₁₀	Light Intensity	Noise	TSP	H₂S
Correlation	Air temperature	1.000	−0.211	0.141	0.122	0.011	0.168	0.025	0.035	−0.597
	Air humidity	−0.211	1.000	0.715	−0.209	−0.200	−0.575	−0.157	−0.204	0.189
	CO₂	0.141	0.715	1.000	−0.019	−0.044	−0.574	−0.189	−0.039	−0.040
	PM_2.5	0.122	−0.209	−0.019	1.000	0.940	0.107	0.035	0.963	−0.139
	PM₁₀	0.011	−0.200	−0.044	0.940	1.000	0.094	0.068	0.997	−0.068
	Light intensity	0.168	−0.575	−0.574	0.107	0.094	1.000	0.108	0.098	−0.106
	Noise	0.025	−0.157	−0.189	0.035	0.068	0.108	1.000	0.061	−0.049
	TSP	0.035	−0.204	−0.039	0.963	0.997	0.098	0.061	1.000	−0.085
	H₂S	−0.597	0.189	−0.040	−0.139	−0.068	−0.106	−0.049	−0.085	1.000

Table 5. Experimental results for different prediction models.

Predict Model	Error Type
Predict Model	RMSE	MSE	R²	MAE
BPNN	1.2354	1.5261	0.5845	0.9449
SVR	1.1422	1.3046	0.6448	0.5718
XGBoost	0.1373	0.0189	0.9949	0.1034
PSO-SVR	0.7035	0.4950	0.8652	0.2454
PSO-BPNN	0.4470	0.1998	0.9456	0.2344
PSO-XGBoost	0.1256	0.0158	0.9957	0.0898
PCA-PSO-XGBoost	0.0433	0.0019	0.9995	0.0065

Table 6. Experimental results for different prediction models in high temperature dataset.

Predict Model	Error Type
Predict Model	RMSE	MSE	R²	MAE
XGBoost	1.2562	1.5782	0.8312	1.2088
PSO-XGBoost	0.1676	0.0281	0.9969	0.1495
PCA-PSO-XGBoost	0.0104	0.0001	0.9999	0.0036

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Feng, D.; Zhou, B.; Hassan, S.G.; Xu, L.; Liu, T.; Cao, L.; Liu, S.; Guo, J. A Hybrid Model for Temperature Prediction in a Sheep House. Animals 2022, 12, 2806. https://doi.org/10.3390/ani12202806

AMA Style

Feng D, Zhou B, Hassan SG, Xu L, Liu T, Cao L, Liu S, Guo J. A Hybrid Model for Temperature Prediction in a Sheep House. Animals. 2022; 12(20):2806. https://doi.org/10.3390/ani12202806

Chicago/Turabian Style

Feng, Dachun, Bing Zhou, Shahbaz Gul Hassan, Longqin Xu, Tonglai Liu, Liang Cao, Shuangyin Liu, and Jianjun Guo. 2022. "A Hybrid Model for Temperature Prediction in a Sheep House" Animals 12, no. 20: 2806. https://doi.org/10.3390/ani12202806

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Hybrid Model for Temperature Prediction in a Sheep House

Abstract

Simple Summary

Abstract

1. Introduction

2. Materials and Methods

2.1. Experimental Area

2.2. Experimental Data Preprocessing

2.3. Construction of a Combined Forecasting Model for Sheep House

2.3.1. XGBoost Algorithm

2.3.2. Particle Swarm Optimization Algorithm

2.3.3. Model Performance Evaluation

3. Result and Analysis

3.1. Experimental Environment

3.2. Experiment Procedure

3.3. Result and Discussions

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI