1. Introduction
Due to rapid population growth and climate changes, global food security and agriculture production risks have been increased [
1]. Information about annual crop production is vital for global and local food security. In particular, measuring and monitoring crop biophysical parameters, including dry biomass, crop height, crop density, and LAI, during the crop growing season are essential for improving crop growth models and yield estimation [
1,
2]. Biomass and LAI are two widely used crop parameters in crop monitoring and growth models [
3,
4]. As the input data in crop models, crop biophysical parameters are estimated using direct and indirect methods. The direct method consists of a ground measuring of the plant’s parameters. These methods are usually destructive, costly, time-consuming, and complicated [
5]. The information extracted from remote sensing data is non-destructive and significantly reduces time and cost. Remote sensing provides vital information on crop growth conditions over agricultural areas due to its extensive coverage and spatio-temporal resolution [
6]. To this end, remote sensing imagery could be suitable for accurate crop monitoring.
Both SAR and optical data have been used to estimate crop parameters. Substantial studies have been carried out to investigate satellite optical data’s potential to estimate various crop parameters. VIs extracted from optic bands are widely used to estimate crop parameters and monitor crop conditions. However, when the crop canopy is dense, optical data tend to be saturated [
7]. In addition, since optical data in cloudy conditions are not helpful, SAR sensors use microwave wavelengths that can penetrate clouds and haze [
1,
8,
9,
10,
11,
12,
13].
SAR sensors can also provide data in day and night without considering sun illumination, with suitable temporal coverage and sufficient spatial resolution [
12,
13]. Furthermore, soil and surface parameters and the crop canopy state can easily affect radar backscattering [
14]. Moreover, the SAR backscattering coefficient is affected by crop and soil parameters. However, these effects have changed by various sensor parameters (i.e., wavelength, incidence angles, and polarization), different target parameters (i.e., canopy structure, water content, soil moisture, and soil roughness), and crop type and growth stage [
1,
11,
15,
16,
17,
18]. Thus, the combination of optical and SAR data has a great ability in crop monitoring.
Considerable researches have been conducted to estimate various crop parameters using satellite Earth observations, including RADARSAT-2 [
1,
17,
19,
20,
21], RapidEye [
5,
19,
20,
22,
23], Sentinel-1 [
7,
17,
24,
25,
26,
27,
28], Sentinel-2 [
7,
25,
29,
30,
31,
32], Landsat-5 Thematic Mapper (TM) [
33,
34], Landsat-7 Enhanced Thematic Mapper Plus (ETM+) [
34,
35,
36], Landsat-8 Operational Land Imager (OLI) [
7,
17,
31,
32,
35,
36,
37], Worldview-2/3 [
17,
27,
28,
38,
39], and MODIS [
40,
41].
The crop parameters estimation methods in the literature can be generally categorized into three groups: (1) parametric models, (2) non-parametric models, and (3) physically-based models [
42]. Parametric models assume a clear relationship between input and output variables. In contrast, there is no assumption for the statistical distribution of input data in non-parametric models. Finally, physically-based models use physical laws, and model variables are frequently obtained from Radiative Transfer Models (RTMs) [
42].
The new generation of satellite sensors coupled with an increasing need for big data mining has increased the essential need to use artificial intelligence (AI) for Earth observation data analysis. Machine learning (ML), a subset of AI, is learning algorithms by using training data. ML algorithms rapidly process a large amount of data and give helpful insight into the information leads to astonishing output. Another advantage of ML algorithms is that any apriori assumption is needed about data distribution [
43]. Non-parametric MLAs, without any assumption for the statistical distribution of input data, have successfully been applied to remote sensing data to retrieve crop biophysical parameters, yield estimation, and crop mapping. Reisi Gahrouei, et al. [
22] used an artificial neural network (ANN) and SVR to estimate LAI and dry biomass of three crops, including soybean, corn, and canola high-resolution RapidEye data. Reisi-Gahrouei, et al. [
44] also used MLR and ANN to estimate crop biomass using UAVSAR data. Sharifi and Hosseingholizadeh [
45] have investigated the potential of MLR, relevance vector regression (RVR), and SVR to estimate cereal height and biomass. Zhu, et al. [
46] utilized unmanned aerial vehicles (UAV) data to assess the ability of four MLAs, including MLR, RF, ANN, and SVR, to estimate the above-ground biomass (AGB). Luo, et al. [
7] utilized MLR and SVR to estimate corn LAI and biomass. Deb, et al. [
37] used parametric regression models and SVR to estimate agro-pastoral AGB. The excellent generalize capability of ML methods and their robustness to the noise in the case of low samples data makes them excellent tools to process remote sensing data and provide smart solutions in the field of precision agriculture.
In this study, we considered two scenarios to estimate crop parameters. First, we used four MLAs, including SVR, RF, GB, and XGB, in estimating two crop biophysical parameters. GB and XGB are the novel machine learning algorithms that received less attention in the crop parameters estimation method. This scenario was performed through three steps: (1) using polarimetric SAR features, (2) using optical VIs, and (3) using the integration of SAR and VIs features. These three steps could clearly show that the radar or optical remote sensing data or their combination in estimating crop parameters has excellent potential. Also, we used a deep artificial neural network to model the crop’s LAI and dry biomass in the following scenario. In addition, the deep neural network received less attention in crop parameters estimation. Therefore, we tried to utilize the best performance from these algorithms using feature engineering and suitable parameters tuning. Several features counting VIs extracted from RapidEye spectral reflectance and polarimetric SAR decomposition feature extracted from UAVSAR data were selected to utilize as the input of mentioned algorithms. Moreover, the importance of each feature is investigated attentively. The results were compared, and the best method for estimating crop parameters was determined.
3. Methodology
MLAs are recently used in classification and regression problems in many areas. In this study, regression models, e.g., RF, GB, XGB, and SVR, were used to estimate crop’s LAI and dry biomass. MLAs were implemented using the open-source Python Scikit-learn package. Besides, deep ANN was implemented using the Keras package. The data were divided into train and test data. Two-thirds (i.e., ~66.7%) of the data were selected to train the models, and the remaining data (i.e., ~33.3%) were used as test data. In this study, first, we calculate the correlation between SAR and optical VIs.
Then, RF feature importance was calculated for each crop and crop parameter separately. The less important features were removed by considering the absolute value of feature importance greater than 0.9 between features. Finally, the remaining features were combined to estimate crop parameters. The selected SAR and VIs features were separately fed into the model as the input data in the first and second steps. Then, the combining of SAR and VIs features was used for modeling. The results of the three separate input data were compared to each other. Furthermore, a deep artificial neural network based on the selected feature was designed and implemented. Furthermore, Grid Search Cross-Validation (GridSearchCV) was used to tune the hyper-parameters of all the ML algorithms.
3.1. Random Forest Regression
RF is a robust ensemble learning method, which is widely used in classification and regression problems. Ensemble learning is the process in which multiple models are produced and combined to solve a particular task. Two common types of ensemble learning are boosting and bagging. Bagging is made up of fitting several models that train independently to reduce variance to avoid overfitting while improving combined models’ stability and accuracy [
63]. RF is a successful bagging approach made up of a substantial number of individual decision trees. Each tree makes its prediction. Finally, the model combines all predictions to obtain a better performance [
64]. Each tree grows independently using a bootstrap sampling of the training data [
29].
In contrast to the linear regression model, an RF regressor model cannot predict outlier data, e,g, predicting the data from training samples. Various researches have used RF regression and classification models to estimate crop parameters or map croplands [
31,
40,
46,
65]. The GridSearchCV parameters used in the RF are shown in
Table 3.
3.2. Support Vector Regression
The support vector machines (SVMs) algorithm, developed by Vapnik and his colleagues [
66], is one of the most widely used kernel-based machine learning algorithms, which is used in a variety of problems, especially in classification tasks [
63]. Maintaining all the algorithm’s main features, like maximal margin, SVM can also be used in regression problems. SVR, firstly introduced by Drucker, et al. [
67], has several minor differences from SVM. The regression model’s output has infinite numbers, but in SVM, the output is finite numbers.
In regression models, a margin of tolerance (epsilon) is set in approximation. There will be various reasons that make regression models more complicated than the SVM model. SVR gives us the flexibility to define how much error is acceptable in our model and find an appropriate line (or hyperplane in higher dimensions) to fit the data. In this manner, the tube’s points, the points outside the tube, receive penalization; however, the prediction function receives no penalization either above or below. SVR and SVM are widely used in recent researches to estimate crop parameters and cropland mapping [
7,
26,
31,
37,
45,
46,
65,
68,
69]. The Grid Search parameters used for the SVR model are shown in
Table 4.
3.3. Gradient Boosting and Extreme Gradient Boosting
GB regression algorithms were subsequently developed by Friedman [
70,
71]. As we said in part2, two common types of ensemble learning are boosting and bagging. GB, a machine learning method, is an extension of the boosting method. GB, like RF, is used in regression and classification tasks. GB method is based on minimizing a loss function, and various types of loss functions can be used. The regularization techniques are customarily used to reduce overfitting effects. GB negligibly has been used in crop biomass estimation [
31,
65]. One of the most attractive gradients boosting implementations is XGB [
72], first started by Tianqi Chen (Tianqi Chen on
http://datascience.la/xgboost-workshop-and-meetup-talk-with-tianqi-chen/, accessed on 3 July 2021) as a research project. It is an ensemble machine learning algorithm that uses a gradient boosting framework. XGB is designed to enhance a machine learning model’s performance, speed, flexibility, and efficiency. The Grid Search parameters used for the GB and XGB algorithms are shown in
Table 5.
3.4. Deep Artificial Neural Network Regression
ANNs are popular machine learning algorithms inspired by the human brain [
64]. A simplified model of the brain shows a considerable number of primary computing devices called neurons. Through these substantially connected neurons, highly complex computations can be carried out. ANN consists of interconnected neurons that learn by adopting and modifying the weights [
29]. This model typically includes one input layer, more than two hidden layers, and one output layer. In the ANN model, neurons of one layer can be connected to all other layers’ neurons but not to the same layer’s neurons. Each neuron is connected to all neurons in the previous and following layers in a fully connected ANN [
73].
In this study, we used a dense, deep ANN. The primary considerations for tuning hyper-parameters of ANN are the number of neurons and hidden layers. Several empirical methods can determine the number of neurons in each layer [
74,
75]. In this study, we have determined the number of the neurons using Equation (2) [
74]:
In this equation, Nn is the number of neurons in each layer, N is the number of input neurons, and m is the number of layers. We examined various activation functions for the deep ANN model, including ReLU, Tanh, Sigmoid, and Linear. Adam’s optimization method, an extension of Stochastic Gradient Descent (SGD), was used to update the network’s weight iteratively. The early stopping approach was used to avoid overfitting. Furthermore, 20% of the training sample was selected as the validation data.
3.5. Evaluation Criteria
Several criteria were used to evaluate prediction performance, including RMSE, mean absolute error (MAE), and Pearson correlation coefficient (R). The formula of the RMSE is as follows:
where
N is the number of data.
In addition, the normalized RMSE (nRMSE) (
) is presented in one figure for better and accurate visualization. MAE is calculated as the following equation:
R is used in statistics problems to measure how the relationship between predicted and observed data is robust:
4. Results and Discussion
Several optical VIs and polarimetric SAR data were extracted from RapidEye and UAVSAR data to explore satellite data’s potential to evaluate and estimate crop parameters. The results showed acceptable agreement with the researches had done before by Hosseini, et al. [
9] and Reisi Gahrouei, et al. [
22]. The impact of optical VIs, UAVSAR polarimetric features, and integrating them on the accuracy of retrieving dry biomass and LAI using four machine learning regression models is assessed in the following sections.
4.1. Time Series Analysis of Radar Backscattering
Figure 2 presents the temporal profiles of the three crops. The left axes represent three SAR backscattering, including VV, HH, and HV. The right axes in the left images are regarding dry biomass, while the right images are regarding LAI. For canola, all three intensities, including VV, HH, and HV, generally decrease from 17 June to 14 July 2012. As expected, the amount of dry biomass during the campaign increased, while the LAI reduced from start to middle of the campaign. Generally, all three SAR backscattering coefficients from 17 June to 14 July 2012, are rising for corn. Also, the amount of dry biomass and LAI increased during the campaign. For soybean, in total, the HV and HH backscattering coefficient is rising, but the VV behavior is irregular. For soybean, similar to corn, the amount of dry biomass and LAI increased during the campaign.
4.2. Correlation and Features Importance
Correlation coefficients between all features extracted for each crop have shown in
Figure 3. For canola, the correlation between DF and DY, OF and OY, and VF and VY is high. The absolute correlation between DF and most of the other radar features is generally higher than 0.9. Overall, the correlation between OF and OY with other decompositions is relatively low. The correlation between DF and DY with the other radar features is also high. As well, the correlation between radar features with VIs is low. Between VIs, approximately in most cases, correlation is high. Apart from A, H, and Alpha, the correlation between other SAR features is relatively high for corn. However, the number of radar features with an absolute correlation exceeded 0.9 is negligible. The correlation between D
F and D
Y, O
F and O
Y, and V
F and V
Y is higher than 0.9. Besides, a high correlation can be seen between VIs. Compared to corn and canola, the correlation between SAR features is relatively low.
For corn’s dry biomass, the higher importance is related to MCTI VI. However, between 5 high important features, four of them are SAR parametric features. For corn LAI, similar to dry corn biomass, the higher feature importance is MCTI. Nevertheless, in contrast to corn’s dry biomass, from 5 higher importance features, four of them are related to spectral VI. For canola’s dry biomass, the higher importance is related to DF.
Of five higher importances, three are related to SAR parameters, and the remaining are regarding spectral VIs. For canola LAI, the higher importance is related to RVI. Also, between five high importance features, four of them are regarding SAR parameters. For soybean’s dry biomass, the higher importance is related to CL-EDGE. Besides, four out of five higher importance are related to spectral VIs. Finally, for soybean LAI, similar to soybean’s dry biomass, CL-EDGE has higher importance. In addition, between 5 higher importance, four of them are related to spectral VIs.
The details of the RF feature importance are listed in
Table 6. The color of the feature cell with higher importance tends to be green, and the low important feature’s color tend to be yellow. Details of the features used in each crop parameter can be seen in
Table 7.
4.3. Sensitivity Analysis
Complete information over validation and calibration accuracies for retrieving dry biomass and LAI for each crop is shown in
Table 8 and
Table 9, respectively. The following details are for validation data. As maturity methods, canola builds up appreciable plant material and vegetation water. This considerable water volume might lead to a greater tendency towards saturation of signals from canola canopies, especially for SAR backscatter. For canola’ dry biomass, generally, the accuracy of integrated input data was better than the SAR parameters or spectral VIs, separately. The prediction performance of optical VIs was slightly better than the SAR features using. The best performance was related to SVR MLA with RMSE = 26.29 g/m
2, MAE = 20.72 g/m
2, and R = 0.95 using a combination of SAR and optical data (
Figure 4a). A low amount of overestimated value can be seen in the early growth stage. Moreover, SVR underestimated dry biomass for canola at advanced development stages. Low error among dry biomass estimation using VIs spectral data provided by GB method with RMSE of 38.87 g/m2 and MAE of 28.34 g/m2, considerably higher than integrated estimation error. Using SAR polarimetric features, high accuracy was delivered by RF algorithms (RMSE = 46.67 g/m
2, 33.45 g/m
2, and R = 0.83). The results showed no saturation in the high and low amount of dry biomass prediction using a combination of optic and SAR data.
LAI is indicative of the crop structure and affects both reflectance and backscatter at canopy scales. For canola LAI, the integration of SAR polarimetric data and spectral VIs in GB and XGB improved performance. However, optical VIs outperforms the integration of SAR and optical features in SVR and RF. Besides, the accuracy of LAI estimation with VIs spectral data was better than SAR polarimetric data. As demonstrated by
Figure 4b, LAI for canola estimated by GB using both SAR and VIs feature was highly correlated with in situ measured LAI (0.557 m
2/m
2, 0.399 m
2/m
2, and R = 0.95). However, GB slightly underestimates the LAI in the mid-growth stage. The higher accuracy with spectral VIs regarded to GB and XGB, with approximately the same RMSE, but better MAE in XGB. A minimal amount of saturation could be seen in high values of LAI.
For corn’s dry biomass, like canola’s dry biomass, integrated features have higher accuracy. The performance of optical VIs data was better than SAR polarimetric parameters in RF, GB, and XGB, while in SVR, the accuracy of SAR polarimetric features is higher than VIs spectral data. The prediction accuracy delivered by RF, SVR, and GB has the same R, but RF has a lower error than SVR and GB. The higher accuracy was related to the RF regression model with RMSE = 57.97 g/m
2, MAE = 41.15 g/m
2, and R = 0.96 (
Figure 4c). Although a few ground measurements are available at high dry biomass, observed and estimated, dry biomass values are well distributed about the 1:1 line. The best performance of optical VIs regarded to GB regression model with RMSE = 69.85 g/m
2, MAE = 44.37 g/m
2, R = 0.94. The higher accuracy with SAR polarimetric data was related to the SVR model with RMSE = 60.2 g/m
2, MAE = 46.94 g/m
2, and R = 0.94. The result of RF show low saturation in the high amount of corn’s dry biomass.
For corn’s LAI, the best performance was related to integrating SAR polarimetric data and spectral VIs. Besides, the estimation accuracy of spectral VIs is slightly worse than SAR polarimetric data. The best accuracy was regarding the GB regression method with RMSE = 0.298 m
2/m
2, MAE = 0.219 m
2/m
2, and R = 0.96 using a combination of SAR and optical VIs features (
Figure 4d). Early in the season, the GB overestimates LAI. This may be due to a more open canopy in the early growth stages, leaving more soil exposed to soil properties, contributing significantly to reflectance and backscatter. The RMSE of SVR, RF, and XGB is nearly equal. The best performance for spectral VIs was provided by the XGB model with RMSE = 0.399 m
2/m
2, MAE = 0.273 m
2/m
2, and R = 0.92. The best performance for SAR data was delivered by the XGB model with RMSE = 0.321 m
2/m
2, MAE = 0.221 m
2/m
2, and R = 0.95.
For soybean’s dry biomass, generally, the integration of SAR and optical data had better performance. The estimation performance of SAR polarimetric data was slightly better than optic data in all cases. The best performance among MLAs was related to GB regression model with RMSE = 5.00 g/m
2, MAE = 3.5 g/m
2, and R = 0.94 (
Figure 4e). SVR had the lower MAE among all algorithms; however, RF has the lower RMSE comparison to SVR. Also, RF and GB had a similar MAE, but RF had the lower RMSE. ML algorithms had a significant saturation in the high value of dry biomass (higher than 60 g/m
2).
For soybean’s LAI, like corn and canola, the best performance belongs to integrating SAR and optic data. Also, the accuracy of SAR polarimetric data, compared to spectral VIs, was better. The XGB and GB had the same RMSE, however, XGB had a better MAE (RMSE = 0.233 m
2/m
2, MAE = 0.164 m
2/m
2, and R = 0.94 (
Figure 4f). The higher accuracy using SAR data was related to SVR with RMSE = 0.291 m
2/m
2, MAE = 0.209 m
2/m
2, and R = 0.9. Using optical VIs data XGB provided better results with RMSE = 0.36 m
2/m
2, MAE = 0.247 m
2/m
2, and R = 0.84. The best performance in estimating dry biomass and LAI for each crop is shown in
Figure 4.
Figure 5 shows the results of four MLAs’ nRMSE for three crops shown in the boxplot. The quartile of distribution shows in the box. The rest of the dataset showed whiskers. Each boxplot’s data consisted of each method’s calibration and validation data, including three various input data (SAR polarimetric features, VIs spectral data, and integration of SAR and optical features). The results showed that MLAs performed better in canola rather than corn and soybean. For corn’s dry biomass, SVR has better accuracy rather than the other methods. In addition, the median for XGB in corn dry biomass and LAI are lower, which means better accuracy.
4.4. The Results of Deep Neural Network
The results of Deep ANN can be seen in
Figure 6. The results showed that Deep ANN, in all cases, improved the accuracy of estimation. For canola dry biomass and LAI, the model delivered the RMSE of 25.8 g/m
2 and 0.525 m
2/m
2, respectively (
Figure 6a,b). The results of deep ANN clearly showed improved canola dry biomass and LAI. For corn dry biomass and LAI, the deep ANN provided the RMSE of 54.43 g/m
2 and 0.273 m
2/m
2, respectively (
Figure 6c,d). For both corn LAI and dry biomass, deep ANN improved the retrieval accuracy. Finally, for soybean dry biomass and LAI, the deep ANN provided the RMSE of 4.95 g/m
2 and 0.211 m
2/m
2, respectively (
Figure 6e,f). Besides, the deep ANN slightly improved the estimation’s accuracy for both LAI and dry biomass. A minimal amount of saturation can be seen in the high biomass for soybean’ dry biomass (approximately higher than 60 g/m
2).
4.5. Discussion
For the last decades, remote sensing satellite SAR and optic data’s progress provides an environment for further research on crop biophysical parameters. MLAs showed significant potential in broad areas; utilizing these methods recently has grown to solve remote sensing problems. Crop biophysical parameters are vital parameters for crop monitoring, crop stress assessments, crop growth model, to name but a few. Identify the number of train and test samples, the best value for each parameter in tuning MLAs’ hyperparameters, and many other things that are not mentioned here, are the reasons that we need to compare several MLAs to determine the best approach to estimate target parameters. In this study, we focused on the potential of four MLAs to assess two crop biophysical parameters. XGB is a new method that is used in the fields related to crop parameter estimation. Information during crop growth duration is available from UAVSAR data. In general, for all three crops, a combination of UAVSAR polarimetric features and spectral VIs have a better performance to estimate crop biomass and LAI. the estimation accuracy of regression models UAVSAR L-band polarimetric features showed great potential in retrieving soybean dry biomass and LAI. For canola and corn LAI and dry biomass, generally, the accuracy of estimation using optical VIs was better than SAR polarimetric features in each regression model.
Considering other research works, Reisi-Gahrouei, et al. [
44] achieved RMSE of 56.55 g/m
2 and R = 0.72 for canola’s dry biomass using decomposition UAVSAR L-band data. Besides, they achieved an RMSE of 13.48 g/m
2 and R = 0.82 for soybean’s dry biomass. In another study, Reisi Gahrouei, et al. [
22] achieved 25.22 g/m
2 for canola, 88.13 g/m
2 for corn, 5.91 g/m
2 for soybean using spectral VIs extracted from RapidEye optical data. Their model delivered RMSE of 0.59 m
2/m
2 for canola, 0.27 m
2/m
2 for corn and 0.21 m
2/m
2 for soybean, a combination of UAVSAR L-band data and spectral VIs improved the soybean and canola dry biomass estimation in our study. Mandal, et al. [
76] used various methods to estimate wet biomass and PAI of soybean and wheat. They achieved RMSE between 0.73 to 1.21 g/m
2 for wheat’s wet biomass. As well, their results for wheat PAI were between 0.83 to 1.48 m
2/m
2. The best soybean wet biomass and PAI results were 0.34 g/m
2 and 0.72 m
2/m
2, respectively. We achieved the RMSE of 4.95 g/m
2 and 25.80 g/m
2 for soybean and canola dry biomass, respectively. Our model also provided the RMSE of 0.211 m
2/m
2 for soybean LAI, 0.273 m
2/m
2 for corn LAI, and 0.525 m
2/m
2 for canola LAI. Our model amazingly improved the accuracy of LAI estimation, especially for corn.
5. Conclusions
Biomass and LAI are two critical parameters in the crop growth model and crop monitoring. This paper assessed four MLAs’ potential to estimate dry biomass and LAI of three crops, including soybean, corn, and canola. In situ measurements have been collected during the SMAPVEX-12 campaign over Manitoba, Canada. Several polarimetric features were extracted from UAVSAR data. Besides, various spectral VIs were extracted from RapidEye optical data. Correlation for all features was calculated; also, RF feature importance for each feature was obtained. Finally, the correlation with an absolute value of 0.9 was considered, and the feature with low importance and high correlation was removed. The remaining features were incorporated into machine learning regression models. The results showed that the integration of SAR polarimetric and spectral VIs better estimate dry biomass and LAI. Besides, XGB showed great potential in assessing crop biophysical parameters. For LAI, RMSE was reported as 0.557 m2/m2 for canola, 0.298 m2/m2 for corn, and 0.233 m2/m2 for soybean. Also, RMSE was reported for dry biomass as 29.45 g/m2 for canola, 26.29 g/m2 for corn, 5.00 g/m2 for soybean. In addition, the results of deep neural networks were 0.525 m2/m2, 0.273 m2/m2, and 0.211 m2/m2 for canola, corn, and soybean LAI, respectively. The results of deep neural networks were 25.80 g/m2, 57.97 g/m2, 5.00 g/m2 for canola, corn, and soybean dry biomass, respectively.