Next Article in Journal
Exploring the Physicochemical and Antioxidant Characteristics of Honey from Eastern Morocco: Insights into Potential Health Benefits and Molecular Docking Analysis
Previous Article in Journal
Hay Yield, Chemical Composition, and In Vitro Digestibility of Five Varieties of Common Vetch
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Soil Salinity Prediction in an Arid Area Based on Long Time-Series Multispectral Imaging

1
College of Energy and Power Engineering, Lanzhou University of Technology, Lanzhou 730050, China
2
Key Laboratory of Smart Agriculture and Water-Saving Irrigation Equipment, Ministry of Agriculture and Rural Affairs, Lanzhou 730050, China
3
College of Environmentaland Energy Engineering, Beijing University of Technology, Beijing 100124, China
*
Author to whom correspondence should be addressed.
Agriculture 2024, 14(9), 1539; https://doi.org/10.3390/agriculture14091539
Submission received: 4 August 2024 / Revised: 26 August 2024 / Accepted: 26 August 2024 / Published: 6 September 2024
(This article belongs to the Section Agricultural Soils)

Abstract

:
Traditional soil salinity measurement methods are generally complex and labor-intensive, restricting the long-term monitoring of soil salinity, particularly in arid areas. In this context, the soil salt content (SSC) data from farms in the Heihe River Basin in Northwest China were collected in three consecutive years (2021, 2022, and 2023). In addition, the spectral reflectance and texture features of different sampling sites in the study area were extracted from long-term unmanned aerial vehicle (UAV) multispectral images to replace the red and near-infrared bands with a newly introduced red edge band. The spectral index was calculated in this study before using four sensitive variable combinations to predict soil salt contents. A Pearson correlation analysis was performed in this study to screen 57 sensitive features. In addition, 36 modeling scenarios were conducted based on the Extreme Gradient Boosting (XGBoost Implemented using R language 4.3.1), Backpropagation Neural Network (BPNN), and Random Forest (RF) algorithms. The most optimal algorithms for predicting the soil salt contents in farmland located in the Heihe River Basin, in the arid region of Northwest China, were determined. The results showed a higher prediction accuracy for the XGBoost algorithm than the RF and BPNN algorithms, accurately reflecting the actual soil salt contents in the arid area. On the other hand, the most accurate predicted soil salt contents were obtained in 2023 using the XGBoost algorithm, with coefficient of determination (R2), root mean square error (RMSE), and mean absolute error (MAE) ranges of 0.622–0.820, 0.086–0.157, and 0.078–0.134, respectively, whereas the most stable prediction results were obtained using the collected data in 2022. From the perspective of different sensitive variable input combinations, the implementation of the XGBoost algorithm using the spectral index–spectral reflectance–texture feature input combination resulted in comparatively higher prediction accuracies than those of the other variable combinations in 2022 and 2023. Specifically, the R2, RMSE, and MAE values obtained using the spectral index–spectral reflectance–texture feature input combination were 0.674, 0.133, and 0.086 in 2022 and 0.820, 0.165, and 0.134 in 2023, respectively. Therefore, our results demonstrated that the spectral index–spectral reflectance–texture feature was the optimal sensitive variable input combination for the machine learning algorithms, of which the XGBoost algorithm is the most optimal model for predicting soil salt contents. The results of this study provide a theoretical basis for the rapid and accurate prediction of soil salinity in arid areas.

1. Introduction

Soil salinization represents a dynamic spatiotemporal process of soil change. This process consists of salt accumulation in soil surfaces through the evaporation of soil water and groundwater, resulting in the large-scale degradation of cultivated land and, consequently, reducing crop yields [1]. Indeed, soil salinization has become a serious environmental issue, affecting more than 1.0 × 109 hectares of land worldwide [2,3,4]. This issue seriously restricts the development of agriculture and economic growth, making it necessary to ensure the effective management of saline–alkali land. Therefore, rapid and accurate soil salinity quantification is of great importance to ensure effective saline–alkali land management. Traditional methods of soil salinity quantification are time-consuming, labor-intensive, costly, and time-inefficient, and are not suitable for the large-scale monitoring of soil salinity in cultivated land areas [5]. Numerous studies have demonstrated the capability of unmanned aerial vehicle (UAV) remote sensing technology to reverse farmland soil salinity. Compared to satellite remote sensing, drones can provide centimeter-level high-resolution images with high spatial resolution and a high degree of flexibility, allowing researchers to observe soil salinity changes in small-scale areas in detail [6,7].
Ground field measurements of soil contents combined with remote sensing technologies have been the major methods used to predict soil salinity [8]. Cui [9] used the UAV multispectral platform to construct multivariate screening and multi-depth soil salt inversion methods, demonstrating the great capacity of the combined machine learning–UAV multispectral remote sensing platform in monitoring soil salt contents in farmland. Zhao [10] explored the correlations between soil spectral indices and salt contents in soil surfaces under different vegetation cover types, including bare land and alfalfa land, to construct a soil salt predictive model, introducing the red edge band to extract new spectral information. This spectral information was used to replace the red band, resulting in good inversion results. Jody [11] used a drone equipped with a multispectral camera to predict maize canopy nitrogen (N) contents under different vegetation growth stages and soil characteristics, thereby achieving more efficient field N management. Numerous researchers have also improved the accuracies of inversion models from the perspective of variable parameter selection and modeling analysis. Zhao [12] introduced the Support Vector Machine Recursive Feature Elimination (SVM-RFE) algorithm to select important input variables for salinity inversion at different depths under corn alfalfa cultivation, highlighting the importance of the red edge band in improving spectral indices. The results demonstrated the effectiveness of the SVM-RFE feature selection algorithm in improving the accuracy of the inversion model. Zhu [13] combined UAV-based hyperspectral visible and near-infrared spectroscopy with two feature selection techniques to predict soil salinity, with R2 = 0.799 for the optimal model and R2 = 0.714 for the worst model, further confirming the effectiveness of hyperspectral images in surface soil salinity estimation and mapping. Therefore, research on soil salinity prediction modeling remains a hot topic in this field.
Long-term soil salinity data can exhibit strong correlations with the texture characteristics of spectral vegetation canopy images. In recent years, image texture features have gradually been used to monitor crop growth and quantify soil salinity [14,15,16]. Numerous researchers have employed image texture features to construct texture index-based models for estimating vegetation parameters and aboveground biomass [17,18]. Cheng [19] constructed a model for predicting soil moisture contents at different depths in a corn field based on soil spectral indices, vegetation canopy temperatures, and texture features extracted from UAV image data. Zhang [20] combined Sentinel-2 texture features and Sentinel-1 backscatter coefficients to implement three machine learning algorithms using three variable selection algorithms to predict soil salt contents. The Extreme Gradient Boosting (XGBoost) algorithm is an optimized distributed gradient-boosting library based on the second-order Taylor expansion to optimize the objective function, thereby reducing potential overfitting and improving prediction accuracy [21]. Li [22] used the XGBoost algorithm to predict soybean and maize yields in the northwestern part of the United States based on satellite and meteorological data. The results show that the XGBoost algorithm can predict crop yield well. Han [23] constructed inversion models by implementing the XGBoost algorithm combined with a gray relational analysis to select sensitive spectral indices and predict salinity contents at different soil depths, demonstrating the effectiveness of the XGBoost algorithm in screening redundant values of sensitive variables. Nevertheless, there are still limited studies on soil salinity prediction using the XGBoost algorithm. Moreover, saline–alkali lands can exhibit different spectral characteristics over long-term vegetation growing seasons. Therefore, it is crucial to explore the effectiveness of multispectral images combined with machine learning algorithms to predict long-term soil salinity contents.
In this context, UAV multispectral image and soil salinity data were collected from the Bianwan Farm area, China, under different types of cover crops in 2021, 2022, and 2023. The specific objectives of this study were as follows: (1) To propose a new spectral index by replacing the red and near-infrared bands with the red edge band. To evaluate and discuss the feasibility of its application in soil salinity inversion. (2) A Pearson correlation analysis was used to reveal the main sensitive variables and determine the best combination of variable inputs. (3) The prediction accuracies of the different machine learning algorithms were further compared and analyzed to determine the most accurate model for long-term soil salinity prediction in the study area. The results of this study provide a useful reference for monitoring soil salinity in the arid areas of Northwest China, as well as a theoretical basis for the rapid and accurate prediction of soil salinity.

2. Materials and Methods

2.1. Study Area

The study area is located in the Bianwan Farm in the Heihe River Basin, Gansu Province, China (Figure 1). The altitude of the study area is about 1390 m, with a total area of about 15.6 km2, of which the cultivated land area covers about 5.4 km2. The study area is characterized by great variations in diurnal air temperatures. In fact, the climate in the study region is dry and cold, with low precipitation amounts and high evapotranspiration rates, resulting in relatively scarce water resources and an uneven spatiotemporal distributions of precipitation amounts. Several crop species are cultivated in the Bianwan Farm, such as alfalfa, corn, and wheat.

2.2. Data Collection and Processing

2.2.1. Unmanned Aerial Vehicle Multispectral Images

In this study, the UAV data of the study area were collected from May to June in the 2021–2023 period. Alfalfa is in the branching period in this time period. Indeed, the weather during the collection period was sunny without precipitation events, with an average temperature of 33 °C. The multispectral images were obtained using the DJI Sprite 4 multispectral camera (Shenzhen DJI Innovation Technology Co., Shenzhen, China) (Figure 2a). The camera collected five spectral bands, namely blue, green, red, near-infrared, and red edge (Table 1).

2.2.2. Field Soil Salinity Data

In this study, in situ sampling campaigns were conducted in the alfalfa fields of the study area from May to June over the 2021–2023 period (Figure 2c). The soil samples collected in this study were soil samples of alfalfa during the meristematic stage, during which vegetation does not completely cover the surface soil, so the relevant spectral indices were collected i.e., both bare soil and soil covered by vegetation. Specifically, a total of 70 sampling sites were selected from the alfalfa plots every year (0–20 cm depth). About 50 g of soil samples was collected from each sampling site using a soil-drilling method. In addition, a hand-held Real-Time Kinematic (RTK: Shenzhen DJI Innovation Technology Co., Shenzhen, China) was employed to calibrate the sampling sites. Afterward, 30 g of each soil sample was placed in an aluminum box and oven-dried at 105 °C for eight hours. Subsequently, the dried soil samples were ground and sieved through a 2 mm sieve. Then, 150 mL of distilled water was added to the sieved soil samples and thoroughly stirred. The mixtures were left to stand for 10 h before measuring the electrical conductivity of the supernatants using a DJS-1C (USA-HACH, Loveland, CO, USA)conductivity meter (Figure 2b). The salt contents (SSC, %) in the soil samples were determined using the following empirical formula: SSC = 0.2882EC1:5 + 0.0183 [24].

2.3. Construction and Screening of Sensitive Variables

2.3.1. Construction of the Spectral Index

The spectral index was constructed in this study using different mathematical combinations of spectral reflectance extracted from the collected remote sensing images. Indeed, spectral indices have been often used in rapid assessment and prediction studies on soil salinization, showing good results [25]. When spectral indices are collected over bare soil, the reflectance data is directly influenced by the soil’s properties, including moisture content, texture, organic matter, and salinity. Since salinity affects soil reflectance (usually leading to higher reflectance in certain spectral bands due to the presence of salts), it can be correlated with soil salinity measurements taken at or near the surface. While soil salinity samples are collected below the surface, spectral data primarily capture surface information, so directly correlating surface spectral indices with subsurface salinity can be complex. However, if there is a known relationship between surface and subsurface salinity (e.g., through capillary rise or the lateral movement of salts), or if subsurface salinity consistently influences surface conditions, a correlation might still be valid. Calibration models or empirical relationships may be developed to estimate subsurface salinity from surface spectral data. When spectral indices are taken over vegetated areas, the data primarily reflect vegetation characteristics. Soil salinity can influence vegetation indirectly by affecting plant health and growth. High salinity levels in the soil often lead to stress in vegetation, which can be detected through changes in spectral indices like the Normalized Difference Vegetation Index (NDVI).
In this study, several spectral indices, commonly used in soil salinization detection, were selected, including 11 salinity indices (NDSI, Int2, SI, SI1, SI2, SI3, S2, S3, S4, S5, and BI) and six vegetation indices (NDVI, EVI, GLI, GNDVI, MSAVI, and NNIP). It was found that the red edge band contains more extensive spectral information, which can effectively improve the accuracy of the constructed soil salinity predictive model [26]. In this study, the red or near-infrared bands were replaced with the red edge band to obtain an improved spectral index. A total of 44 spectral indices were selected with the traditional spectral index (Table 2) [27,28,29,30,31,32].

2.3.2. Texture Feature Extraction

The gray level co-occurrence matrix (Glcm) has been extensively used to extract remote sensing image texture features, resulting in good classification and inversion results [33,34,35]. However, this method is generally not used as a method for distinguishing texture features as it contains a large amount of data. In contrast, some Glcm-based statistics are used to extract and classify image texture features. According to the results revealed in previous related studies, the red-light band has a high degree of response to soil salinization [12]. Therefore, the co-occurrence measures window of ENVI 5.3.1 was used in this study to extract the mean, variance (Var), homogeneity (Hom), contrast (Con), dissimilarity (Dis), entropy (Ent), second moment (Sec), and correlation coefficients (Cor) of the R-band of the UAV multispectral image.

2.3.3. Variable Filtering Methods

Pearson correlation analysis is used to measure the strength of the relationship between two quantitative variables. The Pearson correlation coefficient (r) ranges from −1 to 1, indicating perfect negative and positive correlations, respectively. In fact, this method has been widely used in variable selection to effectively reduce redundant variables. The Pearson correlation coefficient (r) can be calculated using the following equation (Equation (1)):
r = i = 1 n ( X i X ¯ ) ( Y i Y ¯ ) i = 1 n ( X i X ¯ ) 2 i = 1 n ( Y i Y ¯ ) 2
where Xi and Yi denote the individual measured values of the two variables X and Y, respectively, and denote the average values of the variables X and Y, respectively.
In this study, the Pearson correlation analysis method was performed using IBM SPSS Statistics 27 software.

2.4. Model Construction and Accuracy Evaluation

2.4.1. Model Construction

In this study, the selected variables using the feature screening method were used as independent variables, while the observed soil salt contents (SSCs) in the different years (2021–2023) were used as the dependent variable in a soil salt predictive model. This was constructed based on three machine learning algorithms, namely the BPNN, RF, and XGBoost algorithms.
Among them, the XGBoost algorithm is a novel and effective boosting algorithm based on the gradient boosting tree (GBDT). This algorithm uses Newton’s method to solve the extreme value of the loss function through second-order Taylor function approximation. Specifically, a regularization term is added to the loss function [36]. As a boosting algorithm, XGBoost includes three boosting elements: loss function, to measure the difference between the model prediction result and the real value result; weak evaluator, generally decision trees (Classification and Regression Trees and Dropout Additive Regression Trees), according to different boosting algorithms using different tree building process; and comprehensive integration results, that is, the integration algorithm specific output integration results. The objective function can be calculated by Equation (2):
O b j = i = 1 n l ( y i , y i ) + k = 1 K Ω ( f k )
where y i denotes the sample i predicted value; y i denotes the sample i true value; l ( y i , y i ) denotes the loss function; and Ω ( f k ) denotes the regularisation term.
Compared with traditional gradient boosting trees, XGBoost operates to minimize the objective function rather than the loss function. For any tree, the objective function consists of two parts. The first part is the differentiable loss function, used to evaluate the loss or error between predicted and actual values, thereby controlling the empirical risk of the model. The second part is the regularization term, used to avoid overfitting the model [37]. In this study, the XGBoost algorithm was implemented using the “XGBoost” package and other auxiliary algorithm packages in R software (v. 4.3.1).
The BPNN (Backpropagation Neural Network) is a model that simulates how neurons in the brain learn. It uses multiple layers of neurons and adjusts weights to minimize prediction errors. This model is effective in handling complex, nonlinear relationships in data. The RF (Random Forest) model is based on decision trees. It creates multiple decision trees and makes predictions by averaging or voting on the results. This approach is robust to noise and can handle high-dimensional data well. Both the BPNN and RF models are widely used in soil salinity prediction research. The BPNN model is often employed for predicting the spatiotemporal variations in soil salinity due to its ability to handle complex nonlinear relationships. The RF model, known for its robustness and ability to manage high-dimensional data, is commonly used to analyze various variables related to soil salinity. The choice between these models depends on the data characteristics and research objectives.

2.4.2. Evaluation of the Model Accuracy

In this study, 70% (50 samples) and 30% (20 samples) of the alfalfa-covered soil samples were used as the training and validation datasets, respectively. The coefficient of determination (R2), root mean squared error (RMSE), and mean absolute error (MAE) were used to assess the prediction accuracy of the implemented model. The closer the R2 is to 1, the higher the prediction accuracy of the model. On the other hand, the smaller the RMSE and MAE values, the smaller the deviation between the predicted and measured values, indicating greater prediction accuracy. The R2, RMSE, and MAE values were calculated in this study using the following equations:
R 2 = i = 1 n ( y ^ i y ¯ ) 2 i = 1 n ( y i y ¯ ) 2
RMSE = i = 1 n ( y ^ i y ¯ i ) 2 n × 100 %
MAE = 100 % n i = 1 n | y ^ i y i |
where y i and y ^ l denote the measured and predicted soil salt contents, respectively; y ¯ denotes the average soil salt content; n is the number of the samples.

3. Results

3.1. Statistical Analysis of the Soil Salinity Data

A total of 70 sampling sites were selected in the alfalfa cultivation area site each year. Soil samples were collected from each sampling site over the 2021–2023 period, accounting for a total of 210 soil samples (Table 3). According to the results of previous studies, more than 80% of the study area was under moderate salinization conditions, with an SSC range of 0.2–0.5%. The remaining area was under severe salinization conditions, with an SSC range of 0.5–1%. The highest soil salt contents in the study area in 2021, 2022, and 2023 were 0.868, 0.863, and 0.981%, respectively. Under the alfalfa cultivation conditions, the soil salt content data of the modeling, validation, and entire datasets in the different years exhibited similar statistical results, minimizing the validation error of the model. Hence, the soil salt content data of the modeling datasets and validation datasets effectively represented the entire dataset.

3.2. Sensitive Variable Screening

The spectral index was constructed in this study using five basic spectral bands extracted from the UAV multispectral images through different mathematical combinations. In addition, the correlations of the measured soil salt contents in the different years under the alfalfa cover with the five basic spectral bands and the extracted texture features were further analyzed in this study (Figure 3). According to the obtained results, some spectral reflectance and R-band texture features had good correlations with the observed soil salt contents. Therefore, we assessed the importance of several spectral index, spectral reflectance, and R-band texture feature combinations, including spectral indices, spectral indices–spectral reflectance, spectral indices–texture features, and spectral indices–spectral reflectance–texture features. The selected variables were used as inputs to evaluate the soil salt content prediction of the different models under different variable combinations.
Pearson correlations of the 11 calculated salt indices, 6 vegetation indices, and their derived improved spectral indices with the observed soil salt contents in the alfalfa fields in different years were further analyzed in this study. Spectral indices with good correlations in each group were used as the input variables into the inversion models. To mitigate the effects of the number of index inputs on the prediction accuracy of the models, the number of screening variables was gradually increased. The highest prediction accuracy of the models was achieved using 12 spectral indices. On the other hand, to mitigate the effect of the number of screening features on the prediction results, the top 12 spectral indices with the highest r values were retained as input variables into the models for each year (Table 4).

3.3. Accuracies of the Machine Learning-Based Soil Salinity Inversion Models

The applicability of the machine learning algorithms in long-term soil salinity prediction using different input variables, including the spectral index, spectral reflectance, and texture features, was further evaluated in this study. The most important input variables were revealed using a Pearson correlation analysis before implementing the BPNN, RF, and XBGoost machine learning algorithms for long-term soil salinity prediction (Table 5). According to the obtained results, it was feasible to introduce the red edge band into the models to predict the soil salt contents at different depths under crop coverage at different growth stages.
The results showed differences in the prediction accuracies of the models between the different input combinations. Specifically, the obtained R2 values between the predicted and observed data in 2021 ranged from 0.519 to 0.613, 0.524 to 0.583, 0.612 to 0.684, and 0.541 to 0.654 using the spectral index, spectral index–spectral reflectance, spectral index–texture feature, and spectral index–spectral reflectance–texture feature combinations, respectively. The obtained RMSE values ranged from 0.103 to 0.109, 0.098 to 0.157, 0.083 to 0.117, and 0.074 to 0.132. Meanwhile, the MAE ranges were 0.063 to 0.079, 0.086 to 0.151, 0.086 to 0.129, and 0.083 to 0.108. Therefore, the spectral index–texture feature input combination resulted in the highest prediction accuracy, followed by the spectral index–spectral reflectance–texture feature group. On the other hand, by comparing the prediction accuracies of the three machine learning algorithms, it was found that the XGBoost algorithm outperformed the RF and BPNN algorithms. The implementation of the XGBoot algorithm using the spectral index, spectral index–spectral reflectance, spectral index–texture feature, and spectral index–spectral reflectance–texture feature combinations resulted in R2 values of 0.613, 0.583, 0.684, and 0.652, respectively. The RMSE values were 0.103, 0.105, 0.095, and 0.132, whereas the MAE values were 0.079, 0.086, 0.086, and 0.108. The RF algorithm was the second most accurate prediction model. The comparison of the predicted values with the observed soil salt contents in 2022 showed the highest prediction accuracy using the spectral index–spectral reflectance–texture feature input combination, followed by that obtained using the spectral index–texture feature input combination. In addition, the XGBoost algorithm exhibited the highest prediction accuracies among the three machine learning algorithms, showing R2 values of 0.622, 0.652, 0.653, and 0.674 using the spectral index, spectral index–spectral reflectance, spectral index–texture feature, and spectral index–spectral reflectance–texture feature combinations, respectively, followed by the RF algorithm. In 2023, the spectral index–texture feature and spectral index–spectral reflectance–texture feature input combinations resulted in the highest prediction accuracies, with R2 ranges of 0.574–0.778 and 0.542–0.820, respectively. Similarly, the XGBoost algorithm was better than the RF and BPNN algorithms in predicting the soil salt contents in 2023.
In summary, the most accurate prediction of the soil salt contents in the alfalfa fields using long-term multispectral images was obtained in 2023. In addition, the use of the spectral index–spectral reflectance–texture feature combination as input variables into the XGBoost algorithm was comparatively effective in predicting the soil salt contents in the study area. Under this condition, XGBoost was used as the best salt prediction model with the variables ranked in order of importance as R, Lnt2*, SI1, SI3, SI2*, SI, BI*, G, RedEdge, Hom, Con, Var.
To further assess the accuracies of the machine learning algorithms in predicting the soil salt contents in the alfalfa fields over the three years, scatter plots and fitting lines of the obtained BPNN, RF, and XGBoost-based prediction results using the different input combinations were drawn (Figure 4). According to the obtained results, the highest fitting degrees between the R2 machine learning-based predicted and actual salt contents in the alfalfa fields were observed in 2023, followed, respectively, with those in 2022 and 2021. On the other hand, the XGBoost algorithm resulted in the highest fitting degrees between the predicted and observed soil salt contents, followed, respectively, by the RF and BPNN algorithms. The values obtained using XGBoost, RF, and BPNN ranged from 0.583 to 0.684, 0.622 to 0.674, and 0.622 to 0.820, respectively. The RMSE values of XGBoost, RF, and BPNN ranged from 0.095 to 0.132, 0.082 to 0.133, and 0.086 to 0.167, corresponding to MAE ranges of 0.079–0.108, 0.075–0.086, and 0.078–0.134, respectively. However, the RF prediction algorithm performed better than the XGBoost algorithm, using the spectral index–spectral reflectance–texture feature and spectral index–texture feature input combinations in 2021 and 2023, respectively. Indeed, the implementation of the RF algorithm using the spectral index–spectral reflectance–texture feature input combination resulted in R2, RMSE, and MAE values of 0.654, 0.074, and 0.096, respectively. Meanwhile, the introduction of the spectral index–texture feature input combination into the RF algorithm revealed R2, RMSE, and MAE values of 0.647, 0.076, and 0.149, respectively. It is worth noting that the implementation of the BPNN and RF algorithms using the spectral index–spectral reflectance–texture feature and spectral index input combination in 2022 and 2023, respectively, resulted in comparatively similar prediction accuracies to that of the XGBoost algorithms. In summary, the highest soil salt prediction accuracies under the alfalfa cover were achieved in 2023 using the spectral index–spectral reflectance–texture feature combination as input variables into the XGBoost algorithm.

3.4. Comprehensive Analysis of the Prediction Results

To determine the most accurate machine learning algorithm in predicting soil alt contents in the alfalfa fields, the prediction results of the above-mentioned four input combinations were further comprehensively compared and analyzed (Figure 5). According to the obtained R2 values, the introduction of the spectral index–spectral reflectance–texture feature input combination resulted in comparatively higher accuracies of the predicted soil salt contents in the alfalfa fields that those obtained using the other input combinations. Indeed, the implementation of the XGBoost, RF, and BPNN algorithms using this input variable combination resulted in average R2 values of 0.715, 0.614, and 0.554, respectively, over the three years, with small RMSE and MAE values. The spectral index–texture feature combination resulted in the second most accurate prediction results, followed, respectively, by the spectral index and spectral index–spectral reflectance input combinations. Compared with the single spectral index as the input group of the model, the combination of spectral reflectance and texture features can significantly improve the prediction accuracy of the inversion model.
The prediction performance of the XGBoost algorithm was obviously better than those of the RF and BPNN algorithms. In the three years, the R2 of the model constructed by XGBoost algorithm ranges from 0.583 to 0.820, the RMSE ranges from 0.082 to 0.165, and the MAE ranges from 0.075 to 0.134, In addition, the average R2 value obtained using the XGBoost algorithm was substantially higher than those obtained using the RF and BPNN algorithms. Therefore, the XGBoost algorithm demonstrated a higher prediction accuracy than the RF and BPNN algorithms.
The results showed that the overall prediction for 2023 was better than those for 2022 and 2021.The observed XGBoost-based prediction accuracy in 2023 was substantially higher than those obtained in 2022 and 2021, with an R2 range of 0.622–0.820. However, it is worth noting the prediction accuracy of the algorithm effect in 2022 was stable under the four different input combinations, with an R2 range of 0.622–0.674. On the other hand, the prediction accuracy of the RF algorithm in 2021 was better than those revealed in 2022 and 2023, with an R2 range of 0.519–0.654. Compared with the XGBoost and RF algorithms, the BPNN algorithm exhibited poor prediction performance in the three years.

4. Discussion

4.1. Soil Salinity Prediction Based on Spectral Data and Vegetation Texture Features

Soil salinity can change with crop root growth, thereby affecting the growth of crops. This explains the sensitivity of the spectral reflectance and texture characteristics of the vegetation canopy to soil salt contents [29,38,39]. Exploring the long-term relationships of the observed soil salt contents with crop spectral and texture information can be useful in predicting long-term soil salt contents. In this study, we explored the influences of spectral index and image texture information on the prediction accuracy of soil salt contents in different years and under alfalfa cultivation to construct different soil salt prediction models based on multi-variables, thereby providing a theoretical basis for the accurate and rapid acquisition of agricultural soil salt data.
In this study, the spectral index, spectral reflectance, and vegetation canopy texture features were constructed based on UAV multispectral images. These data exhibited good Pearson correlations with the measured soil salt contents, thereby making them good indicators for the quantitative assessment of soil salinization [40,41]. In addition, our results demonstrated the effectiveness of the optimized spectral index–vegetation texture feature combination in predicting soil salinity when compared with traditional spectral index inputs. Screening sensitive variables can effectively remove redundant spectral variables, providing effective data for the accurate prediction of soil salinity contents [42,43]. Sun [44] monitored maize lodging by UAV multispectral images, and combined texture features and vegetation indices to classify various feature images and extract maize lodging levels. Zhao [45] constructed five model input groups using different variables, such as spectral index and texture features. The highest prediction accuracies of the soil salt contents were achieved using the texture index–vegetation index input combination. Compared with the single vegetation index, the use of the variable input combination in the RF algorithm increased the R2 value by 8.5%, while the Extreme Learning Machine (ELM) model R2 increased by 15.8%. Xiang [46] used the Otsu threshold selection method to extract the spectral index and texture features from multispectral images before and after removing the soil background, to construct a soil salinity prediction model based on four variable inputs, achieving high prediction accuracies. Therefore, the texture features of multispectral images are of great significance in improving the accuracy of predicted soil salt contents. In this study, the spectral reflectance and texture features of the vegetation canopy were selected as inputs into the model, resulting in improved model prediction accuracy, of which the spectral index–spectral reflectance–texture feature input combination resulted in the most accurate predicted soil salt contents using the XGBoost algorithm.

4.2. Application of XGBoost Algorithm in Soil Salinity Prediction

Predictive models are crucial tools employed to estimate soil salinity contents. Selecting high-precision and high-efficiency soil salinity predictive models is important to achieve accurate the prediction of soil salt contents [47]. Zhou [48] used visible and near-infrared spectroscopy to quantitatively predict soil salt contents in the Western United States and Southwest Europe using the Partial Least Squares Regression (PLSR), Cubist regression, RF, and XGBoost models, of which the XGBoost model had the highest prediction accuracy (R2 = 0.71), followed by the RF model. These findings are, indeed, consistent with our results due to the second-order Taylor function expansion by the XGBoost algorithm, greatly improving the accuracy of the predicted soil salinity contents under vegetation coverage on the regional scale. Tian [49] implemented five different machine learning algorithms, such as XGBoost, using first-off characteristic parameters extracted from ZY1-02D hyperspectral satellite data to predict coastal soil salinity in Qinzhou Bay, providing a new method for the large-scale estimation of soil salinity. Many scholars have demonstrated the higher prediction accuracy and stability of the XGBoost algorithm than those of other machine learning algorithms [50,51], which is in line with our findings. Indeed, the implemented XGBoost algorithm using the four different variable inputs in this study outperformed the RF and BPNN algorithms. Specifically, the highest prediction accuracy of the XGBoost algorithm was achieved using the combination of the spectral index–spectral reflectance–texture feature data collected in 2023, showing R2, RMSE, and MAE values of 0.82, 0.165, and 0.134, respectively. Meanwhile, the highest R2 values obtained using the RF and BPNN algorithms were 0.654 and 0.612, respectively. Several studies have used traditional machine learning algorithms combined with multi-year measured datasets to determine sensitive variables and predict meteorological and land use data [52,53]. Hence, the combination of sensitive variables and machine learning algorithms is crucial in improving the prediction accuracy of soil salinity.

4.3. Uncertainty Analysis of Current Study

The results of the present study revealed good accuracies in the predicted soil salt contents in the alfalfa fields using multispectral image data. Nevertheless, it is still necessary to further explore the potential of implementing machine learning algorithms using multi-source remote sensing data. The limitations of remote sensing by drones, such as a small coverage area and limited flight time, can be compared to the advantages of satellite remote sensing, such as a large coverage area and huge amount of data, which are suitable for monitoring and analyzing on a large scale or global scale. In particular, Landsat or Sentinel satellite remote sensing data can be combined. In addition, other variables, such as soil moisture contents and crop growth stages can be used to construct effective models for predicting soil salinity at watershed scales. The experimental methods and the accuracies of the machine learning algorithms in predicting the soil salt contents in this study need to be further confirmed under other vegetation cover types.

5. Conclusions

In this study, the soil salt contents in the alfalfa fields of the Heihe River Basin were assessed from 2021 to 2023. In addition, three machine learning algorithms, including XGBoost, BPNN, and RF, were implemented to predict the soil salt contents using a Pearson correlation-based screening of spectral index, spectral reflectance, and texture features. The R2, RMSE, and MAE were employed to assess the prediction accuracies of the three machine learning algorithms. The following conclusions were drawn in this study:
(1)
The spectral index–spectral reflectance–texture feature input combination resulted in the highest accuracies of the machine learning algorithms in predicting the soil salt contents, showing comparatively smaller RMSE values. Compared with the other input combinations, the spectral index–spectral reflectance–texture feature input combination significantly improved the prediction accuracies of the algorithms, highlighting the possibility of using spectral reflectance and texture features in future studies on soil salt prediction.
(2)
The prediction accuracies of the machine learning algorithms were further compared and analyzed using the different input combinations. The results showed an increasing trend in the prediction accuracy with increasing input variable combinations into the different soil salt predictive models, achieving the highest accuracy using the collected data in 2023, with an R2 value of 0.820. Specifically, the spectral index–spectral reflectance–texture feature input combination was the most optimal for achieving the best prediction accuracy, followed by the spectral index–texture feature combination. On the other hand, the most accurate predicted soil salt contents were obtained using the XGBoost algorithm.
(3)
By comparing the prediction accuracies of the soil salinity predictive models obtained using the three modeling schemes, it was found that the prediction accuracy and stability of the XGBoost algorithm were generally superior to the RF and BPNN algorithms. The XGBoost algorithm exhibited the highest accuracy of the predicted soil salt contents using the spectral index–spectral reflectance–texture feature input data acquired in 2023, with R2, RMSE, and MAE values of 0.820, 0.165, and 0.134, respectively, whereas the RF algorithm was the second most optimal predictive algorithm, followed by the BPNN algorithm.

Author Contributions

Formal analysis, Z.L.; funding acquisition, W.Z. and Z.L.; methodology, Z.L. and W.Z.; project administration, W.Z. and H.L.; software, X.L. and Z.L.; validation, P.Y.; writing—original draft, Z.L.; writing—review and editing, Z.L. and W.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by National Natural Science Foundation of China (52379042), Key R & D plan of Gansu Province (23YFFA0019), Gansu Province East–West Cooperation Project (23CXNA0025). and Gansu Province Water Resources Science Experimental Research and Technology Promotion Project in 2024 (24GSLK064).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Data not available due to commercial restrictions.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Gu, S.; Jiang, S.; Li, X.; Zheng, N.; Xia, X. Soil salinity simulation based on electromagnetic induction and deep learning. Soil Till. Res. 2023, 230, 105706. [Google Scholar] [CrossRef]
  2. Fang, B.; Lakshmi, V. Soil moisture at watershed scale: Remote sensing techniques. J. Hydrol. 2014, 516, 258–272. [Google Scholar] [CrossRef]
  3. Hassani, A.; Azapagic, A.; Shokri, N. Global predictions of primary soil salinization under changing climate in the 21st century. Nat. Commun. 2021, 12, 6663. [Google Scholar] [CrossRef]
  4. Deng, M.; Tao, W.; Wang, Q.; Su, L.; Ma, C.; Ning, S. Theory and technical guarantee system construction of modern ecological irrigation district in Northwest China. Trans. Chin. Soc. Agric. Mach. 2022, 53, 1–13. [Google Scholar]
  5. Ding, J.; Yu, D. Monitoring and evaluating spatial variability of soil salinity in dry and wet seasons in the Werigan–Kuqa Oasis, China, using remote sensing and electromagnetic induction instruments. Geoderma 2014, 235, 316–322. [Google Scholar] [CrossRef]
  6. Lubczynski, M.W.; Leblanc, M.; Batelaan, O. Remote sensing and hydrogeophysics give a new impetus to integrated hydrological models: A review. J. Hydrol. 2024, 633, 130901. [Google Scholar] [CrossRef]
  7. Ivushkin, K.; Bartholomeus, H.; Bregt, A.K.; Pulatov, A.; Franceschini, M.H.; Kramer, H.; Finkers, R. UAV based soil salinity assessment of cropland. Geoderma 2019, 338, 502–512. [Google Scholar] [CrossRef]
  8. Ge, X.; Wang, J.; Ding, J.; Cao, X.; Zhang, Z.; Liu, J.; Li, X. Combining UAV-based hyperspectral imagery and machine learning algorithms for soil moisture content monitoring. Peer J. 2019, 7, 6926–6953. [Google Scholar] [CrossRef]
  9. Cui, J.; Chen, X.; Han, W.; Cui, X.; Ma, W.; Li, G. Estimation of soil salt content at different depths using UAV multi-spectral remote sensing combined with machine learning algorithms. Remote Sens. 2023, 15, 5254. [Google Scholar] [CrossRef]
  10. Zhao, W.; Zhou, C.; Zhou, C.; Ma, H.; Wang, Z. Soil salinity inversion model of oasis in arid area based on UAV multispectral remote sensing. Remote Sens. 2022, 14, 1804. [Google Scholar] [CrossRef]
  11. Jody, Y.; Wang, J.; Brigitte, L. Evaluation of soil properties, topographic metrics, plant height, and unmanned aerial vehicle multispectral imagery using machine learning methods to estimate canopy nitrogen weight in corn. Remote Sens. 2021, 13, 3105. [Google Scholar] [CrossRef]
  12. Zhao, W.; Ma, F.; Ma, H.; Zhou, C. Soil salinity inversion model based on the multispectral images of UAV. Trans. Chin. Soc. Agric. Eng. 2022, 38, 93–101. [Google Scholar] [CrossRef]
  13. Zhu, C.; Ding, J.; Zhang, Z.; Wang, Z. Exploring the potential of UAV hyperspectral image for estimating soil salinity: Effects of optimal band combination algorithm and random forest. Spectrochim. Acta A 2022, 279, 121416. [Google Scholar] [CrossRef]
  14. Li, D.; Gao, G.; Shao, M.A.; Fu, B. Predicting available water of soil from particle-size distribution and bulk density in an oasis–desert transect in northwestern China. J. Hydrol. 2016, 538, 539–550. [Google Scholar] [CrossRef]
  15. Zheng, H.; Ma, J.; Zhou, M.; Li, D.; Yao, X.; Cao, W.; Zhu, Y.; Cheng, T. Enhancing the nitrogen signals of rice canopies across critical growth stages through the integration of textural and spectral information from unmanned aerial vehicle (UAV) multispectral imagery. Remote Sens. 2020, 12, 957. [Google Scholar] [CrossRef]
  16. Zhang, J.; Cheng, T.; Shi, L.; Wang, W.; Niu, Z.; Guo, W.; Ma, X. Combining spectral and texture features of UAV hyperspectral images for leaf nitrogen content monitoring in winter wheat. Int. J. Remote Sens. 2022, 43, 2335–2356. [Google Scholar] [CrossRef]
  17. Zheng, H.; Cheng, T.; Zhou, M.; Li, D.; Yao, X.; Tian, Y.; Cao, W.; Zhu, Y. Improved estimation of rice aboveground biomass combining textural and spectral analysis of UAV imagery. Precis. Agric. 2019, 20, 611–629. [Google Scholar] [CrossRef]
  18. Liu, J.; Bi, H.; Zhu, P.; Sun, Q.; Zhu, J.; Chen, T. Estimating stand Volume of Xylosma racemosum forest based on texture parameters and derivative texture indices of ALOS imagery. Trans. Chin. Soc. Agric. Mach. 2014, 45, 245–254. [Google Scholar] [CrossRef]
  19. Cheng, M.; Jiao, X.; Liu, Y.; Shao, M.; Yu, X.; Bai, Y.; Wang, Z.; Wang, S.; Nuremanguli, T.; Liu, S.; et al. Estimation of soil moisture content under high maize canopy coverage from UAV multimodal data and machine learning. Agric. Water Manag. 2022, 264, 107530. [Google Scholar] [CrossRef]
  20. Zhang, Z.; He, Y.; Yin, H.; Xiang, R.; Chen, J.; Du, R. Synergistic Estimation of Soil Salinity Based on Sentinel-1/2 Improved Polarization Combination Index and Texture Features. Trans. Chin. Soc. Agric. Mach. 2024, 55, 175–185. [Google Scholar] [CrossRef]
  21. Jing, X.; Zou, Q.; Yan, J.; Dong, Y.; Li, B. Remote sensing monitoring of winter wheat stripe rust based on mRMR-XGBoost algorithm. Remote Sens. 2022, 14, 756. [Google Scholar] [CrossRef]
  22. Li, Y.; Zeng, H.; Zhang, M.; Wu, B.; Qin, X. Global de-trending significantly improves the accuracy of XGBoost-based county-level maize and soybean yield prediction in the Midwestern United States. GISci. Remote Sens. 2024, 61, 2349341. [Google Scholar] [CrossRef]
  23. Han, W.; Cui, J.; Cui, X.; Ma, W.; Li, G. Estimation of farmland soil salinity content based on feature optimization and machine learning algorithms. Trans. Chin. Soc. Agric. Mach. 2023, 54, 328–337. [Google Scholar] [CrossRef]
  24. Huang, Q.; Xu, X.; Lv, L.; Ren, D.; Ke, J.; Xiong, Y.; Huo, Z.; Huang, G. Soil salinity distribution based on remote sensing and its effect on crop growth in Hetao Irrigation District. Trans. Chin. Soc. Agric. Eng. 2018, 34, 102–109. [Google Scholar] [CrossRef]
  25. Sedaghat, A.; Shahrestani, M.S.; Noroozi, A.A.; Nosratabad, A.F.; Bayat, H. Developing pedotransfer functions using Sentinel-2 satellite spectral indices and Machine learning for estimating the surface soil moisture. J. Hydrol. 2022, 606, 127423. [Google Scholar] [CrossRef]
  26. Chen, J.; Wang, X.; Zhang, Z.; Han, J.; Yao, Z.; Wei, G. Soil salinization monitoring method based on UAV-Satellite remote sensing scale-up. Trans. Chin. Soc. Agric. Mach. 2019, 50, 161–169. [Google Scholar] [CrossRef]
  27. Ke, Y.; Im, J.; Lee, J.; Gong, H.; Ryu, Y. Characteristics of Landsat 8 OLI-derived NDVI by comparison with multiple satellite sensors and in-situ observations. Remote Sens. Environ. 2015, 164, 298–313. [Google Scholar] [CrossRef]
  28. Broge, N.H.; Leblanc, E. Comparing prediction power and stability of broadband and hyperspectral vegetation indices for estimation of green leaf area index and canopy chlorophyll density. Remote. Sens. Environ. 2001, 76, 156–172. [Google Scholar] [CrossRef]
  29. Allbed, A.; Kumar, L.; Aldakheel, Y. Assessing soil salinity using soil salinity and vegetation indices derived from IKONOS high-spatial resolution imageries: Applications in a date palm dominated region. Geoderma 2014, 230, 1–8. [Google Scholar] [CrossRef]
  30. Fu, H.; Wang, W.; Lu, J.; Yue, Y.; Cui, G.; She, W. Estimation of Ramie Physicochemical Property Based on UAV Multi-spectral Remote Sensing and Machine Learning. Trans. Chin. Soc. Agric. Mach. 2023, 54, 194–200+347. [Google Scholar] [CrossRef]
  31. Sulik, J.; Long, D. Spectral considerations for modeling yield of canola. Remote Sens. Environ. 2016, 184, 161–174. [Google Scholar] [CrossRef]
  32. Bendig, J.; Yu, K.; Aasen, H.; Bolten, A.; Bennertz, S.; Broscheit, J.; Bareth, G. Combining UAV-based plant height from crop surface models, visible, and near infrared vegetation indices for biomass monitoring in barley. Int. J. Appl. Earth. Obs. 2015, 39, 79–87. [Google Scholar] [CrossRef]
  33. Guo, Y.; Chen, S.; Wu, Z.; Wang, S.; Robin, C.; Senthilnath, J.; Fu, Y. Integrating spectral and textural information for monitoring the growth of pear trees using optical images from the UAV platform. Remote Sens. 2021, 13, 1795. [Google Scholar] [CrossRef]
  34. Nieschulze, J.; Erasmi, S.; Dietz, J.; Hölscher, D. Satellite-based prediction of rainfall interception by tropical forest stands of a human-dominated landscape in Central Sulawesi, Indonesia. J. Hydrol. 2009, 364, 227–235. [Google Scholar] [CrossRef]
  35. Zhang, D.; Han, X.; Lin, F.; Du, S.; Zhang, G.; Hong, Q. Estimation of winter wheat leaf area index using multi-source UAV image feature fusion. Trans. Chin. Soc. Agric. Eng. 2022, 38, 171–179. [Google Scholar] [CrossRef]
  36. Suaza-Medina, M.E.; Laguna, J.; Béjar, R.; Zarazaga-Soria, F.J.; Lacasta, J. Evaluating the efficiency of NDVI and climatic data in maize harvest prediction using machine learning. Int. J. Digit. Earth 2024, 17, 2359565. [Google Scholar] [CrossRef]
  37. Liu, Y.; Yang, J.; Chen, X.; Yao, J.; Li, L.; Qiu, Y. Moderate-resolution snow depth product retrieval from passive microwave brightness data over Xinjiang using machine learning approach. Int. J. Digit. Earth 2024, 17, 2299208. [Google Scholar] [CrossRef]
  38. Haj-Amor, Z.; Araya, T.; Kim, D.G.; Bouri, S.; Lee, J.; Ghiloufi, W.; Lal, R. Soil salinity and its associated effects on soil microorganisms, greenhouse gas emissions, crop yield, biodiversity and desertification: A review. Sci. Total Environ. 2022, 843, 156946. [Google Scholar] [CrossRef]
  39. Aldabaa, A.A.A.; Weindorf, D.C.; Chakraborty, S. Combination of proximal and remote sensing methods for rapid soil salinity quantification. Geoderma 2015, 239, 34–46. [Google Scholar] [CrossRef]
  40. Zhang, Z.; Wei, G.; Yao, Z.; Tan, C.; Wang, X.; Han, J. Soil Salt Inversion Model Based on UAV Multispectral Remote Sensing. Trans. Chin. Soc. Agric. Mach. 2019, 50, 151–160. [Google Scholar] [CrossRef]
  41. Hu, J.; Peng, J.; Zhou, Y. Quantitative estimation of soil salinity using UAV-borne hyperspectral and satellite multispectral images. Remote Sens. 2019, 11, 736. [Google Scholar] [CrossRef]
  42. Zhao, W.; Duan, W.; Wang, Y.; Zhou, C.; Ma, H. Multispectral Vegetation Water Content Inversion Model Based on Sensitive Variable Filtering. Trans. Chin. Soc. Agric. Mach. 2023, 54, 343–351, 385. [Google Scholar] [CrossRef]
  43. Guo, Y.; Fu, Y.H.; Chen, S.; Bryant, C.R.; Li, X.; Senthilnath, J. Integrating spectral and textural information for identifying the tasseling date of summer maize using UAV based RGB images. Int. J. Appl. Earth Obs. 2021, 102, 102435. [Google Scholar] [CrossRef]
  44. Sun, Q.; Sun, L.; Shu, M.; Gu, X.; Yang, G.; Zhou, L. Monitoring maize lodging grades via unmanned aerial vehicle multispectral image. Plant Phenomics 2019, 2019, 5704154. [Google Scholar] [CrossRef] [PubMed]
  45. Zhao, W.; Ma, F.; Yu, H.; Li, Z. Inversion Model of Salt Content in Alfalfa-Covered Soil Based on a Combination of UAV Spectral and Texture Information. Agriculture 2023, 13, 1530. [Google Scholar] [CrossRef]
  46. Xiang, Y.; Li, W.; Tai, X.; An, J.; Wang, X.; Chen, J. Inversion of Soil Salt Content Based on Texture Feature and Vegetation Index of UAV Remote Sensing Images. Trans. Chin. Soc. Agric. Mach. 2023, 54, 201–210. [Google Scholar] [CrossRef]
  47. Yao, R.; Gao, Q.; Liu, Y.; Li, H.; Yang, J.; Bai, Y.; Zhu, H.; Wang, X.; Xie, W.; Zhang, X. Deep vertical rotary tillage mitigates salinization hazards and shifts microbial community structure in salt-affected anthropogenic-alluvial soil. Soil Till. Res. 2023, 227, 105627. [Google Scholar] [CrossRef]
  48. Zhou, Y.; Chen, S.; Hu, B.; Ji, W.; Li, S.; Hong, Y.; Xu, H.; Wang, N.; Xue, J.; Zhang, X.; et al. Global Soil Salinity Prediction by Open Soil Vis-NIR Spectral Library. Remote Sens. 2022, 14, 5627. [Google Scholar] [CrossRef]
  49. Tian, Y.; Zheng, D.; Zhang, Q.; Lu, F.; Huang, Y.; Tao, J.; Zhang, Y.; Lin, J.; Yao, G.; Yao, Y. Inversion of coastal soil salinity in Qinzhou Bay based on domestic ZY1-02D satellite and machine learning algorithm. Chin. Environ. Sci. 2024, 44, 371–385. [Google Scholar] [CrossRef]
  50. Jiang, D.; Dong, C.; Ma, Z.; Wang, X.; Lin, K.; Yang, F.; Chen, X. Monitoring saltwater intrusion to estuaries based on UAV and satellite imagery with machine learning models. Remote Sens. Environ. 2024, 308, 114198. [Google Scholar] [CrossRef]
  51. Zhu, X.; Chu, J.; Wang, K.; Wu, S.; Yan, W.; Kiefer, C. Prediction of rockhead using a hybrid N-XGBoost machine learning framework. J. Rock Mech. Geotech. 2021, 13, 1231–1245. [Google Scholar] [CrossRef]
  52. Chen, G.; Li, S.; Knibbs, L.D.; Hamm, N.A.; Cao, W.; Li, T.; Guo, J.; Ren, H.; Abramson, M.; Guo, Y. A machine learning method to estimate PM2.5 concentrations across China with remote sensing, meteorological and land use information. Sci. Total Environ. 2018, 636, 52–60. [Google Scholar] [CrossRef] [PubMed]
  53. Lindner, C.; Bromiley, P.A.; Ionita, M.C. Robust and accurate shape model matching using random forest regression-voting. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 37, 1862–1874. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Geographic location of the study area.
Figure 1. Geographic location of the study area.
Agriculture 14 01539 g001aAgriculture 14 01539 g001b
Figure 2. Data collection and processing. (a) DJI Phantom 4 collects multi-spectral images. (b) Measuring salt content of soil samples. (c) Alfalfa.
Figure 2. Data collection and processing. (a) DJI Phantom 4 collects multi-spectral images. (b) Measuring salt content of soil samples. (c) Alfalfa.
Agriculture 14 01539 g002
Figure 3. Correlations of the observed soil salt contents in the different years with the spectral bands and texture features.
Figure 3. Correlations of the observed soil salt contents in the different years with the spectral bands and texture features.
Agriculture 14 01539 g003
Figure 4. Comparison of the measured and predicted soil salt contents.
Figure 4. Comparison of the measured and predicted soil salt contents.
Agriculture 14 01539 g004aAgriculture 14 01539 g004b
Figure 5. Evaluation of the prediction accuracies of the different machine learning algorithms.
Figure 5. Evaluation of the prediction accuracies of the different machine learning algorithms.
Agriculture 14 01539 g005
Table 1. Multispectral camera parameters and UAV parameters.
Table 1. Multispectral camera parameters and UAV parameters.
BandParameter
Blue450 nm (±16)Picture resolution1600 × 1300
Green560 nm (±16)Flying altitude40 m
Red650 nm (±16)Route overlap rate70%
Red Edge730 nm (±16)Side image overlap rate65%
NIR840 nm (±26)Average speed4 m/s
Table 2. Spectral indexes and related formula.
Table 2. Spectral indexes and related formula.
Spectral IndexFormulasSpectral IndexFormulas
NDSI(R − NIR)/(R + NIR)BI(R + NIR)0.5
NDSI-reg(RedEdge − NIR)/(RedEdge + NIR)BI-reg(RedEdge + NIR)0.5
NDSI *(R − RedEdge)/(R + RedEdge)BI*(R + RedEdge)0.5
Int2(G + R + NIR)/2NDVI(NIR-R)/(NIR + R)
Int2-reg(G + RedEdge + NIR)/2NDVI-reg(NIR − RedEdge)/(NIR + RedEdge)
Lnt2*(G + R + RedEdge)/2NDVI *(RedEdge − R)/(RedEdge + R)
SI(B + R)0.5DVINIR − R
SI1(G × R)0.5DVI-regNIR − RedEdge
SI2(G2 + R2 + NIR2)0.5DVI *RedEdge − R
SI3(G2 + R2)0.5EVI2.5(NIR-R)/(NIR + 6R − 7.5B + 1)
SI-reg(B + RedEdge)0.5EVI-reg2.5 (NIR − RedEdge)/(NIR + 6RedEdge − 7.5B + 1)
SI1-reg(G × RedEdge)0.5EVI *2.5 (RedEdge R)/(RedEdge + 6R 7.5B + 1)
SI2-reg(G2 + RedEdge2 + NIR2)0.5CVI(R/B) (G/B)
SI3-reg(G2 + RedEdge2)0.5GNDVI(NIR G)/(NIR + G)
SI2*(G2 + R2 + RedEdge2)0.5GNDVI *(RedEdge G)/(RedEdge + G)
S2(B − R)/(B + R)GLI[(G − R) + (G − B)]/(2 × G) + R + B
S3(G × R)/BGLI-reg[(G − RedEdge) + (G B)]/(2 × G) + RedEdge + B
S4(B × R)0.5MSAVI[2R + 1 [(2NIR + 1)2 8(NIR R)]0.5/2
S5(B × R)/GMSAVI-reg[2RedEdg + 1 [(2NIR + 1)2 8(NIR RedEdg)]0.5/2
S2-reg(B − RedEdge)/(B + RedEdge)MSAVI *[2R + 1 [(2RedEdg + 1)2 − 8(RedEdg R)]0.5/2
S3-reg(G × RedEdge)/BNNIRNIR/(NIR + R + G)
S4-reg(B × RedEdge)0.5NNIR-regNIR/(NIR + RedEdg + G)
S5-reg(B × RedEdge)/GNNIR *NIR/(RedEdg + R + G)
Note: R, G, B, NIR and RedEdg in the table represent the reflection values of red, green, blue, near infrared, and red edge bands in UAV multispectral images.“-reg” and “*” represents the improved spectral index.
Table 3. Statistical analysis of soil salt content measured data.
Table 3. Statistical analysis of soil salt content measured data.
Particular YearDatasetTotalMax
/%
Min
/%
Mean
/%
Standard Deviation
/%
Variance
/%
Coefficient of Variation
/%
2021Total sample700.868 0.338 0.528 0.152 0.023 0.288
Modeling set500.868 0.370 0.537 0.152 0.023 0.282
Prediction set200.866 0.338 0.506 0.156 0.023 0.308
2022Total sample700.882 0.243 0.448 0.160 0.026 0.358
Modeling set500.882 0.243 0.457 0.175 0.031 0.383
Prediction set200.863 0.365 0.424 0.112 0.013 0.264
2023Total sample700.981 0.018 0.377 0.268 0.072 0.713
Modeling set500.981 0.086 0.422 0.280 0.078 0.664
Prediction set200.647 0.018 0.265 0.193 0.039 0.729
Table 4. Spectral variables of different combinations in different years based on PCC method.
Table 4. Spectral variables of different combinations in different years based on PCC method.
Particular
Year
Spectral Index GroupSpectral Index—
Spectral Reflectance Group
Spectral Index—
Texture Information Group
Spectral Index—
Spectral Reflectance—
Texture Information Group
2021Lnt2*, S4-reg, SI2*
SI1-reg, SI-reg, GLI-reg
S4, S5, GLI
NNIP-reg, SI1, SI3
Lnt2*, S4-reg, SI2*
SI1-reg, SI-reg
GLI-reg, R, S4, S5
NNIP-reg, B, G
Mean, Lnt2*, S4-reg
SI2*, SI1-reg, SI-reg
GLI-reg, S4, S5
NNIP-reg, Var, Dis
Mean, Lnt2*, S4-reg
SI2*, SI1-reg, SI-reg
GLI-reg, R, B, G
Var, Diss
2022NNIP-reg, GLI-reg, Lnt2*
SI1-reg, SI2*, SI3-reg
NDSI-reg, NDVI-reg, BI*
S4-reg, SI-reg, EVI-reg
NNIP-reg, GLI-reg, Lnt2*
SI1-reg, SI2*, SI3-reg
NDSI-reg, NDVI-reg
BI*, RedEdge, G, R
NNIP-reg, GLI-reg, Lnt2*
SI1-reg, SI2*, SI3-reg
NDSI-reg, NDVI-reg, BI*
Mean, Ent, Dis
NNIP-reg, GLI-reg, Lnt2*
SI1-reg, SI2*, SI3-reg
RedEdge, G, R,
Mean, Ent, Dis
2023Lnt2*, SI1, SI3
SI2*, SI, BI*
SI1-reg, GLI-reg
GLI, SI3-reg, Int2, SI2
R, Lnt2*, SI1
SI3, SI2*, SI, BI*
SI1-reg, GLI-reg
GLI, G, RedEdge
Lnt2*, SI1, SI3, SI2*
SI, BI*, SI1-reg
GLI-reg, GLI, Hom
Con, Var
R, Lnt2*, SI1
SI3, SI2*, SI, BI*
G, RedEdge
Hom, Con, Var
Note: The sensitive spectral variables selected for each group in the table are arranged from large to small according to the |r| value.
Table 5. Inversion model based on soil salt content.
Table 5. Inversion model based on soil salt content.
Particular YearSensitive Variable
Combination
ModelModeling SetPrediction Set
RC2RMSEC/%MAEC/%RP2RMSEP/%MAEP/%
2021Spectral IndexBPNN0.6310.0910.0840.5360.1090.063
RF0.5320.0760.0460.5190.1070.075
XGBoost0.6610.520.1310.6130.1030.079
Spectral Index—
Spectral Reflectance
BPNN0.6130.0920.0750.5620.0980.151
RF0.5960.1050.0720.5240.1570.129
XGBoost0.7570.0590.0480.5830.1050.086
Spectral Index—
Texture Information
BPNN0.6640.0680.0470.6120.0830.104
RF0.6860.0910.0920.6310.1170.129
XGBoost0.7980.0750.0460.6840.0950.086
Spectral Index—
Spectral Reflectance—
Texture Information
BPNN0.6750.0890.0610.5410.1240.083
RF0.6810.0680.0820.6540.0740.096
XGBoost0.6380.0870.110 0.6520.1320.108
2022Spectral IndexBPNN0.5690.080 0.0560.4960.1730.107
RF0.6390.0970.1160.5730.1280.147
XGBoost0.7810.0220.0180.6220.0820.081
Spectral Index—
Spectral Reflectance
BPNN0.5810.116 0.080 0.540 0.1270.091
RF0.6060.0820.0920.5930.1150.105
XGBoost0.7910.0460.0710.6520.0840.075
Spectral Index—
Texture Information
BPNN0.5930.0980.0780.5440.1120.090
RF0.6730.1130.0970.5780.1180.094
XGBoost0.7390.0450.0330.6530.1050.083
Spectral Index—
Spectral Reflectance—
Texture Information
BPNN0.6380.1150.0930.5710.1260.103
RF0.6210.1140.0910.550 0.1260.134
XGBoost0.7410.0460.0370.6740.1330.086
2023Spectral IndexBPNN0.6290.1960.1470.5410.1590.139
RF0.6530.0830.0640.5540.1510.121
XGBoost0.750 0.0430.0340.6220.0860.078
Spectral Index—
Spectral Reflectance
BPNN0.5870.1690.0520.5090.1720.064
RF0.6590.0540.0790.6350.0690.098
XGBoost0.7820.1030.0840.6420.1280.118
Spectral Index—
Texture Information
BPNN0.6830.1170.0660.5720.2150.153
RF0.7330.0830.0990.6470.0760.149
XGBoost0.8610.0890.0670.7780.1670.134
Spectral Index—
Spectral Reflectance—
Texture Information
BPNN0.5840.1480.0690.5420.1530.127
RF0.7280.0550.0930.6380.0620.142
XGBoost0.8720.1060.0820.820 0.1650.134
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhao, W.; Li, Z.; Li, H.; Li, X.; Yang, P. Soil Salinity Prediction in an Arid Area Based on Long Time-Series Multispectral Imaging. Agriculture 2024, 14, 1539. https://doi.org/10.3390/agriculture14091539

AMA Style

Zhao W, Li Z, Li H, Li X, Yang P. Soil Salinity Prediction in an Arid Area Based on Long Time-Series Multispectral Imaging. Agriculture. 2024; 14(9):1539. https://doi.org/10.3390/agriculture14091539

Chicago/Turabian Style

Zhao, Wenju, Zhaozhao Li, Haolin Li, Xing Li, and Pengtao Yang. 2024. "Soil Salinity Prediction in an Arid Area Based on Long Time-Series Multispectral Imaging" Agriculture 14, no. 9: 1539. https://doi.org/10.3390/agriculture14091539

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop