Next Article in Journal
Improving GNSS-IR Sea Surface Height Accuracy Based on a New Ionospheric Stratified Elevation Angle Correction Model
Previous Article in Journal
A Texture-Considerate Convolutional Neural Network Approach for Color Consistency in Remote Sensing Imagery
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Spatial Estimation of Soil Organic Carbon Content Utilizing PlanetScope, Sentinel-2, and Sentinel-1 Data

1
College of Resources and Environment, Southwest University, Chongqing 400716, China
2
College of Computer and Information Science, Southwest University, Chongqing 400716, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2024, 16(17), 3268; https://doi.org/10.3390/rs16173268
Submission received: 19 July 2024 / Revised: 30 August 2024 / Accepted: 30 August 2024 / Published: 3 September 2024

Abstract

:
The accurate prediction of soil organic carbon (SOC) is important for agriculture and land management. Methods using remote sensing data are helpful for estimating SOC in bare soils. To overcome the challenge of predicting SOC under vegetation cover, this study extracted spectral, radar, and topographic variables from multi-temporal optical satellite images (high-resolution PlanetScope and medium-resolution Sentinel-2), synthetic aperture radar satellite images (Sentinel-1), and digital elevation model, respectively, to estimate SOC content in arable soils in the Wuling Mountain region of Southwest China. These variables were modeled at four different spatial resolutions (3 m, 20 m, 30 m, and 80 m) using the eXtreme Gradient Boosting algorithm. The results showed that modeling resolution, the combination of multi-source remote sensing data, and temporal phases all influenced SOC prediction performance. The models generally yielded better results at a medium (20 m) modeling resolution than at fine (3 m) and coarse (80 m) resolutions. The combination of PlanetScope, Sentinel-2, and topography factors gave satisfactory predictions for dry land ( R 2 = 0.673, MAE = 0.107%, RMSE = 0.135%). The addition of Sentinel-1 indicators gave the best predictions for paddy field ( R 2 = 0.699, MAE = 0.114%, RMSE = 0.148%). The values of R 2 of the optimal models for paddy field and dry land improved by 36.0% and 33.4%, respectively, compared to that for the entire study area. The optical images in winter played a dominant role in the prediction of SOC for both paddy field and dry land. This study offers valuable insights into effectively modeling soil properties under vegetation cover at various scales using multi-source and multi-temporal remote sensing data.

1. Introduction

Soil organic carbon (SOC) serves as a significant indicator of soil quality. It constitutes the primary component of organic matter in the soil and plays a critical role in maintaining moisture, improving soil structure, and providing nutrients [1]. In the southwestern region of China, specifically in the Wuling Mountain area, the geographical features are characterized by intricate diversity. The terrain undulates with continuous fluctuations in elevation, while the slopes exhibit a wide range of variations. Thus, the dynamics of SOC are subject to various influencing factors, leading to spatial heterogeneity and complexity in this region [2]. For about 20 years, digital soil mapping (DSM) has been an efficient and convenient technique to estimate the spatial distribution of soil properties using environmental factors. For example, the GlobalSoilMap project aims to provide fine-resolution soil information for the entire world, offering detailed data on various soil properties such as soil organic carbon, pH, and texture [3]. Another notable example is the SoilGrids initiative by ISRIC—World Soil Information, which generates global predictions for key soil properties at a high spatial resolution [4].
Currently, optical remote sensing images have been widely used for predicting soil properties [5,6,7,8]. Many scholars have successfully predicted SOC content by extracting multiple spectral indices from medium- and low-resolution satellite images (e.g., Sentinel-2, MODIS, and Landsat) using machine learning regression models [4,9,10,11]. Compared to the commonly used medium and low-resolution satellite sensors, high-resolution satellite sensors could provide more detailed and accurate surface information which can capture spatial variations of SOC at smaller scales. However, the effectiveness of high-resolution satellite data in mapping soil properties has not been widely explored yet due to high data costs. In recent years, the PlanetScope satellite has garnered attention from researchers due to its 3-meter spatial resolution and daily updated data [12,13,14,15,16]. It has achieved reliable results in fields such as agricultural monitoring, ocean monitoring, and land cover classification [14,15,16,17,18,19,20,21]. However, its potential for SOC prediction has not been fully utilized.
Meanwhile, radar remote sensing data can capture information about the relationship between soil and vegetation, opening up new avenues and prospects for predicting soil properties [22,23]. Since radar utilizes the transmission and reception of echo signals to detect surface and subsurface targets, it can penetrate atmospheric interference and obstacles. This capability allows it to obtain reflected, scattered, or interfered signals, providing a unique advantage in all-weather imaging [24]. Although the information content of a single radar image is limited, its continuous monitoring capability makes it an important tool for SOC prediction [22,23,25].
Many studies have been conducted using both optical and radar remote sensing data to predict SOC [26,27]. The synergistic use not only compensates for the limitations of a single source data but also provides more comprehensive information for SOC prediction [21,22]. For example, Poggio and Gimona comprehensively utilized MODIS, Landsat, Sentinel-1, and Sentinel-2 to map the SOC content in agricultural fields [27]. The optical data provided vegetation indices and reflectance information, the radar data provided soil moisture and surface roughness information, and the hyperspectral data provided more detailed land cover types and soil characteristic information. They found that the synergy of multiple remote sensing data is more effective than using a single satellite data [27]. In terms of data sources, published works mainly focus on the application of optical remote sensing data, while the use of radar remote sensing images are still in its early stages. To obtain more comprehensive and accurate soil information, further exploration is needed to uncover the potential and utility of multi-source remote sensing data in SOC prediction.
Due to the differences in the original spatial resolution of various satellite remote sensing images, there are certain discrepancies in their ability to capture terrestrial features. High spatial resolution images can provide more detailed information and identify subtle changes in land features, whereas medium to low spatial resolution images capture general changes in surface characteristics, enhancing the model’s sensitivity to large-scale features [6]. Currently, many researchers employ resampling methods to unify different remote sensing data to a consistent spatial resolution for comparative analysis and modeling [6,7,9,28,29]. However, the resampling techniques might lead to a loss or blurring of details from the original images [5] and thus could affect the accurate capture of target objects. Instead, different resolutions ranging from fine to coarse allow for multi-scale data analysis and help determine the optimal modeling resolution [6,28].
Compared to single-temporal remote sensing data, multi-temporal remote sensing data offers significant advantages in terms of comprehensiveness, reliability, and improved identification accuracy for soil property prediction [30,31]. By integrating data from various time points, multi-temporal remote sensing can capture the unique spectral reflectance characteristics at each stage and dynamically track changes in vegetation growth cycles. This allows for a more accurate inference of vegetation features related to SOC, thereby demonstrating distinct and important value in SOC prediction within vegetated areas. Currently, the application of multi-source and multi-temporal remote sensing data in SOC prediction mapping is primarily focused on bare soils [31,32,33,34,35,36,37]. These studies are typically confined to fallow lands, agricultural seedbeds, and sparsely vegetated arid and semi-arid regions, relying on the visibility of bare soil within specific time windows. However, in the areas under vegetation cover, the signal within the pixels of satellite images might be generated jointly by both vegetation and soil, making it challenging to isolate information that is purely attributable to the soil. Since satellite images (such as Sentinel-2, MODIS, and Sentinel-1) can capture vegetation features that may serve as indirect proxy variables for soil properties [38,39], utilizing multi-source remote sensing data provides a more comprehensive and detailed depiction of surface characteristics. Furthermore, multi-temporal remote sensing data can mitigate biases caused by short-term climatic conditions or incidental events affecting single observations by capturing information closely related to SOC input-output processes over different periods [31]. Therefore, exploring the potential of multi-source, multi-temporal, and high-resolution remote sensing data to enhance SOC prediction accuracy in vegetated areas represents a highly promising research direction that warrants further investigation.
The land use in the Wuling Mountain area is characterized by diverse types and fragmented plots, with obvious exposed rocks and uneven vegetation coverage. This complexity of ground conditions poses challenges for the interpretation of remote sensing data [2]. Research has shown that land use types have a substantial impact on SOC content [40,41]. Different types of land use involve various biological, chemical, and physical processes that directly or indirectly influence the accumulation and distribution of organic carbon in the soil [42]. Therefore, when constructing a predictive model for SOC, it is necessary to fully consider land use or the spatial patterns of crop rotation [43]. There is currently a substantial amount of research on the impact of land use change on SOC content. However, there is limited research on predicting SOC content under different land use types. Scholars such as Zhou et al. [24] and Zhang et al. [44] have used land use types as variables to predict SOC content, but the results of feature importance ranking indicate that their contribution to the models is minimal. On the other hand, some researchers have utilized soil reflectance spectra to estimate SOC for individual land use types such as forests, croplands, and orchards [22,45]. These studies have confirmed that the prediction results for soil properties under individual land use types are substantially influenced by using soil reflectance spectra. Therefore, we hypothesize that the utilization of multi-source remote sensing to predict SOC for individual land use types has the potential to enhance model accuracy in regions with diverse land use types.
Hence, the objective of this study is to evaluate the effectiveness of multi-source and multi-temporal remote sensing data for predicting SOC across arable land under vegetation cover conditions in the Wuling Mountain region of Southwest China. To conduct this, we analyzed (i) the utility of synergistic application of multi-source remote sensing data in predicting SOC; (ii) the feasibility of using multi-temporal and multi-scale remote sensing images for SOC content mapping; (iii) whether constructing SOC prediction models under individual land use types can improve prediction accuracy. This research will contribute to a deeper understanding of the potential application of multi-source remote sensing data in predicting SOC, providing the scientific basis and technical support for precision agriculture and sustainable soil management. Specially, the eXtreme Gradient Boosting algorithm (XGBoost) was used to estimate the spatial distribution of SOC due to its high performance, efficiency, and the advantage of handling imperfect data such as missing values and outliers [22,42,44].

2. Materials and Methods

2.1. Study Area

The research area is located in Qianjiang District, Chongqing, southwestern China (Figure 1). Geographically, it lies between 108 36 0 and 108 52 0 E longitude and 29 16 0 and 29 36 0 N latitude. It covers a vast area of about 440 km 2 . The research area is characterized by typical hilly and mountainous terrain and is situated in the core region of the Wuling Mountains. The topography is undulating with continuous mountain ranges, varying natural slopes ranging from 0 to 80.91 degrees, and elevations ranging from 367 to 1561 m above sea level. The unique subtropical humid monsoon climate of the research area exhibits distinct four seasons. According to the observation data from Qianjiang District Meteorological Station, the annual average temperature is 15.4 degrees Celsius, with an annual sunshine duration of 1167 h and an annual rainfall ranging from 1180 to 1280 mm. The relative humidity is around 70%. Due to the complex terrain and high degree of fragmentation in the area, there are various land use types. The main land use types are paddy field and dry land. The paddy field area covers 32.05 square kilometers, accounting for 7.28% of the total land area. The dry land area covers 102.21 square kilometers, accounting for 23.23% of the total land area (Figure 1c).

2.2. Soil Sampling

During the period from 2017 to 2018, field surveys were conducted within the study area. Soil sampling points were arranged based on the principle of ”grid combined with plot”, targeting surface soil samples [46]. For agricultural land, a density of 4–6 points per square kilometer was used. At each designated sampling point, centered on a GPS location, 4–6 sub-sampling points were identified radiating outwards by 30–50 m [46,47,48]. These sub-samples, which share similar land use and soil types, were then combined into a composite sample. The original weight of each sample generally exceeded 1.5 kg and they were stored in dedicated sample bags. Afterward, the samples were air-dried at room temperature and passed through a 2 mm soil sieve prior to chemical analysis. The determination of SOC was conducted using the oxidation-reduction capacity method [49]. A total of 332 topsoil samples were collected within the study area, comprising 171 samples from paddy field and 161 samples from dry land.

2.3. Environmental Data for Modeling

On the basis of the correlation between the surrounding environment and soil properties, we gathered remote sensing images and terrain features to serve as environmental variables for modeling analysis. These variables were transformed into raster layers with spatial resolutions of 3 m, 20 m, 30 m, and 80 m through bilinear interpolation using ArcGIS 10.8 software [50]. Subsequently, we retrieved the attribute values of the relevant environmental variables from each soil sample point to be used as inputs for modeling [51]. The source and processing of environment variables are shown in Figure 2.

2.3.1. Optical Satellite Images Collection and Processing

(1) Image pre-processing: The optical satellite images used in this study include high-resolution 3 m PlanetScope images and commonly used medium-resolution Sentinel-2 images. Given that the study area is located in a complex and dynamic mountainous environment, cloud cover can obscure crucial regions such as mountain ranges and valleys. These regions are particularly important as they may exhibit significant variations in soil organic carbon content. Thus, selecting a low cloud coverage threshold ensures that the data used are clear and contain comprehensive information, which is essential for subsequent data analysis and modeling. Specifically, the PlanetScope images selected are L3B-level PSB-SD data products with a cloud cover of less than 5%. These images consist of four spectral bands: Blue, Green, Red, and Near-infrared (Table 1). The images capture the four seasons in the study area: spring (consisting of nine stitched imagery taken from 26 to 28 May 2017), summer (comprised of 10 stitched images captured on 27 July 2017), autumn (composed of nine stitched images taken on 31 October 2017), and winter (consisting of 11 stitched images captured from 22 to 24 December 2017) (Figure 3). The Sentinel-2 images were obtained from the European Space Agency’s Sentinel Scientific Data Center (https://scihub.copernicus.eu, accessed on 17 June 2024). We specifically selected L1C-level images with a cloud coverage of less than 5%. These images cover four seasons in our study area: spring (4 May 2017), summer (10 July 2017), autumn (31 October 2017), and winter (22 December 2017) (Figure 3). The study area was covered individually by each image. Only the ten bands with higher resolution were used as candidate predictors for modeling, namely Blue (10 m), Green (10 m), Red (10 m), Red edge 1 (20 m), Red edge 2 (20 m), Red edge 3 (20 m), Near-infrared (10 m), Near-infrared Narrow (20 m), Shortwave Infrared 1 (20 m), and Shortwave Infrared 2 (20 m) (Table 1). Then, we performed geometric correction on all the images and conducted atmospheric correction on all remote sensing datasets using the FLAASH atmospheric model. This process includes radiometric calibration and atmospheric correction radiance. The atmospheric correction was carried out using the Spectral Hypercube Fast Line-of-sight Atmospheric Analysis of Spectral Hypercubes (FLAASH) module, which is based on the Moderate Resolution Atmospheric Transmission (MODTRAN 4) radiative transfer code [52].
(2) Spectral variables: The raw bands of optical satellite images and the spectral indices calculated from the bands were used as spectral variables for predicting SOC. The spectral indices included 6 soil radiance indices, 13 vegetation radiance indices and 3 soil salinity indices (Table 2). Multiple studies have shown that high salt environments can limit microbial activity and the action of organic matter degradation enzymes, leading to the gradual accumulation of organic carbon in the soil [53,54,55]. Additionally, the salt content in soil can alter its structure, affecting processes such as water infiltration and oxygen exchange, which in turn impact microbial activity and the transformation and preservation of organic carbon [54]. Therefore, considering the relationship between soil salt content and SOC, we have chosen indices related to soil salt content as proxies for SOC prediction. Although there are two NIR bands in Sentinel-2 images, the narrower NIRn band (855–875 nm) is more similar to the PlanetScope NIR band (855–875 nm). Therefore, the spectral index used in Sentinel-2 is calculated based on the NIRn band.

2.3.2. Radar Satellite Images Collection and Processing

(1) Image pre-processing: Sentinel-1 is the first Earth observation radar satellite designed and developed by the European Space Agency (ESA). It consists of two polar-orbiting satellites, Sentinel-1A and Sentinel-1B, equipped with a C-band synthetic aperture radar (SAR). This satellite offers various imaging modes, including single polarization and dual polarization, enabling the acquisition of 10 m resolution satellite images.
Although radar images are not affected by atmospheric conditions such as cloud cover, the surface conditions (e.g., vegetation status and soil moisture) vary with the seasons. We selected representative time periods to capture these changes. For example, spring and summer are characterized by vigorous vegetation growth, while winter may reflect bare ground conditions. Therefore, we have downloaded six scenes of radar images from the European Space Agency’s Copernicus Open Access Hub (https://scihub.copernicus.eu, accessed on 17 June 2024) for the year 2017 (16 January, 21 May, 20 July, 18 September, 24 October, and 11 December). These dates were chosen to encompass major seasonal variations: January and December represent winter; May and July cover the transition between spring and summer as well as peak summer; September and October cover autumn. All images are in Interferometric Wide (IW) mode, Ground Range Detected (GRD) format, with polarization options of Vertical-Horizontal (VH) and Vertical-Vertical (VV). This study utilizes the official SNAP 8.0 software to perform orbit correction, thermal noise removal, and radiometric calibration on each satellite image in order to obtain accurate satellite azimuth and velocity signals [56,57]. Subsequently, filtering techniques are applied to reduce the impact of speckle noise [56]. Finally, terrain correction is conducted to compensate for geometric distortions in SAR data [58]. They were then converted into dB format for VV and VH polarizations.
(2) Radar variables: Extracting the Radar Vegetation Index (RVI) and Cross Ratio (CR) from well-processed images, along with the backscattering coefficients VV and VH, serves as a predictive radar remote sensing variable for SOC (Table 2). Research has demonstrated that RVI and CR are more effective in reducing systematic sampling errors and providing a more stable representation of vegetation attributes compared to the backscattering coefficients VV and VH [59,60].
Table 2. Spectral indices used in this study.
Table 2. Spectral indices used in this study.
TypeSpectral IndexCalculation FormulaReference
Soil
radiometric
indices
Brightness Index (BI) ρ red 2 + ρ green 2 2 [61]
Second Brightness Index (BI2) ρ red 2 + ρ green 2 + ρ nir 2 3 [61]
Redness Index (RI) ρ red 2 ρ green 3 [62]
Color Index (CI) ρ red ρ green ρ red + ρ green [62]
Hue Index (HI) 2 ρ red ρ green ρ blue ρ green ρ blue [63]
Saturation Index (SI) ρ red ρ blue ρ red + ρ blue [63]
Vegetation
radiometric
indices
Soil Adjusted Vegetation Index
(SAVI)
ρ nir ρ red × ( 1 + L ) ρ nir + ρ red + L [64]
Transformed Soil Adjusted
Vegetation Index (TSAVI)
s × ρ nir s × a s × ρ nir + ρ red a × s + X × 1 + s 2 [65]
Modified Soil Adjusted
Vegetation Index (MSAVI)
( 1 + M ) × ρ nir ρ red ρ nir + ρ red + L [66]
Second Modified Soil Adjusted
Vegetation Index (MSAV12)
2 ρ nir + 1 2 ρ nir + 1 2 8 ρ nir ρ red 2 [67]
Difference Vegetation Index
(DVI)
ρ nir ρ red [68]
Ratio Vegetation Index (RVI) ρ nir ρ red [69]
Perpendicular Vegetation Index
(PVI)
sin ( b ) × ρ nir cos ( b ) × ρ red [70]
Weighted Difference
Vegetation Index (WDVI)
ρ nir S × ρ red [71]
Infrared Percentage Vegetation
Index (IPVI)
ρ nir ρ nir + ρ red [72]
Normalized Difference
Vegetation Index (NDVI)
ρ nir ρ red ρ nir + ρ red [73]
Transformed Normalized
Difference Vegetation Index
(TNDVI)
S q r t ρ nir ρ red ρ nir + ρ red + 0.5 [74]
Atmospherically Resistant
Vegetation Index (ARVI)
ρ nir r b ρ nir + r b [75]
Global Environmental
Monitoring Index (GEMI)
eta × ( 1 0.25 × eta ) ρ red 0.125 1 ρ red [76]
Soil salinity
indices
Soil salinity index1 (SSII) ρ blue × ρ red ρ green [77]
Soil salinity index2 (SSI2) ρ green + ρ red 2 [78]
Soil salinity index 3 (SSI3) ρ blue × ρ red [79]
Radar IndexRadar Vegetation Index (RVI) 4 × V H V V + V H [59]
Cross Ratio (CR) V H V V [60]
Notes: L is a correction factor that varies from 0 (indicating very high vegetation cover) to 1 (indicating very low vegetation cover) a is the soil line intercept; the parameter s represents the slope of the soil line; X is the adjustment factor to minimize soil noise; M = 1 − 2 × s × NDVI × WDVI; b denotes the angle between the soil line and the NIR axis, measured in degrees; eta = (2 × (ρnir2ρred2) + 1.5 × ρnir + 0.5 × ρred)/(ρnir + ρred + 0.5); rb = ρblueρred. In this paper, the values assigned to L, a, s, X, and b are 0.5, 0.5, 0.5, 0.08, and 45, respectively.

2.3.3. Topographic and Land Use Data

Topographic variables were added as proxies for estimating SOC because the study area is situated in a hilly and mountainous environment with large undulations. Based on the 12.5 m resolution DEM data generated by the Land Surveys, ArcGIS 10.8 was used to calculate elevation, slope, and aspect as three fixed variables for each model to be used to aid in SOC prediction [50]. These three metrics were selected because they are the most commonly used in terrain analysis and are sufficient to comprehensively reflect basic terrain characteristics. Additionally, our study primarily focuses on the utility analysis of multi-source remote sensing data. By selecting key and representative terrain metrics, we ensure that the research remains focused on the effectiveness of remote sensing data without being diluted by an excessive number of terrain variables.
The land use data were derived from the 1:50,000 Land Use Map of the People’s Republic of China (http://www.mnr.gov.cn, accessed on 6 January 2024), provided by the Ministry of Natural Resources in 2019. This map was first compiled using data from the QianJiang District Land Use Database, which was launched in 2008 and is updated annually by the local Ministry of Natural Resources [80].

2.4. Modeling Process

In this study, covariates used to construct SOC prediction models consist of multi-temporal spectral variables (original bands + spectral indices), multi-temporal radar remote sensing variables, and topographic variables. Six models were constructed in paddy field and dry land, respectively (Table 3).
This study utilizes eXtreme Gradient Boosting (XGBoost) to explore the relationship between SOC and covariates. XGBoost is an innovative improvement upon the Gradient Boosting Decision Tree (GBDT) algorithm, incorporating advancements at both the algorithmic and system design levels [81]. It combines the predictions of a set of weak learners and trains a powerful learner through additional training strategies. Compared to GBDT, XGBoost utilizes second-order derivative information during the optimization process and incorporates regularization terms, which not only enhances the precision of the optimization objective but also constrains model complexity to prevent overfitting [82]. XGBoost incorporates the weighted quantile sketch algorithm at the algorithmic level, which helps in selecting feature split points during tree generation. This approach allows for pre-building candidate split points for each feature, reducing computational overhead and improving training speed. At the system design level, XGBoost sorts the input training features and stores them in memory using a block structure, enabling efficient reuse during subsequent iterations. Additionally, XGBoost supports parallel training during the training process, further enhancing computational efficiency [83]. The XGBoost model was implemented in Python 3.9 for this study.
The feature importance in XGBoost is calculated based on three factors: “gain”, “frequency”, and “coverage”. The gain parameter describes the importance of a feature in splitting tree branches. The frequency parameter captures the number of occurrences of elements in constructing the tree. The coverage parameter represents the relative values of feature observations [84]. A feature with more frequent use in making crucial decisions in the boosting process will have a higher score [85]. Gain is the primary factor determining the importance of a branching feature, and in this study, it is used to determine feature importance. For example, the calculation of the relative importance of feature j is as follows:
I ^ j 2 ( T ) = t = 1 J 1 i t 2 P v t = j
I ^ j 2 = 1 M m = 1 M I ^ j 2 T m
where T is a single tree with J branch nodes, t is a node, i t 2 is the improvement of squared error of a node t, v t is the feature associated with node t, M is the number of trees in the forest.
The importance of feature j in the tree T is calculated through Equation (1), and then the average of the importance of feature j in each tree is calculated through Equation (2) to obtain its final importance in the forest of M trees.
In this study, there are multiple predictor variables used to construct the model. To prevent the uncertainty of the predictive model caused by highly correlated or redundant predictor variables [86], we utilized the feature importance analysis function provided by the XGBoost algorithm. This function evaluates the contribution of each feature in model training and provides a relative ranking of importance. Based on these evaluations, we can select the top-ranked variables as primary features and eliminate variables that have minimal impact on prediction performance. This approach allows us to reduce model complexity while maintaining prediction accuracy.

2.5. Model Validation

In the XGBoost model, we established four parameters: learning rate, maximum tree depth, minimum child weight, and regression lambda. We then devised many possible combinations to adjust these parameters within the model. The parameters are illustrated in Table 4. We randomly select 80% of the sample points for each land use type as the calibration dataset, while the remaining 20% serves as the validation dataset [27].
To assess the predictive ability and robustness of SOC prediction models, we have selected four indicators to evaluate the models: Coefficient of Determination ( R 2 ), Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Akaike Information Criterion (AIC). The calculation formulas for these indicators are as follows:
R 2 = 1 i = 1 m y i y ^ i 2 i = 1 m y i y ¯ i 2
M A E = 1 m i = 1 m y i y ^ i
R M S E = 1 m i = 1 m y i y ^ i 2
A I C = 2 × k + m × ln i = 1 m y i y ^ i 2 + 1 1 + R 2
where m is the sample size, y i is the true value, y ^ i is the predicted value, and y ¯ is the mean of y i , and k is the number of features.
R 2 evaluates the fit of the model, and when the value is closer to 1, the better the fit of the model; M A E and R M S E evaluate the robustness of the model, and when the value is smaller, the lower the prediction error of the model; A I C is to evaluate the complexity of the model, the lower the value the more concise the model is [22].

2.6. Statistical Analysis

To analyze the differences in SOC content between paddy field and dry land, a one-way analysis of variance (ANOVA) with a confidence interval of 0.05 was used, because it effectively determines if there are significant differences between groups [87]. These statistical analyses were performed using the SPSS V.25 software package.
The workflow of the study is shown in Figure 4.

3. Results

3.1. Descriptive Statistics of SOC

Descriptive statistics for total area, paddy field and dry land SOC are shown in Table 5. In the entire dataset of the study area, the variation range of SOC is between 0.05% and 2.84%, with a mean value of 1.17% and a coefficient of variation of 36.08%. This indicates that there is substantial spatial variability in SOC. The average SOC content in dry land is generally lower than that in paddy field, primarily due to differences in water availability and vegetation cover [88]. Compared to paddy field, dry land lacks irrigation, which leads to faster decomposition of organic matter in the soil. Additionally, the low vegetation cover in dry land during certain periods of time, influenced by cropping cycles, exposes the soil to climatic and environmental changes, further exacerbating the loss of organic carbon from the soil.
Significant differences (p < 0.05) in SOC were observed between paddy field and dry land through one-way analysis of variance and multiple comparisons. These results suggest that land use types have an impact on SOC content to a certain extent.

3.2. Model Evaluation and Comparison

Figure 5 summarizes the predictive performance of different models for paddy field, dry land, and the total area. Additionally, Appendix A: Table A1, Table A2 and Table A3 provides detailed evaluation results for each model on both training and test sets. Given that the evaluation metrics on the test set more accurately reflect a model’s performance on unseen data, we primarily reference these metrics when assessing model performance. Furthermore, by comparing metrics between training and test sets, we can evaluate the model’s generalization ability, i.e., its adaptability to new data. Through analysis and comparison, it is evident that modeling resolution, satellite data selection, and land use play a crucial role in determining the accuracy and reliability of SOC prediction models.
In Figure 5, it is evident that there are large differences in model performance when building predictive models at different modeling resolutions. Overall, all six models achieved the highest prediction accuracy at a moderate modeling resolution of 20 m. However, when the resolution was scaled down to 80 m, the accuracy of the models substantially decreased. For instance, in the case of the paddy field, the R 2 of Model E dropped from 0.647 to 0.125 when transitioning from a 20 m to an 80 m modeling resolution, while the R 2 of Model F decreased from 0.699 to 0.123. Similarly, for dry land, the R 2 of Model E declined from 0.673 to 0.151, and the R 2 of Model F dropped from 0.571 to 0.058. This suggests that when the raster units of the original images are scaled to a coarser resolution, more detailed information is lost or blurred, which affects the performance of the model. In addition, compared to finer and coarser modeling resolutions, building prediction models at moderate resolutions can provide more competitive accuracy for soil attributes.
Furthermore, we found that the selection and combination of different satellite data substantially affect the prediction accuracy of SOC at the optimal modeling resolution. In the paddy field, Model A and Model B constructed using high-resolution PlanetScope data exhibited higher prediction accuracy compared to Model C and Model D using medium-resolution Sentinel-2 data. Conversely, in dry land, Sentinel-2 demonstrated advantages. This indicates that different satellite data have their own strengths, and selecting the appropriate data has a major impact on the prediction accuracy of SOC. We also discovered that the synergy of multiple data is notably more effective than single satellite data. For instance, in paddy field, the combined use of all available predictors in Model F resulted in the highest prediction accuracy of 0.699. Similarly, in dry land, the combination of PlanetScope, Sentinel-2, and topography predictors in Model E achieved an optimal prediction accuracy of 0.673. Notably, as shown in Appendix A: Table A1, Table A2 and Table A3, the training accuracies and test accuracies for these two models are closely aligned. This indicates that they possess strong generalization capabilities and do not overly rely on specific training samples.
Finally, we found that the predictive model built for individual land use types performed better than the model for the entire region. In the optimal model, the prediction accuracy ( R 2 ) for modeling under paddy field and dry land separately improved by 0.360 and 0.334, respectively, compared to modeling across the entire study area. Furthermore, different land use types also had an impact on the utility of radar remote sensing indices. For example, in the paddy field, Model B, Model D, and Model F, which incorporated radar remote sensing indices provided by Sentinel-1, outperformed Model A, Model C, and Model E with better model performance. However, in dry land, the situation is quite the opposite as the inclusion of radar remote sensing index resulted in a decrease in prediction accuracy to varying degrees.
Moreover, we compare XGBoost with Random Forest (RF), Gradient Boosting Decision Trees (GBDT), and Support Vector Regression (SVR). Based on R 2 , MAE, RMSE, and AIC metrics, XGBoost outperforms other models. The results for RF, GBDT, and SVR are detailed in Appendix A: Table A4, Table A5 and Table A6.

3.3. Relative Importance of Predictor Variables

Based on the importance scores obtained from the XGBoost algorithm, we conducted an analysis of Model E and Model F at a 20 m modeling resolution, which showed better performance. Figure 6 shows the contribution of different satellite data and different temporal spectral variables to the models. Additionally, we selected only the top 30 important environment variables for each model to be displayed. We discovered that in both Model E and Model F for the paddy field, the leading feature variables were provided by the PlanetScope satellite. Overall, the PlanetScope predictor contributed substantially with a dominance rate of 55.37% and 44.93% in these two models, respectively, indicating its crucial role in paddy field. On the other hand, Sentinel-2 satellites provide more prominent feature variables for dry land, with overall contribution rates reaching 50.06% and 51.05%. Sentinel-2 predictors play an important role in dry land. Additionally, the contribution rate of Sentinel-1 satellites is also influenced by land use types. It reaches 22.05% for paddy field but only 8.12% for dry land. These results indicate that land use types to some extent affect the interpretation of the remote sensing data for the model.
Based on the analysis of multi-temporal remote sensing variables and their contributions to prediction models, we found that the predictive variables derived from winter images play a dominant role in interpretation. For the paddy field, spectral variables from winter images contribute 42.46% and 47.02%, respectively, to the two models, indicating their substantial influence. Similarly, for dry land, winter spectral variables also serve as important explanatory variables for SOC prediction, with contribution rates of 29.86% and 25.96%.

3.4. Spatial Prediction

We mapped the spatial distribution of SOC in the study area based on the best model for the paddy field and dry land (Figure 7). The SOC content in the central region is relatively low, while it forms high-value aggregation areas in the western and southeastern parts. It can be observed that there is a certain relationship between SOC content and elevation. In high-altitude areas, the SOC content is higher, whereas it is lower in low-altitude regions. This pattern may be attributed to the indirect influence of topography on SOC input and loss through its close relationship with vegetation cover, temperature, and precipitation [89]. Additionally, soil gravity movement can also be influenced by water flow and cultivation practices [90].

4. Discussion

4.1. Performance of SOC Prediction Models Using Different Combinations of Environmental Variables

Through analysis and comparison, it was found that different combinations of environmental variables substantially affected the accuracy of the SOC prediction model (Figure 5). When it comes to optical remote sensing data, high-resolution remote sensing data can provide more detailed information, but they are not always necessary for accurate spatial analysis [7]. For example, the high-resolution PlanetScope satellite data performs better overall for paddy field compared to the medium-resolution Sentinel-2 data, while the medium-resolution Sentinel-2 performs better for dry land. This could be due to the influences of soil vegetation characteristics, spatial heterogeneity, and spectral mixing effects [91,92]. Firstly, in paddy field, crops are usually densely planted with small spacing between them. Additionally, the unique irrigation system in the paddy field ensures that water is evenly distributed throughout the entire field, resulting in minimal spatial variation in soil moisture. These characteristics often lead to smaller spatial scales in soil and vegetation features in paddy field. High-resolution PlanetScope satellite data are capable of capturing detailed changes in paddy field, allowing for better differentiation of soil texture and vegetation. Additionally, the dense planting method in paddy field creates spatial separation between different crops, which have varying requirements for light exposure and soil conditions. This ultimately affects the growth status of vegetation. High-resolution images have the advantage of capturing this heterogeneity more effectively. Finally, the presence of water in the paddy field may result in spectral mixing effects, where the spectral signals of multiple substances are mixed within a pixel, leading to impure spectral responses observed by sensors [91]. However, Sentinel-2, with its multiple spectral bands, is more susceptible to spectral mixing effects, resulting in lower predictive performance. Conversely, these additional bands can provide more abundant remote sensing information in dry land areas, aiding in the accurate capturing of key features such as soil texture and vegetation status. These factors may also contribute to the differences in their contribution rates when used together for both paddy field and dry land. However, overall, the synergistic application of high-resolution PlanetScope and medium-resolution Sentinel-2 remote sensing data outperforms the predictive performance of using either dataset alone, whether for paddy field or dry land. This could be attributed to the combined utilization of spatial and spectral information from both remote sensing data sources, enabling a more comprehensive and accurate capture of soil characteristics for SOC prediction [6]. By combining detailed spatial information and rich spectral information, this approach effectively overcomes the limitations of individual satellite data. It considers both small-scale details and large-scale variations, enhancing the overall comprehensiveness and accuracy of SOC prediction.
According to the research on the synergistic application of optical remote sensing data and radar remote sensing data, it has been found that incorporating radar remote sensing variables improves the performance of prediction models at the optimal modeling resolution for the paddy field. Conversely, for dry land, the predictive performance decreases, which is different from the findings of Wang et al. [22]. The possible reason for this difference may lie in the fact that in paddy field, due to the penetrability and ability to acquire information about the underlying layers of radar waves, more important observational parameters such as soil moisture and soil texture needed for monitoring and predicting paddy field can be obtained, thereby enhancing the interpretability of the model [93,94]. However, the radar remote sensing data have certain limitations in obtaining information on SOC due to the influence of soil moisture and vegetation cover in dry land areas [95]. Dry land surface features are drier and have less interaction with radar waves, resulting in poorer quality radar signals received in these regions. Additionally, there may be numerous man-made structures (such as buildings and roads) in dry land areas, which can cause multiple scattering or interference of radar waves, further complicating the interpretation of the models.
Although different satellite data have their own advantages, the synergistic application of various optical remote sensing data and the combination of optical remote sensing data with radar remote sensing data have great potential in improving the performance of SOC prediction models compared to single remote sensing data sources. It is evident that utilizing multi-source remote sensing data for efficient monitoring and evaluation of soil properties is of significant importance. In the future, we can further explore the synergistic application of more remote sensing data, providing more possibilities for multi-source remote sensing data in soil mapping.

4.2. Feasibility of Multi-Temporal, Multi-Scale Remote Sensing Images for SOC Mapping

Due to the fact that remote sensing images record the information of a specific moment on the Earth’s surface, the features recorded by satellite images at different times exhibit substantial differences. In vegetation-covered areas, spectral images primarily reflect the vegetation information on the land surface to predict SOC [8,96]. However, the distribution characteristics of vegetation vary with seasons. Therefore, utilizing multi-temporal remote sensing data not only helps eliminate potential seasonal biases in soil property prediction caused by single-temporal remote sensing data but also enables us to obtain more comprehensive and rich soil attribute information, better indicating vegetation growth conditions [30,31]. Our research evaluated the capability of multi-temporal optical remote sensing images and radar remote sensing images in predicting SOC content in densely vegetated areas. In the assessment of the importance of predictor variables in our prediction models, we found that the different temporal remote sensing images all contributed to varying degrees. This suggests that spectral information captured at different times can provide important clues about changes in SOC content. Among them, spectral variables obtained from winter images played a dominant role in interpreting SOC. This is mainly due to the influence of vegetation cover, soil moisture, and variations in organic matter distribution [97,98]. Compared to other seasons, the decrease in vegetation cover during winter might result in clearer and more stable spectral information, thereby reducing the disturbance caused by variations in vegetation type, density, and growth status on remote sensing data [97]. Additionally, winter generally experiences less rainfall, and therefore, maintains relatively stable soil moisture levels. The spectral information is often affected by soil moisture, and having stable soil moisture allows for more accurate and reliable spectral variables to be obtained from winter images [97]. Furthermore, in winter, due to slow vegetation growth or dormancy, the rate of organic matter decomposition in the soil is relatively low. Therefore, the spectral information obtained from winter images are more likely to reflect the content and distribution of soil organic matter [98]. Although winter images provides the main interpretive variables for the model, other temporal images also provide important information on land cover and soil characteristics. Utilizing multi-temporal remote sensing data not only allows for a more comprehensive and rich understanding of soil information but also reflects vegetation growth, coverage, and changes. Some studies have shown that vegetation density is closely related to SOC content [8,31]. By correlating vegetation information with SOC content through multi-temporal remote sensing data, a relationship model can be established to more accurately predict and monitor SOC content. It is evident that multi-temporal remote sensing data holds great potential in providing comprehensive data sources for SOC monitoring.
Due to the varying spatial resolutions of different remote sensing data, our prediction model is built on multiple spatial scales. The prediction results show that different modeling resolutions yield different prediction accuracies in different combinations of predictor variables (Figure 5). In comparison to models based on fine (3 m) and coarse (80 m) modeling resolutions, the ideal prediction accuracy is achieved at a moderate modeling resolution (20 m). This finding is consistent with the studies conducted by Samuel-Rosa et al. [99] and Zhou et al. [28]. The differences in predictive accuracy under different modeling scales may stem from the data matching issues during the resampling process [6]. When remote sensing data are resampled to different modeling resolutions, it may result in the loss of fine-grained information or introduce additional noise. Consequently, there can be deviations in obtaining soil and vegetation information, which ultimately affects the accuracy of SOC prediction. Moreover, each model’s reliance on remote sensing data with a modeling resolution of 20 m could be attributed to the characteristics of various sensors and remote sensing data. For example, down-sampling high spatial resolution PlanetScope data to lower resolutions such as 30 m or 80 m, or up-sampling medium spatial resolution Sentinel-2 data to higher resolutions like 3 m, may result in the loss of crucial information for certain remote sensing variables or a decrease in their correlation with the target variable. This can lead to differences in the contribution level for predicting SOC content. The moderate modeling resolution may strike a balance between capturing fine-scale details and capturing general landscape characteristics, allowing remote sensing data to capture effective land use features and improve the accuracy of SOC content prediction. Although we generally acknowledge that the spatial scale of input variables may have a substantial impact on the predictive performance of soil properties, most previous soil mapping studies have only conducted analyses at individual scales [7,22,29]. Therefore, in order to enhance the predictive ability for target soil properties, we recommend considering the utilization of multi-scale modeling to optimize the prediction model.
Generally, there are differences in vegetation cover, human disturbance, and climate conditions among different land use types. These factors directly or indirectly affect the accumulation and decomposition of SOC, resulting in different variations in SOC [100]. Compared to studies that use land use types as predictive covariates [24,44], we propose a method to construct predictive models under individual land use types. Through quantitative evaluation of prediction accuracy, we found that this method can substantially improve the accuracy of SOC prediction. This is similar to the research conducted by Wang et al. [22]. The heterogeneity of soil and vegetation conditions under different land use types may be the cause for this phenomenon. By subdividing the area into smaller regions, we can control this heterogeneity within specific spatial boundaries and focus more on the spatial variations within each specific land use type. This approach can greatly enhance the accuracy of prediction models [101]. However, it is important to note that our study area has limited land use types. Further research should be conducted in larger areas with a greater variety of land use types in order to validate our findings and ensure more reliable conclusions.

5. Conclusions

This study proposes and evaluates a method for predicting SOC content under individual land use types using multi-source, multi-temporal, and multi-scale remote sensing data. To achieve this, we synergistically utilized high-resolution optical images (PlanetScope), medium-resolution optical images (Sentinel-2), commonly used radar images (Sentinel-1), and auxiliary DEM data for modeling. Based on quantitative evaluations of prediction accuracy, the optimal models were selected for mapping the distribution of SOC in paddy field and dry land in the Southwest Wuling Mountain region. The main findings of this study are summarized as follows:
(1) Different satellite data have their own strengths and weaknesses, and the selection and combination of satellite remote sensing data greatly influence the performance of prediction models. Compared to using a single remote sensing dataset, the synergistic utilization of multi-source remote sensing data can achieve complementarity, resulting in higher accuracy in model predictions. The combination of PlanetScope, Sentinel-2, and topography factors gave satisfactory predictions for dry land ( R 2 = 0.673, MAE = 0.107%, RMSE = 0.135%). The addition of Sentinel-1 indicators gave the best predictions for the paddy field ( R 2 = 0.699, MAE = 0.114%, RMSE = 0.148%).
(2) According to comparative analysis, it has been shown that considering multi-scale modeling contributes to optimizing prediction models and improving the predictive capability of target soil properties. Overall, it is generally observed that under moderate modeling resolutions, more ideal predictions of SOC can be achieved compared to fine and coarse resolutions.
(3) The acquisition time of remote sensing images have a certain impact on the interpretation of models. The spectral variables obtained from winter images are the main interpretation variables for SOC.
(4) In regions with complex land use types, constructing prediction models for each of them can help improve the prediction accuracy of SOC. Modeling for paddy field and dry land separately resulted in higher prediction accuracies compared to modeling across the entire region, with improvements (in terms of R 2 ) of 36.0% and 33.4%, respectively.
(5) The land use types to some extent affect the interpretation of models by satellite data. In models that combine PlanetScope and Sentinel-2 data, the inclusion of radar remote sensing variables has improved the prediction accuracy for paddy field. However, for dry land, the prediction accuracy declines, with R2 dropping by 10.2%.

Author Contributions

Z.W.: Methodology, Software, Conceptualization, Writing—Original draft preparation, Validation, Visualization, Resources, Formal analysis. W.W.: Writing—Review and Editing, Validation. H.L.: Methodology, Writing—Review and Editing, Funding acquisition, Supervision, Resources, Investigation. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The satellite images used in this study is publicly available and can be accessed through the specified website or repository. However, the specific sampling data and computational code used for analysis are proprietary and confidential, as they are part of ongoing research within Chongqing Key Laboratory of Land Quality Geological Survey. These resources are not publicly available due to non-disclosure agreements. Requests for further information about the data and code can be directed to the corresponding author, subject to the constraints of the confidentiality agreement.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Appendix A

Table A1. Performance metrics of different SOC prediction models for training and test sets based on eXtreme Gradient Boosting (XGBoost) for the paddy field. The most accurate results are shown in bold.
Table A1. Performance metrics of different SOC prediction models for training and test sets based on eXtreme Gradient Boosting (XGBoost) for the paddy field. The most accurate results are shown in bold.
Model TrainTest
R 2 MAE
(%)
RMSE
(%)
R 2 MAE
(%)
RMSE
(%)
Model A
3 m0.7240.1930.2480.3890.1590.211
20 m0.4710.2780.3440.5660.1380.178
30 m0.3920.2920.3680.5230.1520.187
80 m0.9990.0010.0020.4950.1610.192
Model B
3 m0.2940.3180.3970.4980.1540.192
20 m0.6320.2330.2870.6030.1270.171
30 m0.2920.3210.3980.4110.1610.208
80 m0.9990.0020.0010.5050.1580.192
Model C
3 m0.2380.3230.4120.0810.1850.259
20 m0.5920.1770.2350.2450.1770.235
30 m0.9960.0210.0280.1310.1920.252
80 m0.5860.2470.3040.0710.1980.261
Model D
3 m0.2070.2920.3980.1310.2970.376
20 m0.6210.2230.2910.4140.1620.207
30 m0.7210.1860.2410.0520.3130.359
80 m0.9990.0030.0090.1020.3410.456
Model E
3 m0.2980.3220.3960.4120.1560.207
20 m0.7090.2020.2550.6470.1290.161
30 m0.2920.3170.3980.5380.1470.184
80 m0.9990.0010.0010.1250.1910.253
Model F
3 m0.3230.3120.3890.4050.1620.209
20 m0.7290.1940.2460.6990.1140.148
30 m0.6220.2410.2910.3820.1620.213
80 m0.3210.3110.3920.1230.1780.253
Notes: Model A, DEM + PlanetScope; Model B, DEM + PlanetScope + Sentinel-1; Model C, DEM + Sentinel-2; Model D, DEM + Sentinel-2 + Sentinel-1; Model E, DEM + PlanetScope + Sentinel-2; Model F, DEM + PlanetScope + Sentinel-2 + Sentinel-1.
Table A2. Performance metrics of different SOC prediction models for training and test sets based on eXtreme Gradient Boosting (XGBoost) for dry land. The most accurate results are shown in bold.
Table A2. Performance metrics of different SOC prediction models for training and test sets based on eXtreme Gradient Boosting (XGBoost) for dry land. The most accurate results are shown in bold.
Model TrainTest
R 2 MAE
(%)
RMSE
(%)
R 2 MAE
(%)
RMSE
(%)
Model A
3 m0.9520.0630.0910.2010.1670.201
20 m0.9990.0010.0010.5040.1360.166
30 m0.9040.0810.1280.2440.1320.196
80 m0.2140.2850.3680.1310.1530.204
Model B
3 m0.6920.1790.2310.2370.1530.196
20 m0.9990.0010.0010.3410.1510.192
30 m0.9110.0760.1240.2990.1320.188
80 m0.2350.2840.3630.0280.1610.216
Model C
3 m0.2730.2870.3540.1610.1560.206
20 m0.7350.1690.2130.5850.1290.152
30 m0.5110.2210.2910.2810.1580.191
80 m0.9610.0460.0830.2110.1530.194
Model D
3 m0.9990.0030.0130.3230.1510.185
20 m0.9280.0570.1110.3730.1620.187
30 m0.9990.0030.0120.3770.1450.177
80 m0.3070.2790.3460.0110.160.218
Model E
3 m0.8830.0960.1420.2350.1510.197
20 m0.6920.1810.2290.6730.1070.135
30 m0.7490.1610.2080.3380.1480.183
80 m0.5740.2040.2710.1510.1590.202
Model F
3 m0.9990.0020.0030.4260.1360.171
20 m0.8970.1090.1330.5710.1230.155
30 m0.9960.0070.0260.3590.1310.182
80 m0.2890.2750.3510.0580.1620.212
Notes: Model A, DEM + PlanetScope; Model B, DEM + PlanetScope + Sentinel-1; Model C, DEM + Sentinel-2; Model D, DEM + Sentinel-2 + Sentinel-1; Model E, DEM + PlanetScope + Sentinel-2; Model F, DEM + PlanetScope + Sentinel-2 + Sentinel-1.
Table A3. Performance metrics of different SOC prediction models for training and test sets based on eXtreme Gradient Boosting (XGBoost) for total area. The most accurate results are shown in bold.
Table A3. Performance metrics of different SOC prediction models for training and test sets based on eXtreme Gradient Boosting (XGBoost) for total area. The most accurate results are shown in bold.
Model TrainTest
R 2 MAE
(%)
RMSE
(%)
R 2 MAE
(%)
RMSE
(%)
Model A
3 m0.9970.010.0210.1560.2750.395
20 m0.9990.0020.0080.2890.2620.362
30 m0.9990.0010.0010.1380.3070.399
80 m0.9990.0010.0010.0210.3520.436
Model B
3 m0.9930.0160.0360.1680.2880.392
20 m0.9970.0120.0230.1950.2860.386
30 m0.9990.0010.0040.0990.2930.408
80 m0.1750.2930.3810.0310.3050.423
Model C
3 m0.9880.0380.0470.1790.2910.389
20 m0.9990.0010.0020.2510.2850.372
30 m0.9960.0040.0160.1460.3030.397
80 m0.9990.0040.0130.0940.2640.341
Model D
3 m0.9930.0270.0360.160.2880.394
20 m0.9980.0120.0170.2310.2750.377
30 m0.9990.0060.0080.130.3090.401
80 m0.9990.0010.0020.0570.2810.348
Model E
3 m0.8580.0910.1580.1920.2870.386
20 m0.9370.0780.1050.3390.2680.349
30 m0.9990.0020.0080.1970.2860.385
80 m0.9960.0090.0250.0210.3090.425
Model F
3 m0.9930.0150.0350.2110.2890.384
20 m0.960.0620.0840.2330.2720.376
30 m0.9990.0020.0090.1930.280.386
80 m0.9970.0120.0210.0370.3080.422
Notes: Model A, DEM + PlanetScope; Model B, DEM + PlanetScope + Sentinel-1; Model C, DEM + Sentinel-2; Model D, DEM + Sentinel-2 + Sentinel-1; Model E, DEM + PlanetScope + Sentinel-2; Model F, DEM + PlanetScope + Sentinel-2 + Sentinel-1.
Table A4. Performance results of Random Forest (RF) in predicting SOC based on different combinations of environmental variables at different modeling resolutions. The most accurate results are shown in bold.
Table A4. Performance results of Random Forest (RF) in predicting SOC based on different combinations of environmental variables at different modeling resolutions. The most accurate results are shown in bold.
ModelLand Use Type
Paddy FieldDry LandTotal Area
R 2 MAE
(%)
RMSE
(%)
AIC R 2 MAE
(%)
RMSE
(%)
AIC R 2 MAE
(%)
RMSE
(%)
AIC
Model A
3 m0.4010.1550.209228.9500.0690.1780.217228.5150.0840.2950.411376.667
20 m0.4190.1600.206227.8920.3270.1550.194211.0060.1280.2920.401373.374
30 m0.3980.1620.210229.1420.2220.1450.192222.9140.0940.2970.409375.912
80 m0.3280.1880.222232.9660.0870.1610.215227.877−0.0380.2990.366373.813
Model B
3 m0.3800.1770.213278.1630.1820.1590.203272.2500.0810.2890.412424.921
20 m0.4370.1610.203274.8110.3040.1510.197270.1320.0650.2970.416426.011
30 m0.4180.1620.206275.9520.1090.1620.212275.0970.0650.3110.416426.027
80 m0.3400.1840.220280.3580.0630.1580.212274.953−0.0250.2990.363420.887
Model C
3 m−0.1440.2100.289299.6080.1280.1680.210274.3640.1320.2990.400421.076
20 m0.4140.1500.207276.1990.1990.1710.211274.7560.2020.2770.380415.397
30 m0.1910.1810.243287.4720.1730.1580.204272.6110.1380.2960.399420.615
80 m0.0790.1910.260292.0220.2110.1490.194269.279−0.0750.3020.372424.295
Model D
3 m−0.1350.2210.288347.3370.2550.1460.194317.1690.0980.3000.408471.644
20 m0.3100.1610.225329.9210.2310.1650.207321.4280.1150.2920.404470.380
30 m−0.0410.2060.276344.2940.1860.1550.203321.1020.1040.3030.407471.154
80 m0.0950.1870.257339.4180.0540.1560.213323.274−0.0810.3090.373472.681
Model E
3 m0.3360.1590.220488.5770.1390.1760.209481.9580.0940.2920.409631.936
20 m0.4700.1450.197480.6620.3580.1580.189475.4790.2430.2590.375624.469
30 m0.3660.1670.215486.9390.2070.1630.200479.2310.1320.2840.401629.108
80 m0.2170.1960.239494.3370.1990.1450.196477.7920.0050.3030.426634.808
Model F
3 m0.3090.1750.225537.9830.2120.1630.199527.0090.1060.2950.407679.089
20 m0.5330.1460.185524.2460.3160.1570.195525.5680.1470.2800.371672.369
30 m0.2440.1790.235541.1290.1740.1600.204528.5820.1120.2840.399676.240
80 m0.2270.1860.238541.9080.1590.1520.201527.370−0.0260.2950.363676.664
Notes: Model A, PlanetScope + DEM; Model B, PlanetScope + Sentinel-1 + DEM; Model C, Sentinel-2 + DEM; Model D, Sentinel-2 + Sentinel-1 + DEM; Model E, PlanetScope + Sentinel-2 + DEM; Model F, PlanetScope + Sentinel-2 + Sentinel-1 + DEM.
Table A5. Performance results of Gradient Boosting Decision Tree (GBDT) in predicting SOC based on different combinations of environmental variables at different modeling resolutions. The most accurate results are shown in bold.
Table A5. Performance results of Gradient Boosting Decision Tree (GBDT) in predicting SOC based on different combinations of environmental variables at different modeling resolutions. The most accurate results are shown in bold.
ModelLand Use Type
Paddy FieldDry LandTotal Area
R 2 MAE
(%)
RMSE
(%)
AIC R 2 MAE
(%)
RMSE
(%)
AIC R 2 MAE
(%)
RMSE
(%)
AIC
Model A
3 m0.5540.1370.181218.6160.1490.1600.203223.9690.0380.3090.422379.939
20 m0.5100.1450.189221.9390.3320.1630.193220.7890.1520.2850.396371.479
30 m0.4490.1610.201226.0070.2030.1480.201223.4000.0590.2990.417378.432
80 m0.2370.2040.237237.6800.0770.1560.210226.448−0.0790.3000.373376.598
Model B
3 m0.5000.1530.191270.6310.1100.1620.212275.0580.0830.2950.412424.768
20 m0.4450.1580.201274.2700.3740.1410.186266.6390.0730.2960.414425.463
30 m0.3110.1690.225281.8640.0920.1490.214275.7180.0460.3050.420427.405
80 m−0.0070.1980.271295.134−0.0080.1700.220277.364−0.0850.3010.374424.925
Model C
3 m0.0490.1910.264293.1480.1370.1580.209274.0150.1530.3000.396419.404
20 m0.2470.1730.235284.9610.1870.1770.213275.2460.1780.2870.389417.433
30 m0.1640.1770.247288.6270.2130.1500.199270.9920.1200.3020.403422.013
80 m0.0890.1820.258291.6090.1370.1500.203272.228−0.0190.2930.362420.503
Model D
3 m0.0290.1870.266341.8620.1620.1560.206321.0600.1820.2980.389465.077
20 m0.4200.1600.206323.8090.2540.1700.204320.4040.1360.2920.399468.759
30 m−0.1350.2040.288347.3320.1970.1580.202319.6660.1050.3100.407471.093
80 m0.0890.1820.258339.6090.1160.1540.206321.018-0.0260.2960.364469.000
Model E
3 m0.3160.1710.224489.6180.1900.1520.202479.9430.1090.3080.406630.785
20 m0.5980.1350.171470.9850.4930.1360.168467.6890.2130.2800.381622.491
30 m0.4650.1580.198480.9570.1260.1650.210482.4520.1040.3010.407631.190
80 m0.2480.1860.235492.9450.1240.1580.205480.738−0.0530.3080.368630.784
Model F
3 m0.2860.1750.229539.1260.2470.1430.195525.5440.1390.2890.398676.442
20 m0.6390.1310.162515.1660.4470.1450.175518.5380.1350.2870.399676.788
30 m0.4290.1620.204531.3130.1250.1640.207529.5490.0910.2950.410680.158
80 m0.1650.1920.247544.5790.0900.1540.209529.973−0.0460.2980.367678.334
Notes: Model A, PlanetScope + DEM; Model B, PlanetScope + Sentinel-1 + DEM; Model C, Sentinel-2 + DEM; Model D, Sentinel-2 + Sentinel-1 + DEM; Model E, PlanetScope + Sentinel-2 + DEM; Model F, PlanetScope + Sentinel-2 + Sentinel-1 + DEM.
Table A6. Performance results of Support Vector Regression (SVR) in predicting SOC based on different combinations of environmental variables at different modeling resolutions. The most accurate results are shown in bold.
Table A6. Performance results of Support Vector Regression (SVR) in predicting SOC based on different combinations of environmental variables at different modeling resolutions. The most accurate results are shown in bold.
ModelLand Use Type
Paddy FieldDry LandTotal Area
R 2 MAE
(%)
RMSE
(%)
AIC R 2 MAE
(%)
RMSE
(%)
AIC R 2 MAE
(%)
RMSE
(%)
AIC
Model A
3 m0.0360.1850.266245.617−0.0100.1690.226231.213−0.0170.3090.430382.658
20 m0.0330.1850.266245.7410.0120.1780.235233.7100.0010.3100.429382.453
30 m0.0290.1900.266245.8540.0490.1600.219229.2240.0460.3090.420379.381
80 m0.0290.1950.267245.8670.0200.1600.217228.4430.0100.2850.357370.405
Model B
3 m0.0220.1890.268294.130−0.0060.1680.226279.087−0.0020.3120.430430.707
20 m0.0300.1870.266293.821−0.0020.1790.236282.167−0.0090.3120.431431.124
30 m0.0170.1880.268294.3010.0220.1650.222278.1510.0090.3020.428429.969
80 m0.0250.1880.267294.0170.0270.1590.216276.1910.0180.2860.356417.861
Model C
3 m0.0140.1870.287294.4250.0480.1700.219277.2530.1860.2760.388416.755
20 m0.0690.1840.261292.407−0.0120.1850.238282.4980.1410.2810.398420.392
30 m0.0320.1840.266293.7590.0560.1660.218276.9700.0450.2980.419427.452
80 m0.0110.1890.269294.5070.1180.1630.206272.9670.0360.2860.353416.574
Model D
3 m0.0170.1850.268342.3040.0460.1710.219325.3470.0690.3110.415473.784
20 m0.2400.1630.234333.304−0.0090.1890.237330.4100.0640.2990.416474.087
30 m0.0180.1840.268342.2470.0500.1670.219325.1840.0180.3150.426477.351
80 m0.0080.1900.269342.6150.0630.1630.212322.9570.0330.2880.353464.784
Model E
3 m0.0050.1890.270502.7240.0250.1690.222486.0670.0270.3060.424636.725
20 m0.0210.1880.268502.1470.0250.1830.233483.5890.0270.3090.424636.733
30 m0.0080.1880.269502.6110.0950.1760.214483.5890.0520.3040.418634.982
80 m0.0480.1830.264501.1670.0190.1600.217484.4760.0350.2810.353624.630
Model F
3 m0.0060.1890.270550.6850.0000.1740.225534.8870.0330.3110.423684.315
20 m0.2220.1730.239542.136−0.0200.1810.238538.7510.0050.3110.429686.212
30 m0.0090.1890.269550.5840.0330.1740.221533.7900.0250.3040.424684.829
80 m0.0140.1880.269550.4000.0190.1600.217532.4760.0140.2840.356674.107
Notes: Model A, PlanetScope + DEM; Model B, PlanetScope + Sentinel-1 + DEM; Model C, Sentinel-2 + DEM; Model D, Sentinel-2 + Sentinel-1 + DEM; Model E, PlanetScope + Sentinel-2 + DEM; Model F, PlanetScope + Sentinel-2 + Sentinel-1 + DEM.

References

  1. Lal, R. Digging deeper: A holistic perspective of factors affecting soil organic carbon sequestration in agroecosystems. Glob. Chang. Biol. 2018, 24, 3285–3301. [Google Scholar] [CrossRef] [PubMed]
  2. Jing, J.; Li, R.; Zhang, Y.; Wu, Q. Identification of priority areas for soil erosion control based on minimum administrative units and karst landforms in karst areas of Guizhou. Prog. Phys. Geogr. Earth Environ. 2023, 47, 892–911. [Google Scholar] [CrossRef]
  3. Arrouays, D.; Grundy, M.G.; Hartemink, A.E.; Hempel, J.W.; Heuvelink, G.B.; Hong, S.Y.; Lagacherie, P.; Lelyk, G.; McBratney, A.B.; McKenzie, N.J.; et al. GlobalSoilMap: Toward a fine-resolution global grid of soil properties. Adv. Agron. 2014, 125, 93–134. [Google Scholar]
  4. Hengl, T.; Mendes de Jesus, J.; Heuvelink, G.B.; Ruiperez Gonzalez, M.; Kilibarda, M.; Blagotić, A.; Shangguan, W.; Wright, M.N.; Geng, X.; Bauer-Marschallinger, B.; et al. SoilGrids250m: Global gridded soil information based on machine learning. PLoS ONE 2017, 12, e0169748. [Google Scholar] [CrossRef] [PubMed]
  5. Wang, T.; Zhou, W.; Xiao, J.; Li, H.; Yao, L.; Xie, L.; Wang, K. Soil Organic Carbon Prediction Using Sentinel-2 Data and Environmental Variables in a Karst Trough Valley Area of Southwest China. Remote Sens. 2023, 15, 2118. [Google Scholar] [CrossRef]
  6. Zhou, T.; Geng, Y.; Ji, C.; Xu, X.; Wang, H.; Pan, J.; Bumberger, J.; Haase, D.; Lausch, A. Prediction of soil organic carbon and the C: N ratio on a national scale using machine learning and satellite data: A comparison between Sentinel-2, Sentinel-3 and Landsat-8 images. Sci. Total Environ. 2021, 755, 142661. [Google Scholar] [CrossRef]
  7. Xia, C.; Zhang, Y. Comparison of the use of Landsat 8, Sentinel-2, and Gaofen-2 images for mapping soil pH in Dehui, northeastern China. Ecol. Inform. 2022, 70, 101705. [Google Scholar] [CrossRef]
  8. Zhang, Y.; Sui, B.; Shen, H.; Wang, Z. Estimating temporal changes in soil pH in the black soil region of Northeast China using remote sensing. Comput. Electron. Agric. 2018, 154, 204–212. [Google Scholar] [CrossRef]
  9. Wang, X.; Han, J.; Wang, X.; Yao, H.; Zhang, L. Estimating soil organic matter content using sentinel-2 imagery by machine learning in shanghai. IEEE Access 2021, 9, 78215–78225. [Google Scholar] [CrossRef]
  10. Paul, S.S.; Coops, N.C.; Johnson, M.S.; Krzic, M.; Chandna, A.; Smukler, S.M. Mapping soil organic carbon and clay using remote sensing to predict soil workability for enhanced climate change adaptation. Geoderma 2020, 363, 114177. [Google Scholar] [CrossRef]
  11. Hengl, T.; Leenaars, J.G.; Shepherd, K.D.; Walsh, M.G.; Heuvelink, G.B.; Mamo, T.; Tilahun, H.; Berkhout, E.; Cooper, M.; Fegraus, E.; et al. Soil nutrient maps of Sub-Saharan Africa: Assessment of soil nutrient content at 250 m spatial resolution using machine learning. Nutr. Cycl. Agroecosyst. 2017, 109, 77–102. [Google Scholar] [CrossRef]
  12. Planet. Satellite Imagery and Archive. 2021. Available online: https://planet.com/products/planet-imagery/ (accessed on 17 June 2024).
  13. Moon, M.; Richardson, A.D.; Friedl, M.A. Multiscale assessment of land surface phenology from harmonized Landsat 8 and Sentinel-2, PlanetScope, and PhenoCam imagery. Remote Sens. Environ. 2021, 266, 112716. [Google Scholar] [CrossRef]
  14. Kimm, H.; Guan, K.; Jiang, C.; Peng, B.; Gentry, L.F.; Wilkin, S.C.; Wang, S.; Cai, Y.; Bernacchi, C.J.; Peng, J.; et al. Deriving high-spatiotemporal-resolution leaf area index for agroecosystems in the US Corn Belt using Planet Labs CubeSat and STAIR fusion data. Remote Sens. Environ. 2020, 239, 111615. [Google Scholar] [CrossRef]
  15. Cheng, Y.; Vrieling, A.; Fava, F.; Meroni, M.; Marshall, M.; Gachoki, S. Phenology of short vegetation cycles in a Kenyan rangeland from PlanetScope and Sentinel-2. Remote Sens. Environ. 2020, 248, 112004. [Google Scholar] [CrossRef]
  16. Wang, J.; Yang, D.; Detto, M.; Nelson, B.W.; Chen, M.; Guan, K.; Wu, S.; Yan, Z.; Wu, J. Multi-scale integration of satellite remote sensing improves characterization of dry-season green-up in an Amazon tropical evergreen forest. Remote Sens. Environ. 2020, 246, 111865. [Google Scholar] [CrossRef]
  17. Neyns, R.; Efthymiadis, K.; Libin, P.; Canters, F. Fusion of multi-temporal PlanetScope data and very high-resolution aerial imagery for urban tree species mapping. Urban For. Urban Green. 2024, 99, 128410. [Google Scholar] [CrossRef]
  18. Pabla, S.S.; Mandla, M.S.; Narendra, H.; Patel, S. Classification of multi-temporal images using machine learning. EarthArXiv 2021. [Google Scholar] [CrossRef]
  19. Dobrinić, D.; Miler, M.; Medak, D. A Comparative Analysis of Pixel-Based and Object-Based Approaches Using Multitemporal PlanetScope Imagery for Land Cover Classification. Forest 2024, 67204, 39. [Google Scholar]
  20. Qayyum, N.; Ghuffar, S.; Ahmad, H.M.; Yousaf, A.; Shahid, I. Glacial lakes mapping using multi satellite PlanetScope imagery and deep learning. ISPRS Int. J. Geo-Inf. 2020, 9, 560. [Google Scholar] [CrossRef]
  21. Breunig, F.M.; Galvão, L.S.; Dalagnol, R.; Dauve, C.E.; Parraga, A.; Santi, A.L.; Della Flora, D.P.; Chen, S. Delineation of management zones in agricultural fields using cover–crop biomass estimates from PlanetScope data. Int. J. Appl. Earth Obs. Geoinf. 2020, 85, 102004. [Google Scholar] [CrossRef]
  22. Wang, H.; Zhang, X.; Wu, W.; Liu, H. Prediction of Soil Organic Carbon under Different Land Use Types Using Sentinel-1/-2 Data in a Small Watershed. Remote Sens. 2021, 13, 1229. [Google Scholar] [CrossRef]
  23. Yang, R.; Guo, W. Using time-series Sentinel-1 data for soil prediction on invaded coastal wetlands. Environ. Monit. Assess. 2019, 191, 462. [Google Scholar] [CrossRef] [PubMed]
  24. Zhou, T.; Geng, Y.; Chen, J.; Liu, M.; Haase, D.; Lausch, A. Mapping soil organic carbon content using multi-source remote sensing variables in the Heihe River Basin in China. Ecol. Indic. 2020, 114, 106288. [Google Scholar] [CrossRef]
  25. Yang, R.; Guo, W. Modelling of soil organic carbon and bulk density in invaded coastal wetlands using Sentinel-1 imagery. Int. J. Appl. Earth Obs. Geoinf. 2019, 82, 101906. [Google Scholar] [CrossRef]
  26. Dahhani, S.; Raji, M.; Bouslihim, Y. Synergistic Use of Multi-Temporal Radar and Optical Remote Sensing for Soil Organic Carbon Prediction. Remote Sens. 2024, 16, 1871. [Google Scholar] [CrossRef]
  27. Poggio, L.; Gimona, A. Assimilation of optical and radar remote sensing data in 3D mapping of soil properties over large areas. Sci. Total Environ. 2017, 579, 1094–1110. [Google Scholar] [CrossRef]
  28. Zhou, Y.; Wu, W.; Liu, H. Exploring the Influencing Factors in Identifying Soil Texture Classes Using Multitemporal Landsat-8 and Sentinel-2 Data. Remote Sens. 2022, 14, 5571. [Google Scholar] [CrossRef]
  29. Shafizadeh-Moghadam, H.; Minaei, F.; Talebi-khiyavi, H.; Xu, T.; Homaee, M. Synergetic use of multi-temporal Sentinel-1, Sentinel-2, NDVI, and topographic factors for estimating soil organic carbon. Catena 2022, 212, 106077. [Google Scholar] [CrossRef]
  30. Grabska, E.; Frantz, D.; Ostapowicz, K. Evaluation of machine learning algorithms for forest stand species mapping using Sentinel-2 imagery and environmental data in the Polish Carpathians. Remote Sens. Environ. 2020, 251, 112103. [Google Scholar] [CrossRef]
  31. Luo, C.; Zhang, X.; Meng, X.; Zhu, H.; Ni, C.; Chen, M.; Liu, H. Regional mapping of soil organic matter content using multitemporal synthetic Landsat 8 images in Google Earth Engine. Catena 2022, 209, 105842. [Google Scholar] [CrossRef]
  32. Peng, L.; Wu, X.; Feng, C.; Gao, L.; Li, Q.; Xu, J.; Li, B. Assessing the potential of multi-source remote sensing data for cropland soil organic matter mapping in hilly and mountainous areas. Catena 2024, 245, 108312. [Google Scholar] [CrossRef]
  33. Xie, B.; Ding, J.; Ge, X.; Li, X.; Han, L.; Wang, Z. Estimation of soil organic carbon content in the Ebinur Lake wetland, Xinjiang, China, based on multisource remote sensing data and ensemble learning algorithms. Sensors 2022, 22, 2685. [Google Scholar] [CrossRef] [PubMed]
  34. Stevens, A.; Udelhoven, T.; Denis, A.; Tychon, B.; Lioy, R.; Hoffmann, L.; Van Wesemael, B. Measuring soil organic carbon in croplands at regional scale using airborne imaging spectroscopy. Geoderma Int. J. Soil Sci. 2010, 1/2, 158. [Google Scholar] [CrossRef]
  35. Íala, D.; Minaík, R.; Zádorová, T. Soil Organic Carbon Mapping Using Multispectral Remote Sensing Data: Prediction Ability of Data with Different Spatial and Spectral Resolutions. Remote Sens. 2019, 11, 2947. [Google Scholar] [CrossRef]
  36. Wang, B.; Waters, C.; Orgill, S.; Gray, J.; Cowie, A.; Clark, A.; Liu, D.L. High resolution mapping of soil organic carbon stocks using remote sensing variables in the semi-arid rangelands of eastern Australia. Sci. Total Environ. 2018, 630, 367–378. [Google Scholar] [CrossRef]
  37. Tayebi, M.; Fim Rosas, J.T.; Mendes, W.D.; Poppiel, R.R.; Ostovari, Y.; Ruiz, L.F.; dos Santos, N.V.; Cerri, C.E.; Silva, S.H.; Curi, N.; et al. Drivers of Organic Carbon Stocks in Different LULC History and along Soil Depth for a 30 Years Image Time Series. Remote Sens. 2021, 13, 2223. [Google Scholar] [CrossRef]
  38. Maynard, J.J.; Levi, M.R. Hyper-temporal remote sensing for digital soil mapping: Characterizing soil-vegetation response to climatic variability. Geoderma 2017, 285, 94–109. [Google Scholar] [CrossRef]
  39. Pellerin, S.; Bamiere, L.; Constantin, J.; Launay, C.; Martin, R.; Schiavo, M.; Angers, D.; Augusto, L.; Balesdent, J.; Basile-Doelsch, I.; et al. A model-based assessment of the soil C storage potential at the national scale: A case study from France. In Food Security and Climate Change: 4 per 1000 Initiative New Tangible Global Challenges for the Soil; HAL: Poitiers, France, 2019. [Google Scholar]
  40. Zhang, G. Changes of soil labile organic carbon in different land uses in Sanjiang Plain, Heilongjiang Province. Chin. Geogr. Sci. 2010, 20, 139–143. [Google Scholar] [CrossRef]
  41. Fang, X.; Xue, Z.; Li, B.; An, S. Soil organic carbon distribution in relation to land use and its storage in a small watershed of the Loess Plateau, China. Catena 2012, 88, 6–13. [Google Scholar] [CrossRef]
  42. Emadi, M.; Baghernejad, M.; Memarian, H.R. Effect of land-use change on soil fertility characteristics within water-stable aggregates of two cultivated soils in northern Iran. Land Use Policy 2009, 26, 452–457. [Google Scholar] [CrossRef]
  43. Dignac, M.; Derrien, D.; Barré, P.; Barot, S.; Cécillon, L.; Chenu, C.; Chevallier, T.; Freschet, G.T.; Garnier, P.; Guenet, B.; et al. Increasing soil carbon storage: Mechanisms, effects of agricultural practices and proxies. A review. Agron. Sustain. Dev. 2017, 37, 1–27. [Google Scholar] [CrossRef]
  44. Zhang, X.; Xiang, D.Q.; Yang, C.; Wu, W.; Liu, H.B. The spatial variability of temporal changes in soil pH affected by topography and fertilization. Catena 2022, 218, 106586. [Google Scholar] [CrossRef]
  45. Guo, J.; Zhu, Q.; Zhao, X.; Gou, X.; Han, Y.; Xu, J. Hyper-spectral inversion of soil organic carbon content under different land use types. Chin. J. Appl. Ecol. 2020, 31, 863–871, (In Chinese with English Abstract). [Google Scholar]
  46. Cambardella, C.A.; Moorman, T.B.; Novak, J.; Parkin, T.; Karlen, D.; Turco, R.; Konopka, A. Field-scale variability of soil properties in central Iowa soils. Soil Sci. Soc. Am. J. 1994, 58, 1501–1511. [Google Scholar] [CrossRef]
  47. Robertson, G.P.; Gross, K.L.; Caldwell, M.; Pearcy, R. Assessing the heterogeneity of belowground resources: Quantifying pattern and scale. In Exploitation of Environmental Heterogeneity by Plants: Ecophysiological Processes Above- and Belowground; Academic Press: Boston, MA, USA, 1994; pp. 237–253. [Google Scholar]
  48. Bellamy, P.H.; Loveland, P.J.; Bradley, R.I.; Lark, R.M.; Kirk, G.J. Carbon losses from all soils across England and Wales 1978–2003. Nature 2005, 437, 245–248. [Google Scholar] [CrossRef]
  49. Li, M.; Xi, X.; Xiao, G.; Cheng, H.; Yang, Z.; Zhou, G.; Ye, J.; Li, Z. National multi-purpose regional geochemical survey in China. J. Geochem. Explor. 2014, 139, 21–30. [Google Scholar] [CrossRef]
  50. Burrough, P.A.; McDonnell, R.A.; Lloyd, C.D. Principles of Geographical Information Systems; Oxford University Press: New York, NY, USA, 2015. [Google Scholar]
  51. Chen, D.; Chang, N.; Xiao, J.; Zhou, Q.; Wu, W. Mapping dynamics of soil organic matter in croplands with MODIS data and machine learning algorithms. Sci. Total Environ. 2019, 669, 844–855. [Google Scholar] [CrossRef]
  52. Lin, C.; Zhu, A.X.; Wang, Z.; Wang, X.; Ma, R. The refined spatiotemporal representation of soil organic matter based on remote images fusion of Sentinel-2 and Sentinel-3. Int. J. Appl. Earth Obs. Geoinf. 2020, 89, 102094. [Google Scholar] [CrossRef]
  53. Yu, L.; Zhuang, T.; Bai, J.; Wang, J.; Yu, Z.; Wang, X.; Zhang, G. Effects of water and salinity on soil labile organic carbon in estuarine wetlands of the Yellow River Delta, China. Ecohydrol. Hydrobiol. 2020, 20, 556–569. [Google Scholar] [CrossRef]
  54. He, W.; Wang, H.; Ye, W.; Tian, Y.; Hu, G.; Lou, Y.; Pan, H.; Yang, Q.; Zhuge, Y. Distinct stabilization characteristics of organic carbon in coastal salt-affected soils with different salinity under straw return management. Land Degrad. Dev. 2022, 33, 2246–2257. [Google Scholar] [CrossRef]
  55. Haj-Amor, Z.; Araya, T.; Kim, D.G.; Bouri, S.; Lee, J.; Ghiloufi, W.; Yang, Y.; Kang, H.; Jhariya, M.K.; Banerjee, A.; et al. Soil salinity and its associated effects on soil microorganisms, greenhouse gas emissions, crop yield, biodiversity and desertification: A review. Sci. Total Environ. 2022, 843, 156946. [Google Scholar] [CrossRef]
  56. Niro, F.; Goryl, P.; Dransfeld, S.; Boccia, V.; Gascon, F.; Adams, J.; Themann, B.; Scifoni, S.; Doxani, G. European Space Agency (ESA) calibration/validation strategy for optical land-imaging satellites and pathway towards interoperability. Remote Sens. 2021, 13, 3003. [Google Scholar] [CrossRef]
  57. Small, D. Flattening gamma: Radiometric terrain correction for SAR imagery. IEEE Trans. Geosci. Remote. Sens. 2011, 49, 3081–3093. [Google Scholar] [CrossRef]
  58. Lee, J.S.; Pottier, E. Polarimetric Radar Imaging: From Basics to Applications; CRC Press: Boca Raton, FL, USA, 2017. [Google Scholar]
  59. Veloso, A.; Mermoz, S.; Bouvet, A.; Le Toan, T.; Planells, M.; Dejoux, J.F.; Ceschia, E. Understanding the temporal behavior of crops using Sentinel-1 and Sentinel-2-like data for agricultural applications. Remote Sens. Environ. 2017, 199, 415–426. [Google Scholar] [CrossRef]
  60. Trudel, M.; Charbonneau, F.; Leconte, R. Using RADARSAT-2 polarimetric and ENVISAT-ASAR dual-polarization data for estimating soil moisture over agricultural fields. Can. J. Remote Sens. 2012, 38, 514–527. [Google Scholar]
  61. Escadafal, R.; Girard, M.; Courault, D. Munsell soil color and soil reflectance in the visible spectral bands of Landsat MSS and TM data. Remote Sens. Environ. 1989, 27, 37–46. [Google Scholar] [CrossRef]
  62. Pouget, M.; Madeira, J.; Le Floch, E.; Kamal, S. Caracteristiques spectrales des surfaces sableuses de la region cotiere nord-ouest de l’Egypte: Application aux donnees satellitaires SPOT. In Proceedings of the 2e Journée de télédéTection: CaractéRisation et Suivi des Milieux Terrestres en réGions Arides et Tropicales, Paris, France, 4–6 December 1990; ORSTOM: Paris, France, 1990; pp. 4–6. [Google Scholar]
  63. Raya, S.S.; Singhb, J.P.; Dasa, G.; Panigrahyb, S. Use of high resolution remote sensing data for generating site-specific soil mangement plan. Red 2004, 550, 727. [Google Scholar]
  64. Huete, A.R. A soil-adjusted vegetation index (SAVI). Remote Sens. Environ. 1988, 25, 295–309. [Google Scholar] [CrossRef]
  65. Baret, F.; Guyot, G. Potentials and limits of vegetation indices for LAI and APAR assessment. Remote Sens. Environ. 1991, 35, 161–173. [Google Scholar] [CrossRef]
  66. Qi, J.; Chehbouni, A.; Huerte, A.R.; Kerr, Y.H.; Sorooshian, S. A modified soil adjusted vegetation index. Remote Sens. Environ. 1994, 48, 119–126. [Google Scholar] [CrossRef]
  67. Qi, J.; Kerr, Y.; Chehbouni, A. External factor consideration in vegetation index development. In Proceedings of the Physical Measurements and Signatures in Remote Sensing, ISPRS, Beijing, China, 17–19 October 1994; Volume 723, p. 730. [Google Scholar]
  68. Li, Z.; Moreau, L. A new approach for remote sensing of canopy-absorbed photosynthetically active radiation. I: Total surface absorption. Remote Sens. Environ. 1996, 55, 175–191. [Google Scholar] [CrossRef]
  69. Gupta, R.K. Comparative study of AVHRR ratio vegetation index and normalized difference vegetation index in district level agricultural monitoring. Int. J. Remote. Sens. 1993, 14, 53–73. [Google Scholar] [CrossRef]
  70. Clevers, J. The derivation of a simplified reflectance model for the estimation of leaf area index. Remote Sens. Environ. 1988, 25, 53–69. [Google Scholar] [CrossRef]
  71. Richardson, A.J.; Wiegand, C.L. Distinguishing vegetation from soil background information. Photogramm. Eng. Remote Sens. 1977, 43, 1541–1552. [Google Scholar]
  72. Crippen, R.E. Calculating the vegetation index faster. Remote Sens. Environ. 1990, 34, 71–73. [Google Scholar] [CrossRef]
  73. Huete, A.; Didan, K.; Miura, T.; Rodriguez, E.P.; Gao, X.; Ferreira, L.G. Overview of the radiometric and biophysical performance of the MODIS vegetation indices. Remote Sens. Environ. 2002, 83, 195–213. [Google Scholar] [CrossRef]
  74. Zhou, X.; Dandan, L.; Huiming, Y.; Honggen, C.; Leping, S.; Guojing, Y.; Qingbiao, H.; Brown, L.; Malone, J.B. Use of landsat TM satellite surveillance data to measure the impact of the 1998 flood on snail intermediate host dispersal in the lower Yangtze River Basin. Acta Trop. 2002, 82, 199–205. [Google Scholar] [CrossRef]
  75. Kaufman, Y.J.; Tanre, D. Atmospherically resistant vegetation index (ARVI) for EOS-MODIS. IEEE Trans. Geosci. Remote Sens. 1992, 30, 261–270. [Google Scholar] [CrossRef]
  76. Leprieur, C.; Kerr, Y.H.; Pichon, J.M. Critical assessment of vegetation indices from AVHRR in a semi-arid environment. Remote Sens. 1996, 17, 2549–2563. [Google Scholar] [CrossRef]
  77. Abbas, A.; Khan, S. Using remote sensing techniques for appraisal of irrigated soil salinity. In Advances and Applications for Management and Decision Making Land, Water and Environmental Management; Oxley, L., Kulasiri, D., Eds.; Modelling and Simulation Society of Australia and New Zealand: Christchurch, New Zealand, 2007; pp. 2632–2638. [Google Scholar]
  78. Nicolas, H.; Walter, C. Detecting salinity hazards within a semiarid context by means of combining soil and remote-sensing data. Geoderma 2006, 134, 217–230. [Google Scholar]
  79. Khan, N.M.; Rastoskuev, V.V.; Sato, Y.; Shiozawa, S. Assessment of hydrosaline land degradation by using a simple approach of remote sensing indicators. Agric. Water Manag. 2005, 77, 96–109. [Google Scholar] [CrossRef]
  80. Liu, W.; Henneberry, S.R.; Ni, J.; Radmehr, R.; Wei, C. Socio-cultural roots of rural settlement dispersion in Sichuan Basin: The perspective of Chinese lineage. Land Use Policy 2019, 88, 104162. [Google Scholar] [CrossRef]
  81. Zhou, J.; Li, E.; Wang, M.; Chen, X.; Shi, X.; Jiang, L. Feasibility of stochastic gradient boosting approach for evaluating seismic liquefaction potential based on SPT and CPT case histories. J. Perform. Constr. Facil. 2019, 33, 04019024. [Google Scholar] [CrossRef]
  82. Yu, X.; Wang, Y.; Wu, L.; Chen, G.; Wang, L.; Qin, H. Comparison of support vector regression and extreme gradient boosting for decomposition-based data-driven 10-day streamflow forecasting. J. Hydrol. 2020, 582, 124293. [Google Scholar] [CrossRef]
  83. Fan, J.; Yue, W.; Wu, L.; Zhang, F.; Cai, H.; Wang, X.; Lu, X.; Xiang, Y. Evaluation of SVM, ELM and four tree-based ensemble models for predicting daily reference evapotranspiration using limited meteorological data in different climates of China. Agric. For. Meteorol. 2018, 263, 225–241. [Google Scholar] [CrossRef]
  84. Jia, Y.; Jin, S.; Savi, P.; Gao, Y.; Tang, J.; Chen, Y.; Li, W. GNSS-R soil moisture retrieval based on a XGboost machine learning aided method: Performance and validation. Remote Sens. 2019, 11, 1655. [Google Scholar] [CrossRef]
  85. Zheng, H.; Yuan, J.; Chen, L. Short-term load forecasting using EMD-LSTM neural networks with a Xgboost algorithm for feature importance evaluation. Energies 2017, 10, 1168. [Google Scholar] [CrossRef]
  86. Hartig, F.; Dormann, C.F. Does model-free forecasting really outperform the true model? Proc. Natl. Acad. Sci. USA 2013, 110, E3975. [Google Scholar] [CrossRef]
  87. Field, A. Discovering Statistics Using IBM SPSS Statistics; Sage Publications Limited: Washington, DC, USA, 2024. [Google Scholar]
  88. Tang, Z.; Nan, Z. The potential of cropland soil carbon sequestration in the Loess Plateau, China. Mitig. Adapt. Strateg. Glob. Chang. 2013, 18, 889–902. [Google Scholar] [CrossRef]
  89. Zhou, T.; Geng, Y.; Chen, J.; Pan, J.; Haase, D.; Lausch, A. High-resolution digital mapping of soil organic carbon and soil total nitrogen using DEM derivatives, Sentinel-1 and Sentinel-2 data based on machine learning algorithms. Sci. Total Environ. 2020, 729, 138244. [Google Scholar] [CrossRef]
  90. Li, X.; McCarty, G.W.; Karlen, D.L.; Cambardella, C.A. Topographic metric predictions of soil redistribution and organic carbon in Iowa cropland fields. Catena 2018, 160, 222–232. [Google Scholar] [CrossRef]
  91. Xu, X.; Tong, X.; Plaza, A.; Zhong, Y.; Xie, H.; Zhang, L. Using linear spectral unmixing for subpixel mapping of hyperspectral imagery: A quantitative assessment. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2016, 10, 1589–1600. [Google Scholar] [CrossRef]
  92. Dvorakova, K.; Heiden, U.; van Wesemael, B. Sentinel-2 exposed soil composite for soil organic carbon prediction. Remote Sens. 2021, 13, 1791. [Google Scholar] [CrossRef]
  93. Narayan, U.; Lakshmi, V. Characterizing subpixel variability of low resolution radiometer derived soil moisture using high resolution radar data. Water Resour. Res. 2008, 44, W06425. [Google Scholar] [CrossRef]
  94. Bindlish, R.; Barros, A.P. Subpixel variability of remotely sensed soil moisture: An inter-comparison study of SAR and ESTAR. IEEE Trans. Geosci. Remote Sens. 2002, 40, 326–337. [Google Scholar] [CrossRef]
  95. Garosi, Y.; Ayoubi, S.; Nussbaum, M.; Sheklabadi, M.; Nael, M.; Kimiaee, I. Use of the time series and multi-temporal features of Sentinel-1/2 satellite imagery to predict soil inorganic and organic carbon in a low-relief area with a semi-arid environment. Int. J. Remote. Sens. 2022, 43, 6856–6880. [Google Scholar] [CrossRef]
  96. Vaudour, E.; Gomez, C.; Loiseau, T.; Baghdadi, N.; Loubet, B.; Arrouays, D.; Ali, L.; Lagacherie, P. The impact of acquisition date on the prediction performance of topsoil organic carbon from Sentinel-2 for croplands. Remote Sens. 2019, 11, 2143. [Google Scholar] [CrossRef]
  97. Davidson, E.A.; Janssens, I.A. Temperature sensitivity of soil carbon decomposition and feedbacks to climate change. Nature 2006, 440, 165–173. [Google Scholar] [CrossRef]
  98. Fang, X.; Zhu, Y.L.; Liu, J.D.; Lin, X.P.; Sun, H.Z.; Tang, X.H.; Hu, Y.L.; Huang, Y.P.; Yi, Z.G. Effects of moisture and temperature on soil organic carbon decomposition along a vegetation restoration gradient of subtropical China. Forests 2022, 13, 578. [Google Scholar] [CrossRef]
  99. Samuel-Rosa, A.; Heuvelink, G.B.; Vasques, G.M.; Anjos, L.H. Do more detailed environmental covariates deliver more accurate soil maps? Geoderma 2015, 243, 214–227. [Google Scholar] [CrossRef]
  100. Odhiambo, B.O.; Kenduiywo, B.K.; Were, K. Spatial prediction and mapping of soil pH across a tropical afro-montane landscape. Appl. Geogr. 2020, 114, 102129. [Google Scholar] [CrossRef]
  101. Brungard, C.; Nauman, T.; Duniway, M.; Veblen, K.; Nehring, K.; White, D.; Salley, S.; Anchang, J. Regional ensemble modeling reduces uncertainty for digital soil mapping. Geoderma 2021, 397, 114998. [Google Scholar] [CrossRef]
Figure 1. (a) Digital elevation model and distribution of soil sample points in the study area; (b) RGB image acquired by the PlanetScope optical satellite sensor (image date: 27 July 2017); (c) distribution of paddy field and dry land in the study area.
Figure 1. (a) Digital elevation model and distribution of soil sample points in the study area; (b) RGB image acquired by the PlanetScope optical satellite sensor (image date: 27 July 2017); (c) distribution of paddy field and dry land in the study area.
Remotesensing 16 03268 g001
Figure 2. Environmental data preprocessing processes for modeling.
Figure 2. Environmental data preprocessing processes for modeling.
Remotesensing 16 03268 g002
Figure 3. Multi-temporal images of normalized difference vegetation index (NDVI) at 20 m modeling resolution from PlanetScope and Sentinel-2, respectively. PlanetScope: the spring images are acquired on 26 May 2017, the summer images on 27 July 2017, the autumn images on 31 October 2017, and the winter images on 22 December 2017; Sentinel-2: the spring images are acquired on 4 May 2017, the summer images on 10 July 2017, the autumn images on 31 October 2017, and the winter images on 22 December 2017.
Figure 3. Multi-temporal images of normalized difference vegetation index (NDVI) at 20 m modeling resolution from PlanetScope and Sentinel-2, respectively. PlanetScope: the spring images are acquired on 26 May 2017, the summer images on 27 July 2017, the autumn images on 31 October 2017, and the winter images on 22 December 2017; Sentinel-2: the spring images are acquired on 4 May 2017, the summer images on 10 July 2017, the autumn images on 31 October 2017, and the winter images on 22 December 2017.
Remotesensing 16 03268 g003
Figure 4. Technical workflow.
Figure 4. Technical workflow.
Remotesensing 16 03268 g004
Figure 5. Performance results of XGBoost in predicting SOC based on different combinations of environmental variables at different modeling resolutions. Model A: PlanetScope + DEM; Model B: PlanetScope + Sentinel-1 + DEM; Model C: Sentinel-2 + DEM; Model D: Sentinel-2 + Sentinel-1 + DEM; Model E: PlanetScope + Sentinel-2 + DEM; Model F: PlanetScope + Sentinel-2 + Sentinel-1 + DEM.
Figure 5. Performance results of XGBoost in predicting SOC based on different combinations of environmental variables at different modeling resolutions. Model A: PlanetScope + DEM; Model B: PlanetScope + Sentinel-1 + DEM; Model C: Sentinel-2 + DEM; Model D: Sentinel-2 + Sentinel-1 + DEM; Model E: PlanetScope + Sentinel-2 + DEM; Model F: PlanetScope + Sentinel-2 + Sentinel-1 + DEM.
Remotesensing 16 03268 g005
Figure 6. Relative importance of the thirty most important environmental variables used for the SOC prediction in Model E and Model F at a resolution of 20 m based on XGBoost for paddy field and dry land, respectively. Model E: PlanetScope + Sentinel-2 + DEM; Model F: PlanetScope + Sentinel-2 + Sentinel-1 + DEM (PS, S2, and S1 mean PlanetScope predictors, Sentinel-2 predictors, and Sentinel-1 predictors, respectively; May, July, October, and December mean May, July, October, and December, respectively).
Figure 6. Relative importance of the thirty most important environmental variables used for the SOC prediction in Model E and Model F at a resolution of 20 m based on XGBoost for paddy field and dry land, respectively. Model E: PlanetScope + Sentinel-2 + DEM; Model F: PlanetScope + Sentinel-2 + Sentinel-1 + DEM (PS, S2, and S1 mean PlanetScope predictors, Sentinel-2 predictors, and Sentinel-1 predictors, respectively; May, July, October, and December mean May, July, October, and December, respectively).
Remotesensing 16 03268 g006
Figure 7. SOC content prediction map for cultivated land in the study area based on the best optimal model at 20 m spatial resolution.
Figure 7. SOC content prediction map for cultivated land in the study area based on the best optimal model at 20 m spatial resolution.
Remotesensing 16 03268 g007
Table 1. Spectral bands and resolutions of the PlanetScope and Sentinel-2 sensors used in this study.
Table 1. Spectral bands and resolutions of the PlanetScope and Sentinel-2 sensors used in this study.
PlanetScopeSentinel-2
BandWavelength
(nm)
Resolution
(m)
BandWavelength
(nm)
Resolution
(m)
Blue (B)457.5–522.53Blue (B)458–52310
Green (G)542–577.53Green (G)543–57810
Red ( R ) 650–6803 Red ( R ) 650–68010
Near infrared (NIR)855–8753Red edge 1 (RE1)698–71320
Red edge 2 (RE2)733–74820
Red edge 3 (RE3)773–79320
Near infrared (NIR)785–90010
Near infrared Narrow (NIRn)855–87520
Shortwave Infrared 1 (SWIR1)1565–165520
Shortwave Infrared 2 (SWIR2)2100–228020
Table 3. Different combinations of environmental variables for SOC prediction.
Table 3. Different combinations of environmental variables for SOC prediction.
ModelEnvironmental Variable
Model ADEM + PlanetScope
Model BDEM + PlanetScope + Sentinel-1
Model CDEM + Sentinel-2
Model DDEM + Sentinel-2 + Sentinel-1
Model EDEM + PlanetScope + Sentinel-2
Model FDEM + PlanetScope + Sentinel-2 + Sentinel-1
Table 4. Parameters setting for the XGBoost algorithm.
Table 4. Parameters setting for the XGBoost algorithm.
ParameterThresholdInterval
Eta0.01, 0.05, 0.1-
Max_Depth1–111
Min_Child_Weight0–211
Lambda0–111
Notes: Eta = learning rate; Max_Depth = maximum tree depth; Min_Child_Weight = minimum child weight; Lambda = regression lambda.
Table 5. The statistics of SOC (%) under total area, paddy field, and dry land.
Table 5. The statistics of SOC (%) under total area, paddy field, and dry land.
Land Use TypeNMinMaxMeanSDCV (%)SkewnessKurtosis
Total area3320.052.841.170.4236.080.521.49
Paddy field1710.052.841.26a0.4434.790.501.56
Dry land1610.142.601.08b0.3835.590.401.20
Notes: N = number; Min = minimum; Max = maximum; SD = standard deviation; CV = coefficient of variation. Different letters within the mean column indicate that the difference in SOC between paddy field and dry land is significant at p < 0.05.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, Z.; Wu, W.; Liu, H. Spatial Estimation of Soil Organic Carbon Content Utilizing PlanetScope, Sentinel-2, and Sentinel-1 Data. Remote Sens. 2024, 16, 3268. https://doi.org/10.3390/rs16173268

AMA Style

Wang Z, Wu W, Liu H. Spatial Estimation of Soil Organic Carbon Content Utilizing PlanetScope, Sentinel-2, and Sentinel-1 Data. Remote Sensing. 2024; 16(17):3268. https://doi.org/10.3390/rs16173268

Chicago/Turabian Style

Wang, Ziyu, Wei Wu, and Hongbin Liu. 2024. "Spatial Estimation of Soil Organic Carbon Content Utilizing PlanetScope, Sentinel-2, and Sentinel-1 Data" Remote Sensing 16, no. 17: 3268. https://doi.org/10.3390/rs16173268

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop