Next Article in Journal
Cascaded Hierarchical Attention with Adaptive Fusion for Visual Grounding in Remote Sensing
Previous Article in Journal
Integration of Convolutional Neural Networks and UAV-Derived DEM for the Automatic Classification of Benthic Habitats in Shallow Water Environments
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Mapping Soil Organic Matter in a Typical Black Soil Region Using Multi-Temporal Synthetic Images and Radar Indices Under Limited Bare Soil Windows

1
College of Land Science and Technology, China Agricultural University, Beijing 100193, China
2
School of Public Administration, South China Agricultural University, Guangzhou 510642, China
3
College of Geoscience and Surveying Engineering, China University of Mining and Technology-Beijing, Beijing 100083, China
4
Bayannur Modern Agriculture and Animal Husbandry Development Center, Bayannur 015000, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Remote Sens. 2025, 17(17), 2929; https://doi.org/10.3390/rs17172929
Submission received: 11 July 2025 / Revised: 10 August 2025 / Accepted: 21 August 2025 / Published: 23 August 2025

Abstract

Remote sensing technology provides an efficient and low-cost approach for acquiring large-scale soil information, offering notable advantages for soil organic matter (SOM) mapping. However, in recent years, the bare soil period of cultivated land in Northeast China has significantly shortened, posing serious challenges to traditional SOM prediction and mapping methods that rely on optical imagery. Meanwhile, current approaches that integrate optical imagery, radar imagery, and environmental covariates have yet to fully exploit the potential of remote sensing data in SOM mapping. To address this, this study focuses on the typical black soil region in Northeastern China, acquiring median synthetic images from different time periods (crop sowing, growing, and harvest stages) along with vegetation and radar indices. Six data groups were created by integrating environmental covariate data. Four machine learning models—XGBoost, BRT, ET, and RF—were used to analyze the SOM prediction accuracy of different groups. The group and model with the highest prediction accuracy were selected for SOM mapping in cultivated land. The results show that: (1) in the same model, incorporating radar images and their related indices significantly improves SOM prediction accuracy; (2) when using four machine learning models for SOM prediction, the RF model, which integrates optical images, radar images, vegetation indices, and radar indices from the crop sowing and growing periods, achieves the highest accuracy (R2 = 0.530, RMSE = 6.130, MAE = 4.822); (3) in the optimal SOM prediction model, temperature, precipitation, and elevation are relatively more important, with radar indices showing greater importance than vegetation indices; (4) uncertainty analysis and accuracy verification at the raster scale confirm that the SOM mapping results obtained in this study are highly reliable. This study made significant progress in SOM prediction and mapping by employing a radar–optical image fusion strategy combined with crop growth information. It helped address existing research gaps and provided new approaches and technical solutions for remote sensing-based SOM monitoring in regions with short bare soil periods.

Graphical Abstract

1. Introduction

China feeds 22% of the world’s population on just 7% of the arable land, making it the leading global grain producer. At present, China’s cultivated land faces unsustainable threats from prolonged and intensive use [1,2]. As a key commodity grain production base in China, the Northeast Black Soil Region has long shouldered the responsibility of safeguarding national food security. This region features loose soil texture and a thick humus-rich topsoil layer, resulting in naturally high soil organic matter (SOM) content [3]. However, in recent years, the widespread adoption of intensive farming practices has led to significant soil degradation, particularly a gradual decline in SOM content within the cultivated black soil layer [4,5]. In response, conservation tillage practices—such as reduced tillage and straw mulching—have been increasingly adopted as effective measures to combat soil degradation [6,7]. This measure has resulted in the long-term coverage of arable land surfaces by vegetation or straw, with extremely short or almost non-existent periods of bare soil, which has posed significant challenges for SOM prediction research based on satellite remote sensing.
In recent years, the multifunctional satellite remote sensing technique has already shown its feasibility for SOM prediction and mapping [8,9]. It combines low-cost with extensive coverage, enabling continuous monitoring data over the Earth’s surface. Moderate Resolution Imaging Spectroradiometer (MODIS), Landsat series, and Sentinel series satellites are some of the commonly used satellites and sensors for this purpose, with a wide spectral band coverage from 350 to 2500 nm for SOM responsive bands [10,11]. Numerous studies have utilized Sentinel-2 images for SOM mapping, and the accuracy of the results effectively meets research requirements [12,13]. The Sentinel-1 satellite offers all-weather Earth observation, unaffected by cloud covers and precipitation events, making its images among the most reliable and high-performance microwave radar remote sensing tools available. Research involving the Sentinel-1 image has primarily focused on the extraction of soil roughness and physical properties such as moisture and texture [14,15]. The direct correlation between the backscattering coefficient of Sentinel-1 and SOM remains uncertain. While a few studies have indicated that combining sensors from Sentinel-1 and Sentinel-2 may enhance SOM prediction accuracy to some extent [16,17], substantial potential for further exploration exists in related fields.
The accuracy of SOM predictions is also directly influenced by image acquisition time, image quality, and ground cover conditions, among other things, besides the selection of satellite image [18]. Some studies have shown that satellite images taken during the bare soil period play an important role in SOM prediction [19,20]. These studies, however, often used multi-temporal images from the same year, while the bare soil window for agriculture production is very short. Satellite images may be affected by cloud cover and rainy weather, which are highly unfavorable [21,22]. To address this issue, some researchers obtain multiyear, multi-temporal remote-sensing images during bare soil periods and key crop growth stages for mapping SOM. For example, Rizzo et al. (2020) [23] synthesized a single bare soil image from multi-year and multi-temporal Landsat images of southeastern Brazil. They then constructed the SOM prediction model based on the spectral reflectance data of this image. Luo et al. (2023) found that combining multi-year median composite images from the bare soil period and crop growing season using Landsat-8 images effectively improves SOM prediction accuracy [24]. Mallick et al. (2020) enhanced the accuracy of SOM inversion using Landsat 8 images combined with multiple vegetation indices [25]. Moreover, multi-temporal Landsat images, analyzed by Belenok et al. (2023), demonstrated the strong correlation between vegetation indices and SOM [26]. Yang et al. (2023) confirmed that Fourier transform decomposed (FTD) variables based on vegetation indices effectively capture time series patterns of crop growth, thereby significantly improving the accuracy of SOM mapping [18].
Currently, support vector machines (SVMs), extreme gradient boosting (XGBoost), artificial neural networks (ANNs), and random forest (RF) models are widely applied in SOM mapping studies [27,28,29]. Existing studies have shown that these models can effectively handle the complex nonlinear relationships between multispectral reflectance and SOM through adaptive learning mechanisms, thereby reducing training errors and significantly improving the predictive accuracy of the models [30]. Although numerous studies have been conducted to predict SOM based on remote sensing data, the synergistic response mechanisms between crop growth status and the spatial distribution of SOM remain insufficiently explored. Moreover, the response mechanisms linking Sentinel-1 radar imagery and its derived indices with SOM have not been systematically investigated. Under conditions where cultivated land is covered by vegetation or has minimal bare soil, how to fully exploit existing remote sensing data to extract SOM information to the greatest extent remains the major challenge and the critical frontier in current research.
In recent years, the bare soil period of cultivated land in Northeast China has significantly shortened, making it increasingly difficult to acquire representative bare soil spectral data using remote sensing satellites. In this circumstance, in order to enhance the prediction and mapping capabilities of remote sensing data for SOM. This study adopts a multi-temporal radar–optical image fusion strategy to generate median synthetic images from multi-year remote sensing data and calculates vegetation-related remote sensing indices, aiming to enhance the comprehensive characterization of the spatial heterogeneity of SOM. This study focuses on cultivated land in the typical black soil region of Northeast China. By integrating multi-temporal median synthetic images from Sentinel-1 and Sentinel-2 satellites during the sowing, growing, and harvesting stages from 2019 to 2024, along with environmental covariates, multiple SOM prediction models were established. The optimal model was then selected to generate the spatial distribution map of SOM. The specific objectives of this study are: (1) to assess the potential of optical images and environmental covariates at different crop growth stages for SOM mapping; (2) to evaluate the influence of radar images and their derived indices on the accuracy of SOM prediction; and (3) to generate the SOM distribution map at a spatial resolution of 30 m.

2. Materials and Methods

2.1. Study Area

This study area encompasses Zalantun City, Jalaid Banner, Horqin Right Front Banner, and Ulanhot City (Figure 1), located in the eastern part of the Inner Mongolia Autonomous Region and the western part of Northeast China. This region lies in the transitional zone between the Greater Khingan Mountains and the Songnen Plain, with terrain sloping from northwest to southeast and comprising low mountains, hills, and plains. Elevation ranges from 106 to 1697 m above sea level. The area has pronounced seasonal temperature variations and abundant sunshine. The mean annual temperature is approximately 3.6 °C, with an average annual precipitation of about 491.8 mm and a frost-free period of roughly 126 days. The major crops cultivated in the region include corn, soybeans, and rice. Sowing typically begins in May, followed by the growing season from June to August and the maturation and harvesting period from September to October.

2.2. Data Acquisition and Treatment

2.2.1. Soil Sample Collection and Processing

In October 2023, 108 surface soil samples from the top 20 cm were gathered within the study area (Figure 1). Sampling points were refined and adjusted on the regular polygon grid by overlaying slope, land use, and soil type data. The 2023 land use data used in this study were obtained from the China land cover dataset (CLCD), with a spatial resolution of 30 m [31]. Considerations such as accessibility and feasibility of sampling conditions were taken into account to ensure the reasonable placement of sampling locations. At each sampling point, the square area of 100 × 100 cm was delineated. Sampling was executed using the five-point plum blossom method, both at the center and the periphery [32]. These samples were amalgamated into one final sample weighing 500 g, with the geographic coordinates of the central sampling point recorded via GPS. The GPS device was manufactured by Wuhan Huancan Engineering Technology Co., Ltd. (Wuhan, China). According to the statistical results in Table 1, the number of soil sampling points and the proportion of cultivated land area across different terrain types were relatively consistent, indicating that the spatial distribution of sampling points was relatively uniform and representative in this study.
To avoid direct sunlight, the samples were air-dried under shade after being carried in sealed plastic bags to the laboratory. After the dirt dried, it was ground into a fine powder and then sieved through a 2 mm mesh [33]. The soil organic carbon (SOC) content of soil samples was determined by the external heating potassium dichromate titration method, and the measured values were subsequently multiplied by the conversion factor of 1.724 [34] to calculate the SOM content [35]. The total nitrogen (TN) content was determined by the Kjeldahl method [36]. The chemical reagents and equipment used in this study were produced by Beijing Institute of Chemical Reagents Co., Ltd. (Beijing, China) and Beijing Star-Century Science Development Co., Ltd. (Beijing, China), respectively.

2.2.2. Sentinel-1 Images

Two Synthetic Aperture Radar (SAR) satellites in the C-band, Sentinel-1A and Sentinel-1B, make up the Sentinel-1 system, which has a six-day revisit cycle. In this study, May, June to August, and September-October Sentinel-1 SAR ground-range-detected (GRD) images from 2019 to 2024 were retrieved from the “Sentinel-1 SAR GRD: C-band Synthetic Aperture Radar Ground Range Detected, log scaling” collection on the Google Earth Engine (GEE) platform. The following steps made up the data processing sequence, according to the GEE official documentation: (1) thermal noise exclusion. (2) Calibration measuring radioactivity is carried out. (3) For areas beyond 60 degrees latitude, terrain correction using SRTM 30 or ASTER DEM is used. Next, logarithmic scaling (10*log10(x)) was used to convert the adjusted terrain values to decibels. In this study, Sentinel-1 images with single polarization, vertical emission/vertical reception (VV), and double polarization, vertical emission/horizontal reception (VH) polarization were used. Median composites of images for May, June to August, and September to October from 2019 to 2024 were generated separately, resulting in three median synthetic images (Table 2). These images correspond to the crop sowing, growing, maturation, and harvesting periods, respectively. Research indicates that median synthetic images selectively extract the most representative pixels from time series data while minimizing the impact of outliers [37].

2.2.3. Sentinel-2 Images

The high-definition multispectral imaging satellite Sentinel-2 is equipped with the Multispectral Instrument (MSI) that covers twelve different spectral bands, from 400 to 2400 nm. The spatial resolutions are 10 m for visible light, 20 m for near-infrared, and 60 m for shortwave infrared (Table 3). Similar to its counterpart, the Sentinel-2 system, encompassing the Sentinel-2A and Sentinel-2B satellites, maintains a 5-day revisit interval. In this study, Sentinel-2 images for May, June to August, and September to October from 2019 to 2024 were sourced from the “Harmonized Sentinel-2 MSI: MultiSpectral Instrument, Level-2A (SR)” dataset on the GEE platform. Based on the cloud cover threshold recommended by the GEE platform, the cloud threshold for Sentinel-2 imagery in this study was set to less than 20. This means that only images with cloud cover below 20% were selected from the image collection. Each image was filtered using the “QA60” band to remove cloud cover for the given timeframe. Then, median synthetic images were generated for May, June to August, and September to October, as listed in Table 2.
Although surface vegetation interference is relatively low during the winter months, the widespread presence of snow cover in Northeast China poses a significant challenge [38]. In the study area, most surfaces during this period are affected by varying degrees of snow or freezing conditions, which substantially interfere with the backscatter of Sentinel-1 imagery and the surface reflectance signals of Sentinel-2 imagery. Meanwhile, the current GEE platform still has certain limitations in the availability and quality control of winter imagery. A large number of images are affected by snow cover, surface freezing, and low solar elevation angles, making it difficult to meet research needs. Therefore, remote sensing images taken during winter were not selected for this study.
To ensure uniformity, all median synthetic images were resampled employing the nearest-neighbor technique at the spatial resolution of 30 m. The Sentinel-1 images were then applied to compute the Normalized difference polarization index (NDPI) [39] and Polarization ratio (PR) [40]. NDVI was derived from Sentinel-2 images [41]. Formulas for these remote sensing indices are provided in Table 4.

2.2.4. Environmental Covariates

Studies have confirmed that, among topographic factors, elevation and slope have significant effects on SOM [42]. In addition, SOM is a soil physicochemical property formed through long-term accumulation and is strongly influenced by the cumulative effects of climatic factors over time. Studies have shown that using long-term average climatic variables from the 1980s to the present can more comprehensively reflect the background climate conditions associated with the formation and evolution of SOM in the study area, thereby improving the prediction accuracy of SOM [42,43]. The environmental covariates used in this study are as follows.
(1) Topographic data. The Shuttle Radar Topography Mission (SRTM) digital elevation data represent a global effort coordinated by NASA’s Jet Propulsion Laboratory to generate a global-scale digital elevation model (DEM) with 1 arc second resolution, available in the SRTM V3 product, also known as SRTM Plus. The DEM data of 30 m resolution in the study area was obtained from the “NASA SRTM Digital Elevation 30 m” dataset on the GEE platform, and then the calculated slope data is used as an environmental variable.
(2) Climatic data. The European Centre for Medium-Range Weather Forecasts (ECMWFs) Reanalysis Version 5 (ERA5) dataset is the fifth-generation atmospheric reanalysis of the global climate produced by ECMWF. It integrates observational data from around the world with model outputs to produce a spatially and temporally consistent climate dataset. In this study, the “ERA5 Monthly Aggregates–Latest Climate Reanalysis Produced by ECMWF/Copernicus Climate Change Service” dataset provided on the GEE platform was employed. The long-term average of climate factors is widely recognized as the reliable indicator of regional climatic conditions. The average values of air temperature (mean_2m_air_temperature) and precipitation (total_precipitation) recorded in this study area from 1 January 1980 to 31 December 2024, were used as climate data inputs.

2.2.5. SOC Raster Data

The China High-Resolution National Soil Information Grid Basic Attributes SOC Dataset has a spatial resolution of 90 m and includes six soil depth layers in the vertical direction: 0–5 cm, 5–15 cm, 15–30 cm, 30–60 cm, 60–100 cm, and 100–200 cm. The dataset is based on field survey sampling data collected between 2010 and 2018 and follows the predictive soil mapping paradigm [44]. In this study, the average SOC content at the depth of 0–15 cm was obtained based on this dataset, with the data unit being g/kg. This dataset was not used as a covariate, model input, or for accuracy assessment, but rather served as a reference for spatial comparison with the SOM mapping results generated in this study.

2.3. Methods

2.3.1. Median Synthesis Approach

Median synthesis is a commonly used strategy in remote sensing image processing. Research indicates that median synthetic images selectively extract the most representative pixels from time series data while minimizing the impact of outliers [37]. The specific formula is as follows:
M e d i a n =   X n + 1 2 ,   n   i s   a n   o d d   n u m b e r
M e d i a n = X n 2 2 + X n + 1 2 2 ,   n   i s   a n   e v e n   n u m b e r
where n represents the number of remote sensing images, and X denotes the ordered sequence of values for different spectral bands. In this study, the median function provided by the GEE platform was used to obtain the median synthetic image. Reduces an image collection by calculating the median of all values at each pixel across the stack of all matching bands. Bands are matched by name.

2.3.2. Machine Learning Models

To reduce the loss function and mitigate the risk of overfitting, Friedman et al. (2000) introduced the Boosted Regression Tree (BRT) model, which blends regression trees and boosting approaches [45]. Improved performance of the BRT model using a stochastic gradient-based approach. In this study, we implement the BRT model using the Matlab 2016b platform and optimize its parameters by studying the number of samples per tree (numTrees).
The Extremely Randomized Trees (ETs) model, first proposed by Geurts et al. (2006), is a decision tree-based ensemble technique with strong generalizability capabilities [46]. The trained ET algorithm takes all samples as training samples to construct different decision trees based on different features. The scoring function computes the scores for each random split node across K randomly selected features; thus, it selects the highest-scoring node as the split node. In predictive scenarios, numerous decision trees within the ensemble assign scores to new samples, with the aggregate prediction reflecting the average output across the various trees. On the Matlab 2016b platform, key parameters such as numTrees, the number of samples per tree, and minLeafSize, the minimal number of samples necessary per leaf node, were optimized for this study.
The RF model was created by Breiman (2001) [47]. It is a combination of several decision trees used for data classification and is the technique for machine learning. It makes predictions by randomly selecting auxiliary variables and then making an aggregation of the outputs of these trees. To divide the training data into two categories, “in-bag” and “out-of-bag”, this model uses bootstrap sampling. This study used Matlab 2016b to run the RF model and optimize its parameters by changing the amount of predictor variables (mtry) and trees (ntree).
The XGBoost model, formulated by Chen et al. (2016), is predicated on Gradient Boosting Decision Trees (GBDT) and is renowned for its efficacy [48]. The XGBoost model includes regularization terms to manage complexity and curtail overfitting. In this study, critical parameters such as the learning rate (eta), maximum tree depth (maxdepth), numTrees, and maximum boosting iterations (nrounds) were tuned on the Matlab 2016b platform to enhance the model’s performance.
In this study, model parameter tuning is performed using grid search. Grid search is one of the most common techniques for parameter optimization to find the best combination of parameters in machine learning models. Predefined values in the grid are traversed to go through all its possible combinations. Every system setting returns an evaluation against its best setting for the parameters [49].

2.3.3. Validation

Cross-validation is the robust and reliable method for model calibration and validation by splitting the data into several sets of training and testing to ensure every sample will appear in the validation set at least once. This method has high advantages, such as higher reliability and less bias in limited numbers of sample cases [50]. In order to assess the efficacy of the model, this research employed the ten-fold cross-validation strategy. Ten-fold cross-validation assesses model performance based on the average value of accuracy metrics. The coefficient of determination (R2), root mean square error (RMSE) and mean absolute error (MAE) are parameters that have been used to assess the model’s accuracy. The formula for those metrics is as follows:
R 2 = 1 i = 1 n ( y i y ^ i ) 2 i = 1 n ( y i y ¯ ) 2
R M S E = i = 1 n ( y i y ^ i ) 2 n
M A E = i = 1 n y i y ^ i n
where n represents the number of samples, y i and y ¯ are the SOM measured and mean values, respectively, and y ^ i is the SOM predicted value. In this study the unit of RMSE and MAE is g/kg.

2.3.4. Uncertainty Analysis

To assess the uncertainty in the SOM mapping process, 50 bootstrap samples were utilized as a non-parametric resampling method. The calibration dataset was randomly divided 50 times to construct predictive models, and the average of these 50 calibration results was computed. The 90% confidence interval (CI) was calculated to estimate the uncertainty and establish the potential range of the prediction results [50].
C I s = Y ¯ ± a × S D N
u n c e r t a i n t y = C I u p p e r C I l o w e r Y ¯
where Y ¯ is the mean of SOM in 50 estimations, SD is the standard deviation, and N is the number of repetitions (50 in this study). a is the coefficient of 50 repetitions in 90% confidence interval (1.645). C I u p p e r and C I l o w e r denote the upper and lower confidence limits, respectively.

2.4. Flowchart

The flowchart of this study is depicted in Figure 2. Initially, preprocessing was conducted on Sentinel-1 and Sentinel-2 images obtained from the GEE platform to produce median synthetic images for the periods of May, June to August, and September to October, and to calculate three remote sensing indices (Table 4). Along with environmental covariates data, various groups were established based on the acquisition time of the remote sensing images (Table 5) to construct the required independent variable database. Studies have shown that integrating remote sensing imagery from different crop growth stages can significantly improve the accuracy of SOM mapping [22,24,51]. In view of this, this study constructed SOM prediction models based on single median synthetic images from the crop sowing, growing, and harvesting stages, as well as their various combinations, to systematically evaluate the impact of different remote sensing image combinations on SOM prediction accuracy. Therefore, six remote sensing data groups were defined, as presented in Table 5. Subsequently, the obtained SOM data were used as the dependent variable for modeling. Four machine learning models (BRT, ET, RF, and XGBoost) were employed to construct SOM prediction models for each group. The optimal model was selected based on the evaluation of accuracy results. Finally, this model was used for SOM mapping, followed by an uncertainty analysis and accuracy assessment of the mapping results. The SOM mapping and uncertainty analysis results in this study had a spatial resolution of 30 m, representing a substantial improvement compared to the 90 m spatial resolution of the China High-Resolution National Soil Information Grid Basic Attributes SOC Dataset, thereby enabling more detailed spatial characterization.
Furthermore, some important information in this study needs to be further clarified. Firstly, it is important to note that although SOM in Northeast China shows a slow decreasing trend, its variation within typical black soil cropland areas, where natural conditions are stable and farming practices remain relatively consistent, is generally gradual and often requires several decades to exhibit significant changes [42]. Accordingly, SOM was considered relatively stable during the period from 2019 to 2024. Secondly, the use of median synthetic images helps to significantly reduce the interference of factors such as cloud cover, extreme weather, and seasonal variation. Studies have shown that, in Northeast China, generating median synthetic images using Sentinel-1 and Sentinel-2 data over a four-year period (2019–2022) can ensure both the accuracy and generalization ability of SOM prediction models [22]. Consequently, extending the time range to 2019–2024 in this study was theoretically justified. Finally, in recent years, the Chinese government has implemented a series of strict cultivated land protection policies to minimize the large-scale conversion of cultivated land to other land use types [52]. As a major grain-producing region in China, the Northeast has maintained long-term agricultural cultivation [3]. Therefore, the land use and vegetation cover patterns in the study area were considered stable during the period from 2019 to 2024. Based on the above content, the image synthesis strategy adopted in this study is both feasible and reasonable.

3. Results

3.1. SOM Characterization and Remote Sensing Response Analysis

3.1.1. Descriptive Statistics of SOM Content

The basic statistical parameters of SOM content from the 108 soil samples are presented in Figure 3. SOM content ranged from 27.927 g/kg to 70.113 g/kg, with a mean of 41.491 g/kg, a median of 40.223 g/kg, and a standard deviation of 8.391 g/kg. The coefficient of variation (CV) was 20.224%, indicating a moderate level of variability (10% < CV < 90%). The statistical results show that most sampling points have SOM content within the range of 30–50 g/kg.

3.1.2. SOM Content and Relationship with Optical and Radar Images

In this study, the entire sample was divided into three intervals based on the number of soil sampling points and their corresponding SOM content: <40 g/kg, 40–50 g/kg, and >50 g/kg. The number of soil samples from these three intervals is roughly equal. The average spectral reflectance and backscatter coefficients for each interval were calculated. Figure 4 shows the spectral reflectance and backscatter coefficients corresponding to different SOM contents. Figure 4a–c show the spectral reflectance characteristics of the median composite Sentinel-2 images for May, June to August, and September to October, respectively. In this study, ten Sentinel-2 bands were used, including B2, B3, B4, B5, B6, B7, B8, B8A, B11, and B12. Figure 4d–f displays the backscatter coefficient characteristics of the median synthetic Sentinel-1 images for the corresponding periods. The results indicate that the spectral curve trends of soils with different SOM contents are similar. In the B2–B12 bands, the reflectance first increases and then decreases, with the overall reflectance range being approximately 0–0.4. However, there are still certain differences in specific bands and reflectance intensities. Specifically, in May, the Sentinel-2 spectral reflectance shows an increasing trend from the B2 to B11 bands (Figure 4a), while from June to August and September to October, an increasing trend is observed in the B2 to B8A bands (Figure 4b,c). In May, the spectral curves show that the higher the SOM content, the lower the spectral reflectance in the B2 to B8A bands (Figure 4a), while from September to October, the same trend is observed in the B2 to B12 bands (Figure 4c). However, the spectral curve from August to September does not exhibit the similar pattern. In addition, the backscatter coefficients of soils with different SOM contents show an increasing trend in both VH and VV polarization directions, with higher SOM content corresponding to higher backscatter coefficients (Figure 4d–f).

3.2. SOM Prediction Results Under Different Groups

3.2.1. SOM Prediction Accuracy of Optical Images

When predicting SOM based on Sentinel-2 optical remote sensing images, NDVI, and environmental variables, the modeling results of four machine learning models across six data groups are shown in Figure 5. Overall, the RF model outperforms the other models in terms of prediction accuracy. The RF model exhibits the best prediction performance in Group 3 (R2 = 0.451, RMSE = 6.615, MAE = 5.290), while it shows the lowest prediction accuracy in Group 6 (R2 = 0.383, RMSE = 7.001, MAE = 5.442). The BRT model performs best in Group 1 (R2 = 0.434, RMSE = 6.714, MAE = 5.326). The XGBoost model achieves the highest prediction accuracy in Group 2 (R2 = 0.371, RMSE = 10.288, MAE = 8.311), while the ET model performs optimally in Group 1 (R2 = 0.250, RMSE = 7.345, MAE = 6.011). It is worth noting that, although Groups 4, 5, and 6 all used multi-temporal image stacking data combinations, the prediction accuracy of these groups was not significantly better than that of Groups 1, 2, and 3 in any of the models. This suggests that, under the condition of using only Sentinel-2 optical images, multi-temporal image stacking did not significantly improve the SOM prediction accuracy of the models, with some groups even exhibiting a decrease in accuracy.

3.2.2. SOM Prediction Accuracy of Optical and Radar Images

When predicting SOM based on Sentinel-2 optical images, NDVI, Sentinel-1 radar images, NDPI, PR, and environmental variables, the modeling results of four machine learning models across six data groups are shown in Figure 6. Comparing the optimal prediction performance of each model, it can be seen that the RF model performs the best (R2 = 0.530), followed by BRT (R2 = 0.501), XGBoost (R2 = 0.450), and the ET model (R2 = 0.406). In addition, the prediction accuracy of the BRT, RF, and XGBoost models in Groups 4 and 5 is generally higher than that in Groups 1 to 3, while the ET model does not exhibit the similar trend. It is worth noting that the prediction accuracy of the BRT, RF, and XGBoost models in Group 6 is lower than that in Groups 1 to 3. The above results suggest that the RF model exhibits relatively high accuracy and stability in SOM prediction. By comparing the prediction results of each model based on the datasets of Group 1, Group 2, and Group 3, it can be observed that the median synthetic images obtained during the sowing period offer a higher predictive advantage for SOM prediction compared to those from the growing and harvest periods. Meanwhile, appropriately incorporating multi-temporal image stacking combinations can enhance the performance of each model, but excessive data fusion may weaken the prediction ability of the model for SOM.

3.2.3. Comparison of SOM Prediction Accuracies

The comparison of SOM prediction accuracy for each model before and after incorporating Sentinel-1 radar images, NDPI, and PR as modeling variables is shown in Figure 7. The results show that, with the addition of these variables, the prediction accuracy of all models improved in different groups, except for the RF model, which saw the slight decrease in accuracy in Group 3 (ΔR2 = −1.109%). The R2 improvement of the BRT model ranges from 3.243% to 39.167%; the ET model shows an improvement range of 14.010% to 103%; for the RF model, it was between 15.049% and 29.268%; and the XGBoost model improves by 7.397% to 25% (Figure 7a). With the increase in R2 for each model, RMSE and MAE generally show the decreasing trend, indicating the simultaneous reduction in model prediction errors (Figure 7b,c). These results demonstrate that the integration of multi-source remote sensing data and environmental variables significantly enhances the accuracy and stability of SOM prediction.

3.3. Optimal SOM Prediction Models for Different Groups

Based on Figure 5, Figure 6 and Figure 7, this study further analyzes the robustness of each model in its optimal prediction scenario. At the highest SOM prediction accuracy for each model, the variability of the accuracy assessment metrics during the ten-fold cross-validation process is shown in Table 6 and Figure 8, and the detailed accuracy validation results for each model are provided in Figures S1–S4 and Tables S1–S4, which are included in the Supplementary Materials. For the BRT model, the R2 range is from 0.216 to 0.744; for the ET model, the R2 range is from 0.013 to 0.654; for the RF model, the R2 range is from 0.387 to 0.800; and for the XGBoost model, the R2 range is from 0.003 to 0.731. These results suggest that the R2 of the ET and XGBoost models exhibit larger fluctuations, while the R2 variability is smallest for the RF model. The RMSE and MAE of the BRT, ET, and XGBoost models show more variation, while these two metrics are relatively more stable for the RF model. Overall, in Group 4, which integrates optical images, radar images, remote sensing indices, and environmental covariates, the RF model exhibits stronger robustness.

3.4. SOM Mapping and Uncertainty Analysis

3.4.1. SOM Mapping Based on the Optimal Model and Group

Based on the RF model with the highest prediction accuracy, the SOM mapping result for the study area is shown in Figure 9a. During the uncertainty analysis, the mean SOM mapping result is shown in Figure 9b. The results show that the SOM mapping and mean SOM mapping exhibit highly similar spatial distribution patterns, both displaying the gradual decrease in SOM content from northwest to southeast along the elevation gradient. This spatial distribution trend along topographical variations suggests that areas with higher elevations and steeper slopes in the study area have higher SOM content, while areas with lower elevations and relatively flat terrain tend to have lower SOM content. In addition, the value range of the SOM mapping is from 27.810 to 64.020 g/kg, while the value range of the mean SOM mapping is from 29.120 to 63.030 g/kg. The small difference between the two further confirms that the RF model demonstrates strong stability and consistency when performing SOM mapping. Based on the overall spatial distribution characteristics of SOM and the local zoomed-in results in Figure 9(1–6), it can be further observed that the distinction between high and low SOM value areas in the study area is clear, with natural transitions between regions. Moreover, SOM content differences at the field scale are effectively identified, and the results show the high spatial resolution.

3.4.2. Uncertainty Analysis of SOM Mapping

Mapping uncertainty analysis results for SOM mapping are presented in Figure 10. Overall, the uncertainty level of the SOM mapping in the study area is low, indicating that the constructed RF model has strong spatial generalization ability. From the perspective of spatial distribution, the SOM uncertainty and standard deviation in the western part of the study area are generally higher than in the eastern part. Notably, in Figure 10a,b, it is clearly observed that areas with sparse samples or significant terrain variations, particularly in the southwest and parts of the eastern region, exhibit significantly higher uncertainty levels compared to other areas.
In terms of numerical range, the uncertainty analysis results range from 1.050 to 8.100 g/kg, and the standard deviation ranges from 0.730 to 4.950 g/kg. This indicates that, even in regions with higher uncertainty, the prediction results of the RF model still fall within an acceptable fluctuation range. In addition, the relatively flat areas exhibit lower levels of uncertainty, showing that RF predictions are more stable and reliable. The zoomed-in results show that the SOM uncertainty and standard deviation within contiguous cultivated land are significantly lower than those at the edges of cultivated land adjacent to other land use types (Figure 10(1–6)). This suggests that the spatially contiguous and intact cultivated land structure helps improve the reliability of SOM mapping.

4. Discussion

4.1. Influence of Crop Growth Information on SOM Prediction

Studies on SOM remote sensing inversion in Northeast China have confirmed that when satellites scan bare soils, the soil spectral reflectance is significantly negatively correlated with SOM content, meaning that higher SOM content is associated with lower spectral reflectance. Most related studies have conducted quantitative inversion analysis based on this spectral response characteristic [42,53]. In recent years, with the widespread adoption of conservation tillage in Northeast China, the duration of bare soil exposure has significantly decreased. This poses a challenge for remote sensing satellites to acquire effective bare soil information [54,55]. In this study area, during September to October each year, after crops mature and are harvested, the cultivated land is left with straw and crop residues. However, these residues do not completely cover the soil surface. By the following May, in order to facilitate cultivation, farmers typically carry out some degree of clearing, and some areas may even be plowed, but this does not mean that all the cultivated land soil is completely bare. Therefore, in May and September to October, although the small amount of effective soil information was captured by remote sensing satellites, resulting in the noticeable negative correlation between soil spectral reflectance and SOM content during these periods (Figure 4), relying solely on this correlation cannot achieve a sufficient level of accuracy in SOM prediction.
From June to August each year, as crops continue to grow, their stems and leaves gradually cover the soil surface. During this period, remote sensing satellites can only capture crop-related information [56]. Studies have shown that there is a significant correlation between crop growth status and SOM content. Remote sensing images acquired during the crop growing period can be used for SOM prediction [43,57]. When acquiring crop growth information through remote sensing satellites, the focus is typically on biophysical parameters related to greenness, such as chlorophyll content, vegetation biomass, and leaf area index. These parameters can directly or indirectly reflect the growth vitality and photosynthetic capacity of crops [58,59]. In agricultural production, there are various soil nutrient elements that affect crop growth, and soil TN is one of the key elements influencing crop development [51]. In this study, quantitative analysis revealed the significant correlation between soil TN and SOM content (Figure 11). Therefore, monitoring crop growth information can indirectly reflect SOM content. This further supports the rationale of combining remote sensing images from different growth stages of crops for SOM prediction in this study.

4.2. Influencing Factors of SOM Spatial Distribution

Studies have found that in Northeast China, SOM content generally increases from south to north [42,60]. The spatial distribution of SOC in the study area is shown in Figure 12. It can be observed that this distribution closely mirrors the SOM spatial distribution characteristics in Figure 9, both exhibiting the trend of increasing from southeast to northwest along the elevation gradient. Although the data reflects the SOC content from 2010 to 2018, when compared with the SOM mapping results in this study, it still demonstrates the reliability of the findings in this study.
The variation in SOM content is primarily influenced by both natural factors and human activities [61,62]. In Northeast China, temperature has a more significant impact on SOM content, while precipitation has a relatively weaker effect [63]. At the same time, the elevation in the central and western regions of Northeast China generally shows the west-high, east-low pattern, while the annual average temperature exhibits the south-high, north-low trend [60,64]. Similar climatic and topographic characteristics are also observed in this study area. In areas with lower average annual temperatures, soil microbial metabolic activities are inhibited, which in turn slows down the decomposition of nutrient substances in the soil, promoting the accumulation of SOM [65]. Therefore, with the increase in latitude and elevation, the SOM content in the northwestern part of this study area is higher than that in the southeastern part.
Additionally, human activities may be one of the factors contributing to this result. In recent years, this area of cultivated land in Northeast China has continued to expand, a phenomenon that is particularly noticeable in the mountainous and hilly regions [66,67]. From the perspective of land use intensity, the flatter regions have the longer history of intensive agricultural use and higher intensity, reflecting the pattern of human agricultural activities gradually expanding from low to high altitudes [68]. Long-term high-intensity use has led to the decreasing trend in the surface SOM content of cultivated land in Northeast China [69]. Therefore, intensive agricultural production in flat areas accelerates the depletion of soil nutrients, which is unfavorable for SOM accumulation. In contrast, cultivated land in mountainous areas, with relatively lower cultivation intensity, is more conducive to SOM accumulation and preservation [60]. This difference ultimately results in the spatial distribution pattern where SOM content in high altitude farmlands in mountainous areas is generally higher than in flat regions at lower altitudes.

4.3. Application of Multi-Source Remote Sensing Data on SOM Predictions

With the continuous development of satellite sensor technology and cloud computing platforms, an increasing amount of remote sensing imagery and other basic geographic information data can be quickly accessed. Fusing multi-source data to enhance the accuracy of SOM mapping has thus become the mainstream research trend in related fields. Currently, the bare soil period of cultivated land in Northeast China has significantly shortened, posing considerable challenges for traditional studies that rely on single-temporal optical imagery for SOM prediction and mapping. To address this issue, this study employed multi-temporal Sentinel-1 radar and Sentinel-2 optical remote sensing data to generate median synthetic images and calculate vegetation-related remote sensing indices. These efforts effectively captured information that directly or indirectly reflects cultivated land SOM under different surface cover conditions. In particular, radar imagery offers all-weather and all-day observational advantages. Vegetation-related remote sensing indices also provided the ability to indirectly capture information related to soil structure, nutrients, and moisture during periods of dense crop cover, helping to compensate for the lack of bare soil data. Consequently, the SOM prediction and mapping results in this study were relatively reliable. In this study, Group 4 fused optical images, radar images, remote sensing indices, and environmental variables, resulting in the RF model demonstrating the highest SOM prediction performance (Figure 6 and Figure 9).
The relative importance of variables for the optimal RF model in SOM prediction is shown in Figure 13. Studies have shown that SOM content in Northeast China is significantly influenced by temperature, precipitation, and altitude [60,61,63]. In this study, the VH polarization was found to be more important than the VV polarization for SOM mapping (Figure 13). This may be primarily due to two reasons. First, SAR possesses all-weather earth observation capabilities and certain penetration properties [70]. Second, soils with higher SOM content typically exhibit stronger water retention, and VH polarization data performs better than VV polarization data in predicting soil moisture [71,72], which may make VH data more sensitive to SOM. It is noteworthy that the combination of backscatter coefficients from Sentinel-1 satellite imagery and the VH/VV ratio has demonstrated higher accuracy than NDVI in monitoring biophysical parameters during the peak growth period of crops [40]. This finding further supports that the importance of NDPI and PR in SOM prediction is significantly higher than NDVI.
Due to the absorption of infrared frequencies by polar covalent bonds (such as OH, CH, and NH functional groups) in SOM [73,74], soils with higher SOM content exhibit stronger spectral absorption, resulting in lower soil spectral reflectance. This leads to the significant correlation between SOM content and reflectance in the visible to shortwave infrared (350–2500 nm) spectral bands. In this study, the relative importance of B2, B3, and B11 was high in SOM prediction, which is almost similar to the results of previous studies [16,43,75]. Analysis of the results from this study reveals that integrating multi-source data, including optical images, radar images, various remote sensing indices, and environmental covariates for collaborative modeling, can significantly improve the accuracy of SOM mapping.

4.4. Accuracy of SOM Prediction by Machine Learning Models

Past studies have thus verified that RF, ET, BRT, and XGBoost models achieve appropriate accuracy in predicting soil SOM [51]. The RF model is a “forest” that is created by combining many decision trees. These are produced by randomly selecting features from the training data or both. To obtain the final result with RF applied to regression, the model will output the average of all the trees [76]. By averaging the predictions across multiple trees, it effectively captures the overall structure and the trend in data, enhancing the generalization ability of the model. Overfitting variances from individual trees are dispersed and mitigated; hence, the RF model becomes resistant to overfitting [77].
Apart from that, the RF model has the ability to manage a large number of feature data with no harm in the presence of noise and outliers. It works quite well on nonlinear datasets [78]. Due to database and model characteristics used in this study, among both models, the BRT model may have just a little lower resistance to overfitting in regression tasks compared to the RF model [79]. Compared to the RF model, BRT models are built using all samples of the training set; training of individual regression trees is conducted on only one subset that is randomly sampled. Similar to XGBoost, BRT also has sensitivity to noise and outliers [80]. In this study, the above reasons may explain why, although the BRT, XGBoost, and ET models achieved relatively satisfactory SOM prediction accuracy, their performance was still less robust compared to the RF model [30,48,81].

4.5. Limitations and Future Research Progress

This study has made some positive progress in remote sensing inversion of SOM, but there are still several issues that need to be addressed. Firstly, this study collected relatively rich SOM sample data, providing the solid foundation for model training and validation. However, the spatial distribution of the samples remains somewhat uneven, particularly in complex terrain or marginal areas where sample density is relatively low, which may affect the prediction performance of the model in such areas. In the future, consideration should be given to increasing the number of samples in complex terrain areas to enhance the generalization ability of the model. Secondly, this study utilized C-band radar data from the Sentinel-1 satellite for modeling and analysis but did not incorporate L-band or X-band radar data. Given that different radar bands have varying sensitivities to surface information, future studies could integrate multi-source radar data to improve the accuracy and stability of SOM prediction models. Thirdly, studies have demonstrated that applying wavelet transforms, Savitzky-Golay smoothing, and Fourier transforms to spectral reflectance can improve the accuracy of soil property predictions [18,82]. Consequently, it is advisable for subsequent studies to devise methodologies that amalgamate multi-source remote sensing data more effectively, thus enhancing the efficiency of data utilization.
Moreover, both previous studies [22] and the findings of this study indicate that long-term average climatic variables play a crucial and positive role in SOM prediction. Nevertheless, we also recognize that climatic averages over different temporal scales may exhibit varying sensitivities in relation to SOM prediction performance. Future research is needed to further investigate the differences in climatic variables across different temporal scales and their impacts on SOM prediction accuracy in order to determine a more representative and applicable temporal range for climate-related input variables.
Finally, although surface vegetation interference is relatively low during the winter months (e.g., November to March), the widespread presence of snow cover in Northeast China poses a significant challenge [38]. In the study area, most surfaces during this period are affected by varying degrees of snow or freezing conditions, which substantially interfere with the backscatter of Sentinel-1 imagery and the surface reflectance signals of Sentinel-2 imagery. Meanwhile, the current GEE platform still has certain limitations in the availability and quality control of winter imagery. A large number of images are affected by snow cover, surface freezing, and low solar elevation angles, making it difficult to meet research needs. Future research is needed to further explore the feasibility of using high-quality winter imagery for SOM inversion and mapping, particularly by incorporating multi-source radar data for complementary analysis.

5. Conclusions

This study was conducted in a typical black soil area, where median synthetic images from different time periods (crop sowing, growing, and harvest stages) from Sentinel-1 and Sentinel-2 were used. By integrating terrain and climate covariates with multi-source data groups, SOM prediction was conducted. The best-performing model and data group were selected for SOM mapping. The results indicate that, although cultivated land soil is not fully bare before sowing and after harvest, both optical and radar remote sensing satellites can capture a small amount of effective soil information. By integrating optical images, radar images, vegetation indices, radar indices, and environmental covariates, the accuracy of SOM mapping can be significantly improved. In this study, the single median synthetic image obtained during the sowing period shows higher predictive advantages in SOM prediction compared to those obtained during the growing and harvest periods. It is worth noting that, when combining multi-temporal data, overly complex data fusion may lead to data redundancy, which could, in turn, hinder the improvement of SOM prediction accuracy.
Compared to single-image data, variables related to crop growth derived from multi-temporal Sentinel-1 and Sentinel-2 satellite images offer a more complex understanding of the spatial variation of SOM. This suggests that using crop growth information as modeling variables can effectively improve the accuracy of SOM mapping. Among the various vegetation information involved in SOM mapping, the relative importance of the NDPI and PR indices obtained from radar imagery is greater than that of the NDVI derived from optical imagery. This provides new insights for SOM remote sensing monitoring in areas with short bare soil periods in agricultural production. In addition, this study also analyzed the impact of climate and topographic covariates on the accuracy of SOM mapping. This study area primarily covers plains and mountainous regions. Thus, the importance of climate and elevation is relatively high. This study extends the application of optical and radar remote sensing data in SOM mapping. In the future, by fully exploiting the potential of different types of remote sensing imagery, more precise SOM mapping research is expected to be conducted.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/rs17172929/s1, Table S1: Ten-fold cross-validation statistics of the BRT model at its highest SOM prediction accuracy; Table S2: Ten-fold cross-validation statistics of the ET model at its highest SOM prediction accuracy; Table S3: Ten-fold cross-validation statistics of the RF model at its highest SOM prediction accuracy; Table S4: Ten-fold cross-validation statistics of the XGBoost model at its highest SOM prediction accuracy; Figure S1: Scatter plot of ten-fold cross-validation for the BRT model at its highest SOM prediction accuracy; Figure S2: Scatter plot of ten-fold cross-validation for the ET model at its highest SOM prediction accuracy; Figure S3: Scatter plot of ten-fold cross-validation for the RF model at its highest SOM prediction accuracy; Figure S4: Scatter plot of ten-fold cross-validation for the XGBoost model at its highest SOM prediction accuracy.

Author Contributions

Conceptualization: X.K.; methodology: W.Z. and W.C.; investigation: W.Z., Z.Z. and E.X.; visualization: W.Z., D.Y. and R.Z.; funding acquisition: X.K.; project administration: X.K.; supervision: X.K.; writing—original draft preparation: X.K., W.Z. and T.X.; writing—review and editing: X.K., W.Z., L.L. and L.R. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the following funds: (1) “Inner Mongolia science and technology promotion action” key project (NMKJXM202303). (2) National Natural Science Foundation of China (42171289). (3) Science and Technology Basic Resources Survey Project (2021FY100403).

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

We are grateful to Jing Zhao and Nanping Pu for editing the manuscript.

Conflicts of Interest

The authors declare that they have no competing interests. The content is solely the responsibility of the authors and does not necessarily represent any official views.

References

  1. Kong, X.B. China must protect high-quality arable land. Nature 2014, 506, 7. [Google Scholar] [CrossRef] [PubMed]
  2. Pravalie, R. Exploring the multiple land degradation pathways across the planet. Earth-Sci. Rev. 2021, 220, 103689. [Google Scholar] [CrossRef]
  3. Liu, H.J.; Wan, W.; Zheng, M.D.; Li, J.W.; Liu, S.W.; Lv, W.; Zhou, Y.X.; Liu, Z. Study on climate suitability for maize and technical implementation strategies under conservation tillage in Northeast China. Soil Tillage Res. 2025, 249, 106473. [Google Scholar] [CrossRef]
  4. Cui, J.W.; Yang, B.G.; Xu, X.P.; Ai, C.; Zhou, W. Long-term maize-soybean rotation in Northeast China: Impact on soil organic matter stability and microbial decomposition. Plant Soil 2025, 507, 141–158. [Google Scholar] [CrossRef]
  5. Li, X.Y.; Shi, Z.Y.; Xing, Z.H.; Wang, M.; Wang, M.C. Dynamic evaluation of cropland degradation risk by combining multi-temporal remote sensing and geographical data in the Black Soil Region of Jilin Province, China. Appl. Geogr. 2023, 154, 102920. [Google Scholar] [CrossRef]
  6. Cai, S.S.; Sun, L.; Wang, W.; Li, Y.; Ding, J.L.; Jin, L.; Li, Y.M.; Zhang, J.M.; Wang, J.K.; Wei, D. Straw mulching alters the composition and loss of dissolved organic matter in farmland surface runoff by inhibiting the fragmentation of soil small macroaggregates. J. Integr. Agric. 2024, 23, 1703–1717. [Google Scholar] [CrossRef]
  7. Wang, L.; Qi, S.J.; Gao, W.F.; Luo, Y.; Hou, Y.P.; Liang, Y.; Zheng, H.B.; Zhang, S.M.; Li, R.P.; Wang, M.; et al. Eight-year tillage in black soil, effects on soil aggregates, and carbon and nitrogen stock. Sci. Rep. 2023, 13, 8332. [Google Scholar] [CrossRef]
  8. Pan, Y.; Zhang, X.L.; Liu, H.J.; Wu, D.Q.; Dou, X.; Xu, M.Y.; Jiang, Y. Remote sensing inversion of soil organic matter by using the subregion method at the field scale. Precis. Agric. 2022, 23, 1813–1835. [Google Scholar] [CrossRef]
  9. Sun, Q.Q.; Zhang, P.; Jiao, X.; Lun, F.; Dong, S.W.; Lin, X.; Li, X.Y.; Sun, D.F. A Remotely Sensed Framework for Spatially-Detailed Dryland Soil Organic Matter Mapping: Coupled Cross-Wavelet Transform with Fractional Vegetation and Soil-Related Endmember Time Series. Remote Sens. 2022, 14, 1701. [Google Scholar] [CrossRef]
  10. Zhang, M.W.; Wang, X.Q.; Ding, X.G.; Yang, H.L.; Guo, Q.; Zeng, L.T.; Cui, Y.P.; Sun, X.L. Monitoring regional soil organic matter content using a spatiotemporal model with time-series synthetic Landsat images. Geoderma Reg. 2023, 34, e00702. [Google Scholar] [CrossRef]
  11. Song, J.; Yu, D.S.; Wang, S.W.; Zhao, Y.H.; Wang, X.; Ma, L.X.; Li, J.G. Mapping soil organic matter in cultivated land based on multi composite images on monthly time scales. J. Integr. Agric. 2024, 23, 1393–1408. [Google Scholar] [CrossRef]
  12. Wang, X.X.; Han, J.G.; Wang, X.; Yao, H.Y.; Zhang, L. Estimating Soil Organic Matter Content Using Sentinel-2 Imagery by Machine Learning in Shanghai. IEEE Access 2021, 9, 78215–78225. [Google Scholar] [CrossRef]
  13. Ayari, E.; Kassouk, Z.; Lili-Chabaane, Z.; Baghdadi, N.; Bousbih, S.; Zribi, M. Cereal Crops Soil Parameters Retrieval Using L-Band ALOS-2 and C-Band Sentinel-1 Sensors. Remote Sens. 2021, 13, 1393. [Google Scholar] [CrossRef]
  14. Lee, J.H.; Walker, J. Inversion of soil roughness for estimating soil moisture from time-series Sentinel-1 backscatter observations over Yanco sites. Geocarto Int. 2022, 37, 1850–1862. [Google Scholar] [CrossRef]
  15. He, Y.J.; Yin, H.Y.; Chen, Y.W.; Xiang, R.; Zhang, Z.T.; Chen, H.Y. Soil Salinity Estimation Based on Sentinel-1/2 Texture Features and Machine Learning. IEEE Sens. J. 2024, 24, 15302–15310. [Google Scholar] [CrossRef]
  16. Li, Z.W.; Liu, F.; Peng, X.Y.; Hu, B.G.; Song, X.D. Synergetic use of DEM derivatives, Sentinel-1 and Sentinel-2 data for mapping soil properties of a sloped cropland based on a two-step ensemble learning method. Sci. Total Environ. 2023, 866, 161421. [Google Scholar] [CrossRef] [PubMed]
  17. Dodin, M.; Levavasseur, F.; Savoie, A.; Martin, L.; Vaudour, E. Farm-scale mapping of compost and digestate spreadings from Sentinel-2 and Sentinel-1. Int. J. Appl. Earth Obs. Geoinf. 2025, 139, 104555. [Google Scholar] [CrossRef]
  18. Yang, C.C.H.; Yang, L.; Zhang, L.; Zhou, C.H. Soil organic matter mapping using INLA-SPDE with remote sensing based soil moisture indices and Fourier transforms decomposed variables. Geoderma 2023, 437, 116571. [Google Scholar] [CrossRef]
  19. Zhang, Y.; Luo, C.; Zhang, Y.H.; Gao, L.R.; Wang, Y.H.; Wu, Z.X.; Zhang, W.Q.; Liu, H.J. Integration of bare soil and crop growth remote sensing data to improve the accuracy of soil organic matter mapping in black soil areas. Soil Tillage Res. 2024, 244, 106269. [Google Scholar] [CrossRef]
  20. Rukhovich, D.; Koroleva, P.; Rukhovich, A.; Komissarov, M. A detailed mapping of soil organic matter content in arable land based on the multitemporal soil line coefficients and neural network filtering of big remote sensing data. Geoderma 2024, 447, 116941. [Google Scholar] [CrossRef]
  21. Shi, P.; Six, J.; Sila, A.; Vanlauwe, B.; Van Oost, K. Towards spatially continuous mapping of soil organic carbon in croplands using multitemporal Sentinel-2 remote sensing. Isprs J. Photogramm. Remote Sens. 2022, 193, 187–199. [Google Scholar] [CrossRef]
  22. Luo, C.; Zhang, W.Q.; Zhang, X.L.; Liu, H.J. Mapping soil organic matter content using Sentinel-2 synthetic images at different time intervals in Northeast China. Int. J. Digit. Earth 2023, 16, 1094–1107. [Google Scholar] [CrossRef]
  23. Rizzo, R.; Medeiros, L.G.; de Mello, D.C.; Marques, K.P.P.; Mendes, W.D.; Silvero, N.E.Q.; Dotto, A.C.; Bonfatti, B.R.; Demattê, J.A.M. Multi-temporal bare surface image associated with transfer functions to support soil classification and mapping in southeastern Brazil. Geoderma 2020, 361, 114018. [Google Scholar] [CrossRef]
  24. Luo, C.; Zhang, W.Q.; Zhang, X.L.; Liu, H.J. Mapping of soil organic matter in a typical black soil area using Landsat-8 synthetic images at different time periods. Catena 2023, 231, 107336. [Google Scholar] [CrossRef]
  25. Mallick, J.; Ahmed, M.; Alqadhi, S.D.; Falqi, I.I.; Parayangat, M.; Singh, C.K.; Rahman, A.; Ijyas, T. Spatial stochastic model for predicting soil organic matter using remote sensing data. Geocarto Int. 2022, 37, 413–444. [Google Scholar] [CrossRef]
  26. Belenok, V.; Hebryn-Baidy, L.; Bielousova, N.; Zavarika, H.; Kryachok, S.; Liashenko, D.; Malik, T. Application of remote sensing methods for statistical estimation of organic matter in soils. Earth Sci. Res. J. 2023, 27, 299–312. [Google Scholar] [CrossRef]
  27. Wang, J.W.; Feng, C.H.; Hu, B.F.; Chen, S.C.; Hong, Y.S.; Arrouays, D.; Peng, J.; Shi, Z. A novel framework for improving soil organic matter prediction accuracy in cropland by integrating soil, vegetation and human activity information. Sci. Total Environ. 2023, 903, 166112. [Google Scholar] [CrossRef]
  28. Zhou, W.; Xiao, J.Y.; Li, H.R.; Chen, Q.; Wang, T.; Wang, Q.; Yue, T.X. Soil organic matter content prediction using Vis-NIRS based on different wavelength optimization algorithms and inversion models. J. Soils Sediments 2023, 23, 2506–2517. [Google Scholar] [CrossRef]
  29. Emadi, M.; Taghizadeh-Mehrjardi, R.; Cherati, A.; Danesh, M.; Mosavi, A.; Scholten, T. Predicting and Mapping of Soil Organic Carbon Using Machine Learning Algorithms in Northern Iran. Remote Sens. 2020, 12, 2234. [Google Scholar] [CrossRef]
  30. Lu, Q.K.; Tian, S.; Wei, L.F. Digital mapping of soil pH and carbonates at the European scale using environmental variables and machine learning. Sci. Total Environ. 2023, 856, 159171. [Google Scholar] [CrossRef]
  31. Yang, J.; Huang, X. The 30 m annual land cover dataset and its dynamics in China from 1990 to 2019. Earth Syst. Sci. Data 2021, 13, 3907–3925. [Google Scholar] [CrossRef]
  32. Zhang, B.; Guo, B.; Zou, B.; Wei, W.; Lei, Y.Z.; Li, T.Q. Retrieving soil heavy metals concentrations based on GaoFen-5 hyperspectral satellite image at an opencast coal mine, Inner Mongolia, China. Environ. Pollut. 2022, 300, 118981. [Google Scholar] [CrossRef] [PubMed]
  33. O’Kelly, B.C. Accurate determination of moisture content of organic soils using the oven drying method. Dry. Technol. 2004, 22, 1767–1776. [Google Scholar] [CrossRef]
  34. Pribyl, D.W. A critical review of the conventional SOC to SOM conversion factor. Geoderma 2010, 156, 75–83. [Google Scholar] [CrossRef]
  35. NY/T 1121.6-2006; Soil Testing—Part 6: Method for Determination of Soil Organic Matter. Ministry of Agriculture of the People’s Republic of China: Beijing, China, 2006.
  36. HJ 717-2014; Soil Quality—Determination of Total Nitrogen—Modified Kjeldahl Method. Ministry of Environmental Protection of the People’s Republic of China: Beijing, China, 2014.
  37. Chong, L.; Liu, H.J.; Lu, L.P.; Liu, Z.R.; Kong, F.C.; Zhang, X.L. Monthly composites from Sentinel-1 and Sentinel-2 images for regional major crop mapping with Google Earth Engine. J. Integr. Agric. 2021, 20, 1944–1957. [Google Scholar] [CrossRef]
  38. Wu, R.G.; Zhao, P.; Liu, G. Change in the contribution of spring snow cover and remote oceans to summer air temperature anomaly over Northeast China around 1990. J. Geophys. Res. Atmos. 2014, 119, 663–676. [Google Scholar] [CrossRef]
  39. Cao, Y.G.; Yan, L.J.; Zheng, Z.Z. Extraction of information on geology hazard from multi-polarization sar images. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2008, XXXVII, 1529–1532. [Google Scholar]
  40. Veloso, A.; Mermoz, S.; Bouvet, A.; Toan, T.L.; Planells, M.; Dejoux, J.F.; Ceschia, E. Understanding the temporal behavior of crops using Sentinel-1 and Sentinel-2-like data for agricultural applications. Remote Sens. Environ. 2017, 199, 415–426. [Google Scholar] [CrossRef]
  41. Wittke, S.; Yu, X.W.; Karjalainen, M.; Hyyppä, J.; Puttonen, E. Comparison of two-dimensional multitemporal Sentinel-2 data with three-dimensional remote sensing data sources for forest inventory parameter estimation over a boreal forest. Int. J. Appl. Earth Obs. Geoinf. 2019, 76, 167–178. [Google Scholar] [CrossRef]
  42. Wang, X.; Li, S.J.; Wang, L.P.; Zheng, M.; Wang, Z.M.; Song, K.S. Effects of cropland reclamation on soil organic carbon in China’s black soil region over the past 35 years. Glob. Change Biol. 2023, 29, 5460–5477. [Google Scholar] [CrossRef] [PubMed]
  43. Luo, C.; Zhang, W.Q.; Zhang, X.L.; Liu, H.J. Mapping the soil organic matter content in a typical black-soil area using optical data, radar data and environmental covariates. Soil Tillage Res. 2024, 235, 105912. [Google Scholar] [CrossRef]
  44. Soil SubCenter. National Earth System Science Data Center, National Science & Technology Infrastructure of China. Available online: http://soil.geodata.cn (accessed on 1 June 2025).
  45. Friedman, J.; Hastie, T.; Tibshirani, R. Additive logistic regression: A statistical view of boosting. Ann. Stat. 2000, 28, 337–374. [Google Scholar] [CrossRef]
  46. Geurts, P.; Ernst, D.; Wehenkel, L. Extremely randomized trees. Mach. Learn. 2006, 63, 3–42. [Google Scholar] [CrossRef]
  47. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  48. Chen, T.Q.; Guestrin, C.; Assoc Comp, M. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  49. Zhang, W.J.; Zhu, L.; Zhuang, Q.F.; Chen, D.; Sun, T. Mapping Cropland Soil Nutrients Contents Based on Multi-Spectral Remote Sensing and Machine Learning. Agriculture 2023, 13, 1592. [Google Scholar] [CrossRef]
  50. Wang, N.; Peng, J.; Chen, S.C.; Huang, J.Y.; Li, H.Y.; Biswas, A.; He, Y.; Shi, Z. Improving remote sensing of salinity on topsoil with crop residues using novel indices of optical and microwave bands. Geoderma 2022, 422, 115935. [Google Scholar] [CrossRef]
  51. Geng, J.; Tan, Q.Y.; Lv, J.W.; Fang, H.J. Assessing spatial variations in soil organic carbon and C:N ratio in Northeast China’s black soil region: Insights from Landsat-9 satellite and crop growth information. Soil Tillage Res. 2024, 235, 105897. [Google Scholar] [CrossRef]
  52. Qie, L.; Pu, L.J.; Tang, P.F.; Liu, R.J.; Huang, S.H.; Xu, F.; Zhong, T.Y. Gains and losses of farmland associated with farmland protection policy and urbanization in China: An integrated perspective based on goal orientation. Land Use Policy 2023, 129, 106643. [Google Scholar] [CrossRef]
  53. Wang, X.; Wang, L.P.; Li, S.J.; Wang, Z.M.; Zheng, M.; Song, K.S. Remote estimates of soil organic carbon using multi-temporal synthetic images and the probability hybrid model. Geoderma 2022, 425, 116066. [Google Scholar] [CrossRef]
  54. Ma, J.M.; Shi, P. Remotely sensed inter-field variation in soil organic carbon content as influenced by the cumulative effect of conservation tillage in northeast China. Soil Tillage Res. 2024, 243, 106170. [Google Scholar] [CrossRef]
  55. Xiang, X.Y.; Du, J.; Jacinthe, P.A.; Zhao, B.Y.; Zhou, H.H.; Liu, H.J.; Song, K.S. Integration of tillage indices and textural features of Sentinel-2A multispectral images for maize residue cover estimation. Soil Tillage Res. 2022, 221, 105405. [Google Scholar] [CrossRef]
  56. Vavlas, N.C.; Seubring, T.; Elhakeem, A.; Kooistra, L.; De Deyn, G.B. Remote sensing of cover crop legacies on main crop N-uptake dynamics. Eur. J. Soil Sci. 2024, 75, e13582. [Google Scholar] [CrossRef]
  57. Guo, L.; Sun, X.R.; Fu, P.; Shi, T.Z.; Dang, L.N.; Chen, Y.Y.; Linderman, M.; Zhang, G.L.; Zhang, Y.; Jiang, Q.H.; et al. Mapping soil organic carbon stock by hyperspectral and time-series multispectral remote sensing images in low-relief agricultural areas. Geoderma 2021, 398, 115118. [Google Scholar] [CrossRef]
  58. Peng, Y.; Zhu, T.E.; Li, Y.C.; Dai, C.; Fang, S.H.; Gong, Y.; Wu, X.T.; Zhu, R.S.; Liu, K. Remote prediction of yield based on LAI estimation in oilseed rape under different planting methods and nitrogen fertilizer applications. Agric. For. Meteorol. 2019, 271, 116–125. [Google Scholar] [CrossRef]
  59. Zhang, Y.N.; Niu, Y.X.; Cui, Z.H.; Chai, X.Y.; Xu, L.Z. Cross-Year Rapeseed Yield Prediction for Harvesting Management Using UAV-Based Imagery. Remote Sens. 2025, 17, 2010. [Google Scholar] [CrossRef]
  60. Kong, D.P.; Chu, N.C.; Luo, C.; Liu, H.J. Analyzing Spatial Distribution and Influencing Factors of Soil Organic Matter in Cultivated Land of Northeast China: Implications for Black Soil Protection. Land 2024, 13, 1028. [Google Scholar] [CrossRef]
  61. Liu, X.N.; Wang, M.C.; Liu, Z.W.; Li, X.Y.; Ji, X.; Wang, F.Y. Spatial and temporal evolution of soil organic matter and its response to dynamic factors in the Southern part of Black Soil Region of Northeast China. Soil Tillage Res. 2025, 248, 106475. [Google Scholar] [CrossRef]
  62. Li, R.; Hu, W.Y.; Jia, Z.J.; Liu, H.Q.; Zhang, C.; Huang, B.; Yang, S.H.; Zhao, Y.G.; Zhao, Y.C.; Shukla, M.K.; et al. Soil degradation: A global threat to sustainable use of black soils. Pedosphere 2025, 35, 264–279. [Google Scholar] [CrossRef]
  63. Li, Y.; Zheng, S.F.; Wang, L.P.; Dai, X.L.; Zang, D.Q.; Qi, B.S.; Meng, X.T.; Mei, X.D.; Luo, C.; Liu, H.J. Systematic identification of factors influencing the spatial distribution of soil organic matter in croplands within the black soil region of Northeastern China across multiple scales. Catena 2025, 249, 108633. [Google Scholar] [CrossRef]
  64. Wang, X.; Song, K.S.; Wang, Z.M.; Li, S.J.; Shang, Y.X.; Liu, G. Effects of land conversion to cropland on soil organic carbon in montane soils of Northeast China from 1985 to 2020. Catena 2024, 235, 107691. [Google Scholar] [CrossRef]
  65. Liu, Y.; He, N.P.; Xu, L.; Tian, J.; Gao, Y.; Zheng, S.; Wang, Q.; Wen, X.F.; Xu, X.L.; Yakov, K. A new incubation and measurement approach to estimate the temperature response of soil organic matter decomposition. Soil Biol. Biochem. 2019, 138, 107596. [Google Scholar] [CrossRef]
  66. Zhou, Y.; Zhong, Z.; Cheng, G.Q. Cultivated land loss and construction land expansion in China: Evidence from national land surveys in 1996, 2009 and 2019. Land Use Policy 2023, 125, 106496. [Google Scholar] [CrossRef]
  67. Li, Y.Y.; Li, X.B.; Tan, M.H.; Wang, X.; Xin, L.J. The impact of cultivated land spatial shift on food crop production in China, 1990-2010. Land Degrad. Dev. 2018, 29, 1652–1659. [Google Scholar] [CrossRef]
  68. Ellis, E.C.; Kaplan, J.O.; Fuller, D.Q.; Vavrus, S.; Goldewijk, K.K.; Verburg, P.H. Used planet: A global history. Proc. Natl. Acad. Sci. USA 2013, 110, 7978–7985. [Google Scholar] [CrossRef] [PubMed]
  69. Bao, Y.L.; Yao, F.M.; Meng, X.T.; Fan, J.X.; Zhang, J.H.; Liu, H.J.; Mouazen, A.M. Dynamic modeling of topsoil organic carbon and its scenarios forecast in global Mollisols regions. J. Clean. Prod. 2023, 421, 138544. [Google Scholar] [CrossRef]
  70. Azizi, K.; Garosi, Y.; Ayoubi, S.; Tajik, S. Integration of Sentinel-1/2 and topographic attributes to predict the spatial distribution of soil texture fractions in some agricultural soils of western Iran. Soil Tillage Res. 2023, 229, 105681. [Google Scholar] [CrossRef]
  71. Settu, P.; Ramaiah, M. Estimation of Sentinel-1 derived soil moisture using modified Dubois model. Environ. Dev. Sustain. 2024, 26, 29677–29693. [Google Scholar] [CrossRef]
  72. Li, H.C.; Van den Bulcke, J.; Mendoza, O.; Deroo, H.; Haesaert, G.; Dewitte, K.; Stefaan De, N.; Sleutel, S. Soil texture controls added organic matter mineralization by regulating soil moisture-evidence from a field experiment in a maritime climate. Geoderma 2022, 410, 115690. [Google Scholar] [CrossRef]
  73. Sanderman, J.; Baldock, J.A.; Dangal, S.R.S.; Ludwig, S.; Potter, S.; Rivard, C.; Savage, K. Soil organic carbon fractions in the Great Plains of the United States: An application of mid-infrared spectroscopy. Biogeochemistry 2021, 156, 97–114. [Google Scholar] [CrossRef]
  74. Bahureksa, W.; Tfaily, M.M.; Boiteau, R.M.; Young, R.B.; Logan, M.N.; McKenna, A.M.; Borch, T. Soil Organic Matter Characterization by Fourier Transform Ion Cyclotron Resonance Mass Spectrometry (FTICR MS): A Critical Review of Sample Preparation, Analysis, and Data Interpretation. Environ. Sci. Technol. 2021, 55, 9637–9656. [Google Scholar] [CrossRef]
  75. Liu, Y.; Chen, S.C.; Yu, Q.Y.; Cai, Z.J.; Zhou, Q.B.; Bellingrath-Kimura, S.D.; Wu, W.B. Improving digital mapping of soil organic matter in cropland by incorporating crop rotation. Geoderma 2023, 438, 116620. [Google Scholar] [CrossRef]
  76. Heung, B.; Bulmer, C.E.; Schmidt, M.G. Predictive soil parent material mapping at a regional-scale: A Random Forest approach. Geoderma 2014, 214, 141–154. [Google Scholar] [CrossRef]
  77. Wang, Z.; Du, Z.P.; Li, X.Y.; Bao, Z.Y.; Zhao, N.; Yue, T.X. Incorporation of high accuracy surface modeling into machine learning to improve soil organic matter mapping. Ecol. Indic. 2021, 129, 107975. [Google Scholar] [CrossRef]
  78. Díaz-Uriarte, R.; de Andrés, S.A. Gene selection and classification of microarray data using random forest. BMC Bioinform. 2006, 7, 3. [Google Scholar] [CrossRef]
  79. Nolan, B.T.; Fienen, M.N.; Lorenz, D.L. A statistical learning framework for groundwater nitrate models of the Central Valley, California, USA. J. Hydrol. 2015, 531, 902–911. [Google Scholar] [CrossRef]
  80. Ahmad, M.W.; Reynolds, J.; Rezgui, Y. Predictive modelling for solar thermal energy systems: A comparison of support vector regression, random forest, extra trees and regression trees. J. Clean. Prod. 2018, 203, 810–821. [Google Scholar] [CrossRef]
  81. Friedman, J.H. Stochastic gradient boosting. Comput. Stat. Data Anal. 2002, 38, 367–378. [Google Scholar] [CrossRef]
  82. Guo, B.; Guo, X.N.; Zhang, B.; Suo, L.; Bai, H.R.; Luo, P.P. Using a Two-Stage Scheme to Map Toxic Metal Distributions Based on GF-5 Satellite Hyperspectral Images at a Northern Chinese Opencast Coal Mine. Remote Sens. 2022, 14, 5804. [Google Scholar] [CrossRef]
Figure 1. Overview of the study area. (a) Location of the study area; (b) Topography of the study area; (c) Land use of the study area and distribution of sampling locations.
Figure 1. Overview of the study area. (a) Location of the study area; (b) Topography of the study area; (c) Land use of the study area and distribution of sampling locations.
Remotesensing 17 02929 g001
Figure 2. Flowchart of this study.
Figure 2. Flowchart of this study.
Remotesensing 17 02929 g002
Figure 3. Basic statistical characteristics of SOM content.
Figure 3. Basic statistical characteristics of SOM content.
Remotesensing 17 02929 g003
Figure 4. Spectral reflectance and backscattering coefficients characteristics for different SOM content. (ac) show the spectral reflectance characteristics of the median composite Sentinel-2 images for May, June to August, and September to October, respectively. (df) show the backscatter coefficient characteristics of the median synthetic Sentinel-1 images for May, June to August, and September to October, respectively.
Figure 4. Spectral reflectance and backscattering coefficients characteristics for different SOM content. (ac) show the spectral reflectance characteristics of the median composite Sentinel-2 images for May, June to August, and September to October, respectively. (df) show the backscatter coefficient characteristics of the median synthetic Sentinel-1 images for May, June to August, and September to October, respectively.
Remotesensing 17 02929 g004
Figure 5. SOM prediction accuracy in different groups based on optical images.
Figure 5. SOM prediction accuracy in different groups based on optical images.
Remotesensing 17 02929 g005
Figure 6. SOM prediction accuracies in different groups based on optical and radar images.
Figure 6. SOM prediction accuracies in different groups based on optical and radar images.
Remotesensing 17 02929 g006
Figure 7. Comparison of SOM prediction accuracy with different data groups.
Figure 7. Comparison of SOM prediction accuracy with different data groups.
Remotesensing 17 02929 g007
Figure 8. Variation in accuracy metrics for each model when SOM prediction accuracy is highest.
Figure 8. Variation in accuracy metrics for each model when SOM prediction accuracy is highest.
Remotesensing 17 02929 g008
Figure 9. Mapping results of SOM and mean SOM generated by the RF model with the highest prediction accuracy. (a) is the SOM mapping result; (b) is the mean SOM mapping result.
Figure 9. Mapping results of SOM and mean SOM generated by the RF model with the highest prediction accuracy. (a) is the SOM mapping result; (b) is the mean SOM mapping result.
Remotesensing 17 02929 g009
Figure 10. Uncertainty analysis results for SOM mapping. (a) is the standard deviation of the SOM mapping; (b) is the uncertainty of the SOM mapping.
Figure 10. Uncertainty analysis results for SOM mapping. (a) is the standard deviation of the SOM mapping; (b) is the uncertainty of the SOM mapping.
Remotesensing 17 02929 g010
Figure 11. Correlation analysis between soil sample TN and SOM content.
Figure 11. Correlation analysis between soil sample TN and SOM content.
Remotesensing 17 02929 g011
Figure 12. SOC spatial distribution data of the study area.
Figure 12. SOC spatial distribution data of the study area.
Remotesensing 17 02929 g012
Figure 13. Relative importance of variables for the optimal RF model in SOM prediction.
Figure 13. Relative importance of variables for the optimal RF model in SOM prediction.
Remotesensing 17 02929 g013
Table 1. Cultivated land area and sampling point statistics by topography.
Table 1. Cultivated land area and sampling point statistics by topography.
TopographyPercentage of
Cultivated Land Area
Number of
Sampling Points
Percentage of
Sampling Points
Flat terrain33%3128.70%
Hilly terrain59%6661.11%
Mountainous terrain8%1110.19%
Table 2. Synthesis results of different remote sensing images.
Table 2. Synthesis results of different remote sensing images.
DataYearsMonthsSynthesis Mode
Sentinel-12019–2024MayMedian
2019–2024June–AugustMedian
2019–2024September–OctoberMedian
Sentinal-22019–2024MayMedian
2019–2024June–AugustMedian
2019–2024September–OctoberMedian
Table 3. Bands coverage and corresponding spatial resolutions of Sentinel-2.
Table 3. Bands coverage and corresponding spatial resolutions of Sentinel-2.
Band NameWavelength (nm)Resolution (m)
B1 (Aerosols)433–45360
B2 (Blue)458–52310
B3 (Green)543–57810
B4 (Red)650–68010
B5 (Red Edge 1)698–71320
B6 (Red Edge 2)733–74820
B7 (Red Edge 3)773–79320
B8 (Near infrared)785–90010
B8A (Red edge 4)855–87520
B9 (Water vapor)935–95560
B11 (Shortwave infrared 1)1565–165520
B12 (Shortwave infrared 2)2100–228020
Note: Sentinel-2 image description information sourced from the GEE platform.
Table 4. Three remote sensing indices derived from satellite images.
Table 4. Three remote sensing indices derived from satellite images.
Remote Sensing IndicesAcronymsCalculation Formula
Normalized difference polarization indexNDPI N D P I = V V V H V V + V H
Polarization ratioPR P R = V V V H
Normalized difference vegetation indexNDVI N D V I = N I R R N I R + R
Table 5. Six groups based on remote sensing images.
Table 5. Six groups based on remote sensing images.
GroupsImage Combination
Group 1May median image
Group 2June–August median image
Group 3September–October median image
Group 4Group 1 and Group 2 overlay
Group 5Group 1 and Group 3 overlay
Group 6Group 1 and Group 2 and Group 3 overlay
Table 6. Results of ten-fold cross-validation at optimal SOM prediction accuracy.
Table 6. Results of ten-fold cross-validation at optimal SOM prediction accuracy.
ModelsGroupsValidationMinMaxMean
BRTGroup 4R20.2160.7440.501
RMSE3.9328.7376.312
MAE3.3436.0514.928
ETGroup 2R20.0130.6540.406
RMSE3.1668.6696.425
MAE2.6106.2955.001
RFGroup 4R20.3870.8000.530
RMSE4.0698.1666.130
MAE3.5006.1284.822
XGBoostGroup 4R20.0030.7310.450
RMSE3.7679.1206.381
MAE2.9527.1744.905
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, W.; Chen, W.; Zhao, Z.; Li, L.; Zhang, R.; Yao, D.; Xie, T.; Xie, E.; Kong, X.; Ren, L. Mapping Soil Organic Matter in a Typical Black Soil Region Using Multi-Temporal Synthetic Images and Radar Indices Under Limited Bare Soil Windows. Remote Sens. 2025, 17, 2929. https://doi.org/10.3390/rs17172929

AMA Style

Zhang W, Chen W, Zhao Z, Li L, Zhang R, Yao D, Xie T, Xie E, Kong X, Ren L. Mapping Soil Organic Matter in a Typical Black Soil Region Using Multi-Temporal Synthetic Images and Radar Indices Under Limited Bare Soil Windows. Remote Sensing. 2025; 17(17):2929. https://doi.org/10.3390/rs17172929

Chicago/Turabian Style

Zhang, Wencai, Wenguang Chen, Zhenting Zhao, Liang Li, Ruqian Zhang, Dongheng Yao, Tingting Xie, Enyi Xie, Xiangbin Kong, and Lisuo Ren. 2025. "Mapping Soil Organic Matter in a Typical Black Soil Region Using Multi-Temporal Synthetic Images and Radar Indices Under Limited Bare Soil Windows" Remote Sensing 17, no. 17: 2929. https://doi.org/10.3390/rs17172929

APA Style

Zhang, W., Chen, W., Zhao, Z., Li, L., Zhang, R., Yao, D., Xie, T., Xie, E., Kong, X., & Ren, L. (2025). Mapping Soil Organic Matter in a Typical Black Soil Region Using Multi-Temporal Synthetic Images and Radar Indices Under Limited Bare Soil Windows. Remote Sensing, 17(17), 2929. https://doi.org/10.3390/rs17172929

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop