1. Introduction
The European Union (EU) aims to ensure that 75% of soils are healthy by 2030 to secure healthy food, people, nature, and climate. As of 2020, approximately 60–70% of EU soils remain unhealthy [
1]. Healthy soils are the foundation of agricultural production and a crucial component of the ecosystem. Effective soil management and conservation practices, such as crop rotation, cover cropping, and reduced fertilizer usage, sustain soil fertility, enhance biodiversity, and facilitate the colonization of beneficial microorganisms [
2]. Soil with appropriate elemental content provides essential nutrients for plants, promoting healthy growth. It enhances soil moisture retention and increases water-holding capacity [
3]. Soil testing and fertilizer field trials help ascertain crop fertilizer demand patterns and soil nutrient-supplying capacity, thereby enabling the scientific formulation of effective fertilizer application programs [
4]. Therefore, predicting soil elemental content is crucial for guiding agricultural production, promoting plant growth, and reducing costs. Spectral remote sensing acquires soil spectral information without direct contact with the ground, making it an indispensable tool for soil investigation and analysis [
5].
The key stages of spectral soil property prediction are data preprocessing and model building. Data preprocessing primarily aims to enhance model accuracy and robustness. Common mathematical transformations, Savitzky–Golay (SG) smoothing, and differential preprocessing methods are effective in improving prediction models [
6,
7]. SG smoothing effectively suppresses spectral noise and reduces random noise impact [
8]. Differential processing mitigates instrumental background and drift influence on the signal, amplifying the spectral response [
9]. Recently, more researchers have employed machine learning and deep learning techniques to explore the relationship between hyperspectral data and soil elemental content. PLSR [
10], RFR [
11], and SVR remain robust prediction models in this field. With advancing computational power, deep learning is gradually being applied across various domains. CNNs [
12] are widely used in computer vision and geographic information systems (GISs) due to their exceptional performance and versatility [
13,
14,
15]. Integrating mathematical concepts into differential convolutional neural networks facilitates faster error minimization and convergence [
16]. Combining CNN component construction with hyperparameter tuning can significantly advance CNN application in soil spectroscopy modeling [
17]. The Transformer model, proposed by Google researchers in 2017, is a deep learning method based on an attention mechanism without convolutional or recurrent units. The Transformer model is capable of time series prediction [
18], air quality prediction [
19], risk prediction [
20], and more. Its architecture can be scaled to the training dataset size and model, enabling efficient parallel training and long-range dependency capture [
19]. These advantages motivated us to consider using such models for spectral soil content prediction.
The LUCAS 2009 Topsoil Survey dataset [
21] is one of the most comprehensive and consistent soil datasets on a continental scale. Numerous studies have utilized Vis-NIR spectra from the LUCAS dataset to predict various soil properties. Evangelos Simpris [
22] proposed using stacked autoencoders to predict eight key soil properties simultaneously, improving prediction accuracy. Zhong et al. [
23] evaluated data preprocessing and multitasking Deep Convolutional Neural Network (DCNN) models, finding they outperformed shallow CNN and other traditional machine learning methods. Singh et al. [
24] used principal component analysis and locally preserved projections to reduce hyperspectral data dimensionality and proposed a framework combining hybrid features and Recurrent Neural Network (RNN) variants (LSTM and Gated Recurrent Unit (GRU)), demonstrating enhanced predictive capabilities. Stanislaw Grushchensky et al. [
25] compared the effectiveness of different machine-learning models for large spectral libraries. Hamid Tavakoli et al. [
26] investigated stacked models’ role in improving the prediction accuracy of traditional individual models. Midi Wan et al. [
27] proposed a near-infrared spectral masking autoencoder that learns highly robust and generalized spectral features from a large public near-infrared spectral dataset. Alex Wangic et al. [
28] compared the performance of laser-induced breakdown spectroscopy (LIBS) and visible near-infrared spectroscopy (Vis-NIRS) in predicting soil organic carbon (SOC), texture, extractable phosphorus, clay, and organic carbon ratio. Existing studies have demonstrated the ability to predict a broad spectrum of soil properties with high accuracy (R
2 > 0.8). Although these studies have been extensively discussed and have yielded reasonably accurate predictions, many have excluded samples from certain portions of the LUCAS dataset and selectively focused on specific soil properties. While this approach may yield favorable results, it does not fully capture the comprehensive nature of the dataset. Building upon prior research, this study aims to comprehensively explore the complete LUCAS dataset and leverage deep learning techniques to enhance prediction accuracy, potentially opening new avenues for soil property prediction.
Many studies have demonstrated that integrating different types of neural networks significantly improves model performance. Yuhao Wang et al. [
29] used a deep learning LSTM model combined with an attention mechanism and peak NDVI images to generate more accurate and timely predictions. Shujun Liu et al. [
30] utilized a tensor connection module to combine temporal convolutional networks (TCN) and LSTM neural networks for wind power prediction, achieving excellent results. The powerful sequence processing capabilities of Transformers, combined with the superior feature extraction capabilities of CNN, have improved performance in both predictive modeling and image processing tasks. This fusion approach leverages the strengths of different models to provide richer feature representations, significantly advancing various application scenarios [
31,
32,
33]. These successes support our use of a similar parallel approach in this study. This study introduces an innovative approach combining Transformer and CNN models to enhance prediction accuracy and model stability. Transformer models excel in capturing long-range dependencies, while CNN is effective at extracting localized features. In contrast to several recent studies [
34,
35,
36,
37] that primarily focus on improving accuracy under specific soil properties or controlled conditions, this study seeks to broaden the model’s applicability across a wider range of soil properties and environmental contexts. By doing so, it not only advances the current state of the art in soil property prediction but also addresses practical challenges that hinder the deployment of deep learning models in soil science.
By combining these techniques, we can leverage their strengths for more comprehensive and accurate soil property predictions. Traditional machine learning methods frequently struggle to capture the full complexity and variability of soil spectral data, particularly when confronted with noise and high-dimensional features. These limitations impede both the accuracy and scalability of soil property predictions. Our goal is to overcome traditional model limitations and provide a more robust and efficient tool for soil property prediction through this fusion approach. We will objectively evaluate the strengths and weaknesses of this approach by comparing its performance against traditional machine learning models and existing literature methods. We will explore its applicability and superiority in predicting different soil properties. The evaluation will be multi-dimensional, including prediction accuracy, computational efficiency, applicability, and limitations.
4. Discussion
To further evaluate the predictive performance of the proposed modeling approach for soil attribute content, we will compare it with findings from other related studies. First, our study demonstrated significant advantages in predicting different soil attributes. For example, in predicting N, P, and K contents, our model achieved R2 values of 0.94, 0.41, and 0.60, respectively, outperforming other studies. This indicates that our model has higher accuracy and reliability in capturing and predicting soil nutrient contents. For OC, our study also achieved impressive results. Our results in most soil prediction accuracies were higher than those in other studies. This demonstrates that our model performs better in predicting soil properties and helps analyze soil trends more precisely.
Our experiments were designed to be reproducible, but the magnitude of the experimental RMSE compared to other literature could not be effectively compared due to errors caused by the sampling method used to split the dataset. This may limit the ability to reproduce the exact experimental results in this study.
Table 6 shows that Zong et al. [
23] achieved a superior R
2 for pH in H
2O compared to our model, but we obtained good results for other soil properties. According to
Figure 5, the matrix heat map clearly shows the comparison of our model with other research models’ R
2 values. It demonstrates higher prediction accuracy and reliability compared to other studies. This indicates that our method has significant advantages in predicting the content of key organic and inorganic substances in soil. Our study demonstrated significant advantages in the prediction and analysis of multiple soil attributes, not only in terms of better prediction accuracy but also in capturing and interpreting the complexity of soil data more comprehensively.
A study [
35] employed a CNN model to predict six soil properties (OC, CEC, Clay, Sand, pH, and N) without using additional spectral preprocessing methods. Although this approach showed improvements over traditional methods, a single model could not comprehensively extract spectral features. Another study [
36] used the combined SG smoothing filter (with a second-order polynomial fit and a window size of 11 data points) and a first-order derivative transform to predict seven soil properties (Clay, Sand, Silt, pH, OC, CaCO
3, and N), resulting in improved predictive performance compared to previous methods. In this experiment, we employed a one-dimensional convolutional neural network (1D-CNN) for modeling. While two-dimensional convolutional neural networks (2D-CNNs) for spectral image processing have proven effective [
34], our study remained consistent by using both organic and mineral soils from the dataset. However, their study did not incorporate spectral preprocessing steps, which could help reduce experimental uncertainty. Our method improved the R
2 for six soil properties (OC, N, CEC, pH, Sand, and Clay) in the test set. Although our results might be enhanced by spectral preprocessing, recent research [
37] investigating seven soil properties (OC, CaCO
3, N, CEC, pH, Clay, and Sand) employed seven preprocessing methods. While this approach may improve prediction accuracy from a preprocessing perspective, model performance could be heavily dependent on the choice of preprocessing methods. Such dependence may lead to inconsistent results and raise concerns about model stability. Variability among different preprocessing methods could cause significant differences in model performance under various conditions, increasing result variability and uncertainty. Compared to previous studies using the LUCAS dataset, we utilized the complete dataset and common preprocessing methods to predict all 11 soil properties. This approach reduces the uncertainty associated with excessive reliance on preprocessing, enhances model stability, and improves the reliability of predictive outcomes, providing a more comprehensive and robust predictive framework for readers.
In this experiment, silt, sand, K, and P were poorly predicted. However, Osayande Pascal Omondiagbe et al. [
17] used Adapted-PBT to optimize CNN for predicting soil texture (three soil properties) with significant results for clay (R
2 = 0.90, RMSE = 4.2), silt (R
2 = 0.78, RMSE = 11.4), and sand (R
2 = 0.81, RMSE = 0.3). Additionally, Wan et al. [
27] utilized LUCAS to enhance feature extraction for small datasets, achieving good results: available nitrogen (R
2 = 0.941, RMSE = 3.873), available phosphorus (R
2 = 0.926, RMSE = 3.684), and available potassium (R
2 = 0.903, RMSE = 3.422). These studies demonstrated good application results using the LUCAS dataset. Our model requires a large amount of data, and its performance is not satisfactory with small datasets. Our next goal is to utilize the LUCAS dataset to enhance the prediction of soil properties with a small amount of data. Training the Transformer-CNN model using an NVIDIA GeForce GTX 4090 requires 8 ms/epoch, the Transformer model requires 5 ms/epoch, CNN requires 6 ms/epoch, LSTM requires 16 ms/epoch, and ResNet18 requires 33 ms/epoch. Although this may be considered slow compared to traditional preprocessing methods, it is tolerable. The current studies around the LUCAS dataset are monolithic because they are all based on spectral information collected indoors, which is different in future large-scale outdoor studies. Future research should be performed to evaluate the validity of the proposed methods on soil spectra collected in the field with portable spectrometers or drones, which are more challenging than laboratory data. With a focus on reducing computational complexity and enhancing model robustness on the experimental side, there is a need to further investigate the performance of other deep learning algorithms such as semi-supervised learning and self-supervised learning. The effect of the two-dimensional transformation of spectra on the experimental results is also explored, and as soil spectral databases continue to emerge in different parts of the world, we will continue to investigate the effectiveness of our model on other spectral databases.
5. Conclusions
In this paper, we first summarize the previous research on the LUCAS dataset and analyze the challenges of predicting soil properties using the LUCAS spectral database. A deep learning method combining Transformer and CNN is proposed for the simultaneous prediction of soil properties from Vis-NIR spectral data. Experimental validation on the LUCAS dataset shows that by combining the advantages of Transformer and CNN, the method captures long-range dependencies and local features in the spectral data, significantly improving the prediction accuracy and stability of soil properties. Savitzky–Golay smoothing and first-order differential preprocessing methods play an important role in reducing spectral noise and improving model prediction performance. We predicted all 11 soil properties in the LUCAS dataset, and the parallel model outperformed a single model and made better predictions than traditional machine learning. Although this study demonstrated the great potential of the fusion model, there is still room for improvement in computational efficiency, model complexity, and application breadth. Unlike laboratory data, VNIR spectra collected in the field are affected by a variety of environmental factors such as weather, light intensity, and humidity. These factors may introduce higher variability in the data, thus complicating the prediction of soil properties. Based on the favorable results obtained in this study, we will use the more challenging field-collected soil spectra to evaluate our model in future studies.
Future research should continue to optimize the model structure, explore more application scenarios, and promote its use in real agricultural production. Enhancing the prediction capability of soil properties can improve soil management practices, promote sustainable agricultural development, and achieve significant ecological and economic benefits. The Transformer and CNN fusion model proposed in this paper provides an effective and robust new approach for hyperspectral soil property prediction. Through further research and optimization, this method is expected to play a greater role in soil monitoring and agricultural management, helping achieve the goals of healthy soil and sustainable agriculture.