3.1. Results of Chemical and Spectral Data Analysis
The summary of the results of the chemical analysis of the soil samples from both data sets are shown in
Table 2, separated by horizon. The BZE Saxony data consisted of 362 samples in total, with 176 samples from Oh horizons and 186 samples from Ah horizons. In the organic horizon, values for total C content ranged from 8.60 to 49.4%, with a mean of 27.15%. Percentage of nitrogen in the sample lay between 0.32 and 2.08%. Mean value was 1.22%. For the Ah horizon, C content values lay between 0.40 and 17.22, with a mean of 5.12. The mean of the N content was 0.23, with values ranging from 0.02 to 1.06.
Regarding the data set collected in the forest area Zellwald, values for C content in the Oh horizon were measured to be between 18.81 and 41.36 %, with a mean of 33.99 %. The N content in these samples ranged from 0.2 to 2.11 %. The mean was 1.66 %. In the Ah horizon, C content lay between 3.16 and 13.88 with a mean value of 6.43. Measured values for N content ranged from 0.17 to 1.08, mean value was 0.34.
Graphs of the spectra obtained with the different devices can be seen in
Figure 2. It shows all measured samples (mean of ten single measurements) of both data sets per spectrometer, with the mean of all measurements. The raw spectra is shown on the left, while pre-processed data can be seen on the right side. It clearly can be seen that remarkable trends in all measured spectra were emphasized by the applied pre-processing methods.
The Hamamatsu and Veris spectra both showed the steady ascent through the whole visual range. In the NIR-range, the important reflectance dips around 1400 as well as around 1900 could be detected in both measured spectra using the Veris and Neospectra devices. They are also known as strong water absorption features. The preprocessing accentuated these trend for both devices. In the Neospectra data, the feature around 2200 could be seen as well as it ranges until 2500 nm.
3.2. Predicting Total Carbon Content
To evaluate the application of the selected spectrometers for total C prediction of forest topsoil samples, we calibrated models for Oh and Ah horizon using PLSR and Cubist for every spectral data set, once for the BZE data from whole Saxony (regional scale), and once for the forest site Zellwald (local scale). An additional model was calibrated using samples from Oh and Ah horizons combined for the regional scale.
The validation results of the separate models for the Oh and Ah horizons for the regional BZE data set are shown in
Table 3. Using the spectral data in the visual range derived by the Hamamatsu device to predict carbon, the results were: RMSE = 8.98%,
= 0.01 and RPIQ = 1.48 (Oh) and 2.36%, 0.29 and 1.67 (Ah) for the PLSR approach. Using Cubist, RMSE,
and RPIQ were 7.03%, 0.40 and 2.90 (Oh) and 2.13%, 0.36 and 2.86 (Ah), respectively. When calibrating the models with NIR range data from Neospectra, results for PLSR regression were: RMSE = 6.87%,
= 0.43, RPIQ = 1.94 (Oh) and RMSE = 1.97%,
= 0.48 and RPIQ = 2.00 (Ah). The Cubist regression resulted in lower RMSE values of 6.70% and
and RPIQ of 0.43 and 1.99 for Oh horizons. For the Ah horizon data, model performance was more accurate as well (RMSE = 1.69% and
= 0.61, RPIQ = 2.34). The combination of Hamamatsu and Neospectra data using both visual and NIR range for regression resulted in RMSE,
and RPIQ values of: 5.87%, 0.57 and 2.27 (PLSR, Oh), 2.01%, 0.49 and 1.97 (PLSR, Ah), 5.60%, 0.61 and 2.38 (Cubist, Ah) and 1.75%, 0.55 and 2.25 (Cubist, Ah).
Modeling results on basis of the full-range Veris spectrometer with PLSR achieved: RMSE = 6.79%,
= 0.47, RPIQ = 1.96 (Oh) and RMSE = 1.93%,
= 0.55 and RPIQ = 2.05 (Ah). The Cubist estimations results were calculated as: RMSE = 5.98%,
= 0.58 and RPIQ = 2.23 (Oh) and RMSE = 1.59%,
= 0.62 and RPIQ = 2.48 (Ah), respectively. Therefore, the most accurate predictions of C content were reported using the full-range approaches. The usage of MEMS-spectrometer in combination led to similar predictive performance (Oh: 105%, Ah:89% of the full range models
values) while the visual range model showed the least precise results (Oh: 69%, Ah: 58% of the full range models
values). Estimated and observed values of the independent validation sets of the Cubist regression models based on data from all spectrometers and both Oh and Ah horizon are displayed in
Figure 3 on the left side. It can be seen that deviations of the predictions from the observed values were not evenly distributed for the models based on the visual range. The other approaches show a better orientation along the x = y line. For the Oh horizon, high values seem to be overestimated while low values tend to be underestimated. The residual plots (see
Figure A1 in the
Appendix A) underline this observation.
For the regional BZE data, we also calibrated models using a combined approach of both Oh and Ah horizons samples. The results of this approach can be seen in
Table 4. Using the visual range to predict C, the results were: RMSE = 6.64%,
R2 = 0.75 and RPIQ = 3.47 (PLSR) and RMSE = 6.36%,
= 0.76 and RPIQ = 3.63 (Cubist). Using the only the NIR range for model calibration, RMSE, R2 and RPIQ were 6.67%, 0.74 and 3.46 for PLSR and 5.01%, 0.85 and 4.60 for Cubist regression, respectively.
When calibrating the models with data from both visual and NIR ranges combined, results for PLSR were: RMSE = 6.98%, = 0.73, RPIQ = 3.30 (PLSR) and RMSE = 5.13%, = 0.84 and RPIQ = 4.49 (Cubist). Using the full range spectral data of the Veris device to predict carbon, the results were: RMSE = 5.41%, = 0.83 and RPIQ = 4.26 (PLSR) and 4.29%, 0.89 and 5.38 for the Cubist regression approach.
In this case, the most accurate predictions of C content were reported using the full-range device. The approaches using NIR range alone and MEMS-spectrometer in combination led to similar predictive performance. The model based on the visual range again showed less precise results. Calculated predicted and observed values of the independent validation sets from the BZE data of the Cubist regression models based on data from all spectrometers are displayed in
Figure 4. It can be seen that deviations of the predictions from the observed values were generally higher for the samples from Oh-horizons, which were distinguishable from the Ah samples.
For the visual range, it is further notable that, in contrary to the other approaches, the deviations of predicted values were less oriented along the x = y line. Models including data from the NIR area were more precise in this aspect.
The results of the model validation procedure for the Ah horizon of the local Zellwald data set can be seen in
Table 5 on the left side.
Using the spectral data in the visual range to predict carbon, the model results were: RMSE = 0.99%, = 0.62 and RPIQ = 2.66 for PLSR and 1.08, 0.56 and 2.42 for the Cubist approach. Using NIR spectral data for the models, RMSE, and RPIQ were 1.44%, 0.44 and 1.83 (PLSR) and 1.50%, 0.35 and 1.75 (Cubist). When calibrating the models with data from combined visual and NIR range, results for PLSR regression were: RMSE = 1.58%, = 0.66, RPIQ = 1.66. For Cubist, model results were RMSE = 0.62%, = 0.86 and RPIQ = 4.22. The full-range Veris approach resulted in lower RMSE values of 0.74%, and higher and RPIQ of 0.89 and 3.55 for PLSR. For the Cubist model, performance was calculate as RMSE = 0.90%, and = 0.86, RPIQ = 2.92.
Calculated predicted and observed values of the independent validation sets of the Cubist regression models based on data from all spectrometers are displayed in
Figure 5. It shows that that the models including data from both visual and NIR range were more accurate, as the points cluster tight around the x = y line. It was not possible to calibrate meaningful models based on the Oh horizon only (data not shown).
3.3. Predicting Total Nitrogen Content
In a second step, the performance of the different devices for total nitrogen content prediction was assessed for Oh and Ah horizon for both data sets of the study using PLSR and Cubist.
The results of the separate models of N prediction for the Oh and Ah horizons of the regional BZE data set are shown in the lower part of
Table 3. Regressions for Oh horizon based on spectral data in the visual range were more precise using Cubist, with values of RMSE = 0.36%,
= 0.23 and RPIQ = 1.86. For models based on the Ah samples, performance was less precise.
For NIR range, Cubist model results were more precise with RMSE = 0.26%,
= 0.61 and RPIQ = 2.56 for the organic Oh horizon. For the Ah horizon models, we achieved very similiar results for both approaches. Regarding the models for combined data set covering visual and NIR range, Oh results were: RMSE = 0.29%,
= 0.51, RPIQ = 2.28 (PLSR) and RMSE = 0.27%,
= 0.57, RPIQ = 2.42 when using the Cubist regression approach. PLSR outperformed Cubist when predicting total N content of the Ah soil samples Additionally, we used a full-range device for model calibration. The models based on Oh samples with PLSR resulted in RMSE = 0.30%,
= 0.48, RPIQ = 2.42. The Cubist models were more precise with RMSE = 0.25%,
= 0.66, RPIQ = 2.64. Using Ah samples the Cubist model again resulted in higher accuracy with RMSE = 0.06%,
= 0.78 and RPIQ = 3.08. In total, the full range device resulted in best prediction performance for total N estimation. Predicted and observed total N content values for models based on data from both horizons and all used spectrometers for the regional BZE data are shown in
Figure A2. Largest deviance in the predictions can clearly be seen for the models based on the visual range. Results for the other approaches were more precise, as they were distributed closer and more along to the x = y line. Further, lower values tended to be overestimated, while higher ones were underestimated. This observation was stronger for the models of the Oh horizon.
In addition, we also calibrated models using a combined approach of both Oh and Ah horizons samples for the regional BZE data. The results of the calculations can be seen in
Table 4. In this case, the models calibrated based on the visual range to predict N content achieved similar accuracy for both algorithms. Using the NIR range only, model accuracy increased. RMSE,
and RPIQ were 0.35%, 0.67 and 3.08 for PLSR and 0.24%, 0.85 and 4.51 for Cubist regression, respectively. The results of the combined devices procedure resulted in similar results. PLSR achieved RMSE = 0.34%,
= 0.71 and RPIQ = 3.17. The usage of Cubist regression resulted in RMSE = 0.24%,
= 0.85 and RPIQ = 4.57. Models calibrated using data derived from the full-range device yielded best accuracy. The PLSR model resulted in RMSE = 0.29%,
= 0.79 and RPIQ = 3.80, the Cubist approach in RMSE = 0.21%,
= 0.88 and RPIQ = 5.24. The predicted and observed values of the Cubist model combing both horizons of the regional BZE data can be seen in
Figure A3. The largest inaccuracies in the predictions could be observed for the models based on the visual range. The other approaches were more accurate, indicated by tighter point clouds and a better orientation along the x = y line.
We observed an equivalent prediction power of the combined MEMS-spectrometer approach compared to the Veris device.
The results of the independent validation procedure for the Ah horizon of local Zellwald data, separated by algorithm and sensor, can as well be seen in
Table 5. The error measures for N content are on the right side. Regression based on spectral data in the visual range was more precise using Cubist and resulted in values of RMSE = 0.08%,
= 0.45 and RPIQ = 1.67. Regarding the NIR range, model performance increased and was more accurate using PLSR, with RMSE = 0.07%,
= 0.67 and RPIQ = 2.15 for PLSR. Using Cubist, models were less precise. Using the combination of visual and NIR data, the local model for Zellwald resulted in in RMSE values of 0.09% (PLSR) and 0.06% (Cubist),
values ranged from 0.58 (PLSR) to 0.71 (Cubist). RPIQ was calculated as 1.54 for PLSR and 2.41 for Cubist. For the full range device data, prediction accuracy was the most precise. PLSR results were: RMSE = 0.06%,
= 0.87, RPIQ = 2.19. For the Cubist regression, RMSE was 0.04%,
was 0.84 and RPIQ = 3.37. Predicted and observed total Ah horizon N content values for models based on local Zellwald data from all used spectrometers are shown in
Figure A4. Smallest deviance in the predictions can clearly be seen for the models based on the combined and full rage approaches. As for the C predictions, no suitable models could be calibrated for the Oh horizon of the local Zellwald data.