1. Introduction
The European Union (EU) promotes the development of a circular economy aimed at reusing resources and reducing negative environmental impacts [
1]. In this context, the development of technologies capable of intercepting nutrients from the waste stream and transforming them into a safe product to be reused for the benefit of agriculture could represent a winning solution to ensure global food security and address the challenges associated with excessive waste production [
2].
The latest available data on sewage sludge (SWS) production suggest that more than 9500 thousand tons of dry matter were produced in 2022 in the countries within the European region [
3]. The volumes of waste biomass produced annually pose significant challenges for their sustainable management, aimed at minimizing the negative impact on the environment and human health. SWS’s primary disposal and recycling pathways include agricultural reuse, incineration, composting, and landfill. Specifically, in European countries, the valorization of the fertilizing and conditioning properties of SWS through land application (38%) and composting (17%) predominates. This is followed by waste-to-energy treatments (30%), a disposal method widely used in Germany, the Netherlands, Switzerland and Belgium, which can be combined with the potential recovery of thermal energy and the recycling of phosphorus contained in the combustion ashes. The remaining fraction (15%) is typically disposed of in landfills or via other methods [
3]. In Italy, in 2021, 52% of the sludge produced from urban wastewater treatment was sent for disposal operations, while the remaining 45.6% was directed to recovery operations—this value is steadily increasing, particularly in the country’s northern regions [
4].
In this context, the application of urban SWS as an organic amendment in agriculture offers significant benefits due to its high nutrient content (20% dry matter (DM), >1.5%DM nitrogen and >0.4%DM phosphorus [
5]), which can enhance soil fertility and improve crop productivity. However, uniform SWS distribution is crucial to maximize these benefits and minimize potential environmental risks, such as nutrient runoff or uneven crop growth; it also prevents the hotspot accumulation of heavy metals and organic pollutants featured in SWS [
6]. The nutrient content of SWS varies significantly depending on its origin, as regional waste treatment processes and local urban activities influence it. This variability challenges its standardization as a fertilizer in precision agriculture. However, this study acknowledges that such variability can be addressed by implementing site-specific nutrient analysis before application. Techniques like on-the-go sensors and rapid laboratory testing can provide real-time data on nutrient composition, allowing for tailored application strategies. By integrating these approaches, the potential environmental risks associated with nutrient imbalances can be minimized, while maximizing the agronomic benefits of using SWS.
Sludge could be applied using conventional manure spreaders, primarily designed for solid organic matter, which may not provide the even distribution required in agriculture. Inaccurate or uneven distribution can lead to over- or under-application in certain areas, reducing the amendment’s effectiveness and potentially causing environmental harm [
7,
8].
But even before uniformity, precision agriculture (PA) values knowledge of distribution variability, if it is known and constant. In this way, a machine may be able to match complex prescription maps for variable dose distribution.
PA plays a crucial role in SWS distribution. In Italy, the diffusion of PA is very low (only 1% of the Utilized Agricultural Area) due to the typical orography of Italian territory and farms [
9]. The increase in agricultural input prices, such as fertilizer, can increase PA.
The current PA technology provided by tractor owners is now mature and its operation established: it allows precision fertilization, sowing, spreading and irrigation. However, the latest advancements in PA emphasize the need for accurate and efficient application methods, where technologies like interpolation and machine learning can play a pivotal role. Interpolation techniques can model and predict the spatial distribution of SWS, enabling better management practices that align with the goals of PA [
10,
11]. Furthermore, machine learning models can classify and predict distribution zones, optimizing sludge application and ensuring positive impacts on crop yields and environmental sustainability [
12].
When comparing accuracy results to Kriging, which is traditionally effective at modelling spatial autocorrelation, it is encouraging to see that the combination of neural networks and spline interpolation has the potential not only to match but possibly exceed Kriging’s performance in certain aspects, primarily when data preprocessing is carefully managed [
7,
13]. This observation reflects a promising trend in the literature, where combining machine learning and interpolation techniques leads to more substantial and more reliable spatial predictions.
Furthermore, the research conducted by [
14], who examined various interpolation methods for predicting soil attributes, adds valuable insight to this study’s findings. Their results highlight the strengths of spline-based methods, mainly when utilized alongside advanced machine learning models. This study suggests that more straightforward methods, such as inverse distance weighting (IDW), may struggle with more complex datasets, illustrating the advantages of more sophisticated techniques like spline interpolation.
This study investigates the effectiveness of a standard manure spreader in distributing urban sludge across agricultural land, using interpolation methods to assess the spread patterns, and machine learning models to classify and predict distribution zones. The results aim to contribute to the optimization of sludge application strategies, supporting the broader adoption of PA practices [
10,
13].
2. Materials and Methods
2.1. Field Trials
Field experiments were conducted on a test site located in northern Italy, around San Giorgio di Lomellina (Pavia, Italy; 45.156450° N, 8.768105° E). SWS came from different urban wastewater treatments across Italy and is then processed with oxide lime to break down the bacterial load and stabilize its mass.
SWS was applied using a conventional manure spreader across a pre-tilled agricultural field. Fifteen trays (0.50 m × 0.50 m) were positioned to capture the spread pattern, distributed systematically along both longitudinal and latitudinal axes (five latitudinally x three longitudinally, every 3 m) (
Figure 1). The experiment was repeated thrice under consistent conditions to ensure data reliability and capture any spread pattern variability. The sludge collected from the trays was immediately weighed (grams), avoiding any loss of moisture that could compromise the final result. The weight was recorded and implemented in an Excel
® file for elaboration. The aim was to evaluate the machine’s ability to distribute sludge evenly across the field.
2.2. Spreader Characteristic
The trailed manure spreader (
Figure 2) used in the trials had 8 m
3 of volume, two rear vertical rotors (360 rpm) for spreading and 12 m of distribution width. The spreading of organic material was guaranteed by a chain mat that moved a belt on the bottom surface of the wagon. A rear bulkhead mounted sensorized hydraulic cylinder (MH-Series MS, Temposonics GmbH Co., Lüdenscheid, Germany) was used to adjust the opening height according to the characteristics of the organic material inside the spreader. Furthermore, a belt advancement control sensor (model M18, DigiDevice, Calvisano, Brescia, Italy) was installed to modulate the speed and, therefore, the amount of material to the rear rotors. In the trials, the spreader was used in conventional mode; the installed sensors acquired information without changing the spreader distribution settings.
2.3. Statistical Analysis
The sludge collected in each tray was analyzed using ANOVA, after checking for homogeneity of variance and normality, to determine if there were significant differences in the distribution patterns across the repetitions (trials) and along the geographical coordinates. The confidence level for the analysis was set at a p-value < 0.05. The analysis aimed to identify any systematic variations in distribution caused by mechanical limitations of the distributor or the configuration of the field. Subdivision into geographical axes made it possible to separately assess the uniformity of distribution along the direction of travel (longitudinal axis) and along the width of the field (latitudinal axis).
2.4. Interpolation Techniques
Spatial interpolation was used to estimate the value of unknown points with a few data where the value is observed.
The software QGIS 3.28 LTR with the SAGA plugin allowed the assessment of the spatial distribution of the sludge; successively, three interpolation methods were applied as follows:
Spline: This method creates a smooth curve that fits the data points, ideal for modelling gradual changes in distribution. The data (amount of sludge, kg) of the single trials are included in the algorithm, and the grid pixel output extent (m) is selected.
Cubic Spline: This is a more refined version of the spline that ensures a smoother transition between data points. Following third-degree polynomials, the result of interpolation is smoother and avoids oscillations. The input data are equal to the spline, but the setting of output is quite different: the cubic spline allows the selection of the minimal and maximal number of points to create the interpolation (low points do not make interpolation, and high points avoid overfitting) and finally, the density of points to be used in a specific cell.
Inverse Distance Weighting (IDW): This method calculates the value of unknown points based on the inverse distance from known points, with closer points having more influence. The algorithm creates a coefficient depending on the distance between different known points. This method must have a regular sample grid, otherwise the interpolation may contain more errors and be less accurate. The input data are the same as the previous methods. The sett radius influences the output results, namely the influence distance of the single point in the final calculated interpolation.
These interpolation methods generated distribution maps visually representing the spread pattern across the field. The maps were analyzed to identify any trends or anomalies in the distribution, with particular attention to areas of over- or under-application.
2.5. Machine Learning Classification on Field Data
Machine learning classification validated the interpolative results. Three machine learning algorithms were employed to classify mud distribution patterns and predict application zones using the interpolated data as input. The implementation was carried out in the R environment using specific packages for each model [
15]. The analysis leveraged several key R packages that are commonly used in the field of machine learning and data analysis:
k-Nearest Neighbors (kNN): The k-nearest neighbors algorithm classifies points based on the majority category of their nearest neighbors, making it particularly effective in scenarios where the data are locally structured. This study used the class package [
8] to implement the kNN algorithm, with the number of neighbors (k) selected via cross-validation to optimize predictive performance. The caret package [
16] was also utilized for model training and evaluation, allowing for a streamlined workflow that includes data preprocessing, model tuning, and validation.
Random Forest: Random forest is an ensemble learning method that builds multiple decision trees and merges their results to enhance classification accuracy and control overfitting [
17]. The randomForest package [
18] was employed to create and train the random forest models, with hyperparameters such as the number of trees (ntree) and the number of variables tried at each split (mtry) tuned to achieve the best performance. Additionally, the caret package was again used to manage the tuning process and to evaluate model performance using cross-validation techniques.
Neural Network: A neural network, particularly a feedforward neural network, was applied to capture complex relationships within the data, offering high predictive accuracy for non-linear and high-dimensional datasets. The nnet package [
8] was used to implement a single hidden-layer neural network, with the number of neurons in the hidden layer (size) and the regularization parameter (decay) optimized through grid search techniques. For more complex deep learning models, the keras package [
19] was integrated, allowing for the construction of deep neural networks with multiple layers, activation functions, and dropout regularization to mitigate overfitting. The architecture consisted of an input layer corresponding to the number of features in the preprocessed dataset, a single hidden layer with 64 neurons, and an output layer with one neuron employing a sigmoid activation function to output probabilities for binary classification. The hidden layer utilized the ReLU activation function to handle non-linear relationships in the data. The model was trained using the Adam optimizer with an initial learning rate of 0.001, selected for its efficiency in handling gradient updates and its adaptability during training. The batch size was set to 32, balancing computational efficiency and stability during the learning process. Training was capped at 100 epochs, with early stopping implemented to prevent overfitting. Early stopping monitored the validation loss and halted training when no improvement was observed over 10 consecutive epochs. To further mitigate overfitting, a decay parameter of 0.01 was applied as regularization.
The performance of the neural network was evaluated using several metrics, during both training and validation phases (MT: Training Metrics) and on the test set (MP: Prediction Metrics), including mean absolute error (MAE) and mean squared error (MSE) to measure prediction accuracy, and accuracy (CA) to assess the proportion of correctly classified instances. The area under the curve (AUC) was calculated to evaluate the model’s ability to distinguish between classes. Additionally, F1 score, precision, and recall were used to balance the evaluation of true positive and false negative predictions. The Matthews correlation coefficient (MCC) was particularly useful in assessing the agreement between predictions and true values, especially in imbalanced datasets. These metrics were computed during both training and validation phases to ensure consistency in performance evaluation. The dataset included the distributed sludge quantities (kg) measured at the 15 sampling points and interpolated using the spline, cubic spline and IDW methods.
Data preprocessing steps, including normalization, feature scaling, and handling of missing values, were performed using the caret package, ensuring that all models received comparable input data. To ensure reproducibility, random seeds were set using the caret package in R with set.seed(42) to ensure consistency of results during optimization and training. The value 42 was selected arbitrarily, as it is a widely recognized convention in computational research, providing consistency across implementations while maintaining randomness in generated sequences.
For each model, the optimization of the hyperparameters was performed using the caret package, which allowed for the identification of optimal values through cross-validation and error minimization on the test data.
The dataset was divided into training (70%) and testing sets (30%), and each model was iterated 100 times to ensure robustness. Performance metrics, including area under the curve (AUC), classification accuracy (CA), F1 score, precision, recall, and the Matthews correlation coefficient (MCC), computed via the pROC package [
20], were calculated for each model. AUC was calculated using the R pROC package based on the ROC curves generated for each model. CA, precision, recall and F1 score were derived from the confusion matrices. The MCC was calculated to assess the correlation between predictions and observations, which is particularly useful in the presence of unbalanced datasets. The models were iterated 100 times, with average results reported to reduce variability due to randomization in the distribution of the data. These metrics provided insights into the effectiveness of the machine learning models in accurately classifying the distribution patterns.
4. Discussion
The results of this study highlight both the potential and the limitations of using a conventional manure spreader for slurry application in precision agriculture. The significant variation along the longitudinal axis underscores the challenge of achieving uniform distribution with the current equipment, particularly across the working width. However, integrating interpolation and machine learning techniques presents a viable solution. By accurately modelling the spread pattern, these techniques allow for identifying areas that require corrective action, such as adjusting the overlap of spreader passes to achieve more uniform coverage [
7,
8].
Furthermore, this study provides valuable insights into developing precision agriculture strategies that incorporate advanced data analysis tools. Predicting distribution patterns with high accuracy enables more informed decision-making, potentially leading to the more efficient use of resources, reduced environmental impact, and improved crop yields. However, the effectiveness of the machine learning models is highly dependent on the quality and quantity of data collected, which may vary under different field conditions or with different types of spreaders. Additionally, while the interpolation methods used in this study provided reliable results, they are based on assumptions about the spatial relationships between data points that may not always hold true in practice [
10].
An innovative aspect of this study is the cross-validation of interpolative results using machine learning models, ensuring a robust and accurate analysis of distribution dynamics. This approach, rarely explored in the current literature, demonstrates how advanced data analysis tools can be integrated into precision agriculture practices to improve the efficiency of sludge applications and reduce environmental impact.
Despite encouraging results, this study has some limitations. First, the data were collected at a single experimental site with a specific type of manure spreader. Future studies should replicate these analyses under diverse environmental conditions and with different equipment to confirm the generalizability of the findings. Second, machine learning techniques require large datasets to achieve optimal performance, which may pose a barrier to their practical adoption in agricultural contexts with limited data collection resources. Developing more efficient data collection methods and simplified learning models could facilitate the adoption of these technologies by farmers.
The long-term impacts of urban sludge application, such as heavy metal accumulation and organic pollutants, should also be evaluated. Combining advanced technological approaches with sustainable agronomic strategies is a priority to ensure both environmental and productive benefits.
This study provides practical insights for precision agriculture software developers and agricultural machinery manufacturers. Enhancing the mechanical components of manure spreaders, such as by implementing advanced sensors and automated systems for speed and trajectory adjustment, could further improve distribution uniformity. In parallel, the algorithms developed for distribution analysis could be integrated into automatic tractor guidance systems, enabling site-specific SWS applications aligned with prescription maps.
When evaluating interpolation results against established research, several compelling insights arise. The results presented are consistent with findings from geostatistical and agricultural studies. In terms of spline and cubic spline interpolations, these methods typically outperform inverse distance weighting (IDW) due to their smoothness and ability to model continuous surfaces. Neural networks also perform strongly, combining the advantages of spline methods and machine learning to capture complex patterns [
7]. Additionally, random forest is recognized for its ability to balance precision and recall, showing robust results in agricultural contexts, particularly for predicting crop yields and soil conditions [
7,
10]. This study adheres to this trend, with spline-based techniques combined with neural networks achieving the highest evaluation metrics, particularly in AUC, F1 score, precision, and the Matthews correlation coefficient (MCC).
The integration of neural networks with spline interpolation increases predictive accuracy, especially in intricate data pattern scenarios [
7]. The work of [
12] supports our conclusions, as they noted that spline-based methods and neural networks performed exceptionally well in spatial predictions, particularly concerning geospatial data on soil properties and crop yields. Likewise, Reference [
11] reinforced that spline-based methods are particularly effective for spatial predictions, especially in mapping soil properties. This is consistent with our findings, which indicate that neural networks achieved superior results compared to other models in terms of area under the curve (AUC) and Matthews correlation coefficient (MCC). These insights emphasize the potential of combining these advanced methods for improved spatial analysis.
In their study, Reference [
21] provided critical insights into interpolation methods for predicting water quality indices, underscoring the effectiveness of spline-based methods and neural networks, asserting that these approaches deliver remarkably accurate and robust predictions in complex spatial scenarios, significantly surpassing traditional inverse distance weighting (IDW) techniques.
Overall, the described experiment reinforces the prevailing literature, highlighting the strengths of spline-based methodologies and machine learning models, especially neural networks. Such a combination is key to achieving reliable outcomes in spatial analysis.
5. Conclusions
This study highlights the practical implications of using advanced technologies such as machine learning and interpolation methods in precision agriculture. The combination of IDW interpolation and neural networks achieved the highest accuracy, with a Matthews correlation coefficient (MCC) of 0.9820, showcasing their potential to address challenges in spatial variability and distribution efficiency. These results provide actionable insights for improving manure spreader calibration and optimizing in-field operations. Furthermore, adopting these techniques can enhance nutrient management strategies, reducing environmental risks and ensuring sustainable agricultural practices. Future research should focus on integrating real-time sensors and automation systems to further refine application precision and expand the scalability of these approaches
However, several critical points must be addressed in future research. Firstly, the current study is based on data from a single field trial with a specific type of manure spreader; additional studies are needed to validate these findings across different field conditions and with other types of spreaders. Secondly, machine learning models, while accurate, require large datasets for training, which may not always be feasible in practical settings. Future research should explore ways to optimize data collection and model training to ensure these techniques are accessible and applicable in diverse agricultural contexts. Furthermore, to improve the precision management of these organic matrices with high variability, it is important to provide preliminary laboratory analysis or, even more precisely, on-the-go analysis with expeditious methods (e.g., NIR) to manage the distribution. Finally, further investigation is needed into the long-term impacts of sludge distribution patterns on soil health and crop productivity to ensure that precision agriculture strategies can be sustainably integrated into farming practices.