A Comprehensive Analysis of Soil Erosion in Coastal Areas Based on an Unmanned Aerial Vehicle and Deep Learning Approach

Li, Han; Miao, Sheng; Qi, Yansu; Gao, Huiwen; Duan, Haoyan; Liu, Chao; Gao, Weijun

doi:10.3390/su17031261

Open AccessArticle

A Comprehensive Analysis of Soil Erosion in Coastal Areas Based on an Unmanned Aerial Vehicle and Deep Learning Approach

by

Han Li

¹,

Sheng Miao

²

,

Yansu Qi

³,

Huiwen Gao

²,

Haoyan Duan

³,

Chao Liu

^1,*

and

Weijun Gao

⁴

¹

School of Environmental and Municipal Engineering, Qingdao University of Technology, Qingdao 266520, China

²

School of Information and Control Engineering, Qingdao University of Technology, Qingdao 266520, China

³

College of Architecture and Urban Planning, Qingdao University of Technology, Qingdao 266520, China

⁴

Faculty of Environmental Engineering, The University of Kitakyushu, Kitakyushu 808-0135, Japan

^*

Author to whom correspondence should be addressed.

Sustainability 2025, 17(3), 1261; https://doi.org/10.3390/su17031261

Submission received: 15 January 2025 / Revised: 1 February 2025 / Accepted: 2 February 2025 / Published: 4 February 2025

(This article belongs to the Section Soil Conservation and Sustainability)

Download

Browse Figures

Versions Notes

Abstract

:

Soil is an important nonrenewable resource. Soil erosion is increasingly severe, and the accurate identification of soil erosion is crucial for ecological sustainability. In recent years, advancements in artificial intelligence have significantly contributed to the development of precise modeling technologies. This study utilizes high-resolution multispectral images captured by unmanned aerial vehicles and applies five machine learning models, namely convolutional neural network (CNN), support vector classification, random forest, extreme gradient boosting, and fully connected neural network, to identify regional soil erosion. The performance of each model is evaluated using F1-score, precision, and recall measurements. The results show that all models exhibit strong recognition capabilities, with CNN outperforming the others in both training and testing phases. Specifically, CNN achieved a recall rate of 0.99 on the training set and an F1-score of 0.98. Given the black-box nature of machine learning models, the shapley additive explanations method is further used for interpreting model outputs. The analysis reveals that the normalized difference salinity index and soil erodibility factor are the primary factors influencing soil erosion in the study area.

Keywords:

soil erosion; multispectral remote sensing; convolutional neural networks; SHapley Additive exPlanation; deep learning

1. Introduction

As a fundamental provider of numerous ecosystem services, soil plays an indispensable role in sustaining human and ecological well-being while supporting sustainable development. Its significant carbon sink function allows for the absorption and storage of substantial amounts of carbon [1,2], thereby mitigating the adverse effects of global climate change. However, in recent years, escalating global warming and intensified human activities have posed unprecedented challenges to soil resources [3,4]. Among these challenges, soil erosion, one of the most severe threats to global soil resources, poses a significant risk to the sustainable development of natural ecosystems, economies, and societies [5]. According to the Status of the World’s Soil Resources report, over one billion hectares of land worldwide are currently affected by various types of erosion, including water, wind, and gully erosion [6]. This alarming situation underscores the urgent need for immediate and effective measures to combat soil erosion and preserve soil health.

Furthermore, soil erosion is a complex process influenced by multiple interacting factors, including rainfall, soil properties, topography, and climate change [7]. Accurately identifying and monitoring soil erosion remains a significant challenge, although empirical models offer simplified approaches to address these challenges. However, due to oversimplified assumptions and reliance on specific environmental conditions, their reliability is often limited [8]. Consequently, there is ongoing debate regarding the most effective and accurate classification model for soil erosion. In this context, machine learning (ML) algorithms, with their exceptional data processing capabilities and ability to capture complex relationships, provide a novel perspective and promising solution for soil erosion classification and monitoring. These algorithms can automatically extract key features from large datasets and uncover intricate relationships among variables, enabling the precise classification of soil erosion phenomena [9]. But the performance of machine learning algorithms can vary significantly. Therefore, the selection of the most suitable algorithm to optimally solve the problem of soil erosion classification and monitoring has become the key point of current research.

Given the significant threat that soil erosion poses to global ecosystem services and sustainable development, coupled with the limitations of existing monitoring and identification methods, this study commits to explore a more efficient and accurate approach for soil erosion classification. Utilizing multi-spectral images collected by unmanned aerial vehicles (UAVs), this study selects several indicators, combined with an empirical model and geographical environment characteristics. On this basis, the performances of five machine learning algorithms—CNN, support vector classifier (SVC), random forest (RF), extreme gradient boosting (XGBoost), and feedforward neural network (FNN)—in soil erosion classification tasks are compared. The performance of each model is comprehensively evaluated using the following three key metrics: accuracy, recall, and F1-score. Furthermore, to gain deeper insights into the influence of various factors on soil erosion classification, SHapley Additive exPlanations (SHAP) were employed for feature importance analysis. The main contributions of this study are as follows:

Five machine learning techniques are applied to model the classification of soil erosion intensity.
The convolutional neural network model demonstrates the best performance in soil erosion modeling.
The normalized difference salinity index is identified as the most important factor influencing soil erosion.

2. Related Works

Soil erosion has a significant impact on the protection and development of soil resources, water resources, and plant resources. The methods for classifying and monitoring soil erosion intensity can generally be divided into physical and empirical models [10], with the latter currently being the most widely utilized. To provide a comprehensive assessment of soil conditions in a basin, Sud [11] combined the Revised Universal Soil Loss Equation (RUSLE) model with Google Earth Engine to reveal different degrees of soil erosion in the Satluj basin. Similarly, Delgado [12] integrated satellite remote sensing data with the universal soil loss equation model to estimate soil loss in the plain basin of Argentina. However, despite the advantages of satellite remote sensing, such as wide coverage and relatively easy data acquisition, its data quality is often compromised by factors such as spatial resolution and cloud cover, which necessitate sophisticated processing techniques. In contrast, UAV technology offers greater flexibility and high-resolution imaging capabilities, providing a rich source of data for soil erosion monitoring. Pijl [13] utilized high-resolution terrain and land use data obtained from UAVs, applying both the RUSLE and the simulated water erosion model to analyze soil erosion in steep slope vineyards. While empirical models, such as RUSLE, play a crucial role in soil erosion research, their reliability is often limited by the simplified assumptions and specific conditions they rely on. This is particularly evident when considering that soil erosion is influenced by a combination of factors, such as slope, temperature, precipitation, and vegetation.

To address the limitations of empirical models, recent research has shown a clear trend toward utilizing machine learning algorithms for more effective analysis of soil erosion. Various ML algorithms, such as support vector machine (SVM), RF [14], and XGBoost [15], have proven to be valuable tools for identifying and mapping different types of soil erosion. Gelete [16] used four ML models, namely XGBoost, RF, SVM, and artificial neural network (ANN), to identify areas sensitive to gully erosion. Additionally, Sadia [17] employed a combination of the analytic hierarchy process and ML techniques, including classification and regression trees, ANN, SVM, and RF, to identify areas in mountainous regions particularly vulnerable to soil erosion. Among these models, the RF model achieved the highest area under curve of 86%. As a “black box” model, the decision-making process of ML algorithms is often difficult to interpret, which limits their broader application in soil erosion research [18]. To address this issue, researchers have begun exploring model interpretation methods, such as local interpretable model-agnostic explanations [19], gradient-weighted class activation mapping [20], and SHAP [21]. These methods help identify the influence of each feature in the model and assess their relative importance [22]. With rich visualization tools and the capture of feature interactions, these methods can help researchers gain deeper insights into the decision-making processes of complex models, thereby enhancing the credibility and transparency of machine learning applications. For instance, Asma [23] employed Grad-CAM to offer a visual explanation for the decisions made by a CNN. Similarly, Mortier [24] utilized SHAP to analyze the relative importance of various variables, revealing that soil temperature had a significant impact on plant phenology, thereby demonstrating the utility of SHAP in environmental studies.

In short, soil erosion is a complex process. Current research primarily relies on satellite remote sensing data. However, significant challenges remain in data preprocessing, such as denoising and feature extraction. Additionally, empirical models exhibit notable limitations, and existing machine learning-based studies on soil erosion often focus narrowly on technical aspects while lacking comprehensive consideration of regional factors. Moreover, the performance of different machine learning models can vary significantly. To address these issues, this study utilizes UAV imagery, integrates geographic and environmental characteristics of the study area, and selects indicators based on empirical models to identify soil erosion intensity using multiple machine learning models. Furthermore, the SHAP method is introduced to interpret the key factors influencing soil erosion, thereby providing a scientific foundation and robust support for soil resource protection and sustainable management.

3. Materials and Methods

3.1. Study Area

The study area is located in the southern region under the jurisdiction of coastal cities in China, and its geographic coordinate information and boundary are shown in Figure 1. It covers a total area of 751.02 hectares. According to the time series observation data of meteorological stations around the study area, the average annual precipitation is 685.2 mm, the average annual temperature is 13.3 °C, and the average annual relative humidity is 70%. The primary soil types in the study area include brown soil and tidal brown soil. There are no rivers within the study area. The primary land use types are construction land, woodland, bare land, grassland, and so on. The area has rich natural landscapes and abundant tourism resources. However, rapid urban expansion and large-scale construction activities have significantly threatened soil protection in this study area.

3.2. Data Collection and Pre-Processing

The data used in this study primarily consisted of high-resolution multispectral images collected by an UAV and data from the HWSD 2.0 database. The DJI Phantom 4 Multispectral (P4M) UAV (Shenzhen Dajiang Innovation Technology Co., Ltd, Shengzhen, China) was selected for the image collection. The P4M UAV is equipped with a sensor that captures six spectral bands: blue, green, red, red-edge, near-infrared, and full-color light for visible imaging. At a relative altitude of 100 m, the UAV achieves an aerial photography accuracy of 0.05 m. The collected UAV multi-spectral images need to be pre-processed before they can be used, and the specific processing process is shown in Figure 2. The resulting orthophoto has a resolution of 80,038 × 91,959 pixels.

Based on the geographical characteristics of the study area and the spectral properties of the UAV imagery, combined with the empirical model, this study selected nine indices to study soil erosion. These indices included the soil erodibility factor (K), slope, aspect, digital surface model (DSM), distance from the coastline, normalized difference salinity index (NDSI), nitrogen reflectance index (NRI), cover management factor (C), and support practice factor (P), as detailed in Table 1 and Table 2. Given the small geographical scale of the study area and the relatively uniform spatial distribution of rainfall, the influence of rainfall was not considered in this study.

In this study, field investigation and sample data collection were carried out in the actual soil erosion area. Sampling points were randomly distributed throughout the study area, as shown in Figure 1. Through investigation, the soil erosion intensity in the study area was classified into three categories: slight, light and moderate. There were two primary methods used to acquire sample data. One involved taking photographs and recording measurements with a handheld RTK device, which provides centimeter-level positioning accuracy. The other involved capturing aerial images using a CMOS image sensor mounted on a DJI Air 2S UAV. To carry out effective model training and verification, the collected sample data were reasonably divided into three subsets: a training set, a test set, and a validation set, with a distribution ratio of 6:2:2. This structured approach allows for comprehensive model evaluation and improves the reliability of the results.

3.3. Methods

3.3.1. Machine Learning Models

Convolutional neural networks (CNNs) are a type of deep learning model particularly well-suited for processing data with grid-like structures, such as time-series data and image data. A typical CNN architecture consists of several components, including convolutional layers, activation functions, pooling layers, fully connected layers, and output layers. It has the characteristics of local connections and weight sharing, which can greatly reduce the number of parameters that need to be learned.

Support vector classification (SVC) is a supervised learning model. By utilizing eigenvectors, labels, support vectors, and kernel functions, SVC can transform nonlinear problems into linear ones through kernel techniques. One of the key advantages of SVC is its characteristic sparse solution, which significantly enhances computational efficiency. It is particularly effective in situations where precise classification is crucial and where high-dimensional data must be processed.

Random forest (RF) is an ensemble learning method that effectively mitigates overfitting while demonstrating strong performance in handling high-dimensional data. The model operates by constructing multiple decision trees, each independently trained on a randomly selected subset of features and samples. The final prediction is derived from aggregating the results of these individual trees, either through classification or regression tasks. Notably, RF has the characteristics of automatic feature selection, which greatly improves computational efficiency.

Extreme gradient boosting (XGBoost) is an efficient gradient boosting framework that is highly effective in handling sparse data. It consists of a series of sequentially trained decision trees, each of which is modified based on the residuals of the previous tree to gradually reduce prediction errors. Moreover, XGBoost supports custom loss functions, can effectively handle various types of input data, and offers hyperparameter tuning and built-in cross-validation functions.

Fully connected neural network (FCNN) is a classic deep learning model that is particularly well-suited for processing fixed-size input data. It comprises multiple fully connected layers, where each neuron in a layer is connected to every neuron in the adjacent layers. Each layer is characterized by a weight matrix, a bias term, and an activation function, which together facilitate efficient computation. FCNN performs exceptionally well on small-scale datasets that demand precise modeling and accurate classification.

3.3.2. SHAP Interpretation Techniques

SHapley Additive exPlanations (SHAP) is a model interpretation method grounded in Shapley values from game theory, designed to provide a unified and interpretable assessment of feature importance for complex machine learning models, including CNN, RF, and XGBoost. Its primary advantage lies in its ability to fairly allocate the contribution of each feature to the prediction results, thereby ensuring both global consistency and local accuracy in interpretation. SHAP operates through three key components: Shapley value calculation, characteristic contribution summation, and an expectation baseline. These features enable SHAP to deliver comprehensive insights into the inner workings of machine learning models. Consequently, it has found extensive applications in fields such as financial risk assessment, medical diagnosis, and image recognition. This is particularly valuable in scenarios demanding transparency and trust, making SHAP an essential tool for interpretable machine learning.

3.4. Model Evaluation

To comprehensively and accurately evaluate the performance differences between different machine learning algorithms in soil erosion classification, this study selects three evaluation indexes to evaluate the algorithm results, including accuracy, recall, and F1-score, as follows:

Accuracy = \frac{T P + T N}{T P + T N + F P + F N}

(1)

Recall = \frac{T P}{T P + F N}

(2)

F1 - score = \frac{2 T P}{2 T P + F P + F N}

(3)

where TP represents the number of pixels that are predicted to be positive and also those that are actually positive. TN represents the number of pixels that are predicted to be negative and also those that are actually negative. FP represents the number of pixels that are predicted to be positive but are actually negative. FN represents the number of pixels that are predicted to be negative but are actually positive.

4. Results and Discussion

4.1. Model Results

To effectively classify soil erosion intensity in the study area, this study employed five different algorithms for modeling and analysis. The results of these models, presented in Figure 3, revealed that each algorithm has distinct characteristics and proved their ability to accurately classify soil erosion to a certain extent. Notably, all five models achieved an F1-score exceeding 0.80, indicating good overall performance in the soil erosion intensity classification task. Among these algorithms, the CNN stands out as particularly exceptional, achieving an accuracy of 0.987, a recall of 0.986, and an impressive F1-score of 0.985. These results highlight that the CNN model achieved an outstanding balance between precision and recall, effectively identifying real soil erosion events with almost no omissions. Following CNN, the FCNN ranked second, with an F1-score of 0.840, accuracy of 0.844, and recall of 0.871, showing strong performance. In contrast, the RF and XGBoost algorithms performed comparatively less effectively in this specific task. While these models often yield excellent results in other applications, their performance in soil erosion intensity classification does not reach the level demonstrated by CNN in this study.

In the process of training the CNN, this study employed cross entropy as the loss function, which can effectively measure the difference between the predicted value and the actual value. This enabled the model to adjust its parameters through the backpropagation algorithm. As the number of training iterations increases, the CNN model gradually learns the characteristics of the soil erosion classification problem. After 50 training rounds, the loss rate on the training set significantly decreased to 0.005, as shown in Figure 4. This decrease indicates the absence of apparent overfitting, suggesting that the model performed well on the training set. These results not only show that the CNN model successfully captured the underlying patterns of soil erosion classification but also underscore its ability to effectively model this complex problem.

The CNN model was employed to classify the soil erosion intensity within the study area, with the results shown in Figure 5. From the figure, it is evident that the majority of the study area is characterized by slight soil erosion, primarily distributed in the southern and eastern regions, where the erosion appears in a sheet-like pattern. This area is mainly covered with ecological forests, and the plant canopy effectively reduces the intensity of rainfall impact, thereby reducing soil erosion. This is followed by light erosion, mainly consisting of village ruins. These ruins are located to the north and southwest of the study area, covered with weeds and featuring a relatively flat terrain. The presence of plant roots helps to reduce the erosive effects of surface runoff and stabilize the soil. In contrast, moderate erosion, which accounts for the smallest proportion, is predominantly located in the abandoned fish pond area in the west, and in the construction zones in the northern and western parts of the study area. These regions show a discrete, blocky distribution, largely caused by human activity. Due to the disturbance from construction, the soil in these areas has been excavated, and the lack of timely protective measures has led to low vegetation coverage or even bare ground. The uneven terrain, combined with minimal vegetation, accelerates rainwater runoff and directly exposes the soil to erosion, resulting in significant soil loss.

4.2. Enhanced Explainability of the ML Model

To further explore and understand the main influencing factors of soil erosion in the study area and determine the key targets of soil erosion control, SHAP was utilized in this study to analyze and interpret the classification results of CNN model, as shown in Figure 6. For slight erosion, the K factor emerged as the most influential factor, with a SHAP value of 0.103, indicating that K factor is a key determinant in the extent of micro erosion. Following K, NDSI and NRI also exhibited notable impacts, albeit to a lesser extent, on slight erosion. In the case of light erosion, NDSI had the greatest influence, with a SHAP value of 0.074, followed by the DSM and K, which also contributed significantly to the development of light erosion. For moderate erosion, NDSI once again had the greatest impact, with a SHAP value of 0.149, followed by K and NRI, which also played important roles in the progression of moderate erosion. Overall, NDSI, K, and NRI emerged as the most significant factors influencing soil erosion in the study area. While their contributions varied across different erosion intensities, they consistently had a more substantial effect compared to other factors. In contrast, the P factor, as well as slope and aspect, had relatively smaller impacts on soil erosion in the study area. But their potential effects under specific conditions cannot be completely ignored.

To gain a deeper understanding of the interactions among the factors influencing soil erosion, this study calculates the pairwise interaction values of impact factors using SHAP, with the results presented in Figure 7. The color gradient represents the magnitude of the feature values, with pink dots indicating large eigenvalues, blue dots indicating small eigenvalues, and purple dots representing values close to the mean. The horizontal axis illustrates the extent to which each feature influences the model’s output. Specifically, the farther a point deviates from the central line, the greater the impact of the corresponding feature on the model’s predictions. A positive SHAP value denotes a positive contribution to the output, whereas a negative SHAP value signifies a negative influence. The diagonal elements directly reflect the independent contribution of each factor to soil erosion. For instance, in the case of slight erosion, a lower DSM value exerts a stronger negative effect, while a higher C value also contributes negatively. Conversely, under light erosion conditions, a lower DSM value has a more pronounced positive effect. In moderate erosion scenarios, both higher C values and higher NDSI values exhibit a stronger positive influence on the model’s output. In addition, the off-diagonal elements illustrate the pairwise interactions between factors. Notably, the interaction values between factor i and factor j are symmetrically distributed, with each interaction value being equally shared between the two factors, resulting in graphs with the same shapes but opposite colors. The interaction between DSM and NDSI, C and DSM had significant influence on slight erosion. The interaction between DSM and C and the interaction between DSM and distance shoreline have substantial effects on light erosion. Under moderate erosion conditions, the interaction between K and the distance from the coastline plays a key role in the development of moderate erosion.

4.3. Discussion

As a non-renewable natural resource, soil protection is of great significance to the sustainable development of the ecological environment. At present, soil erosion identification mainly relies on satellite remote sensing technology, but this method has the limitations of having a low spatial resolution and being easily affected by weather conditions. To solve this problem, UAV-sourced multi-spectral images were introduced into soil erosion identification in this study. This study takes advantage of the high resolution of the UAV platform and significantly improves the accuracy of soil erosion identification. But UAV technology still has limitations for practical applications, such as a limited flight duration and sensitive lighting conditions. As a result, there may be spatial heterogeneity in the collected data, which limits their application in large-scale regional monitoring. In addition, the traditional soil erosion identification method has the problems of low efficiency and a high labor cost. To overcome these limitations, the researchers introduced ML technology to identify soil erosion. In this study, five machine learning models, CNN, SVC, RF, XGBoost and FNN, were used to identify soil erosion. The results show that the CNN model has the best identification performance. Through comparative analysis of the identification results of the CNN model and the RUSLE traditional empirical model, as shown in Figure 5, combined with field investigation data verification, it was found that the identification results of the CNN model have higher consistency with field conditions [27,28,29]. This difference is mainly due to the relatively limited factors considered by the RUSLE model and the regional dependence of its parameter setting. In this study, due to the sufficient sample size of the dataset, the method of partitioning independent verification sets was adopted to systematically evaluate five machine learning models based on the following three evaluation indicators: F1-score, accuracy, and recall rate. Previous research shows that the verification set partitioning method can provide more stable performance evaluation results and reduce the computational cost significantly in the case of large data scale. This finding is consistent with relevant research in the field of machine learning [30]. In other words, when the amount of data is sufficient, the validation set partitioning method is superior to the cross-validation strategy, as shown in the evaluation of efficiency and stability. However, it is important to note that the cross-validation strategy may be more applicable when the dataset size is small because it can make full use of limited data resources, so as to improve the reliability of model evaluation.

Although the black-box model results are accurate, they lack the transparency required for a deeper understanding of the underlying decision-making process. To address this limitation, this study employed SHAP to interpret the CNN-based soil erosion classification results. SHAP value plots offer an intuitive way to assess the relative importance of various impact factors, thereby enhancing the interpretability of the model and providing valuable insights into the factors driving soil erosion in the study area. Soil erosion is a complicated process, which is affected by many factors. The closer the distance to the coastline, the higher the salt content in the soil due to the effects of sea level rise and salt water intrusion [31]. An increase in soil salinity will not only damage the soil structure, but also reduce the soil gas and water permeability, which in turn increases the risk of soil erosion and poses a serious threat to the ecological balance. Moreover, too much salt in the soil will affect the photosynthesis and growth rate of plants, resulting in a decline in vegetation health status and coverage [32]. In addition, the NRI is also associated with vegetation health. When nitrogen content in plants is insufficient, their growth and development will be seriously hindered. Vegetation is an important barrier to protect the soil from erosion. Its leaves can slow down the erosion of the soil by rain. Its roots weave into a network in the soil, improving the cohesion and stability of the soil [33]. Therefore, a reduction in plant nitrogen will increase the risk of soil erosion. Most studies generally find that slope is the factor that most promotes soil erosion [34,35], as it affects erosion rates through its influence on runoff, vegetation cover, and soil type. However, in this study, slope and aspect were found to have the least impact on soil erosion. This may be due to the generally low slopes in the study area, coupled with relatively abundant vegetation in steeper regions, which helps reduce surface runoff and mitigates erosion.

The findings of this study revealed that the SHAP values for the NDSI, K, NRI, and DSM are significantly higher. Consequently, future efforts to mitigate soil erosion should prioritize these key factors. Meanwhile, addressing this issue effectively will require coordinated cooperation and action from all stakeholders. In future scientific investigations, it would be beneficial to explore additional interpretative methods and extend these approaches to a variety of soil conservation environments. Notably, while this study aligns with other researchers in selecting machine learning algorithms, it distinguishes itself by employing a unique dataset and focusing specifically on the classification of soil erosion in coastal areas. This research perspective offers fresh and comprehensive insights for studies utilizing similar datasets and addressing related challenges.

5. Conclusions

Traditional research methods of soil erosion often rely on empirical models to represent and analyze this phenomenon. However, given the complexity of soil erosion, these conventional approaches often struggle to capture the intricate relationships involved. To address this challenge, this study explored a more accurate and effective soil erosion classification method using five machine learning models. By applying multispectral UAV technology and soil data, combined with the empirical model, nine key indicators were collected and calculated as inputs for the machine learning models. After training and validating the dataset, the CNN demonstrated the best performance in the classification task. Furthermore, SHAP was employed to rank the importance of the nine indicators. The results revealed that the NDSI had the most significant impact on soil erosion classification.

The method proposed in this study not only provides a more accurate and effective approach for soil erosion classification but also lays a solid foundation for integrating this method with other environmental monitoring technologies. This integration could contribute to building a comprehensive and efficient environmental monitoring system in the future. It is worth noting that this method has only been applied in one study area, and its applicability in other similar regions or in long-time series studies including different seasons and years still needs to be further explored.

Author Contributions

Conceptualization, H.L., C.L. and W.G.; methodology, S.M. and H.G.; validation, Y.Q. and H.D.; analysis, H.L. and C.L.; investigation, H.L. and H.G.; data curation, Y.Q.; writing—original draft preparation, H.L.; writing—review and editing, C.L. and S.M.; visualization, Y.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding. This work was supported in part by the Qingdao Natural Science Foundation Youth Project [grant numbers 23-2-1-96-zyyd-jch].

Data Availability Statement

The data that support the findings of this study are available on request from the corresponding author, [C.L.].

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ANN	Artificial neural network
C	Cover management factor
CNN	Convolutional neural network
DSM	Digital surface model
FCNN	Fully connected neural network
K	Soil erodibility factor
ML	Machine learning
NDSI	Normalized difference salinity index
NRI	Nitrogen reflectance index
P	Support practice factor
P4M	Phantom 4 Multispectral
RF	Random forest
RUSLE	Revised Universal Soil Loss Equation
SHAP	SHapley additive explanations
SVC	Support vector classification
UAV	Unmanned aerial vehicle
XGBoost	Extreme gradient boosting

References

Wu, Y.; Li, X.Y.; Zeng, H.D.; Zhong, X.J.; Kuang, S.N. Analysis of Carbon Sink Benefits from Comprehensive Soil and Water Conservation in the Loess Hilly Gently Slope Aeolian Sand Region. Water 2024, 16, 3434. [Google Scholar] [CrossRef]
Mosaid, H.; Barakat, A.; John, K.; Faouzi, E.; Bustillo, V.; El Garnaoui, M.; Heung, B. Improved soil carbon stock spatial prediction in a Mediterranean soil erosion site through robust machine learning techniques. Environ. Monit. Assess. 2024, 196, 130. [Google Scholar] [CrossRef] [PubMed]
Guerra, C.A.; Rosa, I.M.D.; Valentini, E.; Wolf, F.; Filipponi, F.; Karger, D.N.; Nguyen Xuan, A.; Mathieu, J.; Lavelle, P.; Eisenhauer, N. Global vulnerability of soil ecosystems to erosion. Landscape Ecol. 2020, 35, 823–842. [Google Scholar] [CrossRef] [PubMed]
Tsatsaris, A.; Kalogeropoulos, K.; Stathopoulos, N.; Louka, P.; Tsanakas, K.; Tsesmelis, D.E.; Krassanakis, V.; Petropoulos, G.P.; Pappas, V.; Chalkias, C. Geoinformation Technologies in Support of Environmental Hazards Monitoring under Climate Change: An Extensive Review. ISPRS Int. J. Geo-Inf. 2021, 10, 94. [Google Scholar] [CrossRef]
Luetzenburg, G.; Bittner, M.J.; Calsamiglia, A.; Renschler, C.S.; Estrany, J.; Poeppl, R. Climate and land use change effects on soil erosion in two small agricultural catchment systems Fugnitz—Austria, Can Revull-Spain. Sci. Total Environ. 2020, 704, 135389. [Google Scholar] [CrossRef]
Tuo, D.; Lu, Q.; Wu, B.; Li, Q.; Yao, B.; Cheng, L.L.; Zhu, J.L. Effects of Wind-Water Erosion and Topographic Factor on Soil Properties in the Loess Hilly Region of China. Plants 2023, 12, 2568. [Google Scholar] [CrossRef] [PubMed]
Cao, Y.F.; Hua, L.; Tang, Q.; Liu, L.; Cai, C.F. Evaluation of monthly-scale soil erosion spatio-temporal dynamics and identification of their driving factors in Northeast China. Ecol. Indic. 2023, 150, 110187. [Google Scholar] [CrossRef]
Lin, H.; Zhao, Y. Soil Erosion Assessment of Alpine Grassland in the Source Park of the Yellow River on the Qinghai-Tibetan Plateau, China. Front. Ecol. Evol. 2022, 9, 771439. [Google Scholar] [CrossRef]
Gholami, H.; Jalali, M.; Rezaei, M.; Mohamadifar, A.; Song, Y.; Li, Y.; Wang, Y.; Niu, B.; Omidvar, E.; Kaskaoutis, D.G. An explainable integrated machine learning model for mapping soil erosion by wind and water in a catchment with three desiccated lakes. Aeolian Res. 2024, 67–69, 100924. [Google Scholar] [CrossRef]
Bueno-Hurtado, P.; Seidou, O. Empirical and physical modelling of soil erosion in agricultural hillslopes. J. Hydrol. Hydromech. 2024, 72, 279–291. [Google Scholar] [CrossRef]
Sud, A.; Sajan, B.; Kanga, S.; Singh, S.K.; Singh, S.; Durin, B.; Kumar, P.; Meraj, G.; Sahariah, D.; Debnath, J.; et al. Integrating RUSLE Model with Cloud-Based Geospatial Analysis: A Google Earth Engine Approach for Soil Erosion Assessment in the Satluj Watershed. Water 2024, 16, 1073. [Google Scholar] [CrossRef]
Delgado, M.I.; Carol, E. Soil loss and its possible consequences at a flatland watershed. Case of study: El Pescado Creek, Central-Eastern Argentina. Nat. Hazards 2024, 120, 6105–6123. [Google Scholar] [CrossRef]
Pijl, A.; Reuter, L.E.H.; Quarella, E.; Vogel, T.A.; Tarolli, P. GIS-based soil erosion modelling under various steep-slope vineyard practices. Catena 2020, 193, 104604. [Google Scholar] [CrossRef]
Mohammed, S.; Jouhra, A.; Enaruvbe, G.O.; Bashir, B.; Barakat, M.; Alsilibe, F.; Kulimushi, L.C.; Alsalman, A.; Szabó, S. Performance evaluation of machine learning algorithms to assess soil erosion in Mediterranean farmland: A case-study in Syria. Land. Degrad. Dev. 2023, 34, 2896–2911. [Google Scholar] [CrossRef]
Sun, L.H.; Liu, F.; Zhu, X.C.; Zhang, G.L. High-resolution digital mapping of soil erodibility in China. Geoderma 2024, 444, 116853. [Google Scholar] [CrossRef]
Gelete, T.B.; Pasala, P.; Abay, N.G.; Woldemariam, G.W.; Yasin, K.H.; Kebede, E.; Aliyi, I. Integrated machine learning and geospatial analysis enhanced gully erosion susceptibility modeling in the Erer watershed in Eastern Ethiopia. Front. Environ. Sci. 2024, 12, 1410741. [Google Scholar] [CrossRef]
Sadia, H.; Sarkar, S.K.; Haydar, M. Soil erosion susceptibility mapping in Bangladesh. Ecol. Indic. 2023, 156, 111182. [Google Scholar] [CrossRef]
Van der Westhuizen, S.; Heuvelink, G.; Gardner-Lubbe, S.; Clarke, C.E. Biplots for understanding machine learning predictions in digital soil mapping. Ecol. Inform. 2024, 84, 102892. [Google Scholar] [CrossRef]
Chen, Z.Y.; Lian, Z.C.; Xu, Z. Interpretable Model-Agnostic Explanations Based on Feature Relationships for High-Performance Computing. Axioms 2023, 12, 997. [Google Scholar] [CrossRef]
Hu, J.X.; Fan, T.H.; Tang, X.L.; Yang, Z.J.; Ren, Y.J. Nonlinear relations of urban morphology to thermal anomalies: A cross-time comparative study based on Grad-CAM and SHAP. Ecol. Indic. 2024, 162, 112024. [Google Scholar] [CrossRef]
Lin, N.; Zhang, D.; Feng, S.S.; Ding, K.; Tan, L.B.; Wang, B.; Chen, T.; Li, W.L.; Dai, X.A.; Pan, J.P.; et al. Rapid Landslide Extraction from High-Resolution Remote Sensing Images Using SHAP-OPT-XGBoost. Remote Sens. 2023, 15, 3901. [Google Scholar] [CrossRef]
Sun, J.; Sun, C.K.; Tang, Y.X.; Liu, T.C.; Lu, C.J. Application of SHAP for Explainable Machine Learning on Age-Based Subgrouping Mammography Questionnaire Data for Positive Mammography Prediction and Risk Factor Identification. Healthcare 2023, 11, 2000. [Google Scholar] [CrossRef]
Alshuhail, A.; Thakur, A.; Chandramma, R.; Mahesh, T.R.; Almusharraf, A.; Kumar, V.; Khan, S.B. Refining neural network algorithms for accurate brain tumor classification in MRI imagery. BMC Med. Imaging 2024, 24, 118. [Google Scholar] [CrossRef] [PubMed]
Mortier, S.; Hamedpour, A.; Bussmann, B.; Wandji, R.; Latré, S.; Sigurdsson, B.D.; De Schepper, T.; Verdonck, T. Inferring the relationship between soil temperature and the normalized difference vegetation index with machine learning. Ecol. Inform. 2024, 82, 102730. [Google Scholar] [CrossRef]
FAO. IIASA Harmonized World Soil Database Version 2.0. In Rome and Laxenburg; FAO: Rome, Italy, 2023. [Google Scholar]
Yu, H.; Zhu, W.; Jin, R. Future soil erosion assessment based on changing land cover and different climate change scenarios in a transboundary river basin. Int. J. Digit. Earth 2024, 17, 2301434. [Google Scholar] [CrossRef]
Saha, S.; Sarkar, R.; Thapa, G.; Roy, J. Modeling gully erosion susceptibility in Phuentsholing, Bhutan using deep learning and basic machine learning algorithms. Environ. Earth Sci. 2021, 80, 295. [Google Scholar] [CrossRef]
Faouzi, E.; Arioua, A.; Namous, M.; Barakat, A.; Mosaid, H.; Ismaili, M.; Eloudi, H.; Houmma, I.H. Spatial mapping of hydrologic soil groups using machine learning in the Mediterranean region. Catena 2023, 232, 107364. [Google Scholar] [CrossRef]
Pal, S.; Paul, S.; Debanshi, S. Identifying sensitivity of factor cluster based gully erosion susceptibility models. Environ. Sci. Pollut. R 2022, 29, 90964–90983. [Google Scholar] [CrossRef]
Rainio, O.; Teuho, J.; Klén, R. Evaluation metrics and statistical tests for machine learning. Sci. Rep. 2024, 14, 6086. [Google Scholar] [CrossRef] [PubMed]
Yu, Q.J.; Suo, L.Z.; Qi, J.; Wang, Y.; Hu, Q.L.; Shan, Y.; Zhao, Y. Soil habitat condition shapes Tamarix chinensis community diversity in the coastal saline-alkali soils. Front. Plant Sci. 2023, 14, 1156297. [Google Scholar] [CrossRef] [PubMed]
Hou, J.W.; Ye, M. Effects of Dynamic Changes of Soil Moisture and Salinity on Plant Community in the Bosten Lake Basin. Sustainability 2022, 14, 14081. [Google Scholar] [CrossRef]
Dahanayake, A.C.; Webb, J.A.; Greet, J.; Brookes, J.D. How do plants reduce erosion? An Eco Evidence assessment. Plant Ecol. 2024, 225, 593–604. [Google Scholar] [CrossRef]
Olii, M.R.; Olii, A.; Pakaya, R.; Olii, M. GIS-based analytic hierarchy process (AHP) for soil erosion-prone areas mapping in the Bone Watershed, Gorontalo, Indonesia. Environ. Earth Sci. 2023, 82, 225. [Google Scholar] [CrossRef]
Nguyen, K.A.; Chen, W.; Lin, B.S.; Seeboonruang, U.; Thomas, K. Predicting Sheet and Rill Erosion of Shihmen Reservoir Watershed in Taiwan Using Machine Learning. Sustainability 2019, 11, 3615. [Google Scholar] [CrossRef]

Figure 1. The sketch map of study area.

Figure 2. The pre-processing flow diagram used for UAV images.

Figure 3. The comparison chart of the soil erosion classification performances of five machine learning models.

Figure 4. The loss rate graph of CNN model training in soil erosion classification.

Figure 5. The comparison of soil erosion results of different study methods.

Figure 6. Importance of impact factors on soil erosion calculated using SHAP values.

Figure 7. Interactive plot of SHAP values for influencing factors of soil erosion.

Table 1. Source and calculation method of indicators.

Indicators	Calculation Method	Source
K	$\begin{array}{l} K = \frac{1}{7.59} \{0.2 + 0.3 e^{[- 0.0256 S A N (1.0 - \frac{S I L}{100})]}\} \times {(\frac{S I L}{C L A + S I L})}^{0.3} \times \\ [1.0 - \frac{0.25 C}{C + e^{(3.72 - 2.95 C)}}] \times [1.0 - \frac{0.7 S N I}{S N I + e^{(- 5.51 + 22.9 S N I)}}] \end{array}$ where SNI = 1 − SAN/100; SAN is the sand content, %; SIL is silt content, %; CLA is clay content, %; and C is the organic carbon content, %.	HWSD 2.0 [25]
Slope	ArcGIS technology	Collected UAV images
Aspect	ArcGIS technology
DSM	ArcGIS technology
Distance from the coastline	Euclidean distance tool of ArcGIS 10.8
NDSI	$N D S I = \frac{Re d - N I R}{Re d + N I R}$
NRI	$N R I = \frac{N I R}{G reen}$
C	$C = \{\begin{matrix} 1 \\ 0.6508 - 0.3436 \lg F V C \\ 0 \end{matrix} \begin{matrix} \begin{matrix} F V C \leq 0.1 \\ 0.1 < F V C \leq 0.783 \\ F V C > 0.783 \end{matrix} \end{matrix}$ where FVC is the vegetation coverage.
P	Assign values based on land use types [26].

Table 2. p-value assignment results.

Type of Land Use	Cropland	Woodland	Grassland	Waters	Bare Land	Construction Land
P	0.35	0.20	0.70	0.00	0.90	0.00

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, H.; Miao, S.; Qi, Y.; Gao, H.; Duan, H.; Liu, C.; Gao, W. A Comprehensive Analysis of Soil Erosion in Coastal Areas Based on an Unmanned Aerial Vehicle and Deep Learning Approach. Sustainability 2025, 17, 1261. https://doi.org/10.3390/su17031261

AMA Style

Li H, Miao S, Qi Y, Gao H, Duan H, Liu C, Gao W. A Comprehensive Analysis of Soil Erosion in Coastal Areas Based on an Unmanned Aerial Vehicle and Deep Learning Approach. Sustainability. 2025; 17(3):1261. https://doi.org/10.3390/su17031261

Chicago/Turabian Style

Li, Han, Sheng Miao, Yansu Qi, Huiwen Gao, Haoyan Duan, Chao Liu, and Weijun Gao. 2025. "A Comprehensive Analysis of Soil Erosion in Coastal Areas Based on an Unmanned Aerial Vehicle and Deep Learning Approach" Sustainability 17, no. 3: 1261. https://doi.org/10.3390/su17031261

APA Style

Li, H., Miao, S., Qi, Y., Gao, H., Duan, H., Liu, C., & Gao, W. (2025). A Comprehensive Analysis of Soil Erosion in Coastal Areas Based on an Unmanned Aerial Vehicle and Deep Learning Approach. Sustainability, 17(3), 1261. https://doi.org/10.3390/su17031261

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Comprehensive Analysis of Soil Erosion in Coastal Areas Based on an Unmanned Aerial Vehicle and Deep Learning Approach

Abstract

1. Introduction

2. Related Works

3. Materials and Methods

3.1. Study Area

3.2. Data Collection and Pre-Processing

3.3. Methods

3.3.1. Machine Learning Models

3.3.2. SHAP Interpretation Techniques

3.4. Model Evaluation

4. Results and Discussion

4.1. Model Results

4.2. Enhanced Explainability of the ML Model

4.3. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI