**Mapping and Monitoring of Geohazards with Remote Sensing Technologies**

Editors

**Constantinos Loupasakis Ioannis Papoutsis Konstantinos G. Nikolakopoulos**

Basel • Beijing • Wuhan • Barcelona • Belgrade • Novi Sad • Cluj • Manchester

*Editors*

Constantinos Loupasakis The National Technical University of Athens (NTUA) Athens, Greece

Ioannis Papoutsis National Observatory of Athens Athens, Greece

Konstantinos G. Nikolakopoulos University of Patras Rio, Greece

*Editorial Office* MDPI St. Alban-Anlage 66 4052 Basel, Switzerland

This is a reprint of articles from the Special Issue published online in the open access journal *Remote Sensing* (ISSN 2072-4292) (available at: https://www.mdpi.com/journal/remotesensing/ special issues/Geohazard RS).

For citation purposes, cite each article independently as indicated on the article page online and as indicated below:

Lastname, A.A.; Lastname, B.B. Article Title. *Journal Name* **Year**, *Volume Number*, Page Range.

**ISBN 978-3-0365-9216-9 (Hbk) ISBN 978-3-0365-9217-6 (PDF) doi.org/10.3390/books978-3-0365-9217-6**

Cover image courtesy of Constantinos Loupasakis

© 2023 by the authors. Articles in this book are Open Access and distributed under the Creative Commons Attribution (CC BY) license. The book as a whole is distributed by MDPI under the terms and conditions of the Creative Commons Attribution-NonCommercial-NoDerivs (CC BY-NC-ND) license.

## **Contents**


Reprinted from: *Remote Sens.* **2023**, *15*, 826, doi:10.3390/rs15030826 ................. **181**


## **About the Editors**

#### **Constantinos Loupasakis**

Dr. Constantinos Loupasakis is a Professor at the school of Mining and Metallurgical Engineering of the National Technical University of Athens (NTUA), and a contracted professor at the at the School of Science and Technology of the Hellenic Open University (HOU). He is the precedent of the Greek National Group of the International Association of Engineering Geology (IAEG) and the Director of the post graduate program of the HOU "Natural Catastrophic Events under climate Change impact". Also he is a member at the Land Subsidence International Initiative (LaSII) of UNESCO. Dr. Loupasakis has received a Ph.D. in Engineering Geology – Geotechnical Engineering (2002), a Diploma in Civil Engineering, (2001), a MSc Diploma in Applied and Environmental Geology (1998) and a Degree in Geology, (1995), from the Aristotle University of Thessaloniki (AUTH), Greece. He has an extensive research experience since 1996 working, with overlaps, in several positions, and he has participated more than 50 research projects and studies as coordinator or principal investigator. He has developed numerous international collaborations with institutes and universities from Germany, Italy, Spain, UK, United Arabs Emirates, Qatar, China etc. He also has an extensive teaching experience since 1996. He has published more than 100 Papers in scientific magazines and congress proceedings, 6 Scientific– educational books, 2 International Conference Proceedings' Volumes and 55 Technical reports.

#### **Ioannis Papoutsis**

Dr. Ioannis Papoutsis is an electrical and computer engineer with a research focus on the field of Earth Observation and artificial intelligence through the processing of big volumes of satellite and geospatial data. He holds a PhD on the use of advanced satellite interferometry techniques. He has a deep understanding of the Copernicus flagship programme, being the Operations Manager of the Greek node of European Space Agency Hubs that distribute Sentinel data, as well as a Copernicus Emergency Management Services Manager for Risk and Recovery. He leads the AI4EO research group Orion Lab and his current research focuses on the exploitation, management and processing of big satellite data, and the use of artificial intelligence/deep learning for knowledge extraction using Earth Observation and geo-information based on novel information technologies, with emphasis on disasters management. He has participated in 40 research projects funded mainly from EC (H2020 and FP7), ESA and Copernicus initiatives. Currently, he is the coordinator of the H2020 project DeepCube that focuses on AI pipelines for big Copernicus data, and of the ESA-funded project SeasFire on seasonal wildfire forecasting with deep learning. He has co-authored more than 40 research articles in peer-reviewed journals He was part of the team that won the first prize at the 2014 Copernicus Masters Competition for platform FireHub.

#### **Konstantinos G. Nikolakopoulos**

Dr. Konstantinos G. Nikolakopoulos is a Professor at the Department of Geology, University of Patras. He earned a BSc degree in geology and a PhD in Remote Sensing. He made a Post Doc research in the field of hyperspectral data processing. His areas of specialty are remote sensing and geographic information systems (GIS) for geological applications. Since 2012, he is the Chairman of the Special Interest Group Geological Applications of the European Association of Remote Sensing laboratories. He also acts as Associate Editor in European Journal of Remote Sensing. Dr. Konstantinos Nikolakopoulos has been involved in R&D projects funded from organizations such as the European Union, the European Space Agency and the Greek Goverment. He has considerable experience in the area of Earth Observation, GIS and GNSS measurements. His main research interests include: geological mapping, engineering geology, risk analysis, 3D mapping and coastal area mapping. He has more than 200 publications (167 in Scopus) in peer-review journals and conference proceeding with with h index 19 and more than 1600 references.

## *Editorial* **Special Issue "Mapping and Monitoring of Geohazards with Remote Sensing Technologies"**

**Constantinos Loupasakis 1,\*, Ioannis Papoutsis <sup>2</sup> and Konstantinos G. Nikolakopoulos <sup>3</sup>**


Geohazard monitoring is crucial for building resilient communities. By leveraging remote sensing technologies, we can assess hazards, implement early warning systems, and evaluate impacts effectively. These cutting-edge tools enable proactive monitoring and real-time analysis, minimizing the impact of geohazards, protecting lives, and fortifying our society against adversity.

Earth observation (EO) techniques have proven to be reliable and accurate for monitoring land surface deformations that occur naturally (landslides, earthquakes, and volcanoes) or due to anthropogenic activities (ground water overexploitation and extraction of oil and gas).

In cases where mitigation methods must be put into practice, the detailed mapping, characterization, monitoring, and simulation of the geocatastrophic phenomena have to precede their design and implementation. EO techniques possess high potential and suitability as alternative, cost-efficient methods for the management of geohazards, and have been proven to be a valuable tool for verifying and validating the spatial extent and the evolution of the deformations.

To this extent, this Special Issue covers innovative applications and case studies on the mapping and monitoring of all kinds of geohazards with remote sensing technologies. It incorporates articles that make use of new tools and methodologies, including the use of data-driven machine learning methods. Machine learning in earth observation have revolutionized geohazard monitoring. By leveraging advanced algorithms, machine learning can analyze vast amounts of satellite imagery and sensor data to detect subtle changes in terrain, identify precursors to hazards, and forecast their evolution, enabling proactive risk mitigation strategies and bolstering societal resilience.

In particular, Orellana et al. [1] focused on the study of the ground deformations taking place in the Santiago basin, combining multi-temporal differential interferometric synthetic aperture radar (DInSAR) with data coming from GNSS stations. The GNSS datasets showed a constant regional uplift in the metropolitan area, while the DInSAR allows for the identification of areas with anomalous local subsidence due to the overexploitation of the aquifers as well as mountainous areas affected by landslides. Overall, the results are fundamental for urban territorial planning in the city of Santiago and demonstrate the importance of geodetic measurements in assessing the impact of climate change on groundwater storage and how this affects the ground surface elevation.

Liu et all [2] also studied the subsidence phenomena taking place due to the long-term excessive extraction of groundwater resources by means of the progressive small baseline subset (SBAS) InSAR time series analysis method. The study was conducted at the eastern Beijing Plain, providing significant information on the deformation mechanisms of land

**Citation:** Loupasakis, C.; Papoutsis, I.; Nikolakopoulos, K.G. Special Issue "Mapping and Monitoring of Geohazards with Remote Sensing Technologies". *Remote Sens.* **2023**, *15*, 4145. https://doi.org/10.3390/ rs15174145

Received: 25 June 2023 Revised: 9 August 2023 Accepted: 10 August 2023 Published: 24 August 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

1

subsidence, establishing hydrogeological models, and supporting decision making, early warning, and hazard relief for the urban environment.

Investigating mining geohazards, Ma et al. [3] and Chen et al. [4] studied subsidence phenomena taking place at the perimeter of coal mines. Specifically, Ma et al. [3] proposed an approach for predicting mine subsidence that leverages Interferometric Synthetic Aperture Radar (InSAR) technology and a long short-term memory network (LSTM). Chen et al. [4] investigated the surface deformation by means of the DInSAR-PS-Stacking and SBAS-PS-InSAR methods. The results were verified by means of GPS, indicating the subsidence location, range, distribution, and space–time subsidence law of surface deformation.

Tzampoglou et al. [5] investigated the seasonal ground swelling/settlement of an urban area in Cyprus. The study area is occupied by highly expansive bentonitic clays giving the opportunity to combine the extensive database of geotechnical parameters with the Persistent Scattering Interferometry (PSI) InSAR datasets produced within the framework outcomes of the European Union's research project "PanGeo".

The contributions of Tsironi et al. [6], Ma et al. [7], Chen et al. [8], Tan et al. [9], and Kyriou et al. [10] focused on the study of landslide invents. Specifically, Tsironi et al. [6] studied the kinematics of active landslides in mountain areas of Achaia prefecture, Greece, by processing LiCSAR interferograms using the SBAS tool. The results also suggested a correlation between rainfall and landslide motion. Ma et al. [7] created an inventory map of 2665 rainfall-induced landslides triggered from 5 to 10 May 2016 in Fujian Province, China, by using high-resolution satellite imagery. Numerical simulations proved that the temporal evolution of the landslides could be accurately reproduced by using the MAT.TRIGRS tool. Chen et al. [8] proposed an improved multi-source data-driven landslide prediction method that combines a spatio-temporal knowledge graph and machine learning models. This framework could effectively organize multi-source remote sensing data and generate unified prediction workflows. The proposed workflow can alleviate the problem of poor prediction performance caused by limited data availability in county-level predictions. Tan et al. [9] proposes a landslide time prediction method based on the time series monitoring data of micro-deformation monitoring radar. Deformation displacement, coherence and deformation volume, and the parametric degree of deformation (DOD) are calculated and combined with the use of the tangent angle method. Finally, the effectiveness of the method was verified by using measured data of a landslide in a mining area. Finally, Kyriou et al. [10] used multi-dated data obtained by Unmanned Aerial Vehicle (UAV) campaigns and Terrestrial Laser Scanning (TLS) surveys for the accurate and immediate monitoring of a landslide located in a steep and V-shaped valley. They demonstrated that point clouds arising from a UAV or a TLS sensor can be effectively utilized for landslide monitoring with comparable accuracies. Furthermore, the outcomes were validated using measurements acquired by the Global Navigation Satellite System (GNSS).

Foroughnia et al. [11] proposed a stepwise sequence of unsupervised and supervised classification methods for the delineation of flooding areas using synthetic aperture radar (SAR) and multi-spectral (MS) data. Furthermore, a new unsupervised classification approach based on a combination of thresholding and segmentation (CThS) was developed to deal with the heterogeneity and fragmentation of water patches. The new approach was tested successfully in two flood events in Italy, achieving high precision and accuracy and making it appropriate for rapid flood mapping due to its ease of implementation.

The identification of Fossil Mass Movements is an intriguing subject. Popit et al. [12] conducted a geomorphometric analysis using a high-resolution lidar-derived DEM for the quantification and the visualization of fossil landslides. The proposed methodology was applied at Vipava Valley (SW Slovenia).

Exploiting the recently launched European Ground Motion Service (EGMS) products, Festa and Del Soldato [13] presented a desktop app, the so-called "EGMStream", that enables users to systematically store, customize, and convert ground movement data into geospatial databases, burst per burst or for an area of interest that is directly selectable on

the app interface. EGMStream is a value-adding tools for optimal dissemination of radar data from the Copernicus Sentinel-1 satellite mission.

Taking Minqin County, Gansu Province, China as the study area, Yang et al. [14] propose a decision tree model combining four spectral indices for the identification of saline–alkaline areas. The spectral indices are the NDSI34 (Normalized Difference Spectral Index of Band 3 and Band 4), the NDSI25 (Normalized Difference Spectral Index of Band 2 and Band 5), the NDSI237 (Normalized Difference Spectral Index of Band 3 and Band 4), and finally, the NDSInew (New Normalized Difference Salt Index). It was found that this model can be applied for the quick identification of saline–alkaline areas in large regions.

In conclusion, the value of almost any type of remote sensing data, such as radar (SAR), multispectral imagery, data collected by Unmanned Aerial Vehicles and Terrestrial Laser Scanners, and data acquired from airborne Lidar systems, for the mapping and monitoring of geohazards has been demonstrated. Different geohazards, like landslides, ground subsidence in coal mines or urban areas, flooding, and salinization have been addressed in this Special Issue. Most of the researchers that processed SAR data preferred the Sentinel-1 mission. Multispectral data from Sentinel-2, Geoeye, diverse Chinese satellites along with Google Earth data were processed in a fully automatic or semi-automatic way. Different band ratios, supervised and unsupervised classification, diverse spectral indexes, principal component analysis, and soft computing techniques like machine learning were among the methods that were presented for the remote sensing data processing.

**Acknowledgments:** The Guest Editors of this Special Issue would like to thank all authors who have contributed to this volume for sharing their scientific results and experiences. We would also like to thank the journal editorial team and reviewers for conducting the review process.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

## *Article* **High-Resolution Deformation Monitoring from DInSAR: Implications for Geohazards and Ground Stability in the Metropolitan Area of Santiago, Chile**

**Felipe Orellana 1,\*, Marcos Moreno <sup>2</sup> and Gonzalo Yáñez <sup>3</sup>**


**Abstract:** Large urban areas are vulnerable to various geological hazards and anthropogenic activities that affect ground stability—a key factor in structural performance, such as buildings and infrastructure, in an inherently expanding context. Time series data from synthetic aperture radar (SAR) satellites make it possible to identify small rates of motion over large areas of the Earth's surface with high spatial resolution, which is key to detecting high-deformation areas. Santiago de Chile's metropolitan region comprises a large Andean foothills basin in one of the most seismically active subduction zones worldwide. The Santiago basin and its surroundings are prone to megathrust and shallow crustal earthquakes, landslides, and constant anthropogenic effects, such as the overexploitation of groundwater and land use modification, all of which constantly affect the ground stability. Here, we recorded ground deformations in the Santiago basin using a multitemporal differential interferometric synthetic aperture radar (DInSAR) from Sentinel 1, obtaining high-resolution ground motion rates between 2018 and 2021. GNSS stations show a constant regional uplift in the metropolitan area (~10 mm/year); meanwhile, DInSAR allows for the identification of areas with anomalous local subsistence (rates < −15 mm/year) and mountain sectors with landslides with unprecedented detail. Ground deformation patterns vary depending on factors such as soil type, basin geometry, and soil/soil heterogeneities. Thus, the areas with high subsidence rates are concentrated in sectors with fine sedimentary cover and a depressing shallow water table as well as in cropping areas with excess water withdrawal. There is no evidence of detectable movement on the San Ramon Fault (the major quaternary fault in the metropolitan area) over the observational period. Our results highlight the mechanical control of the sediment characteristics of the basin and the impact of anthropogenic processes on ground stability. These results are essential to assess the stability of the Santiago basin and contribute to future infrastructure development and hazard management in highly populated areas.

**Keywords:** geohazard; ground deformation; landslides; DInSAR; GNSS

iations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

**Citation:** Orellana, F.; Moreno, M.; Yáñez, G. High-Resolution Deformation Monitoring from DInSAR: Implications for Geohazards and Ground Stability in the Metropolitan Area of Santiago, Chile. *Remote Sens.* **2022**, *14*, 6115. https://doi.org/10.3390/rs14236115 Academic Editors: Constantinos Loupasakis, Ioannis Papoutsis and Konstantinos G. Nikolakopoulos Received: 30 October 2022 Accepted: 25 November 2022 Published: 2 December 2022 **Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affil-

#### **1. Introduction**

Santiago basin is the homonymous capital of Chile with more than 7 million inhabitants [1]. Along with a high population density, the capital and its surroundings comprise the majority of the economic, political, and social activities of the country. Therefore, knowledge of the geohazards of the territory where the city of Santiago is located is essential for land use planning and sustainable growth. The Santiago forearc basin has been formed in a compressive tectonic environment in which the uplift of the Andes during the Miocene conditions caused the basin development in its eastern flank. The contact between the basin and the Andean Cordillera is formed by reverse faulting (San Ramon Fault). Previous

geological/geophysical studies in the Santiago Metropolitan Area aimed at evaluating geohazards have focused on geology and hydrogeology [2–4], seismic potential and microzoning [5,6], and tectonics [7–10]. Despite advancement in the knowledge and assessment of geohazards in the Santiago basin, few studies have been carried out to analyze the ground surface stability and deformation of the whole basin. In this sense, this research focuses on studying the ground deformations of the metropolitan area using advanced DInSAR technology and ground deformation time series.

The seismic cycle of megathrust earthquakes is the main driver of ground deformation in subduction zones. This process induces long wavelength deformations with regional vertical patterns that oscillate between uplift and subsidence depending on the stage of the seismic cycle and its distance from the trench [11]. Upper plate faults define a secondary earthquake cycle system that has a direct impact on ground deformation. These faults have a long period of elastic energy accumulation (>1000 years) [12] (much longer than the recurrence of earthquakes in the megathrust), during which they induce a very low magnitude deformation of a few mm/year; this was described by [13] in the Andean region close to those faults. To date, it is unclear under which conditions upper plate faults are reactivated and by which mechanism they interact with the plate interface.

It is difficult to estimate upper plate contribution to the ground deformation due to the lack of observables at broad temporal and spatial scales and the fact that active continental faults can be blind in the sense that their location and recent activity may be unknown. Other natural drivers of subsidence and uplift include glacial isostatic adjustment [14], sediment compaction [15], and seasonal hydrological loading [16]. On the other hand, human-induced processes can cause surface level changes on a smaller spatial scale but with a faster response time. Anthropogenic ground deformations can be associated with the exploitation of groundwater [17,18] and hydrocarbons [19–21], which implies subsidence from local to regional scales. Furthermore, much more rapid anthropogenic causes may be related to urbanization, such as building load or removal of materials typically in unconsolidated alluvial deposits, which are linked to surface processes [22,23].

Land surface deformations caused by anthropogenic processes and hydrogeological phenomena, such as the compaction of aquifers, are slower processes that induce low rates of subsidence; therefore, they do not involve situations of immediate risk as their effects are observed after several years. However, over a period of several years, their effects can change the topography of the land surface, causing damage to the population and civil infrastructure. Quantitative evaluation of ground deformation can be performed based on ground instrumentation and traditional measurement techniques (related to topography/geodesy), i.e., leveling and GNSS [24]; however, these are limited in terms of producing high spatial resolution surface displacement maps over wide areas. DInSAR technology is an alternative solution totally assimilable to terrestrial monitoring [25]. Ground deformation monitoring with DInSAR takes advantage of the amount of available data that is acquired more frequently and accurately at low cost, and such characteristics make it an attractive source of information [26]. One of the main attractions of satellite-based DInSAR is its ability to cover areas at a systematic and continuous rate remotely, which makes it suitable as a structural monitoring and control tool as we can detect near-vertical deformations in the structures along the LOS (line of sight). In recent years, the DInSAR time-series technique has emerged as an essential tool for measuring slow surface displacement [27]. This technology has been widely exploited in a wide variety of contexts such as seismic deformation [28,29], volcanic and landslide monitoring [30–32], infrastructure stability studies [26,33,34], and water overexploitation surface effects [35–37].

DInSAR technology is essentially based on two approaches associated with the selection of coherent pixels: the Persistent Scatterers (PS) technique, developed by Ferretti et al. [38], and the SBAS (Small BAseline Subset), developed by Bernardino et al. [39]. These processing techniques involve multiple time-dependent acquisitions to provide characteristic displacement patterns of surface motion over a period of time, thus providing measurement of surface uplift or subsidence to identify cyclical patterns (due to seasonal variations), trends, and anomalous variations. Such techniques can be applied to collect time series of movements of the earth's surface over wide areas with millimeter precision [40–42]. The evolution of DInSAR techniques observed in recent years is mainly related to the development of advanced computational algorithms [43]. However, it is also the result of greater possibilities for the acquisition of radar images by satellite missions and their high review frequency of 6 to 12 days for the entire world. Of particular note is the European Space Agency (ESA) Copernicus mission material that we use in this work, consisting of two twin SAR satellites, Sentinel 1A and Sentinel 1B [44].

This research quantifies the temporal evolution of ground motion through the analysis of the SAR interferometry with geophysical and geological data characterizing the features of the Santiago basin. The methodology of this study uses DInSAR differential interferometry with ground-based information. We use a C-band Sentinel 1 to process multiple SAR images with the P-SBAS (Parallel Small BAseline Subset) algorithm [43,45,46], an evolution of the traditional SBAS (Small BAseline Subset) method [39] developed by the CNR—IREA (Institute for Electromagnetic Sensing of the Environment). For the processing of SAR images, we used the iCloud platform (GEP) [47–50]. The results presented here provide new information that identifies the stability of the city of Santiago and constrains subsistenceinducing factors, such as hydrogeological phenomena, due to the overexploitation of water and stability of slopes in areas susceptible to landslides.

#### **2. Present Day Regional Deformation and Geological Setting**

The subduction of the Nazca Plate beneath the South American Plate controls the seismic cycle of large earthquakes and contributes to the permanent deformation that shapes the main morphological features on the Chilean margin. The Santiago basin is located at about 520 m above sea level at the piedmont of the Andean Precordillera, which reaches over 4000 m in altitude (Figure 1b). Santiago lies in the central segment of the Chilean subduction zone, between the rupture zones of the Maule 2010 [51] and Illapel 2015 [52] earthquakes. This segment is considered a seismic gap, in which the last major earthquake that ruptured the entire segment, both along and across the megathrust, was in 1730 [53]. Velocities derived from GNSS observations between 2018 and 2021 show typical interseismic deformation patterns (Figure 1). In this period, Central Chile moved northeastward with magnitudes of ~20 mm/year near the coast, decreasing to ~10 mm/year in Santiago (Figure 1c). Vertical velocities show subsidence near the coastline, suggesting that the offshore megathrust zone is locked, as previous studies have shown [54]. The Santiago basin and its surroundings show a regional uplift trend of about 10 mm/year. The velocity field shows no horizontal or vertical gradients near the San Ramon fault, so no activity or movement related to this fault are observed.

**Figure 1.** (**a**) Central Chile region showing the GNSS velocity field and traces of active faults (from Maldonado et al. [55]). The red and blue rectangles indicate the corresponding areas of Figures (**b**,**c**), respectively. (**b**) Geographical location of the metropolitan area of Santiago de Chile, ALOS DEM 30 mts, was used as background. (**c**) Cross-section of horizontal (blue) and vertical (red) velocities derived from GNSS data (from Donoso et al. [56]). A swath profile of the topography is shown in gray.

The Santiago forearc basin developed during successive tectonic events divided into two stages. The first stage of the Middle Eocene to the Oligocene-early Miocene was characterized by an extensional setting, which created depocenters where the Abanico Formation accumulated [57]. A second stage of compression linked to an increase in the rate of plate convergence velocity [58] dominated the evolution of the Abanico Basin until the Miocene. This produced the change to a compressive regime, with a partial tectonic inversion of the Abanico basin from the late Oligocene to early Miocene, inverting the larger NS normal fault systems generated in the first stage [59–63]. The Coastal Cordillera is located at the western flank of the basin, which is composed of Jurassic to Late Cretaceous volcanic and sedimentary sequences and Jurassic to Cretaceous intrusive rocks. At the eastern border of the basin, the foothill of the Andean Cordillera corresponds to a thrust deformation front that causes the uplift of the Andean Cordillera [64] and at the same time, the erosional process that provides the sedimentary supply for the basin infill.

Given the high-energy mountain erosional process, most of the basin deposits correspond to coarse gravel material (Figure 2a); however, in the northern part of the basin, some fine sediments associated with low-energy lacustrine-type deposits are still observed (Figure 2b). Based on gravity measurements constrained by geological observations at wells, it has been possible to estimate the thickness of the sedimentary cover as being in the range of 100–400 m with an irregular morphology [3]. The role of the eastern deformation front, whose westernmost branch correspond to the San Ramon Fault System (SRFS), has been a source of large scientific debate in terms of its relevance as a seismic hazard source [7–10,65]; maximum seismic events and recurrence times are among the most relevant unresolved questions. Finally, the basin water table is controlled by surface topography (tilted to the west), asymmetric basin (shallowing to the west), the overexploitation of the water resource, and the limited recharge due to a prolonged drought [66]. Rates of water table descent in the last 10 years range between 1–0.3 m/year, probably due to a combined effect of the lack of recharge and overexploitation. These first order geological characteristics of the Santiago Basin are used in this study to explain the surface deformation derived from the DInSAR observations.

**Figure 2.** (**a**) Geological map and (**b**) Sediment map of Santiago Basin (From Yanez et al. [3]).

#### **3. Materials and Methods**

The methodology describes the study of the deformation in the Santiago Metropolitan Area during an observation period of 3 years (2018–2021), using the multitemporal differential interferometry technique (DInSAR). The data obtained with the interferometric radar is complemented by a contextualization of the deformation obtained with the GNSS stations and, furthermore, by available geological and hydrogeological studies to propose a plausible interpretation of the observed anomalous domains. We integrate the results of the DInSAR analysis with the geological background, which allows us to describe the processes responsible for the detected anomalies. Thus, we present detailed high-resolution maps of the entire Santiago Basin, where records of ground deformation were previously lacking. In addition, the long time series with DInSAR and the high spatial density of PSI (Persistent Scattering Interferometry) can undoubtedly contribute to urban planning in Chile.

The main phases of the workflow include: (1) SAR image processing to generate interferograms, (2) SAR time series analysis using (3) vertical and slope velocity estimation in the post-processing, (4) overlaying with geological and hydrogeological information, and Digital Elevation Models (DTM) (5) to interpret the processes responsible for the anomalies detected.

#### *3.1. Data Set*

We have used 204 interferometric C-band VV polarized wide fringe (IW) SAR scenes acquired by the Copernicus Sentinel-1 mission [67] along ascending and descending orbits (Table 1). The review time for each stack is 12 days, with a resolution of 5 (terrestrial range) by 20 (azimuth) and angles of incidence (θ) of 31 to 46◦. We use the IW-2 swath width and bursts 2–4 to cover the Santiago Basin area of interest. Through the GEP platform, we retrieved updated SAR images from the ESA Open Access Hub repositories as a Single Look Complex (SLC) interferometric product [68]. The advantages of Sentinel-1 come from its wide range coverage (250 km swath in wide interferometric mode) and sufficient spatial resolution for large areas (90 m × 90 m range vs. azimuth, for this case). The Wide Interferometric Fringe (IW) acquisition mode is based on the ScanSAR terrain observation mode and the use of interferometric fringes with progressive scans (TOPS).


**Table 1.** Data set including the main features of Sentinel 1.

#### *3.2. Data Processing*

The processing has been based on the Advanced Earth Sciences Cluster operated by Terrafirma with a duration of approximately 48 h of run-time. We adopt SAR processing running on ESA's next-generation GEP iCloud platform, "CNR-IREA P-SBAS Sentinel-1 on-demand processing" service v.1.0.0, implemented in the computer-based operating environment ESA GRID [43]. The processing approach is based on the SBAS technique [39], applied along the descending orbit of Sentinel-1B (C-band SAR sensor wavelength = 5.6 cm). The algorithm was adapted to run efficiently on high-performance distributed computing and configured for Sentinel-1 IW TOPS data processing [45].

The main processing steps of the SBAS method [39] consist of the generation of differential interferograms from the SAR image pairs formed with a small orbital separation (spatial baseline) to reduce spatial decorrelation and topographic effects. The Shuttle Radar Topography Mission (SRTM) [69] with 1 arcsecond DEM from NASA (~30 m pixel size) and precise orbits from the European Space Agency (ESA) were used for the joint registration and removal of the topographic signal from the interferometric phase in each of the computational interferograms.

Each SLC data stack was co-recorded at the single burst level, ensuring very high coregistration accuracy (on the order of 1/1000 azimuth pixel size), as required for TOPS data due to the great Doppler centroid along the path variations [70]. The temporal consistency and the threshold of the minimum temporal consistency was set at 0.85. Atmospheric phase components were identified and removed. The control point for the P-SBAS processing was established in the same place in the city of Santiago center with coordinates lat: −62.555, log: −35.172, where the annual LOS velocity values were taken, and the time series were referenced accordingly. The use of a common reference point allowed internal calibration of the two output data sets.

#### *3.3. Post-Processing*

With the persistent scatterer interferometry (PSI) dataset, we project the vertical velocity and the velocity along the steepest slope estimates using the principles mentioned below.

The estimates of the vertical velocity component have been made with the combined ascending and descending data sets to obtain the vertical displacement *Vu*. A 90-m square element network was used to resample the point data sets on a regular grid and link the output data sets into a single layer. Both P-SBAS outputs (ascending and descending) were available at the same location, *i*; the combination was achieved under the assumption of negligible north−south velocity, *VN* = 0. This assumption is typically used in DInSAR studies to consider the relatively poor visibility of the north−south horizontal motions that the LOS sensor can achieve [68]. The horizontal E−W displacement components in the Santiago basin region are much lower than the vertical ones; therefore, they were not considered for our case.

Given the known values of the deformation velocity LOS in the ascending (*Vasc*) and descending (*Vdsc*) mode at each location *i,* the *VU* is estimated as follows:

$$V\_{\rm ui} = \frac{E\_{Di} \ast \,\, V\_{Ai} - E\_{Ai} \ast \,\, V\_{Di}}{E\_{Di} \ast \,\, \mathcal{U}\_{Ai} - E\_{Ai} \ast \,\, \mathcal{U}\_{Di}} \tag{1}$$

The estimation of velocity deformation along the Steepest Slope Direction defines zones potentially affected by landslides, considering the geomorphological principle upon which deformation is more likely to occur along the slope direction [71]. This is true for translational landslides and other types of movements present in some sectors of the study area, such as the debris flow. Under the assumption that the displacements occur along the direction of the steepest slope, they were projected considering the local values of slope (*β*) and aspect (*γ*) from the DTM ALOS PALSAR 30 mts. These data were used to identify the orientation of the steepest slope of the Andes mountain range, where the value represents the conversion factor of the LOS to slope values. *E*, *N*, and *U* are the directional cosines of the LOS and the SLOPE vectors in the east, north, and zenith directions, respectively, and they are defined as follows:

*ESi* = *sin γ<sup>i</sup>* ∗ *cos β<sup>i</sup>* , *NSi* = *cos γ<sup>i</sup>* ∗ *cos β<sup>i</sup>* , *and USi* = −*sin βi*. The velocity along such direction (*VSi*) was then estimated as follows (Cigna et al. [72]):

$$V\_{Si} = \frac{V\_{LOSi}}{E\_{LOSi} \ast \ E\_{Si} + N\_{LOSi} \ast \ N\_{Si} + \mathcal{U}\_{LOSi} \ast \mathcal{U}\_{Si}} \tag{2}$$

#### **4. Results and Discussion**

GNSS velocities allow us to characterize the regional deformation field in Central Chile. These data show widespread uplift in and around the Santiago Basin. However, the distribution of GNSS stations does not allow the identification of local uplift and subsidence. Hence, DInSAR data are an excellent complement to improve the spatial resolution of land-level changes and explore features of local deformation. The results of the SAR image processing provide detailed information to represent the phenomenon of ground deformation over the study area. The data set contains the Persistent Scatterers Interferometry (PSI) data, where each PSI contains a LOS displacement time series; average LOS velocity; temporal Coherence > 0.85; average elevation of the scatterer (topography); and the unit vectors that are the directional cosines of the LOS in the east (*E*), north (*N*), and zenith (*U*) directions, respectively. These were then used for the vertical and velocity along the steepest slope projections in the post-processing phase.

The data set output format is a CSV file, according to the specifications of the European System of Plate Observation—Phase Implementation (EPOS—IP), where the metadata corresponding to the LOS velocity in raster (.png) and Google Earth (.kml) are standardized. With these contents, we georeferenced GIS-based cartography (WGS84—UTM zone 19S) to represent and provide evidence for the phenomenon of deformation over the study area. Our results present an overview of the ground deformation for both orbits, selecting the

PSI of the Santiago Basin area and identifying the local subsidence areas. In addition, we present the displacement map of the estimated vertical components based on the projections of the ascending and descending LOS and the displacement along the maximum slope in the Andean Cordillera area, resampling the LOS velocity.

#### *4.1. PSI Measurement and Classifications*

PSI measurements from 2018–2021 have been classified according to deformation velocity rate (mm/year), with a continuous color scale that varies from dark green to dark red. Negative velocity values indicate motion away from the satellite sensor (orange to dark red PSI), while positive values indicate motion toward the sensor (light to dark green PSI). Although we have not used the maximum resolution of the Sentinel 1 (IW) images, the PSI are spaced in 90 × 90 grids, the interferometric data deliver a high density of points, reaching 100 PSI/km2 in urban areas and PSI 60 PSI/km2 in rural areas where it is usually covered by more vegetation, which decreases SAR detection. For both cases, we have appointed an optimal data set to represent the phenomenon of deformation in the study area. With the Persistent Scatterer Interferometry (PSI) dataset, we project the vertical velocity and the velocity along the steepest slope estimates using the principles mentioned below.

We obtained the PSI LOS deformation maps for the ascending and descending orbits, covering the entire study area, the Santiago basin, and the western flank of the Andean Mountain range (see Figure 3). An indicator of the compatibility between acquisition geometries is the standard deviation of the mean LOS strain rates, which correspond to 0.47 mm/year and 0.55 mm/year for the ascending and descending orbits, respectively. Given the difference in the area covered by each orbit, the relatively greater number of PSI points in the ascending solution could be explained by the morphology of the area and the angle of incidence on the mountainous cover. Apart from the effect of area coverage, it appears that the total number of PSI targets is comparable. This similarity can be attributed to the common observation period and the ascending and descending data sets. PSI LOS velocities in 2018–2021 ranged from −25.56 to +2.73 cm/year in the ascending dataset and from −28.79 to +3.05 mm/year in the descending dataset (see Table 2).


**Table 2.** Basic PSI statistical comparison for ascending and descending orbits.

#### *4.2. Deformation Overview and Vertical Displacement*

The deformation overview in the Santiago Metropolitan Area, was calculated with the vertical projection using ascending and descending LOS. Thus, we resampled the data considering the similarity of both data sets from both orbits. Vertical velocity component estimates were calculated after combining the two data sets where a 90 square meter grid was used to resample the point data sets on a regular grid and link the output data sets. This single digital layer in SHP format allowed us to create a new vertical deformation map (see Figure 4a).

**Figure 3.** (**a**) LOS displacement maps for the study area, (**a**) Ascending orbits (A18), and (**b**) Descending orbits (D156), with ALOS DEM 30 mts used as background.

**Figure 4.** (**a**) Vertical displacement maps for study area, (**b**) Sediment cover map, and (**c**) Density of sedimentary infill map (modified from Yanez et al. [3]), with ALOS DEM 30 mts used as background.

The median ground stability is represented by the yellow PSI and varies between −4.99–0.00 (mm/year), indicating a general trend of negative velocities that indicate a pervasive subsidence. On top of this general trend, we found two anomalous domains in which the subsidence rate is much larger: a) The northern domain, over a NE elongated surface of more than 500 km2 (Quilicura, Chicureo, and Colina localities), in which subsidence up to −25.00 mm/year is observed (Figure 4a)the southeastern domain, along a more restricted surface of 250 km2 (Paine, Huelquen localities), with subsidence in the order of −15.00 mm/year. The two anomalous domains of relatively large subsidence rates are associated with fine deposits of lacustrine origin in the northern domain and gravel deposits in the southern domain (Figure 4b). However, in terms of the deposit densities, both anomalous domains share the presence of low-density deposits in the southern region, probably associated with the distal disposition with respect to the Maipo River. On the other hand, these two anomalous domains show different soil use throughout the last few decades; while the southeastern domain is still dedicated to two agricultural activities, the northern domain has experienced a rapid transition to urban areas. In Section 4.3 we interpret these anomalous domains in terms of the interaction between soil characteristics and the spatial/time evolution of the groundwater processes in the basin.

Finally, the most stable soils are in Santiago's center and they are shown in the PSI of green colors with values between 0.99–5.00 (mm/year). For this area, a slight upward trend can be observed and this is probably caused by the tectonic processes that affect the Santiago Basin regionally. For the area near the San Ramon fault, no indication of fault activity was observed in the analyzed time window; however, an uplift caused by the same regional tectonic effects as those in the central area of Santiago is observed.

#### *4.3. Ground Deformations and Time Series*

Multi-temporal satellite radar interferometry is based on the analysis of a series of SAR (Synthetic Aperture Radar) images, which, in our case, were acquired in the period between May 2018–May 2021. By selecting the most stable targets in the anomalous areas (PSI > 0.85) that maintain the electromagnetic scattering signature in each image, it is possible to measure ground displacements by exploiting the phase shift (sensor–target distance) and the amplitude of the signal reflected from the ground surface.

The product generated by this multi-interferometric analysis is a cartographic representation of the time series, with a cumulative displacement value of up to −100 (mm) for isolated PSI targets and values of −74–50 (mm) for homogeneous areas (see Figure 5). The records of interferometric data for both orbits allowed us to know the temporal evolution of the deformation phenomenon in the study area and build the time series in the anomalous areas. For this we have selected 8 pixels of the study area, numbered from 1–8, where each one is assigned the name of the locality where the deformation phenomenon occurs.

Deformation time series represent the most advanced DInSAR product. They provide the history of the deformations during the observed period, which is fundamental to represent and study the ground deformation in the study area and its correlation with the inducing factors. To properly use, interpret, and exploit deformation time series, it is important to consider that they can be affected by geometric and atmospheric distortions. In fact, they contain an estimate of the deformation for each acquisition (SAR image), so they are particularly sensitive to phase noise [40].

Following the results in the deformation area (Pixels 1–8), we focused on a quantitative analysis, comparing each of the ascending and descending time series, based on the parameters of velocity and R2. We present graphs in Figure 6, which show the time series comparisons of the orbits of the interferometric data developed in the processing phase of the images of Sentinel-1, corresponding to LOS, ascending (blue color), and descending orbits (red color). According to the above results, there is very good agreement between the measurements. However, the difference in the SAR-derived time series 7 and 8 (Figure 6) between the ascending and descending orbits is a consequence of the distances between the pixel's PSI, which reach up to 100 m so the pixels can detect different local deformations. For the time series 1, 2, 3, 4, and 6, the values of R2 (Figure 6) are close to 1.0 with a good fit; therefore, the results of both orbits are very mutually reliable, indicating a negative displacement trend (subsidence). For time series 5, the values of R2 are close to 0, and this is similar for both orbits, indicating a stable deformation. In time series 7 and 8, the R2 values are also close to 1.0; however, there is a difference due to the distance between the PSI pixels, indicating a trend of seasonal subsidence.

**Figure 5.** Map of cumulative displacements and temporal evolution (**a**) Recorded in May 2019 (**b**), Recorded in May 2020, and (**c**) Recorded in May 2021, with Maxar image (source Esri) used as background. Numbers in each panel label the localities in which the deformation time series response is described in the text of Section 4.3.

The time series of displacements show values of up to −60 (mm) for the period of May 2018–May 2021 or ~20 mm/year. The greatest deformations are located in the northern area of Santiago, in the towns of Polpaico (Time series 1), Lampa (Time series 1), and Quilicura (Time series 3–4). On the other hand, the area of the San Ramon fault (Time series 5) appears stable and there is no evidence of activity in the fault. In Maipu west (Time series 6), an anthropogenic deformation caused by the increase in urbanization in the area is observed. The wetland "Aculeo" (Time Series 7) is close to the "Aculeo" lake, which has been drying up dramatically [73,74]. Finally, for Paine (Time Series 8), an area of intensive agriculture shows an increase in subsidence in recent years.

#### 4.3.1. Subsidence and Groundwater Spatial and Temporal Evolution

Given the spatial relationship between anomalous deformation, mostly subsidence, soil characteristics, and the known increase of groundwater exploitation, both for agriculture and for human consumption, were explored in this section with the working hypothesis that the main factors that cause the deformation of the Santiago Basin are linked with changes in groundwater flow and static levels.

**Figure 6.** Time series 1–8 in red ascending orbit and blue descending orbit, located at Polpaico, Lampa, Quilicura nord, Quilicura west, San Ramon fault, Maipu west, Wetland Aculeo, and Paine.

In order to determine the water table of the Santiago Basin, we include a water table grid produced by DGA (2000) [75] in Figure 7b, which is a compilation of data up until the year 2000. This map shows that the anomalous DInSAR domains (Figure 7a) are located in shallow areas of the water table. In addition to that, and with the aim to validate this water table map while exploring the time evolution of the water table at the same time, we analyzed 48 wells within the area of interest. From these wells we extract time series evolution of the water table at a rate of 2–4 points per year. From this basic data we calculate a trend to have the gradient per year and an average value for the last 10 years. Comparing Figure 7b,c, we conclude that the DGA 2000 map is basically correct and still valid today, with the deeper water table in the central and eastern part of the basin, and the shallower ones in the northern, southern, and western flanks. The majority of the wells show a decrease in its water table, with some of them in excess of 1–2 m/year; however, in the central part of the basin (Quinta Normal), the well shows that the water table indeed becomes shallower (at rates greater than 1.5 m/year). A qualitative comparison between water table evolution and deformation of the surface according to DInSAR show an apparent inconsistency, zones with large water table oscillation show no or minor deformation, i.e., in the central part of the basin. However, a closer look shows that, in general, these areas are associated with zones of deeper water table and coarse soil in the area of high energy of the fluvial system. In order to gain a better understanding of the role of groundwater dynamics in surface deformation, we need to consider some basic concepts of pore elasticity.

**Figure 7.** (**a**) Ground deformation map and well distribution in white point color, (**b**) Static level map (modified from DGA 2000), and (**c**) Piezometric variations maps, with ALOS DEM 30 mts used as background.

According to the pore-elastic model developed by Terzaghi, K. [76] and Biot, M. [77], the variation in effective stress (geostatic stress minus pressure head) is related to changes in the hydraulic head (Δh). Then, if such a stress field is applied over a compressible matrix aquifer, they result in a matrix deformation, including surface deformation (Δu). For a simple one-dimensional compaction model, Terzaghi, K. [76] established a linear relationship between (Δu) and (Δh), where the constant of proportion is the skeletal storage (Sk), such that Δu = Sk ∗ Δh. The main factors that control skeletal storage are the compression index (Cc) and the effective stress and the direct and reverse dependence. Effective stress increases with depth, so larger effects are expected in shallow aquifers. Compressional index depends on the granulometry of the soil: fine soil (i.e., clay and silt) show larger Cc, whereas coarse and more rigid soils (i.e., gravels) show Cc values practically zero, regardless of any other factor.

In the Santiago Basin, the DInSAR show a major subsidence effect in the northern anomalous domain (Figure 7a), in good spatial agreement with the region in which the static level is shallower than 10 m (Figure 7b) and where fine sediments of lacustrine origin are emplaced (Figure 4b in Yanez et al. [3]). The southeastern anomalous domain is also located in a shallow water table region, where gravel sediments show low-density, and thus distal, fine sediments. On the other hand, most of the central and southern part of the basin is dominated by the high-energy gravel material, which is highly rigid, having an almost zero compressional index. In addition to that, the water table in this region shows depths deeper than 50 m (Figure 7b,c). Both factors correctly predict a minor or null surface deformation, as observed in the DInSAR data. The Δu/Δh ratio, which is the skeletal storage in this case, is in the order of 0.01 in the anomalous domains. These values are in good agreement with observations in central Mexico [78], Kumato area, Japan [79], and Tucson Arizona [80].

#### *4.4. Landslide Identification*

In recent decades, the metropolitan area of Santiago has experienced sustained growth in urbanization towards the Andes Mountain range, where landslide activity is more frequent. The occurrence of landslides in the front of the mountain and on the slopes of the interior basins and, in particular, debris flows that can reach the alluvial plain are common and represent an increasing risk for populated areas [81,82]. A large debris flow event in 1993 [83,84] is an example of the potential impact of catastrophic landslides in the area. The most common types of landslides in the area are debris flows that occur from the mountain range towards the city, rock falls from steep and fractured slopes, and rock slides [85].

To identify landslides, it is necessary to identify the geomorphological units, because they have different conditions that can define a type of landslide, in addition to the fact that the geomorphological unit is favorable for one type of landslide but not for another. For this reason, we have focused the analysis using the susceptibility map (Figure 8a) to mainly debris and rock flows. We also considered the high coherence of the SAR in highly reflective rocky areas. To compare our PSI interferometric data, we used a landslide susceptibility map [86], which shows areas that have the potential for landslides (see Figure 8a), determined by the correlation of some of the main factors that contribute to landslide generation. The area is affected by the main geometric distortions of shadows, layover and foreshortening, where the differences between the ascending and descending orbits are evident and can be seen in Figure 8b,c.

**Figure 8.** (**a**) Landslide susceptibility map (modified from Celis, C. [86]), (**b**) LOS displacement map for ascending orbit, and (**c**) LOS displacement map for descending orbit, with ALOS DEM 30 mts used as background.

From the geometric relationship, it can be known that in the area of visibility, the closer the slope is to the angle of incidence of the satellite, the more LOS deformation can reflect the real deformation; under this assumption we have selected the descending LOS to continue our analysis. A one-dimensional component along the LOS is suitable for rotational landslides, but not translational landslides. In the study area, the most representative processes are debris flows, which have parallel movements along the direction of the steepest slope. In this case, we project the slope along the steepest slope under the formula proposed by Cascini et al. [71], Colesanti and Wasowski [87], and Plank et al. [88]. The slope and aspect values were calculated in GIS and then used in the calculation of the estimated velocity strain along the steepest slope velocity direction (Vslope) in the post-processing phase, where the velocity targets were recalculated. PSI is presented in Figure 9c. For Vslope projection, we consider the Slope (β) > 40◦ and the values of the slope directions aspect (γ) in the northwest, southwest, and northeast, located in the white box (Figure 9b). We notice that Vslope velocity ranges increased along the slope and were mostly recorded on rocky slopes and soils located on higher slopes (Figure 9a). Thesse results confirmed the correspondence of the most susceptible landslides in the highest areas of the ravines (Figure 8a) and the usefulness of the DInSAR approach to identify regions susceptible to landslide processes.

**Figure 9.** (**a**) Slope map with the highest values in pink color, (**b**) Aspect map indicating the values of the direction of the slopes, with ALOS DEM 30 mts used as background, and (**c**) Velocities along steepest slope (Vslope map), with Maxar image (source Esri) used as background.

#### **5. Conclusions**

The metropolitan area of Santiago de Chile is exposed to numerous natural and anthropic processes that induce regional and local deformations that affect the stability of the basin and its surrounding foothills. GNSS-derived velocities show that the Santiago Basin is tectonically stable, showing typical patterns of interseismic deformation, such as slow eastward motion and uplift. DInSAR allowed us to quantify with excellent spatial resolution areas with local subsidence within the Santiago Basin and landslides in mountainous regions. This allowed us to map the Santiago Basin's ground stability and investigate the mechanisms responsible for these surface-level variations. Therefore, our study highlights the use of the SAR interferometry technique for the detection of subsidence caused by groundwater exploitation and geohazards related to landslides. Our estimates of surface level changes in the Santiago basin will also be of great importance for urban planning; these results emphasize the mechanical effect of sediment thickness on surface stability and demonstrate how groundwater extraction induces significant land subsidence.

In general, DInSAR shows that the ground surface level in the Santiago basin is relatively stable, but there are areas showing anomalous subsidence. These areas are located in places where groundwater is exploited at the expense of non-renewable storage in aquifer systems, which causes land subsidence and other environmental impacts such as the degradation of water quality, spring discharge, and flow reduction from the rivers. In Quilicura, north of Santiago, the exploitation and compaction of the aquifer is more evident, and a criticality is noted. For Paine, the southern area of Santiago, the deformations have been more evident in recent years, which are influenced by intensive agriculture in the area. On the other hand, although the area is located in the Chilean subduction area, there is no evidence of large deformations influenced by tectonic movement, in particular, movement linked to the San Ramon fault activity. However, we do not rule out that the fault is active and that more observation time is needed to estimate potential deformations across the fault.

Our results show a constant evolution in the subsidence of the anomalous zones, indicating that water withdrawals have continued to affect soil stability during the observed period. This work also provides semi-theoretical relationships in Section 4.3.1 to link information on metropolitan-scale groundwater use with compaction and storage loss, which could allow predictions of subsidence rates and volumes for different groundwater management scenarios.

In the reliefs of the Andes mountain range, using SAR interferometry, we were able to record information in places with difficult access, quickly and efficiently, and detect numerous control points obtained with the SAR sensor. Based on the Vslope projection, we were able to estimate the velocities and determine the status of the movement of the slopes, comparing them with the susceptibility catalogs of landslides to obtain reliable results. Our results highlight the potential of interferometry to identify landslides; however, more information, such as materials, type of geomorphological units, and meteorological data, is required in situ in order to confirm and define more precisely the activity of the phenomenon and the types of movements present in the area. Our study demonstrates that InSAR provides a spatial accuracy that allows for the detection of ground instabilities in areas above ~500 m2. The interferograms have an excellent consistency in the analyzed urban areas of the Santiago basin. In mountainous regions there may be topographic effects that incorporate noise into ground motion estimates, which is why we limited ourselves to using only the descending orbit for landslide identification. Furthermore, the integration of geological information enabled us to interpret the observed ground elevation anomalies in the Santiago Basin. Our results highlight that the thickness of the basin is a key factor for ground stability; in contrast, groundwater extraction induces local instability. These results are fundamental for urban territorial planning in the city of Santiago and demonstrate the importance of geodetic measurements in assessing the impact of climate change on groundwater storage and how this affects the ground surface elevation.

**Author Contributions:** Conceptualization, F.O., M.M. and G.Y.; methodology, F.O., M.M. and G.Y.; SAR data processing, F.O., validation G.Y. and M.M.; writing F.O., M.M. and G.Y.; supervision, M.M. and G.Y. All authors have read and agreed to the published version of the manuscript.

**Funding:** This study was supported by the Millennium Nucleus CYCLO (The Seismic Cycle Along Subduction Zones) (ICM) grant NC160025.

**Data Availability Statement:** The geological maps in Yanez et al. [3]. The catalog of active faults is available at: https://fallasactivas.cl/ (accessed on 10 September 2022, under license https:// creativecommons.org/licenses/by/4.0/. The Hydrogeological Data are available on the portal of the DGA 2000, Chilean public water directorate https://dga.mop.gob.cl/servicioshidrometeorologicos/ Paginas/default.aspx (accessed on 15 September 2022). Landslide catalogs are available in the master thesis under license https://creativecommons.org/licenses/by-nc-nd/3.0/cl/ (accessed on 10 October 2022).

**Acknowledgments:** F.O. acknowledges support from the ESA NoR Project ID: 65514. M.M. acknowledges support from the ANID PIA Anillo ACT192169, and National Research Center for Integrated Natural Disaster Management (CIGIDEN).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## **£ Ĵ ȬȬ Ȭŗ**

**¢ ŗǰŘǰřǰ ŗǰ ŗǰŘǰ ŗǰŘǰȘǰ <sup>Ś</sup> ŗǰŘ**


**DZ** ě ǰ ¢ Ȭ ¡ ¡ ǯ ŘŖŗŚǰ ¢ ȬȬ ǻǼ ǰ ¡Ȭ ǯ ǰ ę Ȭ ¢ ǻǼ ¢ Ȭŗ ¢ ŘŖŗś ŘŖŘŗǯ ǰ ¢£ Ȭ ¢ ¢ ǯ Ȭ £ ¢ ¡ −ŗśŖ Ȧ¢ ¡ −şśŖ ǯ ǰ ě ŘŖŗŜ ŘŖŘŗ ę¢ ǰ ǰ ¢ ¢ ŘŖŗŝǰ ¢ ǯ ǰ ę Ȭ ŗȮŘȬ ǰ ¢ Ě ¢ ě Ȭ £ǯ ǰ ě ę ǰ Ȭ ¢Ȭ¡ ǰ ǰ ¡ Ě ǯ ę ę Ȭ ǰ ¢ Ȭǰ ¢ £ ǰ ǯ

**¢ DZ** ȬȬ ǻǼ Dz Dz Dz Dz ȬŗDz

**¢DZ** Ț ŘŖŘŘ ¢ ǯ ǰ ǰ ĵǯ Ĵ ǻ Ǽ ǻĴDZȦȦ ǯȦȦ¢Ȧ ŚǯŖȦǼǯ

#### **ŗǯ**

 ǰ Ě Ȭ ǽŗǾǯ Ȭ Ȭ¡ ¡ǰ £ǰ ¡ ǽŘȮŚǾǰ ǽśǾǰ ¡ ǰ £ ǽŜǾǰ ¢ ǽŝǾǰ ǰ ǽŞǾǯ ǰ ¢ ǰ ǽşȮŗŜǾǰ ǽŗŝǾǰ Ȃ ǽŗŞǾǰ ǽŗşǾ ¢ ǽŘŖǾǰ ě ǯ

**DZ** ǰ ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯ £ Ĵ ȬȬ Ȭŗ ǯ  *ǯ* **ŘŖŘŘ**ǰ *ŗŚ*ǰ śŞŗŖǯ ĴDZȦȦǯȦ ŗŖǯřřşŖȦŗŚŘŘśŞŗŖ

 DZ ǰ ǯ 

DZ ŗř ŘŖŘŘ DZ ŗś ŘŖŘŘ DZ ŗŝ ŘŖŘŘ

**Ȃ DZ** ¢ ĜȬ ǯ

 ǰ £ Ĵ ǯ Ȭ ¢ ǻǯǯǰ ǰ Ǽ ¢ ǯ Ȭ ǰ Ȭ ¢ ¢ ǰ Ȭ Ȭ ǯ ŘŖ ¢ǰ ¢ ¢ ǻǼ ě Ȭ ǰ ǰ Ȭ ¢ ǽŘŗǾ ¢ ¢ ǽŘȮŘŖǾǯ

 ǰ Ȭ DZ ǽŘŘǾǰ ǽŘŘǾ ǽŘřǾǯ Ȭ ¢ǰ ¢ Ȭ ¢ ¢ Ȭ ǽŘŚȮřŖǾǯ Ȭ ǰ ȬŗȦŘǰ ǰ ȬŗȦŘǰ ȬŘǰ Ȭ Ȭŗ ¢ ǽşȮŗŜǰřŗȮŚŚǾǰ Ȭ ǻǯǯǰ ŘŖŖřȮŘŖŗśǰ ŘŖŗśȮŘŖŗŝǼ ǻǁŗŖŖ Ȧ¢Ǽ ¢ ¢ ¡Ȭ ¡ Ȭ ǽřŗǾǯ ǰ ȬȬ ǻǼ ǰ ǰ ¡ ŘŖŗŚǰ ǽŚřǾǯ ¢ǰ Ȭ ǽŚŖǰŚśǾǯ ǰ ǰ Ȭ ǯ

 ¡Ȭ ǰ ¢ ǯ ¢ǰ Ȭ Ȭ Ȭ ǻŘŖŗśȮŘŖŘŗǼ ¢ ŝŘ Ȭ Ȭŗ ǯ ǰ Ȭ Ȭ ǯ ¢ǰ Ȭ ǰ ¡ ǯ

#### **Řǯ ¢**

#### *Řǯŗǯ ¢*

ǰ ǰ ŗŜǰŚŗŘ Řǯ ¢ ǰ Ȭ ¡¢ Śřǯś ǯ ŗǰ ¢ ǰ ¢ Ȭ Ȭ Ȭ ǯ ŗŗȮŗŘ ◦ ǽŗśǾǯ ¢ ǰ ¡¢ ŞŖƖ Ȭ ǽŗŜǾǯ ǰ ¢ ¢ǰ ¢ ǰ ¢ ǯ ǰ ¢ ǰ ¢ £ ¢ Ȭ¢ ¢ ¢ ǻ ŗǼǯ ǰ ¢ ¢ řŖŖ ǽşǾǯ Ȭ

 ¡ ¢ ¢ ǰ ǽŚŖǰŚŚǾǯ

 **ŗǯ** ǰ ǻ Ǽǰ ǻ Ǽǰ ǻ Ǽǰ ǻ Ǽ ¢ ¢ ǻ¢ Ǽ ǯ ¢ Ȭ ¡ ¢ ǯ


 **ŗǯ** ¢ ǯ

 ¢ ǰ Ȭ Ȭ ǰ ¢Ȭ¡ ¢Ȭ¡ ǯ Ȭ Ȭ Ȭ ǰ ŜŖ◦DžŜŞ◦ǰ ¢Ȭ¡ Ȭ ǰ ŚŖ◦DžŜŖ◦ǯ Ȭ ǰ ¢ Ȭ ¢Ȭ¡ ǰ Ŗǯř Ȧ Ŗǯŝś Ȧǰ ¢ ǽŚŜǾǯ Ȭ ǰ ǽŚŜǾǯ

#### *ŘǯŘǯ*

 ǰ ǰ Ȭ ǻ Ȭ Ǽǯ ŝŘ Ȭ ǻ Ǽ Ȭŗ řŖ ¢ ŘŖŗś ŗś ŘŖŘŗ ȬȬ ǻǼ ǯ Ȭ ǰ ě ǯ ǰ ę ǻǯǯǰ ę śŜ řŖ ¢ ŘŖŗś ŘŚ ŘŖŗşǼ ě ǻ ŘǼ Ȭ Ȭ £ǯ ¢ Ȭ ¢ ǻǯǯǰ śś śŜ ę Ǽ ě ǻ ŘǼ Ȭ ǯ ¢ǰ ŗśŖ şŜ ¢ǰ ¢ǯ Ȭ ǰ ¢ ŘŖŗśȮ ŘŖŗŝǯ ŗŞ Ȭ ¢ ǭ ¢ ŘŖŗŝǯ ǰ ǯ ǰ ǻǼ řŖ Ĵ ¢ Ȭŗ ǻǼ ǯ

 **Řǯ** ě ¢ǯ ǯ Ȭ ę £Ȭ ǯ ¢ ¢ ǯ

#### **řǯ ¢**

#### *řǯŗǯ Ȭ*

 ǰ ¢ ǻǼ ǯ řǰ DZ ǻǼ ¢ Ȭ ǻǼ ǯ

 **řǯ** Ȭ ǯ

#### řǯŗǯŗǯ Ȭ

¢ ǻǼ ǽŘŝǰŚŝǾ Ȭ ǯ *N* + 1 ǻ*t*0, *t*1, ··· , *tN*Ǽǰ *M* ¢ Ĵ Ȭ ǻ*tc bc*Ǽǯ ¢Ȭ ¢ ǰ *j*Ȭ*th* ě *δϕ<sup>j</sup> tA tB* ¡DZ

$$\sum\_{k=t\_A}^{t\_{B-1}} v\_{k+1}(t\_{k+1} - t\_k) = \delta \varphi\_{\hat{\jmath}} \tag{1}$$

 *k* ¡ǰ *tA tB*−1Dz *vk* Ȭ Ȭ ǯ

 ǻŗǼǰ ¢ ǻǼ ǯ ¢ǰ Ȭ ę Ŗǰ ǯ ¢ǰ ǯ

#### řǯŗǯŘǯ

 Ȭǰ Ȭ ¢ ǰ ¢¢ ¢ ¢ ǯ Ȭ *N*<sup>1</sup> ¢ ǻ*tN*+1, *tN*+2, ··· , *tN*+*N*1Ǽǰ *M*<sup>1</sup> Ȭ ǰ ¢ ¢ *n* Ȭ ¢ ǻ *tn*1Ǽ ǯ ¢ǰ ¡ DZ

⎡ ⎢ ⎢ ⎢ ⎣ *t*<sup>1</sup> − *t*<sup>0</sup> 0 ··· 0 *t*<sup>1</sup> − *t*<sup>0</sup> *t*<sup>2</sup> − *t*<sup>1</sup> ··· 0 ǯ ǯ <sup>ǯ</sup> <sup>ǯ</sup> ǯ <sup>ǯ</sup> <sup>ǯ</sup> ǯ <sup>ǯ</sup> <sup>ǯ</sup> ǯ ǯ 0 0 ··· *tn*<sup>1</sup>−<sup>1</sup> − *tn*<sup>1</sup>−<sup>2</sup> ⎤ ⎥ ⎥ ⎥ ⎦ *A* ⎡ ⎢ ⎢ ⎢ ⎣ *v*1 *v*2 ǯ ǯ ǯ *vtn*<sup>1</sup>−<sup>1</sup> ⎤ ⎥ ⎥ ⎥ ⎦ *<sup>X</sup>*<sup>1</sup> + ⎡ ⎢ ⎢ ⎢ ⎣ *tn*<sup>1</sup> − *tn*<sup>1</sup>−<sup>1</sup> 0 ··· 0 *tn*<sup>1</sup> − *tn*<sup>1</sup>−<sup>1</sup> *tn*1+<sup>1</sup> − *tn*<sup>1</sup> ··· 0 ǯ ǯ <sup>ǯ</sup> <sup>ǯ</sup> ǯ <sup>ǯ</sup> <sup>ǯ</sup> ǯ <sup>ǯ</sup> <sup>ǯ</sup> ǯ ǯ 0 0 ··· *tN* − *tN*−<sup>1</sup> ⎤ ⎥ ⎥ ⎥ ⎦ *<sup>B</sup>*<sup>1</sup> ⎡ ⎢ ⎢ ⎢ ⎣ *vtn*<sup>1</sup> *vtn*1+<sup>1</sup> ǯ ǯ ǯ *vtN* ⎤ ⎥ ⎥ ⎥ ⎦ *<sup>X</sup>*<sup>2</sup> = ⎡ ⎢ ⎢ ⎢ ⎣ *δϕ*<sup>1</sup> *δϕ*<sup>2</sup> ǯ ǯ ǯ *δϕ<sup>M</sup>* ⎤ ⎥ ⎥ ⎥ ⎦ *<sup>L</sup>*<sup>1</sup> ǻŘǼ ⎡ ⎢ ⎢ ⎢ ⎣ 0 0 ··· 0 0 *tn*1+<sup>1</sup> − *tn*<sup>1</sup> ··· 0 ǯ ǯ <sup>ǯ</sup> <sup>ǯ</sup> ǯ <sup>ǯ</sup> <sup>ǯ</sup> ǯ <sup>ǯ</sup> <sup>ǯ</sup> ǯ ǯ 0 0 ··· *tN* − *tN*−<sup>1</sup> ⎤ ⎥ ⎥ ⎥ ⎦ *<sup>B</sup>*<sup>2</sup> ⎡ ⎢ ⎢ ⎢ ⎣ *vtn*<sup>1</sup> *vtn*1+<sup>1</sup> ǯ ǯ ǯ *vtN* ⎤ ⎥ ⎥ ⎥ ⎦ *<sup>X</sup>*<sup>2</sup> + ⎡ ⎢ ⎢ ⎢ ⎣ *tN*+<sup>1</sup> − *tN* 0 ··· 0 *tN*+<sup>1</sup> − *tN tN*+<sup>2</sup> − *tN*+<sup>1</sup> ··· 0 ǯ ǯ <sup>ǯ</sup> <sup>ǯ</sup> ǯ <sup>ǯ</sup> <sup>ǯ</sup> ǯ <sup>ǯ</sup> <sup>ǯ</sup> ǯ ǯ 0 0 ··· *tN*+*N*<sup>1</sup> − *tN*+*N*1−<sup>1</sup> ⎤ ⎥ ⎥ ⎥ ⎦ *C* ⎡ ⎢ ⎢ ⎢ ⎣ *vtN*<sup>+</sup><sup>1</sup> *vtN*<sup>+</sup><sup>2</sup> ǯ ǯ ǯ *vtN*<sup>+</sup>*N*<sup>1</sup> ⎤ ⎥ ⎥ ⎥ ⎦ *<sup>X</sup>*<sup>3</sup> = ⎡ ⎢ ⎢ ⎢ ⎣ *δϕM*+<sup>1</sup> *δϕM*+<sup>2</sup> ǯ ǯ ǯ *δϕM*+*M*<sup>1</sup> ⎤ ⎥ ⎥ ⎥ ⎦ *<sup>L</sup>*<sup>2</sup> ǻřǼ

 *A*, *B*1, *B*2, *C* ¡ Ŗ Ȭ Dz *X*1, *X*2, *X*<sup>3</sup> ǯ Ȃ ǻ*X* <sup>1</sup> *X* <sup>2</sup>Ǽ *X*<sup>1</sup> *X*<sup>2</sup> Ȭ ǯ ǰ *X*<sup>1</sup> *X*<sup>2</sup> Ĵ DZ

$$\begin{aligned} X\_1 &= X\_1' + \delta X\_1\\ X\_2 &= X\_2' + \delta X\_2 \end{aligned} \tag{4}$$

*δX*<sup>1</sup> *δX*<sup>2</sup> *X*<sup>1</sup> *X*2ǯ

 ¢ ¡ ŗǰ ¢ ǽŚŞǾDZ

$$\begin{aligned} \left[ \begin{array}{cc} \delta X\_2\\ X\_3 \end{array} \right] &= \left[ \begin{array}{cc} Q + B\_2^T B\_2 & B\_2^T \mathbb{C} \\ \mathcal{C}^T B\_2 & \mathcal{C}^T \mathbb{C} \end{array} \right]^{-1} \left[ \begin{array}{cc} B\_2^T L\_2'\\ \mathcal{C}^T L\_2' \end{array} \right] \\\ Q &= B\_1^T B\_1 - B\_1^T A \left( A^T A \right)^{-1} A^T B\_1 \\\ \delta X\_1 &= - \left( A^T A \right)^{-1} A^T B\_1 \delta X\_2 \\\ L\_2' &= L\_2 - B\_2 X\_2' \end{aligned} \tag{5}$$

*Q* ¡ Ȭ ǯ

 ǻŚǼ ǻśǼǰ ǯ ǰ ¢ ǯ

#### *řǯŘǯ ¢*

 ǰ Ȭ¢ ¢ ǰ ¢ Ȧ ǰ ¢ ¢ Ȭ ǽŗśǰŚşȮśŗǾǯ ǻǼ ǽśŘǾǰ Ȭ ǻǼǰ ¡

 ě ǻǯǯǰ Ȭ ¢ Ǽ ǯ ǰ ¡ ǯ *W<sup>X</sup> W<sup>Y</sup>* ě *a*(*t*) *b*(*t*)ǰ *WXY* ¡ DZ

$$\mathcal{W}^{XY} = \mathcal{W}^X \cdot \mathcal{W}^{Y\*} \tag{6}$$

∗ ¡ ǯ

 ¢ ǰ ¡ Dz ǰ ǻ *WXY* Ǽ ę *<sup>a</sup>*(*t*) *<sup>b</sup>*(*t*)<sup>ǰ</sup> ¡ Ȭ¢ ǽśŘǾǯ

ǰ ǻǼ ǰ Ȭ *W<sup>X</sup> W<sup>Y</sup>* Ȭ¢ ǯ DZ

$$R^2(s) = \frac{\left| S(s^{-1}W^{XY}(s)) \right|^2}{S\left(s^{-1}|W^X(s)|^2\right)S\left(\left(s^{-1}|W^Y(s)|^2\right)\right)}\tag{7}$$

 *R*<sup>2</sup> *<sup>n</sup>*(*s*) Dz *s* Dz *S* ǯ *R*<sup>2</sup> *<sup>n</sup>*(*s*) Ŗ ŗǰ ŗ ǯ

¢ǰ Ȭ ě ě ǻǯǯǰ *a*(*t*) *b*(*t*)Ǽǯ ¢ *WXY*ǰ ǯ ǰ ǻ Ǽ Ȭ ¢ ǻ¢Ǽ ǯ ǰ ǻǼ ǻǯǯǰ *a*(*t*)Ǽ Ĵ ǻǯǯǰ *b*(*t*)Ǽ ¢ *π*/2ǰ ǯ ǰ £ ǯ ¡ǰ Ȭ ě ǻǯǯǰ *a*(*t*) *b*(*t*)Ǽ ¢ ǻǯǯǰ *a*(*t*)Ǽ Ĵ ǻǯǯǰ *b*(*t*)Ǽǯ

#### **Śǯ**

#### *Śǯŗǯ*

 ¢ ¢ ǻǯǯǰ ǽŗŚǰŗśǾǼǯ ǰ ǰ ȬȬ ǻǼ ǯ Ȭ Ȭŗ ¢ Ȭ ǻǼ ǯ Ś Ȭ ¢ ¢ −ŗśŖ Ȧ¢DžŘŖ Ȧ¢ ¢ ŘŖŗś ŘŖŘŗǯ ę ¢ Ȭ ¢ ǻǯǯǰ £ǰ £ǰ £Ǽ £ ǻǯǯǰ £ǰ ǰ ¢ǰ Ǽǯ ǰ ¡ ¢ −ŗśŖ Ȧ¢ǰ £ ¢ ǯ ¢ ¢ £ ǯ

 **Śǯ** ¢ ŘŖŗś ŘŖŘŗ ǯ ǰ Ȭ ǯ ǰ ę ǯ

#### *ŚǯŘǯ*

 řŖ ¢ ŘŖŗś ŗř ŘŖŘŗ ¢ ǻ śǼǯ Ĵ ǰ ¡ Ȭ ¢ ¢ £ ǯ řŖ ¢ ŘŖŗś ŗř ŘŖŘŗǰ −ŗŖŖ Ŝśŗǯŗ <sup>Ř</sup> ¡ −şśŖ ǰ £ ¢ ǯ

 ǰ ě ŘŖŗŜ ŘŖŘŗ Ȭ Ŝǯ ŘŖŗŜ ŘŖŗŞǰ ¡ −ŗşŘ −ŗŝř ǰ −śŖ −ŗŖŖ Ȭ řřŘǯśŘ Ř ŘŚşǯŜŜ Řǰ ŚŜǯŜ Ř ŞǯŚŝ Řǰ ¢ǯ ŘŖŗş ŘŖŘŗǰ ¡ −ŗřř −ŗŖŘ ǰ −śŖ −ŗŖŖ ŜřǯŖŜ Ř ŗŚǯŜŚ Řǰ ŖǯśŞ Ř ŖǯŖŗ Řǰ ¢ǯ ǰ Ȭ ǻ ŗDžŜ ŜǼ ¢ ¢£ ǻ ŝǼǯ ŝǰ ŗ ¡ −ŗŚŞǯŜ ŘŖŗŜ ¢ −ŝşǯŗ ŘŖŗşǯ ¢ǰ ǰ −śŖǯŞś ŘŖŘŗǯ Ř ¡ −ŗŝŖǯşŚ ŘŖŗŜǰ −ŝśǯŚş ŘŖŘŖǯ ¢ǰ ¢ −ŞŜǯřŝ ŘŖŘŗǯ ř ¡ −ŗŝřǯŘŗ ŘŖŗŝǰ ¢ −řŖǯśś ŘŖŘŗǯ Ś śǰ řǯ Ŝǰ ¢ ¢ −ŜŖDž−ŞŖ ǰ ¡ ŘŖŗş ŘŖŘŖǯ ¢ ǰ Ȭ ¢ ŘŖŗŝǰ ¢ ǯ ǰ Şǯ ŝ Şǰ ę ¢ ŘŖŗŜ ŘŖŘŗǰ ǯ

 **śǯ** řŖ ¢ ŘŖŗś ŗř ŘŖŘŗǯ ǯ ¢ ¢ ǯ

 **Ŝǯ** ě ǻǻǼ ŘŖŗŜǰ ǻǼ ŘŖŗŝǰ ǻǼ ŘŖŗŞǰ ǻǼ ŘŖŗşǰ ǻǼ ŘŖŘŖǰ ǻǼ ŘŖŘŗǼ ǯ

 **ŝǯ** ¢ £ ǻŗDžŘǼǰ £ ǻřǼǰ ǻŚǼǰ ¢ ǻśǼ ǻŜǼǯ Ȭ ŗŖŖȬ ě £ ¢ Ȭ ǯ ¢ ŗσǯ

 **Şǯ** ¢ £ ǻǻǼ ŗDžǻǼ ŘǼǰ £ ǻǻǼ řǼǰ ǻǻǼ ŚǼǰ ¢ ǻǻǼ śǼ ǻǻǼ ŜǼǯ

#### *Śǯřǯ*

 ¢ Ȭ ǰ Ȭ ¢ǯ ę ¢ ǰ ¢ Ȭ ǯ ¢ ǰ ǯ

 ǰ Ȭ ǰ ŗǯ ¢ ě ǰ ǯ ǰ ¡ śŖȬ ě £ Ȭ Ȭ ǯ ǻ ŗǼ şǰ Ȭ ǯ ǻǼ ŜǯŚ ǯ ǻ ŗŖǼǯ ǻǼ Ř ǻ ę şśƖǼ Śǯř Ŗǯşşǰ ¢ǯ Ȭ Ȭ ¢ ¢ Ě ¢ ǯ

 **şǯ** Ȭ ǯ

 **ŗŖǯ** Ȭ ¢ ŘŖŗŝǯ śŖȬ ě £ ¢ Ȭ ǯ ¢ ŗσǯ

#### **śǯ**

*śǯŗǯ ǻǼ ǻǼ Ȭ* 

 Ȭ ǰ Ȭ ǯ Ȭ  ŘŖŗşǰ ¢  ŘŖŗş ŘŖŘŗ ¢ǯ ǰ ¢ ¢ ǰ ¢ ǯ ǰ ¡ ǰ ě £ ŗǯś ę ¢ ǻŗDžŚ śǼǯ ě £

 ę ¢ ǯ ŗŗ ¢ Ȭ ǰ ŗŘ Ȭ ǯ ¢ ǰ ě Ȭ¢ ǯ ě Ȭ Ȭ ǯ ę şśƖǰ Ě ǻǼ ¢ ¢ ǯ ŗŗ ŗŘǰǰ ŗǰ Ȭ ¢ Ȭ  ŘŖŗş ŘŖŘŗ ŗŖȬ ǯ Ȭ ǰ Ȭ ¢ ŖDžŗ ǯ Řǰ Ȭ Ȭ Ȭ ¢ ŘŖŘŖ ŘŖŘŖǰ ǯ ¢ ŘŖŘŖ ŘŖŘŗǰ Ȭ Ȭ Ȭ ŖDžŘ ǻ ŗŗ ŗŘǰǼǯ Ȭ Ȭ ř ǻ ŗŗ ŗŘǰǼǯ ǰ Ȭ  ŘŖŗş ŘŖŗşǰ ¡ ¢ ¢ Ě ǯ Śǰ Ȭ ¢ ŚDžŗŖ ŖDžř ǻ ŗŗ ŗŘǰǼǼǰ ¢ ¢ ǯ ¢ Ě ǽŚşǾǰ ę ¢ ǯ

 **ŗŗǯ** ¢ Ȭ ǻǻǼ ŗǰ ǻǼ Řǰ ǻǼ ř ǻǼ ŚǼǯ

 **ŗŘǯ** Ȭ ǻŗDZ ǻǰǼDz ŘDZ ǻǰǼDz řDZ ǻǰǼDz ŚDZ ǻǰǼǼǯ Ȭ DZ Ȭ ǰ Ȭ DZ Ȭ Dz Ȭ DZ Ȭ Ȭ ¢ şŖ◦ǯ Ȭ Ȭ ǯ ǯ

#### *śǯŘǯ*

 ǯ ǽŚřǾ ¢ ¢ ¢ ¢ ¢ǰ ¢ ě ¢ ǯ ǰ Ȭ ¢ ǰ ǯ Ȭ ǰ Ȭ ę ¢£ ǻ ŗřǼǰ Śǯ Ś ŗřǰ ǰ Ȭ ¢Ȭ ¡ ǰ ¢ ǯ Ȭ ǽśřǾǯ ǰ ¢ ǰ ¡ ǽŚǾǰ ¢ ǽŘŖǾǰ ǽśŚǾǯ ǰ Ȭ Ě ęǰ £ ¢ ¢ ¢ ě ǯ ¢ ě Ȭ ¢Ȭ¡ ǯ Ȭ ě ǰ ¢ ǰ ¡ǰ ¢ǯ ǰ ě Ȭ ǰ ěȬ ǯ ǰ Ȭ ¢ ǯ

 **ŗřǯ** ę ŗŗ ǰ ŘŘ ǰ řř ŚŚ ǯ Ȭ ę Śǯ ¢ Śǯ

#### *śǯřǯ ¡ Ȃ Ȭ*

Ȃ Ȭ ǻǼǰ £ ǰ ę ȬȬ ǽśśǾǯ ǰ ¡ ǯ ¡ ¢ ¢ ǰ ¢ Ȭ Ȭ ¢ǯ ǰ ǻ Ǽ

 ¢ǰ ¢ ǻ ŗŚǼǯ ŗǯ ŘŖŗś ŘŖŘŗ ŗŚǰ Ȭ¢ Ȭ ¢ ǯ ¢ £ ¢ ě DZ Ȭ ǻŗǼ −ŜŜǯş Ȧ¢ ¢ ǻŘǼ −ŗřǯřŞ Ȧ¢ ǯ ¡ ě ǰ ŗŚǰ ę Ȭę ǯ ŗřǯś ǰ ě ¢ Ȭ ¢ǯ ǰ ¢ǯ ǰ ¢ ǯ ǰ ě ǰ ǯ ¢ ¢ ¡¢ ŞǯŜŚDžŗŗǯŚŗ şǯŝŗDžŗŝǯŘŘ ǽśśǾǰ ¢ ¢ ¢ ǯ ě ¢ǰ Ȭ Řǯ ě ¢ £ ǽśŜǾǰ Ȭ ¢ǰ ě ǯ ¢ǰ ęǰ ¢ ¢ ¢ǯ ŗŚǰ Ȭ ¢ Ě ě ¡ǯ

 **ŗŚǯ** ǻǼ ǻǼǰ ǻǼ ¢ ǯ

 ¢ǰ Ȭ ¢ ¢ ¡ ¡ ǰ ¡ ¢ ǯ ǰ Ȭ ǰ ¡ ¡ ǯ ǰ Ȭ ¢ ¡ ¡ ǯ ¢ ¡ ¡ ǯ ǰ Ȭ ǻǯǯǰ Ǽ ¢ Ě ě ¡ǯ ǰ Ĵ ¢ Ȭ Ě ǰ Ȭ ǻǯǯǰ ȬŗȦǰ Ȭǰ ȬŘǼ Ȭ ǻǯǯǰ Ȭǰ ȬǼ Ȭ Ȭ ǰ ¡ Ě ǻǯǯǰ Ě ǰ ǰ ¢ǰ ¡ Ǽ ǻǯǯǰ ŘŖśŖǼ ǯ

#### **Ŝǯ**

 ¢ǰ ȬȬ ǻǼ ¢ ǻǼ Ȭŗ ¢ ŘŖŗś ŘŖŘŗǯ ě ¡ −ŗśŖ Ȧ¢ ǰ ǻǯǯǰ −ŗŚŜ Ȧ¢ ǽŗŗǾǼǯ −ŗŖŖ Ȧ¢ ę ¢ ǻ¢ £ £Ȭ £Ǽ £ ǻ¢ £ȬȬ¢ȬǼǯ Ȭ Ȭ ŜǯŚ ǯ Ȭ ¢ǰ Ȭ ǰ Śǯř Ř Ŗǯşş ǻ Ȭ ę şśƖǼǯ ǰ ¡ ¢ ě ŘŖŗŝǰ ¢ ǯ ǰ Ȭ ¢ ĚȬ Ě £ǯ ¢ ŖDžŘ ǯ ǰ ǻǯǯǰ Ǽ ¢ Ě ¡ǯ

 ǰ ¢ Ȭ Ȭ ǯ

 **DZ** £ǰ ǯǯ ǯǯDz ¢ǰ ǯǯDz ǰ ǯǯDz Ȭ ǰ ǯǯ ǯǯDz ¢ǰ ǯǯDz ǰ ǯǯDz ǰ ǯǯDz ȯ ǰ ǯǯ ǯǯDz ȯ ǰ ǯǯ ǯǯDz £ǰ ǯǯ ǯǯDz ǰ ǯǯǰ ǯǯ ǯǯ ǯ

**DZ** ¢ ǻ ŚŘŗŖŚŖřŖǰ ŚŘŗŝŚŖśśǰ ŚŗşŜŘŖŗŞǼǰ ¡ ǻ ŘŖŘŖŘŘŗŘŖŗŜǰ ŘŖŘŗŘŘŖŚŖŖřǼǰ 
¢ ¢ ¡ ǰ ¢ ¢ ǻ ŘŖŘŖŗřǼ Ȭ ę ¢ ¢ ǻ ŘŖŗŞŖŖŚǼǯ

 **¢ DZ** ǯ

**Ě DZ** Ě ǯ

#### 

ǰ ǯǯDz ǰ ǯDz ǰ ǯ Ȭ ǰ ¡ Ȭ ǯ  *ǯ ǯ* **ŘŖŗś**ǰ *ŗŜş*ǰ ŘşŖȮřŖŜǯ ǽǾ
 DZ ǰ ǰ ŘŖŖŗǯ
ǯǯDz ǰ ǯDz ǰ ǯǯ ¢ ȬȬȬŘ ǯ  *ǯ ǯ* **ŘŖŗŚ**ǰ *ŗŚř*ǰ ŗŞŖȮŗşŗǯ ǽǾ
ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯ Ȭ Ȭ ¢ ¢ǰ ǯ *ǯ ǯ ǯ* **ŘŖŗŗ**ǰ *Řŗ*ǰ ŝśřȮŝŜŚǯ ǽǾ
ǯ ǰ Ĵ Ȭ ¢ǯ *ǯ ǯ* **ŘŖŗŘ**ǰ *ŞŜ*ǰ řŝśȮřşŘǯ ǽǾ
ǰ ǯDz ǰ ǯ ȬȬ ŗşşŘ ŘŖŗŚ ¢ ǯ  *ǯ* **ŘŖŗŜ**ǰ *Ş*ǰ Ŝŝśǯ ǽǾ
ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯ £ ǰ ǯ *ǯ ǯ ǯ* **ŘŖŗŝ**ǰ *řŞ*ǰ ŞŖŞȮŞŘŜǯ ǽǾ
ǰ ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯ Ȭ ¢ DZ ¢ Ȭ ǯ  *ǯ* **ŘŖŗŞ**ǰ *ŗŖ*ǰ ŗŖŖŜǯ ǽǾ
ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯ Ȭ Ȭ ¢ ŗś ¢ǯ  *ǯ* **ŘŖŘŖ**ǰ *ŗŘ*ǰ ŘŗŘśǯ ǽǾ
ǰ ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯ ¢ Ȭ DZ ¢ ǯ  *ǯ* **ŘŖŘŖ**ǰ *ŗŘ*ǰ řŖŝŗǯ ǽǾ
ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯ Ȭ ǯ *ǯ ǯ* **ŘŖŘŖ**ǰ *ŝřś*ǰ ŗřşŗŗŗǯ ǽǾ
ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯ Ȭ ¢ Ȭ ¡ ¢ ǯ *ǯ ǯ ǯ ǯ ǯ* **ŘŖŘŗ**ǰ *şŜ*ǰ ŗŖŘŘŞŚǯ ǽǾ
ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯ ě ǰ ȬŘ Ȭ ¢ǯ  *ǯ* **ŘŖŘŘ**ǰ *ŗŚ*ǰ şşŗǯ ǽǾ
ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯDz ǰ ǯ Ȭ ǰ ǯ  *ǯ* **ŘŖŘŗ**ǰ *ŗř*ǰ řşŜŚǯ ǽǾ
ǯDz ǰ ǯDz ǰ ǯ Ȃ Ȭ ¡ ȬŗȦŘ ǯ *ǯ ǯ ǯ* **ŘŖŘŗ**ǰ *Śŝ*ǰ ŞŖŘȮŞŗŝǯ ǽǾ
ǯǯ Dz  ¢ DZ ǰ ǰ ǰ ŗşŚřǯ

## *Article* **Prediction of Mine Subsidence Based on InSAR Technology and the LSTM Algorithm: A Case Study of the Shigouyi Coalfield, Ningxia (China)**

**Fei Ma 1,2,\*, Lichun Sui <sup>2</sup> and Wei Lian <sup>1</sup>**


**Abstract:** The accurate prediction of surface subsidence induced by coal mining is critical to safeguarding the environment and resources. However, the precision of current prediction models is often restricted by the lack of pertinent data or imprecise model parameters. To overcome these limitations, this study proposes an approach to predicting mine subsidence that leverages Interferometric Synthetic Aperture Radar (InSAR) technology and the long short-term memory network (LSTM). The proposed approach utilizes small baseline multiple-master high-coherent target (SBMHCT) interferometric synthetic aperture radar technology to monitor the mine surface and applies the long short-term memory (LSTM) algorithm to construct the prediction model. The Shigouyi coalfield in Ningxia Province, China was chosen as a study area, and time series ground subsidence data were obtained based on Sentinel-1A data from 9 March 2015 to 7 June 2016. To evaluate the proposed approach, the prediction accuracies of LSTM and Support Vector Regression (SVR) were compared. The results show that the proposed approach could accurately predict mine subsidence, with maximum absolute errors of less than 2 cm and maximum relative errors of less than 6%. The findings demonstrate that combining InSAR technology with the LSTM algorithm is an effective and robust approach for predicting mine subsidence.

**Keywords:** InSAR technology; mines; time series subsidence monitoring; deformation prediction; deep learning; LSTM algorithm

#### **1. Introduction**

In recent years, satellite remote sensing images have become increasingly utilized for monitoring land surface changes due to the continuous development of computer and space satellite technology. In particular, InSAR has emerged as a popular technology for monitoring surface subsidence changes, especially goaf deformation, as it is not impacted by weather and has extensive coverage [1]. Goaf is formed after coal extraction from underground, and the continuous and efficient monitoring of surface subsidence above goaf can facilitate the understanding of surface subsidence damage to surface structures, explore mining subsidence mechanisms, and provide a decision-making basis for geological disaster prevention and ecological restoration in mining areas [2]. The authors of [3] estimated that the total economic losses due to subsidence from coal mining in China were approximately 32 billion CNY (about 4.9 billion USD) from 2001 to 2010. These losses were primarily due to damage to buildings, roads, and other infrastructure caused by surface subsidence above goaf. Surface subsidence caused by goaf formation can lead to accidents such as landslides, rockfalls, and collapses, which can result in injuries and fatalities; [4] and [5] reported that coal mining-related subsidence caused accidents, resulting in deaths or injuries every year in the world. Efficient monitoring of surface subsidence above goaf can help in preventing accidents and reducing economic losses [6–8]. The traditional

**Citation:** Ma, F.; Sui, L.; Lian, W. Prediction of Mine Subsidence Based on InSAR Technology and the LSTM Algorithm: A Case Study of the Shigouyi Coalfield, Ningxia (China). *Remote Sens.* **2023**, *15*, 2755. https:// doi.org/10.3390/rs15112755

Academic Editors: Constantinos Loupasakis, Ioannis Papoutsis and Konstantinos G. Nikolakopoulos

Received: 26 March 2023 Revised: 21 May 2023 Accepted: 22 May 2023 Published: 25 May 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

form of goaf surface subsidence monitoring involves point-like monitoring stations, which are characterized by high consumption, low efficiency, limited coverage, and insufficient monitoring capability. Thus, it is of immense theoretical value and practical significance to use the new monitoring method of goaf surface subsidence (InSAR) to explore the formation mechanism behind subsidence in key mining areas and predict the evolution law and development trend of subsidence based on new monitoring methods and technology. Recent advances in deep learning theory have had a significant impact on time series prediction. As a result, an increasing number of deep-learning algorithms are being utilized to research long time series prediction, thereby making it possible to obtain mine subsidence characteristic information and dynamic forecasting in mining areas [9,10]. Deep-learning algorithms such as artificial neural networks (ANNs) [11] and recurrent neural networks (RNNs) [12] can be used to analyze long time series data and predict mine subsidence characteristics and dynamics. However, these methods need further improvement to improve the accuracy of prediction.

InSAR, an active remote-sensing technology, has been widely used for monitoring subsidence and surface deformation [13]. Initially, this technology was employed for ground elevation mapping [14] and subsequently extended to surface deformation monitoring [15]. However, the atmospheric phase delay and temporal and spatial decorrelation associated with two-pass InSAR technology can lead to phase unwrapping failure. Therefore, researchers proposed time series InSAR monitoring technology [16], including PS-InSAR [17] and SBAS-InSAR [18], with the latter being more suitable for deformation monitoring in mining areas. SBAS-InSAR technology involves registering interferences in pairs of SAR data sets covering the same area and selecting interferograms whose temporal and spatial baselines meet the threshold [19]. The highly coherent points in the images are then reconstructed based on the interferograms' phase. SBAS-InSAR has shown high-quality monitoring results with sub-centimeter monitoring accuracy [20,21]. G. Herrera et al. [22] demonstrated the monitoring capacity of InSAR technology using multi-sensor and multitemporal SAR data in very slow landslides. Dario Peduto et al. [23] used DInSAR technology to analyze building deformation and presented a multi-scale procedure tailored to analyze the settlement-induced building damage; it could forecast building damage in urban areas. M. P. Sanabria et al. [24] proposed a methodology to produce subsidence activity maps based on PSInSAR data; these displacement map measurements are interpolated based on conditional Sequential Gaussian Simulation complement, and they are helpful for the identification of wide subsiding areas.

In the context of mining subsidence, InSAR technology has been increasingly recognized as a valuable tool for monitoring surface deformation. However, predicting subsidence movement remains a challenge and requires a prediction model that integrates InSAR data. To date, two broad categories of prediction models have been employed: traditional and late models. Traditional models use various technical methods to obtain surface deformation data post-mining and predict the maximum deformation value using mathematical functions or numerical models. Examples include numerical simulation, similar material simulation, probability integration, and other static prediction models [25].

The mining subsidence process is a complex spatio-temporal phenomenon, posing challenges for applying static prediction models that cannot account for dynamic changes [26]. Alternatively, continuous multi-period surface deformation data obtained by various technical means can be analyzed to predict the location and timing of maximum surface movement deformation by incorporating time functions such as Knothe [27], Weibull [28], and Logistic [29]. However, these models can only capture the linear relationship between two vectors and are limited in their ability to predict nonlinear deformation in mining areas. Moreover, due to the dynamic changes in mining practices, such as mode, speed, and roof management, the accuracy of dynamic time function simulations is often compromised [30]. Due to the complexity of the mining subsidence process in both time and space, static prediction models have limited practical application as they cannot simulate the dynamic changes in the subsidence process. In addition, the actual

surface subsidence is different under different geological and mining conditions, but the prediction result is the same if the same time function is used, which is contradictory to the actual situation. Therefore, researchers have focused on "late models", such as the grey model [31], regression analysis [32,33], support vector machine regression [9], Bayesian network [10], wavelet analysis [34], and artificial neural network [35,36]. These models rely on modern and efficient monitoring means such as GNSS, InSAR, and LIDAR to obtain long-term series monitoring data and analyze internal statistical laws and trends. However, these methods are sensitive to model parameters, and adding mining geological parameters for goaf prediction can be challenging.

Compared to traditional mining subsidence prediction methods and late models, deep learning offers a novel approach to address this challenge. By formulating the relationship between response variables in a regression equation, deep-learning algorithms can accurately capture the impact of independent variables influenced by one or more dependent variables. While previous deep-learning techniques such as BP neural networks [37] and recurrent neural networks (RNN) [12] have been developed, they have not been ideal for long-term series prediction [38]. However, recent studies have shown that LSTM models, which combine RNN and attention mechanisms, are better suited for long-term prediction [39]. These models use a cellular structure, with the forgetting gate discarding unnecessary information while the memory gate retains important information.

Homa Ansari et al. [40] conducted an experiment on the Lazufre Volcanic Complex, situated in the central volcanic region, concluding that signal error associated with InSAR technology is a crucial factor contributing to inaccurate predictions when combined with LSTM. Hill et al. [41] focused on the influence of seasonal perturbations on forecasting outcomes. The LSTM prediction methodology proved efficient for short-term projections (less than three months). Qinghao Liu et al. [42] proposed a heterogeneous LSTM network model, which integrates spatial heterogeneity into predicting ground subsidence, successfully achieving accurate and efficient large-scale subsidence forecast. Yi Chen et al. [43] demonstrated the effectiveness of an unimproved LSTM neural network approach for time-series InSAR land subsidence prediction. Despite using InSAR technology and LSTM network models to forecast deformation, these studies produce contradictory findings due to the insufficient recognition of the significance of the InSAR training data, particularly in areas affected by error signals, such as volcanic and mining regions. Above all, the traditional InSAR needs to be improved to monitor mine deformation; inaccurate training data do not help improve the prediction accuracy of deep learning [44,45]. The purpose of this study is to obtain fine subsidence characteristics and accurate data on mining surfaces by improving InSAR technology and realize the accurate prediction of mining surface subsidence combined with the LSTM algorithm. This study presents an integrated monitoring and prediction model for the goaf surface that combines SBMHCT-InSAR and LSTM algorithms. The utilization of SBMHCT-InSAR technology enhanced multiple aspects of image processing and interpretation, such as the image registration algorithm, interferogram filtering method, and high coherence point extraction method. The objective of these advancements is to mitigate the disruptive influence of noise signal and optimize the training data of the prediction model, ultimately resulting in an enhanced level of accuracy. In practice, this technology is leveraged in the monitoring of goaf surface deformation, facilitating the retrieval of settlement values from equal interval time series training data. Additionally, the LSTM algorithm is employed to establish a deformation prediction model for coal mining areas by drawing the global dependence relationship between input and output and learning the nonlinear patterns and features of the training data. The main aim of this research is to forecast geological hazards while also addressing practical problems associated with the prediction of goaf subsidence using InSAR technology with deep learning theory. The study findings not only offer methodological support for mining subsidence management but also promote the quantitative application and development of InSAR technology. Therefore, the proposed model carries significant scientific and practical value.

#### **2. Materials and Methods**

#### *2.1. Site Selection*

The Shigouyi (SGY) mine area, one of the Ningdong coalfields situated in the eastern region of the Ningxia province, is depicted in Figure 1, with its geographical coordinates falling between 37◦39 15–37◦45 17N and 106◦27 49–106◦30 44E. It stretches westward to the Liupan Mount tectonic zone and eastward to the Erdos coal seam; it comprises a series of folds and faults. However, the Shigouyi mine experiences a significant, concentrated distribution of surface damage and subsidence because of its location on the Loess Plateau. The geological structure above the coal beds susceptible to mining is highly fragile.

**Figure 1.** Geographical location of Shigouyi coalfield. The background is optical imagery of SGY coalfield.

Considering space constraints, this study focuses primarily on monitoring and predicting surface subsidence in the SGY coal mine. In this regard, a total of 13 GNSS observation stations have been installed above the working face of the mine, and the corresponding settlement data have been collected for verification of the experimental findings. The Chinese Southsurvey GNSS receivers were utilized to collect GNSS data in the real-time kinematic mode. One receiver was situated at the base station, which was positioned on a stable surface, while the others monitored displacements at the GNSS stations. The GNSS receiver exhibited horizontal and vertical accuracy of ± <sup>10</sup> + <sup>1</sup> × <sup>10</sup>−<sup>6</sup> × *<sup>D</sup>* mm and ± <sup>20</sup> + <sup>1</sup> × <sup>10</sup>−<sup>6</sup> × *<sup>D</sup>* mm (where D is the distance), respectively. Over the period of 9 March 2015 to 1 July 2016, GNSS-RTK measurements were taken at intervals of 24 days. Initially, a GNSS receiver with a tripod was installed on the reference station situated on the stable surface, where the antenna height was measured; receivers were opened; and the reference station height, antenna height, and WGS84 coordinate were inputted. The radio channel was then turned on and checked. The roving station GNSS receiver was subsequently opened with a centering rod, exact parameters were inputted, the radio channel and number of satellites were checked, and simultaneous observation with the reference station GNSS receiver was completed. Using the roving station GNSS receiver, the 12 GNSS stations' coordinates and heights were measured. Data were obtained at a sampling rate of 20 Hz, with the observation time being more than 180 s. These 12 GNSS stations were measured again in the same manner at intervals of 24 days in the ensuing months, with the deformation value calculated by the difference value of these times, while

quality control measures were taken each time. Verification of the reliability of the RTK results was conducted using the method of comparison with quick static measurement, where at least three points were selected as checkpoints, and the observation time of quick static measurement exceeded 600 s. After data processing, the maximum error between quick static measurement and RTK was less than 2 cm in height.

#### *2.2. Data Selection*

This study processed 19 SAR images (C-band) obtained from the Sentinel-1 satellite over a 15-month period from 9 March 2015 to 7 June 2016. Table 1 shows the parameters of the SAR data used in this study; their temporal resolution is 24 days, and their spatial resolutions are 20 and 5 m in azimuth and range, respectively.


**Table 1.** Sentinel-1A data covering SGY mine area used in this work.

The SAR data cover a large area, including the SGY mine area, and due to computational efficiency, SAR data are clipped in pre-processed procedure. The European Space Agency (ESA) released precise orbit ephemerides (POD) data for all the Sentinel-1 SAR data. POD data are important for reducing registration errors. These data are used for phase re-flattening and orbital refinement. To eliminate the impact of topography on the measured surface deformation, the authors employed the three-arc-second Shuttle Radar Topography Mission (SRTM) Digital Elevation Model (DEM) obtained from the National Aeronautics and Space Administration (NASA).

#### *2.3. Fundamental Principle of SBMHCT-InSAR Technique*

The present study puts forward SBMHCT-InSAR technology for precise inversion of surface deformation. The proposed approach integrates the Permanent Scattering (PS) and high-coherence target methods. Linear and nonlinear deformation inversion methods are employed using the coherent target and singular value decomposition methods, respectively. SBMHCT-InSAR technology comprises key steps such as interference pair combination of multi-principal images, high-precision image registration, interferometric phase noise filtering, high-coherence target extraction, and the deformation inversion method. The SBMHCT-InSAR processing steps are as follows:

#### 2.3.1. Group the SAR Pairs

There are 19 scenes of SAR images ordered at times (t0, ... , tN) over the SGY mine area. M interferograms are constructed using installed multiple thresholds. The quantity *M* is such that it adheres to the following inequality:

$$\frac{N+1}{2} \le M \le N\left(\frac{N+1}{2}\right) \tag{1}$$

Figure 2 illustrates the experimental setup of the SAR pairs' connection diagram. The experiment set thresholds of spatial and temporal baselines as 300 m and 200 days, respectively. The SAR data acquired on 2 April 2015 were selected as the super master image; others were co-registered and resampled. Other images that meet the threshold condition also generate interferometric pairs, resulting in 78 differential interferometric images.

**Figure 2.** (**a**) Timeposition plot of interferometric pairs; (**b**) timebaseline of interferometric pairs. The yellow diamond denotes the super master image.

#### 2.3.2. Highly Accurate Image Registration

This paper presents an optimal matching point-based InSAR image registration method. Initially, an external DEM is emulated as a synthetic SAR image, and matching features are extracted from the SAR image to be registered in the simulated image. Then, the vector field consistent point set matching algorithm is employed to eliminate the homonymous feature points between the primary and secondary SAR images, remove the external points, and compute the polynomial transformation parameters for accurate registration. Ultimately, high-precision registration of the InSAR image is achieved.

#### 2.3.3. Noise Filtering of Interferometric Phase

This study proposes an interferogram filtering method based on binary decomposition, which has the potential to effectively address the issue of noise in SAR images. The proposed approach decomposes the interferogram using a binary empirical mode algorithm into image and noise information. Filtering is then performed using a local window signalto-noise ratio as the filtering factor, with strong filtering applied in regions of high noise and weak filtering in regions of low noise. Specifically, the method decomposes the original interferogram into fourth-order intrinsic mode function (IMF) signals and uses the signal-to-noise ratio of local windows as the filtering factor of the Goldstein filter to filter the first third-order IMF signals, which contain most of the noise information. The method demonstrates a strong noise-filtering ability while also preserving the edge details of interference fringes. As a result, the coherence of the interferogram is improved significantly after filtering.

#### 2.3.4. High-Coherence Target Extraction

In 2004, Hooper [46] introduced the StaMPS method, which identifies highly coherent points based on the stability of their phase values and high coherence and signal-to-noise ratio. Another method used to identify high-coherence points involves selecting a radius of a circle around known high-coherence points and then applying the amplitude dispersion threshold method to find candidate high-coherence points. Next, iterative analysis is carried out on the phase stability of the candidate points, and the high-coherence points are determined. This method reduces the computational workload and improves efficiency [47].

#### 2.3.5. Deformation Inversion

The proposed method aims to generate a Delaunay triangulation network for the highly coherent points after differential interferogram generation and identification of the highly coherent points. A linear model of velocity and elevation errors is then established based on the phase difference between two adjacent highly coherent points on the interferogram. By solving the coherence coefficient equation of the model, the incremental values of deformation velocity and elevation error are determined. The absolute value is obtained by incremental integration of the velocity and elevation error of several points. Next, the residual phase is unwrapped and calibrated by the discrete point phase after removing the linear model phase. Subsequently, the residual phase of a single SAR image is inverted using an interference combination matrix. Finally, nonlinear deformation and atmospheric influence phases are separated through time and space filtering, and a time series deformation sequence is obtained by calculating the linear deformation rate and nonlinear deformation phase.

Above all, the SBMHCT-InSAR technology introduced above is improved on the basis of SBAS-InSAR technology. Interactive Data Language (IDL) used for programming to improve the key steps of SBAS-InSAR technology, including image registration, filtering, and high coherence point extraction, in order to improve the adaptability and accuracy of SBAS-InSAR technology in mining deformation application.

#### *2.4. Principles of LSTM*

The LSTM network introduces a gate mechanism in the hidden layer to regulate information loss and dynamically adjusts the backpropagation process, enabling the network to learn long-distance time series data. This mechanism is crucial for the successful application of the LSTM model in large-scale surface subsidence prediction over extended time periods.

#### 2.4.1. The Framework of LSTM

Figure 3 illustrates the prediction framework for time series mine subsidence based on LSTM. The original settlement data are pre-processed to meet the network input requirements in the first step, while the hidden layer uses the cell structure to construct the circulating neural network. Then, predicted values are exported by the output layer. The network training calculates the loss value between the predicted and true values and uses the ADAM algorithm to optimize the model. By dynamically adjusting the long and short-term memory network, the network can fully learn the nonlinear correlation of different subsidence time series and thus capture the complex subsidence mechanism in the study area. This approach not only reduces the requirement for high-quality diachronic data but also improves the accuracy and interpretability of subsidence prediction.

**Figure 3.** The prediction framework of LSTM.

#### 2.4.2. Cell Structure of LSTM

The LSTM network comprises a set of cell units that serve as the central structure in the hidden layer. Figure 4 illustrates that the hidden layer contains three cell units. In the LSTM model, the input data at time t in the sample time series are represented by *xt*, while the corresponding output data of the cell unit in the implicit state are represented by *ht*. The flow of data in each cell unit is executed sequentially for input, information forgetting, cell state update, and implicit state output. The forward calculation method can be expressed as follows:

$$\mathbf{i}\_t = \sigma(\mathcal{W}\_{\text{xi}}\mathbf{x}\_t + \mathcal{W}\_{\text{hi}}\mathbf{h}\_{t-1} + \mathcal{W}\_{\text{ci}}\mathbf{c}\_{t-1} + \mathbf{b}\_i) \tag{2}$$

$$f\_t = \sigma \left( \mathcal{W}\_{xf} \mathbf{x}\_t + \mathcal{W}\_{hf} \mathbf{h}\_{t-1} + \mathcal{W}\_{cf} \mathbf{c}\_{t-1} + \mathbf{b}\_f \right) \tag{3}$$

$$\mathbf{c}\_{t} = f\_{t}\mathbf{c}\_{t-1} + i\_{t}\mathbf{t}\text{artanh}(w\_{xc}\mathbf{x}\_{t} + w\_{hc}h\_{t-1} + b\_{c})\tag{4}$$

$$\rho\_t = \sigma \left( \mathcal{W}\_{\text{xo}} \mathbf{x}\_t + \mathcal{W}\_{\text{ho}} \mathbf{h}\_{t-1} + \mathcal{W}\_{\text{co}} \mathbf{c}\_t + \mathbf{b}\_o \right) \tag{5}$$

$$h\_t = o\_t \tanh(c\_t) \tag{6}$$

where *i*, *f*, *c*, and *o* represent the input gate, forgetting gate, cell state, and output gate, respectively; *W* and *b* represent the corresponding weight coefficient matrix and bias, respectively; *σ* and tanh refer to the sigmoid and the hyperbolic tangent activation function, respectively.

**Figure 4.** The cell structures of LSTM.

The training process of the LSTM network adopts time backpropagation (BPTT), which is similar to the traditional backpropagation algorithm [48]. The algorithm involves four steps: First, the output of the cells is calculated based on the forward-computation method specified in Equation (5). The error term for each cell is then calculated in reverse, including time and network level backpropagation. Then, the gradient of each weight is determined according to the corresponding error term. Finally, the weights are updated using a gradient optimization algorithm.

#### *2.5. Time Series Prediction Model Combining SBMHCT-InSAR Results and LSTM*

Drawing on the fundamental tenets of the LSTM algorithm, the time series of coal mine subsidence obtained via InSAR technology are leveraged as training samples. Notably, these data exhibit nonlinear relationships, taking the form of {*Ht*} = {*H*1, *H*2, ··· , *Hn*}. As such, the values of these data serve as the training samples for the LSTM algorithm, whereby a predictive model is established, the model parameters are solved, and the corresponding predicted values are obtained. The accuracy of the predictive model is evaluated by comparing the expected value with the corresponding truth value. Ultimately, the mine forecasting method is implemented in Python language.

Step 1: Data pretreatments. The settlement time sequence data are processed by extracting a training sample of length L, denoted as *Hs*, from which the last Y values are designated as sample labels, and the first (L−Y) values are used as sample inputs, subject to the constraints of 2 ≤ L < m and 1 ≤ Y < L. Figure 5 depicts the form of sample division. By implementing this segmentation method, all highly coherent target points are pre-processed, and *n* training samples can be extracted. The paper adopts a single-step prediction method to construct the network model, whereby the length of the output sequence Y is set to y, and the settlement at the L moment is predicted based on the settlement information at the first (L−Y) moment.

Step 2: Network training. In Figure 3, the hidden layer output value *Ypre* is the final output through all LSTM hidden layer cell units. The input sample {*x*1, *x*2, ··· *xL*−*Y*} of the hidden layer is a two-dimensional array; output *Ypre* of the hidden layer and sample label *Y* are both one-dimensional arrays (*n*,1), where *n* represents the number of highly coherent points. In this paper, the statistical error index is the mean square error, and the following formula is defined as the loss function of the training process:

$$\text{loss} = \sum\_{t=1}^{n} \left( Y\_{pr\epsilon} - \mathbf{Y} \right)^{2} / n \tag{7}$$

**Figure 5.** The structures of data pretreatments.

Step 3: Parameter optimization. To construct an accurate LSTM prediction model, several parameters need to be considered, including the sample partition length (*L*), network layer number (*K*), and feature number (*S*) of each LSTM hidden layer [49,50]. This paper employs a multi-layer grid search method to explore these parameters and selects the parameter combination with the highest average prediction accuracy as the optimal choice. The accuracy is determined by minimizing the prediction error (*ε*) between the predicted sample (*Ypre*) and the actual sample (*Y*). The objective function is expressed as follows:

$$\min \varepsilon \left( \mathbf{Y}, \mathbf{Y}\_{pre} \right) = \left| \mathbf{Y}\_{pre} - \mathbf{Y} \right| \tag{8}$$

$$\text{s.t.} \begin{cases} \begin{array}{l} 2 \le L < M \, STEP\_L \parallel L\\ 2 \le K < i \, STEP\_K \parallel K \end{array} \\ 10 \le S \le S\_{\text{max}} \, STEP\_S \parallel S \\\ L, K, S \in N \end{cases} \tag{9}$$

where STEPL, STEPK, and STEPS are the grid search STEPS of corresponding parameters, respectively.

Step 4: Output and accuracy assessment. The LSTM model can adapt the parameters during the training and validation process simultaneously, leading to the attainment of the optimal model *LSTM*∗ *net*. This model is then used to predict the future settlement amounts by inputting all standardized prediction samples in a sequential manner. The output of the model is represented as *Ypre* = {*y*1, *y*2, *y*<sup>3</sup> ··· *yn*}, where *Ypre* denotes the set of prediction results of different highly coherent points. Finally, the discrepancy between the output *Ypre* and the actual measured data *Ysurvey* in the course of deep learning prediction is computed, thereby providing a quantitative assessment of the training and prediction accuracy of the model.

Above all, the LSTM neural network was built based on Python 3.9 language and the Pytorch 1.10 deep learning framework [43]. The input dataset includes all highly coherent point feature vectors obtained by SBMHCT-InSAR technology, containing longitude, latitude, coherence value, cumulative time, deformation rate, and cumulative subsidence value, among which cumulative subsidence value is the label data predicted in the model. The grid search algorithm was applied to select the hyperparameters in the LSTM ground deformation prediction model.

The absolute errors (AE) and the relative error (RE) are defined as follows:

$$\mathbf{AE} = \left| m\_i - m\_i' \right| \tag{10}$$

$$RE = \left| \frac{m\_i - m\_j'}{m\_i} \right| \times 100\% \tag{11}$$

where *mi* represents the truth value and *m <sup>i</sup>* represents the predicted value obtained by the LSTM model, and the absolute value is taken to avoid negative errors. The absolute error reflects the magnitude of the errors between the predicted and truth values, while the relative error indicates the proportion of the error relative to the truth value.

The relative error of the predicted results was evaluated using the Mean Absolute Percentage Error (MAPE). The generalization performance and degree of error of the prediction model were evaluated using the Wilmot Consistency Index (WIA), with values ranging from 0 to 1. Specifically, the MAPE was defined as the average of the absolute difference between the predicted and truth values, normalized by the observed value, expressed as a percentage. On the other hand, WIA was defined as the ratio of the observed variance to the sum of the observed variance and the variance of the prediction residuals, which were used to measure the degree of deviation of the model from the true values. They are defined as follows:

$$\text{MAPE} = \frac{\sum\_{i=1}^{n} |m\_i - m\_i'| / m\_i}{n} \times 100\% \tag{12}$$

$$WIA = 1 - \frac{\sum\_{i=1}^{n} \left(m\_i - m\_i'\right)^2}{\sum\_{i=1}^{n} \left(|m\_i - \overline{m\_i}| + \left|\mathbf{m}\_i' - \overline{m\_i}\right|\right)^2} \tag{13}$$

where *mi* represents the truth value, *m <sup>i</sup>* represents the predicted value obtained by the LSTM model, *n* is the number of samples, and *mi* is the average of *m i* .

#### **3. Results and Discussion**

#### *3.1. Analysis of InSAR Results*

In order to confirm the precision and dependability of our experimental results, we conducted an analysis and comparison of InSAR data and the GNSS values in the SGY mining area. As discussed in [51], the study area's InSAR monitoring and precision verification outcomes have been documented and will not be reiterated here. Figure 6a shows the cumulative deformation and coherence maps in the Ningdong coalfield; the graphs show a broader area to exhibit the mesoscale results, and the SGY mine is located in an oval area in the southwest. The coherence diagram shows that the coherence of the interferogram is greater than 0.41, and the coherence is good. Figure 6b shows the cumulative deformation maps in the SGY mining area from March 2015 to June 2016, where the red triangle represents the GNSS observation stations and the black rectangles represent the location of the underground coal seam. Here, a total of 4795 pixels obtained deformation values from InSAR data; these values were calculated by dividing by the cosine of the incident angle to obtain the vertical deformations.

Based on this research, a significant deformation basin has emerged in the study area since March 2015, which is caused by the continuous mining activities in the study area. The maximum observed deformation in the mining area was as high as −94 cm, which is a typical example of the subsidence basin commonly observed during the excavation and mining activities in this area.

The InSAR monitoring of accumulated deformation value error within the study area satisfies standard specifications, validating the applicability of this research outcome for settlement prediction. In contrast to GNSS scattered point deformation monitoring with time series values, InSAR technology acquires continuous and planar deformation information, which can be utilized to predict surface subsidence and provide an effective representation of both the evolving surface trends and the regional distribution of subsidence.

**Figure 6.** (**a**) Cumulative deformation and coherence maps in the Ningdong coalfield. (**b**) Cumulative deformation map in the SGY from March 2015 to June 2016.

#### *3.2. LSTM Prediction Results*

In the present study, 15 sets of InSAR monitoring data obtained between 9 March 2015 and 27 March 2016 were utilized as training samples, while 3 sets of InSAR monitoring data obtained between 27 March 2016 and 7 June 2016 were employed as test samples. Following the deformation results, the settlement sequence of each observation point was pre-processed, and 12 observation points were selected for further analysis.

In this study, the settlement time sequence of each observation point was pre-processed based on the inversion results, and a total of 4795 points were selected as training samples. To determine the optimal network parameters, a grid search method was utilized to investigate the impact of the number of network layers *K* and the number of hidden layer nodes *S* on the prediction accuracy. The resulting heatmap of prediction errors, shown in Figure 7, reveals that the number of network layers and hidden layer nodes are not solely responsible for prediction accuracy. While increasing the number of network layers generally enhances the prediction accuracy, and increasing the number of hidden nodes generally does the same, these correlations are not absolute. The prediction error reaches a trough when the number of network layers *K* reaches 5, and the number of hidden nodes *S*

reaches 55. Further increasing the number of network layers and hidden nodes results in decreased efficiency of the prediction model without improving the prediction accuracy. Therefore, the optimal configuration for this study's LSTM network consists of 5 layers and 55 hidden layer nodes.

**Figure 7.** Heatmap of grid search results.

The focus of this study is on the 12 observation stations located in the center of the subsidence area. The objective is to investigate the relationship between the length of training samples and the prediction error. Additionally, the performance of each observation point in both single-step prediction and multi-step prediction is evaluated.

Figure 8 presents a line chart depicting the relationship between the length of training samples and the average prediction error. The predicted step is set as a single-step prediction. The 12 observation stations located in the center of the subsidence area are analyzed in this study to understand the error performance of each observation point. The results indicate that when the training sample length is 5, the prediction errors for all points are considerably higher compared to the prediction results of others. However, when the training sample length is set to 10, some observation points, such as G1, G2, and G9, exhibit better prediction results, while others show larger or similar errors with the prediction results of the blue line. The reason for this could be the fluctuation in subsidence values at these points. On the other hand, when the training sample length is set as 15, the prediction errors for all observation points are significantly lower. The line chart highlights that increasing the number of training samples results in lower prediction errors. Moreover, the local minimum prediction error of grid search gradually decreases with the increase in training sample length. Overall, the LSTM model performs better in predicting settlement values that exhibit stable and orderly changes over a longer period.

Figure 9 presents a line chart depicting the relationship between the predicted step size and the average prediction error. Based on the analysis, it can be observed that the prediction error increases with the increase in the predicted step size. This is because the longer the prediction sequence, the greater the cumulative error, resulting in a larger error in the predicted values. The error value of the light blue curve, which represents the predicted value error when the predicted step size is 3, is the largest among the three curves, indicating that the error accumulates with the increase in the predicted step size. The red curve and the dark blue curve have similar error values, indicating that the error accumulation is relatively small for multi-step prediction with a predicted step size of 2, and the prediction accuracy is still relatively high. However, it is worth noting that for

some observation points with small cumulative form variables, the single-step prediction error is slightly smaller than that of multi-step prediction, which may be related to the characteristics of the subsidence process at these points. Overall, single-step prediction is a more accurate and reliable prediction method for SBMHCT-InSAR deformation monitoring values based on the LSTM model.

**Figure 8.** Relationship between training sample length and errors.

**Figure 9.** Relationship between prediction steps and errors.

In multi-step prediction, the model needs to predict several time steps ahead, which increases the complexity of the problem. As a result, the prediction error may accumulate over time, leading to a less-accurate prediction. On the other hand, in single-step prediction, the model only needs to predict the next time step, which is a simpler problem, and therefore the average error is better than that of multi-step prediction.

The present study adopts a strategic approach for selecting the sample segmentation length by leveraging the outcomes of the experiments conducted. Specifically, given the small ordinal number and equal time interval of the data, and taking into consideration the LSTM model's ability to learn long-distance time-series data, longer sample lengths were preferred to achieve better prediction results. In this regard, 15 sets of InSAR monitoring data obtained from 9 March 2015 to 27 March 2016 were selected as the training samples, whereas 3 sets of InSAR monitoring data from 27 March 2016 to 7 June 2016 were chosen as the test samples. The table below presents the final combination of LSTM prediction model parameters.

Table 2 shows the parameters of LSTM. Under the aforementioned parameter configuration, the experiment predicting surface subsidence in the SGY coal mine has yielded favorable results. Specifically, the average absolute difference (cumulative) form variable error has been effectively constrained to within 3 cm, thereby satisfying the accuracy standards prescribed within current InSAR data processing protocols.

**Table 2.** Optimal search results of LSTM parameters.


Figure 10 presents the settlement prediction results of the SGY mining area at the overall experimental scale. To obtain these results, the last 15 groups of deformation data from 20 May 2015 to 14 May 2016 were used as training samples to predict the deformation results of 7 June 2016. The predicted results were compared with the measured deformation results of InSAR, and the analysis showed a high degree of consistency between the predicted and real shape variables, indicating that the predicted settlement center area was clear and accurate. Moreover, the prediction errors of each monitoring point in the subsidence area were within 3 cm. Figure 11 presents the prediction errors of LSTM. Out of 4795 observation points, the maximum difference (cumulative) shape variable had a prediction error of 2.6 cm, and the average prediction accuracy reached 93.6%. The calculation method of relevant indicators is given in Equations (9)–(12). Therefore, the model proposed in this paper has a small deviation when compared with the actual settlement amount, effectively reflecting the basic law of land surface settlement changing with time, and the predicted results are reliable.

**Figure 10.** Prediction results of LSTM in SGY mine area.

**Figure 11.** Prediction errors of LSTM in SGY mine area.

#### **4. Discussion**

To assess the validity of the prediction model introduced in this study, GNSS observation stations situated on the surface of the mining area were selected for analysis. Real-time kinematic (RTK) technology utilizing the double-difference mode was employed across all stations, with a precision of up to 1 mm for horizontal displacement monitoring. As previously reported, the measurement error of elevation direction is approximately twice that of the horizontal displacement error [52]. The monitoring values derived from the continuous time series deformation of the GNSS observation stations were extracted to verify the dependability of the prediction results obtained from the fusion of InSAR technology with LSTM. The table below exhibits the deformation values over time for each GNSS observation station based on their respective coordinates.

To demonstrate the superiority of the proposed prediction model, a comparison is made between the LSTM model and the traditional machine-learning model, using model establishment time and prediction error as evaluation metrics. Specifically, the SVR model is chosen as the representative of the traditional machine-learning model, which utilizes a nonlinear kernel function to map multidimensional inputs to higher dimensional feature space for regression analysis. In this study, the C penalty parameter and g kernel function parameter are optimized through the cross-verification method and grid search technique to identify the optimal parameter settings. Further details on the parameter searching process and experimental methodology can be found in [53].

Table 3 presents the InSAR monitoring deformation cumulants of the SGY coal mine surface GNSS observation points over 432 consecutive days in the second column, while the third column displays the GNSS monitoring deformation cumulants over 456 consecutive days. Additionally, the fourth and fifth columns exhibit the cumulative shape variables predicted for 456 consecutive days by the LSTM algorithm and SVR algorithm, respectively. Figure 12 corresponds to the data presented in Table 3, showcasing the close proximity of the predicted values of the LSTM algorithm and the SVR algorithm to the GNSS monitoring results. Furthermore, the LSTM algorithm's predicted values align with the GNSS monitoring values at multiple monitoring points, suggesting a high level of prediction accuracy. Thus, qualitatively, the results indicate the efficacy of the LSTM method in predicting cumulative shape variables.


**Table 3.** SGY coal mine observation points' deformation values.

**Figure 12.** Monitoring and forecast results in SGY mine area.

The bar charts displayed in Figures 13 and 14 demonstrate the error distribution of prediction results for 12 GNSS monitoring points, revealing both the absolute and relative prediction errors of the LSTM and SVR prediction methods. These error metrics serve as quantitative indicators to assess the prediction accuracy of the two methods. The absolute and relative errors of the LSTM prediction at the 12 monitoring stations are smaller than those of the SVR prediction results. Specifically, the LSTM prediction method reports a zero error at G1, G3, G7, G8, G10, G11, and G12, whereas the SVR prediction method only has a zero error at the G7 observation station. The highest prediction error for both methods was observed at G2, G4, and G4 observation stations, with the LSTM method reporting an absolute error of 2 cm and the SVR method reporting an absolute error of 6 cm. This may be attributed to their central location within the subsidence basin, where mining activities were intensive over the 456-day inland. The deformation of these measuring points is influenced by multiple factors, such as geological structure, mining speed, and coal pillar, resulting in a more complex settlement pattern. The LSTM algorithm, with its superior learning ability, demonstrated a higher prediction accuracy compared to the SVR algorithm. Notably, the relative error between the predicted value and the truth value of the SVR method at the G12 station is at most 14%, with an absolute error of 2 cm. This is because G12 is situated at the edge of the subsidence basin with a relatively small subsidence value. However, the prediction accuracy of observation stations such as G1, G11, and G12 located

in other subsidence basins is relatively high. These findings suggest that the LSTM method is better equipped to learn the intricate details of the settlement pattern.

**Figure 13.** Comparison of absolute error between LSTM and SVR.

**Figure 14.** Comparison of relative error between LSTM and SVR.

Table 4 compares the prediction accuracy of the LSTM and SVR methods for the SGY coal mine. The table contains important error metrics such as the maximum absolute error, maximum relative error, average absolute error, and Wilmot consistency index, which provide a quantitative assessment of the accuracy of the prediction results.

**Table 4.** Comparison of prediction accuracy between LSTM and SVR.


The results in Table 4 demonstrate that the LSTM model outperforms the SVR time series prediction method in terms of prediction accuracy. Specifically, the maximum prediction error of cumulative deposition using the LSTM model is less than 2 cm, while the maximum relative error and average relative error are lower compared to the SVR method. These findings indicate that the deep-learning-based prediction model proposed in this study is highly accurate and robust. The WIA and MAPE values for the LSTM model are 0.999 and 1.1%, respectively, and the Wilmot consistency index is close to 1, demonstrating the effectiveness of the prediction function established by Equation (13). The mining settlement prediction model based on InSAR monitoring data and LSTM is robust and outperforms the representative machine-learning model, SVR, in various evaluation indicators, including the Wilmot consistency index. Based on the above analysis, it is

evident that the LSTM-based prediction method used for large-scale surface subsidence is highly accurate, efficient, and a better option to ensure production safety.

Comparative with the findings in references [44,45], it shows that the SBMHCT-InSAR technology improved from SBAS-InSAR is more suitable for deformation monitoring in mining areas. It can obtain the accurate deformation value of mining area surface, which is conducive to the training of the LSTM prediction model. The prediction results show that the prediction of mining area deformation by combining SBMHCT-InSAR technology and LSTM model is highly reliable and robust.

#### **5. Conclusions**

The current study focuses on the surface settlement monitoring of the SGY coal mine in the SGY mining area, utilizing time series InSAR technology with an optimization algorithm. Through this technique, the study acquired series settlement values with equal time intervals. Subsequently, a deep-learning mining settlement prediction model was built, which utilized InSAR monitoring data to predict the future settlement values of the next time series. The results of this study suggest that the proposed deep-learning-based method exhibits higher accuracy, lower time cost, and better performance in various evaluation indicators than the representative machine-learning-based SVR method. Based on these findings, it can be concluded that the proposed approach is a superior method for fulfilling production safety requirements. These conclusions are drawn as follows:


thereby providing theoretical support for the expansion and application of InSAR monitoring technology in mining areas.

**Author Contributions:** Conceptualization, F.M. and L.S.; methodology, F.M.; software, W.L.; validation, F.M., L.S. and W.L.; formal analysis, F.M.; investigation, F.M.; resources, W.L.; data curation, L.S.; writing—original draft preparation, F.M.; writing—review and editing, W.L.; visualization, F.M.; supervision, L.S.; project administration, L.S.; funding acquisition, F.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Fundamental Research Program of Shanxi Province (grant number 202103021223381).

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author. The data are not publicly available due to confidentiality.

**Acknowledgments:** The authors acknowledge the provision of the Sentinel-1 data by ESA and DEM data by NASA.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

## *Article* **Integration of DInSAR-PS-Stacking and SBAS-PS-InSAR Methods to Monitor Mining-Related Surface Subsidence**

**Yuejuan Chen 1,2, Xu Dong 1,2, Yaolong Qi 1,2,3, Pingping Huang 1,2,\*, Wenqing Sun 4, Wei Xu 1,2, Weixian Tan 1,2, Xiujuan Li 1,2 and Xiaolong Liu 1,2**


**Abstract:** Over-exploitation of coal mines leads to surface subsidence, surface cracks, collapses, landslides, and other geological disasters. Taking a mining area in Nalintaohai Town, Ejin Horo Banner, Ordos City, Inner Mongolia Autonomous Region, as an example, Sentinel-1A data from January 2018 to October 2019 were used as the data source in this study. Based on the high interference coherence of the permanent scatterer (PS) over a long period of time, the problem of the manual selection of ground control points (GCPs) affecting the monitoring results during refinement and re-flattening is solved. A DInSAR-PS-Stacking method combining the PS three-threshold method (the coherence coefficient threshold, amplitude dispersion index threshold, and deformation velocity interval) is proposed as a means to select ground control points for refinement and re-flattening, as well as a means to obtain time-series deformation by weighted stacking processing. A SBAS-PS-InSAR method combining the PS three-threshold method to select PS points as GCPs for refinement and re-flattening is also proposed. The surface deformation results monitored by the DInSAR-PS-Stacking and SBAS-PS-InSAR methods are analyzed and verified. The results show that the subsidence location, range, distribution, and space–time subsidence law of surface deformation results obtained by DInSAR-PS-Stacking, SBAS-PS-InSAR, and GPS methods are basically the same. The deformation results obtained by these two InSAR methods have a good correlation with the GPS monitoring results, and the MAE and RMSE are within the acceptable range. The error showed that the edge of the subsidence basin was small and that the center was large. Both methods were found to be able to effectively monitor the coal mine, but there were also shortcomings. DInSAR-PS-Stacking has a strong ability to monitor the settlement center. SBAS-PS-InSAR performed well in monitoring slow and small deformations, but its monitoring of the settlement center was insufficient. Considering the advantages of these two InSAR methods, we proposed fusing the time-series deformation results obtained using these two InSAR methods to allow for more reliable deformation results and to carry out settlement analysis. The results showed that the automatic two-threshold (deformation threshold and average coherence threshold) fusion was effective for monitoring and analysis, and the deformation monitoring results are in good agreement with the actual situation. The deformation information obtained by the comparison, and fusion of multiple methods can allow for better monitoring and analysis of the mining area surface deformation, and can also provide a scientific reference for mining subsidence control and early disaster warning.

**Keywords:** InSAR; mining area; surface subsidence monitoring; DInSAR-PS-Stacking; SBAS-PS-InSAR; ground control point; fusion

**Citation:** Chen, Y.; Dong, X.; Qi, Y.; Huang, P.; Sun, W.; Xu, W.; Tan, W.; Li, X.; Liu, X. Integration of DInSAR-PS-Stacking and SBAS-PS-InSAR Methods to Monitor Mining-Related Surface Subsidence. *Remote Sens.* **2023**, *15*, 2691. https:// doi.org/10.3390/rs15102691

Academic Editors: Constantinos Loupasakis, Ioannis Papoutsis and Konstantinos G. Nikolakopoulos

Received: 20 April 2023 Revised: 19 May 2023 Accepted: 19 May 2023 Published: 22 May 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

#### **1. Introduction**

Coal mine resources play an important role in China's energy resources. The large-scale exploitation of coal mine resources has promoted the development of China's economy, but it has also caused some ecological environment and surface subsidence problems [1–4]. Land subsidence caused by coal mining is a destructive disaster that often occurs in mining areas and is one of the most severe geological disasters in China [1,5]. Large-scale underground coal mining can cause the formation of underground cavities, which can lead to the loss of support for rocks and soils, resulting in ground subsidence and sinking. During the subsidence process, substances such as groundwater, sediment, and coal seams are squeezed and displaced, causing the appearance of depressions and funnel-shaped pits on the surface, known as ground subsidence funnels [6]. Coal mining can also cause changes in geological stress, resulting in the formation of cracks and faults in rock layers, which may trigger ground fissures. These cracks may expand and lead to geological disasters such as ground collapse [7,8]. The large-scale mining of underground coal mines can lead to the occurrence of surface subsidence funnels, ground cracks, and collapses in mining areas, which cause certain safety hazards and affect the local ecological environment and the safety of the surrounding residents. Identifying the causes and risks of surface subsidence by means of monitoring and analyzing the surface deformation of mining areas is of great significance for protecting residents and the safety of their properties, as well as mining subsidence disaster warnings, control, and management [9,10].

Interferometric synthetic aperture radar technology (InSAR) is a new all-weather, all-time Earth observation method [11–16]. With the rapid development of differential interferometric synthetic aperture radar (DInSAR) technology, radar line-of-sight deformation can be obtained up to the millimeter level [17,18]. DInSAR technology has been used to monitor mining subsidence and related research [19]. Berardino et al. proposed small baseline subset InSAR (SBAS) technology, which uses small baseline combinations for measurement, and uses the singular value decomposition (SVD) method to calculate multiple small baseline combinations to effectively obtain deformation information on time series [20–23]. Stacking-InSAR technology is one of the relatively simple time-series InSAR technologies. Specifically, it refers to the linear superposition and weighted average of multiple unwrapped differential interferometry pairs during the study period, which obtains more accurate deformation information [24–28]. Ferretti et al. proposed persistent scatterer InSAR (PS-InSAR) technology. In the long time series of SAR images, the points with high coherence and stability are selected as PS point targets, and then the phase characteristics of these target points are analyzed, and the corresponding atmospheric phases are separated to obtain relatively accurate surface subsidence information [29–32]. Guo Shanchuan et al. used DInSAR technology to effectively monitor and verify the mining area of the Loess Plateau with complex and dangerous terrain [33]. Xia Yuanping et al. combined DInSAR and GIS technology to identify illegal underground mining in the Shanxi Yangquan mining area and provided technical support for monitoring underground mining [34]. Li Da et al. used SBAS-InSAR technology to monitor and analyze the time-series deformation of the Yulin mining area [35]. Ma Fei et al. introduced SBAS-InSAR technology to the Ningdong mining area for monitoring, and compared the subsidence value of ground monitoring points in the Shigouyi coal mine with GPS monitoring values, revealing the effectiveness of SBAS-InSAR technology for mining area subsidence monitoring [36].

The InSAR method has achieved remarkable results in monitoring mining deformation, and different methods show their own advantages. Researchers in China and abroad have carried out relevant research on comparisons between and combinations of various InSAR methods for mining area deformation monitoring and analysis [28,37–39]. For example, Wei Jicheng et al. [40] combined DInSAR and PS-InSAR technology to effectively monitor mining subsidence in the Ordos area in the north of the Shendong mining area, revealing the space–time evolution process of surface deformation in this area. Depin Ou et al. [41] combined DInSAR and pixel offset tracking methods to monitor coal mine deformation. The selection of ground control points (GCP) used for refinement and re-flattening in

InSAR processing is crucial for deformation inversion. Some scholars have used the PS points obtained by PS-InSAR as ground control points in the SBAS-InSAR method to monitor surface deformation [42,43] and compared this with the traditional time-series InSAR method to verify its feasibility and improve the deformation accuracy. As mentioned above, combining the advantages of various InSAR methods can allow for more effective monitoring of the deformation of mining areas.

The limitations of some InSAR technologies should be considered, for example, SBAS-InSAR is insufficient in monitoring the subsidence center of large subsidence and areas with poor coherence, but it is better for monitoring small deformations. In this study, a mining area in Nalintaohai Town, Ejin Horo Banner, Ordos City, Inner Mongolia Autonomous Region, was selected as the study area, and Sentinel-1A data were used as the data source. The improved DInSAR-PS-Stacking and SBAS-PS-InSAR methods are proposed as a means to improve the inversion accuracy by combining the PS three-threshold method (coherence coefficient threshold, amplitude deviation index threshold, and deformation velocity interval) to select ground control points. The time-series deformation results obtained from the DInSAR-PS-Stacking and SBAS-PS-InSAR methods were compared, verified, and analyzed. The ground displacement time series were fused using the OTSU (Otsu 2007, named after the author) [44] automatic extraction method with two thresholds (the deformation and the average coherence threshold). The deformation analysis and subsidence rule of the mining area were studied in relation to the time-series deformation results after fusion. The fusion of multiple InSAR methods can allow us to overcome the shortcomings of individual deformation monitoring methods and obtain more complete and accurate deformation results. This study provides a scientific basis and technical support for mining subsidence prevention and sustainable development.

#### **2. Materials and Methods**

#### *2.1. Study Area*

The object of this study is a coal mine in Nalintaohai Town, Ejin Horo Banner, Ordos, Inner Mongolia Autonomous Region. Located at the north end of the Loess Plateau, the landform has been severely cut after long-term rain erosion, forming a standard beam and hilly platform, and its surface vegetation mostly consists of semi-barren areas. The coal mine is in the local village, the terrain is high in the north and low in the south, where it is relatively flat. The coal mine is close to the traffic line, and the traffic is relatively convenient. The location belongs to the temperate continental monsoon climate, and the temperature changes significantly in the four seasons of the year, with a large temperature difference. The annual average precipitation is low, and the precipitation is mostly concentrated between July and August, and the climate is dry. Its geographical location and scope are shown in Figure 1.

The geological structure of the working face is simple: the elevation of the coal seam floor is 1060~1075 m, the ground elevation is 1193.8~1260 m, the thickness of the overlying bedrock of the 4-2 coal seam is 120~140 m, and the thickness of the loose layer is 20~45 m. The mining period was from 1 January 2018 to 30 September 2019.

#### *2.2. Data*

The experimental data involved in this study are as follows: (1) Sentinel-1A provided C-band synthetic aperture radar data [45], and the satellite revisit period was 12 days/time. The data obtained in this study were 52 scenes of Sentinel-1 A satellite image data from 6 January November 2018 to 4 October 2019. Based on these data, the target area was analyzed, and these data types were SLC. The data was downloaded from the ASF Data Search (URL: https://search.asf.alaska.edu/, accessed on 8 February 2023). (2) SRTM DEM, which are released by NASA of the United States, with a ground resolution of 30 m [46]. POD precise orbit ephemerides were provided by ESA. The main parameters are shown in Tables 1 and 2.

**Figure 1.** Geographical location and scope of the mining area: (**a**) location of the mining area, (**b**) regional location of mining area, (**c**) and location of the working face and monitoring points.

**Table 1.** The parameters of the Sentinel-1A images.



**Table 2.** Main parameters of the Sentinel-1A data used in this study.

#### *2.3. Methods*

The deformation monitoring and analysis of the mining areas based on InSAR proposed in this study are divided into seven main steps:


#### 2.3.1. Ground Control Point Screening

The ground control points (GCPs) were used for refinement and re-flattening to estimate and remove the residual constant phase and residual phase ramp after unwrapping to improve the accuracy of deformation monitoring. The GCPs should be located in a flat terrain, with no phase jump, and far away from the deformation zone. Manual selection of the GCPs causes large errors. Therefore, we propose a method based on permanent scatterers (PS) to determine the GCPs. Firstly, the PS point targets with high coherence coefficients were identified using the coherence coefficient threshold method. Then, the amplitude deviation index threshold was set to further screen the PS points with strong, stable scattering as the target. Finally, a deformation velocity interval was set to select the final PS points [47].

**Figure 2.** Technical flow chart.

#### (1) Coherence coefficient method

The coherence coefficient is an important index used to measure the interference quality of interference image pairs. It is mainly used to describe the similarity between the master and slave images in the same area [48]. The expression of the coherence coefficient is:

$$\gamma = \frac{\left| \sum\_{i=1}^{m} \sum\_{j=1}^{n} \mathcal{M}(i,j) \cdot \mathcal{S}^\*(i,j) \right|}{\sqrt{\sum\_{i=1}^{m} \sum\_{j=1}^{n} |\mathcal{M}(i,j)|^2 \cdot \sum\_{i=1}^{m} \sum\_{j=1}^{n} |\mathcal{S}^\*(i,j)|^2}} \tag{1}$$

where *M* is the master image, *S* is the slave image, and ∗ is the conjugate multiplication. After calculating the value of each pixel, the average value *γ* of each pixel in the time series is obtained:

$$\overline{\gamma} = \frac{1}{N} \sum\_{i=1}^{N} \gamma\_i \tag{2}$$

In general cases, the larger the *γ* value, the more stable the pixel, the lower the noise, and the better the interference phase quality. The coherence threshold is set, and when the *γ* value of a pixel point is greater than that of the threshold, it is determined to be an effective PS point [49].

#### (2) Amplitude dispersion index method

Ferretti et al. proposed that the stability of the interference phase can be measured by the time series of the amplitude information in the pixel [50]. *R* and *I* represent the real and imaginary parts of the image, respectively. If there is Gaussian noise with the standard deviation of *σn*, the amplitude value *A* obeys the Rice distribution [51]:

$$f\_A(a) = \frac{a}{\sigma\_n^2} I\_0\left(\frac{ag}{\sigma\_n^2}\right) e^{-\frac{a^2 + g^2}{2\sigma\_n^2}}, a > 0 \tag{3}$$

In the above formula, *g* is the target reflection, and *g* > 0, *I*<sup>0</sup> is a Bessel function. When the signal-to-noise ratio *g*/*σ<sup>n</sup>* is small, the Rice distribution evolves into the Rayleigh distribution. In the high-SNR target (*g*/*σ<sup>n</sup>* > 4), the distribution tends towards a Gaussian distribution [42]. Therefore, when *σ<sup>n</sup> g*, the phase dispersion index can be estimated by the amplitude dispersion index:

$$
\sigma\_{\upsilon} \cong \frac{\sigma\_{nI}}{\mathcal{g}} \cong \frac{\sigma\_A}{m\_A} \triangleq D\_A = \frac{\sigma\_A}{\mu\_A} \tag{4}
$$

In the above formula, *σ<sup>v</sup>* is the phase dispersion index; *σnI* is the standard deviation of the imaginary part; *μ<sup>A</sup>* and *σ<sup>A</sup>* are the mean and standard deviation of the time series amplitude, respectively; and *DA* is the amplitude dispersion index.

This method selects stable PS points by comprehensively considering the coherence coefficient, amplitude deviation index, and deformation velocity via PS-InSAR processing. Firstly, the pixels with high coherence were selected as PS points by the coherence coefficient method. The coherence threshold was set to 0.95, and the pixels with a coherence higher than that of the threshold were selected as PS points. Secondly, the PS points with a stable phase were further selected by the amplitude dispersion method, and the amplitude dispersion index threshold was set to 0.40. The PS points were selected if the amplitude dispersion index was lower than that of the threshold. Finally, the PS points were determined by the deformation velocity interval, and the deformation velocity interval was set to be [−1 mm/a, 1 mm/a]. Based on the three-threshold method described above, a total of 21 eligible PS points were selected as ground control points (GCPs) for refinement and re-flattening, so as to obtain more accurate surface deformation monitoring results.

#### 2.3.2. DInSAR-PS-Stacking Processing

We selected multiple SAR images at different times in the same area from 6 January 2018 to 4 October 2019. Using the DInSAR-PS-Stacking method, we first preprocessed them by registration and multi-look processing (range/azimuth = 5:1). Then, the two–two interference processing was performed to generate the time-series interferogram, which was processed using the external DEM data to obtain the time-series differential interferogram, and then processed by filtering (Goldstein filtering), phase unwrapping (minimum cost flow method (MCF)), and other data processing steps. The GCPs obtained using the PS-InSAR three-threshold method was used for refinement and re-flattening, and then weighted stacking, phase-to-displacement conversion, and geocoding were performed [52,53] to obtain the accumulated time-series surface deformation information.

The interference phase *ϕ* is expressed as:

$$
\varphi = \varphi\_{flat} + \varphi\_{topo} + \varphi\_{def} + \varphi\_{atm} + \varphi\_{noise} + \varphi\_{orbit} \tag{5}
$$

In the above formula: *ϕflat* is the flat phase, which can be removed by accurate calculation of the baseline length; *ϕtopo* is the topographic phase, which can be removed by the DEM data simulation; *ϕatm* and *ϕnoise* are the atmospheric delay and noise phase, respectively, which can be reduced by filtering; *ϕorbit* is the phase caused by the orbit error, which can be eliminated using precise orbit data of the image pair; and *ϕdef* is the deformation phase.

The deformation variable calculated by the deformation phase is expressed as:

$$
\Delta \tau = -\frac{\lambda}{4\pi} \varphi\_{def} \tag{6}
$$

In the above formula, Δ*r* is the deformation variable of the ground target in the direction of the radar line-of-sight (LOS); *λ* is the working wavelength of the radar sensor, 5.6 cm; and *ϕdef* is the phase of the LOS surface deformation.

In this study, the PS points were selected using the PS-InSAR three-threshold method for conversion into GCPs for refinement and re-flattening so as to remove orbit errors and residual phases and to improve inversion accuracy. The unwrapping phases after refinement and re-flattening were stacked to reduce errors. The cumulative time-series subsidence maps of the study area were obtained by phase-to-displacement conversion, geocoding, and clipping.

#### 2.3.3. SBAS-PS-InSAR Processing

SBAS-InSAR technology is one of the branches of time-series InSAR technology [54]. The SBAS-PS-InSAR method is an improvement of the SBAS-InSAR method. According to the spatio-temporal baseline threshold, all the SAR images from the same area are divided into several small baseline sets. The least squares method solves the deformation phase of each set. All the small baseline sets are connected, and the least squares solution, in the sense of the minimum norm of the deformation phase, is obtained by singular value decomposition (SVD) [55].

Basic principle: *N* + 1 SAR images are selected according to the time sequence, and one scene is selected as the super master image, which is registered and resampled with other *N* SAR images [56]. According to the space–time baseline threshold, *M* differential interference pairs are obtained, where *M* is:

$$\frac{N+1}{2} \le M \le \left(\frac{N+1}{2}\right)N\tag{7}$$

Assuming that the *j* − *th*(*j* = 1, 2, . . . *M*) interferogram is at time *tA* and *tB* (*tB* > *tA*), the phase value of the pixel (*x*, *y*) can be expressed as Equation (8).

$$\begin{array}{ll}\delta\varphi\_{\rangle}(\mathbf{i},\mathbf{j}) &= \varphi(\mathbf{t}\_{\mathrm{B}},\mathbf{x},\mathbf{y}) - \varrho(\mathbf{t}\_{\mathrm{A}},\mathbf{x},\mathbf{y})\\ &\approx \frac{4\pi}{\lambda}[d(\mathbf{t}\_{\mathrm{B}},\mathbf{x},\mathbf{y}) - d(\mathbf{t}\_{\mathrm{A}},\mathbf{x},\mathbf{y})] + \varrho\_{\mathrm{bpoo}}(\mathbf{x},\mathbf{y}) + \varrho\_{\mathrm{orb}} + \varrho\_{\mathrm{res}}(\mathbf{x},\mathbf{y})\end{array} \tag{8}$$

In the above formula, *d*(*tA*, *x*, *y*) and *d*(*tB*, *x*, *y*) are the LOS directional variables of the pixel at time *tA* and *tB* relative to the initial time; *ϕ*topo(*x*, *y*) is the topographic phase error caused by elevation data; *ϕorb* is the orbit error phase; and *ϕres*(*x*, *y*) is the residual phase.

The topographic phase and flat phase were removed using external DEM data and geometric imaging relationships. Then, the required deformation phase information was obtained by filtering and phase unwrapping; the stable PS points obtained using the PS-InSAR three-threshold method were used as GCPs for refinement and re-flattening to further correct the deformation phase and improve the subsequent in-version accuracy. If *d*(*t*0, *x*, *y*) = 0, the corresponding time-series phase is:

$$
\varphi(t\_{i\prime}, \mathbf{x}, \mathbf{y}) \approx \frac{4\pi}{\lambda} d(t\_{i\prime}, \mathbf{x}, \mathbf{y}) \tag{9}
$$

The temporal deformation phase sequence of the pixel to be solved can be expressed as:

$$\boldsymbol{\varrho} = \begin{bmatrix} \boldsymbol{\varrho}(t\_1), \dots, \boldsymbol{\varrho}(t\_N) \end{bmatrix}^T \tag{10}$$

$$\delta\varphi = \begin{bmatrix} \delta\varphi(t\_1), \dots, \delta\varphi(t\_N) \end{bmatrix}^T \tag{11}$$

The primary image and the secondary image sequences arranged in chronological order are represented by sets *IE* = [*IE*<sup>1</sup> ··· *IEM*] and *IS* = [*IS*<sup>1</sup> ··· *ISM*], respectively, and satisfy *IEj* > *ISj*(*j* = 1, 2, . . . , *M*). Then, all the differential phases can be expressed as:

$$\delta \varphi\_{\rangle} = \varphi(t\_{IE\_j}) - \varphi(t\_{IS\_j}), (j = 1, 2, \dots, M) \tag{12}$$

Converting it to a matrix form, it can be expressed as:

$$A\varphi = \delta\varphi\tag{13}$$

In the above formula, *A* is the coefficient matrix. When all the interference pairs belong to a subset, the rank of *A* is *N*(*M* ≥ *N*), which can be obtained by the least square method:

$$
\phi = A^+ \delta \rho,\\
A^+ = \left(A^T A\right)^{-1} A^T \tag{14}
$$

In actual processing, *A* is mostly a non-full rank matrix and *ATA* is a singular matrix, so there are numerous solutions, which are solved by SVD.

$$A = \mathcal{U}SV^T \tag{15}$$

Finally, the phase-change velocity *V* is solved. After the phase-change velocity is solved, the cumulative deformation can be calculated by integrating the phase-change rate in the corresponding time period [57], so as to obtain the time-series deformation.

The SBAS-PS-InSAR method selected one of the images as the super primary image, and the remaining images were secondary images. The connection graph was determined using the spatio-temporal baseline threshold, and the deformation monitoring accuracy was improved with the reduction in the spatio-temporal baseline [58]. The spatio-temporal baseline thresholds were set to 2% and 70 days of the maximum critical baseline, respectively, and the image interference pairs were generated according to the principle of a small baseline set. The interference processing included the removal of the flat phase and topographic phase, the generation of a differential interferogram, Goldstein filtering, and minimum cost flow method (MCF) phase un-wrapping. The PS-InSAR three-threshold method was used to automatically select the PS points for the GCPs for refinement and re-flattening, to estimate and remove the residual constant phase and phase ramp after unwrapping, and to reduce orbit errors and improve the inversion accuracy. Then, we estimated and removed the residual terrain phase and time low-pass phase and performed secondary phase unwrapping on the residual part. The average deformation velocity and elevation topography were inverted. Then, the phase time series was filtered by atmospheric space–time filtering to filter out the atmospheric delay phase [59]. Finally, phase-to-displacement conversion, geocoding, and clipping were used to obtain the cumulative time-series deformation sequence of the study area.

#### 2.3.4. DInSAR-PS-Stacking and SBAS-PS-InSAR Fusion

The cumulative time-series deformations monitored by the DInSAR-PS-Stacking and SBAS-PS-InSAR methods were compared and analyzed, and the monitoring ability and the advantages and disadvantages of these two methods for surface deformation in the goaf of the working face were studied. The SBAS-InSAR method performs poorly in areas with large subsidence gradients and poor coherence, but it can achieve higher accuracy in subsidence edge regions. On the contrary, the DInSAR method performs well in subsidence central areas and can achieve higher accuracy, but its accuracy is relatively low in subsidence edge regions [60,61]. In this study, the DInSAR-PS-Stacking method can obtain the cumulative subsidence in the center area with a large motion gradient, and SBAS-PS-InSAR can more accurately monitor slow and small deformations. Therefore, OTSU threshold

segmentation is used to automatically extract double thresholds (coherence coefficient threshold and deformation threshold) to fuse the simultaneous cumulative time-series deformations obtained by the DInSAR-PS-Stacking and SBAS-PS-InSAR methods.

Firstly, the coherence maps were obtained using Formula (1) in the coherence coefficient method, and then the multi-temporal average coherence maps were obtained using Formula (2).

Secondly, OTSU was used to automatically obtain the threshold of the time series average coherence map. If the coherence is less than the threshold, the cumulative deformation result of DInSAR-PS-Stacking is fused, and the cumulative deformation result of fused SBAS-PS-InSAR with coherence greater than or equal to the threshold is fused.

Thirdly, on this basis, the OTSU method is used to extract the deformation threshold of the cumulative deformation results monitored by DInSAR-PS-Stacking. When the cumulative deformation is less than the threshold, it is the settlement center deformation with a large settlement gradient, and it is fused with the cumulative deformation map after the fusion in the previous step; finally, the cumulative time series deformation results after the fusion is obtained.

The problems of SBAS-PS-InSAR monitoring having an insufficient subsidence center and DInSAR-PS-Stacking monitoring having poor subsidence edge accuracy are solved, and the subsidence of the working face goaf is further studied and analyzed.

#### **3. Results and Analysis**

#### *3.1. Analysis of Refinement and Re-Flattening Results*

In this study, the above DInSAR-PS-Stacking and SBAS-PS-InSAR methods were used to process 52 scenes of SAR image data covering a coal mine in Nalintaohai Town, Yijinhuoluo Banner, Ordos City, Inner Mongolia Autonomous Region, from 6 January 2018 to 4 October 2019. Refinement and re-flattening are important steps in InSAR technology, which play an important role in improving the accuracy and reliability of measurement. It is necessary to select ground control points for refinement and re-flattening to determine the position and elevation information of satellite orbit and terrain for improving measurement accuracy and eliminating the influence of non-surface deformation factors. The ground control points are selected in areas with high coherence, flat terrain, no phase jump and deformation stripes. The stable PS points selected by the PS-InSAR three-threshold method basically meet the selection requirements of ground control points. The PS points obtained by the PS-InSAR three-threshold method are used as ground control points for refinement and re-flattening. We selected the representative results of differential interference and unwrapping. Figure 3 shows the differential interferogram after refinement and re-flattening and the phase diagram after unwrapping. It can be seen that the effect after processing is very good, and the root mean square error is within 15 or even 10. The results of other differential interference and unwrapping after refinement and re-flattening are also better, and the root mean square errors basically satisfy the requirements. The differential interferogram and phase unwrapping diagram after refinement and re-flattening are relatively smooth, with less noise, no obvious slope or step-like phase deviation, and the deformation area can be clearly seen and is more consistent. The differential interferogram and phase unwrapping diagram after data processing at different times have a more obvious deformation area, the positions are relatively close, and there is a certain repeatability and space–time continuity. Through the PS-InSAR three-threshold method, the stable PS points are selected as ground control points for refinement and re-flattening, which can correct the satellite orbit and phase offset and effectively eliminate the influence of orbit phase. This method basically meets the selection requirements of ground control points and meets the research needs of deformation monitoring.

**Figure 3.** *Cont*.

8 December 2018–20 December 2018 20 December 2018–1 January 2018 1 January 2019–13 January 2019

**Figure 3.** Differential interferogram and unwrapped phase diagram after refinement and re-flattening.

#### *3.2. Monitoring and Analysis of DInSAR-PS-Stacking and SBAS-PS-InSAR*

In this study, the DInSAR-PS-Stacking and SBAS-PS-InSAR methods were combined and compared to monitor the surface subsidence of a mining area in Nalintaohai Town, Yijinhuoluo Banner, Ordos City, Inner Mongolia Autonomous Region, from 6 January 2018 to 4 October 2019. After that, the monitoring deformation results were verified and analyzed. The proposed method improved the inaccuracy of mining area monitoring using the single-InSAR method, improved the accuracy and integrity of deformation monitoring in the mining area, reduced the atmospheric error, improved the monitoring effect, and was conducive to the efficient identification of goaf in the mining area. Figure 4 shows the average annual displacement velocity of the coal mines monitored using SBAS-PS-InSAR. Figure 5 shows the comparison of the monitoring results of the DInSAR-PS-Stacking and SBAS-PS-InSAR methods in the same period.

**Figure 4.** The average subsidence velocity of the mining area monitored using SBAS-PS-InSAR.

**Figure 5.** *Cont*.

**Figure 5.** Comparison of the cumulative subsidence monitored using DInSAR-PS-Stacking and SBAS-PS-InSAR in the same period.

Coal mining often produces large subsidence in a short time, which can lead to loss of coherence. SBAS-PS-InSAR cannot select high-coherence points in the mining subsidence center, which leads to loss of information.

It can be noticed from the surface subsidence results shown in Figures 4 and 5 that from 6 January 2018 to 4 October 2019, both the DInSAR-PS-Stacking and SBAS-PS-InSAR methods monitored obvious subsidence funnels, which extended around the center of the subsidence funnel and were gradually distributed as strip. The subsidence range gradually extends from north to south along the working face, which is consistent with the advancement of mining progress. In the mining face area, there was basically no settlement at the beginning. With the advancement of mining progress, the settlement gradually increased, and then gradually stabilized. The subsidence change in the mining face is closely related to the advancement of mining progress. The position and spatio-temporal changes in the monitored mining subsidence basin were basically the same, which were highly consistent with the actual mining area. These two methods reveal that the ground subsidence in the mining area of the coal mining working face is gradually increasing, and the coal mine has been mined in large quantities, resulting in large-scale ground subsidence. As the mining intensity increased, the subsidence range spread around the center of the subsidence funnel. For the mining area, the maximum cumulative subsidence monitored by the DInSAR-PS-Stacking and SBAS-PS-InSAR methods was −131.4 mm and −96.2 mm, respectively. The annual average maximum subsidence velocity obtained by SBAS-PS-InSAR method was −59.3 mm/year. Overall, the two methods used to monitor coal mine

surface subsidence locations, scope, distribution, and temporal and spatial subsidence laws showed a high degree of agreement. However, the specific analysis showed that: (1) in the same time period, the monitored cumulative subsidence using DInSAR-PS-Stacking is larger than that of SBAS-PS-InSAR, and that (2) in the area with a large subsidence gradient (such as the subsidence center) and poor coherence, DInSAR-PS-Stacking was able to monitor the subsidence information, while SBAS-PS-InSAR technology failed to effectively monitor the subsidence information, causing the obtained cumulative time-series subsidence map to be lacking.

Through the cumulative time-series subsidence map obtained using the DInSAR-PS-Stacking and SBAS-PS-InSAR monitoring methods, the time-series analysis and comparative verification of the selected four subsidence points were carried out. The location of the deformation monitoring point is shown in Figure 1c. Point 1 is at the starting position of the mining face and also at the edge of the settlement. Point 6, point 12, and point 22 are located in the larger settlement area. The large settlement area of the mining face in the mining area is more significant, and the settlement in most areas is larger considering the SBAS-PS-InSAR monitoring results are partially missing and the location of the GPS monitoring points. These four points can be used as a representation of the subsidence of the mining face, which can better reflect the deformation of the mining face and the surrounding ground. Choosing these points can make people better understand the subsidence near the mining face.

As shown in Figure 6, in the goaf of the coal mine's working face, it can be seen that the four characteristic subsidence points monitored using these two methods were basically the same for the overall subsidence trend. The DInSAR-PS-Stacking and SBAS-PS-InSAR methods both accurately monitor the temporal subsidence trends of the ground surface, which are consistent with GPS measurements. The settlement curves of the three show similar trends. However, the DInSAR-PS-Stacking method detects a greater amount of subsidence compared with the SBAS-PS-InSAR method. Taking Figure 6d as an example, the time-series settlement plot of this point shows the following information: from 6 January 2018 to 29 July 2018, the settlement was relatively small and changed slowly, with a slight downward trend. A significant settlement change occurred between 29 July 2018 and 27 September 2018. After 27 September 2018, although settlement still occurred, the settlement trend became relatively flat. In summary, this point experienced two different settlement stages during the observation period, with relatively slow settlement changes in the early and late stages and a significant change in the middle stage. The time series subsidence trend of Figure 6d is related to the mining progress. In Figure 6, it can be seen that the time series change basically shows a sinking trend. The time series settlement trend is basically that the settlement in the early and late stages is small and gentle, and the settlement in the middle stage is obviously larger, but the large settlement inflection point in the middle stage of each figure is different. The time of this inflection point is related to the degree of mining advancement.

The settlement point in Figure 6a is at the edge of the settlement, and the settlement points in Figure 6b–d are close to the large settlement center. In Figure 6a, the time series subsidence monitored by the SBAS-PS-InSAR method is closer to the GPS-measured time series subsidence. The time-series subsidence monitored by the DInSAR-PS-InSAR method in Figure 6b–d is closer to the GPS-measured time-series subsidence. In contrast, the SBAS-PS-InSAR method is more suitable for monitoring the slow and small deformation of the edge of the mining area, and its monitoring settlement is closer to the GPS-measured results. The DInSAR-PS-Stacking method is more suitable for monitoring the deformation of the large subsidence area in the mining area, and its monitoring subsidence and subsidence trend are closer to the GPS-measured results. Combining or fusing the DInSAR-PS-Stacking method and the SBAS-PS-InSAR method can better monitor the surface subsidence of the mining area and improve the monitoring accuracy and ability.

**Figure 6.** Results and comparison of time-series subsidence points monitored using DInSAR-PS-Stacking and SBAS-PS-InSAR: (**a**) point 1, (**b**) point 6, (**c**) point 12, and (**d**) point 22.

This paper compares and analyzes the correlation between the monitoring results of the DInSAR-PS-Stacking and SBAS-PS-InSAR methods and GPS monitoring results at four subsidence feature points. Pearson correlation coefficient is used to measure the strength of the linear relationship between the two, with a higher coefficient (closer to 1) indicating a more consistent subsidence trend. Figure 7 shows a comparison with the correlation diagrams of settlement curves measured by DInSAR-PS-Stacking, SBAS-PS-InSAR, and GPS. The time-series subsidence results monitored by the DInSAR-PS-Stacking and SBAS-PS-InSAR methods were compared with GPS monitoring results, and indicators such as Pearson correlation coefficient, mean absolute error (MAE), and root-mean-square error (RMSE) were calculated, verified, and analyzed. Table 3 shows the comparison and verification of the DInSAR-PS-Stacking time series settlement results and the GPS monitoring results. Table 4 shows the comparison and verification of the SBAS-PS-InSAR time series settlement results and the GPS monitoring results.

**Table 3.** Comparison and verification of DInSAR-PS-Stacking time series settlement results and GPS monitoring results.


**Figure 7.** Comparing the correlation diagrams of settlement curves measured by DInSAR-PS-Stacking, SBAS-PS-InSAR, and GPS: (**a**) point 1, (**b**) point 6, (**c**) point 11, and (**d**) point 22.


**Table 4.** Comparison and verification of SBAS-PS-InSAR time-series settlement results and GPS monitoring results.

There are differences between the monitoring results of the DInSAR-PS-Stacking and SBAS-PS-InSAR methods. Considering Figure 7 and Tables 3 and 4, an analysis and verification were carried out. The color bands in the correlation map in Figure 7 indicate better correlation with narrower bands. Based on Figure 7 and Pearson correlation coefficients, both InSAR methods show good correlation with GPS monitoring results, indicating that the subsidence trend detected by these two InSAR methods is consistent with GPS measurements. By comparing Tables 3 and 4, the Pearson correlation coefficient between the time-series subsidence results monitored by the DInSAR-PS-Stacking method and GPS monitoring results is higher than that of the SBAS-PS-InSAR method, approaching 1. This suggests that the monitoring results of the DInSAR-PS-Stacking method are more consistent with the GPS monitoring results, indicating higher reliability and accuracy. Combining Tables 3 and 4 with Figure 6, at point 1, the absolute error, mean absolute error (MAE), and root mean square error (RMSE) between the time-series subsidence results monitored by the SBAS-PS-InSAR method and GPS monitoring results are smaller than those of the DInSAR-PS-Stacking method. At points 6, 12, and 22, the absolute error, MAE, and RMSE between the time-series subsidence results monitored by the DInSAR-PS-Stacking method and GPS monitoring results are smaller than those of the SBAS-PS-InSAR method. In comparison, the monitoring results of the SBAS-PS-InSAR method have smaller errors in the subsidence marginal area, while the monitoring results of the DInSAR-PS-Stacking method show smaller errors in the large subsidence area. The errors of the surface deformation monitoring results of these two InSAR methods are within an acceptable range, indicating that the monitoring results of these two InSAR methods represent the same deformation field of the mining area. These results indicate that the surface subsidence monitoring results of the DInSAR-PS-Stacking and SBAS-PS-InSAR methods are effective and reliable.

In summary:


#### *3.3. Deformation Fusion Monitoring and Analysis of DInSAR-PS-Stacking and SBAS-PS-InSAR*

Comparison analysis of monitoring mining area subsidence using the DInSAR-PS-Stacking and SBAS-PS-InSAR methods. The DInSAR-PS-Stacking method performs better in areas with large subsidence, while the SBAS-PS-InSAR method is more effective in monitoring slow and small subsidence in the outer regions. Therefore, the cumulative subsidence maps obtained from the DInSAR-PS-Stacking and SBAS-PS-InSAR technologies are fused. The fused cumulative subsidence map combines the advantages of monitoring large subsidence areas using the DInSAR-PS-Stacking method and slow subsidence areas using the SBAS-PS-InSAR method. The fused cumulative subsidence map, as shown in Figure 8, is combined with the working face and goaf zone shown in Figure 1c. Overall, the fusion of these two InSAR methods is effective, especially in the subsidence area of the working face and goaf zone where the obvious subsidence trend can be observed, and the missing phenomenon is not serious. The fusion of these two InSAR methods compensates for the deformation loss of the large subsidence area in the mining area monitored by the SBAS-PS-InSAR method, and the slow subsidence edge can be effectively monitored using the SBAS-PS-InSAR method. The mining area is less affected by factors such as climate and roads, and the subsidence basin is mainly caused by underground coal mining. As the mining operation progresses, subsidence gradually occurs in the area of the working face, which spreads to the surrounding areas and deepens gradually, eventually becoming stable. This subsidence gradually forms a subsidence basin that eventually matches the actual location of the mining face. By fusing the DInSAR-PS-Stacking and SBAS-PS-InSAR methods to monitor mining area subsidence, their respective advantages can be fully utilized to complement each other, thereby improving the ability and effectiveness of monitoring mining area surface deformation.

The fusion monitoring results of DInSAR-PS-Stacking method and SBAS-PS-InSAR method are compared with GPS subsidence monitoring results. The distribution of GPS subsidence monitoring points is shown in Figure 1c, and there are 66 monitoring points from north to south. Figure 9 compares the fused cumulative subsidence results from the two InSAR methods with the GPS measurement monitoring results. In Figure 9, it is possible to observe the settlement of monitoring points oriented from north to south on the mining working face very well.

From Figure 9, it can be observed that there are obvious subsidence basins in the mining face from north to south, and the subsidence trend is basically the same. The settlement changes in the middle and front sections of Figure 9a,b are consistent. The settlement of the rear section in Figure 9b is larger than that in Figure 9a, and the position of the last inflection point in Figure 9b is later than that in Figure 9a. This indicates that most areas in the northern part of the mining working face had stabilized subsidence trends by 29 August 2019, while the southernmost part continued to subside until 4 October 2019, before stabilizing. The subsidence of the goaf area of the working face is mainly caused by large-scale coal mining. As the coal mining progresses, the subsidence basin becomes increasingly apparent, and the subsidence volume and subsidence area gradually increase, with steep and uneven subsidence edges. The average absolute errors of the cumulative subsidence at each point on the two subsidence curves in Figure 9a,b are 55.8 mm and 56.1 mm, respectively, and the root mean square errors are 60.8 mm and 59.3 mm, respectively. This method can effectively monitor subsidence and is consistent with the actual situation.

The mining of underground coal mines can cause movement in the surrounding and overlying strata of the goaf, resulting in deformation and destruction of the surface. The surface deformation information obtained by combining the two methods was analyzed, and the distribution of subsidence in the goaf of the mining face and its surroundings can be clearly seen. Combined with the above analysis, the mining face subsidence range and the coincidence level with the working face became more obvious, and the surface continued to sink; according to this trend, in the future, the area will continue to settle. The ground cracks were generally parallel to the boundary of the mined-out area. In the

edge area of the subsidence funnel, the ground surface was greatly affected by tensile deformation, resulting in cracks. As shown in Figure 9, the edge of the subsidence funnel is slightly steep, and there is a high possibility of surface cracking. Therefore, preventive measures should be taken before these conditions deteriorate. During the coal mining process, more preventive measures should be taken to prevent large-scale collapse.

**Figure 8.** The cumulative subsidence maps after the fusion of DInSAR-PS-Stacking and SBAS-PS-InSAR.

**Figure 9.** Comparison between the fusion results of the two InSAR methods and GPS monitoring results: (**a**) 29 August 2019 and (**b**) 4 October 2019.

#### **4. Discussion**

The experimental results show that both the DInSAR-PS-Stacking and SBAS-PS-InSAR methods combine the PS-InSAR three-threshold selection method (the coherence coefficient threshold, amplitude dispersion index threshold, and deformation velocity interval) to select ground control points for orbit refinement and reinterferometric processing, solving the problem of manually selecting ground control points (GCPs) that affect monitoring results during orbit refinement and reinterferometric processing. They can effectively correct satellite orbits and phase offsets, reduce the influence of orbit errors, and improve deformation monitoring effectiveness.

Both the DInSAR-PS-Stacking and SBAS-PS-InSAR methods can be used as effective methods for real-time monitoring of subsidence caused by coal mining. Both methods can accurately monitor the location, extent, and spatio-temporal distribution of coal mining subsidence, with good correlation and consistency. The spatio-temporal subsidence trends monitored by both methods are consistent with the mining progress. It can be seen from Figure 6, Tables 3 and 4, that the deformation error of the DInSAR-PS-Stacking and SBAS-PS-InSAR methods is smaller in the edge region of the subsidence basin and larger in the area with significant subsidence. In the edge region with less subsidence, both methods can reflect the spatio-temporal subsidence trends well, with SBAS-PS-InSAR having higher monitoring accuracy. In areas with greater subsidence, the DInSAR-PS-Stacking method can better monitor the spatio-temporal subsidence trends, with a subsidence monitoring error smaller than that of SBAS-PS-InSAR. However, the deformation results monitored by the SBAS-PS-InSAR method have some missing information, with slightly inferior monitoring capabilities in the subsidence center and the area with significant subsidence. Nevertheless, the SBAS-PS-InSAR method has higher monitoring accuracy in areas with smaller subsidence and is suitable for slow and small deformation monitoring.

In order to better monitor the surface deformation of the mining area, the above characteristics of DInSAR-PS-InSAR and SBAS-PS-InSAR for monitoring the surface deformation of the mining area are considered. The double thresholds (deformation threshold and average coherence threshold and deformation threshold) are extracted by OTSU threshold segmentation to fuse the simultaneous time-series deformation results monitored by the two InSAR methods. The fused deformation results combine the advantages of DInSAR-PS-InSAR's effective monitoring of significant subsidence in large deformation areas and SBAS-PS-InSAR's effective monitoring of slow and small subsidence in the subsidence edge area, thereby improving the accuracy and completeness of mining area surface deformation monitoring. By comparing and fusing the DInSAR-PS-Stacking and SBAS-PS-InSAR methods, mining area surface deformation can be monitored more effectively, and the

problem of incomplete monitoring in some areas can be avoided. The comparison and fusion of these methods can quickly and effectively obtain the deformation distribution of the mining area and more accurate mining area deformation information, achieving more effective monitoring of mining area surface deformation. The results of the space–time analysis show that subsidence will continue to occur in this area, which needs to be further studied to form an integrated research system for subsidence monitoring and prediction in mining areas. This will help to provide early warnings before disasters occur and will also provide auxiliary decision support for safe production in mining areas.

The deformation results of the mining area monitored by this research method are good, but the monitoring accuracy is still affected by some error factors. In a follow-up study, DEM data with higher precision and sourced from closer to the mining period can be considered, and the atmospheric delay error can be corrected by using external meteorological data (such as GACOS data [62]) to improve the monitoring accuracy. Subsequently, the phase filtering and phase unwrapping algorithms can be optimized for the study area to improve the monitoring accuracy. This study has scientific guiding significance for rational mining planning, accident prevention and control, and disaster prediction.

#### **5. Conclusions**

Taking a mining area in the Inner Mongolia Autonomous Region of China as the research area, this study used the DInSAR-PS-Stacking and SBAS-PS-InSAR methods to monitor the surface deformation of the mining area and compared and analyzed the deformation results monitored by the DInSAR-PS-Stacking and SBAS-PS-InSAR methods. The deformation results monitored by the two InSAR methods were fused and the fused deformation results were analyzed. The following conclusions are drawn:


**Author Contributions:** Conceptualization, Y.C. and X.D.; methodology, Y.C. and X.D.; data curation, Y.C., X.D., P.H. and W.S.; formal analysis, Y.C. and X.D.; investigation, Y.C., Y.Q. and X.L. (Xiujuan Li); visualization, Y.C. and X.D.; writing—original draft, Y.C. and X.D.; writing—review and editing, P.H., W.X. and W.T.; project administration, Y.C., P.H. and X.L. (Xiaolong Liu). All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the National Natural Science Foundation of China, grant number 52064039; the Natural Science Foundation of Inner Mongolia Autonomous Region, grant numbers 2020ZD18 and 2021MS06004; the Science and technology planning project of Inner Mongolia Autonomous Region, grant numbers 2020GG0073 and 2019GG139; the Major Project of Science and Technology of Inner Mongolia Autonomous Region, grant number 2019ZD022; the Basic scientific research business expenses of colleges and universities directly under Inner Mongolia Autonomous Region, grant number JY20220134; the Scientific research project of Inner Mongolia University of Technology, grant numbers BS201945 and DC2000001249; and the Equipment Engineering Special Project of natural disaster prevention and control, grant number TC210H00L-49.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** Many thanks to Inner Mongolia Key Laboratory of Radar Technology and Application for the great support. Thanks to anonymous reviewers for their constructive comments to improve the quality of this paper.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

## *Article* **Seasonal Ground Movement Due to Swelling/Shrinkage of Nicosia Marl**

**Ploutarchos Tzampoglou 1,\*, Dimitrios Loukidis <sup>1</sup> and Niki Koulermou <sup>2</sup>**


**Abstract:** This research investigates the seasonal ground heave/settlement of an area covered by an expansive soil of Cyprus called Nicosia marl, highlighting the degree of influence of the main causal factors. For this purpose, existing geotechnical data from the archives of the Cyprus Geological Survey were first collected and processed to compile maps of the key geotechnical parameters in the study area. In order to estimate the ground movements in the area, Earth Observation (EO) techniques for the period between 16 November 2002–30 December 2006 were processed. The correlation of these movements with the existing geotechnical data indicates that there is a statistically significant correlation between plasticity index and the ground movements. Multivariate linear regression analysis using Lasso revealed that the plasticity index ranks first in importance among the examined variables.

**Keywords:** ground heave/settlement; expansive clay; seasonal motion; rainfall

#### **1. Introduction**

Expansive soils significantly affect many countries worldwide, damaging buildings, road networks and other infrastructure. The mechanisms involved in this phenomenon are complex, and the factors that play an active role in its development can be divided into two categories: the preparatory and the triggering ones. In the first category are the geotechnical and the geological conditions, while in the second one, the moisture variation of the ground caused by rainfall infiltration and evapotranspiration, the leakage of water supply and sewage pipes, the fluctuations of the water table, etc. The economic losses caused by this hazard is in the scale of billions of dollars each year [1,2]. In fact, the annual cost of damages due to expansive soils surpasses that of other natural geohazards, namely earthquakes and landslides [3]. Examples of regions with expansive clays include: China [4]; Sudan [5,6]; Australia [7,8]; Saudi Arabia [9,10]; United Kingdom [1,11]; Canada [12], and Sweden [13]. Several studies (numerical or experimental) have been carried out in order to estimate the connection between the degree of damage to buildings and the distribution of ground moisture underneath their foundation, e.g., [14–16].

To date, Earth Observation (EO) technologies have been widely used to carry out studies investigating the failure mechanism of various natural hazard, such as landslides [17–19], land subsidence [20–22], flood [23,24], etc., along with field surveys and numerical modeling. However, the use of these technologies for the investigation of vertical ground movements due to the seasonal swelling and shrinkage of clays is at a relatively early stage [25–27]. Nonetheless, InSAR has been successfully used in numerous studies in the last decades for studying a phenomenon also pertaining to predominantly vertical ground movements, i.e., the land subsidence due to changes in the groundwater level. These studies estimate the degree of influence of the factors that play an active role on the future extent and magnitude of land subsidence, such as the geological conditions [28,29],

**Citation:** Tzampoglou, P.; Loukidis, D.; Koulermou, N. Seasonal Ground Movement Due to Swelling/ Shrinkage of Nicosia Marl. *Remote Sens.* **2022**, *14*, 1440. https:// doi.org/10.3390/rs14061440

Academic Editor: Andrea Ciampalini

Received: 7 January 2022 Accepted: 9 March 2022 Published: 16 March 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

tectonic structure [30], hydrological conditions [31–39], the thickness of the compressible formations [36,40], the urbanization [20,41] and the land use [40,42].

The present study employs Interferometric Synthetic Aperture Radar (InSAR) data to investigate the phenomenon of ground swelling/shrinkage due to seasonal moisture changes in an expansive clay formation and assess the degree of influence of the causal factors. The methodology developed herein relies on the integration of EO technologies, pre-existing local data and experience, conventional geotechnical research and the use of statistical analysis within a GIS environment. All the factors which affect this phenomenon, namely plasticity index [43–45], clay and montmorillonite content [4,46,47], expansive soil layer thickness and depth [6,15], were compiled and thematic maps were constructed using spatial analysis tools. It has long been established that the larger are the values of plasticity index, clay content and montmorillonite content, the larger the tendencies for swelling/shrinkage are expected to be [1,2]. Subsequently, the ground movements in the area as inferred by InSAR for the period between 16 November 2002–30 December 2006 [48] were processed. Finally, multivariate linear regression analysis using Lasso was carried out, elucidating the degree of influence of the main causal factors [49].

The study area is the central and eastern part of the city of Nicosia, which is the capital of the Republic of Cyprus (Figure 1). Expansive clays cover extensive areas of the island of Cyprus. The majority of these geomaterials are bentonitic clays deposited during the late Cretaceous period as deep-water sediments produced by the hydrothermal weathering of the basaltic rocks of the Tethyan oceanic crust. Nowadays, the outcrops of these extremely expansive materials are rather limited and can be found on the ground surface in a relatively small number of areas along the southern coast and in the mountainous region of Paphos District. However, the weathering of these old formations produced sediments that were redeposited in basins north and south of the axis of the Troodos mountain range, eventually resulting in the creation of the marls of the Nicosia geological formation, which are outcroppings particularly in areas of important urban development in the island of Cyprus, such as the cities of Nicosia, Larnaca and Paphos. These outcrops in combination with the climate of the region, which is characterized by rainy winters and very dry summers, are responsible for various types of damages to buildings and other infrastructure. More specifically, the seasonal variations of the degree of ground saturation that happen inside the so-called active zone, the depth of which is of the order of 3m to 8m, cause soil expansion (during the rainy winter months) or soil shrinkage (during the dry summer months). This phenomenon puts strain on buildings founded on such soils, which more often results in cracks in in-fill walls. In more extreme cases, these cracks may propagate to the structural frame. According to a previous study [50], which surveyed a part of the city of Nicosia where the marl is extremely expansive, 58 out of 96 buildings showed various degrees of damage due to ground swelling/shrinkage, characterized as moderate/severe for 7 of them. It is worth mentioning that in areas where the slope of the ground is >3◦ damages are substantially more frequent. This phenomenon may be due to the ratcheting effect on sloping terrain [51,52]. The annual cost of repairs in the surveyed area of 1.75 km2 was estimated to be more than 2.4 million euros (based on 2002 prices), while the decrease in real estate value was estimated at 2 million euro per year.

Based on the above, it is obvious that the social and economic impact of expansive soils has been historically significant and is becoming more important in recent years as urban centers expand into areas covering such soils. The results of the present study are meant to help in understanding the relationship between the spatial distribution of the causal factors and the magnitude of this phenomenon. Its findings can be of benefit to state authorities and other policy makers, as well as the construction industry, in planning future urban development and the maintenance of existing infrastructure.

**Figure 1.** Satellite image of the study area.

#### **2. Geomorphological, Geological and Climatic Setting**

The study's location is mostly urban with the exception of the northeastern part where it consists of an industrial-agricultural zone. In the Old Nicosia area, there are mainly older types of buildings, while in the area west of Lykavitos the modern center of the city is developed, with a significant concentration of high-rise buildings (applying high vertical stress to the ground). The areas southeast of Aglantzia and Strovolos are considered predominantly residential with relatively newly built modern houses. The elevation of the study area ranges from 110 to 190 m. The lowest parts are located in the northeast (110 to 140 m) while the elevation increases southwest, where it reaches 170–190 in the suburb of the Strovolos. The city center presents gentle slopes, with elevation varying from 150 to 170 m.

The geological maps of the Cyprus Geological Survey Department (GSD) pertaining to the wider region of Nicosia, namely the Bedrock Geologic Map of Nicosia and the Surficial Geologic Map of Nicosia (scale 1:25000) are exploited. After spatial analysis and digitization processes, the geological formations with similar characteristics were merged and the surficial geological map of the study area was constructed (Figure 2A). The stratigraphy of the study area can be summarized as follows (starting with the younger formations and moving to the older ones):

**Manmade fills:** It consists of earthfills and manmade materials that have been placed irregularly on the natural ground.

**Alluvial deposits:** Depending on their deposition age, they can be divided into two groups:

The first group contains two distinct series. The more modern and the older one. The modern, which was deposited in rivers that were eroded into the older alluvial deposits, consist mainly of gravel, sand, silt and clay, as well as organic material. Its thickness reaches 1–2 m. The older consists of fine sand and silty clay with some pebbles.

The second group overlays the Apalos Formation. It can be mainly distinguished into two categories. The upper one underlies the alluvial terrace and it mainly consists of fine sand, silt, clay, pebbles, cobbles, and sporadic amount of manmade and organic materials. Its thickness varies from 1 to 4 m, while in some areas it reaches 10 m. The lower one consists of sands and gravels, the thickness of which vary from 1 to 5m. In some areas, fluvial stream deposits can be found, which are composed of sand and gravel with varying size of cobbles. These deposits do not exceed the 6 m in thickness.

**Colluvial deposits:** They consist of sand, calcareous sand, silt, clay, gravel and sporadic boulders. The total thickness reaches 4 to 6 m.

**Apalos Formation (Pleistocene):** Fluvial deposits, consisting of alternating layers of sand and gravel, sand, silt and clay. Its total thickness is approximately 22 m.

**Nicosia Formation:** The Nicosia Formation constitutes the bedrock of the entire study region and consists of several members:

*Athalassa member of Nicosia Formation (Pliocene-Pleistocene):* It mainly consists of calcarenite, calcareous sandstone and sandy marls.

*Kephales member of Nicosia Formation (Pliocene):* It consists of alternations of fine to coarse sands with varying sizes of pebble and cobble gravels. It also contains marine fossils. The thickness of this formation reaches 8 m.

*Marl member of Nicosia Formation (Pliocene):* This member is the dominant one in the region and is overlain unconformably by the Athalassa member and it is made up of marl and silty marl alternations with small amounts of sandy marl. It appears in two distinct horizons, the grey marl and the overlying light brown (khaki) marl. Its total thickness is the order of hundreds of meters. The grey marl is not outcropping in the Nicosia region and, thus, focus has traditionally been placed on the overlying khaki marl, which in certain locations has a montmorillonite content that can be up to 25% [50,53], rendering the khaki marl of the Nicosia formation medium to very highly expansive. The marl member outcrops in almost half of the study region, as seen in Figure 2B.

**Lapatza Formation:** It mainly consists of siltstone, marl, khaki marl, and limestone with thin intercalations of gypsum and silicified zones.

Among the geomaterials described above, only the Marl Member of the Nicosia Formation (and possibly the sandy marl of the Athalassa member, but to a much lesser degree) consists of expansive soils. All other materials are coarse to very coarse grained, and wherever they are encountered overlying the Nicosia marl they act as a buffer zone that lessens the negative effects of the marl swelling/shrinkage.

Figure 2B shows the bedrock geology, i.e., the geology excluding alluvial-colluvial deposits and manmade fills. These geomaterials are still largely inside the so-called active zone (the zone in which the ground moisture varies seasonally). This map is a useful tool as it presents the prevailing geological conditions pertaining to building foundations. Most building foundations in Nicosia are constructer either inside the Nicosia marl or are affected by its presence at some larger but proximal depth.

According the Köppen–Geiger climate classification, Cyprus has a subtropical-mediterranean and semi-arid climate [54], while it is characterized as arid based on the Thornthwaite climate classification [55]. In the area of Nicosia, the winter is mild, while during the summer season (from June to August) the temperature often exceeds 40 ◦C. Most of the rainfalls take place during the period between November and March. The average annual rainfall is of the order of 300–350 mm.

**Figure 2.** Maps of: (**A**) surficial and (**B**) bedrock geology of the study area.

#### **3. Geotechnical Properties of the Nicosia Marl**

In order to investigate the spatial distribution of the physical and mechanical characteristics of the Nicosia marl, over 200 drillings were collected from the numerous geotechnical investigation reports contained in the archives of GSD. The retrieved soil information was mainly about clay content (and occasionally montmorillonite content estimated using the "methylene blue" method), silt content, calcium carbonate content, the Atterberg limits, as well as the free swelling strain and/or swelling pressure measured in conventional oedometers. After data processing in the ArcGIS software (interpolation of geotechnical borehole data by using spatial analysis tools such as Kriging and topo to raster), thematic layers were constructed for the factors controlling ground swelling/shrinkage due to seasonal moisture changes, namely plasticity index (PI), clay and montmorillonite content, expansive soil layer thickness inside the active zone and depth. It should be stressed that the grey marl was not investigated due to the fact that it lies deeper than the zone of ground moisture change (active zone). This database was enriched with geotechnical data from the two campuses of the University of Cyprus, which also lie in the study area, as well as data from the reports of pertinent research projects [25,29].

The physical and mechanical properties of the Nicosia khaki marl are listed in Table 1. The khaki marl consists of horizons that can be classified either as silty clay or silty clay with sand intercalations. The highest clay amounts inside the study region were observed in Lykavitos, the northern part of Aglantzia and at southern part of Palouriotissa. The clay content decrease as we move towards Old Nicosia and Strovolos (Figure 3A). The montmorillonite content on average is 15% (reaching values up to 25%) for the silty clay horizons and the 12% (with maximum 26%) for the silty clay with sandy intercalations. The largest montmorillonite content values are observed at the area of Old Nicosia, Lykavitos and at the northeast of Aglantzia. The lower values are registered in the area around Strovolos (Figure 3B). It should be noted that the montmorillonite content over clay fraction ratio reaches 0.7 in the area around Old Nicosia. This rate, although quite high, is confirmed by an earlier study in which 25 mineralogical tests were performed and the montmorillonite content to clay fraction ratio was found to range from 0.4 to 0.75 [53].


**Table 1.** Physical and mechanical parameters of the khaki marl.

**Figure 3.** Thematic layers of Nicosia khaki marl geotechnical properties: (**A**) clay content (%), (**B**) montmorillonite content (%) and (**C**) plasticity index.

As for the Plasticity Index (PI), the highest values are observed in the silty clay horizons, reaching up to 86 with an average of 46. For the sandier horizons, the corresponding values are maximum 48 and average 27. These results place the khaki Nicosia marl in the high plasticity group according to the USCS classification. The highest values were observed at Lykavitos and Aglantzia, while the lower ones in Old Nicosia and Strovolos (Figure 3C).

Finally, taking into account the oedometer tests results, it seems that the average free swelling strain is 9%, while in some cases this rate reaches 43%. As such, the khaki marl can be characterized in general as very highly expansive.

In order to create maps with the variation in the thickness of the Nicosia marl inside the presumed active zone (upper 8 m of the ground profile) (Figure 4A) and of the depth of its upper boundary (Figure 4B), more than 230 boreholes that contained stratigraphic information (not only geotechnical investigation boreholes) were processed. The lowest values of the marl thickness in the assumed 8 m thick active zone were observed in Old Nicosia and at the south part of Strovolos, while the highest ones were mainly in the area around Aglantzia and the Eastern part of Palouriotissa (Figure 4A). The formation is practically outcropping in the areas around Lykavitos, Palouriotissa and Aglantzia, while the depth of the upper boundary increases substantially towards Old Nicosia (Figure 4B).

**Figure 4.** Thematic layers of: (**A**) marl thickness inside active zone and (**B**) upper boundary depth of Nicosia marl in the ground profile.

#### **4. InSAR Data and Processing**

For the purpose of this study, the results of the European Union's research project titled 'PanGeo' are used. The project's objective was to enable free and open access to geohazard information in all EU countries, based on the collection of data via satellites [56]. At the moment, 52 European cities provide open access maps, which consist of vector polygon Ground Stability Layer (GSL) and their GeoHazard Description (GHD).

The Geological Survey Department of Cyprus took part in this EU project. In the context of Cyprus's participation, an urban area of 489 km<sup>2</sup> of the city of Nicosia was chosen for studying ground movements. Data by the Persistent Scattering Interferometry (PSI) InSAR technique (Interferometric of Synthetic Aperture Radar) from 20 radar scenes of Envisat [Environmental satellite (Table 2) operated by the European Space Agency (ESA)] were taken and analyzed in the GAMMA IPTA software (version V1.2 June 2007).

The date range of the analysis was between 16 November 2002 and 30 December 2006. The scene of 14 May 2005 was taken as a master scene. The Georeference (X,Y) accuracy was 15 m and the reference data used for georeference was Quickbird (0.6 spatial resolution). A total number of 23818 PS points were identified with density of about 49 PS points per km2. The average annual motion rate of the entire processed area is about −0.173 mm per year while the standard deviation of the average annual motion rate reaches the 1.523.




**Table 2.** *Cont.*

#### **5. Results**

The investigation of this phenomenon through the SAR data processing presents some inherent weaknesses. Initially due to the nature of the phenomenon (i.e., repeated small amount of swelling followed by shrinkage without a characteristic monotonic trend), the creation of time series of the velocity or the average vertical displacements per year could not provide us with meaningful conclusions. In addition, it has to be recognized that the ground heave decreases for larger applied vertical stress and, thus, depends strongly on the pressures applied by the structures. Indeed, beyond a threshold stress point (the "swelling pressure"), the addition of moisture in the Nicosia marl does not produce swelling, but generates a collapse strain and settlement. Therefore, it is clear that, in an urban environment where high rise buildings (applying high vertical stress) coexist with smaller ones (lower vertical stress) and building with shallow foundations coexist with building founded on piles (in which case the vertical movement of the superstructure is drastically limited), the SAR data should be treated with great caution.

In order to accommodate the non-monotonic evolution of the ground movement, the amplitude of the oscillations of the vertical movement, in the form of the difference between the maximum and minimum displacement observed in the given time period, was calculated for all the PS points. This way, the intensity of the vertical deformation fluctuations (and not their tendency) becomes evident. Secondly, to mitigate the fact that ground swelling depends on the applied vertical stress, the PS points that did not belong to buildings were selected and treated separately as Free Field points and assumed to not be influenced by the loads or the rigidity of the neighboring structures. This process was done manually. More specifically, reflectors such as lighting poles, sidewalks, as well as objects and very small structures in parks and unbuilt areas, were searched and separated from PS points that obviously belonged to buildings.

Figure 5A shows all the PS points in the study area for the period between 16 November 2002 and 30 December 2006. All the PS points were divided into five categories according to Natural Breaks method. It can be seen that the largest vertical displacements amplitudes, which can be as high as 40 mm, are observed in the areas around Lykavitos and Aglantzia. This observation is in good agreement with the Plasticity Index Thematic Layer (Figure 6A), and both the clay content and Nicosia Marl depth, which are also very unfavorable in these areas (Figure 6C,E). The smallest vertical displacement amplitudes appear in the Strovolos wider area, in the neighborhoods bordering the Pedieos river and in Old Nicosia and its vicinity.

**Figure 5.** Spatial distribution of the vertical displacement amplitude for the period 16 November 2002–30 December 2006: (**A**) All PS points, (**B**) free field PS points.

**Figure 6.** Spatial distribution of the vertical displacement amplitude along with the factors which strongly affect the phenomenon: (**A**,**C**,**E**) PS points, (**B**,**D**,**F**) free field PS points.

A similar but more clear image is observed in Figure 5B, where the PS Free Field points are isolated and plotted. This map pinpoints the highest values of ground movement in the areas of Lykavitos and Aglantzia. These areas present marl of high plasticity index and clay content (Figure 6B,F). In addition, an increase in the variation in the vertical displacement amplitude is marked when moving to the north part of Palouriotissa. This can be attributed to the effect of the relative shallowness of the Nicosia Marl (Figure 6D).

In order to investigate the vertical displacements amplitudes for the three hydrological years, six new maps were created. It should be stressed that the first hydrological year is not presented due to incomplete remote sensing data. Figure 7A,C,E, in which all the PS points are included, shows large vertical displacement amplitudes, especially in the area around Aglantzia and Lykavitos, reaching values as high as 33 mm. Similar conclusions can be drawn for the Free Field PS points (Figure 7B,D,F). More specifically, the highest values are observed in the area around Aglantzia and Lykavitos, while the lowest ones at the Old Nicosia and Strovolos. It is worth stressing the fact that in the western part of study area, the PS points with the lower amplitude values in Figure 7A,C,E disappeared from Figure 7B,D,F, while Free Field PS points with relatively large movement amplitudes remained. This observation can lead to the conclusion that in this area, high vertical displacements amplitudes can occur, but due to the PS points reflectors locations, i.e., on high-rise buildings applying high vertical stresses to the ground, the observed displacements are small.

It should be stressed that the vertical displacement amplitudes are similar both throughout the 4-year examination period (40 mm for the PS points and 35 mm for the Free Field PS points) and for each hydrological year separately (33 mm for the general PS points and 32 mm for the Free Field PS points) (Figure 7). This observation confirms the periodicity of the phenomenon and leads in the conclusion that the main cause of ground volume changes is the seasonal fluctuation in the climatic factors (precipitation, temperature, sunshine).

In order to investigate the periodicity of this phenomenon, the rainfall data from 2 meteorological stations, during the time period were obtained. As Figure 8 shows, the driest period was between June 2004 and September 2004, as the cumulative amount of rainfall did not exceed 1 mm. On the contrary, the periods November 2002–May 2003 and October 2003–May 2004 presented the highest amount of precipitation. A correlation between three free field PS points (19322, 19126, 17779) and rainfall for the period November 2002–November 2006 can be discerned in Figure 8, despite gaps in the time series. It can be seen that rainy periods are followed shortly by an upward movement of the ground, while a steady downward movement develops during the course of the dry periods. The seasonal changes in the displacement of the PS points lead to the conclusion that the repeated changes in the amount of rainfall in conjunction with the high seasonal temperature variations cause significant changes in the ground moisture inside the active zone, resulting in noticeable volumetric changes in the clay-rich horizons of the Nicosia marl.

**Figure 7.** Spatial distribution of the vertical displacement amplitude for the 3 hydrological years: (**A**,**C**,**E**) PS points, (**B**,**D**,**F**) free field PS points.

**Figure 8.** Times series of displacement (right axis) and rainfall (left axis) for three characteristic PS points.

#### *5.1. Correlation of the Factors Controlling Ground Swelling/Shrinkage with the Remote Sensing Data*

The results of the spatial analysis carried out in the GIS environment are presented below and concern the calculation of the frequency of classes of vertical ground displacement amplitude in correlation with the main causal factors. For this purpose, the PS points were divided into five classes according to the movement amplitude. Figures 9–13 show the results of the statistical analysis in the form of diagrams, accounting both for all PS points and for only the Free Field PS points. It is noted that the relative density is estimated by considering the number of PS points in each individual class depending on the area which occupied.

Regarding the clay content, it was observed that the study area is covered mainly by khaki marl that contains clay percentage between 26–40%. Moreover, the area in which the khaki marl contains more than 40% of clay occupies 29% of the map. The highest frequency of small movement amplitude (blue bar) PS points (824) and large movement amplitude (red bar) PS points (64) occurs in the areas where the clay content is between 26–40% (Figure 9a). A similar picture is observed for the Free Field Ps points (Figure 9b). After the calculation of the relative density, it was observed that the highest rate of low displacement amplitude PS points was noticed in the clay content area ≤26% for both Free Field PS Point and the general PS Points (Figure 9c,d).

Significant variation in the high vertical displacement amplitude class between the results of the Free Field PS points and general PS points was observed. More specifically, the relative density of Free Field red PS points (42.62%) is higher in the area with clay content >40%, while the relative density of the general red PS Points is the lowest (14.29%) in same area.

**Figure 9.** Statistical frequency (**a**,**b**) and relative density (**c**,**d**) of the vertical displacement amplitude points as a function of the khaki marl clay content for total and free field PS points.

**Figure 11.** Statistical frequency (**a**,**b**) and relative density (**c**,**d**) of the vertical displacement amplitude points as a function of the khaki marl Plasticity Index for total and free field PS points.

**Figure 12.** Statistical frequency (**a**,**b**) and relative density (**c**,**d**) of the vertical displacement amplitude points as a function of the khaki marl thickness for total and free field PS points.

**Figure 13.** Statistical frequency (**a**,**b**) and relative density (**c**,**d**) of the vertical displacement amplitude points as a function of the khaki marl upper boundary depth for total and free field PS points.

From Figure 9d, it can be seen that there is strong correlation between clay content and ground movement amplitude. For example, the relative density of highest amplitude free field PS points (red bar) increase as the clay content increases, while the lowest amplitude points (blue bar) decreases as the clay content increases. This is not observed when all PS points are considered (Figure 9c). This fact demonstrates clearly the necessity of removing the PS points that do not belong in the free field from the statistical dataset. Contrarily to the clay content, a clear trend is not observed for the montmorillonite content (Figure 10d). For example, the relative density for the high amplitude and the low amplitude movement varying non-monotonically with increasing montmorillonite content. The same is true for the marl thickness inside the active zone (Figure 12).

An even stronger trend than clay content gives the plasticity index (Figure 11d) if only the free-field PS points are considered. The relative density of the highest amplitude displacement rises from 6% to 66% as PI rises to the range 58 to 79. As in the previous cases, if non free-field points are included in the statistical analysis, the correlation with PI is much weaker.

In addition, the depth of the upper boundary of the Nicosia marl (Figure 13d) gives an equally strong correlation with the clay content. Generally, the relative density of the highest amplitude displacement decreases from 48.9% to 23% as the Nicosia marl depth increase up to 8 m. On the contrary, the relative density of the lower amplitude displacement increases proportional to the Nicosia marl depth. Comparing the figures of the total points (Figure 13a,c) with free field points (Figure 13b,d), it is obvious that in latter case the correlation with the Nicosia marl depth is much clearer.

#### *5.2. Lasso Regression Analysis*

In previous section, the factors controlling ground swelling/shrinkage were investigated by considering each time a single causal variable without taking into account the possible interplay between them. In this paragraph, we fit a linear equation in which all the

factors that play an important role in the activation of this phenomenon (Plasticity Index, Clay content, Montmorillonite content, depth of the Nicosia marl) were combined. The Nicosia marl thickness inside the active zone was not considered in this equation since it is firmly correlated with the upper boundary depth (i.e., the thickness of the non-expansive soil cover). After the normalization of each causal variable based on its minimum to maximum range, the linear equation was formulated as follows:

$$AVM\_{total,free} = \mathbb{C}\_0 + \mathbb{C}\_1 \frac{PI - PI\_{min}}{PI\_{max} - PI\_{min}} + \mathbb{C}\_2 \frac{CL - CL\_{min}}{CL\_{max} - CL\_{min}} + \mathbb{C}\_3 \frac{MN - MN\_{min}}{MN\_{max} - MN\_{min}} + \mathbb{C}\_4 \frac{DP - DP\_{min}}{DP\_{max} - DP\_{min}} \tag{1}$$

where *AVM* is the amplitude of vertical movement for either total or free field PS points (in mm), *PI* is the Plasticity Index (%), *CL* is the Clay content (%), *MN* is the Montmorillonite content (%) and *DP* is the upper boundary depth of the Nicosia marl layer (in m).

The main goal of the linear equation was to investigate the degree of influence of the causal factors by ranking them in accordance to their importance. To achieve this, the Lasso regression analysis was performed using the pertinent built-in function in MATLAB. In this method, a penalty term, which is the sum of the absolute values of the linear equation coefficients Ck (k = 1,2 ... n) times a user specified multiplier λ (i.e., λΣ|Ck|), is inserted in the objective function [31]. As λ increased, the equations coefficients became seriatim zero and causal factors are effectively removed from the equation. The larger the λ for which Ck becomes zero (threshold value λT), the greater is the importance of the respective variable.

Figure 14 presents the evolution of the coefficients Ck with increasing λ. It can be seen that the plasticity index drops last out of the linear equation and is thus judged as the most important parameter controlling the phenomenon of ground heave/settlement. Figure 15 shows the λ<sup>T</sup> values (values at which the Ck = 0) normalized with respect to the λ<sup>T</sup> for the plasticity index, thus quantifying the relative importance of each variable. The plasticity index turns out to be by far the most important variable, followed by the marl depth and the clay content.

**Figure 14.** Trace plots of coefficients from Lasso regression for free field PS points.

**Figure 15.** Rank of relative importance of causal variables based on Lasso regression for free field PS points.

#### **6. Discussion**

The investigation of this phenomenon through the EO technologies presents some inherent difficulties. Firstly, the creation of time series with the average vertical displacements per year could not provide us with meaningful conclusions due to the repeated small amount of swelling followed by shrinkage on a yearly basis. Secondly, the magnitude of the vertical displacement is affected the pressures applied by the structures. Therefore, in an urban environment where high rise buildings coexist with smaller ones and buildings with shallow foundations coexist with buildings founded on piles (in which case the vertical movement of the superstructure is drastically limited), the SAR data should be treated with great caution. In order to overcome the above weaknesses, this research relied on estimates of the vertical movement amplitude, in the form of the difference between the maximum and minimum displacement observed in a given time period, calculated for each PS point. This way, the intensity of the vertical deformation fluctuations (and not their tendency) becomes evident. Furthermore, the PS points that did not belong to buildings were selected and treated separately as free field points aiming to mitigate the fact that ground movements depend on the applied vertical stress.

Based on the Interferometric Synthetic Aperture Radar (InSAR) data from the PanGeo project between 16 November 2002 and 30 December 2006, it was found that the seasonal fluctuations of the ground surface are of the order of a few tens of millimeters, with a maximum value of the order of 30–35 mm. This amplitude is clearly larger than the 5 mm to 10 mm observed in regions of France [25,26], signifying the highly expansive nature of the Nicosia marl in conjunction with the arid climate of Cyprus. Nonetheless, the value of 30–35 mm is not far from the 20–25 mm amplitude observed in eastern parts of Paris where severe damages to buildings due to soil swelling/shrinkage frequently occur [27]. It should be stressed that the vertical displacement amplitudes are similar both throughout the examined 4-year period and for each hydrological year separately. This observation confirms the periodicity of the phenomenon and indicates that the main cause of ground volume changes is the seasonal fluctuation of ground moisture inside the active zone of Nicosia marl due to the climatic factors (precipitation, temperature, sunlight, wind). Furthermore, as presented in Figure 8, the ground heave follows the periods of heavy precipitation with a delay of two to four months. This observation suggests that ground wetting in Nicosia marl happens slowly and abrupt activations are not expected (in the absence of other triggering factors). Nonetheless, a monotonic trend in the vertical

displacement (subsidence) can also be noted in Figure 8, which extends from the middle of the 2nd hydrological year to the end of the examined time period. Given that the water table in the study area is not exploited and lies very deep in the largest part (and, thus, is largely unaffected by the atmospheric factors), this non-periodic subsidence can be attributed to the differences between each year in the total amount of precipitation that occurs during the months in which the evapotranspiration is small (i.e., the winter months).

From the geotechnical point of view, the vertical displacements amplitudes appear to correlate well with the thematic layers of plasticity index and clay content. A similarly strong correlation is observed between the average vertical displacement rate and the clay content in the upper 1m of the soil profile in large regions of Australia [57]. These observations are in good agreement with numerous laboratory studies, which corelate the swelling pressure (the pressure required to hold the soil, or restore the soil, to its initial void ratio when given access to water) to soil plasticity and clay content [58,59]. Nonetheless, future research could also examine other factors that are known to affect this phenomenon, such as liquid limit and dry unit weight.

Statistical analysis shows that strong and meaningful correlations can be established between ground movement amplitude and the causal factors (plasticity index, clay content, montmorillonite content, marl thickness inside the active zone and depth of its upper boundary), provided that PS points that are located on buildings are excluded and only "free-field" PS points are retained in the dataset. Lasso regression reveals that the most important variable controlling the amplitude of ground heave and settlement is the marl's plasticity index, followed by the depth of the upper boundary of the marl layer (i.e., the thickness of the non-expansive soil cover) and the clay content. Yet, it must be pointed out that this study relied on a geotechnical data from boreholes that are sparsely and unevenly distributed. The present findings could be reinforced by applying the proposed approach to a region with a denser and more evenly spaced cloud of points of geotechnical data.

#### **7. Conclusions**

The research work presented herein had a goal to develop a methodological approach for investigating the phenomenon of ground swelling/shrinkage due to seasonal moisture changes in an expansive clay formation. This methodology was based on the integration of EO technologies (InSAR techniques), pre-existing local data and experience, conventional geotechnical research and the use of statistical analysis within a GIS environment. The study area is the city of Nicosia, in which an expansive soil formation, called Nicosia marl, dominates the surficial geology. The spatial correlation of seasonal heave/settlement of Nicosia marl with its physical characteristics was investigated using available geological/geotechnical data and InSAR ground motion measurements. It was found that the seasonal fluctuations of the ground surface are of the order of a few tens of millimeters, with maximum value of the order of 30–35 mm. Statistical analysis shows that there is significant correlation between all causal factors examined, namely plasticity index, clay content, montmorillonite conte, marl thickness and thickness of non-expansive cover, provided that PS points that are located on buildings are excluded and only "free-field" PS points are retained in the dataset. Lasso regression reveals that the most important variable controlling the amplitude of ground heave settlement is the marl's plasticity index, followed by the thickness of the non-expansive cover.

This study presented a new methodological approach for the investigation of ground movement without a characteristic trend (such as land subsidence due to the overexploitation of the aquifer), highlighting the important role of EO technologies as a useful tool for the assessment of ground deformation. The use of the vertical deformation amplitude instead of the velocity can limit the weaknesses of PS data and, provided that the quality of the geotechnical thematic layers is high, the results can be rewarding. The methods and findings of the present study could be applied in other regions around the world that suffer from the adverse effects of expansive soils, helping the state authorities and other policy makers in planning future urban development and the maintenance of existing infrastructure.

**Author Contributions:** Conceptualization, P.T. and D.L.; methodology, P.T. and D.L.; software, N.K.; validation, P.T.; formal analysis, P.T. and N.K.; investigation, P.T. and D.L.; data curation, P.T.; writing original draft preparation, P.T.; writing—review and editing, D.L and N.K.; supervision, D.L.; project administration, D.L.; funding acquisition, D.L. All authors have read and agreed to the published version of the manuscript.

**Funding:** This study was part of the research project EXPASOL funded by the Research & Innovation Foundation of Cyprus under contract INTEGRATED/0916/049.

**Data Availability Statement:** The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Kinematics of Active Landslides in Achaia (Peloponnese, Greece) through InSAR Time Series Analysis and Relation to Rainfall Patterns**

**Varvara Tsironi 1,2,\*, Athanassios Ganas 1, Ioannis Karamitros 1, Eirini Efstathiou 1, Ioannis Koukouvelas <sup>2</sup> and Efthimios Sokos <sup>2</sup>**


**Abstract:** We studied the kinematic behaviour of active landslides at several localities in the area of Panachaikon Mountain, Achaia (Peloponnese, Greece) using Sentinel (C-band) InSAR time series analysis. We processed LiCSAR interferograms using the SBAS tool, and we obtained average displacement maps for the period 2016–2021. We found that the maximum displacement rate of each landslide is located at about the center of it. The average E-W velocity of the Krini landslide is ~3 cm/year (toward the east) and 0.6 cm/year downward. The line-of-sight (LOS) velocity of the landslide (descending orbit) compares well to a co-located GNSS station within (±) 3 mm/yr. Our results also suggest a correlation between rainfall and landslide motion. For the Krini landslide, a cross-correlation analysis of our data suggests that the mean time lag was 13.5 days between the maximum seasonal rainfall and the change in the LOS displacement rate. We also found that the amount of total seasonal rainfall controls the increase in the displacement rate, as 40–550% changes in the displacement rate of the Krini landslide were detected, following to a seasonal maximum of rainfall values at the nearby meteorological station of Kato Vlassia. According to our results, it seems that large part of this mountainous region of Achaia suffers from slope instability that is manifested in various degrees of ground displacement greatly affecting its morphological features and inhabited areas.

**Keywords:** InSAR; GNSS; landslide; rainfall; Achaia; Greece

#### **1. Introduction**

The movement of an active landslide can be efficiently captured by the InSAR measurements [1–4]. Through InSAR time series analysis, the velocity of the ground movement can be measured with an accuracy of a few mm/yr (e.g., [5]). This remote sensing technique can provide an accurate identification of the area affected by active landslides and can also assist through hillslope monitoring by detecting potential slope failures. We use the term "active landslide" for landslide bodies that have a long history of affecting the area above and around them, especially the road network that requires constant maintenance. As active landslide (according to [6]) is considered a landslide that is currently moving and includes first time movements and reactivations.

The InSAR time series method has been applied successfully in a range of landslide studies, not only to locate landslide bodies, but also to identify spatial-temporal patterns of movement [7]. The analysis of The InSAR-generated displacement time series has the potential to identify periods of accelerated landslide deformation and to evaluate possible correlations with different triggers (rainfall, earthquakes). Large landslides and debris flows form a frequently occurring geohazard posing significant risk to lives and livelihoods.

**Citation:** Tsironi, V.; Ganas, A.; Karamitros, I.; Efstathiou, E.; Koukouvelas, I.; Sokos, E. Kinematics of Active Landslides in Achaia (Peloponnese, Greece) through InSAR Time Series Analysis and Relation to Rainfall Patterns. *Remote Sens.* **2022**, *14*, 844. https://doi.org/ 10.3390/rs14040844

Academic Editor: Fulong Chen

Received: 26 December 2021 Accepted: 4 February 2022 Published: 11 February 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Landslide phenomena have been previously recognized and mapped in Greek territory using InSAR techniques. Recently, Vassilakis et al. [8] used interferometric processing of TerraSAR-X data to estimate ground displacement rates of a large landslide inside a coal mine in western Macedonia; Kontoes et al., [4] produced susceptibility maps for the landslides in western Greece using ERS/ENVISAT data; Elias et al. [5] used ERS/ENVISAT/Sentinel-1 data to monitor slope stability upstream from an artificial dam; Tsangaratos et al., [9] examined the identification of mass movements due to landslides on Lefkada island with InSAR time series analysis using the TERRAFIRMA data (produced by the persistent scatterers interferometry—PSI technique). Kyriou & Nikolakopoulos, [10] mapped the landslides south of Patras using interferometric processing (offset-tracking method); several other studies on critical locations used InSAR technology [11,12].

The Achaia prefecture in NW Peloponnese (Greece), is an area strongly affected by localized subsidence and landslides [7,13–18], especially the mountainous area near the villages of Krini, Pititsa and Sella (Figure 1; [15,19]). These villages are located on the north-facing slopes of the mountain Panachaiko (maximum elevation 1926 m; Figure 1a), and all of them are affected by active landslide phenomena [20]. The NW Peloponnese shows spatially and temporally diverse climatic conditions across its area, mostly due to irregular topography and different atmospheric circulation patterns. Overall, the climate is typical Mediterranean and exhibits large seasonal variations with mild, wet winters and hot, dry summers [21]. The wet period usually extends from October through May [17].

In this study we used Sentinel-1 (C-band) InSAR displacement data to map the active landslides and monitor their kinematics. We compared our InSAR results to a co-located GNSS station at one of the sites. We then cross-correlated our InSAR time series to the rainfall data from a nearby meteorological station of NOA. We demonstrated a robust correlation between maximum rainfall and an increase in displacement rates.

#### **2. Study Area**

The study area is close to the Gulf of Corinth which is considered to be a paradigm of an active rift system in Greece (Figure 1; [22]) and is among the fastest extending continental regions in the world [15,23–25]. This rift was formed by normal slip on large, E-W striking fault segments that extend the crust of central Greece in a N-S direction. The length of the Corinth rift is 130 km, and the width is 20–40 km. The deepest sea depth is ~900 m, and the major peaks of the mountains around the Gulf of Corinth are ~2500 m. The south coast of the Corinth rift is uplifting, whereas the north part is subsiding [15,22,25–28]. From space geodesy data (GNSS) we know that the Peloponnese (southern part) moves faster toward the southwest than the Greek mainland (with respect to the "stable" Europe; [23,25,29]). The net result from this movement is that these two areas move away from each other with an average speed of 1 cm/year while the rate of movement increases from east to west [23,24]. The onshore active faults have normal kinematics with the main active structures being north-dipping faults visible along the southern coast of the Gulf of Corinth, while the rest are located offshore in the central part of the rift [29–32]. Moreover, the study area has rugged relief, several rivers flowing in the general N-S direction (Figure 1), many narrow valleys and numerous other erosional landforms such as triangular faceted spurs on the footwall of active faults [33–36]. Moderate to strong earthquakes occur frequently, inducing additional geological phenomena such as slow and continuous ground displacements in the area [13,15,37].

The area of Achaia specifically has been the focus of many studies concerning landslide events and vulnerability in the last decade [13,14,21,37–44]. The lithological condition of this area is one of the most controlling parameters for landslide occurrence [38,39]. The higher density of landslide occurrence is in Pliocene and Pleistocene fluvio-terrestrial and clastic flysch formations [39]. The most common triggering mechanisms are the seismicity, the steepness of slopes, heavily fractured rocks in the source areas and heavy rainfall [37,45]. The artificial and natural parts of the terrain slopes along a road, as well as the hydrographic axes of river networks, are also considered important factors for landslide manifestation [39,46]. From these common factors Koukis et al. [13] proposed two main triggering mechanisms for large mass movements in the area, i.e., the excessive rainfall resulting in high pore pressure in the rocks during these events and strong earthquakes resulting in dynamic loading conditions at the failure surface. Therefore, the increased permeability of the rock formations produced by earthquakes (due to strong ground motion and/or fissuring) together with events or periods of intense rainfall are the main factors for the intensity of landslide events.

The bedrock geology of Achaia comprises mainly carbonates (Figure 1b). According to IGME (institute of Geological and Mineral Exploration) maps [47,48], the study area is part of the Olonos–Pindos geotectonic zone of the Hellenides and mainly comprises limestone, schists and cherts of the Triassic–Jurassic age, covered by thinly bedded limestones with radiolarites of the Cretaceous age. A transition zone overlays the radiolarites, including limestones, shales, cherts and marls, leading to typical flysch sequence sediments of the Upper Eocene [13]. The syn-rift rocks are Pliocene–Quaternary age sedimentary rocks such as marls, sandstones, conglomerates and alluvial fan deposits.

In this paper we focus our research on the northern / eastern slopes of Panachaiko mountain, with emphasis on well-known landslide activity near the villages of Krini, Pititsa and the monastery of Agia Eleoussa (Figure 1a). In particular, the area around Krini (elevation 775 m; Figure 1b) mostly consists of the Upper Cretaceous–Palaeocene flysch. The flysch mainly includes beds of sandstone and marls (rarely), alternating with thin pelagic limestone, about 50 m thick [47,48]. During 1985 an IGME investigation about a landslide observed at the southern part of Krini was conducted in this area. According to the field report [49], the sliding area close to the village consists of Neogene clay and marl sediments overlapped by a weathered material about 1.5–2 m thick. The Krini landslide is a large earth flow with a mainly E-W slope-parallel displacement trend, which in its eastern (lower) part converts to a translational slide (based on the categorization proposed by [6,50,51]). This is possibly due to changes in lithology. It can be seen in the geological map (Figure 1b), that the landslide extent changes from flysch and its weathered mantle (upper part towards west) to limestone. Krini also displays many characteristics of an earth flow as described by [51] with long periods of relative slow to very slow movement alternating with more rapid surges. This type of mass movement has also allowed settlements (such as the village of Krini) and roads to be built on top of the landslide, a common occurrence in many earthflows around the world [51,52]. During 2021 we visited this area and observed numerous landslides near the village, some of them affecting parts of the road network. Apart from the large slope failures, many small-scale shallow landslides were also observed. Those were affecting the weathered cover of both flysch and Neogene (marl and clay) units.

The village of Pititsa (elevation 650 m; Figure 1b) is also located inside the Upper Cretaceous–Palaeocene flysch, but in this locality the flysch is comprised of sandstone alterations with limestone intercalations and marls. North and east of the village, the flysch is overlapped by Pliocene–Pleistocene blue marls and sandy clays. The Pititsa landslide can also be categorized as an earth flow with the same characteristics as the main body of the Krini landslide but mostly maintaining the lithological homogeneity throughout. The main displacement here is also E-W. To the west of Pititsa, the village of Sella is also affected by active landslides, however, it is built upon terrestrial deposits (Figure 1b; [47,48]).

**Figure 1.** (**a**) shaded relief Map of northeastern Achaia showing the area of interest (box) including the villages where landslides have occurred; (**b**) simplified geological map [47,48]. The landslide of the village of Krini is located on flysch, and the Pititsa landslide is located on flysch and on Pliocene deposits. Active faults are shown with black lines, and red-dashed outlines represent the boundaries of the landslides (ticks on the downthrown side).

#### **3. Data, Methods and Results**

#### *3.1. SAR Data Processing*

The processing of InSAR time series analysis was completed using the LiCSBAS, an open-source package that integrates with LiCSAR products [53–55]. The processing chain and schematic description of data/methods used are shown in the flow chart of Figure S2. LiCSAR produced interferograms (wrapped and unwrapped) from SLC (single look complex) data of Sentinel-1 acquisitions. The interferograms were multilooked with a factor of 20 × 4 in range and azimuth (46 × 56 m spacing) and spatially filtered by a GAMMA adaptive power spectrum filter with an alpha value of 1.0 to decrease noise [56,57]. With the SNAPHU software, LiCSAR unwraps the phase in two dimensions using a statistical cost technique [58]. We employed unwrapped interferograms and coherence images that were geocoded with a pixel spacing of 0.001 degree (100 m) and converted to GeoTIFF format for our time series analysis. We used the frames with ID: 080D\_05196\_131104 for descending track and 175A\_05184\_121313 for ascending track, respectively. For ascending and descending track (see Figure S1), we used 346 and 317 available interferograms from the LiCSAR portal, respectively (out of 99 and 120 SAR images, respectively; see Table S1 for image dates).

The areas of interest are in the center-east of the frames ID 175A and 080D (Figure S1). The time span for the ascending track ranges was from 2015 to 2021 and for the descending from 2016 to 2021, respectively (see Table S1). Before the main processing, a tropospheric correction was applied on the unwrapped products using the generic atmospheric correction online service for InSAR (GACOS) data ([59]; pixel size 10 × 10 km). GACOS utilizes the iterative tropospheric decomposition (ITD) model to separate stratified and turbulent signals from tropospheric total delays. The tropospheric corrections were used in GeoTIFF format. To evaluate the success of the correction, the calculation of standard deviation (STD) of unwrapped phase before and after the GACOS correction and the reduction rate of the STD was performed through the LiCSBAS software [53]. The STD of unwrapped phases for each interferogram was typically lower (see correlation diagrams in Figures 2 and 3; see also reduction of STD through the interferograms in Figures S3 and S4), indicating that this tropospheric correction approach greatly reduced tropospheric noise. We see in the diagrams (Figures 2 and 3) that the reduction of STD for the dataset of ascending orbit is lower (reduction rate 29.57%) than the reduction of STD for the descending orbit (36.29%).

**Figure 2.** Correlation diagram showing standard deviation: (**a**), of unwrapped phases before and after the GACOS corrections and their reduction rates; (**b**), for ascending track 175. The gray line denotes a 1:1. The reduction of STD for the ascending orbit is moderate. The STD decreased from 6.35 rad to 4.41 rad on average with a mean reduction rate 29.57%.

**Figure 3.** Correlation diagram showing standard deviation: (**a**), of unwrapped phases before and after the GACOS corrections and their reduction rates; (**b**), for descending track 80. The grey line denotes a 1:1. The STD decreased from 4.36 rad to 2.61 rad on average with a mean reduction rate 36.29%.

After the tropospheric correction, we performed the quality check function of LiCSBAS. Statistics such as average phase coherence (we applied a 0.3 threshold) and percentage of genuine unwrapped pixels are used to identify bad data. We also performed the network refinement by interferometric loop closure [60] and removed 100 interferograms (95 ascending and 5 descending orbit) from further processing. Unwrapped data may contain unwrapping mistakes, which can result in serious errors in the derived time series and should be eliminated or addressed prior to use. Moreover, after the quality checks, we applied the small baseline (SB) inversion on the network of interferograms (Figure 4). The least-squares approach was then used to get the mean displacement velocity (LOS) from the cumulative displacements. We used the NSBAS approach [61] to produce a more realistic time series of displacement even with a disconnected network (Figure 4). We selected the SBAS technique because it works well in the vegetated and rural areas such as the mountainous Achaia region, due to the use of distributed scatterers.

**Figure 4.** Perpendicular baseline configuration and network of the 346 small baseline interferograms formed from 99 Sentinel-1A images (ascending track, top panel) and network of the 317 SB interferograms formed from 120 Sentinel-1A images used in this study (descending track, bottom panel). Vertical lines indicate the network gaps.

We also used the percentile bootstrap method to calculate the LOS velocity's standard deviation from cumulative displacements. The next processing step was the masking of noisy pixels in the time series and the spatiotemporal filtering of time series. Finally, topography-correlated components (linear with elevation) were subtracted with deramping (see Figures S5 and S6), and we obtained the LOS velocity maps for each satellite track (Figure 5).

**Figure 5.** Displacement maps in LOS direction for the broader areas of Krini (**a**,**b**) and Pititsa villages (**c**,**d**) for ascending track and descending track (**a**) shows the displacement map for ascending track; and (**b**), for descending track around the village of Krini, respectively. The Figure 5c,d corresponds to displacement map for ascending and descending track around the villages of Pititsa and Sella, respectively.

After the main processing, we decomposed the LOS displacement data to east-west (E-W) and up-down (U-D) components assuming that the north-south (N-S) displacement is very small due to the track's imaging geometry of the Sentinel-1 satellite [62,63].

In the area around the village of Krini (Figure 1), an active landslide is present (Figure 5a,b). More active landslides are also located near the villages of Graikas, Pititsa and Sella and to the monastery of Agia Eleoussa (Figure 5c,d). In those locations we measured LOS velocities up to −55 mm/yr for the ascending track (Figure 5a). We obtained similar velocities, up to 30 mm/yr for the descending track (Figure 5b). Note that the values of LOS displacements are referenced to a local reference point which is located about 30 km south of the area of interest (see Figure S1 for location), and it is considered stable (free of gravitational or tectonic displacements).

Then, we produced maps of the east-west and up-down components, assuming that the north-south component was neglected due to the track's geometry of the Sentinel-1 satellite. We found that all active landslides were moving east and downward (subsidence, Figure 6a,b). All the affected areas are located on the east-facing slopes and in the foothills of Panachaiko mountain (Figure 1).

**Figure 6.** Displacement-rate maps in mm/yr: Up (**a**,**c**) and E-W (**b**,**d**), component after the decomposition of InSAR data for the broader areas of Krini and Pititsa villages, respectively; (**e**), showing the landslides' outlines for Krini and Graikas landslides; and (**f**), showing the landslides' outlines for the monastery Agia Eleoussa and the villages of Pititsa and Sella.

We traced the boundaries of the active landslides using the InSAR motion data on the E-W component (Figure 6e,f). We note that our motion data refer to the period 2016–2021 (Figure 4), so it is possible that landslides mapped in previous studies in the same area [15,20] have changed their motion patterns. In our study we measured the mean velocity of the larger landslides (Krini and Agia Eleoussa (see areas enclosed by orange lines in Figure 6e,f)) by a statistical approach and not by referring to a single pixel. The mean velocity of the Krini landslide was found to be 6 mm/yr downward (nearly 0.6 cm/year) and 28.7 mm/yr (nearly 3 cm/year) eastward. The mean velocity of the Agia Eleoussa landslide was found to be 1.8 mm/yr (~0.2 cm/year) downward and 7.7 mm/yr (~0.8 cm/year) eastward. We also computed the histograms of the velocities from both ascending and descending tracks and the Up and E-W components of the movement for Krini and Agia Eleoussa, respectively (Table 1, Figures 7 and 8). The area affected by the Krini landslide is 4,080,000 square meters (m2). The area of the active landslide around the village of Pititsa is 800,000 square meters (m2), and the area of the landslide near the monastery of Agia Eleoussa is 1,600,000 square m2. It is probable that the two landslides are in fact part of one feature (mass movement). However, the uncertainty in our InSAR motion rates (~3 mm/yr LOS, see Figure 5c,d) is within the values we mapped in the area between the two landslides.

**Table 1.** Table of mean and median velocities of active landslides in the village of Krini and at the monastery of Agia Eleoussa. The landslide outlines are shown in Figure 6e,f. The period of observation was 2016–2020.


**Figure 8.** Histograms of LOS/E-W/U-D velocities of active landslides in the Agia Eleoussa area (Affected area: 1,600,000 m2).

#### *3.2. Rainfall Data*

We also collected rainfall data of stations located in the Achaia prefecture (from the database of [64]) to identify possible temporal patterns of ground movement that could be correlated with the rainfall. The wet season lasts from October to May, during which the total rainfall accounts for 93% of the annual rainfall. The month with the highest precipitation is December, with a mean rainfall of 128.9 mm, followed by November (124.7 mm). August is the lowest with a mean rainfall of 7.0 mm, followed by July (8.8 mm) [17].

We used the daily rain data of three meteorological stations of NOA (Kalavrita, Panaxaiko and Kato Vlassia stations (Figure 1)) for the same time span (2015–2021) as with the InSAR and GNSS time series. These stations are located at similar elevations to and within 20 km from the landslides. The data are plotted in Figures 9–11.

**Figure 9.** LOS-Projected GNSS time series (KRIN station: black points) and InSAR position time series (orange points: descending orbit) of the pixel which contains the GNSS station. Also, blue, green and orange lines represent the rainfall of the meteorological stations Kalavrita, Kato Vlassia and Panachaiko (see Figure 1 for station localities).

**Figure 10.** Graph showing LOS position time series for the Krini landslide descending track (080D). Blue, green and orange lines represent the average monthly rainfall of the meteorological stations Kalavrita, Kato Vlassia and Panachaiko, respectively (right axis). The three black boxes represent the time periods what are used for further processing (see Figure 11). The dashed line indicates the linear fit of the InSAR data for the whole period of observations (25 mm/yr).

**Figure 11.** Graphs showing InSAR position time series and average monthly rainfall (Kato Vlassia station, top panel) for each time period corresponding graphs showing cross-correlation results (signal-converted waveforms, middle panel; correlation between the two waveforms, lower panel). The corresponding displacement rates are reported in Table 2.

**Table 2.** Displacement rate A corresponds to the linear displacement rate of the Krini landslide (descending orbit) before the maximum seasonal rainfall is reached, and linear displacement rate B corresponds to the displacement rate following the rainfall peak (see Figure 11 for trendlines). The total rainfall corresponds to the total rain during each time period. The displacement rate A & B values of the last line (period 2020–2021) refers to the ascending orbit data.


#### *3.3. Validation of InSAR Time Series with GNSS Data*

To compare our InSAR displacements (velocities) with other geodetic data, we used the GPS observations of the permanent station KRIN of the Corinth Rift Laboratory (CRL) NFO of EPOS (http://crlab.eu/ (accessed on 20 December 2021).). Published GNSS data (30-s daily positions; [25]) were used. We visited the station in May 2021, and we confirmed the stability of the antenna (see Figure S7). We transformed the GNSS components into LOS through the equation [63]:

$$\mathbf{D\_{7}} = \mathbf{d\_{u}} \times \cos(\text{inc}) - (\sin(\text{inc}) \times (\text{d\_{n}} \times \cos(\text{ah-3} \times \text{pi}/2) + \text{d\_{e}} \times \sin(\text{Ah-3} \times \text{pi}/2))) \tag{1}$$

Ah-3 × Pi/2 corresponds to the angle to the azimuth look direction, which is perpendicular to the satellite heading; inc is the radar incidence angle; and du, dn, de are the corresponding up, north and east displacements derived from GNSS solutions (Figure S8).

The GNSS time series and the InSAR time series analysis show slightly different displacement rates (Figure 9, descending orbit). The GNSS rate is slightly greater than the InSAR rate; the displacement rate from GNSS is 28 mm/yr and from InSAR time series it is 25 mm/yr (see Figure S9), providing a deviation of 3 mm/yr or 12% of the GNSS rate. This rate difference is expected as it is related to the difference in absolute terms between the point measurement of displacement, which corresponds to the GNSS time series (i.e., the motion of the antenna reference point or ARP), and the 'pixel' measurement on the ground, which corresponds to the InSAR time series. Note that the InSAR time series represent the mean value of displacement within a pixel size 100 × 100 m, i.e., enclosing the building where the GNSS antenna is located. This small deviation in rates may indicate: (a) the "smoothing" effect of InSAR; and (b) the possible occurrence of differential movements inside subareas which each SAR pixel covers. To complete the comparison—validation—we also projected the KRIN GNSS data into the LOS-ascending orbit geometry, confirming similar rates of motion (48 vs. 55 mm/yr, respectively; see time-series diagram in Figure S10).

#### **4. Discussion**

#### *4.1. Landslide Motion and Rainfall Pattern*

There are many publications concerning rainfall triggers for landslides for the areas of central and southern Europe (e.g., [65–67]). The most common parameters used for rainfall threshold definition is intensity (of both antecedent cumulated rainfall and event rainfall) and duration [67]. The effect of the antecedent rainfall depends heavily on the hydrological conductivity of the landslide mass, and only rainfall events with a large amount of precipitation are considered capable of triggering the increase of deformation rate of deep-seated landslides. The authors of [68] noted in their case that the time lag for peak precipitation and peak deformation rate may be up to 20 days, while [65] found a 12-day relation. Since proximal rainfall data is available for Achaia (Figure 1), we examined the correlation between the rainfall patterns and the seasonal movement of the Krini landslide because for this landslide the displacement rate is well constrained by the co-located GNSS station (Figure 9 and Figure S10). For this purpose, we used a signal processing method, the cross correlation [69]. We used this approach to convert our data to signal and compare them, so that we could determine if a time delay (time lag) exists between the maximum peaks of the time series of rainfall and of InSAR, respectively. The cross-correlation technique is used to measure the similarity of two time series as a function of the displacement of one relative to the other and to detect correlations among these two series [70]. We applied this technique to the daily rainfall dataset, which could play an important role in the movement of a landslide and to the InSAR time series. We used the InSAR time series of the descending track due to the completeness of its dataset during the period 2016–2020 rather than the time series from the ascending track (Figure 4). First, we resampled and interpolated the daily rainfall data to fit it to the time span of the InSAR time series. Because it was complete, we selected the data from the meteorological station Kato Vlassia (Figure 1a) and then performed the correlation for motions (displacements) of the Krini landslide.

We applied the cross correlation to the two time series, especially to three subperiods (see boxes in Figure 10). These time periods consisted of three sets of eight months of observations that had complete datasets. We selected to work with three separate time periods because our InSAR time series contains network gaps (Figure 4) which correspond to gaps in the time series. We used the following time periods: 6 September 2016–28 May 2017, 19 September 2017–23 April 2018 and 9 October 2019–6 May 2020. We calculated the displacement rate before and after the changes of the displacements through fitting the trendline to the data (Table 2). We observed in Figure 11 that when changes of the displacement rate were detected previously, a seasonal maximum of rainfall values occurred. Two of the three time periods showed an increase of displacement rate of about 40%, while the total seasonal rainfall (8 months) was quite similar (~700 mm, Table 2). The period September 2017–April 2018 showed an increase of the displacement rate of about 550%. This result was accompanied by a large amount of total rainfall (~1000 mm, Table 2). These findings allowed us to correlate the amount of total rainfall with the positive displacement rate (increase) of an active landslide. Through the cross-correlation method, we computed the time lag between the maximum peaks of InSAR and rainfall (Figure 11 lower panel; Table 2). The mean time lag (of the three periods) was 13.5 days between the maximum value of rainfall and the change in the InSAR displacement rate. The changes in InSAR displacement rates occurred on 28 January 2017, 23 January 2018 and 20 December 2019, respectively. Moreover, due to the large number of ascending orbit data available during the winter–spring period in 2020–2021 (Figure 4), we performed a separate cross correlation using the InSAR position time series (ascending orbit) against the daily rainfall data from the Kato Vlassia station (see Figure S11). Again, we found a 330% increase in the displacement rate of the landslide following a large interval of rainfall with a time lag of 12 days (see Table 2, last line). The change in the displacement rate of the Krini landslide occurred on 18 February 2021. Thus, whatever S1 orbit data we use in the case of Achaia, we always find this pattern, i.e., an increase in the landslide displacement rate following heavy rainfall.

#### *4.2. Kinematic Characteristics of the Landslides*

The Krini, Agia Eleoussa monastery and Pititsa are active landslides whose motion was measured by InSAR time series analysis for the period 2016–2021. First, we identified the moving pixels in the InSAR data (draped over the high-resolution DEM, Figure 6) and delineated the landslide boundary. We examined the decomposed velocities and found that the maximum displacement rate of each landslide is located at about the center of each landslide (Figure 12). Then, through InSAR mapping we identified more cases of mass movement than these three well-known landslides in the broader research area. In particular, we identified two additional active landslides, one around the village of Graikas (3 km north of Krini), and another around the village of Sella (Figure 5c,d and Figure 6c,d). Our results showed that a large part of this mountainous region suffers from slope instability that is manifested in various degrees of ground displacement that greatly affects its morphological features and inhabited areas. To validate these results, we provide field photographs of active landslides (taken on 7 May and 10 September 2021) which agree with our InSAR findings (Figure 12).

**Figure 12.** Maps showing U-D component of displacement rate (left panel, 2016–2021). Upper panel map includes the landslide of Krini and Graikas, and the bottom panel map includes the landslides of Pititsa and Agia Eleoussa, respectively. On the right side we provide field photos with numbers corresponding to the green markers on the maps.

In addition, the NOAFAULTs database of active faults of Greece [71,72] contains two active, north-dipping normal faults that affect the area of the active landslides. In particular, the Pititsa landslide seems to be located at the hanging wall of the Panachaiko fault segment, the Agia Eleoussa is situated between two faults (Panachaiko F. and Pititsa F.), while the Krini and Graikas landslides overlay the Panachaiko fault trace (Figure 1b). We could not find in the literature a fine-scale geological map showing the fault traces with high planimetric accuracy, so we had to rely on the coarse information provided by the NOAFAULTs database. In addition, there is no geodetic or geological evidence for any fault creep along these faults, so the tectonic influence of the landslide mobility is uncertain. However, we cannot exclude the possibility of a strong earthquake during the Holocene along these faults and so to further mobilize these landslides.

Furthermore, due to the lack of fine-scale geological mapping in this region, the lithological contacts are based on 1:50.000 scale maps where both cartographic accuracy and lithology detail are reduced. Nevertheless, the area around the Eleoussa monastery landslide consists mainly of flysch and unconformably overlying Neogene deposits, with small chert (Lower–Upper Cretaceous) and limestone (Jurassic) occurrences (Figure 12, point 7). In the area around Krini (Figures 1b, 12 and 13), we observed that the main body of the landslide is located near the contact between the "lower" flysch formation and the pelagic limestone of the Upper Cretaceous age. In addition, the main body of the landslide extends to areas consisting of Neogene sedimentary (clastic) formations. This comes in accordance with the observations made by [13,19,20] on the same region, where they attribute the highly sheared and weathered nature of flysch and Neogene sediments for contributing to the instability of the area. The weakened state of these clastic sediments influences the reactivation of ground motions when nearby seismic activity and heavy rainfall occurs.

**Figure 13.** Maps showing InSAR velocities in the area of the village of Krini (shaded polygon: period 2016–2021). Up component (**left**), E-W component (**right**). Yellow boxes indicate houses, and black triangle shows the location of the GNSS station.

Inside the village of Krini itself, we mapped downward velocities up to 2.1 cm/year and eastward velocities up to 6.8 cm/year, respectively (Figure 13). The InSAR coverage of surface motion is discontinuous because of the scarcity of ascending orbit acquisitions (spring 2017–2018, Figure 4 and Table S1), yet there is robust geodetic (InSAR) evidence that the whole village as well as the road access to it, from both north and east directions, is affected by the deep-seated landslide.

Deep-seated landslides are characterized by long-term gradual deformation of millimeter to decimeter scale per year [73]. Their movement can be divided into slow and acceleration deformation phases, triggered by increased intensity in precipitation [74]. The deformation rate in deep-seated landslides is mainly controlled by hydrometeorological conditions. Their sliding behavior is a result of the relation of the shear strength of the soil to the shear (sliding) force applied by the gravitational forces acting on the landslide mass, a balance that the hydrological condition of the area can greatly affect [73]. This may be also the case for the Krini deep-seated landslide. We interpret its kinematic behavior as a result of seasonal changes in rainfall. For the first time, we determined the correlation between rainfall and movement of this landslide. Through the cross-correlation method, the maximum correlation between the two series were about 13.5 days. In all three timeperiods studied there was an increase of displacement rate right after a period with rainfall. We suggest that the spatiotemporal pattern of movement is modulated by the seasonal rainfall which in turn, allows us to expect an increase of displacement rate of the landslide of Krini at the end of the rainy season and at the beginning of the dry period. In addition, a moderate to strong earthquake in this area could increase the displacement rate of the landslide, but such a case has not yet been demonstrated.

#### **5. Conclusions**

The main findings of this paper are:


**Supplementary Materials:** The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/rs14040844/s1, Figure S1. ESRI map with the ID frames of LicSAR Portal (red outline is descending; blue outline is ascending frame). Green rhomb indicates the location of the reference point used in the InSAR analysis. Figure S2. Data processing flow chart used in this study. The arrows indicate input/output actions. The products are shown in orange. Figure S3. Unw\_org corresponds to the interferogram before the GACOS correction, and Unw\_cor corresponds to the interferogram after the GACOS correction. The standard deviation (STD) decreased from 3.2 rad to 2.4 rad. The reduction rate for this interferogram is 26.6%. The dates of the interferometric pair are 24 January 2019 and 17 February 2019. (Ascending Orbit). Figure S4. Unw\_org corresponds to the interferogram before the GACOS correction, and Unw\_cor corresponds to the interferogram after the GACOS correction. The standard deviation (STD) decreased from 5.2 rad to 2.2 rad. The reduction rate for this interferogram is 60.1%. The dates of the interferometric pair are 13 December 2016 and 4 January 2017. (Descending Orbit). Figure S5. Graph showing the time series of displacement of the village of Krini corresponding to the pixel enclosing the GNSS station (ascending orbit). Vel(1) indicates the velocity of this pixel (blue line) after the spatio-temporal filtering and deramping. Figure S6. Graph showing the time series of displacement of the village of Krini corresponding to the pixel enclosing the GNSS station (descending orbit). Vel(1) indicates the velocity of this pixel (blue line) after the spatio-temporal filtering and deramping. Figure S7. Field photograph showing the antenna of KRIN GNSS Station. View to the northeast. Photograph was taken on 7 May 2021. Figure S8. Graph showing the geometry of the LOS velocity vector. 34.58◦ is the incidence angle in the Krini study area. Figure S9. Graph showing the trend differences between InSAR and GNSS LOS displacements (descending orbit) of the same dates during the common period of observation (58 common dates). For comparison, the blue-line indicates a perfect (1:1) correlation. Figure S10. LOS-Projected GNSS Time series (KRIN station; black points) and InSAR position time series (orange points, ascending orbit) of the pixel which contains the GNSS station. Figure S11. Graphs showing position time series and average monthly rainfall (Kato Vlassia station, top panel) for the period September 2020–May 2021 and corresponding graphs (middle and lower panel) showing

cross-correlation results. The corresponding displacement rates are reported in Table 2. Table S1. Dataset of Sentinel-1 SAR acquisitions of ascending and descending orbit used in this study.

**Author Contributions:** Conceptualization, A.G. and V.T.; methodology, V.T., A.G.; soft-ware, V.T.; validation, V.T., A.G. and I.K. (Ioannis Karamitros); formal analysis, V.T.; investigation, V.T., A.G.; resources, E.E., I.K. (Ioannis Karamitros); data curation, V.T., E.E. and I.K. (Ioannis Karamitros); writing—original draft preparation, V.T., A.G., I.K. (Ioannis Karamitros); writing—review and editing, V.T., A.G., I.K. (Ioannis Karamitros), E.E., I.K. (Ioannis Koukouvelas), E.S.; visualization, V.T.; supervision, A.G., I.K. (Ioannis Koukouvelas), E.S.; project administration, A.G.; funding acquisition, A.G. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the research project 'Platform of multidimensional monitoring with micro sensors of Enceladus of Hellenic Supersite-PROION' with Grant Number: MIS-5070928. https://proion-hellas.eu/ (accessed on 20 January 2022).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The InSAR data are freely accessible through the LiCSAR—Copernicus portals; The rainfall data are available through www.meteo.gr (data accessed on 20 December 2021); GNSS observations (daily files) are available via the CNRS-CRL web portal.

**Acknowledgments:** We thank the CRL team which operated the GNSS station KRIN. Thanks to PROION partners for comment and discussions. We acknowledge concession for the use of ESRI products licensed to the Hellenic National Tsunami Warning Center, National Observatory of Athens, through the project "HELPOS—Hellenic Plate Observing System" (MIS 5002697). HELPOS is implemented under the Action "Reinforcement of the Research and Innovation Infrastructure", funded by the Operational Programme "Competitiveness, Entrepreneurship and Innovation" (NSRF 2014–2020) and co-financed by Greece and the European Union (European Regional Development Fund). We acknowledge KTIMATOLOGIO S.A. for kindly providing the digital elevation model. We also thank Vassileios Tsironis, Athina Psalta and Ioannis Fountoulakis for their help with cross-correlation techniques. We also thank Sotiris Valkaniotis and Helena Partheniou for comments and discussions.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

#### **References**


## *Article* **Landslides Triggered by the 2016 Heavy Rainfall Event in Sanming, Fujian Province: Distribution Pattern Analysis and Spatio-Temporal Susceptibility Assessment**

**Siyuan Ma 1,2, Xiaoyi Shao 3,4 and Chong Xu 3,4,\***


**Abstract:** Rainfall-induced landslides pose a significant threat to the lives and property of residents in the southeast mountainous area. From 5 to 10 May 2016, Sanming City in Fujian Province, China, experienced a heavy rainfall event that caused massive landslides, leading to significant loss of life and property. Using high-resolution satellite imagery, we created a detailed inventory of landslides triggered by this event, which totaled 2665 across an area of 3700 km2. The majority of landslides were small-scale, shallow and elongated, with a dominant distribution in Xiaqu town. We analyzed the correlations between the landslide abundance and topographic, geological and hydro-meteorological factors. Our results indicated that the landslide abundance index is related to the gradient of the hillslope, distance from a river and total rainfall. The landslide area density, i.e., LAD increases with the increase in these influencing factors and is described by an exponential or linear relationship. Among all lithological types, Sinian mica schist and quartz schist (Sn-s) were found to be the most prone to landslides, with over 35% of landslides occurring in just 10% of the area. Overall, the lithology and rainfall characteristics primarily control the abundance of landslides, followed by topography. To gain a better understanding of the triggering conditions for shallow landslides, we conducted a physically based spatio-temporal susceptibility assessment in the landslide abundance area. Our numerical simulations, using the MAT.TRIGRS tool, show that it can accurately reproduce the temporal evolution of the instability process of landslides triggered by this event. Although rainfall before 8 May may have contributed to decreased slope stability in the study area, the short duration of heavy rainfall on 8 May is believed to be the primary triggering factor for the occurrence of massive landslides.

**Keywords:** landslides; heavy rainfall; distribution pattern; spatiotemporal assessment; Sanming area; Fujian province

#### **1. Introduction**

Rainfall-induced landslides are a type of slope instability that may occur in densely distributed soil and/or debris under heavy rainfall, producing a significant amount of sediments in river networks [1,2]. These landslides often result in catastrophic debris flows, which cause severe damage to agricultural crops, infrastructure and human lives [3,4]. Therefore, effective risk mitigation measures and early warning systems are urgently needed to minimize the detrimental impacts of these slope instabilities on both local and regional scales [5–7].

The southeast coastal area of China falls within the subtropical monsoon climate zone and is frequently affected by typhoons and rainstorms. The coastal areas are characterized

**Citation:** Ma, S.; Shao, X.; Xu, C. Landslides Triggered by the 2016 Heavy Rainfall Event in Sanming, Fujian Province: Distribution Pattern Analysis and Spatio-Temporal Susceptibility Assessment. *Remote Sens.* **2023**, *15*, 2738. https://doi.org/ 10.3390/rs15112738

Academic Editors: Ioannis Papoutsis, Konstantinos G. Nikolakopoulos and Constantinos Loupasakis

Received: 6 April 2023 Revised: 22 May 2023 Accepted: 22 May 2023 Published: 24 May 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

by mountainous and hilly terrains, which cover approximately 75% of the total area. The climate in this region is warm and humid with abundant rainfall, leading to strong physical weathering of rocks [8]. The slope surfaces in this area are predominantly covered by residual soil and heavily weathered rocks. As a result, approximately 90% of landslides occur during the rainy seasons, which typically span from May to September. During these months, landslides triggered by heavy rainfall events are the primary cause of building damage and human casualties. In August 2005, a landslide triggered by Typhoon "Sudilo" caused direct economic losses amounting to USD 4.0 million in Zhejiang Province and posed a threat to 4616 local residents. On 10 June 2019, Longchuan County, Guangdong Province, China, was subjected to incessant heavy rainfall, which subsequently led to widespread landslides, collapses and debris flows. Among the 352 affected villages, Mibei village in Longchuan County was hit the hardest, with 1571 individuals affected, 120 buildings completely destroyed and over 100 houses sustaining damage of varying degrees. The direct economic loss of this event reached USD 15.4 million [9]. In June 2019, due to the influence of southwest airflow on the south side of low-level shear, heavy rainfall occurred in the western region of Fujian Province. Since 6 June, 75,900 people have been affected in 19 counties of Sanming, Nanping and Longyan cities, resulting in a direct economic loss of USD 37.9 million. Rainfall-induced landslides have seriously affected the life safety as well as economic development in the southeastern coastal areas. Therefore, the study of distribution characteristics and spatiotemporal prediction of such landslides has become a major demand for ensuring national security and social development.

For the analysis of regional landslides induced by a single extreme event (e.g., earthquake or heavy rainfall), a comprehensive landslide inventory is essential and often indispensable. Such data provide a crucial foundation for the distribution characteristics of landslides [10], susceptibility and risk assessment [11,12], landslide formation mechanism and geomorphological evolution [13,14]. In contrast to earthquake events, the database of landslides triggered by heavy rainfall events is still limited [15–17]. At present, only 16 public rainfall-event-based landslide databases are available worldwide, but most of these event-based databases are small in scale. There are only four landslide databases with more than 2000 landslides, including the Morakot Typhoon event in southern Taiwan on August 6–9, 2008 [18], the heavy rainfall event in the Teres ó polis region of Brazil on 11–13 January [17], as well as the long-term heavy rainfall event in Japan's Hiroshima region from 28 June to 9 July 2018 [19] and the Hurricane Maria event in Dominica region from 18–22 September 2017 [20,21]. Therefore, compared to earthquake-event-based landslide inventories, the landslide inventories associated with heavy rainfalls still need more in-depth investigations.

Assessing the spatial susceptibility of rainfall-induced landslides plays an important role in effective landslide prevention and control [22–25]. Currently, three primary methods are pervasively used for rainfall-induced landslide prediction: empirical models [26–28], data-driven models [24,29] and physically based models [30,31]. The empirical model is primarily based on the analysis of rainfall characteristics, such as rainfall intensity, duration and total amount. This type of model is utilized to determine the likelihood of landslides. Although this method is simple and easy to implement, it only considers rainfall as a single factor and ignores other important topographical, geological and hydrological factors. In addition, empirical models require abundant landslide and rainfall data to determine the empirical rainfall threshold. Due to the lack of landslide and rainfall data, it is difficult to develop an efficient rainfall threshold model in mountainous areas with severe landslide disasters [5,28]. The statistical model analyzes the relationship between various factors, such as elevation, hillslope gradient, slope aspect, average rainfall and vegetation coverage. Then, based on data-driven models and actual landslides, a rainfall-induced landslide assessment model is created [15,32]. However, this method cannot explain the physical mechanism of landslide occurrence and requires a large amount of landslide data; otherwise, because it requires sufficient landslide data to establish the susceptibility

assessment model, this results in assessment results that frequently lag behind practical application and cannot serve the emergency assessment in a short time [33].

The physically based method does not use actual landslide data but rather simulates the physical process of rainfall-induced landslide occurrence by combining hydrological and infinite slope models. Physical models can reproduce the physical process of landslide occurrence, which is considered to be an effective method for landslide susceptibility analysis [34]. Furthermore, the GIS technology has facilitated the widespread use of physically based models in large regions. As a result, physically based models have been widely used in the prediction and early warning of rainfall-induced landslides [35,36]. Currently, commonly used physical models for landslide susceptibility assessment include the SHALSTAB [37] and SINMAP models [38] based on steady-state hydrological modeling, as well as the SLIP [2,39], CRESTSLIDE [40], HIRESSS [41,42] and TRIGRS models [43] based on transient physical modeling. At present, the TRIGRS model has been widely used worldwide including in Italy, the United States, China, South Korea and Southeast Asia [9,43–46], and is currently one of the most popular models for the spatiotemporal prediction of rainfall-induced landslides. However, the application of the TRIGRS model in China's southeast area is limited, so it is necessary to investigate the applicability of the model in the southeast mountainous area.

From 5 to 10 May 2016, Sanming City in Fujian Province, China, experienced an unprecedented heavy rainfall event with a maximum hourly rainfall of 56.6 mm and a maximum daily rainfall of 259.6 mm, breaking the local daily rainfall record of 178.2 mm set in 1961. This rainfall event triggered extensive landslides, resulting in significant loss of people's lives and property. During this rainfall event, a landslide occurred in the Chitan Village of Kaishan Township, leading to the burial of the office building and construction site dormitory of the expansion project of the Chitan hydropower plant, which belongs to the China Huadian Corporation. As of 1:00 pm on 10 May, the landslide had caused 35 deaths and one person was missing. To better understand the characteristics of the landslides induced by this fatal event, a comprehensive landslide inventory is needed. Therefore, the objectives of this study are twofold: (1) to establish a rainfall-induced landslide inventory through visual interpretation and to analyze the distribution pattern of landslides with relevant factors; and (2) to achieve physically based spatiotemporal susceptibility assessments using an open-source tool of MAT.TRIGRS (V1.0) and to backanalyze the rainfall process' response to changes in landslide stability. This study can provide a significant scientific basis for the formation mechanism and spatiotemporal prediction of rainfall-induced landslides in the southeast coastal area.

#### **2. Study Area**

The study area is located in the western part of Fujian Province, between 25◦30 N–27◦07 N latitude and 116◦22 E–118◦39 E longitude. It neighbors Fuzhou to the east, connects Jiangxi Province to the west and borders Nanping to the north. The landform of the study area is mainly hilly terrain, with medium- and low-elevation mountains. Geologically, the area is an erosion-dominated mountainous region with strong tectonic activity. The terrain is generally higher in the southeast and lower in the northwest, with elevations ranging from 130 m to 1847 m (Figure 1). In addition, the area features a subtropical monsoon climate with a typical mountain climate. The rainfall is abundant, with an annual average precipitation reaching up to 1700 mm. The rainfalls are mainly concentrated from March to August and the annual average temperature is about 19.9 ◦C.

Figure 2 illustrates the distribution of the main lithology exposed in the study area, which ranges from Quaternary loose deposits (Q) to Proterozoic granite (Pt-g). The main lithology is protozoic plagioclase hornblende and granulite (Pt-p), which are mainly distributed in the north of the study area. In addition, Jurassic variegated sandstone and glutenite and K-feldspar granite (Jg and Js-g) are distributed in the west and east of the study area. Cretaceous rhyolite porphyry and glutenite (Kr-g) are mainly exposed in the southwestern area. Sinian mica schist and quartz schist (Sn-s) and Cambrian quartz sand-

stone (∈) are observed in the middle part of the study area. The slope surface is mostly covered by residual soil and strongly weathered rock. Under the condition of rainfall, shallow landslides are mainly formed by loose deposits.

**Figure 1.** The geographical and topographical maps displaying the location and elevation distribution of the study area. (**a**) The location of Fujian Province; (**b**) the location of the study area and rainfall stations; and (**c**) the distribution of elevation, rainfall and water networks in the study area.

**Figure 2.** Lithology distribution of the study area. The geological map was created using China Geological Survey's 1:200,000 geological maps (http://dcc.cgs.gov.cn/, accessed on 5 April 2023).

#### **3. Data and Methods**

#### *3.1. Landslide Mapping*

A comprehensive rainfall-induced landslide inventory is of significance for studying the distribution pattern of landslides, landslide susceptibility and their impact on geomorphological evolution. In this study, we were able to conduct a detailed visual interpretation of landslides due to the availability of high-resolution satellite photographs on the Google Earth (GE) platform [47,48]. The satellite images used for landslide interpretation were all based on the GE platform, which provided a 100% coverage of high-resolution satellite images. Due to the high vegetation coverage in the study area, optical images can be used to better identify the landslide locations before and after the events. By comparing the pre- and post-rainfall images combined with field investigations, the landslide inventory associated with this rainfall event was ultimately established. Figure 3 shows the field photos of landslides triggered by this event.

**Figure 3.** Field photos of landslides triggered by this rainfall event.

#### *3.2. Rainfall Data*

We collected precipitation data in the Sanming area for the last two decades (from 2000 to 2020). According to these data, the average annual rainfall in the Sanming area has remained between 1200–2400 mm, with prominent fluctuation. In 2016, the annual rainfall exceeded 2200 mm, while in 2003 the annual rainfall was relatively low, less than 1200 mm (Figure 4a). After comparison, we found that the rainfall in May 2016 was more than the monthly average rainfall over the past 20 years. The precipitation in May was 300 mm,

which is more than the average monthly rainfall in previous years (approximately 200 mm) (Figure 4b).

**Figure 4.** Monthly precipitation data of Sanming city over the past 20 years (2000–2020); (**a**) monthly and annual average precipitation data over the last 20 years; (**b**) comparison of the monthly rainfall in 2016 with the average precipitation over the last two decades.

We collected rainfall data from 19 rainfall stations of the China Meteorological Administration within a radius of 100 km in the study area. These stations recorded rainfall data every 12 h. Figure 5 shows the rainfall data from two stations located in the northern and western parts of the study area from 1 April to 30 May. Based on the rainfall data, we can observe that the rainfall event mainly occurred from 5 May to 10 May. The highest precipitation occurred on 8 and 9 May, reaching about 100–120 mm, which accounted for more than half of the total rainfall amount. The precipitation on the other four days was relatively low, averaging about 15–40 mm. Based on the above 19 rain gauges, the commonly used Kriging interpolation method was applied to obtain the distribution of the rainfall during different times of this event (Figure 6). The result indicates that the daily rainfall in the study area varies greatly, with a difference of around 160 mm on 8 May. In contrast, the spatial distribution of daily rainfall during other time periods shows relatively small variation, ranging from 10 mm to 40 mm.

**Figure 5.** Precipitation data from every 12 h of two national stations in the study area from 1 April to 30 May; (**a**) the rainfall station (58,820) located in the north of the study area; (**b**) the rainfall station (58,821) located in the west of the study area.

**Figure 6.** The spatial distribution of daily rainfall from 5 to 10 May during this rainfall event; (**a**) 5 May; (**b**) 6 May; (**c**) 7 May; (**d**) 8 May; (**e**) 9 May; (**f**) 10 May.

#### *3.3. Data of Other Influencing Factors*

Based on the distribution characteristics of landslides and influencing factors in the study area, as well as previous studies [9], we selected six factors, mainly including topography, geology, hydrology, land cover and rainfall. The ALOS PALSAR DEM data with a resolution of 12.5 m was used to calculate the hillslope gradient and slope aspect. We calculated the topographic relief based on the elevation range within a 1.0 km radius. TWI was calculated using GRASS GIS software and elevation data and drainages were derived from the DEM using ArcGIS software. The land use type data was derived from the 10 m resolution global land cover results [49]. Finally, all influencing factor layers were divided into 12.5 × 12.5 m grids and subjected to statistical analysis (Figure 7).

**Figure 7.** Map showing the spatial distribution of the influencing factors; (**a**) hillslope gradient; (**b**) aspect; (**c**) topographic relief; (**d**) topographic wetness index; (**e**) land over type; (**f**) total precipitation of this event.

#### *3.4. TRIGRS Modelling*

The TRIGRS model (transient rainfall infiltration and grid-based regional slopestability model) is programmed by the USGS (United States Geological Survey) [50,51] and is widely used for evaluating shallow-rainfall-induced landslide susceptibility [52,53]. Specific input data are required such as rainfall, soil mechanics and hydrological characteristics of the study area [50]. After determining these parameters, using a GIS platform, the model calculates grid stability as a result of the change in transient pore water pressure of each grid during the rainfall period. Iverson [54] linearized the Richards equation solution, which serves as the foundation for infiltration models for moist beginning circumstances with steady and transient seepage components. The former is governed by the water table's initial depth and constant infiltration rate, which maintains slope stability. The latter refers to the increase in pore water pressure caused by rainfall, which can cause instability. The generalized solution in TRIGRS is

*<sup>ψ</sup>*(*Z*, *<sup>t</sup>*) <sup>=</sup> (*<sup>Z</sup>* <sup>−</sup> *<sup>d</sup>*)*<sup>β</sup>* <sup>+</sup> <sup>2</sup>∑*<sup>N</sup> <sup>n</sup>*=<sup>1</sup> *Inz Ks H*(*t* − *tn*)[*D*1(*t*− *tn*)] <sup>1</sup> <sup>2</sup> ∑<sup>∞</sup> *m*=1 *ier f c* (2*m*−1)*dLZ*−(*dLZ*−*Z*) <sup>2</sup>[*D*1(*t*−*tn*)] <sup>1</sup> 2 <sup>+</sup> *ier f c* (2*m*−1)*dLZ*+(*dLZ*−*Z*) <sup>2</sup>[*D*1(*t*−*tn*)] <sup>1</sup> 2 − 2∑*<sup>N</sup> <sup>n</sup>*=<sup>1</sup> *Inz Ks <sup>H</sup>*(*<sup>t</sup>* <sup>−</sup> *tn*+1)[*D*1(*<sup>t</sup>* <sup>−</sup> *tn*+1)] <sup>1</sup> <sup>2</sup> ∑<sup>∞</sup> *m*=1 *ier f c* (2*m*−1)*dLZ*−(*dLZ*−*Z*) <sup>2</sup>[*D*1(*t*−*tn*+1)] <sup>1</sup> 2 + *ier f c* (2*m*−1)*dLZ*+(*dLZ*−*Z*) <sup>2</sup>[*D*1(*t*−*tn*+1)] <sup>1</sup> 2 (1)

where *ψ* denotes pressure head; *t* is rainfall time; *N* is the number of rainfall time intervals; *Z* is the depth below the surface; d is the depth of water table;*dLZ* indicates the impervious basement border depth; *<sup>β</sup>* = *cos*2*<sup>δ</sup>* − (*IZLT*/*Ks*), *<sup>δ</sup>* is the hillslope gradient; *IZLT* is the constant surface flux; *Ks* is the saturated hydraulic conductivity; *InZ* is the nth time period, surface flux; *<sup>D</sup>*<sup>1</sup> <sup>=</sup> *<sup>D</sup>*0/*cos*2*δ*, *<sup>D</sup>*<sup>0</sup> is the saturated hydraulic diffusivity and *<sup>H</sup>*(*<sup>t</sup>* <sup>−</sup> *tn*) is the Heaviside step function in which *tn* is the time at the nth time interval in the rainfall sequence.

$$
\widetilde{\operatorname{erfc}}(\eta) = \frac{1}{\sqrt{\pi}} \exp\left(-\eta^2\right) - \eta \operatorname{erfc}(\eta) \tag{2}
$$

where *er f c*(*η*) denotes the complementary error function.

The TRIGRS model computes infiltration (*I*) for each cell by adding precipitation (*P*) and any runoff from upslope cells (*Ru*). However, it is important to note that the saturated hydraulic conductivity (*Ks*) cannot be exceeded by infiltration. This ensures that the model accounts for the limitations of soil permeability.

$$I = P + R\_{\text{ul}} \, \text{if } P + R\_{\text{ul}} \le K\_s \tag{3}$$

$$I = K\_{s\prime} if\ P + R\_u > K\_s \tag{4}$$

when *P* + *Ru* surpasses *Ks* in a cell, the surplus is referred to as runoff (*Rd*) and it is directed to neighboring downslope cells.

$$R\_d = P + R\_u - K\_{s\prime} \, if \, P + R\_u - K\_s \ge 0 \tag{5}$$

$$R\_d = 0,\\
if\ P + R\_\text{\textquotedblleft} - K\_\text{\textquotedblright} < 0\tag{6}$$

The TRIGRS model computes slope stability using an infinite-slope stability analysis (Equation (7)), as explained in Iverson [54]. In this analysis, it is the percentage of resistive basal Coulomb friction in the presence of gravitationally produced downslope. The instability of an infinite slope is characterized by basal driving stress [55]. The TRIGRS model calculates this ratio, known as the *FoS*, at depth *Z* by

$$FoS(Z, t) = \frac{\tan \phi'}{\tan \delta} + \frac{c' - \psi(Z, t) \gamma\_w \tan \phi'}{\gamma\_s Z \sin \delta \cos \delta} \tag{7}$$

where *c* is the soil cohesion; *ϕ* is the friction angle of the soil; *γ<sup>s</sup>* is the unit weight of soil and *γ<sup>w</sup>* is the unit weight of groundwater.

To overcome the difficulties associated with manually updating several model parameters and sophisticated data processing in the standard TRIGRS model, Ma et al. [56]

posed a new TRIGRS model using Matlab® programming. This model can directly read grid data in TIF format as input and output prediction results, greatly simplifying data preparation and parameter configuration. It includes two script files, INPUT DATA.m and TRIGRS.m. The TIF input files are read by the INPUT DATA.m file, whereas TRIGRS.m is the executable program that calculates the pressure head and FoS. By computing the pressure head and FoS at various soil depths, the model provides the minimum FoS and accompanying pressure head in TIF format. More information may be found in [56]. The flow chart of this study is shown in Figure 8.

**Figure 8.** Flow chart of this study.

#### **4. Results**

#### *4.1. Basic Characteristics of Rainfall-Induced Landslides*

Based on the detailed landslide database of this rainfall event, it is clear that approximately 2665 landslides were triggered by this event (Figure 9a). The landslides were mostly small-scale shallow landslides with an elongated type. Among them, the largest landslide has an area about 50,000 m2, while the smallest one is only 36 m2, with an average area of 1070 m2. The number of landslides with an area greater than 10,000 m2 is 6. Approximately 21 landslides have an area between 5000–10,000 m<sup>2</sup> and 927 landslides have an area between 1000–5000 m2. However, the majority of landslides (1711) possess an area of less than 1000 m2. We used a moving window with a radius of 2.5 km and a Gaussian density kernel function to calculate the landslide number density (LND) in the study area (Figure 9b). The results show that the maximum LND reaches over 80/km2. Spatially, the landslides are mainly concentrated in the northern part of Xiaqu town and the southern part of Yufang town.

**Figure 9.** Spatial distribution of landslides and landslide number density (LND). (**a**) Inventory of the rainfall-induced landslides (**b**) landslide number density (LND); the red line delineates the landslide abundance area.

#### *4.2. Correlation between Landslides and Influencing Factors*

To analyze the relationship between the influencing factors and landslide occurrence, we conducted a statistical analysis of the frequency distribution of landslides and landscape (non-landslide) areas under different intervals, as well as landslide area density (LAD) under different intervals. Figure 10 shows the frequency density distribution of landslides and landscape areas under different influencing factors. Figure 11 shows the LAD distribution of six influencing factors under different intervals. Higher LAD values indicate the areas that are more prone to landslides. The results show that for elevation, most landslides are concentrated between 300–500 m. The landslide frequency density reaches a maximum of 0.22 in the elevation interval of 370–430 m. As for the hillslope gradient, most landslides are concentrated in the interval of 15–25◦, with an average slope of 21.7◦. Both landslide and non-landslide areas have the highest frequency density in the slope interval of 14–18◦ and the values are 0.15 and 0.16, respectively. The same distribution pattern can also be observed in TWI. Specifically, the landslide frequency density reaches its maximum near 4.5, with a value of 0.36. As for topographic relief, non-landslide areas are mainly concentrated between 300–500 m, while landslide areas are mainly concentrated between 150–200 m. For distance to a river, most landslides are concentrated within 800 m, which is the area most affected by river erosion. In the aspect of total rainfall, landslides

are predominantly concentrated in the region with annual rainfall intervals of 320–330 mm and 380–400 mm, with landslide frequency densities of 0.1 and 0.36, respectively.

**Figure 10.** Frequency density estimates of landslide and landscape areas of six influencing factors; (**a**) elevation; (**b**) hillslope gradient; (**c**) topographic relief; (**d**) distance to river; (**e**) topographic wetness index (TWI); (**f**) total rainfall.

**Figure 11.** The relationship between the landslide areal density (LAD) and six influencing factors; (**a**) elevation; (**b**) hillslope gradient; (**c**) topographic relief; (**d**) distance to river; (**e**) topographic wetness index (TWI); (**f**) total rainfall.

Based on the statistical relationship between LAD and different influencing factors (Figure 11), it is observed that there is no significant correlation between elevation, relief, TWI and landslide abundance index. For elevation, the highest LAD (0.17%) is observed in the elevation range of 400–450 m, while the maximum LAD (0.13%) is observed in the relief range of 100–200 m and 550–650 m. By comparison, a tight correlation is seen between LAD and the other three influencing factors (i.e., hillslope gradients, distance to river and total rainfall). For hillslope gradient, the LAD increases with an increase in hillslope gradient and is described by an exponential relationship: y = 0.0557*e*(0.0244*x*), where x is the hillslope gradient and y is the LAD (Figure 11b). The equation indicates that with the increase in hillslope gradient, the possibility of landslide occurrence also raises. In terms of distance to a river, there is a negative linear relationship between the LAD and distance to a river: <sup>y</sup> <sup>=</sup> <sup>−</sup><sup>6</sup> <sup>×</sup> <sup>10</sup>5*<sup>x</sup>* <sup>+</sup> 0.124, where x represents the distance to a river and y is the LAD (Figure 11d), indicating that the LAD decreases with the increase in distance to rivers. For total rainfall, the LAD and total rainfall show an exponential relationship: y = *e*0.023*x*, where x is the total rainfall and y is the LAD. Such a relationship demonstrates that landslides are more likely to occur in areas with high precipitation (Figure 11f).

Figure 12 shows the areal coverage (%) of various lithological types for landslide and landscape area, overlaid by the average landslide area and LAD estimated per unit. The result shows that the predominant lithology is protozoic plagioclase hornblende and granulite (Pt-p), which account for 25% of the study area, followed by Cretaceous rhyolite porphyry and glutenite (Kr-g), which account for more than 15% of the study area. Among all lithological types, Sinian mica schist and quartz schist (Sn-s) are the most prone to landslides, with over 35% of landslides occurring in 10% area. Furthermore, statistics on the average landslide area of different lithological units show that Kr-g and Pt-p have the largest average landslide area (>1400 m2), followed by Cambrian quartz sandstone (∈), which has an average landslide area of 1200 m2. Quaternary loose deposits (Q) have a small average landslide area, only 600 m2.

**Figure 12.** Areal coverage (%) of various lithological types for landslide and landscape, overlaid by average landslide area and LAD estimated per unit.

Based on the frequency density distribution and statistical analysis of landslide area density (LAD) across various aspects in both landslide and non-landslide areas, we found that the landslides on E-SE oriented slopes are highly developed, particularly within a slope aspect of 60–100◦. The statistical results further revealed that landslide area density (LAD) was highest within the 50–110◦ range, reaching 0.17%. Overall, the eastern aspect of the area displayed a greater concentration of landslides, while the western slope exhibited lower levels of landslide development (Figure 13).

**Figure 13.** (**a**) The distribution of aspect within landslide and landscape areas; (**b**) correlations between aspect and LAD.

#### *4.3. Spatio-Temporal Susceptibility Assessment*

To achieve accurate landslide spatiotemporal susceptibility results using physically based models, obtaining sufficient and correct input parameter data is the foremost requirement [30,52,57,58]. The Z-model of Saulnier et al. [59] was used to evaluate the thickness of weathered soil mass. We assumed that the weathered soil mass covered by the upper layer of bedrock has a maximum thickness of 5 m and a minimum thickness of 0.5 m, as determined by previous studies [60,61]. Therefore, the estimation of soil thickness based on altitude can be calculated using Equation (8).

For landslide abundance areas, the lithology in the study area mainly includes Proterozoic granite (Pt-g), Sinian mica schist and quartz schist (Sn-s), Cambrian quartz sandstone (∈) and Quaternary loose deposits (Q). Based on previous studies [9,23,62] and rock engineering standards used in China [63], we assigned the corresponding values to hydrological and mechanical parameters, including soil cohesion (*c* ), internal friction angle (*ϕ* ), unit weight (*γs*) and saturated hydraulic conductivity (Ks) for different lithological types. Specific mechanical and hydrological parameter assignments for different lithologies can be found in Supplementary Materials Table S1. Otherwise, based on previous experience [44,64], saturated hydraulic diffusivity *D*<sup>0</sup> was set to 200*Ks* and the initial surface flux (*IZLT*) is generally less than Ks to one power or more and was often set to *IZLT* = 0.01*Ks*.

$$h\_i = h\_{\max} - \left(\frac{Z\_i - Z\_{\min}}{Z\_{\max} - Z\_{\min}}\right) (h\_{\max} - h\_{\min}) \tag{8}$$

where *Zmax* and *Zmin* refer to the maximum and minimum thicknesses of weathered soil mass, respectively. *hmax* and *hmin* are the maximum and minimum altitudes, respectively.

Figure 14 shows the predicted pictures of the FoS based on rainfall data over different time periods. From the results, it can be observed that prior to the rainfall, most of the study area had an FoS greater than 1.2. As the rainfall event began, unstable areas with an FoS less than 1.2 (shown in the red area) mainly appeared on both sides of the gullies and, by 8 am on 7 May, the unstable area within the study area rapidly increased. After 12 h of rainfall (reaching a rainfall amount of 102 mm) on 8 May at 8 am, the unstable area (shown in the red area) reached its critical value. Although several subsequent intermittent rainfall events occurred, the impact of rainfall on the change in the FoS was tiny and ignorable and the unstable area remained unchanged. We compared the prediction accuracy of the simulation results over different time periods based on the ROC curve. As shown in Supplementary Materials Figure S1, the prediction ability of the assessment results in different time periods varied between 0.68 and 0.72. Among them, the evaluation results on the 7 and 8 June had the highest prediction accuracy, around 0.72.

We conducted a statistical analysis on the temporal variations of the FoS results for different hillslope gradients during different time periods. Figure 15a shows the variations in the FoS of the grids with gradients less than 30◦, where the majority of the FoS ranged from 1.8 to 2.4 with an average of approximately 2.1 before the onset of rainfall. As heavy rainfall occurred, the FoS gradually decreased and the average FoS reached around 1.95 at 8:00 on 7 May. Subsequently, the occurrence of intense rainfall on 8 May resulted in a significant decrease in the FoS for the grids with gradients less than 30◦, with the average FoS maintained at around 1.85. Despite the persistent rainfall in the later period, the overall variation in the FoS for most grids was small, with an average of approximately 1.8. Similar phenomena were observed for the grids with gradients greater than 30◦ (Figure 15b). Prior to the rainfall, the FoS for these grids were basically distributed between 1.4 and 1.7, with an average of approximately 1.6. As rainfall increased, the FoS continued to decrease and after the intense rainfall event on 8 May, most of the shallow weathered soil layers became saturated. As a result, the FoS reached critical values and maintained between 1.1 and 1.4, with an average of approximately 1.3. Although the rainfall continued to occur, the impact of rainfall on the FoS was small due to the saturated state of the soil layers and the FoS for the grids remained stable.

**Figure 14.** Conditions for slope stability, as measured by the factor of safety (FoS) at different times during the 2016 rainfall event; (**a**) 20:00 on 4 May (UTC + 8, before rainfall event); (**b**) 8:00 on 5 May (UTC + 8); (**c**) 8:00 on 6 May (UTC + 8); (**d**) 8:00 on 7 May (UTC + 8); (**e**) 8:00 on 8 May (UTC + 8); (**f**) 20:00 on 8 May (UTC + 8); (**g**) 8:00 on 9 May (UTC + 8); (**h**) 8:00 on 10 May (UTC + 8).

**Figure 15.** FoS results for various hillslope gradient intervals at different rainfall times; (**a**) hillslope gradient: <30◦; (**b**) hillslope gradient: >30◦.

#### **5. Discussion**

Topography, geological features and rainfall characteristics are considered to be essential factors influencing the occurrence of rainfall-induced landslides [65–67]. It is generally accepted that steeper terrain, weaker rock strength and higher rainfall amounts increase the occurrence likelihood of landslides [17,68,69]. To better understand the spatial distribution of the rainfall-induced landslides with different elevations, hillslope gradients, topographic relief, lithological types and rainfall characteristics, the swath profiles (EEN-WWS) with a width of 10 km are presented (Figure 16). The spatial distribution of landslides shows that the majority of the landslides are concentrated in Xiaqu town, which has the maximum landslide number density (100/km2). In terms of topography, the Xiaqu area belongs to the transitional zone from high to low altitude, with an elevation generally ranging from 400 to 600 m. The hillslope gradients in this area are small, with a range from 10 to 30◦ and an average gradient of 20◦. Regarding the rainfall distribution, the Xaiqu area is located in the region with the highest precipitation of nearly 400 mm. Thus, the spatial distribution of landslides is strongly controlled by the rainfall characteristics (Figures 16 and 10f). Otherwise, the primary rock type in the Xaiqu area is Sinian mica schist and quartz schist (Sn-s) and the statistical results show that over 35% of landslides are distributed in the area with this lithological type (Figure 12). We suggest that the schist belongs to metamorphic rock, which is influenced by geological structure, tectonics and mineral composition. When schist is in contact with water, characteristics such as creep, mechanical anisotropy, softening and deterioration may be observed. Due to these unique properties, schist in mountainous areas often experiences frequent landslides [70,71]. For the study area, mica schist and quartz schist have well-developed cleavage and high mica content. Meanwhile, due to the long-term physical weathering, the rock mass on the surface of the bedrock is fragmented, with highly developed fissures and locally filled mud. Therefore, this lithological type is prone to the development of weak structural layers after long-term immersion of rainfall, thereby providing a natural sliding surface for landslides [72]. Additionally, due to the development of cracks in the weathered rock and soil, long-term rainfall can penetrate through the cracks and increase pore water pressure in the rock mass and the decreased shear resistance caused by increased self-weight of the rock mass is also one of the main reasons for the occurrence of landslides.

In the southeast coastal area, orographic amplification of rainfall and the projection of rainfall-vector on hillslopes might result in greater rainfall in the windward hillslopes, resulting in a higher incidence of landslides on the hillslope scale [73]. During the summer months (June and July), the Sanming area is dominated by the southeast monsoon, which is influenced by the monsoon depression and tropical cyclone. The distribution of landslides during this rainfall event indicates that hillslopes facing southeast and east are more susceptible to collapse than those facing northwest-north (Figure 13). This phenomenon can be mainly attributed to the fact that the south-oriented slopes are predominantly windward, leading to greater rainfall and splash erosion.

**Figure 16.** Comparison of longitudinal (EEN-WWS) swath profiles of elevation, slope angle, rainfall, lithology and LND; the location of the swath profile is shown in Figure 8.

In addition, we collected the landslide data and the corresponding population statistics from different towns (Figure 17). According to the statistical results, we found that the landslide size in each town was similar, with most landslides ranging from 600 to 1000 m2. Hangtan, Guangming and Yunkou towns had relatively larger average landslide areas, reaching 1000 m<sup>2</sup> (Figure 17). In terms of the relationship between landslide numbers and population distribution in each town, we can observe that the towns with the most concentrated landslides were areas with the lowest populations (Figure 17). For example, more than 1000 landslides occurred in Xiaqu town but the local population is only 7000. Yufang town had nearly 600 landslides, but the population is only around 2700. On the other hand, areas with fewer landslides corresponded to regions with more concentrated populations, such as Guyong and Shancheng towns, which had populations of 50,000 and 45,000, respectively, but only around 50 landslides occurred in each town. By comparing the topography, geomorphology and rainfall characteristics of these towns, we found that Yufang and Xiaqu towns were located in an area with the highest rainfall intensity and the most susceptible strata for landsliding. In contrast, Guyong and Shancheng towns had gentler terrain, with most hillslope gradients being less than 15◦, making them lowsusceptibility areas for landsliding. Although these areas are considered low-susceptibility areas, the loss of people's lives and property is still possible due to the landslides triggered by typhoons and heavy rainfall events. Conversely, although Yufang and Xiaqu towns are considered high-susceptibility areas for landsliding, their low population density may result in a lower impact on residents and public facilities.

**Figure 17.** Landslide scale, number of landslides and the corresponding population distribution in different towns of the study area.

We calculated the antecedent precipitation index (API) of the study area with different statistical steps (5, 10, 15 days) based on the daily precipitation data (Figure 18). The relevant information about the API is detailed in the Supplementary Materials. From 1 April to 16 May, three API curves were all in the high-value range, with peaks occurring around 10 April, 20 April and 10 May, respectively. The first two peaks had relatively smaller API values (90 mm and 80 mm), while the largest peak appeared around 10 May. The Sanming area witnessed a strong rainstorm from 5 May to 10 May, with a total precipitation of 250 mm within 24 h and a maximum rainfall intensity of 56.6 mm/h. During this time, the API curves increased significantly and reached their pinnacle (250–280 mm). Subsequently, as the rainfall events ceased, the API rapidly decreased. In summary, the preceding rainfall events with low intensity (before 5 May) increased the soil water content and the short

duration of heavy rainfall on 8 May caused the rapid saturation of soil water content, thus leading to the occurrence of massive landslides in the Sanming area.

**Figure 18.** The rainfall data and calculated antecedent precipitation index (API) of this rainfall event.

In physically based models, accurate input information is essential to obtain accurate simulation results [58,74]. However, obtaining accurate input parameters for rock and soil is a challenging task in actual situations, so input data differ from the actual situation to some extent. For example, there is spatial heterogeneity in the mechanical parameters of different positions on slope body and the mechanical and hydrological parameters may be different at various depths [75]. In addition, the evaluation scale based on physically based models is a large area, which is limited by sampling manpower and material resources, resulting in a certain degree of uncertainty in the input parameters of the modelling [76,77]. The evaluation results based on MAT.TRIGRS (V1.0) show that the overall prediction accuracy is around 0.7 (Figure S1), which indicate that the prediction accuracy is reliable. However, due to the subjectivity of the input parameters, there are still some errors between the overall evaluation results and the actual landslides (Figure 14) and some areas with less landsliding are predicted as high-susceptibility areas. Therefore, how to obtain accurate input parameters on a regional scale remains an important constraint for rainfall-induced landslide susceptibility assessment based on physically based models.

#### **6. Conclusions**

The objective of this study was to examine the landslides that occurred during the heavy rainfall event from 5 to 10 May 2016 in the Sanming area and to identify the geological, geomorphological and hydrometeorological factors that contributed to landslide hazards. The rainfall event resulted in around 2700 landslides covering a total area of 2.8 km2, predominantly in Xiaqu town. We analyzed the landslide distribution pattern and its relationship with various factors. Our findings suggest that elevation, relief and topographic wetness index (TWI) were not significantly correlated with the landslide abundance index, while hillslope gradient, distance to a river and total rainfall played a significant role in the occurrence of landslides. The Sinian mica schist and quartz schist (Sn-s) lithological types were found to be the most susceptible to landslides, with more than 35% of landslides occurring in only 10% of the area. This indicates that landslides are more likely to occur in Sn-s strata during rainfall events. Additionally, the study showed that southeast- and east-facing hillslopes are more susceptible to collapse than northwestnorth-oriented hillslopes due to the influence of the summer southeast monsoon. To further examine the susceptibility of the area to landslides, we used the MAT.TRIGRS (V1.0) tool for spatio-temporal susceptibility assessment. The results showed that unstable areas mainly appeared on both sides of the gullies as the rainfall event began. By 8 am on 7 May, the unstable area within the study area rapidly increased. Our numerical simulations indicate that the preceding rainfall events with low intensity (before 5 May) increased the soil water

content and the short duration of heavy rainfall on 8 May caused the rapid saturation of soil water content, thus leading to the occurrence of massive landslides in the Sanming area. Based on the findings, we recommend that more attention should be given to Sn-s lithological types in the region during the planning and implementation of landslide hazard mitigation measures. Additionally, our study highlights the importance of considering the influence of monsoons on landslide susceptibility in the area. Future studies could consider using high-resolution satellite images and other remote sensing techniques to monitor and map the landslide distribution in the study area, as well as to assess the effectiveness of the mitigation measures implemented.

**Supplementary Materials:** The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/rs15112738/s1. Figure S1. Prediction curves of simulation results over different time periods; Table S1. Mechanical and hydrological parameters for different lithological types in landslide abundance areas. References [78–83] are cited in the Supplementary Materials.

**Author Contributions:** The research concept was proposed by C.X. who also contributed to the data curation and analysis. S.M. designed the research framework, processed the relevant data and drafted the manuscript. X.S. participated in the data analysis and contributed to the manuscript revisions. All authors have read and agreed to the published version of the manuscript.

**Funding:** This study was supported by the National Key Research and Development Program of China (2021YFB3901205).

**Data Availability Statement:** The source code of MAT.TRIGRS (V1.0) (https://doi.org/10.1016/j. nhres.2021.11.001, Ma et al., 2022 [56]) is available at https://github.com/Masiyuanlandslides/MAT. TRIGRS-V1.0- (accessed on 10 May 2023).

**Acknowledgments:** We thank Google Earth for the free-to-access satellite images used in this study. We would like to thank three anonymous reviewers for their comments, which are very useful for improving the quality of the MS.

**Conflicts of Interest:** The authors declare that they have no known competing financial interest or personal relationship that could have appeared to influence the work reported in this paper.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

## *Article* **An Improved Multi-Source Data-Driven Landslide Prediction Method Based on Spatio-Temporal Knowledge Graph**

**Luanjie Chen 1,2, Xingtong Ge 1,2, Lina Yang 1,2,\*, Weichao Li <sup>1</sup> and Ling Peng 1,2**


**Abstract:** Landslides pose a significant threat to human lives and property, making the development of accurate and reliable landslide prediction methods essential. With the rapid advancement of multi-source remote sensing techniques and machine learning, remote sensing data-driven landslide prediction methods have attracted increasing attention. However, the lack of an effective and efficient paradigm for organizing multi-source remote sensing data and a unified prediction workflow often results in the weak generalization ability of existing prediction models. In this paper, we propose an improved multi-source data-driven landslide prediction method based on a spatio-temporal knowledge graph and machine learning models. By combining a spatio-temporal knowledge graph and machine learning models, we establish a framework that can effectively organize multi-source remote sensing data and generate unified prediction workflows. Our approach considers the environmental similarity between different areas, enabling the selection of the most adaptive machine learning model for predicting landslides in areas with scarce samples. Experimental results show that our method outperforms machine learning methods, achieving an increase in F1 score by 29% and an improvement in processing efficiency by 93%. Furthermore, by comparing the susceptibility maps generated in real scenarios, we found that our workflow can alleviate the problem of poor prediction performance caused by limited data availability in county-level predictions. This method provides new insights into the development of data-driven landslide evaluation methods, particularly in addressing the challenges posed by limited data availability.

**Keywords:** landslide prediction; spatial–temporal knowledge graph; machine learning; multi-source remote sensing data

#### **1. Introduction**

A landslide is a process in which the soil or rock on a slope falls, dumps, slides, spreads or flows due to the influence of various causative factors [1]. In recent years, landslide hazards have caused serious losses of human life and property, severely constraining economic and social development on a worldwide scale. The scientific and accurate prediction of landslides is thus of primary importance.

The common methods of landslide prediction can be divided into knowledge-driven methods and data-driven methods. Knowledge-driven methods are based on an understanding of the mechanisms of landslide formation for susceptibility prediction. One of the most dominant approaches is to predict landslides by comprehending the physical mechanisms of landslide formation using physical equations and numerical simulation methods. Liu et al. [2] utilize physical modeling and various instruments to study the evolution and instability of a locked segment landslide under rainfall conditions and identify tilting deformation as a standard for landslide instability. Capparelli et al. [3] use a physical model, SUSHI, to simulate the role of subsurface hydrology in rain-induced landslides

**Citation:** Chen, L.; Ge, X.; Yang, L.; Li, W.; Peng, L. An Improved Multi-Source Data-Driven Landslide Prediction Method Based on Spatio-Temporal Knowledge Graph. *Remote Sens.* **2023**, *15*, 2126. https:// doi.org/10.3390/rs15082126

Academic Editors: Ioannis Papoutsis, Konstantinos G. Nikolakopoulos and Constantinos Loupasakis

Received: 31 December 2022 Revised: 12 April 2023 Accepted: 14 April 2023 Published: 17 April 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

in Campania, Italy. The model enables a better understanding of rainfall infiltration and suction changes in the triggering mechanism of the phenomena. Additionally, some studies have predicted landslide susceptibility based on empirical or statistical methods that assign weights to each causative factor. Mandal et al. [4] applied the analytical hierarchy process (AHP) using geospatial tools to develop a landslide susceptibility map for the Lish River basin in the eastern Darjiling Himalaya. Akgun et al. [5] also produced landslide susceptibility maps for a landslide-prone area in Findikli District using likelihood frequency ratio (LRM) and weighted linear combination (WLC) models. The results showed that the WLC model performed better. However, knowledge-driven methods heavily rely on professional knowledge, and the results are greatly influenced by human expertise.

To overcome this shortcoming, remote sensing data-driven methods have been proposed for landslide prediction. Supervised machine learning methods are by far the most widely used data-driven approach applied to landslide prediction. Typically, machine learning models use remotely sensed images as the data source to generate landslide inventories [6], and then construct relationships between input and output variables based on these inventories [7]. The most commonly used machine learning methods include Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), and Artificial Neural Net (ANN). For example, Chen et al. [8] compared kernel logistic regression and naive-Bayes tree and alternating decision tree models in landslide prediction in Taibai County (China). Tian et al. [9] adopted an artificial neural network (ANN) model to predict landslides in Minxian, China. Marjanovi´c et al. [10] tested different kernel functions of the support vector machine (SVM), selected the most accurate kernel function as model parameters, and carried out landslide susceptibility mapping for Chittagong, Bangladesh. In addition, ensemble methods have been gradually applied to produce landslide susceptibility maps [11]. Pham et al. [12] combined a rotation forest and different machine learning classifiers to produce landslide susceptibility maps of India. Dou et al. [13] used SVM as the base learner to generate four classes of ensemble learning models to predict catastrophic rainfall-induced landslides. Hong et al. [14] used J48 decision tree to construct adaptive boosting (Adaboost), bootstrap aggregating (Bagging) and rotation forest models to conduct a comparative study of landslide susceptibility in Guangchang County, Fuzhou City, and the results showed that the rotation forest model has better spatial prediction. Although machine learning methods can predict landslides and achieve high accuracy, the prediction effectiveness of the model is closely related to the quantity of the dataset in the study area. For example, the study area may have problems such as the low spatial resolution of remote sensing data and noisy historical landslide data, which can result in a scarcity of available data and make it difficult to fit the model.

To date, some methods have addressed the problem of sample scarcity by introducing Adversarial Neural Networks (GANs). For example, Al-Najjar et al. [15] proposed a novel approach using GANs to correct imbalanced landslide datasets. Their research showed that integrating GANs with machine learning models can improve the effectiveness of landslide prediction. However, GANs' complex training procedures and lack of interpretability may limit their practicality and reliability for landslide prediction in real-world scenarios. Furthermore, some methods have addressed the problem of scarce environmental data by considering the environmental information of multiple regions. For instance, Zhu et al. [16] added an unsupervised representation learning module to form the underlying representations embedded in thematic maps, which improved the model's accuracy. Ai et al. [17] transferred features from a large dataset region, utilized a pre-trained model, and established a transfer-learning-based susceptibility assessment model to enhance landslide prediction in regions with limited samples. These methods involve multi-source remote sensing data, and as the number of research areas increases, the data scale sharply increases. Therefore, these remote sensing data need to be scientifically integrated and organized in practical applications to meet the requirements of effectiveness and efficiency. Based on well-organized data, it is necessary to establish a unified prediction process to ensure an accurate and fast landslide prediction analysis of the regions of interest according to a

standardized procedure. However, existing methods seldom consider the difficulties of organizing environmental big data from multiple sources, which reduces the efficiency of data reuse. Additionally, the lack of a systematic workflow for transfer-learning-based methods leads to the need to establish different models in different fields, reducing the prediction efficiency of machine learning methods.

Knowledge graph is a modeling approach that uses symbols to describe entities, concepts, and relationships in the real world [18–20]. It has received increasing attention in recent years. In the domain of geoscience, knowledge graphs are used to obtain spatiotemporal knowledge and geographic knowledge from multi-source remote sensing data and textual data, also known as a spatio-temporal knowledge graph [21,22] or geographic knowledge graph [23,24]. The spatio-temporal knowledge graph is based on the graph structure for unified spatio-temporal data management, intelligent retrieval and inference analysis, which is an effective means to fuse, organize and compute the multi-source data involved in landslide prediction. In this paper, we propose a workflow for landslide prediction based on spatio-temporal knowledge graph, which not only alleviates the problem of landslide sample scarcity but also improves the efficiency of data usage and landslide prediction. On the one hand, the spatio-temporal knowledge graph is used to fuse remote sensing environmental data, models, and datasets that are closely related to landslide prediction, which makes multi-area environmental data under different conditions rapidly available. On the other hand, the applicability of the machine learning model is enhanced by designing semantic reasoning rules in the knowledge graph. The method extends the traditional machine-learning-based landslide prediction method by adding the process of extracting, storing, and analyzing environmental knowledge, which improves the landslide prediction under the condition of sample scarcity.

This paper has the following main contributions: (1) We propose a workflow for landslide susceptibility evaluation combining spatio-temporal knowledge graphs and machine learning model. (2) We propose a method for organizing remote sensing environmental data based on semantic structure, and improve the efficiency of remote sensing data usage by constructing schema. (3) We define inference rules for candidate model selection and environmental similarity analyses to reduce the impact of sample scarcity on landslide prediction results. (4) We incorporate the knowledge of environmental features in the remote sensing data-driven machine learning method to enhance the applicability of the model, and demonstrate the benefits of this method through experiments. In the following, we first explain our proposed workflow and introduce the construction method of spatio-temporal knowledge graph, and the details of predicting landslides using our method in Section 2. Then, the advantages of our method are demonstrated by experiments in Section 3 and the experimental results are analyzed in Section 4. Finally, our study is concluded in Section 5.

#### **2. Materials and Methods**

#### *2.1. Workflow for Landslide Prediction*

Generally, when using machine learning models for landslide prediction, it is necessary to first define the boundary of the area for landslide prediction. Secondly, data related to landslides, including historical landslide data and environmental data in the area, are collected by means of remote sensing techniques or fieldwork. Based on these data, datasets are created. The dataset contains the environmental features, i.e., causative factors, that need to be input to the model, and the landslide prediction results, i.e., labels, that are output from the model. Then, the parameters of the model are trained based on the dataset. After training, the optimal model is obtained, and the prediction performance is evaluated based on the model. Eventually, the landslide is predicted based on the model.

To improve the effectiveness of landslide prediction in areas with scarce samples, we introduce knowledge graph into the workflow of machine-learning-based landslide prediction. Firstly, as in the general workflow, we define the boundary of the area to be evaluated and collect historical landslide data and environmental data from the area. Secondly, environmental data are structured knowledge and imported into a knowledge

graph, i.e., extracting knowledge from environmental data. Then, we evaluate whether the quantity of historical landslide data can support the training of the model. If it can, the subsequent steps are performed following general machine learning methods, including producing the dataset of the area, training the model, and predicting landslides. If the quantity of the historical landslide data cannot support the training of the model, the environmental similarity within the area is analyzed based on the knowledge graph. Finally, the model with the highest similarity to the study area is selected among the candidate models for landslide prediction. Figure 1 shows the difference between the general workflow and the workflow using the knowledge graph.

**Figure 1.** Landslide prediction workflow using machine learning (**left**), and additional steps for improvement using the spatio-temporal knowledge graph (**right**).

#### *2.2. Design of Spatio-Temporal Knowledge Graph for Landslide Prediction*

In this paper, we use the spatio-temporal knowledge graph to organize remote sensing environmental data, machine learning models and datasets of the study area. The spatiotemporal knowledge graph includes the schema layer and the data layer, as shown in Figure 2.

**Figure 2.** The structure of the spatio-temporal knowledge graph for Landslide Prediction.

#### 2.2.1. Schema Layer

The schema of the knowledge graph is used to describe and organize the spatiotemporal data related to landslides and to define the rules for landslide prediction. We implement the schema using ontologies, which include a spatial ontology, a temporal ontology, and a landslide prediction ontology. Each ontology defines classes, properties, and rules. The classes and properties describe the concepts and their relations involved in landslide prediction, while the rules use classes and properties as symbols to describe the process of spatio-temporal analysis and landslide prediction. The structure of the schema, as well as the main ontologies, concepts, and attributes used, are shown in Figure 3.

**Figure 3.** Schema structure, including the division of ontology, main concepts and attributes.

• Spatial ontology

The spatial ontology is used to describe the spatial information of geographic objects and is constructed based on the GeoSPARQL ontology [25]. Geocoding rules are designed in the spatial ontology to serve as the location index of the geographic object.

The classes of the spatial ontology mainly introduce two subclasses of geographic objects in the GeoSPQRQL ontology: the feature class and the geometry class. The spatial terms defined based on the feature and geometry classes can be helpful in modeling geospatial data.

The properties of the spatial ontology mainly define the topological spatial relations between geographic objects, as well as the geometry literal [26], which is the serialization standard used when generating geometry descriptions and the supported geometry types. In addition, the properties of the spatial ontology also include Metric [26], which are scalar spatial properties that describe the geographic object. The main rule of the spatial ontology includes basic ontology constraints for class and property, such as constraints on hierarchical relationships between classes and constraints on property values. The core rule includes rules defined in the GeoSPARQL ontology, such as the query transformation rule for computing spatial relations between geographic objects based on their geometries [25].

In addition, indexable location information helps to improve the efficiency of spatial analysis. However, remote sensing data describe spatial information with latitude and longitude coordinates, which cannot be objectified and indexed. To solve this problem, we designed a geographic tile-based spatial indexing rule, i.e., geocode. Figure 4 illustrates an example of geocoding.

**Figure 4.** An example of a spatial description of a geographic object (landslide point), converting the coordinate property of the geographic object to a geocode property (Tile5536) so that the spatial information of the geographic object can be indexed.

Geocode converts the coordinate properties of geographic objects into tile-coded entities according to the Web Mercator rules [27], i.e., the tile number of the Web Mercator coordinate system is used, instead of the latitude and longitude coordinate system, as the unit to describe the location of the geographic entity. Each tile number consists of the horizontal coordinates, vertical coordinates and zoom level of the tile. The conversion rules are as follows:

$$x = \frac{\text{lon} + 180}{360} \cdot 2^z \tag{1}$$

$$y = \left(1 - \frac{\ln\left(\tan\left(lat \cdot \frac{\pi}{180}\right) + \frac{1}{\cos\left(lat \cdot \frac{\pi}{180}\right)}\right)}{\pi}\right) \cdot 2^{z-1} \tag{2}$$

where *lon* and *lat* denote the entered longitude coordinate and the entered latitude coordinate, *x* denotes the tile horizontal coordinate after conversion, *y* denotes the vertical coordinate after conversion, *z* denotes the zoom level of the tile. Each tile in the geocode represents a set of latitude and longitude coordinates, and tiles with different zoom levels contain different amounts of latitude and longitude coordinates. The higher the zoom level, the fewer the number of latitude and longitude coordinates in a tile, and the more accurate the spatial description of the geographic object.

• Temporal ontology

The temporal ontology is used to describe the temporal information of geographic objects, and we construct it based on the OWL-Time ontology [28].

The classes of the temporal ontology mainly define the instant and interval to describe the temporal position and duration of the geographic object. The properties of the temporal ontology mainly define the topological temporal relations between geographic objects, such as "meets", "overlaps" and "during", developed by Allen [29]. The temporal ontology also defines the Date–Time Literal, which is a serialization standard describing time. Similar to the spatial ontology, the main rule of the temporal ontology includes basic ontology constraints for classes and properties. Additionally, the main rule includes rules defined in the OWL-Time ontology. For example, OWL-Time defines time analysis rules

for computing temporal relations between geographic objects based on their time instant and time interval [28].

• Landslide prediction ontology

The Landslide Prediction Ontology is used to describe the concepts needed for landslide prediction, the relations between concepts, and the reasoning process of landslide prediction.

The classes of the ontology define concepts that describe the landslide situation, such as the severity of the landslide and the phase it is in. Since the environment is the root cause of landslides, the classes also define concepts describing the environment, including the natural environment and the social environment. Additionally, concepts related to machine learning are defined in the classes, such as vocabulary to describe the features of models and datasets. Furthermore, the process of landslide prediction is divided into several events and actions; hence, we also need to define the events and actions involved in landslide prediction in the Landslide Prediction Ontology. For example, when selecting the best model for an area, candidate models are described in the ontology by defining classes.

The properties of the ontology mainly define the relations among landslides, the environment, and machine learning methods, such as describing which environmental factors are causative factors for landslides and which causative factors are used as features of the dataset. The Landslide Prediction Ontology also defines the relations between events and actions in the landslide prediction process. For example, when the environmental similarity between areas is calculated, the result triggers the action of model selection. The relation of this "trigger" is described as a property.

The main rule of the ontology includes basic ontology constraints for class and property. Meanwhile, based on the classes and properties of the Landslide Prediction Ontology, we use the production representation to define a series of rules to describe the process of remote sensing data-driven landslide prediction. This includes the calculation method of environmental similarity, the process of model selection, and the process of landslide prediction.

#### 2.2.2. Data Layer

The data layer consists of subject–predicate–object (SPO) triples, where subjects and objects represent entities in the knowledge graph, and predicates denote the edges connecting them. The raw data include three types of independent data: environmental data, area-based dataset, and candidate model.


After generating SPO triples in the data layer, the schema layer relates and organizes these triples to form knowledge that is useful in landslide prediction.

#### *2.3. Landslide Prediction Using Spatio-Temporal Knowledge Graph*

#### 2.3.1. Knowledge Extraction and Storage

When processing environmental data, it is essential to extract and store knowledge while constructing a knowledge graph. After preprocessing the remote sensing monitoring data, they pass through the steps designed to generate knowledge. Simultaneously, data associated with the dataset and the machine learning model are produced by utilizing knowledge, and these data undergo a series of steps to generate corresponding knowledge that helps optimize the results of landslide prediction. The process of transforming data into knowledge is depicted in Figure 5.

**Figure 5.** The process of producing knowledge from data in the knowledge graph approach.

• Data preprocessing

Remote sensing monitoring data quantify environmental elements by assigning a value to each pixel, such as the elevation in DEM data. Since knowledge in a knowledge graph is based on object representation, we convert discrete features in remote sensing data into attributes in objects. The GeoJSON file uses a feature object-based storage mode, which is more conducive to knowledge graph reading than remote sensing data. We preprocess the data and convert the original multi-source remote sensing data into a GeoJSON file that describes the distribution of environmental elements in the study area. After generating the GeoJSON file, we classify adjacent pixels in the remote sensing data with the same environmental element value into the same feature object.

Typically, remote sensing monitoring data of different environmental elements have different spatial ranges. Therefore, in the data preprocessing stage, we need to crop the original remote sensing data to obtain the remote sensing data within the spatial range of the research area. Additionally, remote sensing data from different sources may use different projected coordinate systems, so we must convert multi-source remote sensing data into the same projected coordinate system. We also scale the raster values, which may be decimal, to an integer by multiplying and rounding them. Finally, we generate a GeoJSON file by converting the raster data into vector data and then converting the vector data into the GeoJSON format. During the conversion process, we ensure that the original raster value is restored. The entire process can be automated using the GDAL library [30].

• Knowledge production

After data preprocessing, a GeoJSON file is generated, followed by the process of knowledge production. First, geocoding is calculated based on the spatial information of each object in the GeoJSON file. Next, SPO triples are generated to describe geographic entity attributes based on the GeoJSON file. The objects in the GeoJSON file correspond to the subjects in the SPO triples, the keys of the object properties correspond to the predicates in the SPO triples, and the values of the object properties correspond to the objects in the SPO triples. In this process, geocoding is also generated as an attribute of geographic entities in the form of SPO triples.

Next, we import the generated SPO triples into the knowledge graph. If the SPO triples are imported for the first time, the ontology needs to be created according to the ontology structure in the schema layer we designed. Spatial ontology and temporal ontology can be directly used as basic ontologies. For the landslide prediction ontology, we use Protégé [31] to define the Class, property, and rule in the ontology. Protégé is a tool that helps users quickly create and edit ontologies. The landslide prediction ontology edited with Protégé can be directly imported into the knowledge graph. In this paper, Virtuoso [32] is used to store ontologies and SPO triples. After importing the ontology and SPO triples, we map the SPO triples of the data layer and the ontology of the schema layer to generate semantic associations between data features to produce knowledge.

• Knowledge usage

During the process of knowledge usage, additional structured data related to the dataset and the machine learning model are generated, which also need to be extracted and stored in the knowledge graph. We extract instances of features from the dataset and the model, and write them into a GeoJSON file. The description objects of the dataset and the model are areas, and the characteristics of the dataset and the model in different areas are different. In the GeoJSON file, an area is defined as a feature object. The geometry of the feature describes the location of this area, and the properties of the feature represent the instance of the dataset feature and the model feature. After generating the GeoJSON file, we follow the steps of knowledge production to generate and import SPO triples. We map the SPO triples to schema layer ontologies to produce knowledge related to domain models and datasets.

#### 2.3.2. Semantic Reasoning

Semantic reasoning is based on the production representation and recommends models for areas with sparse samples while following the main rule in the schema. It consists of two phases, similarity analysis and candidate model selection, each with several production rules. The general reasoning program automatically performs semantic reasoning as shown in Figure 6. A rule is triggered by an event object, and the corresponding action function is executed to generate a result based on the defined action object. The generated result then triggers the execution of other rules in the rule set until the phase is complete. Figure 7 shows the template defining this process.

**Figure 6.** General reasoning program for semantic reasoning in each phase.

**Figure 7.** Template for completing phase reasoning from multiple events, and three types of rules, including one event triggering one action, event combination triggering one action, and one event triggering action combination.

The Jaccard index is used to evaluate the similarity of the environment between the area. The equation is as follows:

$$J(A,B) = \frac{|A \cap B|}{|A| + |B| - |A \cap B|} \tag{3}$$

where *A* and *B* denote the environmental feature collection of area A and area B, and the larger the Jaccard index, the more similar the environment of the two areas.

The Jaccard index essentially compares the number of environmental features that are similar between areas. For discrete environmental features, the mode of the feature values within the area is taken as the environmental feature value representing that area. If the environmental feature values representing two areas are equal, then this environmental feature is considered similar in the two areas. For continuous environmental features, the average of the feature values within the area is taken as the environmental feature value representing that area. For area A and area B, the similarity of the environmental features in the two areas is determined according to the following equation:

$$isSimilar = \begin{cases} \text{Yes} & |F\_A - F\_B| \le \frac{F\_{\max} - F\_{\min}}{N\_A + N\_B} \\ \text{No} & |F\_A - F\_B| > \frac{F\_{\max} - F\_{\min}}{N\_A + N\_B} \end{cases} \tag{4}$$

where *FA* and *FB* denote the environmental feature values representing area A and area B, *Fmax* and *Fmin* denote the maximum and minimum values that can be obtained for the environmental feature, *NA* denotes the number of the values of this feature in area A, and *NB* denotes the number of the values of this feature in area B.

The statistical parameters calculated by the similarity analysis are stored as properties in the triples generated from the area-based dataset. When predicting landslides in study areas with sparse samples, the statistical parameters of the study area are first calculated. Then, a similarity analysis is performed based on the statistical parameters. Eventually, the area with the most similar environmental features to the study area is selected from the knowledge graph, and the model trained from the dataset of that area is obtained through a semantic query, i.e., the process of candidate model selection.

#### **3. Experiment and Result**

#### *3.1. Study Area*

We obtained historical landslide data for China from the Global Landslide Catalog [33]. China is one of the countries in the world with the highest frequency of landslide hazards, posing threats to both the ecological environment and the safety of people and their property. Furthermore, China is situated at the intersection of continental plates, and its mountainous areas account for nearly 70% of the land area, with a highly undulating terrain that provides natural conditions for landslides to occur.

To demonstrate the effectiveness of our method, we applied the DBSCAN algorithm [34] to cluster landslide points based on their spatial locations. Landslide points belonging to the same category are indicated in the same area, resulting in four simulated study areas, denoted as area 1, area 2, area 3, and area 4. Among them, area 3 and area 4 have the smallest sample sizes and can be simulated as sample scarcity cases. Then, environmental data were collected as causative factors for the training of machine learning models. Table 1 shows the sources and details of the experimental data, and Figure 8 depicts the process of obtaining samples from the experimental data for the four areas.

Additionally, although the selected simulated area can demonstrate the advantages of our method in terms of effectiveness, it is difficult to show the actual prediction results because the simulated area does not have clear boundaries. Therefore, we validated the practical effectiveness of our method using landslide data from Xiji County, located in the southern part of Ningxia Province, China. Xiji County has an area of approximately 1581.5 square kilometers, ranging from 35°35 to 36°14 north latitude and 105°20 to 106°04 east longitude. We obtained 82 landslide events, which mainly occurred in areas with broken topography and narrow ridges. We used the environmental data in Table 1 to create samples, but due to the scarcity of samples, it is difficult for conventional machine learning methods to make accurate predictions. The distribution of landslide points and the boundary of Xiji County are shown in the Figure 9.

**Type Source Spatial Resolution Temporal Resolution Acquisition Method or Sensor Used** Landslide NASA Global Landslide Catalog [33] Nationwide vector data Acquired 1915-2021 Crowdsourcing Terrain Shuttle Radar Topography Mission DEM [35] 30 m <sup>×</sup> 30 m Acquired 11-22 February 2000 STS Endeavour OV-105, SIR-C/X-SAR Precipitation Annual spatial interpolation dataset of Chinese meteorological elements [36] 1 km <sup>×</sup> 1 km Update annual Multi-element weather station Lithology Global Lithological Map [37] 0.5° × 0.5°; Rasterized at 250 m resolution Released 2014 Assembled from existing regional geological maps Landform Global Landform classification from ESDAC [38] 500 m × 500 m Released 2008 Applied two algorithms [39,40] on global DEM datasets Land Cover Landsat-derived annual land cover product of China [41] 30 m × 30 m Update annual Landsat Road OpenStreetMap [42] Nationwide vector data Update daily Crowdsourcing Normalized Difference Vegetation Index (NDVI) China Annual NDVI Spatial Distribution Dataset [36] 1 km × 1 km Update annual SPOT/VEGETATION

**Table 1.** Sources and details of experimental data.


**Figure 8.** Template for completing phase reasoning from multiple events, and three types of rules, including one event triggering one action, event combination triggering one action, and one event triggering action combination.

**Figure 9.** Information of the Study Area: Xiji County.

#### *3.2. Machine Learning Model*

The performance of four methods, SVM [43], RF [44], KNN [45], and GCF [46], was compared based on landslide prediction research. To assess the landslide prediction, the landslide and non-landslide samples are both randomly divided into two parts: samples for model training and samples for performance testing.


#### *3.3. Metrics*

The process of landslide prediction based on machine learning is a binary classification process for landslide and non-landslide points. Several measures, including precision, recall and the F1 index, are employed to evaluate the overall landslide prediction accuracy for model comparisons. The equations of precision, recall, and F1 index are shown below:

$$Precision = \frac{TP}{TP + FN} \tag{5}$$

$$Recall = \frac{TP}{TP + FP} \tag{6}$$

$$F\_1 = \frac{2PrecisionRecall}{Precision + Recall} \tag{7}$$

where *TP* denotes the number of true positives predicted as being in the positive category; *TN* denotes the number of true negatives predicted as negatives; *FP* denotes the number of true negatives predicted as positives; *FN* denotes the number of true positives predicted as negatives.

Additionally, the Receiver Operating Characteristic (ROC) curve and the Area Under the Curve (AUC) are used to evaluate the results. The horizontal and vertical axes of the ROC curve represent the false positive (FP) rate and true positive (TP) rate, respectively. The AUC is the area under the ROC curve. When the AUC exceeds 0.5, the model is considered to have positive discriminative ability. A higher AUC value, closer to 1, indicates a better predictive performance.

#### *3.4. Experimental Results*

#### 3.4.1. Effectiveness of the Method

Table 2 presents a summary of the results obtained from predicting landslides in four areas using different candidate models. Initially, existing samples of each area were used to predict landslides, and candidate models numbered 1, 2, 3, and 4 were obtained from the training samples of areas 1, 2, 3, and 4, respectively. It was observed that models in areas with sparse samples were generally difficult to fit or had a poor performance after dividing the training and test sets. To address this issue, the environmental similarity (Jaccard Index) between sample-sufficient areas and sample-scarce areas was calculated based on the spatio-temporal knowledge graph's reasoning rules, and candidate models were selected for landslide prediction. When predicting landslides for area pairs with similar environmental features, using model 2 to predict the landslide of area 3, for instance, reduced the issue of sample scarcity in the prediction process for area 3. It was also noted that a larger Jaccard Index indicated that the environmental features of the two areas were more similar, and the model performed better. In terms of selecting model types, SVM and GCF showed better prediction performance for area 3, while KNN and GCF performed better for area 4. Table 3 presents the optimal performance achieved by predicting samplesparse areas using the general workflow and the workflow incorporating knowledge graph, while Figure 10 displays the corresponding ROC curves. It is evident that incorporating the knowledge graph into the workflow enhances the predictive capability for sample-scarcity areas. However, it should be noted that the number of candidate models and the limitations of model training knowledge still leave room for improvements in the AUC.


**Table 2.** Results of using candidate models to predict landslide in different areas, including predictions for regular areas and predictions for scarce areas with similar environments.

**Table 3.** Predicted performance of sample scarcity areas.


**Figure 10.** Comparison of Receiver Operating Characteristic Curves between General Workflow and Workflow Using Knowledge Graph in Area 3 and Area 4.

Moreover, we implemented two workflows for landslide prediction using machine learning. In areas with limited samples, the general workflow involves more manual steps. On the other hand, the workflow based on the knowledge graph offers the advantages of automation and faster computation. Table 4 provides a comparison of the two workflows.


**Table 4.** Effectiveness comparison of general workflow and workflow with additional knowledge graph steps.

#### 3.4.2. Validation in Xiji

To further demonstrate the effectiveness of this method in real-world scenarios, we applied the knowledge graph-based workflow to produce a landslide susceptibility map in Xiji County. Our approach has shown promising results in preliminary studies and we sought to validate it in a practical setting. We first collected environmental data from Xiji County from the sources listed in Table 1. Next, we extracted knowledge from the data and stored it in the knowledge graph, following the data processing process outlined in Figure 5. The knowledge graph performed semantic reasoning to predict landslides, using similarity analysis and candidate model selection as detailed in Section 2.3.2. Based on Equations (3) and (4), the Jaccard index of Area 1 and Xiji is 0.6, and the Jaccard index of Area 2 and Xiji is 0.3. Therefore, the knowledge graph selected the model produced by Area 1 from the candidate models to generate the landslide susceptibility map. Among the candidate models, RF produced the best results for predicting landslide susceptibility in Xiji, with 100 trees in the forest, a minimum of 2 samples required to split an internal node, and a ratio of positive to negative samples of 1.7.

In addition, we followed the general machine learning method shown in Figure 1 to generate the susceptibility map and compared it with our method. The results are presented in Figure 11. Compared to the real landslide sites, the general machine learning method was unable to accurately evaluate the spatial distribution of susceptibility in Xiji County due to the lack of dataset. On the other hand, the method using the knowledge graph workflow mitigated the effect of sample scarcity on the results.

**Figure 11.** Comparison of Landslide Susceptibility Maps Produced by General Workflow and Workflow Using Knowledge Graph for Xiji Landslide

#### **4. Discussion**

In our experiments, we conducted both an effectiveness validation and a validation based on real scenarios. For the effectiveness validation, we divided the landslide dataset into four areas, including two sample-sufficient areas and two sample-scarce areas. We then used our proposed knowledge-graph-based method and general machine learning methods to predict the sample-scarce areas. Our method demonstrated several advantages over general machine learning methods, including better precision due to the use of similarity reasoning rules and environmental features stored in the spatio-temporal knowledge graph. The similarity analysis method we designed quantifies the similarity of geographical features, which improves prediction accuracy, as shown in our experiments. Additionally, the knowledge graph accelerated the prediction process by using automatic semantic reasoning rules and the storage advantages of the graph structure, providing a speed advantage over other methods.

Furthermore, for validation based on real scenarios in Xiji County, we further compared the effectiveness of our workflow and a general machine learning workflow to draw susceptibility maps. Our study demonstrated that the proposed workflow can mitigate the problem of poor prediction in sample-scarce areas. Among the candidate models, Random Forest performed the best, likely due to its ability to handle high-dimensional variables without variable deletion and reduce overfitting through the use of multiple trees, substitution methods, and random subset selection to split nodes.

However, it is important to acknowledge that our proposed method has limitations. Firstly, it is sensitive to prediction size, and larger study areas may require longer processing times and more storage space. Secondly, while our method shows promising results, the precision still needs improvement in real scenarios, which may be achieved by using higher-resolution environmental remote sensing data and more comprehensive landslide point records. Lastly, in future experiments, specific model training techniques could be incorporated as knowledge in the knowledge graph to standardize the comparison criteria, and the design of inference rules for model training should be carefully considered.

#### **5. Conclusions**

Data-driven methods usually require a sufficient number of samples to train the models. In areas where samples are limited, some studies employed prediction methods based on transfer learning or GANs. However, these methods face challenges in organizing multi-source remote sensing data or face difficulties training, making the munsuitable for disaster scenarios that require real-time prediction. Moreover, the lack of a systematic prediction process and the low level of automation in prediction resulted in low prediction efficiency for landslides. In this paper, we propose a novel approach to improve the performance of remote sensing data-driven landslide prediction, which makes the following main contributions:


In future research, we will strive to improve the generalization ability of spatiotemporal knowledge graph. On the one hand, we should define the inference rules for machine learning training strategies in the spatio-temporal knowledge graph to improve the prediction accuracy of candidate models. We will also attempt to integrate other datadriven methods, such as representation learning. On the other hand, we will incorporate more disaster knowledge, such as exposure factors and other geological disaster concepts, into the model to assess the comprehensive risk of geological disasters. Additionally, we will pay more attention to the interpretability of landslide prediction methods. By leveraging the structural advantages of knowledge graph, modeling landslide disaster environments based on multi-source remote sensing data helps to explain the inherent features between causative factors and positively contributes to the prediction results of landslides.

**Author Contributions:** Conceptualization, L.C., L.P. and L.Y.; methodology, L.C., L.P. and W.L.; validation, L.C. and W.L.; resources, L.C.; data curation, L.C. and X.G.; writing—original draft preparation, L.C. and X.G.; writing—review and editing, L.C., X.G. and L.Y.; funding acquisition, L.P. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by Ningxia Key R&D Program (2020BFG02013). This work was sponsored by Tianjin intelligent manufacturing special fund project (NO. 20201198).

**Data Availability Statement:** Data sharing not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

## *Article* **A Method for Predicting Landslides Based on Micro-Deformation Monitoring Radar Data**

**Weixian Tan 1,2, Yadong Wang 1,2, Pingping Huang 1,2,\*, Yaolong Qi 1,2, Wei Xu 1,2, Chunming Li 1,2 and Yuejuan Chen 1,2**

	- **\*** Correspondence: hwangpp@imut.edu.cn; Tel.: +86-0471-360-1821

**Abstract:** Mine slope landslides seriously threaten the safety of people's lives and property in mining areas. Landslide prediction is an effective way to reduce losses due to such disasters. In recent years, micro-deformation monitoring radar has been widely used in mine slope landslide monitoring. However, traditional landslide prediction methods are not able to make full use of the diversified monitoring data from these radars. This paper proposes a landslide time prediction method based on the time series monitoring data of micro-deformation monitoring radar. Specifically, deformation displacement, coherence and deformation volume, and the parametric degree of deformation (*DOD*) are calculated and combined with the use of the tangent angle method. Finally, the effectiveness of the method is verified by using measured data of a landslide in a mining area. The experimental results show that our proposed method can be used to identify the characteristics of an imminent sliding slope and landslide in advance, providing monitoring personnel with more reliable landslide prediction results.

**Keywords:** landslide prediction; deformation monitoring; micro-deformation monitoring radar; coherence

**1. Introduction**

As one of the most harmful natural disasters in the world, landslides can cause drastic loss of life and property every year [1–4]. In recent years, open-pit mining engineering activities have been expanding on a large scale along with the rapid development of mining technology and the continuous growth in global raw material demand, resulting in a sharp increase in the number of high and steep slopes in mining areas [5–8]. According to incomplete statistics, there are currently 10,100 non-coal metal mines and 73,548 nonmetallic mines in China, and from 2001 to 2007, there were 1951 landslide accidents in metal and non-metal open-pit mines, with 3065 casualties [9–12]. The potential safety hazards on the rock slopes of open-pit mines represent the core problem of mine safety production. Accordingly, in 2018, China put forward the idea of "establishing an efficient and scientific natural disaster defense system and improving the ability of the whole society to prevent and control natural disasters" to achieve early identification, monitoring, and early warning of major geological disasters. In the future, organic collaborative means of air–space–ground monitoring will be used to ensure the safe production of mines [13–15].

The slopes of open-pit mining areas are so steep and abrupt that they can be rather dangerous to even observe. Large-scale and long-term real-time monitoring cannot be achieved using traditional slope measurement technologies such as GPS and total station, and it is also difficult to reflect the overall trend of slope deformation [16–21]. With the advancement of science and technology, micro-deformation monitoring radar has been widely used in open-pit mine slope landslide monitoring. It is capable of long-distance and all-weather real-time monitoring, using the phase change of electromagnetic waves to obtain the overall deformation information of the target area. As a result, it has attracted extensive attention

**Citation:** Tan, W.; Wang, Y.; Huang, P.; Qi, Y.; Xu, W.; Li, C.; Chen, Y. A Method for Predicting Landslides Based on Micro-Deformation Monitoring Radar Data. *Remote Sens.* **2023**, *15*, 826. https://doi.org/10.3390/ rs15030826

Academic Editors: Constantinos Loupasakis, Ioannis Papoutsis and Konstantinos G. Nikolakopoulos

Received: 4 January 2023 Revised: 29 January 2023 Accepted: 30 January 2023 Published: 1 February 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

worldwide [22–29]. In addition, coherence of the target deformation area can also be obtained using micro-deformation monitoring radar. A significant difference in the coherence of the target at two different times means that the target has a large deformation. The coherence will be reduced by rainfall, snowfall, and other factors, so good coherence is an important prerequisite for obtaining the deformation value [29–32]. In general, coherence gives an important and valuable reference and implication for deformation.

Since the 1960s, many scholars have carried out a series of studies on landslide prediction, wherein the Saito model, Fukuzono model, and Voight model are typical representatives of creep theory. They are suitable for short-term landslide prediction and temporary slip prediction [33–37] but are limited to gravity-type landslides only. Due to the complex failure mechanism of slopes, nonlinear models such as neural network models and collaborative prediction models always result in differences from the actual slope system [38–40]. In addition, Xu Qiang used the improved tangential angle method as an important indicator in early warning for landslides and achieved good results [41–43]. The existing methods can barely make the best of diverse data obtained by micro-deformation monitoring radar, despite its wide application. So, to take full advantage of these data, there is an urgent need to find a landslide prediction method suitable for micro-deformation monitoring radar.

In addition, the setting of the slope threshold is one of the key steps in landslide prediction. Once the corresponding parameters of a slope exceed the threshold, it is considered to be at risk of collapse. The early warning value is adjusted and continuously optimized in the process of data accumulation and application through the initially set empirical value so as to slowly approach the best early warning threshold applicable to different types of slopes [44]. However, the setting of the early warning value is not static, and there is no absolute universal early warning value. The traditional strategy usually uses a single threshold setting for early warning, which can be susceptible to the environment and other factors, resulting in false early warning.

In summary, based on the time series monitoring data of mine slopes obtained by micro-deformation monitoring radar, the degree of deformation (*DOD*) was calculated in this study by applying parameters such as deformation displacement, coherence, and deformation volume. Then, a comprehensive landslide early warning method combining the multi-threshold criterion and tangent angle was proposed. The effectiveness of the method in realizing early warning of slope landslides is verified through measured data.

#### **2. Study Area**

The experimental site, located in the northwest of China, is an open-pit mine on the slope of the low and middle part of the Altai Mountains. The altitude of the open-air stope is 1000~1300 m. The site is 1.2 km long in the northwest–southeast direction and 0.7 km long in the northeast–southwest direction; the terrain is high in the north and low in the east and south, with a relative height difference of 50~300 m; the bedrock slope was formed after the exploitation of the open-pit mine with the destruction of original terrain and landform. Accordingly, multiple platforms with noticeable elevation differences occur, leading to undulating terrain. Figure 1 shows a panorama of the open-pit mine from the radar perspective.

Geologically, the lithology of the study area can be divided into five categories: gneiss, schist, granulite, marble, and amphibolite. The rock mass inside the slope is relatively complete, and the integrity of the rock mass from the surface to the deep part gradually increases, except for the inner part of the slope due to the structural fracture zone caused by fragmentation of the rock mass, whereas the remaining rock masses are relatively complete and hard rock masses.

**Figure 1.** Open-pit mine slope from the radar perspective.

In this experiment, linear scanning micro-deformation monitoring radar was used for monitoring; the radar operates in the Ku band, with an azimuth resolution of 0.3◦ and a distance resolution of 0.2 m (Figure 2). The monitoring period was from 0:00 on 18 April 2021 to 10:00 a.m. on 21 April 2021, and a total of 225 radar images were acquired. A radar image obtained by the micro-deformation monitoring radar is shown in Figure 3. Since the monitoring scenes are mostly rock structures with strong scattering, the imaging results are fairly clear.

**Figure 2.** (**a**) micro-deformation monitoring radar protection device; (**b**) micro-deformation monitoring radar.

**Figure 3.** Radar image of monitored scene.

#### **3. Method**

In the 1960s, Saito proposed a landslide prediction model based on creep theory after a large number of landslide experiments, as shown in Figure 4, which divides the landslide process into three stages, and slope instability often occurs in the third stage, namely the accelerated deformation stage. In 2009, Xu Qiang proposed the early warning criterion of a tangent angle landslide based on the three-stage evolution characteristics of the Saito model curve, and the dimensions of the longitudinal and horizontal coordinates of the displacement time curve were found to be consistent through coordinate transformation [41,45]; the expression of the improved tangent angle is as follows:

$$\mathfrak{a} = \arctan(V/V\_a) \tag{1}$$

where *V* is the actual deformation rate of the landslide and *Va* is the deformation rate of the uniform speed deformation stage of the landslide. When the tangent angle is greater than 45◦, it marks the beginning of the landslide in the accelerated deformation stage, and the closer the tangent angle is to 90◦, the closer the occurrence of a landslide (Figure 5).

**Figure 4.** Saito curve model.

**Figure 5.** Tangent angle model.

#### *3.1. Forecast Process*

The flow chart of slope landslide prediction proposed in this paper is shown in Figure 6. First, all pixels of radar data are traversed, and deformed pixels are filtered out according to the cumulative deformation displacement threshold and deformation velocity threshold. Then, the filtered pixels are then connected by a connectivity algorithm to obtain multiple connected areas. When the deformation area is determined, the *DOD* (degree of deformation) of the target area is obtained by calculating the area, volume, and average coherence of the deformation area. Finally, the new tangent angle warning criterion is used in landslide prediction.

**Figure 6.** Flowchart of the landslide prediction method.

#### *3.2. DOD*

The azimuth resolution of the micro-deformation monitoring radar is expressed by the azimuth, which leads to an uneven size for the actual terrain corresponding to each pixel of the radar data. The farther the distance, the larger the pixel area. The pixel area can be calculated by the azimuth resolution, range resolution, and range sampling position of the micro-deformation monitoring radar [30]:

$$S\_{\rm single} = \frac{a}{2} \left( [n\beta + R]^2 - [(n-1)\beta + R]^2 \right) = an\beta^2 + a\beta R - \frac{a}{2}\beta^2 \tag{2}$$

where *Ssingle* is the area of a single pixel, *α* is the azimuth resolution, *β* is the range resolution, *n* is the sampling position of the corresponding pixel in the range direction, and *R* is the radar monitoring close distance.

During the long-term monitoring process, the data may fluctuate due to certain uncontrollable factors in the monitoring environment, such as human activities, local vibrations caused by construction, and the noise of the system itself. If only a single pixel is used for region identification, pixels that fluctuate in value due to interference are mistaken for distorted pixels. Landslides, on the other hand, usually require a large deformation area to a certain extent and not just a few isolated pixels. In this paper, pixels are filtered according to the deformation velocity threshold *Tv* and cumulative displacement threshold *Td*. The identified pixels are connected through the connectivity algorithm to identify the sensitive areas that need to be focused on. The landslide warning under multiple thresholds can have a good effect and avoid single-point false warnings. Not only should the actual situation of the monitoring site, including the influence of topography, landform, and human factors, be considered for the setting of the threshold, but also it needs to be continuously adjusted in the process of data collection. The area of the deformation area is the area of the contained pixels, and the deformation volume is calculated as follows:

$$B\_{\mathcal{V}} = \sum\_{n=1}^{N} \left( D\_{\text{def\'o,n}} \times \mathcal{S}\_n \right) \tag{3}$$

where *Bv* is the deformation volume, *Ddefo*,*<sup>n</sup>* is the deformation value corresponding to the *nth* pixel, *Sn* is the area of the *nth* pixel, and *N* is the total number of pixels in the deformation area.

The micro-deformation monitoring radar can obtain the coherence of the target, which is the similarity of the observation results of the target at two different times [46]. If the target does not change between two echoes, the target coherence is 1. If the coherence reaches 0, the target is completely incoherent. When the wavelength, angle of incidence, and time baseline of the radar are constant, there is a nonlinear inverse relationship between the deformation velocity and the coherence [32,47]. Qi Lin also demonstrated this conclusion through simulation [30]. The coherence is calculated as follows:

$$\gamma = \left| \frac{\sum\_{i=1}^{n} A\_i^2 A\_{1,i} \exp\left(i \left[ -\frac{4\pi}{\Lambda} \Delta R + \Delta \beta \right] \right)}{\sqrt{\sum\_{i=1}^{n} |A\_i|^2 \* \sum\_{i=1}^{n} |A\_i A\_{1,i}|^2}} \right| \tag{4}$$

where *γ* is the coherence, *A* is the backscattering coefficient of the target, Δ*R* is the deformation value of the target, and Δ*β* is the scattering phase. Therefore, the average coherence of the deformation region is as follows:

$$\gamma\_{\text{average}} = \left(\sum\_{n=1}^{N} \gamma\_{\text{average}}\right) \Big/ N \tag{5}$$

where *γaverage* is the average coherence of the deformation region and *N* is the number of pixels in the deformation region.

Traditionally, the landslide is usually predicted according to a single pixel which can be easily disturbed by environmental factors during monitoring. Thus, the deformation value may be abrupt, leading to misjudgment of the situation of the slope in monitoring. Therefore, from the point of view of slope, this paper proposes the parameter *DOD* for slope landslide prediction by using the nonlinear inverse relationship between deformation velocity and coherence. To describe the degree of deformation of the slope over a certain period of time, the *DOD* is calculated as follows:

$$DOD = \frac{B\_v}{\gamma\_{\text{average}}r^2} \tag{6}$$

The variables used in the calculation of the *DOD* parameter presented in this paper are coherence and deformation volume, but the variables used in the displacement tangent angle method are displacement, so the variables used in the two methods are completely different. By applying the displacement tangent angle method, the diverse data obtained by micro-deformation monitoring radar could not be fully analyzed and studied. Therefore, the newly proposed method, *DOD* tangent angle, is in an attempt to improve this problem. In the actual monitoring process, multiple thresholds and *DOD* tangent angles are used to recognize which stage of accelerated deformation the landslide is currently in so that the landslide can be accurately predicted before it happens.

In summary, this paper proposes a landslide prediction method combining a multithreshold criterion and the tangent angle, which provides a feasible method for the early warning of open-pit landslides. For a deformed mine slope, as the deformation accumulates, the calculated *DOD* begins to increase, and when the *DOD* increases to a certain extent, a landslide will occur. Usually, before the landslide, the *DOD* curve will increase significantly.

#### **4. Result**

The cumulative deformation displacement map within the time window of 0:00 on 18 April 2021 to 9:00 on 21 April 2021 is shown in Figure 7, and it can be clearly seen that the deformation value of pixels in region 1 changes greatly. The pixels are filtered according to the conditional threshold proposed in this article, and the obtained results are shown in Figure 8, where it is obvious that regions 1 and 2 are in the same position in the observation scene, and most of the pixels in region 1 are identified.

**Figure 7.** Cumulative deformation diagram.

It can be seen from Figure 9 that the displacement in the monitoring area is continuously increasing, and the cumulative displacement curve basically conforms to the "three stages" of the Saito model landslide. From 22:54 on 18 April to about 2:00 on 19 April, the region was basically in a stable state without obvious deformation; from 3:00 on 19 April to around 19:00 on 20 April, the deformation in this area is in the stage of constant velocity deformation; from 19:00 on 20 April to around 8:50 on 21 April, the deformation in this area is in the accelerated deformation stage. The cumulative displacement of characteristic points increases sharply, and the deformation speed also continuously increases. The maximum cumulative displacement is 57 mm, and the maximum speed is 10.1 mm/h (Figure 10).

**Figure 9.** Displacement curve.

**Figure 10.** Velocity curve.

The displacement curve before sliding shows a horizontal trend, and the velocity curve shows a downward trend. Owing to rapid deformation, the significantly reduced coherence of the target region makes it impossible for the system to obtain valid deformation data (Figures 9 and 10). As shown in Figure 11, the coherence curve began to decrease overall on the morning of 20 April, and at about 8 a.m. on 21 April, coherence was less than 0.7 and began to drop sharply (based on monitoring experience, the threshold for coherence is usually 0.7). Changes in coherence before landslides provide us with a new perspective for monitoring slopes. Therefore, as shown in Figure 12, the change in coherence is added to the calculation not only to reproduce the deformation trend of the landslide area, but also for comparison with the velocity curve. The *DOD* curve is more concise and clear, and the starting point of rapid growth in *DOD* corresponds to the accelerated deformation stage in the Saito model. *DOD* increases sharply before slippage, and *DOD* obtains the maximum as the landslide occurs. After sliding, *DOD* continues to drop; then, there is a gradual stabilization of the target area and a gradual increase in coherence, and *DOD* continues to decrease. In summary, it is clear that the *DOD* curve reflects the overall trend of the slope in the accelerated deformation stage.

The most critical point of displacement tangent angle determination is the need to determine the uniform speed deformation stage of the displacement–time curve. Because of the interference from environmental factors and measurement errors, the deformation rate at each moment in the constant velocity deformation stage will fluctuate rapidly or slowly, thereby causing difficulties in determining the rate of the uniform speed deformation stage. Therefore, it should be continuously adjusted in line with the actual deformation (Figure 13). The calculation of the value for the parameter *DOD* proposed in this paper begins when the cumulative displacement and the deformation rate reach a certain threshold, at which the slope of the *DOD* curve shows great changes within a relatively short time window. The time window of the uniform deformation stage of *DOD* is narrower, making it easier to obtain the uniform deformation rate of *DOD* compared with the displacement rate. As can be seen from Figure 14, the *DOD* tangent angle curve is more intuitive and clear, the maximum tangent angle is reached at the slip moment, and the displacement tangent angle is determined by the fluctuation of the data. Although there seems to be an overall trend of a gradual increase in the tangent angle, the up and down fluctuations in the actual monitoring process make it easy to make misjudgments in monitoring.

**Figure 11.** Coherence curve.

**Figure 12.** *DOD* curve.

Landslide prediction can be divided into medium and long-term prediction (more than 6–12 months), short-term prediction (3–6 months), and imminent sliding prediction. Generally, for a mine, prediction 2 h in advance is required. In addition, the instability of slope rock mass is unavoidable. On the premise of not affecting production, the mine will pursue the steepest slope angle with maximum benefit [11]. In summary, in order to more accurately judge the slippage trend of the slope, the velocity threshold, cumulative displacement threshold, and *DOD* tangent angle threshold of different deformation stages before the slope slippage of the open-pit mine were established based on the historical data of the previous slope landslide and the diversified data of the micro-deformation monitoring radar. In addition, a comprehensive early warning criterion, as shown in Table 1, was established. The acceleration is divided into three stages: the initial acceleration stage is a blue warning, the medium acceleration stage is an orange caution, and the critical sliding stage is a red alarm.

**Figure 13.** Displacement tangent angle.

**Figure 14.** *DOD* tangent angle.


**Table 1.** Comprehensive early warning criterion at an accelerated stage.

According to the landslide warning criterion proposed in this paper, the fractal deformation rate of 7:05 a.m. on 21 April 2021 was 7.5 mm/h, the cumulative deformation displacement was 47.9 mm, and the *DOD* tangent angle reached 89.13◦, triggering the red warning, and at about 9:37 a.m. on 21 April 2021, the landslide was unstable, and the surveillance video of the mining area recorded the moment of the landslide (Figures 14 and 15). The displacement tangent angle criterion is that when the slope enters the slippery stage, the tangent angle is greater than 85◦, and the corresponding moment is 04:45 a.m. on 21 April 2021 (Figures 12 and 13). Complying with the management requirements of the open-pit mine slope, the *DOD* tangent angle can give a red alarm 2 h in advance so that the mining area is insusceptible. Meanwhile, time is sufficient to transfer the mining facilities from the dangerous area in order to avoid the loss of property and casualties to the maximum extent. The scene after the landslide is shown in Figure 16.

**Figure 15.** Landslide collapse process.

**Figure 16.** The scene after the landslide.

The deformation process of the landslide area is shown in Figure 17. The deformation data from radar monitoring are combined with the digital terrain model (DSM) data such that the monitoring personnel can judge the large deformation in the monitoring area by observing the displacement and deformation cloud map and further improve the accuracy of landslide monitoring in combination with the slope landslide early warning system, greatly improving the effect of landslide prevention.

**Figure 17.** Slope radar monitoring cumulative deformation.

#### **5. Discussion**

The traditional threshold early warning strategy is usually used to set the threshold value of a single pixel point for early warning. This is not an ideal situation, because a single pixel point is vulnerable to interference from the external environment and other factors, resulting in the phenomenon of the monitoring data of a single pixel point having a sudden change or continuing to increase. Therefore, when the change in the monitoring data of only a single pixel point is used as the evaluation standard for landslide early warning, it is prone to false early warning. Using remote sensing monitoring means, monitoring is conducted in point cloud monitoring mode. At present, there is less research on the mode of landslide early warning based on the regional surface. Therefore, from the point of view of landslide surface and landslide mass, this paper proposes a landslide early warning index based on double thresholds of deformation velocity and cumulative displacement combined with the tangent angle model to propose a comprehensive early warning criterion for open-pit mine slope landslides.

The setting of a slope threshold is always a difficult problem, and few scholars have proposed effective thresholds. Therefore, the identification of rapidly growing data points is based on a set threshold, which is based on the analysis after a period of data accumulation and needs to be continuously optimized and adjusted in the process of gradual data collection and enrichment. This value varies for slopes of differing scales, formation conditions, and rock and soil types. The historical data analysis method is used in actual monitoring; it involves judging the maximum deformation rate in the past, which is used as the threshold for an unacceptable maximum deformation rate resulting in a landslide. When setting the actual threshold, according to an actual situation, there should be comprehensive consideration of the nature of rock and soil mass, the degree of artificial disturbance, the possible influence range, and other factors that increase the maximum deformation value by 10–20%. When the slope deformation speed exceeds this value at a certain time, it is necessary to focus on slope stability. If there is no landslide or signs of a landslide, the threshold value is updated according to the deformation speed at this time.

Based on the nonlinear inverse relationship between deformation volume and coherence, a new tangent angle early warning criterion is proposed according to monitoring experience. The target deformation will affect the scattering characteristics of the target. Similarly, weather factors such as rain and snow will also affect the radar echo. Ensuring good coherence is a prerequisite for accurate early warning. Added to that, it should be noted that *DOD* will be calculated only when the deformation speed and cumulative displacement reach a certain cumulative level. Accurate identification of the acceleration phase of the landslide requires substantial data accumulation and analysis.

#### **6. Conclusions**

The stability of high and large rock slopes in open-pit mines represents the core problem of safety production in open-pit mines, and the security of open-pit mines in China is still a serious issue. Micro-deformation monitoring radar is widely used in slope monitoring based on its characteristics of high spatial resolution, short time baseline, submillimeter monitoring accuracy, surface monitoring, and suitability for all-weather monitoring. In this paper, a new parameter, *DOD*, is proposed for measuring the degree of slope deformation based on the deformation volume and coherence value, and the tangent angle value of this parameter is combined with the dual threshold values of velocity and displacement in landslide prediction. With the introduction of deformation volume and coherence, this method can be used to provide timely and effective early warning before the sliding of a target and more accurate and reasonable prediction results for slope managers. Based on the measured data of a slope and landslide in a mining area, the method realizes timely warning of an impending landslide and allows the avoidance of property loss in the mining area.

**Author Contributions:** Conceptualization, W.T.; writing—original draft, Y.W.; methodology, W.T., Y.W. and Y.Q.; formal analysis, Y.W., P.H. and W.X.; supervision, P.H., C.L. and Y.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported in part by the National Natural Science Foundation of China (Grant Nos. 61971246 and 52064039), the Joint Funds of the National Natural Science Foundation of China (Grant No. U22A2010), the Science and Technology Planned Project of Inner Mongolia (Grant Nos. 2019GG139 and 2020GG0073), the Science and Technology Major Project of Inner Mongolia (Grant No. 2019ZD022), the Science and Technology Leading Talent Team of Inner Mongolia (Grant No. 2022LJRC0002), the Fundamental Research Funds for Universities of Inner Mongolia (Grant No. JY20220077), and the Natural Science Foundation of Inner Mongolia (Grant Nos. 2019MS04004, 2020ZD18 and 2021MS06004).

**Data Availability Statement:** Not applicable.

**Acknowledgments:** Many thanks to Inner Mongolia Pattern Technology Co., Ltd., for providing the field-measured data. Thanks to anonymous reviewers for their constructive comments to improve the quality of this paper.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

## *Article* **Timely and Low-Cost Remote Sensing Practices for the Assessment of Landslide Activity in the Service of Hazard Management**

**Aggeliki Kyriou, Konstantinos G. Nikolakopoulos \* and Ioannis K. Koukouvelas**

Department of Geology, University of Patras, 265 04 Patras, Greece **\*** Correspondence: knikolakop@upatras.gr

**Abstract:** Landslides are among the most dangerous and catastrophic events in the world. The increasing progress in remote sensing technology made landslide observations timely, systematic and less costly. In this context, we collected multi-dated data obtained by Unmanned Aerial Vehicle (UAV) campaigns and Terrestrial Laser Scanning (TLS) surveys for the accurate and immediate monitoring of a landslide located in a steep and v-shaped valley, in order to provide operational information concerning the stability of the area to the local authorities. The derived data were processed appropriately, and UAV-based as well as TLS point clouds were generated. The monitoring and assessment of the evolution of the landslide were based on the identification of instability phenomena between the multi-dated UAV and TLS point clouds using the direct cloud-to-cloud comparison and the estimation of the deviation between surface sections. The overall evaluation of the results revealed that the landslide remains active for three years but is progressing particularly slowly. Moreover, point clouds arising from a UAV or a TLS sensor can be effectively utilized for landslide monitoring with comparable accuracies. Nevertheless, TLS point clouds proved to be denser and more appropriate in terms of enhancing the accuracy of the monitoring process. The outcomes were validated using measurements, acquired by the Global Navigation Satellite System (GNSS).

**Keywords:** landslide mapping; landslide monitoring; UAV; terrestrial laser scanning; 3D point clouds

#### **1. Introduction**

Landslides are among the most dangerous and catastrophic natural disasters in the world. They usually occur suddenly and can be detrimental to the natural environment, the infrastructure or even human life itself. As climate change is unequivocal, modelbased estimations propose that warming temperatures would lead to increased activity of landslides [1–3]. Over the years, several researchers dealt with the investigation of such phenomena and various methods focused on landslide vulnerability mapping, hazard zoning, risk assessment, or rock-fall simulations, have been developed [4,5]. However, landslide research has never been more urgent and important.

The increasing progress in remote sensing technology have made landslide observations more timely, systematic and less costly. Moreover, new possibilities for high-precision research of landslides located in inaccessible areas or extensive landslides have emerged. Novel methodologies, based on multiple remote sensing data, have already been indispensable tools for landslide assessment and risk prevention [6,7].

The growing use of Unmanned Aerial Vehicles (UAVs) has been a real milestone in Earth observation, and therefore, in landslide research [8,9]. The first approaches were based on the successful exploitation of UAVs with compact cameras for the rapid identification of landslides [10]. Subsequently, UAVs mounted with digital single-lens reflex (DSLR) cameras, were used for the documentation and monitoring of large earthflows [11]. An academic research team developed a UAV, equipped with a consumer-grade optical camera

**Citation:** Kyriou, A.; Nikolakopoulos, K.G.; Koukouvelas, I.K. Timely and Low-Cost Remote Sensing Practices for the Assessment of Landslide Activity in the Service of Hazard Management. *Remote Sens.* **2022**, *14*, 4745. https://doi.org/10.3390/ rs14194745

Academic Editor: Christian Bignami

Received: 18 August 2022 Accepted: 19 September 2022 Published: 22 September 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

in order to generate 3D surface models for the more comprehensive characterization and monitoring of unstable areas [12].

Moreover, time series of UAV images were processed via the Structure from Motion (SfM) technique and the outputs were utilized for the quantification of the surface deformation, the measurement of the landslide volumetric change and the determination of the landslide's dynamics [13,14]. In a respective study, UAV data contributed effectively to the assessment of residual risk (post-landslide risk) on a medium- and long-term scale through the estimation of the evolution of the area [15]. UAV imagery along with digital photogrammetry has successfully assisted in the recording of slope conditions as well as in the enhancement of the understanding of landslide processes and the precise assessment of slope instabilities [16]. As time passes, new and more innovative approaches for landslide investigation are emerging based on the combined use of UAVs and machine learning algorithms for the extraction of landslide susceptibility maps and the monitoring of landslide risk areas [17].

Another common tool for many geotechnical studies and landslide investigations is Light Detection and Ranging (LiDAR) technology [18]. In particular, Airborne LiDARs have proven to be particularly effective in the detailed and accurate representation of the landslide's surface, the recognition of different types of mass movements and the monitoring of landslide dynamics as well as the classification of slow-moving landslides in densely vegetated areas [19–22]. In addition, a variety of studies have already been carried out regarding the utilization of LiDARs on rockfall mapping, rock mass characterization and rockfall susceptibility analysis [23,24]. However, landslide research through airborne LiDAR constitutes a quite expensive approach and thus continuous airborne monitoring is relatively limited.

On the contrary, Terrestrial LiDAR (TLS) surveys are more affordable, providing data with a higher temporal and spatial resolution. An overall overview of TLS acquisition and data processing concerning the characterization, volume estimation and monitoring of rock slopes has already been published [25]. Furthermore, a truly comprehensive study took place in Yosemite Valley where TLS and SfM were utilized for the detection of rockfalls over a 40-year period and the updating of the inventory database with more precise measurements (number, area, volume) [26]. In fact, the terrain models resulting from remote sensing techniques (TLS, SfM) were compared with the corresponding models derived from the processing of historical oblique photographs allowing the detection and quantification of surface changes and providing long-term monitoring.

The integrated use of a variety of remote sensing data is suggested as an alternative perspective for more comprehensive landslide research. Specifically, spaceborne satellite data (high resolution multispectral and radar images) were combined with UAV imagery and ground-based techniques, such as Ground-Based Interferometric SAR (GB-InSAR), TLS, etc., in order to identify, map and monitor landslides, which vary in their characteristics, failure mechanisms, evolution processes, spatial distribution and risk of instability [27]. Moreover, the analysis of the activity of the landslide and the estimation of its kinematic evolution can be achieved effectively either by the execution of repeated UAV campaigns along with Global Navigation Satellite Systems (GNSS) surveys [28] or by the utilization of UAV data in conjunction with (a) airborne LiDAR data [29] or (b) TLS surveys [30].

In the current study, we collected multi-dated data obtained by Unmanned Aerial Vehicle (UAV) and Terrestrial Laser Scanning (TLS) surveys for the accurate and immediate monitoring of a landslide aiming at the provision of operational information concerning the stability of the area to the local authorities. Specifically, a landslide occurred in an environmentally sensitive area, which is located in a steep and narrow valley. Thus, the monitoring of the area constitutes a particularly challenging task, as it must be achieved timely and accurately with as little environmental impact as possible and minimum costs. As both UAV and TLS field campaigns involve only transportation expenses to the study area, they have proved to be a reliable and cost affordable tool for continuous monitoring of small (ten of meters) to moderate (hundreds of meters) active landslides. The aforementioned challenge is the main objective of the current work. In this framework, the derived data were processed appropriately, and UAV-based as well as TLS point clouds were generated. The monitoring and assessment of the evolution of the landslide were based on the identification of instability phenomena between the multi-dated UAV and TLS point clouds using the direct cloud-to-cloud comparison and the estimation of the deviation between surface sections. Moreover, systematic measurements, obtained by the Global Navigation Satellite System (GNSS) were utilized for the verification of the results. Finally, research outcomes were communicated to local authorities in order to execute appropriate measures for the mitigation of the risk.

#### **2. Landslide Area**

Northern Peloponnese is characterized as one of the most tectonically and seismically active continental regions worldwide due to the existence of the fast-extending Corinthian rift [31–35]. The rift is dissecting the entire region, from the coastline in the north to the ridges inland since the Pliocene. Normal faulting, sea level changes, as well as the tectonic uplift of rift sediments, are recognized as dominant features in this process. The morphology of the wider area is obviously affected by the ongoing tectonics and it is evidenced by the development of deep and narrow valleys.

Our area of interest is located on the outskirts of the village of Kato Zachlorou within the Region of Western Greece (Figure 1). The first phenomena of instability at the specific site started as rock falls on 19 April 2019. A subsequent reactivation took place on 14 November 2019 through rockfalls and debris falls, while a more extensive event, including rock falls and debris falls occurred on 4 April 2020. Landslide material covered the area of the road in three different locations, contributing to the isolation of the local community from the surroundings (Figure 2).

**Figure 1.** Area of Interest.

**Figure 2.** (**a**,**b**) UAV photos of the area of interest after the landslide on 4 April 2020. (**c**) Photo of the landslide material displaced on the area of the road.

The specific site is an outstanding natural heritage area, as well as a famous tourist destination. A historic rack railway, named "Odontotos" runs through the gorge of Vouraikos since 1896 and it is one of the most iconic touristic features of the wider area. In this context, the mitigation of landslide risk and the maintenance of human security are top priorities, especially in our area of interest, which is located in a particularly narrow area of the gorge and quite close to the railway lines (~30 m) (Figure 2a). Additionally, this environmentally sensitive area must be safeguarded and thus any monitoring should take place with as little environmental disturbance as possible.

#### **3. Materials and Methods**

#### *3.1. Data Acquizition*

The precise monitoring of our area of interest started after the occurrence of the extensive rock falls and debris fall on 19 April 2019 and is still in progress. Our datasets included repetitive representations of the study area acquired by either UAV or by TLS as well as high-precision GNSS measurements (Table 1). Repeated UAV/TLS surveys have been carried out at regular intervals for the immediate provision of stability information to the local authorities. UAV flights are operated within 1 h while each TLS survey lasts approximately 4 h.

A DJI Matrice 600 was utilized for the collection of the UAV imagery. The specific hexacopter is equipped with a Zenmuse X5 camera with a 15 mm F/1.7 lens and a 72-degree diagonal field of view. The camera operates with an electronic shutter, capturing images at 16 MP analysis, i.e., photo analysis of 4608 × 3456 pixels. The campaigns were executed once per month at an altitude of 70 m above the ground level, maintaining the same flight grid (Figure 3) and the corresponding photogrammetric characteristics throughout the monitoring period. The acquired UAV photos have 90% along-track and 75% across-track overlap and photogrammetric processing was carried out in Agisoft Metashape software.

TLS surveys were conducted using a Leica ScanStation P50, which allows the extremely fast scanning (1 million points per second) of large areas along with the extraction of highquality 3D representations. The range accuracy of the specific laser scanner is estimated at about 1.2 mm for ranges varying from 120 m to 270 m. In addition, scanning can be performed under almost any weather conditions (exceptions include stormy winds and heavy rainfall) in a 360-degree horizontal and 270-degree vertical field of view. A Canon EOS 80D camera with 24 MP resolution is mounted onto Leica ScanStation P50

for the improvement of the sharpness of the obtained point clouds. The processing of the multi-dated point clouds took place in Leica Cyclone software.


**Table 1.** Dates of the repeated UAV, TLS and GNSS surveys.

**Figure 3.** The flight grid of the photogrammetric UAV campaigns.

Moreover, several repetitive static GNSS measurements were executed utilizing a Leica GS08 GNSS Receiver. The adopted methodology for the construction of the permanent GNSS pillar and the and subsequent monitoring has already been described in detail [28]. The measurements were performed on permanent pillars (Figure 4), which are located at key points within the area of interest in order to guarantee the performance of each measurement exactly at the same position. In particular, three of these permanent pillars were constructed along the paved road, while two others were placed outside the landslide. GNSS measurements were utilized both to monitor instability phenomena and to verify the results of remote sensing approaches.

**Figure 4.** Permanent location on the road area for the execution of GNSS measurements.

Finally, square 4.5" black and white targets were distributed throughout the area of instability (Figure 5) in each UAV campaign or TLS survey, aiming at minimizing georeferencing errors and enhancing the registration quality of the multi-dated outputs. The position of each target was measured using a Leica GS08 GNSS Receiver.

#### *3.2. Methodology*

The current research focuses on the accurate and timely monitoring of landslide activity using low-cost, repeatable remote sensing data obtained by UAV and TLS sensors, in an environmentally sensitive area. Our main purpose is to inform the local authorities about the stability of the area, within two days (maximum) from the field surveys, in order to respond immediately by planning the appropriate measures, in case of a possible emergency (future landslide). An overview of the adopted methodology is shown in the following flowchart (Figure 6). The validation of the results takes place through their comparative assessment with GNSS measurements, performed on permanent positions (Figure 7).

**Figure 5.** (**a**) Distribution of GCPs in the area of interest, (**b**) 4.5" black and white target, (**c**) 4.5" black and white target within the landslide area.


**Figure 6.** Flowchart depicting the applied methodology.

**Figure 7.** Distribution of the GNSS permanent pillars. The orthophoto presents the sliding area on 21 April 2019.

Specifically, the obtained UAV images were processed in Agisoft Metashape (v. 1.7.2., Agisoft LLC, St. Petersburg, Russia) according to the SfM photogrammetry. The technique transforms the overlapping, multi-view UAV images into a three-dimensional object model [36–38]. The origin of SfM is traced to the fields of computer vision and photogrammetry. In our case, the UAV images were aligned using the highest-quality option as described in the Agisoft Metashape manual [39]. The specific option is inextricably linked to the quality of the 3D reconstruction since the camera positions are calculated more accurately. In addition, the collected images were processed in their original size and at the same time, they were upscaled by a factor of 4. The dense point cloud generation was performed according to the ultra-high-quality setting which allows for the creation of more detailed and precise depth maps. Concerning the camera calibration and optimization, the default setting of Agisoft Metashape was selected. The SfM processing ended up in

the extraction of point clouds, which were projected into the Hellenic Geodetic Reference System 1987.

The processing of the indicative TLS data was carried out in Leica Cyclone software. Six scan stations were required in each TLS survey to fully cover the area of interest. Therefore, the primary step of the processing was the correct registration of the scans obtained by the different scan positions. The procedure took place using Leica Cyclone REGISTER 360 through the identification of the 4.5" black and white targets. The registration error in the specific step was measured at 6 mm. Afterward, the multi-dated point clouds were imported into Leica Cyclone REGISTER in order to be properly aligned. The alignment was executed through the detection of common points between the point clouds. The derived TLS three-dimensional representations along with the corresponding UAV-based point clouds were utilized for the monitoring of the landslide's evolution

#### **4. Results**

#### *4.1. UAV Surveys*

The systematic monitoring of mass movements across the investigated area using UAV imagery was based on change detection approaches. Specifically, high-resolution orthophotos were utilized for the visual identification of surface changes between the repetitive sliding episodes. Some typical relief changes, detected between the first mass movements in April 2019 (Table 1) and the more extensive ones in April 2020, are presented in Figure 8. Specifically, Figure 8a,b show area 1 between the two episodes, while Figure 8c,d correspond to area 2. In both cases, the crosshair is located in the same position. As can be observed, instabilities were evolving as evidenced by the displacement of large conglomerate boulders and soil towards the area of the road as well as vegetation changes.

**Figure 8.** Orthophotos: (**a**) Area 1 on 11 April 2020, (**b**) Area 1 on 21 April 2019, (**c**) Area 2 on 11 April 2020, (**d**) Area 2 on 21 April 2019.

Local authorities after the occurrence of the extensive mass movements (April 2020) decided to execute some stabilization measures in order to reduce the risk. Operations took place in late July 2020 and included scaling and trimming of the overhanging conglomerates. Therefore, the evolution of the topography of the investigated area from April 2020 to the end of 2021 is strongly related to man-made activities as well as natural processes.

The derived products of UAV photo processing were used for monitoring the topographic variation arising from both factors. In fact, multi-dated orthophotos, covering the investigated area, were exploited for the determination of the areas of instabilities through the monitoring of the vegetation evolution (Figure 9). The initial extent of the mass movements is displayed in blue, whilst the corresponding boundary of November 2021 is shown in red. The lines present significant changes, which are expressed by local vegetation variations and they are related to stabilization operations. Relief modifications related to the specific operations are equally obvious in surface profiles derived from UAV's DSMs (Figure 10). The surface profile of April 2019 is shown in magenta, while the respective profile after the remediation of the slope is depicted in red.

**Figure 9.** Evolution of the extent of the mass movements over time.

Concerning the monitoring of the instabilities using UAV point clouds, the processing relied on the direct cloud-to-cloud comparison, which was carried out in the Leica Cyclone 3DR extension. The results of this comparison, related to either manmade activities or natural processes are presented in Figure 11. In particular, UAV point clouds, obtained on 10 June 2020 and 25 September 2020 were exploited to detect relief variations that emerged from the stabilization measures (Figure 11a). The greatest changes were varying from 1.5 m to 1.7 m and were highlighted in reddish to magenta colors. Regarding the evolution of the mass movements influenced by natural processes, 3D point clouds acquired on 25 September 2020 and 7 November 2021 were compared (Figure 11b). Following this comparison, small, scattered surface variations were detected throughout the landslide. In the specific output, the topographic variations were overestimated concerning their distribution, which is associated with the sparser density of UAV point clouds and the temporal changes in vegetation. It is worth mentioning that in both periods, the surface changes depicted the dispersion area, despite the positive values that emerged from the processing procedure. Unfortunately, local authorities were removing the sliding materials daily in the area of the road, thus it is impossible to map the volume accumulated.

**Figure 10.** Surface profiles capturing relief modification after the stabilization operations. (**a**) Disribution of the profile sections within the area of interest. (**b**) Multitemporala surface profiles. The surface profile of April 2019 is shown in magenta, while the respective profile after the remediation of slope is depicted in red.

#### *4.2. TLS Surveys*

The detection of potential phenomena of instability within the area of interest, utilizing TLS surveys, was performed through cloud-to-cloud comparisons between the acquired multi-dated point clouds. In particular, TLS sensors are able to identify surface differences arising from manmade activities (Figure 12a) or natural processes (Figure 12b) in a more accurate way. The largest surface change associated with the stabilization measures was calculated at about 1.60 m and is close enough to the actual topographic modifications (Figure 12a). On the other hand, scattered relief changes related to erosion processes were noticed between the point clouds obtained on 25 September 2020 and 7 November 2021; they were estimated at about 0.5 m (Figure 12b). The procedure concerned the dispersion areas and not the areas of accumulation, as has already been noted.

**Figure 11.** (**a**) Cloud-to-cloud comparison between the UAV point clouds, acquired on 10 June 2020 and 25 September 2020. (**b**) Cloud-to-cloud comparison between the UAV point clouds, acquired on 25 September 2020 and 7 November 2021. (**c**) UAV point cloud acquired on 25 September 2020 in RGB.

**Figure 12.** (**a**) Cloud-to-cloud comparison between the TLS point clouds acquired on 10 June 2020 and 25 September 2020. (**b**) Cloud-to-cloud comparison between the TLS point clouds, acquired on 25 September 2020 and 7 November 2021. (**c**) TLS point cloud acquired on 10 June 2020 in RGB.

Moreover, surface profiles of the multi-dated TLS 3D topographic representations were generated in the context of more comprehensive monitoring of the mass movements within the area of interest. Specifically, the intersection between the surface profiles of the TLS point clouds managed to capture accurately and efficiently the parts where the works for the remediation of the slope took place (Figure 13a). Red shades highlighted the areas where the greatest topographic changes have taken place due to the aforementioned operations. Although the output was expected, considering the accuracy of TLS data, it was truly surprising to discover that TLS sensors are able to detect micro-displacements associated with the particularly slow evolution of the topography of the area of interest (Figure 13b). The blue-green colors of the intersection depict the micro-displacements, while the road area seems to be lifting. In fact, these local surface changes, which emerged from the intersection of TLS surface profiles, were estimated at about 0.321 m and they have been confirmed by field observations and GNSS measurements (Table 2). These micro-displacements that indicate the ongoing activity of the slide, are easily observable in the contours of the multi-dated TLS representations, especially in the area of the road (Figure 14).

**Figure 13.** (**a**) Deviation between the surface sections of TLS point clouds acquired on 10 June 2020 and 25 September 2020. (**b**) Deviation between the surface sections of TLS point clouds, acquired on 25 September 2020 and 7 November 2021.


**Table 2.** Surface deformation on permanent GNSS pillar 2.

#### *4.3. Monitoring Overview and Computational Effort*

As already mentioned, the main objective of this work is to provide information regarding the stability of the investigated area to the local authorities immediately, using low-cost and environmentally friendly approaches. The results demonstrated that UAVs, as well as TLS point clouds, are able to monitor the stability of the selected area precisely and effectively at regular intervals; local authorities have access to operational information about the selected site within 2 days.

In more detail, each UAV flight is performed in about 60 min and each scanning requires approximately 4 h for an 11,423 m2 area with 120 m length (Table 3). The processing of the collected UAV and TLS data was performed in Agisoft Metashape and Cyclone software, respectively. The processing time for each approach is displayed in Table 3, while Table 4 contains the characteristics of the used computer. It is worth noting that TLS cloud density was particularly high between the repeated surveys, in contrast to the density of the UAV point clouds, which was obviously sparser. On the other hand, the increased and constant point density of the TLS point clouds is responsible for the good performance of the specific sensor even in the identification of micro-displacements. Furthermore, the operational cost of the applied approach is directly related only to the purchase of equipment and software or to repairing/upgrading issues.

**Table 3.** Characteristics and computational effort of UAV/TLS point clouds.


**Table 4.** Characteristics of the used computer.


Meanwhile, we calculated the volume of these hanging rocks utilizing the point clouds obtained by either UAV or TLS, in the context of a comparison of the data used (Table 5). The calculation was carried out using the stockpile tools in Cyclone 3DR. Both approaches yielded similar results, which are slightly different from the assessment of the volume of the detached hanging rocks, as calculated by the local authorities. In particular, local authorities appointed staff from the Department of Geology of University of Patras, as consultants to monitor the instabilities within the area of interest [40]. In this framework, multi-dated UAV orthophotos and DSMs were imported into a GIS environment in order to calculate the volume of the hanging rocks. The estimation was based on simple mathematical operations between the digitized extents of the hanging rocks and the elevation differences. As can be observed, the specific estimation led to an overestimation of the volume, which is expected since the calculation took place in a rougher way. On the contrary, the proposed methodology, consisting of UAV and TLS surveys achieved a more realistic calculation of the volume with millimetric-scale accuracies.


**Table 5.** Estimation of the volume of the hanging rocks using UAV/TLS point clouds.

#### **5. Discussion**

Over the years, numerous researchers dealt with the investigation of landslides through the development of effective methodologies and tools for different aspects of landslide research, such as vulnerability mapping, risk assessment, etc. Nevertheless, proper landslide research has never been more urgent. New possibilities for landslide assessment and risk mitigation arose from the evolving development of technology and remote sensing. The aim of the current study is to provide the local authorities with useful and immediate operational information regarding the stability of an environmentally sensitive area while keeping at the same time operational costs as low as possible. On this basis, we performed multiple UAV flights and TLS surveys between April 2020 and November 2021. The collected UAV data were processed via SfM photogrammetry, resulting in the generation of multi-temporal point clouds. The accuracy assessment of the derived SfM products has already been examined in several studies using quantitative and qualitative methods along with a variety of reference data sets consisting of GNSS measurements, classical topographic measurements, etc. [41–43]. Most studies are emphasizing the effect of the number of GCPs and their distribution on the accuracy of the derived products [44–46]. Additionally, the registration and alignment of the TLS data led to the extraction of dense three-dimensional representations.

The monitoring and assessment of the landslide activity were carried out through the exploitation of the derived UAV-based point clouds and the respective TLS threedimensional representations. TLS sensors have proven their effectiveness in the analysis of the spatio-temporal evolution of landslides in previous studies [47–49]. In this context, we tried to identify any surface change related to instability phenomena between the multi-dated UAV/TLS point clouds through the direct cloud-to-cloud comparison and the estimation of the deviation between surface sections. Surface variations emerging from either human activities or natural processes were successfully traced by both types of remote sensing data. The greatest surface modifications within the area of interest varied between 1.5 and 1.7 m; they were closely associated with the execution of the stabilization operations. Small, scattered surface displacements were detected throughout the area of interest during the last year as a result of erosion processes. Local authorities are constantly removing the newly induced sliding materials from the road area in order to keep the traffic flowing; however, this affects research in a way, as it only allows the mapping of the dispersion areas. In general, the outputs of the photogrammetric and the TLS processing were comparable; nevertheless, the evolution of the landslide area can be achieved more precisely (millimeter-scale) with the TLS-based point, as the corresponding UAV models were less accurate (centimeter-scale) due to the lower density of the point clouds (Table 3). Indeed, the micro-displacements which were observed in the TLS outputs (Figures 12–14), have been confirmed by field observations and the analysis of the GNSS measurements. In particular, Figure 15 depicted the x-, y-, z- coordinate variations of the permanent pillar 2 after the works for the remediation of the slope until now. The greatest differences vary between 0.1 m and 0.28 m. Therefore, it was demonstrated that the phenomena of instability within the area of interest were still ongoing until the end of 2021, but they are evolving slowly. We also have to mention that with the exception of the artificial boulder removal, all the other mass wasting over the slope is regularly distributed over the sliding area. These observations help the landslide classification as debris falls including minor conglomerate boulders. Similar methodologies based on cloud-to-cloud comparison have

been applied towards the determination of the deformation of active landslides or in other emergencies [50–52].

**Figure 15.** Variation of the coordinates of permanent pillar 2 over time. The *x*-axis corresponds to the period of GNSS measurements, while *y*-axis displays the observed displacements.

Many previous studies have examined the synergy of new technologies for landslide monitoring in conjunction with the minimization of operational costs. Synergistic use of TLS point clouds and GNSS measurements was proposed by [53] in order to map a landslide deformation in Puerto Rico. In order to reduce the high cost of purchasing commercial software, the specific study adopted the use of an open-source software package named Generic Mapping Tools for the processing of TLS data. Another study [54] proposes a combination of 12 low-cost single-frequency GNSS sensors, one seismometric station enhanced with another single-frequency GGNSS sensor and only one dual-frequency GNSS receiver established outside of the unstable area. The specific system provided daily information to the local stakeholders about the landslide movement. Another low-cost GNSS network was established by [55] composed of six sensors inside the landslide area and one outside to be used as a reference. Special attention was given to the logistic limitations (no electrical power, no Wi-Fi, etc.). The GNNS network results were analyzed with other data derived from boreholes, piezometers, inclinometers, crackmeters and meteorological stations. The supplementarity of TLS and UAV point clouds and the operational cost decrease with repeated UAV campaigns in areas with steep reliefs were also discussed [56]. Ref. [56] have emphasized the fact that in areas of very steep topography, the TLS data acquisition implies discontinuities in the point cloud and as a result, there is no homogeneous rendering over the broader area. This gap in point cloud data is filled with low-cost UAV campaigns. Similarly, it was mentioned that UAV campaigns at low altitudes and SfM photogrammetry can prevail over the visibility limitations that are present in land-based methodologies, such as TLS [57–59]. The TLS and UAV data combination for landslide monitoring in Brazil is presented in [60]. It is mentioned that the combination of those two methods presents great advantages over conventional, costly and time-consuming methods.

The current work proposes an effective methodology for the monitoring of challenging areas using low-cost data, acquired by UAVs and TLSs along with repetitive GNSS measurements. The total survey time (for both UAV and TLS) is only 5 h, while the processing of the collected data is carried out within 24 h. Thus, local authorities are informed about the stability of the landslide area, almost simultaneously with the monitoring procedure (within 2 days). The suggested methodology can be used as a guide for monitoring challenging sites or for the quick detection of surface/terrain changes in other emergencies. In more detail the main recommendations of the proposed methodology are:


#### **6. Conclusions**

The provision of operational information concerning the stability of an environmentally sensitive area to local authorities for the mitigation of the risk was the main objective of the current research. Based on this, the monitoring of the selected site was implemented using timely and low-cost remote sensing data. The appropriately processed multi-dated UAV-based point clouds and the respective TLS point clouds were submitted to comparison procedures aiming at the determination of the evolution of the landslide over time. Direct cloud-to-cloud comparison and the intersection between surface sections were exploited as change detection approaches. The overall evolution of the landslide is distinguished into two sub-periods: the one that was in line with human activities and the other emerging from natural processes, which are still in progress. Both types of data managed to trace the surface changes into the different sub-periods, proving that UAV and TLS data can be used effectively in emergencies and for the accurate landslide classification as debris falls. In fact, the outputs of the photogrammetric and the TLS processing were comparable. Specifically, the largest surface changes detected in UAV/TLS point clouds for the first monitoring period were varying between 1.5 m to 1.7 m. Additionally, the small local surface changes (0.20 m–0.30 m) which are identified on the sliding area during the second monitoring period—related to the natural processes—indicated that the landslide was still active until 11/2021, but it is evolving slowly. It is worth mentioning that the immediate and precise monitoring of the evolution of the landslide seems to be more efficient with the usage of the TLS-based point clouds, which are denser through the monitoring procedure. However, integrated use of UAV and TLS data could be suggested as a hybrid approach for the improvement of the spatial coverage and the point density of UAV-based point clouds. Eventually, this low-cost methodology can be utilized as a guide for the monitoring of challenging or sensitive sites or for the rapid detection of deformations arising from natural disasters or human activities.

**Author Contributions:** Conceptualization, K.G.N., I.K.K. and A.K.; methodology, K.G.N. and A.K.; software, K.G.N. and A.K.; validation, K.G.N., I.K.K. and A.K.; formal analysis, K.G.N., I.K.K. and A.K.; investigation, K.G.N., I.K.K. and A.K.; resources, K.G.N. and I.K.K.; data curation, K.G.N., I.K.K. and A.K.; writing—original draft preparation, K.G.N. and A.K.; writing—review and editing, K.G.N., I.K.K. and A.K.; visualization, K.G.N., I.K.K. and A.K.; supervision, K.G.N. and I.K.K.; project administration, K.G.N. and I.K.K.; funding acquisition, K.G.N. and I.K.K. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Data Availability Statement:** Data available on request due to restrictions. The data presented in this study are available on request from the corresponding author. The data are not publicly available due to public safety reasons.

**Acknowledgments:** The authors would like to thank Eidikos Logariasmos Kondilion kai Ereunas University of Patras for the support and the payment of the publication fees.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Evaluation of SAR and Optical Data for Flood Delineation Using Supervised and Unsupervised Classification**

**Fatemeh Foroughnia 1, Silvia Maria Alfieri 1, Massimo Menenti 1,2,\* and Roderik Lindenbergh <sup>1</sup>**


**Abstract:** Precise and accurate delineation of flooding areas with synthetic aperture radar (SAR) and multi-spectral (MS) data is challenging because flooded areas are inherently heterogeneous as emergent vegetation (EV) and turbid water (TW) are common. We addressed these challenges by developing and applying a new stepwise sequence of unsupervised and supervised classification methods using both SAR and MS data. The MS and SAR signatures of land and water targets in the study area were evaluated prior to the classification to identify the land and water classes that could be delineated. The delineation based on a simple thresholding method provided a satisfactory estimate of the total flooded area but did not perform well on heterogeneous surface water. To deal with the heterogeneity and fragmentation of water patches, a new unsupervised classification approach based on a combination of thresholding and segmentation (CThS) was developed. Since sandy areas and emergent vegetation could not be classified by the SAR-based unsupervised methods, supervised random forest (RF) classification was applied to a time series of SAR and co-event MS data, both combined and separated. The new stepwise approach was tested for determining the flood extent of two events in Italy. The results showed that all the classification methods applied to MS data outperformed the ones applied to SAR data. Although the supervised RF classification may lead to better accuracies, the CThS (unsupervised) method achieved precision and accuracy comparable to the RF, making it more appropriate for rapid flood mapping due to its ease of implementation.

**Keywords:** SAR; optical; flood mapping; random forest; Otsu thresholding; unsupervised classification; supervised classification

#### **1. Introduction**

The literature on the detection and mapping of flooding events is rather abundant and documents a broad spectrum of methods, relying on multi-spectral (MS) and microwave remote sensing observations. Given the high dimensionality of flood mapping, we deemed it useful to evaluate in detail a limited number of combinations of methods and remote sensing signals. We chose for this experiment two extreme events in two different areas, which led to two completely different flooding patterns. We considered that by analyzing in detail a limited number of cases, it would be possible to understand the causes of the performance on flood mapping achieved in each case. This includes exploring the advantages and disadvantages of either signal, simple vs. advanced algorithms or single image vs. time series data sets. We conceived this study as an attempt to structure such options as an incremental approach where increasingly complex signals, data sets and algorithms are applied as needed, rather than going a priori for the most complex solution. The capability of Copernicus satellites to acquire remote sensing data at higher spatial and temporal resolution has improved surface water and flood mapping. Both optical and microwave sensors can be used for flood mapping, providing different capabilities

**Citation:** Foroughnia, F.; Alfieri, S.M.; Menenti, M.; Lindenbergh, R. Evaluation of SAR and Optical Data for Flood Delineation Using Supervised and Unsupervised Classification. *Remote Sens.* **2022**, *14*, 3718. https://doi.org/10.3390/ rs14153718

Academic Editors: Constantinos Loupasakis, Ioannis Papoutsis and Konstantinos G. Nikolakopoulos

Received: 3 June 2022 Accepted: 24 July 2022 Published: 3 August 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

and accuracy. MS imaging radiometers measure spectral radiance from the visible (VIS) through the shortwave infrared (SWIR) spectrum. The near-infrared (NIR) region is most suitable to distinguish water from dry surfaces due to the strong absorption of water [1]. Based on this, many simple spectral indices have been developed to delineate water areas using MS images.

The normalized difference vegetation index (NDVI), derived from Red and NIR ranges, has been widely used in water and flood mapping [2–5]. The NDVI, however, is a vegetation index, i.e., it is sub-optimal to capture information on a water surface [6]. The normalized difference water index (NDWI), calculated using Green and NIR spectral radiances, aims to maximize the spectral contrast between water and other terrestrial land covers in the Green and NIR regions [6]. NDWI has been extensively used to map inundated areas [7–12]. Although other indices were also successively developed [13–16], NDWI was demonstrated to provide higher performance in detecting water bodies [17]. The index ranges from −1 to +1, in which positive values are associated with surface water in the ideal situation of deep and clear water. The presence of dense vegetation, however, may easily lead to a higher NIR than Green reflectance, and NDWI values closer to values over land. In these cases, delineation of heterogeneous flooded areas using NDWI is not straightforward [8,18].

Microwave signals, on the other hand, benefit from good penetration through clouds, providing more efficient measurements in cloudy conditions than optical observations [19]. The difference in surface roughness is the main feature to detect surface water using synthetic aperture radar (SAR) data. Ideally, smooth open water exhibits specular reflection, i.e., away from the line of sight (LOS) of the SAR sensor, in strong contrast with the scattering of surrounding natural surfaces in dry conditions [20]. SAR backscattering is mainly influenced by soil roughness and the soil dielectric constant [21]. Specular reflectance can be affected by weather conditions, such as wind and precipitation, and also by ground-target types such as emergent vegetation, making the detection of open water difficult [22,23]. In addition, overestimation of the water extent using SAR backscatter is also frequent in sandy areas due to the similarity of radar backscatter over sand and water [24]. Notably, the quality of radar imaging of sandy regions is affected by the random reflection of the incident electromagnetic pulse which results in a loss of energy [25].

Various methodologies have been applied to delineate surface water from MS and SAR data. Water surfaces can be delineated by unsupervised [18,26–31] and supervised approaches [18,32–37] using single or multiple bands.

The literature reviewed above shows that in the case of an ideal situation, i.e., without any disturbance factors, unsupervised thresholding approaches provide a quick assessment of flooded areas. Thresholding methods, however, due to the presence of disturbance factors, which influence the real SAR backscatter and optical reflectance of the targets, may perform less effectively. When using MS data, NDWI thresholding may fail to detect standing water bodies beneath dense canopies and emergent vegetation due to the sensitivity of NIR reflectance to vegetation [38,39]. When using SAR data, flooded vegetation or forests appear bright due to the double and/or multi-bouncing effects, i.e., the interaction between the water surface and the vertical structure of stems and trunks [40–42]. Wind waves can also roughen the water surface, causing an increase in SAR backscatter to a similar/or even higher level than in surrounding non-flooded areas [43]. Moreover, the speckle noise inherent to all coherent imaging devices causes statistical fluctuations in the backscatter of pixels, which prevents stable estimates of threshold values [44].

Technically speaking, although histogram thresholding is one of the most rapid techniques in flood mapping, the selection of a suitable threshold value represents a critical step that strongly influences the outcome [45,46]. Essentially, a threshold-based method requires a bimodal histogram to binarize an image into the two semantic classes, target and background. However, since the water class only represents a small portion of the whole image in most flood cases, the histogram of the image values is often not obviously bimodal and it becomes difficult to separate the two classes [47]. To address this issue, some studies tried to divide the image into many sub-images and then apply the thresholding method to

each sub-image to estimate a suitable threshold, where the histogram was bimodal [47,48]. An alternative is to merge all sub-images containing a sufficient number of flood pixels and to estimate one global threshold value which is then applied to all sub-images [49]. Other than pixel-based thresholding discrimination, image segmentation techniques, which gather connected homogeneous pixels into patches, can provide information at the object level. Furthermore, in the case of analyzing SAR data, image segmentation will reduce the speckle effect because both morphological and radiometric information is used.

The literature review led us to identify the following gaps in knowledge:


In this study, we addressed these gaps by developing and applying a stepwise approach to delineate surface water types and flooded areas, defined by the comparison of surface water areas before and during a flood event.

The research goal of this study was to evaluate alternate combinations of remote sensing data and delineation methods to determine flood extent. In this study, the flooded area is determined as the difference between the surface water area during a flooding event and the surface water area before it.

The approach applied in the study required multi-temporal image analysis. We have analyzed the MS and SAR signatures to identify a procedure to separate different water and non-water surfaces. In addition, both the MS and SAR signatures are likely to be rather heterogeneous due to the combined effects of terrain, vegetation, and sediments transported by flood water. For example, the optical signatures of emergent vegetation and turbid water were largely overlapping, but these surface types could be discriminated using SAR backscatter signatures. The optical signature of clear water, however, was very different from anything else and suggested the possibility of delineating this surface type using a simple spectral index. Hence, a classical thresholding procedure, i.e., with a predefined threshold, was not applicable to separate all water surface types. Therefore, we used a grid-based Otsu thresholding related to the distribution of threshold values in a set of heterogeneous sample areas. This approach, however, does not solve the problem of the fragmentation of flooded areas. To deal with fragmentation, we developed and applied a new unsupervised approach that benefits from the combination of thresholding and segmentation methods (CThS).

Given the heterogeneity of the water surfaces, we have experimented with AI algorithms to explore whether we could discover additional classification rules to classify different surface water types, which then could be aggregated to delineate the entire "surface water" area. The supervised classification method, random forest (RF), was applied to our datasets. This solution was suggested by its performance being less affected by outliers and noisy data, along with the easier parametrization and the absence of assumptions on data distribution [50]. Flood maps obtained with the RF classifier were explored to understand (a) the achievable improvements by using, first, either only SAR or MS data and, second, by combining both datasets for flood delineation and (b) which features are determinant in improving flood map accuracy. This has been done particularly focusing on the heterogeneous cases mentioned earlier. It should be noted that our first and second classification approaches are unsupervised while the third one is supervised. The solutions proposed in this study have been evaluated in two different case studies with highly heterogeneous water surfaces under different hydro-meteorological conditions.

The accuracy and precision of the methods were then evaluated using different reference datasets. A comparison between the three methods was performed and the difference in accuracy due to the use of different methodologies was evaluated.

#### **2. Materials and Methods**

#### *2.1. Case Studies: Sesia and Enza Rivers*

We carried out two case studies during extreme flood events in areas located in Northern Italy along the Sesia and Enza rivers (Figure 1). The extreme events were selected and characterized by [51] as part of a study on extreme hydro-meteoric events in the Emilia-Romagna region during the period 1989–2018.

**Figure 1.** Location map of the two case studies in Italy. The lower and upper pictures show where the 2017 and 2020 events occurred, respectively. The blue lines represent rivers' routes. The footprints of the image tiles, i.e., the borders of the regions of interest, are indicated by black solid squares. The background is Google satellite imagery, available in the QGIS environment.

The Sesia is a left tributary of the Po River with the catchment entirely located in the Piemonte region. Its source is the Monte Rosa at 2500 m. It rapidly flows in the Valsesia valley where several smaller rivers flow into it, largely increasing its discharge. Between 2 and 3 October 2020, several flood events occurred in the Piemonte region. Among them, the one that occurred along the Sesia caused an embankment failure near Caresana, at the boundary with Pavia Province, leading to extensive flooding of agricultural fields and inundation of the municipalities of Borgosesia and Vercelli.

The Enza river, flowing in between the Parma and Reggio provinces (Emilia-Romagna region), is a right tributary of the river Po. Its source is in the Alpe di Succiso, in the northern Apennines at 1406 m. An extreme flood event occurred on 12 December 2017 when the Enza reached the maximum historical level of 12.44 a.m.s.l. [52] at Sorbolo in the Parma province. Consequently, the river broke up the embankment near Lentigione, inundating the entire urban area and forcing hundreds of residents to evacuate.

#### *2.2. Datasets*

The Sentinel-1 (S-1) and Sentinel-2 (S-2) satellites provide users with short revisit time data, good global coverage, and quick and free image delivery, and have good potential in land monitoring and emergency response [53,54].

According to the European Space Agency (ESA), the S-1 dual-polarized level-1 ground range detected high resolution (GRDH) products can be used in mapping affected flood areas. These datasets are acquired, multi-looked and projected to the ground range using an Earth ellipsoid model. The resulting product has approximately square pixels with a spatial resolution of 10 m with reduced speckle at the cost of worse spatial resolution. A number of (see Table 1) pre-processed (radiometrically calibrated and terrain corrected) GRD products with VV and VH polarizations were downloaded from the Google Earth Engine (GEE) server. S-1 SAR imagery in the GEE consists of Level-1 GRD scenes processed to backscatter coefficient (*σ* ◦ ) in decibels (dB) to ensure that images are statistically comparable [55]. The backscatter coefficient captures the target backscattering area (radar cross-section) per unit of ground area. Because this coefficient can vary by several orders of magnitude, it is usually converted to dB as 10·log10*σ*◦. The pre-processing of SAR GRD data in GEE includes applying orbit tracking, GRD border noise removal, thermal noise removal, radiometric correction and terrain correction using shuttle radar topography mission (SRTM) digital elevation data. In addition, some single look complex (SLC) data products were downloaded from the ESA Copernicus Open Access Hub [56] since we used the phase information as explained in Section 2.3.4. Level-1 SLC products consist of geo-referenced SAR data and are provided in zero-Doppler slant-range geometry. The data includes a single look in each azimuth and range direction using the full transmit signal bandwidth and contains complex samples preserving the phase information [56].


**Table 1.** Dates of GRDH Sentinel-1 (S-1), SLC S-1, and Sentinel-2 (S-2) data used. The explanation of the data products can be found in the text above. The flood and non-flood images are represented in blue and orange color respectively. The footprints of the images are also shown in Figure 1.

The S-2 satellites carry the multi-spectral instrument (MSI) providing high spatial resolution multispectral imagery. MSI measures the Earth's reflected radiance in 13 spectral bands from VIS/NIR to SWIR with a spatial resolution ranging from 10 m to 60 m. Some S-2 Level-2A products (See Table 1) were downloaded from the ESA Copernicus Open Access Hub. S-2 Level-2A (MS) are provided after applying radiometric, geometric and atmospheric correction and were directly used in further processing. S-2 atmospheric correction (S2AC) is based on the algorithm Atmospheric/Topographic Correction for Satellite Imagery [57]. This algorithm allows retrieval of bottom-of-atmosphere (BOA) reflectance from top-of-atmosphere (TOA) reflectance images, available as Level-1C products. The

method performs atmospheric correction based on the LIBRADTRAN radiative transfer model [58].

A complete list of the images applied in this study and the date of their acquisition is given in Table 1, while the footprints of the images are shown in Figure 1. A two-month time series per event, including 13 images (twelve non-flood and one flood image) in the same path (ascending with the track numbers of 15 and 88 for the events in 2017 and 2020, respectively), was used.

#### *2.3. Methods*

#### 2.3.1. Overview of the Approach

The first step in the procedure adopted in this study is to evaluate SAR and MS signals. We identified, first, the main landscape units by interpreting true and false color composites, then analyzed the optical and microwave signatures of such units. The analyses of the signatures suggested that surface water types can be discriminated better by combining MS and SAR signatures. The characterization of landscape units is described in Section 3.1.

Second, a new stepwise workflow was developed as schematically illustrated in Figure 2 to delineate heterogonous surface water. In the first classification experiment, the Otsu method was applied to distinguish two classes with minimal intra-class variance and maximal inter-class difference. In the second experiment, we focused on improving the delineation of fragmented flood water patches by combining thresholding and segmentation. Based on the MS and SAR signatures, we labeled these classes as water and non-water, as required by these unsupervised classifications. The SAR backscatter and NDWI images were used for the unsupervised methods.

**Figure 2.** The work-flow of the approach, consisting of three different methods.

Flooded areas are better detectable with co-polarized SAR data rather than crosspolarized ones [59–61]. The S-1 data do not include HH backscatter data for our case studies, so we used the pre-processed S-1 backscatter data with only VV polarization, although the literature suggests that HH data may perform better [55,62,63]. Further detailed information about unsupervised methods can be found in Section 2.3.2. The RF approach was applied to multiple features obtained from SAR and MS data with a dual

scope: first, to discover new classification rules to classify different surface water types and, second, to evaluate the classification performance when using either SAR or MS data only and when combining them. The RF supervised method is described in Section 2.3.3.

In this study, the flooded areas were determined as the difference between the water area during a flooding event and the permanent water areas before it. The accuracy and precision of water maps were evaluated by applying three different methods and two datasets.

There are four innovative elements in the proposed workflow: (1) the stepwise approach as an exploration of the capability of each dataset to distinguish landscape units starting from a simple method and simple data to increasingly complex algorithms and features to resolve ambiguities remaining at each step; (2) the combination of thresholding and segmentation; (3) the combination of optical and SAR derived features for RF classification and (4) the use of time-dependent features (anomalies) in the RF.

#### 2.3.2. Unsupervised Methods

Basically, the global thresholding method assumes that image pixel intensity values follow a bimodal frequency distribution (histogram). The method tries to find a single intensity threshold that separates pixels into two classes, foreground and background. However, in most flood cases, the water feature covers only a small fraction of the scene, and the bimodality does not appear in the histogram. Furthermore, the abundance of two spectrally different features in the image, such as bare soil and vegetation, may give a threshold that is not appropriate to delineate water. To tackle the non-bimodality issue, we applied the simple Otsu thresholding method to sub-images and proposed a new unsupervised method (called CThS) based on the combination of histogram thresholding and active contour segmentation methods.

• Otsu thresholding method

The threshold in the Otsu method is determined by minimizing intra-class intensity variance, or equivalently, by maximizing inter-class variance [64]. To tackle the abovementioned issue of non-bimodality, Otsu thresholding was applied to small sub-images of the original image. The entire image was subsampled into one hundred sub-images using a regular grid, and thresholds found were pooled to determine their frequency distribution (histogram). A unique threshold was determined from the histogram of sub-image thresholds. If the histogram of the thresholds is not bimodal, the threshold is identified first by a visual analysis to determine a threshold range that is not referring to water features and by excluding values in this range. Then, the maximum value (for SAR backscatter) or the minimum value (for NDWI) of the remaining thresholds is selected and applied to the entire image.

• The CThS method (combination of thresholding and segmentation)

The main idea of CThS is to find seeds that are definitely samples of water areas. To identify the water seeds a two-step procedure is applied using a textural feature, namely entropy, which maximizes the contrast between homogenous pixel samples. First, the entropy image is generated by applying a moving window. Then, a water and a non-water mask are constructed by applying a threshold to the moving window. The window size and the threshold are estimated by trial and error using the Otsu delineation of rivers and lakes in a pre-event image as a reference. By applying the mask to the input image at full resolution, the distribution of NDWI and VV backscatter values for water and non-water pixels is obtained. These distributions were applied to identify the water seeds. The histogram of all extracted seeds is reasonably bimodal so that a suitable threshold value can be determined by fitting a curve to the histogram to separate water and non-water pixels. The minimum turning point of the curve determines the threshold to extract water seed pixels. Having separated water seed points, in the second step an active contour segmentation method is used to delineate the full flood extent. The segmentation extends the initial seeds to fragmented patches. The active contour segmentation method has been widely employed for flood mapping [35,65,66]. Tong et al. (2018) showed that the Chan–Vase (C-V) active contour model [67] is computationally more efficient than the classical snake model, while it also performs better in weak boundary detection than the snake model. The snake model needs an initial set of boundary points, which are identified by applying the water and non-water gradient as a characteristic of boundary points. The gradient is estimated using an initial set of water patches. Because of the irregular and extensive distribution of the inundated areas, it is challenging to construct the initial set of water patches and estimate robust statistics on the gradient. Thus, the C-V model was applied in this study.

#### 2.3.3. Supervised Random Forest Classification

RF is a machine learning-based method, which combines many weak classifiers, the individual decision trees, to obtain a strong classifier, the Random Forest, consisting of all decision trees together [68], and is, therefore, an example of a so-called ensemble method. The method takes a number of features as input, and, when applied to classification, is trained by a set of training feature vectors for which it is known to which class they correspond. All possible values of all features in the feature vectors together form the feature space. Each node of a decision tree in the random forest corresponds to a split of the feature space for one of its features, or one dimension. To partly decorrelate the decision trees, one individual tree is only built using a subset of the features, and each split in a decision tree is determined in the training phase such that it minimizes impurity, where the impurity is a measure of heterogeneity (entropy) of the two subsets of the feature space, generated by the split [69].

Each decision tree is created by selecting at random only two-thirds of the training feature vectors with replacement. The remaining one-third of the training samples are assigned as out-of-bag (OOB) data [68], which are used for inner cross-validation to evaluate the performance of RF. The importance of the input variables can be measured, which indicates their contributions to the classification accuracy [68]. Only two parameters need to be specified to parameterize the classifier, ntree, the number of decision trees making up the whole forest and, mty, the number of randomly selected features. In general, the OOB error decreases with the growth of ntree and the plot of OOB error vs. ntree is always necessary to see whether a given number of trees is sufficient to achieve the required performance in the grown forest [50].

A key functionality of the RF is the application of alternate criteria and metrics to rank candidate features on the basis of their importance. Gini importance or mean decrease impurity (MDI) is one of the methods to calculate the feature importance. For each feature, it is possible to assess how on average they decrease the impurity. The average over all trees in the forest is then the measure of the feature importance.

#### 2.3.4. Feature Generation

S-2 bands 3 (*Green*) and 8 (*NIR*) were used to generate the *NDWI* image for both pre and in-flood dates. *NDWI* is mathematically expressed (Equation (1)) as a combination of *NIR* and *Green* bands [6]:

$$NDWI = \frac{Green - NIR}{Green + NIR} \tag{1}$$

where, *Green* and *NIR* refer to the reflectance in the green and near-infrared bands of the MS data, respectively. The *NDWI* products from S-2 data are represented in Figure 3.

A total number of 32 features at pixel level were calculated as input for Random Forest Classification. These are based on statistics calculated with SAR amplitude, phase, temporal information and textural characteristics of S-1 data (Table 2).

**Figure 3.** VV-derived SAR backscatter (**left**) and NDWI (**right**) images for the events in (**a**) 2017 and (**b**) 2020.

**Table 2.** Random forest SAR-derived features.


The main parameter retrieved by SAR sensors is the backscatter coefficient, i.e., the amplitude squared of the SAR complex signal. SAR backscattering is mainly affected by soil roughness and soil dielectric constant [71]. Flooded areas appear darker, i.e., with lower backscatter coefficient values, than non-flooded areas due to specular reflectance of flood water surfaces when smooth, i.e., still and free of emergent vegetation. We did also benefit from VH available SAR polarization in line with the findings of [72]. The pre-processed VV (Figure 3) and VH SAR backscatter images were retrieved from the GEE server.

Additionally, some polarimetric and phase-based features can be extracted from the dual-polarized SLC data. SAR interferometry (InSAR), which provides information about the Earth's topography by processing two or more SAR data images, produces interferometric coherence. InSAR coherence is sensitive to the physical changes in the ground surface and is therefore useful for image segmentation and identification of geometeorological and hydrological features [73]. Coherence (*γ*) is defined as the normalized cross-correlation coefficient between two interferometric images *I*<sup>1</sup> and *I*2:

$$\gamma = \frac{E\left(I\_1 \cdot I\_2^\*\right)}{\sqrt{E\left(\left|I\_1\right|^2\right)E\left(\left|I\_2\right|^2\right)}},\tag{2}$$

where *E* is the expectation operator, and the asterisk indicates complex conjugation.

An interferogram generated from two images before and after a flood represents flooded areas with uncorrelated phase information, thus more incoherent than non-flooded ones, since both dielectric constant and surface roughness of flooded patches can be different [73]. The SLC image pairs acquired on 7–13 December 2017 and 27 September– 3 October 2020 were used to generate coherence maps for the first and second events, respectively. The short spatio-temporal baselines ensure that the least coherent areas most likely are flooded areas. The coherence processing was conducted using the Sentinel Toolbox software (SNAP) and consists of steps to apply orbit files, pre- and post-flood images co-registration, de-speckling (Refined Lee filter with the window size of 5 × 5), interferogram/coherence generation, Sentinel de-bursting, and multi-looking (to get square pixels). The coherence image finally was geocoded by correcting SAR geometric distortions using a digital elevation model (DEM), here the 1 Arc-Second SRTM DEM.

H/Alpha dual-polarimetric decomposition, which allows the separation of different scattering mechanisms, was also included in the RF classification. The H/Alpha decomposition of dual-polarization data uses an eigenvector analysis of the coherence matrix, which separates the parameters into scattering processes and their relative magnitudes [74]. Two parameters are extracted from the H/Alpha decomposition, entropy (H), and alpha (α). Entropy is calculated from the eigenvalue information and represents the heterogeneity of the scattering. Alpha (α) is calculated from the eigenvectors and represents a rotation that indicates the type of scattering mechanism.

Texture is one of the characteristics used in identifying objects or regions of interest in an image. The so-called gray-level co-occurrence matrix (GLCM) method is used to extract second and higher-order statistical texture features, considering the relationship between neighboring pixels. The GLCM function is an image texture indicator that works by computing the frequency of occurrence of pixel pairs with specific values and in a specific spatial relationship within an image. Fourteen textural features can be calculated from the probability matrix to derive the characteristics of texture statistics of images. Detailed definitions of the textural features can be found in [70]. We used, however, ten GLCM-derived features for each polarization, which are listed in Table 2.

At the same time, to improve the reliability of classification, some temporal SAR features, including standard deviation (*Std*), temporal Z-scores (*Zs*) [75] and normalized anomaly (*Anomaly*) of image pixels within our time-series images, were also calculated according to Equations (3), (4) and (5), respectively. Flooded pixels can be identified by using these indicators as features in the RF classification. The temporal Z-score, *Zs*, is a measure of the difference between the backscatter during the flood and the mean backscatter during the entire period of the observations (including pre-event data and co-event data). The anomaly is a measure of the difference between the backscatter during the flood and the mean backscatter during the non-flood period.

The time-series SAR backscatter data (including flood and non-flood images) was used to calculate the above-mentioned temporal features. Note that during these time periods, only the events analyzed in the current study occurred. These features were computed separately for both VV and VH polarizations.

$$Std = \sqrt{\frac{\sum\_{n=1}^{N} \left(\sigma\_n^\circ - \sigma\_M\right)^2}{N-1}},\tag{3}$$

where, *σ* ◦ *<sup>n</sup>* indicates the SAR backscatter coefficient of each pixel of *Nth* image within the time series. *σ<sup>M</sup>* is the mean backscatter of each pixel along the whole stack of images. *N* is the number of images.

$$\text{Zs} = \frac{\sigma\_F^\circ - \sigma\_M}{Std},\tag{4}$$

$$Anomaly = \frac{\stackrel{\circ}{\sigma\_F} - \sigma\_{M\\_prc}}{\stackrel{\circ}{\sigma\_{Max}} - \stackrel{\circ}{\sigma\_{Min}}'} \tag{5}$$

where, *σ* ◦ *<sup>F</sup>* represents the SAR backscatter of the flood image. *σM*\_*pre* is the mean backscatter of the pre-flood data stack only. *σ* ◦ *Max* and *σ* ◦ *Min* refer to the maximum and minimum of flood backscatter.

For MS data, the R, G, B, and NIR bands and the same GLCM-derived features (see Table 2) of RGB bands (at pixel level), as well as NDWI (15 features altogether for only the flooding date), are used for the classification.

#### 2.3.5. Evaluation of the Classifications

To construct training and testing datasets, we first identified by expert interpretation the landscape units observable in both true color (R = S2 band 4, G = S2 band 3, and B = S2 band 2) and false color composites (R = S2 band 8; G and B as in the true color composite). Then, we analyzed the spectral and SAR signatures of such units to evaluate their separability. These signatures were estimated by sampling the MS and SAR images at locations identified as similar in both the true and false color composites. Specifically, the samples were randomly collected in small polygonal blocks so that all pixels within each polygon represented the same class. The training and testing samples were 70% and 30% of the total number of samples, respectively. The total number of samples was approximately 14,000 and 15,200 pixels for case studies in 2017 and 2020 respectively.

Three methods were applied to evaluate the accuracy of water maps, using different testing datasets. To compare the supervised and unsupervised results, the five classes mapped with the supervised RF classifications were aggregated into water and non-water classes. The testing dataset was used to calculate the producer accuracy related to the water class. The second reference dataset was obtained by delineating the water class in the entire scene and then estimating the fractional abundances (fraction of classified water pixels to the total number of pixels) of water on a regular 500 m resolution grid. The third evaluation was based on estimating the precision of each method by comparing each estimate within each grid cell with the median of all the estimates in the same cell.

#### **3. Results**

This section exhibits the results of different analyses in our research. First, the SAR and MS signatures of different landscape units are described and interpreted. The results of unsupervised and supervised methods are then presented. Having the accuracy and precision of water maps evaluated, the results on heterogeneous surface water which makes flood delineation more challenging are illustrated.

#### *3.1. SAR and Multispectral Signatures of the Classes*

The classes to be mapped by the supervised classification were identified by visual interpretation of the true and false color composites of the optical images acquired during the flood events. The two composites clearly identified similar land features, as shown in Figure 4, which were taken as a reference for the sampling and analysis of spectral signatures.

The first inspection (Figure 5) of the spectral profile of the land features led us to identify five different classes. According to the observed signatures, three water subclasses can be distinguished: emergent vegetation representing emergent vegetation (EV), turbid water (TW) (=defined here as flood) and clear water (CW). On the other hand, the signatures in Figure 5 identify two non-water classes: soil and vegetation. Due to the presence of sediments, most of the flooded areas had spectral characteristics different from clear water, which typically tends to have low reflectance values. Turbid water spectral reflectance is typically higher at increasing solid particle concentration [76,77]. The spectral reflectance of emergent vegetation was similar to turbid water (=flood), except the higher NIR and SWIR reflectance, while the spatial variability of the reflectance of turbid water was rather limited. The spectral profiles shown in Figure 5 suggest that the five classes might be discriminated by combining the SAR and MS signatures. The partial overlap of the MS signatures of emergent vegetation and flood water should be solved by using the VV and VH backscatter

(Figure 5b), which are clearly different. On the other hand, as expected, there is a clear overlap in the backscatter signatures of emergent and terrestrial vegetation, which can be tackled using the sharp contrast in the corresponding MS signatures.

**Figure 4.** First row: Example of false color composites (R = S2 band 8(NIR), G = S2 band 3 (Green), B = S2 band 2 (Blue)) over (**a**) emergent vegetation (EV), (**b**) clear water (CW) and (**c**) turbid water (TW) classes. Second row: example of true color composite (RGB) over (**d**) emergent vegetation, (**e**) clear water, and (**f**) turbid water classes.

**Figure 5.** (**a**) Spectral profile (average and standard deviation) (**b**) VV and VH backscattering of the classes (EV: Emergent Vegetation, Flood: flood (turbid) water, Water: clear water, vegetation, and soil) included in the training dataset for Random Forest classification (the 2020 event).

#### *3.2. Flood Maps Derived from Unsupervised Methods: Otsu and CThS Methods*

The unsupervised classification methods were applied to flood SAR VV backscatter and optical images for each case study. Flood maps obtained by Otsu and CThS methods are shown in Figure 6. The RGB true color composite of the S-2 data for the flooding date is used in the background as a reference to provide visual support. To delineate only the area actually flooded during each event, pre-flood maps of permanent water bodies were generated and removed from the water maps obtained during the flood events. In both case studies, Otsu thresholding applied to MS data provided better delineation of flood areas than using SAR VV backscatter. In fact, it provided better-defined patterns compared to the SAR data, which gave flood maps more fragmented, i.e., some flood patterns with defined geometry, observable in the true color composite, were not well identified.

Results obtained by the CThS method (Figure 6e–h) showed that similar flood patterns were obtained by CThS compared to Otsu. The main difference between the two methods is that the CThS provided a less fragmented shape of flood areas than Otsu (see illustrations in Section 3.4. for a better visualization of the differences), as expected with the contour reconstruction. Overall, the total flooded areas were also comparable for the two 2017 and 2020 events and the given method (Figure 7). Differences between estimates based on SAR and MS were rather large, however. These differences are due to the partial overlap of the SAR signatures of TW and EV with vegetation and soil (Figure 5 and Table 3). The smaller difference between the SAR and MS total flooded area estimates for the 2020 event suggests, as expected, better performance of the CThS method when dealing with fragmented flooded areas.

#### *3.3. Flood Maps with Supervised Methods: Random Forest Classification*

The previous classification experiments highlighted a large variability in spectral signatures and the complexity of classifying water and non-water. To explore the potential advantages of using a larger number of features, we applied the RF classifier on either SAR and MS or both. Specifically, according to Section 2.3.4, we used 32 SAR (Table 2), 15 MS-derived features, and 47 SAR + MS features.

The optimal number of trees (ntree) can be determined by plotting the OOB error versus ntree, where the OOB error curve converged. Figure 8a,b exhibits the ranking of SAR feature importance we used for the classification of the flood event in 2017 (Enza), and a plot of the OOB error curve for the flood event 2020 (Sesia), respectively. The larger the number of trees is, the smaller the error is. The fluctuations around >400 trees were ignored. On the basis of the trend shown in the figure, we regarded the error as stabilized at around 400 trees, where the fluctuations became smaller than 0.001. Similarly, we regarded the contribution of additional features past the first six as negligible (Figure 8a), since the incremental contribution of each additional feature was smaller than 0.01. Therefore, in order to evaluate the reduction of the computational load, 400 trees have been selected and only the six most important features have been used to apply the RF classifier. On the other hand, a comparison between the use of all and the first six features' results provided differences in overall accuracy of less than three percent, confirming it was sufficient to use the best six features for our cases.

**Figure 6.** *Cont*.

**Figure 6.** Flood maps obtained by unsupervised methods for the case studies Enza and Sesia rivers. The Otsu results of VV for the case study of (**a**) Enza river (in 2017) and (**c**) Sesia river (in 2020) and of NDWI for the case study of (**b**) Enza river (in 2017) and (**d**) Sesia river (in 2020). The CThS results of VV for the case study of (**e**) 2017 and (**g**) 2020 and of NDWI for the case study of (**f**) 2017 and (**h**) 2020. Pixels classified as water are displayed in blue color, overlaid onto the RGB true color composite of flooding date. The backscatter data on 7 December 2017 and 27 September 2020 as well as the MS data on 24 October 2017 and 9 August 2020 were used to map the permanent water body areas by each method separately.

**Figure 7.** Total flooded areas estimated with Otsu and CThS methods using SAR and multi-spectral data for the events (**a**) 2017, Enza river and (**b**) 2020, Sessia river.



Random Forest flood maps provided even better-defined flood patterns compared to unsupervised results (Figures 6 and 9). Well-defined geometric patterns were identified and mapped correctly when using SAR features. This is especially evident by looking at the event that occurred in 2020 (Figures 6 and 9).

The results obtained with the RF classifier using either SAR only or MS only or combined SAR and MS features for the event 2020 were compared by calculating confusion matrices (Table 3). The performance of classification was good when using MS features. In all cases, the percentage of misclassification of emergent vegetation as soil is rather high, with the worst case being SAR features. Likewise, the ambiguity between emergent vegetation and terrestrial vegetation was high in the case of SAR classification. This was slightly improved by combining MS's features with SAR's.

**Figure 8.** (**a**) Random Forest feature importance based on MDI for SAR-derived features, and (**b**) OOB error vs. number of trees for the best six SAR-derived features (event 2017).

**Figure 9.** RF flood maps obtained for the two events in Enza (event 2017) and Sesia (event 2020) rivers by SAR (**a**) and (**d**), MS (**b**) and (**e**) and a combination of SAR and MS feature (**c**) and (**f**). Pixels classified as water are displayed in blue color, overlaid onto the RGB true color composite of flooding date. The backscatter data on 7 December 2017 and 27 September 2020 as well as the MS data on 24 October 2017 and 9 August 2020 were used to map the permanent water body areas by the Otsu thresholding method.

We have evaluated the composition of the lumped water class delineated by the unsupervised Otsu and CThS methods by using the information on water type in the testing dataset (Table 4). In addition, we have compared the EV, TW, and CW pixels classified by the supervised method on SAR and SAR + MS with the actual number of EV, TW, and CW pixels in the testing dataset. It appears that the SAR + MS RF classifier captured the largest fraction of the EV pixels, although still much lower than the total number of EV pixels in the testing dataset. As expected, the unsupervised methods captured almost all CW pixels in the testing dataset, but only part of the TW pixels and a small fraction of the EV pixels. The lumped water class delineated by the CThS method included a greater number of TW pixels than by the Otsu method and close to the number of TW pixels identified by the RF classifier applied to SAR data. The use of MS features in combination with the SAR improved the number of pixels that were classified correctly as emergent vegetation and turbid water.

**Table 4.** Otsu and CThS classifiers: number of pixels lumped in the water class, disaggregated into emergent vegetation (EV), turbid water (TW), and clear water (CW) according to the testing dataset. Number of pixels classified as EV, TW, and CW by the SAR RF and SAR + MS RF classifiers and actual number of EV, TW, and CW pixels according to the testing dataset.


#### *3.4. Evaluation of Flood Delineation*

As defined in the methodology section, the water maps obtained by the different methodologies used in this work have been assessed in three ways. In the first evaluation, the producer accuracy of classified water (=percentage of correctly classified pixels divided by the total number of the testing samples, i.e., roughly 4000 and 4500 for the events 2017 and 2020, respectively) obtained by the different methodologies has been evaluated (Table 5).

**Table 5.** Water classification accuracies for the case studies of Enza (the event in 2017) and Sesia rivers (the event in 2020).


As mentioned before, the three water classes mapped with RF, i.e., emergent vegetation, turbid water, and clear water, were aggregated into the unique water class. The accuracy values of the water class confirmed a better performance with MS data compared to SAR for all the methods, as stated in the analysis of the flood maps provided in Sections 3.2 and 3.3. The CThS method outperformed the Otsu. The accuracy improvement ranges from about 1% (case 2017) to 19% (case 2020) when using SAR data and from 5% (case 2020) to 2% (case 2017) with MS-based classification. RF provided the highest accuracies with significant

improvements in SAR based supervised classification compared to unsupervised (from 1% to 20%).

The second evaluation was performed by comparing the fractional abundance of water estimated with each classification against reference values obtained by delineating the water area by visual interpretation of the false color composites. For this purpose, we applied a regular 500 m resolution grid to sample the maps obtained as the results of the classifications. Then, the fractional abundance of water in each cell was plotted against the corresponding reference values (Figures 10 and 11). The plots show a greater dispersion of the results obtained with unsupervised methods compared to RF, where the classification exactly matches the reference values in most cases. The unsupervised methods generally underestimated flood extent, especially when using SAR, which gave the highest root mean square error (RMSE) values (from 17 to 34%) of the fractional abundance of water. Among the unsupervised classification, the CThS method provided a better delineation with lower RMSE values for both SAR and MS-based classification. The combination of SAR features with MS in RF classification did not give an improvement in terms of accuracy (Table 6) and flood delineation performance compared to considering only MS features. The Enza river case study (2017) showed a much better agreement between classification results and reference data (Figure 10).

**Figure 10.** Plots of the fractional abundance of water (%) calculated over a regular 500 m resolution grid for the flood event 2017 (Enza river): results obtained by supervised and unsupervised methodologies (*x*-axis) against a reference flood delineation (*y*-axis).

In a further evaluation, the water maps were compared to assess the precision of the methods in each case study. This analysis was performed by evaluating the deviation of the fractional abundance calculated over the 500 m resolution grid from the median value of the results obtained from all the methods. Histograms of the deviations of fractional abundances are shown in Figures 12 and 13 for the events in 2017 and 2020, respectively.

**Figure 11.** Plots of the fractional abundance of water (%) calculated over a regular 500 m resolution grid for the flood event 2020 (Sesia river): results obtained by supervised and unsupervised methodologies (*x*-axis) against a reference flood delineation (*y*-axis).

**Table 6.** Coefficient of determination (R2) and root mean square error (RMSE) calculated between the fractional abundance of water calculated from supervised and unsupervised classification methods vs. a reference flood delineation.


The results in Figure 12 indicate that the different methods provided similar estimates for the 2017 event (median deviations close to zero), with unsupervised methods giving in some areas smaller flood extent than RF. MS-based RF classification gave larger (positive) deviations from the other methods as well as larger dispersion of the deviations. Contrariwise, deviations were positive and larger with the SAR-based RF for the 2020 event, with a median value close to 30%. Positive deviations for RF classification in the 2020 event (Figure 14) were determined by the better performance, in terms of accuracy, compared to all the unsupervised methods (Table 5).

**Figure 12.** Deviation from the median value of the fractional abundance of water (event 2017) calculated over the nodes of a regular 500 m grid.

**Figure 13.** Deviation from the median value of the fractional abundance of water (event 2020) calculated over the nodes of a regular 500 m grid.

**Figure 14.** Flood maps of EV: emergent vegetation, TW: turbid water, and sand for all the experiments in 2017 and 2020. Pixels classified as water are displayed in black color, overlaid onto the NIRGB false color composite of the flooding dates. The light blue patches correspond to the river, which was masked out from the flooding water maps (Figure 2). The first row shows the pictures of RGB true color composite of flood conditions; 13 December 2017 and 03 October 2020. The backscatter data on 7 December 2017 and 27 September 2020 as well as the MS data on 24 October 2017 and 9 August 2020 were used to map the permanent water body areas by each method separately.

#### *3.5. Sub-Cases: Emergent Vegetation, Sandy Areas, and Turbid Water*

Based on the literature, there are some land cover types, including emergent vegetation, sandy areas, and turbid water, which make accurate flood mapping challenging. We randomly selected some areas with these land covers to investigate the differences between the flood maps derived using SAR and MS data (Figure 14).

The results indicated that the SAR-based unsupervised classifications did not capture completely the emergent/sub-merged vegetation observable in the NIR, G and B false color composite. To understand this issue we compared the distribution of NDWI and backscatter (Figure 15) within the emergent vegetation (selected area in Figure 14) and a part of the river. The similarity in the distributions of the NDWI indicates that emergent vegetation is mapped as water while this cannot be achieved using backscatter, which has different distributions for emergent vegetation and river water. This implies that the unsupervised methods using NDWI provide a better delineation of the water class, as defined in Section 3.1.

**Figure 15.** Signatures of emergent vegetation vs. reference water body observed during the 2020 flooding event: (**a**) NDWI from S2/MSI data and (**b**) VV backscatter (in dB) from S-1 SAR data.

We observed an improvement in the delineation of emergent vegetation when using SAR-based RF classification since even fragmented patches of emergent/sub-merged vegetation were correctly classified (see Figure 14, emergent vegetation sub-case, and Table 4). Random Forest achieved the highest performance in delineating emergent vegetation when using MS and a combination of MS and SAR signals.

Over sandy areas (based on the soil map of Regione Piemonte [78]), SAR data led to an over-estimation of water since the backscatter of sandy soil is similar to water, as shown by our analysis of the frequency distribution of backscatter (Figure 16) and confirmed in the literature [24]. The water extent was over-estimated in these sandy areas even when using RF with backscatter data. The distribution of NDWI suggests a clear threshold and good separability between water and sandy soil.

As illustrated in Figure 14, turbid water was mapped correctly by unsupervised classification since it was still possible to determine an appropriate threshold on NDWI. On the other hand, the pre-flood delineation of turbid water by the active contour segmentation when using MS data of the event 2020 was not completely accurate at the location indicated in Figure 14.

**Figure 16.** Signatures of sandy soils vs. reference water body observed during the 2020 flooding event: (**a**) NDWI from S2/MSI data and (**b**) VV backscatter (in dB) from S1 SAR data.

#### **4. Discussion**

The results presented in the previous section provide answers to the research questions stated in the introduction as regards three main aspects:


1. Delineation of landscape units. Land cover, terrain and the depth of flood water concur in determining fragmented and heterogeneous patterns in floodwater. The high spatial resolution of S1/SAR and S2/MSI may capture small patches of emergent vegetation and of turbid water, which increases the heterogeneity of floodwater. The terrain in the two study areas is rather different, i.e., rather flat in the Sesia area and more heterogenous in the Enza area. Land cover is also very different with extensive rice, maze, and pastures, irrigated by flooding in the Sesia area and more fragmented agricultural land cover in the Enza river. In addition, hydro-meteorological conditions were quite different; the 2020 flood in the Sesia river was caused by an embarkment failure (see Section 2.1), rather than extreme rainfall and/or river water level. Contrariwise, the 2017 event in Enza river was caused by a record high river water level. Precipitation was slightly higher for the 2020 Sesia than for the 2017 Enza event. In other words, the combination of terrain, land cover and hydro-meteorological conditions led the 2020 event to be a rather complex flooding pattern, which explains the observed lower performance for this event.

A critical step in our approach was the delineation of landscape units to be mapped by interpreting the true and false color composites. The lack of calibration/validation data is a common problem when observing past extreme events associated with natural hazards. Under such circumstances, it is unlikely that concurrent in situ observations are available to analyze remote sensing data. Photo-interpretation of color composites is a widely used approach in these cases [35,79]. The identification of clearly different land units by photo-interpretation is still a challenge, however, and requires particular attention. Since our main interest was delineating water areas, we mainly focused on the correct identification of different surface water types, i.e., water–vegetation–sediment mixtures. Soil and vegetation classes, even showing intra-class heterogeneity in terms of spectral signature, could be easily identified as unique classes. The spectral signature of the classes is presented in Figure 5, where the mean values of the spectral reflectance confirm the overall separability of the defined classes. The slight overlap of the emergent vegetation and turbid water classes standard deviations suggests that few pixels may present a similar spectral

signature. As regards the unsupervised methods, where the target classes are water and non-water, this similarity had no impact on the results obtained by thresholding of NDWI. Contrariwise, the thresholding of backscatter was not adequate to separate emergent (as water) from terrestrial vegetation (as non-water). The ambiguity of SAR backscatter data in the classification of the two classes, however, could be addressed by applying the RF classifier to the combined MS and SAR signatures. The five classes identified on the basis of the true and false color composites could be separated by applying the RF classifier to the combined MS and SAR signatures.

2. Spectral and backscatter features. The spectral and backscatter signatures of flooded areas are complex in two different ways. SAR backscatter is sensitive to the physical characteristics of the ground surface, i.e., roughness and the dielectric constant, making it more difficult to interpret. This concept is supported by the evidence in Table 3. Furthermore, the heterogeneity of the flooding pattern in both events implies that observed targets include rather different components, e.g., different vegetation types and water conditions, that can be better identified using MS spectral features. A flooded area is likely to include patches of turbid water and emergent vegetation which have different signatures from water. The spectral signatures in Figure 5 confirmed this hypothesis since the MS signatures of emergent vegetation and turbid water were roughly overlapping at shorter wavelengths, but slightly different beyond 740 nm. On one hand, the SAR signatures of emergent and terrestrial vegetation were completely overlapping. On the other hand, the combined MS and SAR signatures suggested that it was feasible to separate the five identified classes, as shown by the confusion matrices (Table 3) and by the frequency distributions (Figures 15 and 16). Emergent vegetation during the 2020 event had a spectral signature (NDWI) similar to river water (Figure 15a) and was classified correctly as a component of the water class. When using SAR to observe the same targets, however, the emergent vegetation appeared much brighter than water (Figure 15b) and was not classified correctly. Most likely this is due to the double bouncing effect that increases the backscatter, causing an under-estimation of water areas [41,42,80].

According to Figure 16, values of NDWI and VV SAR backscatter were also compared with reference (river) water in a sandy area, where we observed an overestimation of flooded areas with both supervised and unsupervised methods (Figure 14). The histograms prove that SAR backscatter (Figure 16b) mistakenly led to overestimating flood water by misclassification of sandy soil because of weak backscatter (Martone et al., 2014). As in the case of emergent vegetation, the sandy soil had a "drier" MS signature, i.e., negative NDWI, than water and was separated correctly (Figure 16a).

3. Classification methods. In general, the complexity of the landscape, as a consequence of the flooding pattern, makes it rather challenging both to estimate a reliable threshold in unsupervised methods and reliable signatures when applying the supervised method. As observed the flooding pattern in 2020 was more complex than in 2017, thus explaining the generally lower performance of all the methods evaluated in this study (Table 6).

Unsupervised methods demonstrated good overall performance. The grid-based estimation of the water/non-water thresholds gave satisfactory results when applying the Otsu approach to discriminate water from non-water. However, the accuracy analysis revealed better overall performance of the CThS method in delineating water extents compared to Otsu (Table 5). It generally improved the delineation of water extents by a better-defined geometric structure as it uses segmentation to grow the seed points to approach the optimal water boundaries (Figure 14 and Table 4). Nevertheless, there was an occurrence of misclassification of the water class in the pre-flood MS image (see Figure 14). This implies that a map of flood water extent beyond the boundary of the permanent water bodies was less accurate since it confused the bare soil around the river with the water class in the reference/pre-flood image (Figure 14, 2020 event using MS). As a result, when performing the change detection to remove the reference permanent water bodies, the flooded portion of the bare soil area was removed. On the other hand, the RF classification

provided the highest accuracy in our flood mapping cases (Table 5). The advantage of RF appears when dealing with challenging cases, namely emergent vegetation which cannot be discriminated using SAR data alone, while acceptable results are obtained when using MS signatures. This is rather evident using SAR data alone. However, the CThS method provided, overall, precision and accuracy comparable to the supervised method and it is more appropriate for rapid flood mapping due to the easy implementation (Table 4).

Besides the complexity of constructing appropriate training and testing sets and defining efficient features for the supervised method, the computational complexity of RF is much higher than CThS. The computational complexity of RF is O(ntreexNxKxlogN), where N is the number of training samples, and K is the number of features [81], which gives in the more complex case of the 2020 event O(400 × 15,200 × 6 × log15,200) ≈ O(152,553,654). The computational complexity of the CThS is mostly related to the active contour segmentation, which is O(MxN), where M and N refer to the image sizes [82]. Hence, the computational complexity of the CThS is equal to O(seeds number) ≈ O(52,000) in our complex case. The advantage of RF appears only when dealing with challenging cases, namely emergent vegetation which cannot be discriminated using SAR data alone, while acceptable results are obtained when using MS signatures.

The use of various input features instead of one, as well as the definition of the water classes on the basis of the signatures, increased the possibility of accurate class discrimination. The presence of emergent vegetation and sandy soil was the most problematic issue for flood mapping with the SAR data. Additionally, the overestimation of floods in non-water areas could also be due to the misclassification of vegetation with water. For the turbid water case, most MS-derived features (into the RF) were able to distinguish between turbid water and clear water class, leading to the most accurate delineation with RF. The detection of emergent vegetation by the SAR supervised method improved when compared to the unsupervised methods (Table 4). Both supervised and unsupervised methods overestimated flooded areas in sandy areas where the SAR backscatter signal is weak.

The confusion matrices in Table 3 indicated that SAR data could only discriminate clear water from all other classes. According to our experiments, described above, MSderived features provided more reliable information on flooding than SAR. For example, the relatively small differences in reflectance beyond 470 nm between emergent vegetation and turbid water (Figure 5) were sufficient to mitigate the misclassifications between the two classes. The use of SAR in combination with MS resulted in more confusion in classifying emergent vegetation and soil compared to MS features alone. That was induced by the presence of sand and misclassification of sandy areas as water with SAR data. Furthermore, the use of MS features in combination with SAR data improved the separation of emergent vegetation from turbid water and from terrestrial vegetation compared with SAR only (Table 3).

The better performance of MS data for both supervised and unsupervised methods suggests that optical data should be preferred to the SAR. However, SAR data provides more efficient measurements in cloudy conditions than optical observations and increase the availability of data during flood events. Our study suggested that the combined use of SAR data and machine learning methods may lead to a better compromise in terms of data availability and method accuracy, providing performance improvements compared to unsupervised methods, notably in the case of the presence of emergent and/or subemergent vegetation. On the other hand, the CThS (unsupervised) method provided, overall, precision and accuracy comparable to the supervised method and is the most rapid technique to delineate flooded areas with acceptable performance.

#### **5. Conclusions**

Flood monitoring by remote sensing is a useful tool for rapid emergency response. The precise and accurate retrieval of flood maps is however a challenge mainly due to the heterogeneity of flooded and land areas. The use of multisource remote sensing imagery

increases not only the chance of data availability at the time of extreme events but also precision and accuracy due to the different nature of signals. The goal of this paper was to evaluate the precision and accuracy of alternate combinations of classification methods and measurements of different and complementary natures (MS and SAR).

Flood mapping of two events in different regions of interest using S-1 (SAR) and S-2 (MS) datasets acquired during the 2017 and 2020 heavy precipitation was performed and evaluated. Two unsupervised methods, Otsu and CThS, as well as the RF supervised method, were applied. The results indicated that multi-spectral data provided more accurate flood maps using all methods compared to SAR data. Otsu-resulted maps exhibited more fragmented flooding areas, which was addressed by applying the CThS method. The CThS method takes the advantage of both thresholding and segmentation approaches. Consequently, better-defined patterns of inundated areas were obtained. Generally, the CThS resulted in more reliable water maps than Otsu.

There were some areas, like emergent vegetation and sandy soil, leading to misclassifications when using VV SAR backscatter data. The issue was tackled by applying supervised RF, in which different intensity-, phase-, texture- and temporal-based features were utilized to improve the SAR classification. An enhancement with emergent vegetation case was observed while some overestimations of water class over sandy soil still remained with the RF as well. In another experiment, the RF classifier was also applied to MS-derived features separately, as well as the combination of all SAR and MS features together. The highest accuracy in flood mapping was obtained by the supervised RF method in all the cases. Accuracies of 92%, 99%, and 99% were achieved for the 2017 event using SAR, MS, and SAR + MS, respectively. Similarly, high values were obtained for the 2020 event, i.e., 64%, 98%, and 98%. All the solutions evaluated in this study taken together, a better performance was achieved when using MS data, possibly due to the high heterogeneity of the two flooded areas because of the combined effect of terrain land cover and hydro-meteorological conditions in the 2017 and 2020 events.

**Author Contributions:** Conceptualization, S.M.A., M.M. and R.L.; methodology, F.F. and S.M.A.; software, F.F. and S.M.A.; validation, F.F., S.M.A. and M.M.; formal analysis, F.F. and S.M.A.; investigation, F.F. and S.M.A.; resources, F.F. and S.M.A.; data curation, F.F. and S.M.A.; writing—original draft preparation, F.F.; writing—review and editing, S.M.A., M.M. and R.L.; visualization, F.F. and S.M.A.; supervision, S.M.A., M.M. and R.L.; project administration, M.M. and R.L.; funding acquisition, M.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was carried out under the framework of OPERANDUM (OPEn-air laboRAtories for Nature baseD solUtions to Manage hydrometeorological risks) project, which is funded by the Horizon 2020 Program of the European Union under Grant Agreement No. 776848. M.M. acknowledges the support received from the MOST High-Level Foreign Expert program (Grant No. GL20200161002) and the Chinese Academy of Sciences President's International Fellowship Initiative (Grant No. 2020VTA0001).

**Data Availability Statement:** Not applicable.

**Acknowledgments:** The authors are grateful to B/Pulvirenti, F. Porcu and colleagues at the University of Bologna for discussions on flood events to be analyzed and vulnerable areas. We would also like to thank the European Space Agency (ESA) for providing, free of charge Sentinel 1 and 2 data.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


## *Article* **Using a Lidar-Based Height Variability Method for Recognizing and Analyzing Fault Displacement and Related Fossil Mass Movement in the Vipava Valley, SW Slovenia**

**Tomislav Popit \*, Boštjan Rožiˇc, Andrej Šmuc, Andrej Novak and Timotej Verbovšek**

Department of Geology, Faculty of Natural Sciences and Engineering, University of Ljubljana, Aškerˇceva 12, SI-1000 Ljubljana, Slovenia; bostjan.rozic@ntf.uni-lj.si (B.R.); andrej.smuc@ntf.uni-lj.si (A.Š.); andrej.novak@ntf.uni-lj.si (A.N.); timotej.verbovsek@ntf.uni-lj.si (T.V.)

**\*** Correspondence: tomi.popit@ntf.uni-lj.si; Tel.: +386-14-704-601

**Abstract:** The northern slopes of the Vipava Valley are defined by a thrust front of Mesozoic carbonates over Tertiary flysch deposits. These slopes are characterized by a variety of different surface forms, among which recent and fossil polygenetic landslides are the most prominent mass movements. We used the height variability method as a morphometric indicator, which proved to be the most useful among the various methods for quantifying and visualizing fossil landslides. Height variability is based on the difference in elevations derived from a high-resolution lidar-derived DEM. Based on geologic field mapping and geomorphometric analysis, we distinguished two main types of movements: structurally induced movement along the fault zone and movements caused by complex Quaternary gravitational slope processes. The most pronounced element is the sliding of the huge rotational carbonate massif, which was displaced partly along older fault structures in the hinterland of fossil rock avalanches and carbonate blocks. In addition to the material properties of the lithology, the level of surface roughness also depends on the depositional processes of the individual sedimentary bodies. These were formed by complex sedimentary events and are intertwined in the geological past. The sedimentary bodies indicate two large fossil rock avalanches, while the smaller gravity blocks indicate translational–rotational slides of carbonate and carbonate breccia.

**Keywords:** slope process; surface roughness; rock avalanche; geomorphometric analysis; geological setting; deep-seated rotational and translational slides

#### **1. Introduction**

Vipava valley (SW Slovenia) is located between the Karst plateau on the southwest side and the Nanos Mountain range in the northeast. The N and NE slopes of the valley are defined by a thrust front of Mesozoic carbonates over Tertiary flysch deposits [1–3]. This overthrusting has resulted in steep slopes and fracturing of the rock, producing highly weathered carbonates and large amounts of scree deposits in the upper part of the valley. In the lower part, these slopes are characterized by a variety of different surface forms, among which recent and fossil polygenetic landslides are the most prominent mass movements. Superficial deposits range from large-scale, deep-seated rotational and translational slides to shallow landslides, slumps, and sedimentary gravity flows in the form of debris or mudflows reworking the carbonate scree and flysch material [4–8]. The influence of tectonic fractures on mass movements is a common phenomenon in Slovenia, for example, the Ciprnik complex landslide in the Tamar Valley in northwestern Slovenia [9]. Due to tectonic stresses in the hinterland of the Ciprnik landslide, the initially highly bedded rocks were additionally fractured. This intense fracturing caused an increase in the effective porosity and a decrease in the strength of the material [9,10]. The relationship between tectonics and gravitational movement in the Vipava Valley and similar extreme cases in the Alps and Dinarides point to the need for a complex study of geologic processes [11].

**Citation:** Popit, T.; Rožiˇc, B.; Šmuc, A.; Novak, A.; Verbovšek, T. Using a Lidar-Based Height Variability Method for Recognizing and Analyzing Fault Displacement and Related Fossil Mass Movement in the Vipava Valley, SW Slovenia. *Remote Sens.* **2022**, *14*, 2016. https:// doi.org/10.3390/rs14092016

Academic Editors: Constantinos Loupasakis, Ioannis Papoutsis and Konstantinos G. Nikolakopoulos

Received: 1 March 2022 Accepted: 20 April 2022 Published: 22 April 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

In this paper, we present the morphometrical analysis of two large sedimentary bodies of fossil rock avalanches, Podrta gora and Gradiška gmajna, and a few smaller detached and translationally moved carbonate blocks named Stara baba, Veliki strel, and Klapaˇciše in Zagriža. As a morphometric indicator, we used a variation of roughness index; the variability of surface elevation was used, using five different methods [12]. Quantification of the variability of surface roughness was based on the height variability method (HV) [4]. This method is based on the differences in elevations obtained from a digital elevation model with a high spatial resolution (1 m × 1 m), derived from lidar scanning.

Based on the geomorphometric analyses of surface roughness in conjunction with geomorphological and geological mapping, two main types of displacements were analyzed in the considered work: structurally conditioned displacements (at the fault zone) and Quaternary displacements caused by gravitational processes. By studying fossil and (sub)recent landslides, we were able to identify the shape of the crown, main, and lateral edges, as well as the geomorphometric characteristics at the top of the single sedimentary body of rock avalanche.

#### **2. Materials and Methods**

#### *2.1. Geological Setting*

The general topography of the Vipava Valley is determined by thrust fronts of the Trnovo and Hrušica nappes composed of Mesozoic shallow-marine limestone and dolomites. Carbonates are thrust over gentle slopes of strongly folded Paleocene and Eocene basinal clastics deposits (flysch), which are composed of alternating sandstone, shale, and marl beds (Figure 1). The Mesozoic carbonate rocks are highly fractured along the thrust contacts and are cut by large NW–SE striking, Neogene, dextral, strike-slip faults characterized by fault zones up to 300 m wide [11,13,14].

The structural contact is expressed morphologically by the difference in lithology between the steep carbonate rocks in the massif and the gentler slopes of the underlying flysch in the lower part of the slope. This lithologic boundary, formed by the thrust contact, is covered by a variety of Quaternary slope deposits, broadly divided into two groups. The first group is lithified and unlithified scree deposits covering the upper part of the slope, while the second group is partially cemented complex Quaternary slope deposits covering the lower part of the slope. The latter represents a series of composite, fanshaped sedimentary bodies with different compositions, internal structures, and textures, indicating a complex depositional history and polyphase genesis [4,6,15–17]. Quaternary slope deposits are moderately sorted and consist of gravel to medium boulder-sized clasts. Rarely, very large (approximately several meters in a linear direction) individual boulders are also present. In addition, carbonate megablocks reaching more than 100 m in length are found at the lower parts of the slopes (e.g., in the Lokavec area near Ajdovšˇcina). They were detached from the stable carbonate karst plateau and were transported up to 2 km by translational and rotational slope movements [18] (Figure 2).

The structural and lithological settings also determine the hydrogeological conditions; therefore, most springs originate close to the contacts between the limestone and flysch (springs of the Vipava, Lijak, and Hubelj rivers) [14]. Many smaller springs also emerge within the gravel layers and lenses. These are very permeable and allow rapid infiltration of rainwater, which then flows within them and encounters significantly poorer permeable rocks in weathered flysch. In this area, the infiltrated rainfall water emerges in springs upon contact between the flysch below and the limestone gravel above and continues to flow superficially. The described geological structure and related hydrogeological conditions also influence the complex depositional processes of the slope deposit [3,4,15,19–23].

**Figure 1.** (**A**) Geological map of the extended area of the Vipava Valley; (**B**) cross-section through the Trnovski gozd and Vipava Valley; (**C**) panoramic view from Sveti Socerb to the Vipava Valley and Trnovski gozd in the hinterland (geological map modified after [1,3,11,14,24–27].

**Figure 2.** Photograph of carbonate gravitational blocks (upper figure) from Navrše hill (view towards W) and lithology of the wider source area and carbonate blocks (lower figure). Adapted and modified after [18]. Reprinted and adapted with permission from Ref. [18]. Copyright 2019, Založba ZRC SAZU, Geografski inštitut Antona Melika".

#### *2.2. Geomorphological Analysis*

A detailed digital hillshaded terrain model (DTM) was obtained from an openly available Airborne laser scanning (ALS) dataset of Slovenia [28]. The ALS point cloud was rasterized to a 1 m × 1 m resolution and later, by a combination of filtering and removal of non-ground base points by adaptive triangulated irregular network densification [12]. Relief Visualization Toolbox (Version 1.1 [29]) was used to aid the visual inspection of the hillshaded DTM to emphasize the positive and negative geomorphological anomalies, which simplifies the geomorphological interpretation of the studied area [12,30,31].

For quantitative geomorphological analysis of the rock avalanche surfaces, we have used several surface roughness methods which, apart from the surface curvature analyses, proved to be useful parameters for the investigation and detection of fossil landslides [4,12,32–35] and recent landslides [36–40].

A quantitative analysis of the surface roughness of the studied sedimentary bodies of fossil rock avalanches was performed in the program ArcGIS using the height variability (HV) method, which proved to be the most useful among the several methods for the quantification and visualization of landslide parts with different sedimentary composition and genesis [12]. The height variability method is performed on a raster elevation surface with, first, a resampling of the original 1 m × 1 m lidar resolution elevation data into a coarser 3 m × 3 m resolution (performed by the replacement of the center cell with the average value in the 3 m × 3 m moving window with the ArcGIS Focal Statistics tool) and, later, by the replacement of the center cell with the difference between the highest and lowest elevation (hence the name of the method) in the same window size. The original resolution turned out to be too detailed (too noisy) for further analyses. Larger moving windows were contrarily too coarse due to lost surface details.

The HV method is very similar to the method of slope variability cf. [41]; the only difference is that it uses the difference between the maximum and minimum elevation instead of the slope difference: DTVmax–DTVmin [12]. The results are given as the elevation difference (range) in meters, with a more variable ("rough") surface inside the search window giving the higher numerical values. The advantage of this method is a very distinguishable visualization of areas with different roughness, which correspond well to different sedimentary processes on the landslide body. This method was tested on the Quaternary slope sedimentary bodies of (i) Podrta gora and Gradiška gmajna fossil rock avalanches; (ii) the adjacent gravity carbonate megablocks of Stara baba, Veliki strel, Klapaˇciše, and Zagriža on the slopes of the Vipava Valley; and (iii) on the structural elements in the hinterland high karst plateau.

We also suggest using the additional approach for geomorphometric analysis of the surface using the VAT method [24,42]. This method was originally named visualization for archaeological topography, and although the name comes from its primary use in archeology, it can be used to explore small-scale topographic variations also in geology or in any field of geomorphology, generally. It is based on the analysis of surface elevation data, and it combines several DEM-derived input layers: hillshaded relief, positive openness [43], slope, and sky-view factor [31], and blends them to combine the information into a single image (VAT) suitable for visual or quantitative inspection of the surface morphology features, helping with the interpretation of the surface changes (landslide, erosion, construction works, etc.). The VAT method is a part of the Relief Visualization Toolbox (RVT) described in [31,42]. We have tested the VAT method and found that it is extremely useful when analyzing smaller landslide features on fossil rock avalanches [24]. However, in the presented case, in which we analyzed a much bigger (regional-scale) area several kilometers in size, the use of the VAT method did not contribute useful results to help with the interpretation of the terrain analysis.

#### **3. Results**

The HV values in the N and NE areas of the Vipava Valley reflect the differences between the surfaces of the flysch base and the surfaces of the fossil rock avalanche, gravity blocks built up of carbonate gravels, and structural elements in the hinterland of sedimentary bodies. Rock avalanches and carbonate blocks have high HV and a high degree of surface roughness at the edges, in contrast to flysch rocks, which have low HV and a low degree of surface roughness (Figure 3). The exceptions are flysch-cut ravines, where greater erosion of flysch rocks occurs and which represent areas of greater slope inclination than the surrounding area [44] and high height variability. The structural elements can be identified by linear changes in height variability.

**Figure 3.** Map of HV with marked individual sedimentary bodies of Gradiška Gmajna and Podrta gora rock avalanches and larger leveling area of Stara baba, Veliki strel, Klapaˇciše, and Zagriža gravity carbonate blocks.

#### *3.1. Boundary of Individual Sedimentary Bodies and Their Source Area*

The values of HV in the Podrta gora and Gradiška gmajna fossil rock avalanches are shown in Figure 3. In narrow bands along the edges of the bodies, the variability is medium and rarely high, while the central part of the sedimentary bodies is mostly an area of low HV or low surface roughness. Medium to high HV is also observed in the narrow bands in the lower (fan-shaped) part of the Gradiška gmajna rock avalanche and in two tongue-shaped bodies of the Podrta gora rock avalanche (Figure 3). Almost the entire narrow area of the margins of both sedimentary bodies is dominated by medium to large surface roughness, which is in sharp contact with the smooth surface. The boundary is partially blurred only in the SW part of the Podrta gora rock avalanche, where the difference in surface roughness is not strongly pronounced, and in the central part of the Gradiška gmajna rock avalanche, where the boundary between the sedimentary body and its surroundings is not definable. On the NE side of the Gradiška gmajna rock avalanche, the lateral edge is clearly visible, as the gorge of the Hubelj River is cut next to it (Figure 3).

Aell-recognized geomorphological element is also the upper edge of the rock face in the hinterland of the Podrta gora and Gradiška gmajna rock avalanches (Figure 3). Below the upper edge, the values of HV (in the form of a jagged convex edge) are very high and indicate a sharp boundary between the carbonate rock face and the karst surface at the top of Trnovski gozd in the area of Rob, Pravi vrh, and especially of Podrta gora (Figure 3).

#### *3.2. Boundary of Individual Sedimentary Bodies and Their Source Area*

In both the Podrta gora and Gradiška gmajna rock avalanche areas and in the area between them, there are extensive areas with an extremely low degree of surface roughness (Figure 3). These areas are the locations of the gravitational carbonate blocks named the Stara baba, Veliki strel, and Klapaˇciše sedimentary bodies and the Zagriža sedimentary body, located within the Podrta gora rock avalanche. The mentioned areas in the lower part, towards the SW, are adjacent to the areas of high surface roughness. The contacts between the smoothed and rough areas at the Stara Baba, Veliki Strel, and Klapaˇcišˇce bodies are clearly concave in the downward direction, while the contact at the Zagriža body (within the Podrta gora rock avalanche) is approximately flat or slightly convex in the downward slope (Figure 3).

#### *3.3. Identification of Structural Elements in the Hinterland of Sedimentary Bodies*

The HV method is also useful for identifying faults and fracture zones. Fracture deformation and crack structures are most evident in the area of high karst plateaus, where faults can be identified in line-by-line changes in the values of height variability. In some places, the value of surface roughness increases; in other places, the response of the rock mass to the fracture zone is just the opposite, and the HV decreases. The most pronounced are the northern part of the Predjama fault, which runs across the plateau above Rob, and the fault on Mala gora above the Slano blato (Figure 3, cf. [45]. Linearly distributed alterations are evident in the carbonate rock face in height variability, mostly oblique to the edges of the rock face. Most likely, this is a morphological reflection of fracture zones running at different angles between the main fault systems [3,14].

Generally, the middle and upper parts of the rock face of carbonate rock (yellow and red color), extending over the whole Vipava Valley, have the greatest HV values (Figure 3). In the lower parts of the rock face in the foothills, the values of HV are medium (Figure 3). The transition to the upper karst plateau of Trnovski gozd is morphologically pronounced, as medium values and sometimes low values of HV begin to predominate in the sharp line. This transition is related to the erosional frontal retraction of the overthrust fronts and to the structural features of the area. The lower boundary, at the base of the rock face, is directly related to the thrust or fault contact with the carbonate rocks and the flysch bedrock. The sharpness of the boundary is obviously determined by the thickness of the carbonate gravel or scree deposit. In places where individual sedimentary bodies with a lot of carbonate gravel are observed, higher values of surface roughness are obtained, while in areas with thin gravel, the value of HV is low.

#### **4. Discussion**

In general, landslides are complex and consist of parts with different geomorphological characteristics [46,47]. Using the visual interpretation of the digital evaluation model (DEM), and the calculated surface roughness indicator, we were able to identify the surface properties of the individual sediment bodies very well in most cases. Based on a combination of geomorphometric indicators, we conclude that the sedimentary bodies have a very complex structure formed by different Quaternary sedimentation processes. The analysis of the typical morphological elements found in the sedimentary bodies of the Podrta gora and Gradiška gmajna rock avalanches and on the rotational blocks of Stara baba, Veliki strel, Klapaˇciše, and Zagriža is presented in Figure 4, cf. [47,48]. It has been shown that the degree of surface roughness is most strongly influenced by various sedimentation processes, in addition to the characteristics of the sedimentary material. Similarly, Grohmann et al. [49,50] determined that different surface roughness values are attributed to landslide formation processes (recent and fossil) and time elapsed since the surface formation, in addition to material characteristics.

The high contrast between the degree of surface roughness occurring at the margins of each body and the surface belonging to the surrounding and base of the sedimentary deposit reflects the difference between the two distinct lithologic units. The carbonate gravel or breccia belongs to the Gradiška gmajna and Podrta gora rock avalanches and to the Stara baba, Veliki strel, Klapaˇciše, and Zagriža gravity blocks. These blocks are characterized by a high degree of roughness, while flysch bedrock forms the base and immediate surroundings of sedimentary bodies, creating a relatively smooth surface.

#### *4.1. Podrta Gora and Gradiška Gmajna Fossil Rock Avalanches*

#### 4.1.1. Crowns and Main Scarp

Based on the surface roughness, the upper edge above the Podrta gora and Gradiška gmajna landslides is clearly visible in the area of Rob, Pravi vrh, and Podrta gora (Figures 3 and 4). The values of HV in this area are very high in the form of a convex edge. The areas of fossil rock avalanches determined by remote sensing correspond with the results of geological mapping. High values of surface roughness in this area indicate that this geomorphometric element represents the crowns and main scarps of the Podrta gora and Gradiška gmajna rock avalanches (Figure 4A, points 1 and 2). The main direction of elongation of the crown and main scarp is perpendicular to the direction of mass transport, which is also one of the typical characteristics of crowns and can be detected at least in the case of the Podrta gora fossil rock avalanche, cf. [47]. In the immediate vicinity of the main scarp is also an area of carbonate gravel, representing a recent scree deposit (Figure 4A, point 7).

**Figure 4.** Geomorphometric elements of sedimentary bodies: (**A**) Podrta gora fossil rock avalanche; (**B**) Gradiška gmajna fossil rock avalanche; and (**C**) Stara baba, Veliki strel, and Klapaˇciše carbonate blocks, based on the classification of the schematic representation of the landslide. Adapted from [46–48].

#### 4.1.2. Minor (Lateral) Scarp of Fossil Rock Avalanches

At the border of sedimentary bodies, especially in the lower part, sharp transitions of HV values occur. These locations are the steep boundaries between the flysch bedrock and the sedimentary bodies of the Podrta gora and Gradiška gmajna fossil rock avalanches, which are made of carbonate gravel, mostly lithified in a slope breccia (Figure 5). Lateral scarps are areas of medium to high surface roughness, while the flysch bedrock is primarily a smooth area with an extremely low degree of HV (Figures 3 and 4). The lateral scarps of sedimentary bodies are approximately parallel to the main direction of transport, which is one of the characteristic elements of landslides [48]. The sharp transitions of HV at the periphery of the sedimentary bodies are also affected by the erosion of carbonate gravels and breccias. This is especially marked in the eastern part of the Gradiška gmajna fanshaped sedimentary body, where the Hubelj River erodes part of the fan and changes its original shape (Figure 3. Shulz [51], for example, explained the lower reliability in detecting the lateral scarp and the toe of the Gradiška gmajna body precisely with the reworked surface of the fossil rock avalanches. The toe of the Gradiška gmajna rock avalanche with a high degree of surface roughness has also been eroded.

**Figure 5.** Up to a 10 m high wall of carbonate gravel, partly strongly lithified to breccia, in the lower part of the lateral scarp of the Podrta gora fossil rock avalanche in the abandoned Apnenec quarry, above the village of Kožmani.

#### 4.1.3. Geomorphometry of the Central Part of Sedimentary Bodies

The surface roughness is low in the interior of the Podrta gora and Gradiška gmajna fossil rock avalanche. It is particularly low in the central part of the tongue- or fan-shaped areas of the body. Similarly, Glenn et al. [37] recognized the high surface roughness at the head scarp and the toe of rock avalanches. Surface roughness is different in different parts of a single landslide; namely, it is high in the areas of erosion and low in the body of the landslide [36,37].

The central parts of the sedimentary bodies were well identified at the Podrta gora rock avalanche, but this was the least accurate compared to other geomorphological elements. This is due to erosional processes that change the shape of the fan and increase the surface roughness. This can be seen in the eastern part of the Gradiška gmajna rock avalanche, where carbonate gravels occur at quite high elevations, at the source of the Hubelj, and the riverbed is cut to the flysch base (Figure 3. Habiˇc [52] even stated that the water of the Hubelj River caused the sliding of the Gradiška gmajna breccia material, and the Gradišˇce carbonate breccia was displaced and transported when the breccia had already formed. The presence of the Hubelj karst spring [53] even before the lithification of the older gravel into the breccia indicates that this carbonate gravel had dammed the karst spring for a long time, and such a dam could only form when larger quantities of gravel poured into the original riverbed in a relatively short time during stronger earthquakes [52].

In the upper part of the Podrta gora rock avalanche is a large area of an accumulation of carbonate rocks with low surface roughness, most likely representing a huge rotational block that slid on the weathering flysch bedrock or muddy sediment at its base (Figure 3). The block, which is divided into three parts, consists of strongly cracked carbonate rocks, while carbonate breccias and slope gravels occur only in the hinterland. Similar gravity blocks were recorded near the Lokavec slide in a combination of translational and rotational block-type slope movements [18]. The gravitational block of Podrta gora most likely represented the first transport phase of the complex Podrta gora rock avalanche, from which a huge gravel landslide further developed and was transported in the form of a rock avalanche in the Vipava Valley. The two phases of the Podrta gora rock avalanche mass movement are also evidenced by the forms of the secondary scarp (convexity in the downward direction) in the Zagriža area (Figure 3). The latest proposed classification system by Hungr et al. [54], modified after Varnes [55] and Cruden and Varnes [48], classifies two-phase landslides in the class of complex landslides, and their transport complexity is referred to as a two-phase event [56]. In contrast to the two-phase Podrta gora rock avalanche, the two-phase transport process is not observed in the case of the Gradiška gmajna rock avalanche. The large area of the main scarp and the well-defined upper crown, as well as the large fan-shaped body in the lower part of the avalanche, may indicate that Gradiška gmajna represents a huge rockfall in the initial (first) phase, which further developed into a debris avalanche.

#### *4.2. Gravitational Blocks*

#### 4.2.1. Planation Surface Area

Based on the analyses, we have identified many areas of extremely low surface roughness (Stara baba, Veliki strel, and Klapaˇciše), which spread upwards to individual scree deposits in the foothills. Individual planation surfaces were formed by large rotational slides, where individual blocks of the carbonate breccias rotated along the sliding surface at the contact of the gravel with the underlying flysch bedrock. The blocks of breccia also tilted towards the slope as they slid (Figure 6). At the outer edges, there was even a reverse tilt of the breccia blocks and the formation of steep rock faces with an extremely high degree of surface roughness and smooth areas in the hinterland depression. Similar geomorphometric features in the Rebrnice area were also recognized [6,19].

**Figure 6.** Rotational Zagriža carbonate block, which is part of the complex two-phase Podrta gora rock avalanche. The detail of rotational Zagriža carbonate block is on the right side of the image (area a).

#### 4.2.2. Structurally Conditioned Movement

By analyzing the geomorphometric features, we can recognize the structurally induced movement. Today, the blocks are generally inactive and are important mainly because of their influence on the geological structure of the area [14,26,45,57]. The contact between the flysch and carbonate bedrock in the northwestern part of the area (north of the Gradiška gmajna rock avalanche) is at significantly higher elevations than in the central part. Thus, we find the highest flysch outcrops in the area of Gosta meja at 475 m and at the source of Hubelj at 240 m above sea level [14]. The structure in this area is a depressional synclinal bend of the overthrust surface of the Trnovo nappe with an axis in a northeast–southwest direction [57]. In addition, a regionally significant Avˇce fault was explored in the area, which would be at least a partially displaced thrust fault in this area [26,57]. In the Hubelj Spring area, detailed geological mapping identified a complex, highly branched NW–SE oriented fault system, one segment of which merges with the northern branch of the Predjama fault [14]. Thus, in the studied area, we are dealing with a complex structure, within which we cannot determine the structural significance of a single segment of the contact between the carbonates and flysch on the basis of the outcrops alone due to the overlap of the outcrops. Indeed, it can be assumed that the structures triggered a significant reduction of the thrust fault and, thus, influence the formation and hydrogeological characteristics of the area. At the same time, the diversity and intensity of the slope processes in this part of the examined area have significantly increased.

On the nearby Mala gora on the other side of the valley of the Lokavšˇcek stream (Figure 7A), Placer et al. [45] suggested so-called structural landslides, reportedly also known in the Rebrnice area [58]. The exposed carbonate massif of Mala gora slides in the form of a large (deep-seated) rotational slide about 300 m down the slope toward the valley, including the flysch layers in the slide. The sliding surfaces (main scarps) in the carbonate massif represent the shape of normal faults in the hinterland of the Mala gora block [45].

**Figure 7.** Interpretation of the processes in the area around Ajdovšˇcina, where the whole area (Block A) above the hinterland of the fossil rock avalanche of Gradiška gmajna and Podrta gora could be classified as part of a large slide block by analogy with Mala gora. (**A**) HV map, (**B**) geological map, and (**C**) map with individual geomorphometric elements marked: 1 = linear distribution of change in the degree of HV along the northern branch of the Predjama fault is comparable to the main scarp of Mala gora; 2 = areas north to northeast of both lines have greater HV than southwestern areas; 3 = arcuate curvature of lines; 4 = southwestern carbonate massifs in the concave part of the lines, which are subsided; 5 = slopes below the concave margin where recent and sub-recent gravitational processes are intense.

A comparison of the results of the geomorphometric characteristics of the whole study area with the situation in Mala gora provides some details, namely:


Based on these results, we propose that the whole area between Rob and Podrta gora, analogous to the Mala gora massif, could be part of a large carbonate block, settled relative to the hinterland. The movement of the carbonate block (Block A; Figure 7A,C) occurred along the fracture surface of the Predjama fault in the western part and most likely along roughly parallel fracture zones in the eastern part. Consequently, the displacement of Block A also affected the increased intensity and diversity of gravitational movements in the lower part of the slopes, such as the large fossil rock avalanches Podrta gora and Gradiška gmajna. Based on the observed similarities, two questions regarding the geological history of the area are still unanswered: whether the lowered thrust fault in the area under consideration is only an influencing factor or if it is an active participant in the gravitational movement of rock masses. The other question is if the linear geomorphometric elements indicate a connection with the Mala gora main scarp and the fault zone of the northern branch of the Predjama fault. It is possible that both displacements could belong to a large-scale structural movement of Mala gora and Block A (Figure 7C).

#### **5. Conclusions**

Based on the geomorphometric analyses of surface roughness, we have roughly distinguished two main types of displacements: structurally induced displacements (along the fault zone) and displacements caused by Quaternary gravitational slope processes. Quaternary slope deposits were studied geomorphometrically on two larger sedimentary bodies, the Podrta gora and Gradiška gmajna fossil rock avalanches, and on some smaller gravitational carbonate bodies—Stara baba, Veliki strel, Klapaˇciše, and Zagriža. It turns out that the quantitative parameter of surface roughness proved to be very useful in the studies of fossil and recent or sub-recent rock avalanches or landslides, generally. Specifically, we were able to detect very well the shapes of the main and minor scarps, as well as the geomorphometric characteristics of the deposits within individual bodies. In addition, structural elements that influence mass movements have been successfully identified.

The degree of surface roughness depends mainly on various deposition processes, in addition to material properties. The visualization of roughness values in a GIS environment allowed us to understand the two phases of complex avalanches that evolved from a sliding rotational landslide, in the case of Podrta gora, or a large rockfall, in the case of Gradiška gmajna, to a rock avalanche. The cases of the Stara baba, Veliki strel, and Klapaˇciše gravity blocks indicate large translational–rotational slides. In this sense, we strongly recommend the use of surface roughness analysis in future research of mass movements induced by various displacement causes. In addition, the geomorphometric analyses also revealed some peculiarities in the structural observation, the most pronounced element being the sliding of the huge carbonate block on the Predjama fault in the hinterland of the fossil rock avalanches and carbonate blocks. A comparison of the geomorphometric elements of the carbonate block (A) with the Mala gora rotational block above Lokavec shows that the whole area may be a part of a major rotational slide of the carbonate massif that was displaced partly along older fault structures.

**Author Contributions:** Conceptualization, T.P., B.R., A.Š., A.N. and T.V.; methodology, T.V. and T.P.; writing—original draft preparation, T.P., B.R., A.Š., A.N. and T.V.; writing—review and editing, T.P., B.R., A.Š., A.N. and T.V.; visualization, T.P. and T.V.; funding acquisition, T.P. and T.V. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was partly funded by the Slovenian Research Agency's core funding (No. P1-0195—Geoenvironment and Geomaterials).

**Acknowledgments:** The work was carried out in the framework of the International Consortium on Landslides (ICL), project number IPL-216 (entitled Diversity and Hydrogeology of Mass Movements in the Vipava Valley, SW Slovenia), and the ICL Adriatic–Balkan Network (ABN). We appreciate the constructive comments and suggestions made by the anonymous reviewers.

**Conflicts of Interest:** The authors declare no conflict of interest. The sponsors had no role in the design execution, interpretation, or writing of the study.

#### **References**


### *Communication* **EGMStream, a Desktop App for EGMS Data Downstream**

**Davide Festa and Matteo Del Soldato \***

Earth Science Department, University of Firenze, 50121 Firenze, Italy; davide.festa@unifi.it **\*** Correspondence: matteo.delsoldato@unifi.it; Tel.: +39-055-2757551

**Abstract:** The recent release of European Ground Motion Service (EGMS) products implemented under the responsibility of the Copernicus Land Monitoring Service (CLMS) guarantees free and accessible Europe-wide ground motion data for ground deformation analysis at the local and regional scales. The need for value-adding services and tools for optimal dissemination of radar data from the Copernicus Sentinel-1 satellite mission urges the scientific community to find efficient solutions. A desktop R-based application with a user-friendly interface capable of automatically downloading and transforming EGMS products delivered as large .csv tiles, equivalent to a radar burst into geospatial databases, is presented here. EGMStream is a self-contained desktop app that enables users to systematically store, customize, and convert ground movement data into geospatial databases, burst per burst or for an area of interest directly selectable on the app interface.

**Keywords:** downstream; EGMS; InSAR; Europe; ground deformation

#### **1. Introduction**

Copernicus is the Earth Observation Programme of the European Union, managed by the European Commission and implemented in partnership with the European Member States; the ESA (European Space Agency); the EUMETSAT (European Organization for the Exploitation of Meteorological Satellites); the ECMWF (European Centre for Medium-Range Weather Forecasts); and European Union (EU) agencies, such as the Environmental European Agency (EEA); and the Mercator Ocean. The program aims to have the planet and its environment benefit all European citizens by using in situ data collected from different sources, such as Earth Observation satellites. The program is divided into different services: (i) Copernicus Atmosphere Monitoring Service (CAMS); (ii) Copernicus Marine Service (or Copernicus Marine Environment Monitoring Service); (iii) Copernicus Land Monitoring Service (CLMS); (iv) Copernicus Climate Change Service (C3S); (v) Copernicus Service for Security; and (vi) Copernicus Emergency Management Services (EMS). The last addition in the CLMS program is the European Ground Motion Service (EGMS), launched in mid-2022 to provide consistent A-DInSAR (Advanced Differential Interferometric Synthetic Aperture Radar) data derived by satellite imagery in high resolution (Sentinel-1) at a continental scale and allow the ground motion analysis and monitoring [1]. At a national, or regional, scale, the ground motion service monitoring was sporadically adopted. In fact, starting from the first application launched in Italy in 2007 (Italian Special Plan of Remote Sensing of the Environment) and taking advantage of the ERS (European Remote Sensing), EN-VISAT (Environmental Satellite), and partially the COSMO-SkyMed data [2,3], processed with the PSP (persistent scatterer pairs [4]), the PSInSAR [5,6], and the SqueeSAR [7] algorithms, other nations decided to develop the same service as Norway [8,9] and Germany in 2018 [10]. In addition, the Agency for Data Supply and Efficiency of Denmark is working on a WebGIS platform, not yet available for a full, free, and open to all user service based on the Sentinel-1 data processed providing InSAR data by the SqueeSAR algorithm data and information along LOS (Line of Sight) and GNSS (Global Navigation Satellite System)-calibrated velocity [11] and their components. Additionally, the Netherlands is

**Citation:** Festa, D.; Del Soldato, M. EGMStream, a Desktop App for EGMS Data Downstream. *Remote Sens.* **2023**, *15*, 2581. https://doi.org/10.3390/ rs15102581

Academic Editor: Soe Myint

Received: 3 April 2023 Revised: 28 April 2023 Accepted: 12 May 2023 Published: 15 May 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

implementing its GMS based on Sentinel-1 data at national level for investigating the height change obtained via A-DInSAR [1].

In Italy, taking advantage of the Sentinel-1 regularity of acquisition and coverage, in 2016, the Tuscany Region (central Italy) implemented the first continuous monitoring system according to the Prime Minister Decree 27 February 2004 [12,13]. Following the same approach, the northwestern Valle d'Aosta Region, in 2018, and the northeastern Veneto Region, in 2019, started with the near-real monitoring of the ground deformation [14]. These examples explain that the launch of the Sentinel-1 constellation opened the possibility to investigate several geohazards thanks to its short revisiting time, the worldwide coverage and the freely availability due to its scientific application purposes. Several examples can be found in the literature on landslide detection [15–18], characterization [19–21], and monitoring [22–24], subsidence phenomena analysis and monitoring [25–29], infrastructure monitoring [30–33], or mining instability analysis and monitoring [34–37].

On the same line, the EGMS was conceived following the direct request of many users for free InSAR data over Europe. The service is managed and being implemented under the responsibility of the EEA, and it is part of the CLMS portfolio. The EGMS provides an annual update of the ground motion over the whole European territory with the time series from February 2015 with full spatial resolution based on the Sentinel-1 radar data. The Sentinel-1 images are processed by four different algorithms [38] PSP-IFSAR [4,39,40], SqueeSAR [7], GSAR-GTSI [41,42], and PSI (Persistent Scatterer Interferometry), performed with an IWAP (Integrated Wide Area Processor) [43,44].

The processed data are available from November 2022 in visualization and download at three levels of processing [45], (i) basic (L2a), (ii) GNSS-calibrated (L2b), and (iii) Ortho (L3).

The basic data provides InSAR velocity and displacement information along the LoS (Line of Sight) with information about the geolocalization and quality measurements for every MP (Measurement Point). A time series is associated with each MP, while the velocity and displacement information are relative, thus referring to a stable virtual Reference Point chosen during processing of the Sentinel-1 SAR images stacks frame by frame. The L2a products are provided in full resolution as two discrete datasets referred to as the Sentinel-1 ascending (South to North) and descending (North to South) orbits. The basic product is a necessary first step to the more advanced products.

The calibrated data are an advanced product, considered the main EGMS product, consisting of a deformation map with LoS absolute velocity and displacement information corrected by a model derived from GNSS time series data across Europe. Since some isolated islands do not have GNSS data available, such as the L2b products, are calibrated, and " ... *are produced by harmonizing Basic products with respect to each other, and then adjusting the mean ground velocity to zero.*" [22] The L2b products are provided in full resolution as two discrete ascending and descending datasets as the L2a.

The L3 data, named Ortho products, are the vertical and horizontal (East–West) components of velocity, completed for their time series, calculated from the L2b data. Differently to the previous products, the L3 data follow a regular grid since the L3 MP are synthetic points summarizing the L2b ascending and descending velocity in a cell of 100 m (coinciding to the Copernicus DEM). As for the L2a and L2b data, the time series, both for the vertical and horizontal components, are present in this level of product.

Prior to publication, all EGMS data underwent an extensive process quality control protocol involving the validation of several criteria, such as a suitable density of measurement points within the Corine Land Cover (CLC) categories and an appropriate displacement standard deviation [46]. This procedure of EGMS data velocity validation is still ongoing by an external consortium.

In this work, we introduce the R-based EGMStream tool developed with the Shiny package [47,48] and available as a desktop application for downloading, transforming, and exporting the EGMS products into customized geospatial databases storable in ESRI shapefiles or GeoPackage containers. The meaningfulness of such a value-adding service relies on offering an improved user experience for the management of A-DInSAR data

covering Europe. In particular, the capability of cropping the EGMS products based on a personalized Area of Interest (AoI) is implemented within the EGMStream application. Within this framework, thanks to an easy-to-use interface, we foster the usage and the growth of the potential pool of users interested in the downstream applications of EGMS data and interferometric products.

#### **2. Materials and Methods**

EGMStream is a free app that does not require any previous software installation or the use of a third-party server, being a self-contained R-based application, which can be deployed to the desktop. The presented data downstream approach follows a precise workflow to download and convert the EGMS data, indifferently from the level of processing, by using a list of links available for download directly from the EGMS viewer [49]. The underlying flowchart of EGMStream can be split into two main parts, the upload of the input data required for the application to run and the design of the geospatial database prior to the final conversion (Figure 1).

**Figure 1.** EGMStream concept workflow.

#### *2.1. Input Data*

The EGMStream app is designed to retrieve and manage EGMS ground deformation products at a pan-European level. The EGMS satellite-based land monitoring data are made available for download through the EGMS Product Archive and Dissemination System by accessing the EGMS Explorer [49].

Upon registration and authentication, users are enabled to locate and download multiple InSAR datasets (with a maximum of two simultaneously) related to a geographical territory. The only limit about the extension of the latter is a maximum width of 3 degrees. In addition, the EGMS Explorer system allows downloading the "Download links", which is an ASCII file (*.txt*) containing hyperlinks corresponding to a list of products that is queried for bulk download. Multiple EGMS product levels can be listed within the single ASCII file, where the provided hyperlinks come with a security token that keeps the data valid for downloading for one hour. Moreover, the token is refreshed if a download is in progress, allowing the hyperlinks to remain valid for another hour after a given download finishes [50]. When the token expires, the app will appear frozen, requiring a further re-load by the desktop shortcut or by the start menu. Therefore, new valid download links are then required to correctly deploy the app.

The EGMStream application is conceived to be fed with an ASCII file ("*download links*" from the EGMS viewer) containing one or several download-links via upload control. The selected file will serve as the input for the successive operations concerning data download, eventual data cropping over the AoI, and setting of the database attributes prior to InSAR data conversion.

#### *2.2. Data Storage Setting and Conversion*

The EGMStream can unscramble encoded URLs with a timeout limit set to 1 h to ensure an appropriate time window for starting the download of server-intensive files. In the first instance, the resulting EGMS products are temporarily stored as zipped files within an automatically created folder named 'Downloaded'. After this operation, EGMStream automatically proceed by unzipping the retrieved files and by creating the directory 'Unzipped' where the ground motion data are stored in the *.csv* format (standard download format from EGMS viewer). Both folders are automatically generated and then deleted at the end of the data conversion process. Pop-up notifications are extensively used within the app to inform the user about the processes achieved by the ongoing R session. In case the pop-up expires, all the processes made by the app are reported in the "Processing history" tab.

EGMStream's main functionalities concern EGMS data cropping based on the user's AoI and data conversion into geospatial databases. A major feature of the app regards the possibility to interactively draw a rectangular shape of the AoI through a map viewer panel created using the JavaScript 'Leaflet' library [51]. This functionality allows the user to manually derive the geographical area, which acts as a mask covering the underlying EGMS data products to be clipped. In the event that the AoI is on two different bursts (on the same track), the MP data will be merged into a unique shapefile or database. On the contrary, if the selected data overlay two or more tracks (which implies different acquisition dates from the satellite), the converted geodatabases will be kept separate and named after the track number. EGMStream ensures the possibility of choosing the folder location where the converted data will be saved locally. Moreover, several settings allow designing the data storage characteristics, which are selectable from the app interface. In particular, the geospatial database can be designed by the user by selecting:


Giving the possibility to exclude time series information would result in a lighter conversion process and a geospatial database with a reduced file size. This can be particularly helpful to adapt EGMS data to users whose need is to retrieve only ground deformation maps, requiring only mean velocity values of deformation. The default option is set to 'With Time Series'.

When retained, date column names related to the time series information can be adjusted to the *Dddmmyyyy* format or the *Dyyyymmdd* format. In this case, there is no default option; therefore, the date format selection needs to be explicitly selected.

Once all the parameters are set, the downloaded *.csv* files can be converted into two of the most common file formats for geospatial data, namely shapefiles (i.e., *.shp*) or GeoPackage (i.e., *.gpkg*), with the last being the default option. Shapefile is a native ESRI (Environmental Systems Research Institute) proprietary format which comes with a mandatory part of file collections and is especially designed for use in Geographic Information Systems (GIS) software. On the other hand, GeoPackage is an open and platform-independent format for storing geospatial information within an SQLite database in a unique file that supports its direct use. The required and supported content of a GeoPackage is entirely defined in the Open Geospatial Consortium (OGC®) standard document [52].

As a result of the selectable storage setting options, six different configurations are made available to convert EGMS products:

*.shp* without time series;

*.gpkg* without time series;

*.shp* with *Dddmmyyyy* time series format;

*.gpkg* with *Dddmmyyyy* time series format;

*.shp* with *Dyyyymmdd* time series format;

*.gpkg* with *Dyyyymmdd* time series format.

#### **3. Results and Discussion**

EGMStream is an open-source tool which is realized by using the Shiny R-package framework. To encourage distribution and make the program platform independent, EGMStream is shared as a Windows desktop application ready for immediate use without the need for installing any external software (i.e., R). To achieve this, the built framework uses R-Portable [53], while the app's primary package dependencies are loaded when the application is run for the first time. The presented application is bundled into an executable installation (i.e., a setup wizard), which guarantees control over the destination location and allows creating a program's desktop shortcut. The front-end interface of the app loads dynamically on the PC default web browser (tested on Google Chrome and Microsoft Edge) prior to installing EGMStream.

Leveraging on the Shiny's reactive framework, the user-driven draw toolbar can be used to reshape the AoI multiple times within the same session, where only the last drawn element will be kept. Additionally, the drawn AoI can be interactively deleted from the application interface. Additionally, EGMStream is designed to overwrite the output results when the same instructions are repeated. It should be noted that EGMStream tackles the conversion of very large EGMS files by limiting the process to 500,000 rows at a time; consequently, the targeted downloaded file would be split into different converted geospatial databases (which are suitably labeled with a progressive numbering for reference).

The EGMStream's current limit concerns the maximum amount of downloadable EGMS products within the same session. This amount varies according to the user's internet connection velocity and to the dimension of the requested file for conversion. Based on tests performed with an Intel Core i7-4790QM at 3.60 GHz, 4 cores, 8 threads, 8 MB cache, 16 GB RAM, 250 GB SSD disk, and Windows 10, 64 bits, we recommend to feed EGMStream with less than 50 download links.

#### *3.1. Example*

To demonstrate the added value of EGMStream for the rapid downstream and deployment of EGMS products, it briefly showcased the successful data conversion procedure of multiple levels of interferometric data covering part of the Rhenish coalfields of Germany, as shown in Figure 2. In particular, the converted database contains a geographical subset of the interferometric products overlaying the AoI drawn by the user.

**Figure 2.** (**a**) Outline of the EGMStream app interface; (**b**) framework of EGMStream output products; and (**c**) visualization of the converted EGMS product via GIS platform.

EGMStream implements intuitive workflow routine tasks, such as (i) deployment of the ASCII input file obtained from the EGMS viewer; (ii) implementation of an arbitrary extraction mask to clip the input EGMS data to an AoI; (iii) data download-unzipping procedure; (iv) data conversion according to the specified storage settings; and (v) storage of converted data.

The output results, being vector data points, are stored following a precise framework; every level of EGMS data and every related acquisition geometry is stored within a dedicated folder (Figure 2b), enabling a direct reference to the queried products.

With the exploitation of the EGMStream functionalities, the downloaded and converted data result are particularly suited for visualization, data handling, and post-processing elaboration on GIS software (Figure 2c).

A deep understanding of the ongoing surface displacement can rely on the analysis of the spatial pattern of the various levels of EGMS data features. In Figure 3, a landslide is displayed, one of the most common geohazards that can be detected and analyzed via the available EGMS products (reference period: 2016–2021) and downloaded by the EGM-Stream tool for the area of interest. In particular, a joint use of the calibrated (Figure 3a,b) and ortho (Figure 3c,d) products enables the identification of clusters of points with comparable motion trends and deformation patterns (which can be identified according to the mean velocity). The displacement of the area of interest can be further evaluated by looking at the time series of every MP. Mapping and monitoring potentially risky and/or critical areas can greatly contribute to more resilient urban planning and prompt response from civil protection authorities.

**Figure 3.** (**a**,**b**) Three-dimensional visualization of the calibrated ascending and descending EGMS products related to an ongoing slope displacement located near Canillo (Andorra); a more accurate picture of the deformation scenario can be obtained by consulting the EGMS (**c**,**d**) ortho data.

#### *3.2. Future Developments*

The EGMS foresees an annual update of the three levels of data, and the EGMStream tool is already set to process the new data.

In addition, in a further version of the EGMStream tool, the authors would like to implement several functions, to among:


In addition to the above-mentioned ideas in the pipeline, every future suggestion from the scientific community or end-users will be considered.

#### **4. Conclusions**

EGMStream is an open-source, interactive, and user-friendly Shiny/R desktop application designed to enhance the downstream of EGMS products by enabling the user to seamlessly download, customize, convert, and store radar-based geospatial databases. Distributed as a self-contained application, EGMStream deployment is tied to an initial installation procedure on a local Windows PC. Leveraging the app´s intuitive interface, no prior knowledge is required to obtain reliable and handy results.

The development of EGMStream is still ongoing in order to guarantee a more personalized experience.

**Author Contributions:** Conceptualization and methodology, M.D.S.; software, D.F. and M.D.S.; validation, D.F.; writing—original draft preparation, D.F.; writing—review and editing, M.D.S.; visualization, D.F. supervision and project administration, M.D.S. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Data Availability Statement:** EGMStream v1.0 is freely available as a standalone desktop app at http://cpc.unifi.it/EGMStream\_v1.zip. By clicking on the link the download of the zip file containing a Windows installer (i.e., EGMStream\_installer.exe) will start. To remain updated about future releases or to receive further support, please contact the authors at egmstream@dst.unifi.it. The code is freely available and can be found within the installed folders.

**Acknowledgments:** This work was developed within the project EGMS RASTOOL (ECHO–UCPM-2021-PP (Prevention and preparedness in civil protection and marine pollution), Contract No. 101048474). We would like to thank Lorenzo Solari for the support in the data comprehension and in the testing of the tool. In addition, we would like to thank the Copernicus Land Monitoring Service (CLMS) to provide the European Ground Motion Services (EGMS) data to users.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

## *Technical Note* **A High-Precision Remote Sensing Identification Method on Saline-Alkaline Areas Using Multi-Sources Data**

**Jingyi Yang 1,2,3, Qinjun Wang 1,2,3,4,\*, Dingkun Chang 1,2,3, Wentao Xu 1,2,3 and Boqi Yuan 1,2,3,5**


<sup>3</sup> University of Chinese Academy of Sciences, Beijing 100049, China


**Abstract:** Soil salinization is a widespread and important environmental problem. We propose a high-precision remote sensing identification method for saline-alkaline areas using multi-source data, a method which is of some significance for improving ecological and environmental problems on a global scale which have been caused by soil salinization. Its principle is to identify saline-alkaline areas from remote sensing imagery by a decision tree model combining four spectral indices named NDSI34 (Normalized Difference Spectral Index of Band 3 and Band 4), NDSI25 (Normalized Difference Spectral Index of Band 2 and Band 5), NDSI237 (Normalized Difference Spectral Index of Band 3 and Band 4) and NDSInew (New Normalized Difference Salt Index) that can distinguish saline-alkaline areas from other features. In this method, the complementary information within the multi-source data is used to improve classification accuracy. The main steps of the method include multi-source data acquisition, adaptive feature fusion of multi-source data, feature identification and integrated expression of the saline-alkaline area from multi-source data, fine classification of the saline-alkaline area, and accuracy verification. Taking Minqin County, Gansu Province, China as the study area, we use the method to identify saline-alkaline areas based on GF-2, GF-6/WFV and DEM data. The results show that the overall accuracy of the method is 88.11%, which is 7.69% higher than that of the traditional methods, indicating that it could effectively identify the distribution of saline-alkaline areas, and thus provide a scientific technique for the quick identification of saline-alkaline areas in large regions.

**Keywords:** remote sensing; saline-alkali areas; salinization identifying; high precision; multisource data

#### **1. Introduction**

Soil salinization is a major type of land degradation in arid and semi-arid areas [1,2], one which causes soil consolidation and crop yield decline, and thus results in huge losses in agricultural production. In addition, its mutual induction with soil desertification will cause more significant damage to the ecological environment and even cause serious geological disasters [3–6]. Soil salinization lasts for a long time, and the land encounters difficulties when it attempts to repair itself, which makes for a continuous impact on the human living environment and economic development [7–9]. More than 100 countries and 7% of land area on a global scale are affected by land salinization [10,11]. It has become a worldwide environmental issue of wide-ranging concern, thus leading many countries to pay high attention to the amelioration and development of saline-alkaline areas. China is one of the countries seriously affected by salinization [12]. Therefore, it is important

**Citation:** Yang, J.; Wang, Q.; Chang, D.; Xu, W.; Yuan, B. A High-Precision Remote Sensing Identification Method on Saline-Alkaline Areas Using Multi-Sources Data. *Remote Sens.* **2023**, *15*, 2556. https:// doi.org/10.3390/rs15102556

Academic Editor: Bas van Wesemael

Received: 27 March 2023 Revised: 10 May 2023 Accepted: 11 May 2023 Published: 13 May 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

to strengthen the dynamic monitoring of saline-alkaline areas to curb the source of land degradation, and to make rational use of land to improve the ecological environment.

The methods of saline-alkaline area monitoring can be currently divided into two types: instrument-measured soil data [13,14] and large-scale monitoring with remote sensing. With the development of spatial information technology, remote sensing has become the most widely used method in large-scale saline-alkaline area monitoring [5,15–17]. The methods for monitoring saline-alkaline areas based on remote sensing technology have mainly changed from visual interpretation to methods using computers to process image data and extract features [18].

The exploration of saline-alkaline area identification methods based on spectral features has been a subject of frequent scholarly discussion. In 1992, Dwivedi [19] performed experimental research on the best remote sensing bands combination for saline-alkaline areas monitoring, and concluded that the combination of bands 1, 3, and 5 of Landsat TM remote sensing images contained the largest amount of information, while the accuracy of a saline-alkaline area being identified was not proportional to the amount of information in the remote sensing data. Farifteh [20] found that soil reflectance had a good response to the salinity of a soil surface layer when using hyperspectral data for soil salinization classification, and concluded that there was a linear relationship between soil salinization and its spectral reflectance. By correlating the spectral parameters from MODIS images with salinization levels, Bouaziz et al. [21] constructed a linear spectral unmixing (LSU) model to examine the status of soil salinization in semi-arid areas. Xiao Dong [22] et al. obtained reflectance and salinity data by field sampling to construct an inversion model and a correction model. Yanhua Fu [23] constructed a model indicating the relationship of spectral data and salt content, and of organic matter content and PH level.

Research efforts using indirect features are mainly used to verify the saline-alkaline soil distribution with the help of some other auxiliary information. For example, the growth condition of vegetation can be affected by salinity; thus, vegetation is a good indirect indicator of salinity [24]. Some salt-tolerant vegetation can also be one of the salinization signs. On the ecological scale, soil salinity can adversely limit species diversity and species' ecological niches [25]. Salinity is especially associated with negative osmotic potential, which inhibits seed germination and debilitates cell turgidity [26]. R. L. Dehaan et al. [27] demonstrated that the growth and distribution of vegetation had a strong correlation with soil salinity. By developing the normalized difference vegetation index–salinity index (NDVI–SI) feature-space remote sensing model of soil salinization, Wang et al. [28] successfully monitored the change of saline soil in the Tarim Basin, Xinjiang.

Although these two methods have attained some achievements, how to effectively identify saline-alkaline areas with high accuracy is still the focus of present research.

Minqin County in Gansu Province, China is located at the junction of the Tengger Desert and the Badain Jaran Desert [29], where land degrades seriously. Since the middle of the 20th century, Qingtu Lake, which is located in the deepest part of the two deserts, has gradually dried up. In the 1970s, Minqin County started to use a large amount of groundwater, which caused soil salinization. If it continues this seriously, it will eventually lead to the merger of the Tengger Desert and the Badain Jaran Desert, which will directly affect the geomorphology, climate, and human environment of the northwest region and even threaten the survival of local peoples [30].

Although scholars have attained some achievements of quantitative monitoring of land cover using remote sensing, there is little research on the application of remote sensing data to the identification and monitoring of saline-alkaline areas at present. Traditional saline-alkaline area identification methods only rely on the selection of a single feature parameter, which is difficult to adapt to the optimal classification effect. To solve the problem of low accuracy of saline-alkaline area identification based on the traditional spectral indices, taking Minqin County as the study area, we propose a high-precision method of saline-alkaline area identification using multi-source data. By analyzing the trends and reasons of changes in saline-alkaline areas in the Minqin oasis, the objective of this paper is to provide a reference for the timely monitoring of saline-alkaline areas and ecological environment construction globally in arid areas.

#### **2. Methodology and Experimental Application**

*2.1. Methodology*

2.1.1. Identification Method of Saline-Alkaline Area

A decision tree [31] is a method for hierarchical processing of remote sensing images which is suitable for features with blurred boundaries and complex structures. Its main idea is to gradually mask and separate each feature as a layer from the imagery, avoiding any impact on the other features' identification. Therefore, it is possible to integrate various effective feature quantities, thus improving the identification accuracy of salinealkaline areas.

Firstly, we use GF-6/WVF (Chinese satellite GaoFen-6/Wide Field View) image data, combining GF-2(Chinese satellite GaoFen-2) image and Google Earth high-resolution image data to select different types of samples, and find the best spectral index of band combinations for saline-alkaline areas. Secondly, the GF-2 image data is used to extract textures. Elevation and slope from DEM (Digital Elevation Model) data are used as elevation features to build a decision tree model for saline-alkaline area identification. Finally, the accuracy of the classification results of the constructed decision tree model are verified in ArcGIS using the random scattering function combined with visual interpretation. The technical flowchart is shown in Figure 1.

**Figure 1.** Technical flowchart (GLCM: grey-level co-occurrence matrix).

#### 2.1.2. Accuracy Evaluation Method

Evaluation of feature classification is an important part of remote sensing monitoring, attempting to determine whether the results are credible. The most commonly-used evaluation method is the error matrix method, also called the confusion matrix method [32].

In this paper, the confusion matrix is calculated by comparing each actual measured image element with the corresponding classified one [33]. Each column of the confusion matrix represents the actual measured information, and each row of the confusion matrix represents the classified information of the remote sensing data (Table 1).


**Table 1.** Example of confusion matrix.

Various land type: Class 1, Class 2, . . . , Class *n*.

User accuracy is the percentage of test points that fall on that category in that subcategory and are correctly classified as that category on the classification graph.

$$\text{UA} \left( \text{User's Accuracy} \right) = \frac{X\_{nn}}{\mathbb{C}d\_n} \tag{1}$$

Producer accuracy is the probability that the ground truth reference data for the category is correctly classified in this classification.

$$\text{PA}(\text{Producers accuracy}) = \frac{X\_{nn}}{T d\_n} \tag{2}$$

Overall accuracy is the percentage of check points of all correctly-classified land cover categories relative to the total number of check points.

$$\text{OA(Overall accuracy)} = \frac{\sum\_{i=1}^{n} X\_{ii}}{All} \tag{3}$$

The Kappa coefficient is a metric that indicates how much better the classification result is than random classification. The Kappa coefficient takes into account the difference between two kinds of consistency; one is the consistency between automatic classification and reference data, and the other is the consistency between sampling and reference classification. In general, the Kappa coefficient is between 0 and 1. A higher Kappa coefficient indicates a higher classification accuracy.

$$\text{Kappa} = \frac{OA - \frac{\sum\_{i=1}^{n} Cd\_i \times Td\_i}{All}}{1 - \frac{\sum\_{i=1}^{n} Cd\_i \times Td\_i}{All}} \tag{4}$$

From 5 to 8 March 2023, we collected 143 samples for verification at a depth of 0 to 5 cm from the surface. They were recorded, associated with information such as number, location, depth, personnel, date, and then brought back to the laboratory.

#### *2.2. Experimental Application*

#### 2.2.1. Study Area

Minqin County is located in the downstream region of the Shiyang River Basin in eastern Gansu Province, China, with an altitude of 1200–1500 m (Figure 2). Tengger Desert is in the east, and Badain Jaran Desert is in the north [34] (Figure 3).

**Figure 2.** Location map of the study area.

**Figure 3.** Map of Minqin.

As a temperate continental desert climate, the climate of the study area is characterized by cold winters and hot summers, and is dry, with little precipitation, as well as windy and sandy [35]. Its average annual temperature is 8.2 ◦C and the average annual precipitation is 115 mm [36]. The total area of oasis in this area is about 1352 km2, which only accounts for 9% of the total area of Minqin County [37]. Due to environmental characteristics such as high temperatures, low precipitation, and high evaporation, the water resources in Minqin are lacking, which leads to soil desertification [30].

The main soil types are Fragic Arenosol, Solonchak, Solonetz, Plaggic Anthrosol, and Irragric Anthrosol [38], the first of which can be classified as Arenosol with sand content exceeding one-half [39].

#### 2.2.2. Data

GF-6/WFV images and GF-2/PMS images, as well as DEM, slope, and vector boundary data, are used in this paper (Table 2). In this experiment, seasons of the remote sensing images were selected as being from June to July, because plants grow more luxuriantly and there is no snow and ice cover in this period, and thus, it is favorable for the identification of saline-alkaline areas.

**Table 2.** Data sources for identification.


The GF-6/WFV data were pre-processed for radiometric calibration, atmospheric correction, orthorectification correction, and vector cropping to obtain eight-band surface reflectance data for the study area (Figure 4). The atmospheric correction was implemented by the FLAASH (fast line-of-sight atmospheric analysis of spectral hypercubes) atmospheric correction module (Table 3).

**Figure 4.** GF6 pre-processing flow chart.

**Table 3.** Parameters of the FLAASH atmospheric correction module.


The GF-2/PMS image has MSS (Multispectral) and PAN (Panchromatic) data. They have been pre-processed for radiometric calibration, atmospheric correction, geometric correction, image fusion, etc. to obtain a four-band fused image with a spatial resolution of 1 m for the study area (Figure 5).

**Figure 5.** GF2 pre-processing process.

**Figure 6.** Comparison of image fusion: (**a**) image of PAN; (**b**) image of MSS; and (**c**) sharpened MSS.

**Figure 7.** DEM with slope.

Image fusion is an image processing technique that resamples low-resolution multispectral images with a high-resolution pan image to generate a high-resolution multispectral image for remote sensing, enabling the processed image to have both high spatial resolution and multispectral characteristics. Here, we use the Gram–Schmidt pan sharpening (GS) fusion method (Figure 6). Its advantage is that it is not limited by the band, which is suitable for processing high spatial resolution images, and can better maintain the texture and spectral information.

The DEM data was cropped in ArcGIS using vector boundary files and then output to obtain the elevation data of the study area. Furthermore, slope was obtained from the cropped DEM in ArcGIS (Figure 7).

2.2.3. Feature Extraction

• Spectral features

The main land cover types in the study area include desert, saline-alkaline area, vegetation, urban, and water. After pre-processing the GF-6/WFV images, the original spectral characteristics of each type in the study area were analyzed (Table 4, Figure 8).

**Table 4.** GF-6/WFV band.


**Figure 8.** Spectra of features in the study area. (Bands are arranged by increasing wavelength).

From Figure 8, we can see that some spectral features of a saline-alkaline area and a desert are easily confused; more spectral indices are needed to improve the saline-alkaline areas' classification accuracy.

The NDSI34 (Normalized Difference Spectral Index of Band 3 and Band 4) was constructed using band 3 (Red) and band 4 (NIR).

$$\text{NDSI}\_{34} = (\text{NIR} - \text{R})/(\text{R} + \text{NIR}) \tag{5}$$

The NDSI25 (Normalized Difference Spectral Index of Band 2 and Band 5) was constructed using band 2 (Green) and band 5 (Red edge1: Re1).

$$\text{NDSI}\_{25} = (\text{Re1} - \text{G}) / (\text{Re1} + \text{G}) \tag{6}$$

The NDSI 237 (Normalized Difference Spectral Index of Band 2, Band 3 and Band 7) was constructed using bands 2 (Green), 3 (Red), and 7 (Violet).

$$\text{NDSI}\_{237} = (\text{R} + \text{G} - \text{V})/(\text{R} + \text{G} + \text{V})\tag{7}$$

The final composite salinity index NDSInew (New Normalized Difference Salt Index) was constructed as:

$$\text{NDSDI}\_{\text{new}} = \text{NDSDI}\_{25} + \text{NDSDI}\_{237} - \text{NDSDI}\_{34} \tag{8}$$

The spectral indices are mainly selected depending on the spectral characteristics of each feature. For example, NDSI34 can sufficiently separate the vegetation in the image. The saline-alkaline area is associated with a large difference between the red edge1 band and the green band, therefore, NDSI25 can distinguish the saline-alkaline areas from other features. For reflectance of saline-alkaline soil in the red and green bands, which are significantly higher than those in the violet band, NDSI237 can sufficiently separate saline-alkaline soil from other features. Considering the three indices together, we finally construct the comprehensive index, NDSInew, by which the saline-alkaline areas can be well distinguished.

• Texture features

When the spectra of the features are relatively similar, the spectral differentiability decreases and texture information can play an important role in distinguishing the features, raising the accuracy rates of classification [40].

Among the methods for computing image texture features, GLCM (grey-level cooccurrence matrix) is one of the most widely used statistical methods [41]. GLCM can describe the spatial distribution and structural characteristics of the image grayscale, which is advantageous in improving the classification of geological targets by using texture. There are eight main feature quantities commonly used for texture identification in remote sensing images: mean, variance, homogeneity, contrast, dissimilarity, entropy, angular second moment, and correlation.

**Figure 9.** Textures.

We used the GLCM method to extract textures from GF-2 images and calculated eight textures on four bands with a 3 × 3 window (Figure 9). After that, we selected mean, dissimilarity and entropy as the parameters for classification.

### • Elevation features

Height and slope information from DEM are introduced to carry out reclassification in ArcGIS. As shown in Figure 10, there are some differences in elevation among features. For example, vegetation and urban types are generally flatter.

**Figure 10.** Elevation characteristics: (**a**) height of features; and (**b**) slope of features.

#### **3. Results and Discussion**

#### *3.1. Classification and Verification*

Based on multi-source data, the results of saline-alkaline area identification map in the study area using a decision tree classification method combining spectral features (NDSInew), texture features (mean, dissimilarity and entropy), and elevation features (height and slope) is shown in Figure 11.

**Figure 11.** Saline-alkaline areas: map and sample points.

Through random distribution and considering the accessibility of each site, we travelled to Minqin County for a field survey (Figure 12). The land types of verification points were investigated and labeled, and 143 verification samples were obtained.

**Figure 12.** Field survey: (**a**) saline-alkaline areas; (**b**) handheld GPS recording; and (**c**) sampling.


**Figure 13.** Percentage of saline-alkaline area in Minqin County.

The samples were established and verified by confusion matrix, using producer accuracy, user accuracy, total accuracy and Kappa coefficient. The results are shown in Table 5, and indicate that the accuracy of the proposed saline-alkaline area identification method is 88.11%.

Shown in Figure 13, the saline-alkaline area in Minqin County is 3385.17 km2, accounting for 20.4% of the total area of Minqin County.

From its spatial distribution in Figure 11, the soil salinization in the northwest is the most serious, with a large area of saline-alkaline area implicated, followed by the eastern region, and finally the Minqin oasis area, where the saline-alkaline area is small, scattered, and distributed on both sides of the oasis.

#### *3.2. Comparison of the Results of Different Indices*

Based on the same data, the traditional salinity index NDSIold = (NIR − R)/(R + NIR) was used for salinity identification, and its accuracy was verified to be 80.42%. The comparison of the salinity identification results between these two methods is shown in Figure 14.

**Figure 14.** Comparison of the identification results of different indices.

From these results, we can see that the accuracy of the new salinity index NDSInew is improved by 7.69% compared with the traditional salinity index NDSIold, indicating the effectiveness of the new spectral index in the identification of saline-alkaline areas.

#### *3.3. Analysis of Saline-Alkaline Area Change*

Three Landsat8 OLI remote sensing images were downloaded from https://www. gscloud.cn, accessed on 28 December 2022, in July and August (Table 6).

**Table 6.** Data sources for analysis.


**Figure 15.** Saline-alkaline areas identified in the Minqin oasis in 2010, 2015, and 2020.

The remote sensing data were pre-processed with ENVI.

The results of the saline-alkaline area identification in 2010, 2015 and 2020 (Figure 15) were statistically analyzed in ArcGIS to classify the total areas of saline-alkaline land. The saline-alkaline areas in 2010, 2015 and 2020 were 2276.21 km2, 2186.28 km2 and 1922.93 km2, respectively (Figure 16). From this, we can see that the saline-alkaline area decreased 353.28 km2 from 2010 to 2020.

**Figure 16.** Change of saline-alkaline areas in the Minqin oasis.

There are many natural and human factors affecting the saline-alkaline area changes in the Minqin oasis:

(1) Climate change has brought many problems to the soil environment, such as a series of biological changes in the soil's physical composition (water content), chemical composition (various salt ion contents), and plant species. Climatic warming can not only cause microorganisms to rapidly decompose soil organic matter and soil nutrients' rapid decrement, but it can also cause soil moisture to evaporate, accelerating the upward movement of salt, and causing soil salinization.

According to the statistics of the Minqin meteorological station, Minqin has little rainfall but a high level of evaporation (Figure 17). Combined with the temperature rises, these promote the salinization of the soil.

**Figure 17.** Climate in Minqin county: (**a**) curve of annual average temperature and precipitation; and (**b**) annual average monthly potential evaporation.

Coupled with the weathering effect of rocks, a large amount of salt is released in the soil's parent material of the northwestern remnant hills, and then carried to the lowlands through precipitation, resulting in serious salinization in the northwest [42].

(2) Historically, Minqin has been one of the important salt-producing areas, with many salt ponds [34]. With the gradual depletion of the Shiyang River, Minqin started to seek groundwater instead (Figure 18). Due to the adjustment of agricultural structure, water resources were redistributed spatially, salt was transferred with water, and the overuse of irrigation water also led to the transformation of some depressions at the edge of the oasis into saline-alkaline areas [43].

**Figure 18.** Changes of surface water inflow in Minqin County from 2005 to 2012.

(3) In 2007, the government implemented the "Key Control Plan of Shiyang River Basin" and began to transfer water to Minqin County at the lower reaches of the Shiyang River [44]. The surface water runoff into Minqin County has increased year by year (Figure 18). Since then, the soil salinization in Minqin has been improved to a certain extent.

Some studies [45–47] on saline-alkaline area identification in Minqin County are shown in Table 7. However, all only considered salinization as a type of desertification and did not conduct in-depth research on the fine classification of saline-alkaline areas. From this, it can be concluded that there are few studies on saline-alkaline area identification in Minqin County. At the same time, there have been precedents for the decision tree classification methods for land classification in this region, which proves the applicability of the method in Minqin.


**Table 7.** Comparison of studies on the identification of saline-alkaline areas in Minqin County.

#### **4. Conclusions**

We use multi-source data for saline identification in Minqin County and draw the following conclusions:

(1) The proposed method is effective in saline identification.

Based on multi-source data, we use a decision tree classification method to extract saline-alkaline areas by constructing three features: spectral indexes, textures, elevations, and slopes. The results show that the accuracy of saline-alkaline area identification is

88.11%, which is 7.69% greater than the traditional salinity indices, indicating the effectiveness of the proposed method.

(2) The multi-source data can help to identify features and improve accuracy.

GF-6 data are beneficial to the improvement of the accuracy of saline-alkaline area identification. In which, band 3 and band 7 are important to the saline-alkaline area identification in the study area.

High spatial resolution of GF-2 data can provide rich texture information, thus reducing the mistakes of distinguishing or misclassifying between features due to "different features with the same spectrum" or "different features with the same spectrum".

The height and slope from DEM can quantify the topography of the study area, which is also helpful for identifying features and improving the classification accuracy.

(3) Monitoring and prevention of unused land in the study area are necessary.

With 20.4% of the land considered to be within a saline-alkaline area, soil salinization in Minqin County is a serious concern, especially in the northwestern areas. Therefore, we should strengthen the monitoring and prevention of unused land to prevent further soil salinization.

In summary, based on the previous studies, we proposed a high-precision salinealkaline area identification method based on multi-source data. The results demonstrate the effectiveness of the method, thus solving the current problem of low accuracy of salinealkaline area identification, a solution which may be applied to large-scale saline-alkaline area monitoring in the future. Meanwhile, it should be noted that, although the decision tree classification method achieved better classification results in this study, the significance of selected feature variables and grading criteria need to be further studied and improved to make the discriminative rules and classification results more realistic. Therefore, the research on the identification and classification of soil salinization in arid zones needs to be further developed.

**Author Contributions:** Conceptualization, J.Y. and Q.W.; methodology, J.Y.; validation, J.Y. and D.C.; formal analysis, J.Y. and D.C.; investigation, J.Y. and W.X.; resources, Q.W. and B.Y.; data curation, J.Y. and D.C.; writing—original draft preparation, J.Y.; writing—review and editing, Q.W.; visualization, J.Y. and W.X.; supervision, Q.W.; project administration, Q.W.; funding acquisition, Q.W. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded in part by the National Key R&D Program of China (grant No. 2021YFB3900503), the National Natural Science Foundation of China (grant number 42071312), the Hainan Hundred Special Project (grant number 31, JTT [2018]), the Innovative Research Program of the Foundation of China (grant number 42071312), the Innovative Research Program of the International Research Center of Big Data for Sustainable Development Goals (grant number CBAS2022IRP03), the National Key R&D Program (grant number CBAS2022IRP03), the National Key R&D Program (grant number 2021YFB3900503), the Special Project of Strategic Leading Science and Technology of the Chinese Academy of Sciences (grant number XDA19090139), the Second Tibetan Plateau Scientific Expedition and Research (STEP) (grant number 2019QZKK0806), and the Hainan Provincial Department of Science and Technology (grant number ZDKJ2019006).

**Data Availability Statement:** Not applicable.

**Acknowledgments:** Thanks are due for the GF data provided by CRESDA and AIRCAS.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

MDPI St. Alban-Anlage 66 4052 Basel Switzerland www.mdpi.com

*Remote Sensing* Editorial Office E-mail: remotesensing@mdpi.com www.mdpi.com/journal/remotesensing

Disclaimer/Publisher's Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Academic Open Access Publishing

mdpi.com ISBN 978-3-0365-9217-6