where *m* is the mass, *v* is the velocity, and *f* is the resultant force, which depends on the

and the right-hand side (RHS) is the resultant force.

$$
\otimes \dashv \vec{\mathbf{v}}
$$

ment of inertia matrix,

**Figure 2.** Eulerian and Lagrangian descriptions. **Figure 2.** Eulerian and Lagrangian descriptions.

2.1.1. Models Based on the Lagrangian Description

2.1.1. Models Based on the Lagrangian Description The rigid body motion includes translation and rotation, which are six-degrees-offreedom processes (three components for the translation process and three components The rigid body motion includes translation and rotation, which are six-degrees-offreedom processes (three components for the translation process and three components for the rotation process). Using the center of mass and inertia matrix, the force and torque equations take the form:

$$\mathbf{F} = m\mathbf{a}, \ \mathbf{T} = [I\_R]\boldsymbol{\omega} + \boldsymbol{\omega} \times [I\_R]\boldsymbol{\omega} \tag{2}$$

movement type. The left-hand side (LHS) of Equation (1) represents the motion process,

**Figure 2.** Eulerian and Lagrangian descriptions. for the rotation process). Using the center of mass and inertia matrix, the force and torque equations take the form: **F aT** = = +× *mI I* , [ *R R* ]αωω [ ] (2) where **F** is the force, **T** is the torque, **a** is the acceleration, *m* is the mass, *I<sup>R</sup>* is the moment of inertia matrix, *ω* is the angular velocity, and *α* is the angular acceleration. The equations of translation and rotation on the Lagrangian description are

$$\mathbf{x}\_{l+\Delta t} = \mathbf{x}\_l + \mathbf{v}\_l \Delta t + \frac{1}{2} \mathbf{a}\_l \Delta t^2 \tag{3}$$

$$\Theta\_{\mathsf{H}\star\mathsf{Alt}} = \Theta\_{\mathsf{I}} + \mathfrak{w}\_{\mathsf{I}}\Delta t + \frac{1}{2}\mathfrak{a}\_{\mathsf{I}}\Delta t^{2} \tag{4}$$
 
$$\text{where } \mathfrak{a}\_{\mathsf{I}} \text{ is the characteristic matrix of the combination of the } \mathfrak{a}\_{\mathsf{I}} \text{ state, the } \mathfrak{a}\_{\mathsf{I}} \text{ state is } \mathfrak{a}\_{\mathsf{I}} \text{ state.}$$

α

is the angular acceleration. The

is the angular velocity, and

equations of translation and rotation on the Lagrangian description are

**F aT** = = +× *mI I* , [ *R R* ]αω ω [ ] (2) where **F** is the force, **T** is the torque, **a** is the acceleration, *m* is the mass, *RI* is the mowhere **x***<sup>t</sup>* is the location at time *t*, **v***<sup>t</sup>* is the velocity at time *t*, **a***<sup>t</sup>* is the acceleration at *t*, θ**<sup>t</sup>** is the angle at time *t*, ω**<sup>t</sup>** is the angular velocity at time *t*, and α**<sup>t</sup>** is the angular acceleration at time *t*.

ω

*dmv <sup>f</sup> dt* <sup>=</sup> (1)

ω

ment of inertia matrix,

## 2.1.2. Models Based on the Eulerian Description

In the continuum systems, the Eulerian description is best for simulating motion, and the Navier–Stokes equations are the basis of modeling [32]:

$$\frac{\partial \rho}{\partial t} + \nabla \cdot (\rho \mathbf{v}) = 0 \tag{5}$$

$$
\rho(\partial \mathbf{v}/\partial t + \mathbf{v} \cdot \nabla \mathbf{v}) = \mathbf{S}\_{\mathbf{f}} \tag{6}
$$

where *ρ* is the mass density, *t* is time, **v** is the velocity, **S<sup>f</sup>** is the field which includes gravity, friction, and other forces. The best way to solve these equations is through numerical analysis due to lacking a smooth analytical solution. Traditional CFDs such as Openfoam® use these equations to simulate the landslide dynamics. In these programs, the geometric modeling process is complex.

To describe flowing grain–fluid mixture, Iverson proposed a mixture theory [33], and the **S<sup>f</sup>** is:

$$\mathbf{S}\_{\mathbf{f}} = -\nabla \cdot \left( \mathbf{T}\_{\mathbf{s}} + \mathbf{T}\_{\mathbf{f}} + \mathbf{T}' \right) + \rho \mathbf{g} \tag{7}$$

in which

$$
\rho = \rho\_s \upsilon\_s + \rho\_\upsilon \upsilon\_\upsilon \tag{8}
$$

$$\mathbf{v} = \left(\rho\_s v\_s \mathbf{v}\_s + \rho\_f v\_f \mathbf{v}\_f\right) / \rho. \tag{9}$$

Here, *ρ* is the mixture mass density, **v** is the mixture velocity, **v<sup>s</sup>** is the velocity in the solid phase and **v<sup>f</sup>** is the velocity in the fluid phase, *υ<sup>s</sup>* is the volume fraction of solid, *υ<sup>f</sup>* is the volume fraction of fluid, **g** is the gravitational acceleration, **T<sup>s</sup>** is the solid stress, **T<sup>f</sup>** is the fluid stress, and **T** 0 is a contribution to the mixture stress. The stress **T** 0 can be avoided by using an approximation suitable for many debris flows.

To reduce the difficulty of traditional CFDs, Savage and Hutter proposed the depthaveraged theory [34]. This model allows the GIS to integrate simulation codes. Depth averaging is one of the steps to eliminate the calculation in the *z*-direction. The averaged velocities and resultant forces are calculated on the *x*–*y* plane [32,34,35]. In this method, the assumption is that *ρ* is constant and the landslide depth is shallow. Thus, the equations [32,36] are

$$\frac{\partial h}{\partial t} + \frac{\partial h \overline{\sigma}\_{\text{x}}}{\partial \mathbf{x}} + \frac{\partial h \overline{\sigma}\_{y}}{\partial y} = \mathbf{0} \tag{10}$$

$$
\rho \left[ \frac{\partial (h\overline{\boldsymbol{v}}\_{\boldsymbol{x}})}{\partial t} + \frac{\partial (h\overline{\boldsymbol{v}}\_{\boldsymbol{x}})^2}{\partial \boldsymbol{x}} + \frac{\partial (h\overline{\boldsymbol{v}}\_{\boldsymbol{x}}\overline{\boldsymbol{v}}\_{\boldsymbol{y}})}{\partial \boldsymbol{y}} \right] = -\int\_{0}^{h} \left[ \frac{\partial T\_{\rm{xx}}}{\partial \mathbf{x}} + \frac{\partial T\_{\rm{yx}}}{\partial \boldsymbol{y}} + \frac{\partial T\_{\rm{zx}}}{\partial \boldsymbol{z}} - \rho \boldsymbol{g}\_{\rm{x}} \right] d\boldsymbol{z} \tag{11}
$$

$$
\rho \left[ \frac{\partial (h\overline{\boldsymbol{\sigma}}\_y)}{\partial t} + \frac{\partial (h\overline{\boldsymbol{\sigma}}\_y)^2}{\partial y} + \frac{\partial (h\overline{\boldsymbol{\sigma}}\_x \overline{\boldsymbol{\sigma}}\_y)}{\partial \boldsymbol{x}} \right] = -\int\_0^h \left[ \frac{\partial T\_{xy}}{\partial \mathbf{x}} + \frac{\partial T\_{yy}}{\partial y} + \frac{\partial T\_{zy}}{\partial z} - \rho \overline{\boldsymbol{g}}\_y \right] dz. \tag{12}
$$

where *v<sup>x</sup>* = <sup>1</sup> *h* R *h* 0 *vxdz*, *v<sup>y</sup>* = <sup>1</sup> *h* R *h* 0 *vydz*. As shown in the equations, the velocity in the *z*-direction is averaged. Based on the depth-averaged theory, a depth-averaged mixture model was proposed by Iverson and Denlinger [32]. Subsequently, two-phase and multiphase depth-averaged models were proposed to describe distinct mechanical responses and dynamic behaviors of material [35,37].

Some studies considered the erosion process, a mechanical process by which the bed material is mobilized by the flow and dominant mechanical processes in geophysical mass flows [38]. Erosion determines enhanced or reduced mobility but is not understood thoroughly. In the dynamic model, the erosion rate *E* = −*∂b*/*∂t* and erosion velocity *u<sup>b</sup>* are two important parameters, which change mass and momentum productions *E* and *ubE* [38], respectively. Subsequently, Pudashini and Fischer proposed a two-phase erosion model on this basis [39]. The emergence of these models has further developed the continuum model on GIS platforms.

## *2.2. Forces*

Gravity, friction, collision force, and hydraulic pressure are the major forces during motion. Some forces have less influence and can be ignored during the movement.

## 2.2.1. Collision Force

Collision force is one of the factors affecting the process of landslide movement, especially during falls and topples. The collision process is complicated, which makes the calculation difficult. Traditionally, the spring–dashpot model is widely used to describe the nonlinear process, which is

$$F\_n = F\_{el} + F\_{diss} \tag{13}$$

where *F<sup>n</sup>* is the collision force, *Fel* is the elastic force (spring), and *Fdiss* is the dissipative force (dashpot). The spring obeys Hooke's law, and the dashpot obeys Newton's law of viscosity [40,41]. To simplify the process, Evans and Hungr used a lumped mass model in ROCKFALL programs in 1993 [42]. The associated assumptions of the lumped mass model are as follows: (1) each rock is a small spherical particle; (2) rocks do not have any size, only mass. The lumped mass model uses one or two restitution coefficients to express this process to avoid calculating complex collision forces. The restitution coefficients are the ratio of the rebound velocity to the incident velocity, the impulse ratio, and the work ratio, which involves the square root of work performed. Most models use two restitution coefficients: the tangential restitution coefficient *R<sup>t</sup>* and the normal restitution coefficient *Rn*; however, a few models use only one restitution coefficient to quantify dissipation in terms of velocity magnitude loss. Hybrid approaches were proposed on GIS platforms to simulate rockfall accurately. These models, such as CRSP [43], RocFall [44], and STONE [45], consider the influence of shape and nonlinear collision based on the lumped mass model. Hybrid approaches have been the mainstream models on GIS platforms. Therefore, there are three models for describing the collision force, a fully rigid body model, a lumped mass model, and a hybrid approach [46], on GIS platforms.

## 2.2.2. Friction Force

Friction is the force that resists the relative motion of solid surfaces, fluid layers, and material elements sliding against each other, which is important for the movement calculation. In this process, kinetic energy is converted to thermal energy when motion with friction occurs. There are two types of resistance: viscosity and Coulomb's friction.

For dry granular flow, Coulomb's friction is adopted, which is

$$
\pi = \mu \text{N} \tag{14}
$$

where *τ* is the unit base resistance and *µ* is the friction coefficient, and *N* is the normal force. Generally, the basal shear forces, obtained by simple infinite landslide models, are calculated by Coulomb's friction.

For fluid flow, viscosity is a measure of resistance to deformation at a given rate. It can be conceptualized as the internal frictional force that arises between adjacent layers of fluid that are in relative motion. There are three viscosity equations: Newtonian, Bingham, and quadratic fluids.

(1) The Newtonian fluid is:

$$
\pi = \mu \gamma'\tag{15}
$$

where *µ* is the shear viscosity of a fluid and *γ* 0 is the derivative of the velocity component.

(2) The Bingham fluid model is

$$
\pi = \pi\_0 + \mu \gamma'\tag{16}
$$

where *τ*<sup>0</sup> is a constant yield strength, *µ* is the shear viscosity of the fluid, and *γ* 0 is the derivative of the velocity component.

(3) The quadratic fluid model is

$$
\pi = \pi\_0 + \mu \gamma' + \zeta \gamma'^2 \tag{17}
$$

The first two terms are referred to as the Bingham shear stresses. The last term represents the dispersive and turbulent shear stresses. Fluid friction can be generalized as:

$$f = f\_s + f\_v + f\_t \tag{18}$$

where *f<sup>v</sup>* is the viscosity term, *f<sup>s</sup>* is the constant term, and *f<sup>t</sup>* is a turbulent term. Based on the depth-averaged theory, shear stress is depth-integrated, and the corresponding equation is

$$S = \frac{1}{h} \int fdz\tag{19}$$

where *S* is depth-integrated shear stress. Therefore, Equation (18) can be transformed to:

$$S\_{fx} = S\_{\tau} + S\_{v} + S\_{td}.\tag{20}$$

## 2.2.3. Other Forces

Hydraulic pressure in the depth-averaged model is the force imparted per unit area of liquid or flow-like materials on the surfaces, which can be expressed in the Eulerian description as:

$$F\_{\bar{l}} = \frac{\partial}{\partial x} \beta \frac{h^2}{2} \tag{21}$$

where *β* can be changed to *β x <sup>s</sup>* and *β x f* based on the form of the phase [37]. For fluid, the *β* is *g z* . For solids, the force created by collisions among particles is simplified to an internal force based on soil mechanics, and the *β* is:

$$\boldsymbol{\beta}\_s^{\mathbf{x}} = \mathbf{K} \mathbf{g}^{\mathbf{z}} \left( \mathbf{1} - \gamma\_s^f \right) \tag{22}$$

$$K\_{\rm pas/act} = 2 \sec^2 \phi \left\{ 1 \pm \left( 1 - \cos^2 \phi \sec^2 \delta \right)^{1/2} \right\} - 1 \tag{23}$$

where *φ* is the internal frictional angle and *δ* is the basal frictional angle. The collisions and frictions among particles are difficult to calculate during the motion.

There are two methods to describe the form of hydraulic pressure on the Lagrangian description. The first method is the particle-in-cell (PIC) method [47], which converts particles or columns to fields based on the volume in each cell. The other is smoothedparticle hydrodynamics (SPH) approximation [48–50], and they are:

$$
\overline{P}\_I = \frac{1}{2} \beta h\_I^{-2} \tag{24}
$$

$$H\_p = \sum\_I m\_I \left(\frac{\overline{P}\_I}{h\_I^2} + \frac{\overline{P}\_I}{h\_I^2}\right) \text{grad}\!W\_{II} \tag{25}$$

where *P<sup>I</sup>* is an averaged hydraulic pressure term, *WIJ* is the value of the SPH kernel function *WIJ* centered at node *I* evaluated at node *J*. The weighting function or kernel *WIJ* is a symmetric function of *x<sup>I</sup>* − *x<sup>J</sup>* . Additionally, *m<sup>J</sup>* has no physical meaning. When the node moves, the material contained in a column of base Ω<sup>I</sup> has entered it or will leave it as the column moves with an averaged velocity, which is not the same for all particles or columns in it [48].

In addition, buoyancy and drag forces in the two-phase and multi-phase flow can also influence the landslide motion. Buoyancy is an upward force exerted by a fluid that opposes the weight of a partially or fully immersed object, and it is a vertical force. Buoyancy can reduce the pressure at the basal surface. Therefore, buoyancy can reduce the resistance, especially in multi-phase flow. Drag forces are shear forces caused by different velocities and accelerations in different phases. In other words, solid particles may accelerate relative to fine solids or fluids [37].

## **3. Software**

With the development of computer technology, GIS has gradually become a mainstream system for engineering design and urban planning. There are a lot of GIS programs, such as GRASS GIS, QGIS, and ArcGIS. GRASS GIS and QGIS are popular in program development due to being free and open source. ArcGIS is a mature commercial program and is widely applied in urban planning and engineering design. These programs provide a rich interface such as raster import, vector import, raster statistics, and vector analysis, and are convenient for users to call the functions and develop their programs. In addition, these programs provide a GUI to display the results calculated by their programs. More and more landslide simulation codes support GIS. At present, there are two kinds of programs: programs based on the rigid body model and programs based on the flow-like model.

## *3.1. Programs Based on the Rigid Body Model*

On GIS platforms, there are several programs to simulate the discrete rigid body motions, as shown in Table 1. On GIS platforms, the format of input parameters of GIS data needs to be considered, especially DEM data. The choice of parameter expressions determines the model. There are three types of DEMs to express on GIS platforms: triangulated irregular networks (TINs), grid networks, and vector or contour-based networks. Relying on the algorithm, the selection of the DEM is also different. For example, Rockfall Analyst and Rockyfor3D select the grid networks. The lumped mass models are popular in early GIS platforms among these models. With the GIS technique development, the hybrid model is mainstream on GIS platforms. These programs include Hy-STONE, Rockyfor3D, and PICUS Rock'n'Roll.


**Table 1.** Rigid body programs on GIS platforms.

## *3.2. Programs Based on the Flow-like Model*

Based on the description, the grid networks of DEM are best for flow-like models on GIS platforms. The grid networks make the calculation simple and accurate. At present, there are a lot of codes on the GIS platforms, such as r.avaflow, LA, DA, Titan2D, and Massflow (Table 2). As for the model, the depth-averaged theory model is employed to obtain the depths and velocities in each cell. The simulation r.avaflow is popular around the world. The program has a built-in multi-phase flow method based on the depth-averaged theory and is applied in GRASS GIS and R. It is a very good cross-platform program, which means that we can use the program on Windows, Linux, and Mac OS [35,37,38].


**Table 2.** Flow-like programs on GIS platforms.

For flow-like programs, the numerical scheme is one of the factors affecting results. Most codes use 1st or 2nd order finite difference methods to solve the partial differential equations. In these programs, the numerical diffusion and numerical oscillation are the difficulties for the models in the Eulerian description. Some codes use the TVD method and adaptive mesh refinement (AMR) to improve precision. In addition, some methods use the Lagrangian or Eulerian–Lagrangian description to solve difficulties such as the material point method (MPM) [62], particle-in-cell (PIC) [47], and smoothed particle hydrodynamics (SPH) [63]. In these models, the computational cost of simulations per number of particles may be higher than the cost of grid-based simulations per number of cells. In some cases, these methods solve the numerical solution problem of differential equations to a certain extent.

## **4. Results and Discussion**

Many factors affect the simulation results, such as models, algorithms, and descriptions. In this section, we show cases such as bilateral dam break, a rockfall example in the RA program, and the Yigong landslide to analyze the effect on models, algorithms, and descriptions.

## *4.1. Reason for Differences*

## 4.1.1. Differences Caused by Models

Different models can produce different results due to assumptions. We used a 2-D bilateral dam break simulation to analyze the applicability of the traditional CFD model (two-phase model), the depth-averaged model, and the SPH model (Figure 3). In these simulations, we set the initial state to 1 m at [0, 1]. Under the action of gravity, the fluid moves downward. In this case, we applied the interfoam solver of OpenFOAM® to calculate the two-phase model [64], PySPH to calculate the SPH model [65], and the Lax–Friedrichs scheme to calculate the depth-averaged model. In OpenFOAM®, we select the area in the grid with water content greater than 0.6 to obtain the profile. In the SPH model, we set the ball to have a diameter of 0.03 m.

The results show that the range of movement observed in all models is similar at 0.5 s, spanning [−2, 3]. In the 2-D simulation, different models will have different results in terms of details. The depth-averaged model can obtain a smoother result than the other models. Additionally, the results of the two-phase model and the SPH model include more details in the *z*-directions than the depth-averaged model. Based on the above analysis, the model selection depends on the relevant requirements. When detailed information is required, we must select a complex model to calculate the results. When we focus on the range, we can use the depth-averaged model.

*4.1. Reason for Differences* 

4.1.1. Differences Caused by Models

set the ball to have a diameter of 0.03 m.

Different models can produce different results due to assumptions. We used a 2-D

The results show that the range of movement observed in all models is similar at 0.5

s, spanning [−2, 3]. In the 2-D simulation, different models will have different results in terms of details. The depth-averaged model can obtain a smoother result than the other models. Additionally, the results of the two-phase model and the SPH model include more details in the z-directions than the depth-averaged model. Based on the above analysis, the model selection depends on the relevant requirements. When detailed information is required, we must select a complex model to calculate the results. When we

bilateral dam break simulation to analyze the applicability of the traditional CFD model (two-phase model), the depth-averaged model, and the SPH model (Figure 3). In these simulations, we set the initial state to 1 m at [0, 1]. Under the action of gravity, the fluid moves downward. In this case, we applied the interfoam solver of OpenFOAM® to calculate the two-phase model [64], PySPH to calculate the SPH model [65], and the Lax–Friedrichs scheme to calculate the depth-averaged model. In OpenFOAM®, we select the area in the grid with water content greater than 0.6 to obtain the profile. In the SPH model, we

focus on the range, we can use the depth-averaged model.

**Figure 3.** Bilateral dam break simulation with different models: (**A**) the initial state of the simulation, (**B**) obtained by OpenFOAM at 0.5 s, (**C**) obtained by PySPH at 0.5 s, and (**D**) obtained by the depthaveraged model at 0.5 s. **Figure 3.** Bilateral dam break simulation with different models: (**A**) the initial state of the simulation, (**B**) obtained by OpenFOAM at 0.5 s, (**C**) obtained by PySPH at 0.5 s, and (**D**) obtained by the depth-averaged model at 0.5 s.

## 4.1.2. Differences Caused by Algorithms

4.1.2. Differences Caused by Algorithms Differences in parameters and numerical schemes can affect the runout zones significantly (Figure 4). In this case, we used Rockfall Analyst to analyze the influence of algorithms. The results obtained with ArcGIS 9.x and ArcGIS 10.x differ because of different point extraction algorithms (Figure 4). The small difference in the point extraction of the Differences in parameters and numerical schemes can affect the runout zones significantly (Figure 4). In this case, we used Rockfall Analyst to analyze the influence of algorithms. The results obtained with ArcGIS 9.x and ArcGIS 10.x differ because of different point extraction algorithms (Figure 4). The small difference in the point extraction of the GIS module will affect the runout zone. The rock fell into the river in ArcGIS 9.x. However, the rock stops on the road in ArcGIS 10.x, although the DEM is the same. The results show that a small difference in the algorithm can produce large differences in trajectory.

GIS module will affect the runout zone. The rock fell into the river in ArcGIS 9.x. However, In the Eulerian-based model, some numerical schemes can handle one property of ADEs but process badly with another one [17]. Therefore, the balance between numerical diffusion and numerical oscillation is key to obtaining a suitable physical solution. For the model in the Eulerian description, we simulated the uniform linear motion of a block with 1 m/s. In this simulation, the space interval is 0.1 m, and the time interval is 0.01 s. The Lax–Friedrichs scheme, a first-order in time and second-order in space method, shows numerical diffusion during motion (Figure 5A) but the Lax–Wendroff scheme, a secondorder in both space and time method, shows numerical dispersion (Figure 5B). The errors are generally caused by neglecting high-order terms. However, higher-order linear schemes such as 3rd order, although more accurate for smooth solutions, are not TVD and tend to introduce spurious oscillations (wiggles) where discontinuities or shocks arise. Various high-resolution schemes use flux/slope limiters to maintain the TVD, thereby reducing the impact of numerical dissipation and numerical diffusion [66,67]. In these methods, the

accuracy is high in the smooth area, and the flux/slope limiter method is used in the shock area to avoid producing nonphysical solutions. the rock stops on the road in ArcGIS 10.x, although the DEM is the same. The results show that a small difference in the algorithm can produce large differences in trajectory.

*Appl. Sci.* **2022**, *12*, x FOR PEER REVIEW 10 of 16

*Appl. Sci.* **2022**, *12*, x FOR PEER REVIEW 10 of 16

the rock stops on the road in ArcGIS 10.x, although the DEM is the same. The results show that a small difference in the algorithm can produce large differences in trajectory.

**Figure 4.** Rockfall simulations using RA: (**A**) in ArcGIS 9.x and (**B**) in ArcGIS 10.x. **Figure 4.** Rockfall simulations using RA: (**A**) in ArcGIS 9.x and (**B**) in ArcGIS 10.x. the shock area to avoid producing nonphysical solutions.

are generally caused by neglecting high-order terms. However, higher-order linear schemes such as 3rd order, although more accurate for smooth solutions, are not TVD and **Figure 5.** Numerical diffusion and numerical oscillation: (**A**) numerical diffusion obtained by the Lax–Friedrichs scheme, and (**B**) numerical oscillation obtained by the Lax–Wendroff scheme. **Figure 5.** Numerical diffusion and numerical oscillation: (**A**) numerical diffusion obtained by the Lax–Friedrichs scheme, and (**B**) numerical oscillation obtained by the Lax–Wendroff scheme.

### tend to introduce spurious oscillations (wiggles) where discontinuities or shocks arise. 4.1.3. Differences Caused by Description

Various high-resolution schemes use flux/slope limiters to maintain the TVD, thereby reducing the impact of numerical dissipation and numerical diffusion [66,67]. In these methods, the accuracy is high in the smooth area, and the flux/slope limiter method is used in the shock area to avoid producing nonphysical solutions. In this section, we show the 3-D sliding block used to analyze the influence of different descriptions. In this case, we assumed the acceleration in the *x*- and *y*-direction is 5 m/s<sup>2</sup> , and that the *x*-direction and *y*-direction interval is 10 m. The location of the center of the block is (200, 200) (Figure 6A). In this motion process, the deformation of the block is zero, and the block can be considered as a rigid body. After 20 s, the block moves to (1200, 1200) based on Newton's second law. We selected the lumped mass model in the Lagrangian description and the depth-averaged model in the Eulerian description to simulate the same motion.

**Figure 5.** Numerical diffusion and numerical oscillation: (**A**) numerical diffusion obtained by the

Lax–Friedrichs scheme, and (**B**) numerical oscillation obtained by the Lax–Wendroff scheme.

In this section, we show the 3-D sliding block used to analyze the influence of different descriptions. In this case, we assumed the acceleration in the x- and y-direction is 5 m/s2, and that the x-direction and y-direction interval is 10 m. The location of the center of the block is (200, 200) (Figure 6A). In this motion process, the deformation of the block is zero, and the block can be considered as a rigid body. After 20 s, the block moves to (1200, 1200) based on Newton's second law. We selected the lumped mass model in the Lagrangian description and the depth-averaged model in the Eulerian description to simulate

4.1.3. Differences Caused by Description

the same motion.

**Figure 6.** Simulations using different numerical schemes: (**A**) the original location, (**B**) the result in the Lagrangian description, and (**C**) the result in the Eulerian description. **Figure 6.** Simulations using different numerical schemes: (**A**) the original location, (**B**) the result in the Lagrangian description, and (**C**) the result in the Eulerian description.

In the Lagrangian description, the lumped mass model is simple (Figure 6B) because the block assumes a mass point in the model. Based on the lumped mass model, we can calculate the location and the result close to the observed one. In the Eulerian description, we used depth-averaged theory and the McCormack–TVD scheme. After 20 s, the block also moves to (1200, 1200) (Figure 6C). However, numerical diffusion (the green area) is notable at the boundary of the block, even though the TVD method is applied to reduce diffusion (Figure 6C). Additionally, the calculation efficiency is far lower than the method based on the Lagrangian description, when reaching the same precision. This case shows that proper model assumptions are a prerequisite for good results. Suitable assumptions can reduce some errors in the calculation process. The lumped mass model is simple and obtains results quickly and accurately. Therefore, the lumped mass model is more suitable In the Lagrangian description, the lumped mass model is simple (Figure 6B) because the block assumes a mass point in the model. Based on the lumped mass model, we can calculate the location and the result close to the observed one. In the Eulerian description, we used depth-averaged theory and the McCormack–TVD scheme. After 20 s, the block also moves to (1200, 1200) (Figure 6C). However, numerical diffusion (the green area) is notable at the boundary of the block, even though the TVD method is applied to reduce diffusion (Figure 6C). Additionally, the calculation efficiency is far lower than the method based on the Lagrangian description, when reaching the same precision. This case shows that proper model assumptions are a prerequisite for good results. Suitable assumptions can reduce some errors in the calculation process. The lumped mass model is simple and obtains results quickly and accurately. Therefore, the lumped mass model is more suitable for this motion.

for this motion. In the 3-D landslide simulation, we selected the Yigong landslides to evaluate the influence of different numerical schemes. The Yigong landslide happened at the head of the Zhamulong gully (30.178° N; 94.940° E) on 9 April 2000 [68] and blocked the Yi Gong River at the foot of the slope with 3 × 108 m3 of sediment that formed a 60 m high dam [69]. Aiming at the characteristics of long-distance and high-speed movement of the Yigong In the 3-D landslide simulation, we selected the Yigong landslides to evaluate the influence of different numerical schemes. The Yigong landslide happened at the head of the Zhamulong gully (30.178◦ N; 94.940◦ E) on 9 April 2000 [68] and blocked the Yi Gong River at the foot of the slope with 3 <sup>×</sup> <sup>10</sup><sup>8</sup> <sup>m</sup><sup>3</sup> of sediment that formed a 60 m high dam [69]. Aiming at the characteristics of long-distance and high-speed movement of the Yigong landslide, researchers use many numerical models to investigate it.

landslide, researchers use many numerical models to investigate it. We selected different descriptions to simulate this process. Based on previous studies [70,71], the Voellmy model can obtain a suitable result [72]. In this case, the basal friction We selected different descriptions to simulate this process. Based on previous studies [70,71], the Voellmy model can obtain a suitable result [72]. In this case, the basal friction angle is 12◦ , the internal friction angle is 13◦ , and the turbulence assumes 1000 s2/m (Table 3).

angle is 12°, the internal friction angle is 13°, and the turbulence assumes 1000 s2/m (Table

3). **Table 3.** Mechanical parameters of the Yigong landslide.


Turbulent coefficient 1000 In numerical schemes, we selected the NOC–TVD scheme in the Eulerian description, which is applied in r.avaflow, and the depth-averaged SPH method in the Lagrangian In numerical schemes, we selected the NOC–TVD scheme in the Eulerian description, which is applied in r.avaflow, and the depth-averaged SPH method in the Lagrangian description to analyze the differences. The states of 50 s, 100 s, and 200 s using two methods are given in Figure 7. From the simulation, both methods obtain similar results. The Yigong landslide reached the foot of the mountain at 200 s. In detail, the process calculated in the Eulerian description is smoother than that in the Lagrangian description. The material in the Lagrangian description is concentrated in the channel. The maximum height is higher than the model in the Eulerian description. As for the maximum speed, the NOC–TVD scheme is 90.23 m/s, and the depth-averaged SPH is 97.20 m/s. The precision of the SPH model is related to the number of particles or the number of columns. The more particles or columns, the finer the description and the closer to the analytical solution of the fluid equation. Fewer particles or columns will result in lower interaction forces between columns or particles. Additionally, the computational cost of SPH simulations per number

*Appl. Sci.* **2022**, *12*, x FOR PEER REVIEW 12 of 16

of particles or columns is significantly larger than the cost of grid-based simulations for flow-like motion. number of particles or columns is significantly larger than the cost of grid-based simulations for flow-like motion.

description to analyze the differences. The states of 50 s, 100 s, and 200 s using two methods are given in Figure 7. From the simulation, both methods obtain similar results. The Yigong landslide reached the foot of the mountain at 200 s. In detail, the process calculated in the Eulerian description is smoother than that in the Lagrangian description. The material in the Lagrangian description is concentrated in the channel. The maximum height is higher than the model in the Eulerian description. As for the maximum speed, the NOC–TVD scheme is 90.23 m/s, and the depth-averaged SPH is 97.20 m/s. The precision of the SPH model is related to the number of particles or the number of columns. The more particles or columns, the finer the description and the closer to the analytical solution of the fluid equation. Fewer particles or columns will result in lower interaction forces between columns or particles. Additionally, the computational cost of SPH simulations per

**Figure 7.** Yigong landslide simulation using different descriptions: (**A**) at 50 s using Eulerian method; (**B**) at 100 s using Eulerian method; (**C**) at 200 s using Eulerian method; (**D**) at 50 s using **Figure 7.** Yigong landslide simulation using different descriptions: (**A**) at 50 s using Eulerian method; (**B**) at 100 s using Eulerian method; (**C**) at 200 s using Eulerian method; (**D**) at 50 s using depthaveraged SPH; (**E**) at 100 s using depth-averaged SPH; (**F**) at 200 s using depth-averaged SPH.

depth-averaged SPH; (**E**) at 100 s using depth-averaged SPH; (**F**) at 200 s using depth-averaged SPH. Based on the above analysis, the Eulerian description is more suitable for flow-like motion, but the Lagrangian description is more suitable for the discrete rigid body motion. Based on the above analysis, the Eulerian description is more suitable for flow-like motion, but the Lagrangian description is more suitable for the discrete rigid body motion. The description is one of the factors influencing the result.

### The description is one of the factors influencing the result. *4.2. Model Selection*

Based on Varnes' classification [27], the materials include rock, debris (coarse soil), and earth (fine soil). Rock is a solid mass of geological materials, debris is scattered material (large rock fragments), and earth is a cohesive, plastic, clayey soil. These terms are neither geological nor geotechnical [32,34,35,73,74] but are related to the size, shape, quantity, and properties, which help us select a suitable description. A rock is a solid mass and an aggregate of minerals that can be considered a discontinuous rigid body in landslide dynamics due to its characteristics. "Earth" is neither a geological term nor a geotechnical term, and it describes construction material or agricultural soil [75]. Earth is defined

as a material in which at least 80% of particles are smaller than 2 mm [27]. Debris is a mixture of large and small blocks of rock, and debris motion involves multi-phase flow (20% to 80% of particles >2 mm). The properties of debris encompass the characteristics of both rigid bodies and fluids. Particle size is one of the critical factors to consider in the description of motion. When the material volume is small, and the quantity of material is large, the Eulerian description including the depth-averaged model is generally suitable for describing these motions. The Lagrangian description is suitable for large-volume and small-quantity discrete rigid body movement. The Eulerian–Lagrangian method may be suitable for the rock and soil aggregate movement.

In addition, the motion types include falling, sliding, spreading, and flowing. In falling and toppling, collisions and shearing are the major contact processes, and shearing is the main contact process in sliding, flowing, and spreading. This indicates that the movement type determines the force model in a given situation. Collision and friction are critical forces to change the movement state for falling and toppling. However, dry friction and viscosity need to be considered in sliding, spreading, and flowing. Therefore, landslide classification can guide the selection of models of landslide dynamics (Table 4).

**Motion Rock Earth Debris** Falling and toppling Force: collision, friction Motion: Lagrangian Force: collision, friction Motion: Eulerian Force: collision, friction Motion: Eulerian/Lagrangian Sliding, spreading, and flowing Force: dry friction and viscosity Motion: Lagrangian or Eulerian Force: dry friction and viscosity Motion: Eulerian Force: dry friction and viscosity Motion: Eulerian/Lagrangian

**Table 4.** Relationship between models and types.

During the movement, the movement forms transform into each other. Fragmentation is a key process in rock movement and influences the whole process. Fragmentation can cause changes in movement form from falling to flowing [76]. As the number of debris increases due to fragmentation, the interactions in the debris become increasingly complex. From an energy perspective, fragmentation can result in energy dissipation [62] and drag reduction during fragmentation. After fragmentation, small debris has a lubricating effect, and large debris can be transported for a long distance. Therefore, fragmentation in longrunout landslides is very complicated [77]. The movement process gradually transforms from falling and bouncing to sliding and flowing. Therefore, a single-phase model is not suitable for these processes. We should build an episode-based multi-phase model on GIS platforms to describe different states. The best method is a rigid body model in the initial stage and a flow-like model after fragmentation. Therefore, we may use various model forms to obtain more accurate results in landslide simulation.

## **5. Conclusions**

Based on the above analysis, we can draw the following conclusions:


**Author Contributions:** Conceptualization, H.L. and Y.W.; methodology, Y.W.; software, Y.W. and A.T.; validation, H.L. and Y.W.; formal analysis, Y.W.; writing—original draft preparation, Y.W.; writing—review and editing, Y.W.; visualization, Y.W.; supervision, Y.W.; funding acquisition, H.L. and Y.W. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences (CAS) (Grant No. XDA23090301), National Natural Science Foundation of China (Grant No. 41941019, 42041006), and the Second Tibetan Plateau Scientific Expedition and Research (STEP) program (Grant No. 2019QZKK0904).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


## *Article* **The Effects of Rainfall, Soil Type and Slope on the Processes and Mechanisms of Rainfall-Induced Shallow Landslides**

**Yan Liu , Zhiyuan Deng and Xiekang Wang \***

State Key Laboratory of Hydraulics and Mountain River Engineering, Sichuan University, Chengdu 610065, China; liuyan2021scu@gmail.com (Y.L.); dengzhiyuan2019@163.com (Z.D.) **\*** Correspondence: wangxiekang@scu.edu.cn

**Abstract:** Landslides are a serious geohazard worldwide, causing many casualties and considerable economic losses every year. Rainfall-induced shallow landslides commonly occur in mountainous regions. Many factors affect an area's susceptibility, such as rainfall, the soil, and the slope. In this paper, the effects of rainfall intensity, rainfall pattern, slope gradient, and soil type on landslide susceptibility are studied. Variables including soil volumetric water content, matrix suction, pore water pressure, and the total stress throughout the rainfall were measured. The results show that, under the experimental conditions of this paper, no landslides occurred on a 5◦ slope. On a 15◦ slope, when the rainfall intensity was equal to or less than 80 mm/h with a 1 h duration, landslides also did not happen. With a rainfall intensity of 120 mm/h, the rainfall pattern in which the intensity gradually diminishes could not induce landslides. Compared with fine soils, coarser soils with gravels were found to be prone to landslides. As the volumetric water content rose, the matrix suction declined from the time that the level of infiltration reached the position of the matrix. The pore water pressure and the total stress both changed drastically either immediately before or after the landslide. In addition, the sediment yield depended on the above factors. Steeper slopes, stronger rainfall, and coarser soils were all found to increase the amount of sediment yield.

**Keywords:** landslides; artificial rainfall; grain size; rainfall pattern; pore water pressure

## **1. Introduction**

Landslides refer to the geological phenomenon of rock and soil mass sliding along a slope, these are a type of natural hazard that is widely distributed throughout the world [1,2]. Landslides cause tens of billions of dollars of economic losses and serious casualties worldwide every year [3]. In areas with complex geological conditions, landslides occur more frequently [4–7].

In mountain regions, landslides are often triggered by intensive rainfall [8,9]. Rainfall and infiltration enhance moisture content, which further decreases the matrix suction and soil shear strength [10,11]. Rainfall characteristics are often used as criteria for landslide occurrence [12,13]. The rainfall's intensity and pattern influence the characteristics of landslides [14–16]. To date, there have been many studies on the effects of rainfall on landslides. Research methods include in situ experiments, laboratory experiments, and numerical simulations [17,18]. The artificial rainfall test is an effective method when used to study rainfall-induced landslides [19].

Slopes with different soil compositions respond differently to rainfall. Fine particle migration leads to pore blockage [20] and the soil composition of slopes is closely related to their landslide susceptibility. Research has showed that a prerequisite for landslides to occur is that the clay percentage of the soil is higher than 2.5% [21]. The soils in a region where earthquakes happen often have many coarse particles, such as gravel. The proportion of gravel in soils has a great influence on the density and void ratio, which determines the timing and type of landslides [22]. The failure mode is closely related to the grain size [23].

**Citation:** Liu, Y.; Deng, Z.; Wang, X. The Effects of Rainfall, Soil Type and Slope on the Processes and Mechanisms of Rainfall-Induced Shallow Landslides. *Appl. Sci.* **2021**, *11*, 11652. https://doi.org/10.3390/ app112411652

Academic Editors: Ricardo Castedo, Miguel Llorente Isidro and David Moncoulon

Received: 3 November 2021 Accepted: 3 December 2021 Published: 8 December 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

The type of landslide is usually a gully failure when small gravel content is present and usually a "layer-by-layer sliding" failure when large gravel content is present [24].

Rainfall and infiltration change the soil characteristics greatly [25]. Some physical qualities of the soil change during rainfall, such as the water content, matrix suction, pore water press, and total stress [26,27]. The volumetric water content is the ratio of the volume of water to the unit volume of soil and it increases during rainfall. Matrix suction is a sensitive parameter when unsaturated soils encounter rainfall and it is differently affected for soils with varying levels of rainfall infiltration [28,29]. The pore water press increases during rainfall and reduces the soil's shear strength. Slopes with greater inclination have larger pore water press [30]. Total stress is the basis of stability analyses that are used for calculating the factor of safety [31]. However, all of the above studies measured one or several of the physical quantities of the soil and lacked a comprehensive reflection of the changes in soil properties during landslides.

The landslide is a typical example of gravity-based erosion. The sediment yield of landslides varies with the presence of a number of factors [32,33]. The time scale of the impact of a landslide on the sediment yield in the basin is large. To comprehensively analyze sediment transport, it is necessary to investigate the landslide history of the basin for at least the preceding 100 years [34]. The landslide sediment yield of soil that is affected by an earthquake under rainfall has been studied and quantified [35]. One model that is used to express the contribution of shallow landslides to sediment yield as a rainfall characteristic function has been established [36]. Another model called SHEETRAN was established in order to analyze the impact of rainfall on landslides and sediment transport. It was applied to the Valsassina Basin, which is a wide glaciated valley with a U-shaped profile. The superficial deposits that are found on the valley's slopes consist of calcareousdolomitic chaotic material with loose and sharp-edged fragments. [37–39]. The effects of rainfall on sediment yield are obvious and the sediment yield increases with an increase in the gravel percentage [40,41].

Therefore, the purpose of this paper is to make clear how each factor influences landslide occurrences. A series of physical model tests that were carried out by an artificial rainfall system was performed in order to investigate the process and mechanics of slope failure. The rainfall intensity, rainfall pattern, soil type, and slope gradients each play a unique role in slope stability. In the process, the main physical parameters of the soil, including volumetric water content, matrix suction, pore water pressure, and total stress, were all measured. Based on the detailed measured data, we were able to derive the relationship between these physical qualities and the timing of landslides. Moreover, the sediment yield of landslides was quantified and the dependence of the yield on the test variables was analyzed. The present experimental results contribute to improving the understanding of landslide mechanisms and mitigating landslide disasters.

## **2. Materials and Methods**

## *2.1. Experimental System*

The artificial rainfall test site that was used in this study is located in the State Key Laboratory of Hydraulics and Mountain River Engineering of Sichuan University. The system is composed of a reservoir, water pump, water delivery pipe, rain gauge, electromagnetic flowmeter, nozzle, and valve (Figure 1). It is controlled by computer software and can be self-adjusted automatically. When the automatic adjustment mode is turned on, the water pressure and valve opening can be adjusted automatically in order to hold or change the rainfall intensity. A water content sensor, tensiometer, earth press cell, and pore water pressure sensor were used in the test (Figure 2c). The flume that was used for the test was made of impermeable transparent polymer plastic material, with a length of 2 m, a width of 0.3 m, and a height of 0.8 m. The flume was placed horizontally. The three slope angles that were set in this research were 5◦ , 15◦ , and 30◦ . The flume, soil, and instruments that were used are shown in Figure 2.

**Figure 1.** Sketch of experimental setup.

**Figure 2.** Tested samples and experimental apparatus: (**a**) side view of the slope; (**b**) soil samples; (**c**) measuring apparatuses.

## *2.2. Experimental Program*

The experiment was set up with four variables: rainfall intensity, rainfall pattern, soil type, and slope gradient (Table 1). Each variable was changed only when the other variables were kept constant in order to study the relationship between that variable and landslide susceptibility. Four values of 40 mm/h, 80 mm/h, 120 mm/h, and 160 mm/h were taken for the rainfall intensity variable when the rainfall pattern was uniform (Tests no. 1–4). The rainfall pattern variable was set to I, II, III, and IV (test no. 5–7 and 3), these rainfall intensity change processes are shown in Figure 3.

In tests no. 8–10 and 3, four soil types were used. Three of the soil types were mainly composed of silt, sand, and gravel, respectively, and the mixed type was a 1:1:1 mixture of the aforementioned three. The soil compositions were set according to the common soil types that are seen in Min Jiang River basin in southwest China. The soil that was used for the tests was collected in nature through systemic screening and mixed. The soil compositions are listed in Table 2 and the grain size distribution curves of the soil types are shown in Figure 4.


**Table 1.** Test variables.

**Figure 3.** Rainfall patterns: (**a**) Pattern I; (**b**) Pattern II; (**c**) Pattern III; (**d**) Pattern IV.



**Figure 4.** Grain size distribution curves.

## *2.3. Test Procedures*

Before the test began, the soil was filled into the flume, with each layer being 10 cm deep. The soil was paved to the same thickness in each layer and then knocked evenly with wood blocks. Measuring instruments were embedded in the positions that are shown in Figure 1. After the preparation was completed, the test started with the commencement of the rainfall. Each rainfall lasted 1 h. At the end of a test, the amount of sediment that was yielded by the landslide was measured.

## **3. Results**

## *3.1. Slope Instability Processes under Different Rainfall Intensities*

When a slope encounters rainfall with different intensities, its water content, matrix suction, pore water pressure, and total stress exhibit different change processes. Water content is a basic parameter that is used to describe soil's properties. The volumetric water content of the tests with higher rainfall intensity were found to rise earlier and the matrix suction declined earlier, too. When the rainfall intensity was 160 mm/h, the volumetric water content reached its maximum at about 40 min. When the rainfall intensity was 120 mm/h, the volumetric water content reached its maximum at about 55 min. The water gradually penetrated downward from the soil's surface, so the volumetric water content deeper down within the soil was found to increase later than that of the position near the surface.

Matrix suction is an important parameter of the mechanical properties of unsaturated soils. The pores of unsaturated soil are filled with water and air. The water–air interface has surface tension. In unsaturated soil, through capillary action, the pore water pressure under the bent liquid surface is less than the pore air pressure. The shrinkage membrane is subjected to air pressure greater than the water pressure, and this pressure difference is called matrix suction. The processes of matrix suction that were observed at the different depths under the different rainfall intensities are shown in Figure 5b. It can be seen from the figure that the change curve of matrix suction was divided into three stages, namely the initial stage, steep fall stage, and stable stage. Taking, as an example, the matrix suction change process that was observed at position I when the rainfall intensity is 160 mm/h, it was noted that within 22 min of the beginning of rainfall, the change in the soil matrix suction was not obvious. At 22–42 min, the matrix suction of the slope soil decreased abruptly. As the infiltration of the rainfall continued to increase, the soil matrix suction decreased to the minimum and became stable.

Pore water pressure is the pressure of the groundwater that is present in soil or rock, which acts between particles or pores and is an important indicator of stress changes in the soil. The variation of pore water pressure that was observed at the different slope locations under the different rainfall intensities is shown in Figure 5c. The pore water pressure variation curves during the test were observed in three stages: the initial stage, surging stage, and slowly increasing stage. It was specifically noted that a greater intensity of rainfall led to a shorter duration of the initial stage. For example, the initial stage of pore water pressure at position I lasted about 8 min when the rainfall intensity was 160 mm/h, and about 24 min when the rainfall intensity was 40 mm/h.

The total stress is the total force per unit area that is acting within a mass of soil. It increases with the greater depth of the measurement point. Different rainfall intensities lead to different variation processes of total stress. When the rainfall intensity was small, the total stress started to increase a short time after the rainfall began and the increase process was relatively smooth. When the rainfall intensity was higher, the total stress started to increase at the beginning of the rainfall.

The intensity of the rainfall is closely related to the occurrence of a landslide. When the intensity of rainfall was 40 mm/h and 80 mm/h, no landslide occurred. For the case of a rainfall intensity of 120 mm/h, the time of the initial landslide occurrence was about 47 min. For the case of a rainfall intensity of 160 mm/h, the time of the initial landslide occurrence was about 40 min.

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 6 of 14

**Figure 5.** Variation in measured data when rainfall intensity is different: (**a**) water content; (**b**) matrix suction; (**c**) pore water pressure; (**d**) total stress. **Figure 5.** Variation in measured data when rainfall intensity is different: (**a**) water content; (**b**) matrix suction; (**c**) pore water pressure; (**d**) total stress.

#### Pore water pressure is the pressure of the groundwater that is present in soil or rock, *3.2. Slope Instability Processes under Different Rainfall Patterns*

which acts between particles or pores and is an important indicator of stress changes in the soil. The variation of pore water pressure that was observed at the different slope locations under the different rainfall intensities is shown in Figure 5c. The pore water pressure variation curves during the test were observed in three stages: the initial stage, surging stage, and slowly increasing stage. It was specifically noted that a greater intensity of rainfall led to a shorter duration of the initial stage. For example, the initial stage of pore water pressure at position I lasted about 8 min when the rainfall intensity was 160 mm/h, and about 24 min when the rainfall intensity was 40 mm/h. The total stress is the total force per unit area that is acting within a mass of soil. It increases with the greater depth of the measurement point. Different rainfall intensities lead to different variation processes of total stress. When the rainfall intensity was small, The changes in the water content, matrix suction, pore water pressure, and total stress that were observed under the different rainfall patterns are shown in Figure 6. The volumetric water content that was measured at the measurement points did not change for a period of time after the onset of rainfall. The volumetric water content of rainfall pattern I started to increase at the earliest time point. Later, the volumetric water content of rainfall patterns IV and III started to increase. The volumetric water content of rainfall pattern II started to increase last. The rate of the increase in the volumetric water content in the rainfall pattern I test decreased gradually. The rate of increase in the rainfall pattern IV test remained constant. The rate of increase in rainfall pattern II and rainfall pattern III gradually increased.

the total stress started to increase a short time after the rainfall began and the increase process was relatively smooth. When the rainfall intensity was higher, the total stress started to increase at the beginning of the rainfall. The intensity of the rainfall is closely related to the occurrence of a landslide. When the intensity of rainfall was 40 mm/h and 80 mm/h, no landslide occurred. For the case of a rainfall intensity of 120 mm/h, the time of the initial landslide occurrence was about 47 The matrix suction of the tests with rainfall pattern I and rainfall pattern IV began to diminish earlier than the matrix suction of pattern II and pattern III. The matrix suction curve of the tests with rainfall pattern I and rainfall pattern IV began to enter the attenuation stage at about 20 min, while those of pattern II and pattern III began to enter the attenuation stage at about 35 min. The rainfall of patterns I and IV was relatively large in the initial stage. The attenuation processes of these tests were similar. In the stable stage, the matrix suction of the test with patterns I and IV was the smallest, and those of patterns II and III were relatively large.

In the changing process of pore water pressure, the slow-changing stages of the rainfall pattern I and IV tests lasted the shortest time. The rate of the curve was very large at the beginning. After the surge stage, it tended to be smooth. The slow change phase of rainfall patterns II and III lasted longer and the pore water pressure increased slowly throughout the test process. As the rainfall continued, the rainfall intensity of rainfall patterns II and III gradually increased and the pore water pressure changed into the surge stage. The pore water pressure of rainfall patterns I, III, and IV increased slightly in the stable stage. fall patterns IV and III started to increase. The volumetric water content of rainfall pattern II started to increase last. The rate of the increase in the volumetric water content in the rainfall pattern I test decreased gradually. The rate of increase in the rainfall pattern IV test remained constant. The rate of increase in rainfall pattern II and rainfall pattern III gradually increased.

min. For the case of a rainfall intensity of 160 mm/h, the time of the initial landslide occur-

The changes in the water content, matrix suction, pore water pressure, and total stress that were observed under the different rainfall patterns are shown in Figure 6. The volumetric water content that was measured at the measurement points did not change for a period of time after the onset of rainfall. The volumetric water content of rainfall pattern I started to increase at the earliest time point. Later, the volumetric water content of rain-

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 7 of 14

*3.2. Slope Instability Processes under Different Rainfall Patterns* 

rence was about 40 min.

**Figure 6.** Variation in measured data when rainfall patterns are different: (**a**) water content; (**b**) matrix suction; (**c**) pore water pressure; (**d**) total stress. **Figure 6.** Variation in measured data when rainfall patterns are different: (**a**) water content; (**b**) matrix suction; (**c**) pore water pressure; (**d**) total stress.

The matrix suction of the tests with rainfall pattern I and rainfall pattern IV began to diminish earlier than the matrix suction of pattern II and pattern III. The matrix suction curve of the tests with rainfall pattern I and rainfall pattern IV began to enter the attenuation stage at about 20 min, while those of pattern II and pattern III began to enter the attenuation stage at about 35 min. The rainfall of patterns I and IV was relatively large in Different rainfall patterns cause different response processes of total stress. The soil response under rainfall pattern I was rapid. The total stress began to rise at 5 min. The growth rate slowed down at about 20 min. The total stress of the tests with rainfall pattern II and rainfall pattern III began to increase by close to the 20 min mark. Compared with rainfall pattern I, these patterns displayed lag. The total stress of the tests with rainfall patterns II and IV suddenly decreased in the later rainfall stage, due to the landslide.

The occurrence of landslides varied under the different rainfall patterns: no landslides occurred in rainfall pattern I, while landslides did occur in rainfall patterns II, III, and IV. The first landslide of the test with rainfall pattern II occurred at about 42 min. The first landslide of the test with rainfall pattern III occurred at about 45 min. The first landslide of the test with rainfall pattern IV occurred at about 47 min.

## *3.3. Slope Instability Processes with Different Soil Types*

Figure 7 shows the variation of water content, matrix suction, pore water pressure, and total stress of the slopes with different soil compositions. For the same rainfall intensity, there was no significant difference in the time at which the volumetric water content started to rise for the different soils. The volumetric water content for all of them started to increase

at about 25 min. However, there were large differences in the final volumetric water content of the different soil types. The upper limit of the volumetric water content was the highest for the silty soil, at about 36%. The upper limit of the volumetric water content for the mixed soil was about 30%. The sandy soil had an upper volumetric water content of about 27%. The gravel soil had the lowest upper limit of volumetric water content, which was about 18%. The volumetric water content of the silty and mixed soils rose very quickly and reached their maximum water content at about 35 min. The volumetric water content of the sandy soil rose over a longer period of time and the rate gradually decreased. The volumetric water content of the gravel soil plateaued after a small increase. *Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 9 of 14 stages: a rapid-increase stage and a stable stage. The increasing stage was seen in approximately the first 20 min of the test. When the increasing stage was over, the pore water pressure of the silty soil was greater than that of the sandy soil, and that of the sandy soil was greater than that of the gravel soil.

**Figure 7.** Variation in measured data when soil types are different: (**a**) water content; (**b**) matrix suction; (**c**) pore water pressure; (**d**) total stress. **Figure 7.** Variation in measured data when soil types are different: (**a**) water content; (**b**) matrix suction; (**c**) pore water pressure; (**d**) total stress.

The different soil types had different total stress values. At position I, the total stress was highest in the silty soil. It increased slowly during the time period from 10 to 40 min and remained constant at about 10 kPa during the time period from 40 to 60 min. The total stress of the sandy soil varied little in the first 20 min, increased and fluctuated in the time period from 20–50 min, and decreased abruptly at about 50 min due to the landslide. The total stress of the gravel soil was the smallest and it increased only slightly during the rainfall and then stabilized around 4 kPa. The total stress variation pattern of the mixed soil was similar to that of the sandy soil. Soil type is a key factor for determining the stability of slopes. Under the fixed rainfall process and slope conditions that were set in these tests, no landslide occurred when the soil type was silty or sandy. Landslides were generated when the soil type was gravelly The initial matrix suction of the different soils showed great differences. The initial matrix suction of the fine-grained soils was higher than that of the coarse-grained soils. The initial matrix suction of the silty soil was about 78 kPa, that of the sandy soil was about 52 kPa, that of the gravel soil was only about 14 kPa, and that of the mixed soil was about 48 kPa. The changing processes of matrix suction were also different. The matrix suction of the silty soil at location I entered the diminished stage at about 28 min, decreased sharply in the time period of 30 ~ 40 min, and finally stabilized at about 20 kPa. The matrix suction of the sandy soil and mixed soil decreased at about 25 min and the matrix suction of the sandy soil reached the stable stage at 35 min with a value of 18 kPa. The matrix suction of the mixed soil continued to decline and finally decreased to about 8 kPa. The matrix suction of the gravel soil was very small and did not change significantly in the first 30 min of rainfall. It entered the fluctuating stage at 30 min.

or mixed. The initial landslide occurrence time was about 39 min for the gravel soil and

Figure 8 shows the variation process of matrix suction when the slope gradients were different. The different slope gradients had little influence on the variation of the volume

about 47 min for the mixed soil.

The soil type had a strong influence on the changing process of pore water pressure. The increasing process of pore water pressure was longer for the mixed soil. The peak occurred between 30 and 40 min and it reached about 4 kPa. The pore water pressure changes that were observed in the silty, sandy, and gravel soils can be divided into two stages: a rapid-increase stage and a stable stage. The increasing stage was seen in approximately the first 20 min of the test. When the increasing stage was over, the pore water pressure of the silty soil was greater than that of the sandy soil, and that of the sandy soil was greater than that of the gravel soil.

The different soil types had different total stress values. At position I, the total stress was highest in the silty soil. It increased slowly during the time period from 10 to 40 min and remained constant at about 10 kPa during the time period from 40 to 60 min. The total stress of the sandy soil varied little in the first 20 min, increased and fluctuated in the time period from 20–50 min, and decreased abruptly at about 50 min due to the landslide. The total stress of the gravel soil was the smallest and it increased only slightly during the rainfall and then stabilized around 4 kPa. The total stress variation pattern of the mixed soil was similar to that of the sandy soil.

Soil type is a key factor for determining the stability of slopes. Under the fixed rainfall process and slope conditions that were set in these tests, no landslide occurred when the soil type was silty or sandy. Landslides were generated when the soil type was gravelly or mixed. The initial landslide occurrence time was about 39 min for the gravel soil and about 47 min for the mixed soil.

## *3.4. Slope Instability Processes on Differently Angled Slopes*

Figure 8 shows the variation process of matrix suction when the slope gradients were different. The different slope gradients had little influence on the variation of the volume water content, the variation of each test was generally similar. The rise speed rate of the 30◦ slope was less than that of the 5◦ slope and 15◦ slope.

The changing process of the matrix suction of the different gradient slopes was generally similar. For example, in a test with a 30◦ slope at position I it was observed that, in the first 25 min of rainfall, the change in the matrix suction was not obvious. At 26–46 min, the matrix suction at this location decreased abruptly. As the infiltration of the rainfall continued to increase, the soil matrix suction dropped to the minimum level and then stabilized. Since the precipitation on the steep slope was largely converted into surface runoff during the rainfall-infiltration process, the infiltration volume was smaller than that of the gentle slope at the same time. The matrix suction started to decrease earlier for the tests on the gentle slope than those on the steep slope.

The increasing and stable stages of pore water pressure on the slopes with different gradients had large differences. Taking location I as an example, when the slope was gentle (5◦ ), the rising curve of the pore water pressure resembled a convex function. That is, the function slope was larger at the beginning and the period from 5–12 min accommodated the concentrated rising section of the pore water pressure process. When the slope was 15◦ , the growth of the pore water pressure was approximately linear. When the slope was 30◦ , the curve of the pore water pressure increase was similar to the concave function, which decreased slightly in the time period from 0–15 min and increased rapidly in the time period from 20–25 min. In the stable stage, the pore water pressure on the 15◦ slope was the largest, around 3.2 kPa. The pore water pressure on the 30◦ slope was about 2.5 kPa and the pore water pressure on the 5◦ slope was stable at about 2.8 kPa.

The slope had a significant effect on the changing process of the total stress. For the case of a 5◦ slope, the total stress at position I and position II increased linearly throughout the test. For the case of the 15◦ slope, the total stress at position II changed very little in the first 25 min, increased significantly in the time period from 25 to 45 min, and suddenly decreased at about 47 min due to a landslide. For the case of the 30◦ slope, the total stress changed little in the first 40 min at position II, increased continuously from 40 min, and decreased suddenly at around 50 min due to the landslide.

30° slope was less than that of the 5° slope and 15° slope.

**Figure 8.** Variation in measured data when slope gradients are different: (**a**) water content; (**b**) matrix suction; (**c**) pore water pressure; (**d**) total stress. **Figure 8.** Variation in measured data when slope gradients are different: (**a**) water content; (**b**) matrix suction; (**c**) pore water pressure; (**d**) total stress.

The changing process of the matrix suction of the different gradient slopes was generally similar. For example, in a test with a 30° slope at position I it was observed that, in the first 25 min of rainfall, the change in the matrix suction was not obvious. At 26–46 min, the matrix suction at this location decreased abruptly. As the infiltration of the rainfall The gradient of the slope was a decisive factor for the occurrence of landslides. No landslide occurred on the 5◦ slope. Landslides occurred on both the 15◦ and 30◦ slopes. The initial landslide occurred at about 47 min on the 15◦ slope and about 33 min on the 30◦ slope.

water content, the variation of each test was generally similar. The rise speed rate of the

### continued to increase, the soil matrix suction dropped to the minimum level and then **4. Discussion**

stabilized. Since the precipitation on the steep slope was largely converted into surface runoff during the rainfall-infiltration process, the infiltration volume was smaller than that of the gentle slope at the same time. The matrix suction started to decrease earlier for the tests on the gentle slope than those on the steep slope. The increasing and stable stages of pore water pressure on the slopes with different gradients had large differences. Taking location I as an example, when the slope was gentle (5°), the rising curve of the pore water pressure resembled a convex function. That is, the function slope was larger at the beginning and the period from 5–12 min accommodated the concentrated rising section of the pore water pressure process. When the slope was 15°, the growth of the pore water pressure was approximately linear. When the slope A series of physical model tests for rainfall-induced shallow slides have been carried out then and reported on in this paper. The soil samples were made according to the natural soil in the Min Jiang River basin in southwestern China. The different soil types have unique characteristics. Silty soils are usually well-aggregated, but the aggregates break down rapidly when wetted, allowing non-aggregated soil particles to be easily transported [42]. For sandy soil, there is a clear linkage between landslides and sediment yield [43]. Seismic activity will generate a large amount of gravel soils and make the region susceptible to geohazards [22]. Therefore, the impact of the typical soil constitution on the landslides and sediment yield is analyzed in this paper. The sediment yield of the landslides that resulted from tests with different rainfall intensities, rainfall patterns, slope gradients, and soil particle compositions is shown in Table 3. All of the soil that slid down came from within 10 cm of the surface layer of the original mass, so the mass of the soil


that slid down by landslide, as a percentage of the total mass of the upper 10 cm layer of the original slope, was used to measure the severity of the landslide.

Some research on rainfall-induced shallow landslides has been conducted. A comprehensive physics-based Integrated Hydrology Model was set up, which is appliable to different rainfall characteristics [44]. In this paper, more factors are studied in order to investigate the mechanisms of rainfall-induced shallow landslides. The impact of the water content and pore water pressure was analyzed by monitoring a natural slope [45]. However, the variables of that study are not complete because matrix suction and total stress are lacking. A distributed one-dimensional modeling approach for predicting shallow-rainfallinduced landslides was established [46], but the actual landslides are complex. In this paper, more factors have been considered. The influence of soil depth on the occurrence of shallow landslides has been previously investigated [43], but evidence of how the rainfall and soil condition affect a landslide's occurrence is lacking.

The occurrence of landslides and their sediment yield is related to many factors. To investigate the effect of one factor and the following comparisons, the other factors were kept the same. Under a rainfall intensity of 120 mm/h (Test 3), the first landslide occurred at 39 min. Under a rainfall intensity of 160 mm/h (Test 4), the first landslide occurred at 47 min. This suggests that the higher the intensity of rainfall, the earlier the initial landslide occurs. The sediment yield of Test 4 was also larger than that of Test 3. When the rainfall intensity was 80 mm/h, there was no landslide. This indicates that the occurrence of landslides under the test conditions requires the rainfall intensity to exceed 80 mm/h. The rainfall pattern also affected the occurrence of landslides and the first landslide's time of occurrence. In the tests with rainfall patterns II, III, and IV (Test 6, 7, and 3), the first landslides were at 42 min, 45 min, and 47 min, respectively. In the test with rainfall pattern I, no landslide occurred. For the different soil types, the initial landslide occurrence times of the gravel soil and the mixed soil were 39 min and 47 min, and the sediment yields were 9.8% and 11.2%, respectively. No landslide occurred in the silty soil or the sandy soil. Soils composed of coarse grains were shown to be prone to landslides. The landslide on the 30◦ slope (Test 12) occurred earlier, at 33 min, than those which occurred during all of the tests on the 15◦ slope. The sediment yield was 24.2%, which was also higher than those on the 15◦ slope.

## **5. Conclusions**

Landslides are a gravity-driven mass movement and induce an increase in sediment yield in the watershed. In this paper, a series of artificial rainfall tests were conducted in order to investigate rainfall-induced shallow landslides. Four impact factors, including rainfall intensity, rainfall pattern, soil type, and slope gradient, were set to be studied. The changing processes of volumetric water content, matrix suction, pore water pressure, and total stress during rainfall were analyzed. The conclusions were as follows.

The occurrence of rainfall-induced shallow landslides was related to the intensity and pattern of the rainfall, slope gradient, and soil composition. Landslides were triggered by rainfall of a certain intensity and, according to the results obtained from the present tests, the rainfall intensity must exceed 40 mm/h in order to trigger a landslide. The higher the rainfall intensity, the earlier the landslide occurs. The rainfall pattern also influenced landslide generation.

The variables had unique processes of change during the rainfall that occurred under the different combinations of impact factors. The process of matrix suction consisted of an initial stage, steep decline stage, and stable stage. With greater intensity of rainfall, the matrix suction diminished earlier. The pore water pressure continued to rise through the rainfall and it started to rise earlier when the rainfall intensity was greater. In the tests with different rainfall patterns, matrix suction began to diminish later when the rainfall intensity peaked in the middle or late stage (patterns II and III). On the steep slope, soil water content rose and matrix suction diminished later than on the gentler slopes.

When rainfall-induced shallow landslides occurred, there was a corresponding significant change in the physical parameters of the slope. The landslides occurred after the matrix suction entered the diminished stage. The different soil compositions had sequential landslide occurrence times, with the silty and sandy types occurring earlier than the gravelly and mixed types. The pore water pressure rose briefly before the landslide occurred and fell back down after the landslide occurred. As the landslides caused rapid local soil movement, total stress produced rapid changes during the landslides and, in most cases, obviously decreased.

The sediment yield from the landslides was influenced by various factors. When the intensity of the rainfall increased, the sediment yield increased. The sediment yield of the landslides with coarser particle composition was greater than that of the finer soils. The sediment yield of the 30◦ slope was significantly higher than that of the 15◦ slope.

**Author Contributions:** Conceptualization, Y.L. and X.W.; methodology, Y.L. and X.W.; software, Y.L. and Z.D.; validation, Z.D.; investigation, Y.L. and Z.D.; data curation, Y.L. and Z.D.; writing original draft preparation, Y.L.; writing—review and editing, Y.L. and X.W.; funding acquisition, X.W. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research is supported by the opening fund from the State Key Laboratory of Hydraulics and Mountain River Engineering, Sichuan University (grant no. SKHL1804 and SKHL 2009).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


## *Article* **Geometry-Based Preliminary Quantification of Landslide-Induced Impulse Wave Attenuation in Mountain Lakes**

**Andrea Franco 1,\*, Barbara Schneider-Muntau <sup>2</sup> , Nicholas J. Roberts 3,4 , John J. Clague <sup>3</sup> and Bernhard Gems <sup>1</sup>**


**Abstract:** In this work, a simple methodology for preliminarily assessing the magnitude of potential landslide-induced impulse waves' attenuation in mountain lakes is presented. A set of metrics is used to define the geometries of theoretical mountain lakes of different sizes and shapes and to simulate impulse waves in them using the hydrodynamic software Flow-3D. The modeling results provide the 'wave decay potential', a ratio between the maximum wave amplitude and the flow depth at the shoreline. Wave decay potential is highly correlated with what is defined as the 'shape product', a metric that represents lake geometry. The relation between these two parameters can be used to evaluate wave dissipation in a natural lake given its geometric properties, and thus estimate expected flow depth at the shoreline. This novel approach is tested by applying it to a real-world event, the 2007 landslide-generated wave in Chehalis Lake (Canada), where the results match well with those obtained using the empirical equation provided by ETH Zurich (2019 Edition). This work represents the initial stage in the development of this method, and it encourages additional research and modeling in which the influence of the impacting characteristics on the resulting waves and flow depths is investigated.

**Keywords:** landslide-induced wave; lake-tsunami assessment; mountain lakes

## **1. Introduction**

Landslide-induced impulse waves in lakes are gaining interest in the scientific community due to the hazards they pose to people living or recreating along their shorelines and to dams and other infrastructure [1]. Additionally, climate change is driving rapid geomorphological changes in high mountains, which may increase the likelihood of landslides into mountain lakes. Rapid thinning and retreat of glaciers and an increase in heavy rainfall events can destabilize slopes adjacent to water bodies [2–5].

In December 2007, a 3-Mm<sup>3</sup> rock avalanche entered the Chehalis Lake (Canada), generating an impulse wave that destroyed forest and campgrounds along the shoreline and achieved a maximum run-up of 37.8 m above average lake level [6,7]. In July 2014, a 10-Mm<sup>3</sup> rockslide collapsed into Askja Lake (Iceland) from its rimming caldera and generated a wave that propagated >3 km across the lake with localized run-ups of up to 60–80 m [8]. In October 2015, a 50-Mm<sup>3</sup> landslide collapsed into Taan Fiord (Alaska), inducing one of the largest landslide-induced waves ever recorded, with a maximum runup of 193 m on the slope facing the landslide; the wave traveled >17 km down the fiord, devastating forests and eroding soil and sediments along its path [5,9]. In June 2017, an impulse wave generated by a 50-Mm<sup>3</sup> subaerial rockslide into Karrat Fiord on Greenland's

**Citation:** Franco, A.; Schneider-Muntau, B.; Roberts, N.J.; Clague, J.J.; Gems, B. Geometry-Based Preliminary Quantification of Landslide-Induced Impulse Wave Attenuation in Mountain Lakes. *Appl. Sci.* **2021**, *11*, 11614. https://doi.org/ 10.3390/app112411614

Academic Editor: Ricardo Castedo

Received: 29 October 2021 Accepted: 2 December 2021 Published: 7 December 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

west coast killed four people and destroyed 11 buildings in the village of Nuugaatsiaq, 32 km from its source, and flooded other settlements along the coast [10]. These events are just a small subset of all the subaerial landslides known to have generated impulse waves around the world [1].

Different methods, including scaled physical tests, field investigations, and numerical models are used to investigate aspects of this phenomenon, including landslide behavior, landslide–water interaction, and wave formation, propagation, and inundation. The effect of basin geometry on wave propagation has been experimentally and numerically investigated [11–13]. Researchers have also analyzed wave dispersion and related wave decay. Ruffini et al. [14] state that wave decay results from (i) frequency dispersion, (ii) bottom friction, (iii) lateral spreading of the water, and (iv) breaking of waves during generation and propagation. According to those authors, an increase in the lateral angle of the basin leads to a decay of solitary waves. The effect of lateral energy spread on wave amplitude is larger in a 3D basin-type geometry than the effect of frequency dispersion in a 2D flume-type geometry.

To virtually reproduce wave dynamics generated by an impact, researchers have used several different numerical approaches, notably non-linear shallow water equations (NSWE), Reynolds-averaged Navier–Stokes equations (RANS), and smoothed particle hydrodynamic methods (SPH) [13]. These approaches have been used to retrospectively investigate specific landslide-induced impulse wave events, typically with much success. However, such modeling approaches, when used in assessments of possible future events, are compromised by cost, computational time, the considerable amount of data required for model calibration and validation, and the absence of run-up and inundation evidence typically used to fine-tune model parameters

Other methods, such as generic empirical equations, can be used for preliminary assessment of the potential impact of landslide-generated waves and to guide decisions on the need for further investigations. Empirical relationships, commonly based on fieldmeasured wave impacts of historical events or scaled physical experiments, are used to determine wave characteristics. The analytical equations of Heller et al. [15], Heller and Hager [16], and Evers et al. [17] were developed to assess impulse waves in artificial water basins impounded by dams but are also applicable to natural lakes. Their equations, and the 3D approach employed in them, provide estimates of wave characteristics while accounting for variable propagation angles and travel distances, thus covering a wide range of water-basin geometries. The main inputs into their set of equations are landslide properties and a representative lake depth. Their workflow provides the first estimate of impulse wave celerity and waves run-up.

Strupler et al. [18] suggest a classification of mountain lakes based on their impulse wave potential, which they derived from both subaerial and subaqueous mass movements at Swiss perialpine lakes >1 km<sup>2</sup> . Their method relies on parameters calculated from digital elevation data (using geospatial software), for instance, topographic surfaces and bathymetry, together with seismologic data (e.g., the local acceleration), considered as an external factor for landslide initiation. They argue that lakes in the Alps have a high potential for impact by subaerial or subaqueous mass movements due to the surrounding steep slopes and fiord-like morphology of the lake basins compared to perialpine lakes. The latter lakes have a lower potential for subaerial mass movement, but a larger potential to inundate surrounding areas because of the typically lower relief surrounding them.

Despite the simplicity of diverse approaches available in the literature, there are some limitations, such as the technical knowledge required to properly use these tools and limited availability of high-resolution geospatial data—particularly bathymetry—which may necessitate the use of lower-quality digital elevation data. In this paper, a simple alternative method, based on lake shape and extent, for estimating impulse wave propagation in mountain lakes (Figure 1) is presented. The method has a short workflow based on analytical equations derived from the analysis of wave characteristics and estimated using hydrodynamic numerical models in Flow-3D for theoretical mountain lakes. A fixed

impact volume is used to induce the impulse wave. In comparison to other approaches, the equations are easy to apply. The variables required are the geometric characteristic of lakes; no specifics of the impacting landslide are needed.

**Figure 1.** Examples of mountain lakes, differing in shape, bathymetry, and spatial extent. (**a**) Location of the presented lakes in the Alps region. (**b**) Examples of diverse lake geometries: Lake Piburger (1-Austria), Lake Achen (2-Austria), and Lake Lucerne (3-Switzerland). Bathymetric data provided by the Sedimentary Geology Working Group of the University of Innsbruck [19,20] and the Swiss Service-Federal Office of Topography swisstopo (https://www.swisstopo.admin.ch/ en/geodata/height/bathy3d.html) (Accessed on 3 September 2021), [21,22]. The topographical background is taken from Microsoft Bing Maps—2021 Microsoft Corporation Earthstar Geographics Sio (Accessed on 26 July 2021).

We describe the near-shore impulse-wave magnitude in terms of flow depth at the shoreline. This novel approach enables rapid, resource-efficient, preliminary investigations that may be required to identify basins of particular concern or high risk and to justify more detailed data collection and analysis. An Excel spreadsheet, as a Computational tool, comprising the presented workflow is available in the Supplementary Materials.

## **2. Materials and Methods**

Data used in this study are derived from 56 subjectively selected alpine and perialpine lakes in the Alps region (table in Supplementary Materials). All of the lakes are bordered by slopes that could potentially generate subaerial mass movements. Characteristics of most lakes—water volume (*Vw*), lake area (*A*), width (*W*), length (*L*), and mean and maximum water depth (*d<sup>w</sup>* and *Dw*)—are already available from the literature and diverse online sources [23–26]. Bathymetric data for some of the lakes were provided by the Sedimentary Geology Working Group at the University of Innsbruck (see a listing in the table in the Supplementary Materials [19,20]) and the Swiss Service (Federal Office of Topographyswisstopo), enabling a more accurate geometric data calculation utilizing the Raster Layer Statistics and Raster Surface Volume tools in QGIS v3.16.

A classification scheme is developed to group and discriminate the geometries of the 56 lakes in the dataset (Section 3). Frequency analyses are performed to identify the most common lake metrics and to combine them to produce a representative range of theoretical lake-basin shapes for use in subsequent numerical modeling. These theoretical lake basins are generated as 3D solid bodies and exported to stereolithography (STL) files using Rhinoceros 6 software (see Supplementary Materials).

Numerical models are implemented in the finite-volume-based for computational fluid dynamics (CFD) software Flow-3D v11.1 [27–30]. This software simulates two-fluid problems, where all velocity components (*u, v, w*) are computed in the 3D domain using RANS equations [31] in combination with the volume-of-fluid method [29,32], and adopting the fractional-area/volume-obstacle-representation [33]. To compute turbulence and viscosity processes in Flow-3D, the renormalized group model-based k-epsilon turbulence model [34] is applied to create a fluid–fluid coupled model of the impulse wave. This model uses statistical formulations to compute the turbulent kinetic energy dissipation rate [35–37]. A Newtonian-like fluid (see Section 4), featuring a higher density compared to the still-water density, is adopted to simulate the impacting volume [38]. Modeled free-water-surface elevations, wave-crest elevations in open water, and flow depth at the lake shoreline are post-processed in FlowSight v11.1 [27]. Finally, correlation matrices between the numerical input parameters and the modeling results are generated to establish a relationship that can be used to assess potential impulse waves in natural mountain lakes.

## **3. Classification and Frequency Analysis of Alpine Lake Geometries**

A classification of the considered alpine lakes is shown in Figure 2. The figure shows the relationships between water volume (*Vw*) and lake area (*A*) and between water volume and mean (*dw*) and maximum (*Dw*) lake depth (each data point represents a single lake). The data are subjectively grouped and named based on ranges of *V<sup>w</sup>* as follows: small-size lakes (*V<sup>w</sup>* < 1 Mm<sup>3</sup> ); medium-size lakes (*V<sup>w</sup>* 1–100 Mm<sup>3</sup> ); large-size lakes (*V<sup>w</sup>* 100–10,000 Mm<sup>3</sup> ); and very large-size lakes (*V<sup>w</sup>* > 10,000 Mm<sup>3</sup> ).

**Figure 2.** Classification of 56 alpine lakes based on water volume, the lake area, and mean and maximum lake depths: dotted blue line is *V<sup>w</sup>* against *A*; dotted orange line is *V<sup>w</sup>* against *dw*; the dotted green line is *V<sup>w</sup>* against *Dw.* Trendlines enable a qualitative description of the lakes in terms of their extent and depths relative to water volume. Axes are logarithmic (base 10.).

Trendlines in Figure 2 quantitatively describe lake extents and water depths relative to water volume. The higher a lake is above the trendline, the larger the volume-normalized extent or the deeper the volume-normalized depth. Conversely, increasing distance below trendlines indicates a smaller extent or shallower depth relative to lake volume. Among the 56 lakes considered, 42% and 58% have large- and small-volume-normalized extents, respectively; and 53% and 47% have deep and shallow-volume-normalized depths, respectively.

Considering a ratio between *L* and *W,* it is possible to differentiate elongate lake shapes (*L/W* > 1.5, about 60% of them; e.g., Lake Achen in Figure 1) from those that are approximately equidimensional (1 < *L/W* < 1.5, about 16%). Other lakes (the remaining 24%) have complex shapes, generally comprising sub-basins (e.g., Lake Lucerne in Figure 1). These are mostly large- and very large-size lakes. Given that sub-basins of complex lakes are themselves either medium or large size and that only a few small lakes were identified, the small-size and very large-size lake classes are not considered further in the analysis. Results of a frequency analysis of lake geometrical characteristics for the medium-size and large-size lake classes are shown in Figure 3.

**Figure 3.** Frequency analysis and related histograms of geometrical characteristics (lake width and length, mean and maximum depth) of medium- and large-size lakes.

Based on the aforementioned classification, different combinations of parameters are subjectively defined to design theoretical lakes that cover the considerable variability of mountain lake configurations in the Alps. Table 1 shows the selected values for the two size classes, resulting in 51 combinations for medium-size lakes and 18 for large-size lakes. These 69 theoretical lakes form the basis for the numerical modeling part of this study (both a complete table with related information and STL files are available in the Supplementary Materials).

**Table 1.** Geometry parameters chosen for the theoretical lakes [lake width (*W*), length (*L*), and mean and maximum water depths (*d<sup>w</sup>* and *Dw*)]. Permutations of these three variables yield a total of 69 lake-basin configurations.


## **4. Numerical Model Set-Up**

Numerical models are designed to reflect conditions typical of mountain lakes in alpine settings, albeit with some necessary simplifications. Alpine basins, whether small or large, commonly fill topographic depressions with adjacent steep slopes that in many cases are the product of glacial erosion or deposition [39]. Steep slopes are particularly common along valley sides, making subaerial landslides capable of generating impulse waves more likely along lakesides than ends (e.g., [6,39,40]). Consequently, impacting volumes initiated in these environments enter the lake along its sides in all modeled scenarios.

Each numerical simulation considers that same impacting volume for triggering the wave. This lies initially on a sliding surface dipping 45◦ toward the lake and is represented by a 0.5 Mm<sup>3</sup> prismatic fluid body, 20 m thick, 208 m long, and 120 m wide, with a toe position directly above the lake surface (Figure 4). The volume is located mid-way along the side of each theoretical lake.

A bulk material density of 1620 kg m−<sup>3</sup> is used for the landslide [38]. Additionally, an initial speed of 20 m s−<sup>1</sup> at simulation time 0, is arbitrarily set. This, together with volume deformation during the sliding process, results in a maximum speed of 60 m s−<sup>1</sup> for the center of mass at the impact in all models, regardless of lake geometry.

For each simulation, the model domain includes the slide area, the entire water body, and the air above it to 40 m above lake level (m a.l.l.). The origin of the system (*x,y,z* equal to 0) is located midway along the long side of each artificial lake at the middle of the landslide toe (0 m a.l.l.). The maximum depth *D<sup>w</sup>* is located at the center of each theoretical lake (Figure 4e). A mesh block of 120 m × 140 m × 162 m, comprising 2 m × 2 m × 2 m mesh cells, includes the slope and the impacting volume (Figure 4a,e). To improve computational efficiency while also representing the complexity of nearfield landslide–water interaction, a finer mesh block within 500 m of the model origin and a coarser one beyond that is used. Medium-size lakes up to 500 m wide are modeled using uniform 2 m × 2 m × 2 m mesh cells in the nearfield and non-uniform 4 m × 4 m × 2 m cells beyond 500 m.

Medium-size lakes wider than 500 m are modeled using uniform cells that increase beyond 500 m from 2 to 4 m. For all large-size lakes, uniform cells (5 m) and non-uniform cells (10 m × 10 m × 5 m) are used, respectively, within and beyond 500 m of the origin.

In all models, the "wall" boundary condition of the mesh block at the lateral margins of the impacting volume is set in Flow-3D to simulate a continuous slope along the length of the lake, allowing wave reflection modeling. A "symmetry" boundary condition is set

to allow the impacting volume to leave the mesh of the sliding area and enter the mesh comprising the lake. "Outflow" boundary conditions, at the lake borders, on the other edges of the model domain allow the flow to exit the domain without reflection to avoid wave interference and, thus, simplify the computation process. This simplification was implemented because wave characteristics of the first front arrival at the shorelines are of principal interest (Figure 4a,e). The water level at lake rest (0 m a.l.l. at *z*-axis in the domain system) is used as the initial condition for all model runs. Recording data intervals of 0.5 and 1 s are set for the medium-size and large-size lakes, respectively.

**Figure 4.** Examples of theoretical lakes and model set-ups showing simulations of a landslide-induced impulse wave. (**a**) Small elongated medium-size lake; in this example, the boundary conditions are shown (S-symmetry; W—wall; O—outflow). (**b**) Equidimensional medium-size lake. (**c**) Narrow elongated medium-size lake. (**d**) Elongated large-size lake (dotted white lines represent lines probes (L.P. 1–12) used to record water-surface elevations in FlowSight). (**e**) Example of longitudinal section for the theoretical lake in (**c**) along the slide direction (dotted white line); boundary conditions are shown. The yellow dot represents the origin of the system in the model domain.

The model set-ups are chosen to allow completion of each simulation in a reasonable amount of time (between 2 and 24 h per simulation) and to optimize the balance between output accuracy and output file size (4 to 109 Gb per simulation). The model environments have been designed to ensure that output files are within the processing capabilities of FlowSight, while also producing results suited for inter-model comparisons.

The computational resource and hardware components used for numerical modeling are the following:


## **5. Results**

## *5.1. Numerical Simulations and Wave Decay Potential Parameter*

For each scenario, lake-surface elevations are estimated along with line probes (L.P.), which are lines along which the free-water-surface in the 3D domain is monitored over time following impact (Figure 4d). L.Ps are used to document and analyze maximum wave crest elevations in open water and flow depths at the shoreline. A set of 12 L.Ps extend from the domain origin (Section 4) to equal space-distributed points along the shoreline, allowing to qualitatively and quantitatively analyze the propagation of the impulse wave and how lake geometry affects the wave. An example is provided in Figure 5, which shows that a decrease in water depth results in progressively higher wave dissipation and a lower flow depth at the shore (compare Figure 5a–c). These trends reflect the increase in water volume that is mobilized by the impacting volume and reduced interaction and friction with the lake floor as the water depth increases. Conversely, the longer travel distances facilitated by incrementally wider lakes result in greater wave dissipation and, consequently, lower flow depth at the shoreline. The slight increase of flow depth along with the L.Ps in the proximity of the shoreline (Figure 5a,d,e) shows the wave deformation due to the interaction with the lake floor while approaching the shoreline.

**Figure 5.** Free-water-surface elevation computed along with the L.Ps (e.g., Figure 4d) for typical circular artificial lakes with different widths and maximum depths (**a**–**e**). Each lake configuration influences the landslide–water interaction and wave dissipation, thus causing different impulse wave propagation patterns.

The same patterns are observed for all 69 scenarios. Generally, for the medium-size lakes, the maximum wave crest elevation ranges from 29 to 5.9 m a.l.l along with L.P. 1 (the midline representative of the landslide's travel direction). Considering all L.Ps, average wave crest elevations range from 15.4 to 27.5 m a.l.l. At the shoreline, the average flow depth is 4.1 to 19.7 m a.l.l. The highest values of flow depth (6 to 35.6 m a.l.l.) occur at the shoreline along L.P. 1. In the case of large-size lakes, maximum wave crest elevations range from 29.4 to 33.8 m a.l.l. along with L.P. 1, and the average for all L.Ps ranges from 15.4 to 24 m a.l.l. At the shoreline, the average flow depth is 4.1 to 7.4 m a.l.l., with the highest values on L.P. 1 (4.2 to 15.9 m a.l.l.).

The numerical analyses provide insight into potential impulse-wave threats for particular mountain lake configurations. A possible indicator of the threat level is the "dissipation power" of an impulse wave in a lake, which is a measure of the degree to which the wave attenuates as it moves away from the impact location. This metric is determined for each L.P. by calculating the ratio of the maximum wave crest elevation and the flow depth at the shoreline. A new parameter, defined as a single weighted-average decay ratio value and representative for the entire lake is henceforth termed the "wave decay potential parameter" (*WDPP*) is given by Equation (1):

$$WDPP = \frac{\sum\_{i=1}^{n} (a\_{mi}) \left(\frac{a\_{mi}}{f\_{dci}}\right)}{\sum\_{i=1}^{n} a\_{mi}} \ (-) \tag{1}$$

where *a<sup>m</sup>* and *fdc* are, respectively, the maximum wave elevation and the flow depth at the shoreline location for each of the L.Ps. For medium-size lakes, values of *WDPP* range from 1.09 to 4.87, and for large-size lakes from 2.61 to 6.6. When a higher value of *WDPP* is obtained, a higher dissipation power of the lake is expected, and vice versa.

## *5.2. Correlation Analysis and the Shape-Product Approach*

A correlation analysis is completed to better understand how the numerical results are related to the input parameters with matrices showing the resulting correlation coefficients (*r*) calculated by the Pearson function. The correlation matrix in Figure 6a considers all analyzed theoretical lakes. The maximum wave crest elevation does not correlate well with any input parameter, suggesting that it depends on the combination of all lake characteristics (and in a real situation also on the impacting landslide properties, which are not considered in this study). Similarly, no relevant correlations are evident for flow depth at the shoreline. In the case of the mean flow depth at the shoreline, *r*-values of 0.669 and 0.665 are obtained for lake width and length, respectively. Maximum and minimum flow depths yield *r*-values of 0.71 and 0.694 for lake width and length, respectively. Higher correlation coefficients are calculated between *WDPP* and lake characteristics: *r*-values of 0.835 and 0.786 for lake width and length, respectively; and *r*-values of 0.830 and 0.806 for lake area and volume, respectively.

By plotting *WDPP* against lake area and water volume (Figure 7) and considering the maximum water depth as an indicator, it is found that the theoretical lake basins form two groups. Lakes with a maximum *D<sup>w</sup>* of 20 m (small red circles in Figure 7) can be separated from the others. This separation is supported by correlation analyses provided for the two lake geometry subgroups (*D<sup>w</sup>* ≤ 20 m and *D<sup>w</sup>* > 20 m). In the case of lakes with maximum depths greater than 20 m, the correlations between *WDPP* and lake area and lake volume (see Figure 6b) have *r* values of 0.90 and 0.88, respectively.

These relationships can be approximated by hyperbolic (power-type) functions. The coefficients of determination, *R 2* , which express the variation of the dependent variable predicted by the independent variable and provide a measure of how well the curves fit the data (the closer *R 2* is to 1, the better fit-Figure 7 and Table 2).

Lake basins with *D<sup>w</sup>* ≤ 20 m likely plot farther from deeper basins because their water depths are similar to or less than the thickness of the impacting volume. This may imply that numerical results for these scenarios are influenced by how the impacting volume (the dense fluid) enters the lake and propagates through the water body [41]. For example, the dense volume might induce an overestimation of the maximum wave crest in open water. Consequently, further analysis and discussion only consider model scenarios with maximum depths greater than 20 m. *WDPP* is well correlated to the lake area and water volume, thus a parameter that relates different lake characteristics and *WDPP* is formulated to improve rapid assessments of impulse waves in mountain lakes.

The new parameter is the "shape product" (herein labeled *ShpP*), which is defined using empirical Equation (2). Results are well correlated to *WDPP* with an *r*-value of 0.962 (Figure 6 and Table 2):

$$SlpP = \frac{V\_w}{A} \ast \left(\frac{W \ast L}{d\_w^3}\right)^{1/3} (\text{m}^{2/3})\tag{2}$$

where *V<sup>w</sup>* is lake volume, *A* is lake area, *W* and L are lake width and length, respectively, and *d<sup>w</sup>* is mean lake depth.

The shape parameter does not represent any specific physical measurement or a dimensionless parameter. On the contrary, it is the measure of a lake's geometric characteristics most closely related to the *WDPP*.

A plot of *ShpP* against *WDPP* reveals a linear relationship with an *R <sup>2</sup>* of 0.9257 (Figure 8a and Table 2). Linear regression of *ShpP* against *WDPP* verifies the reliability of this relationship (Figure 8b), as it yields the same equation and *R <sup>2</sup>* proposed in the previous analysis (Figure 8a).

After calculating *WDPP* from a specific *ShpP*, the equations in Table 3 provide a rough estimate of the expected flow depth at the lakeshore. Plots of *WDPP* against maximum, mean, and minimum flow depth obtained from the numerical models are shown in Figure 9. The mean flow depth is strongly and negatively correlated to *WDPP* (*r* = −0.849, Table 3). The maximum and minimum flow depths are well negatively correlated to *WDPP* calculated at L.Ps 1 and 5 (r = −0.751 and −0.749, respectively; Figure 6b).



**Figure 6.** Correlation matrices show the correlation coefficients (*r*) between lake geometrical characteristics and the hydrodynamic modeling results. (**a**) Matrix for all lakes (69 models). (**b**) Matrix for all lakes with a maximum depth greater than 20 m (52 scenarios).

**Figure 7.** Hyperbolic relations between *WDPP* and (**a**) lake area and (**b**) water volume (X-axis in logarithmic scale, base 10). Circle size and color relate to lake depth. Related equations, coefficient of determination (*R 2* ), and correlation coefficient (*r*) are also shown. The *r*-values labeled in red, which are not shown in Figure 6, are correlation coefficients specifically for medium-size lakes with a maximum depth of 20 m. The *r*-values labeled in black, shown in Figure 6b, are correlation coefficients for theoretical lakes with a maximum depth greater than 20 m.

**Table 2.** Correlation coefficients (*r*) between *WDPP* and *A*, *Vw*, and *ShpP*; equations; and corresponding coefficients of determination (*R 2* ), where y is *WDPP* and x is *A*, *Vw*, or *ShpP*.


**Figure 8.** (**a**) Plot of *WDPP* against *ShpP* (x-axis in logarithmic scale, base 2), and related equations, coefficient of determination *(R<sup>2</sup>* ), and correlation coefficient (*r*). Data size and color relate to maximum lake depth. The blue circle refers to the 2007 Chehalis Lake landslide-generated wave case, used as a test of the proposed approach (Section 7). The *r*-values with red font, which are not shown in Figure 6, correspond only to the red dots. (**b**) Linear regression plot obtained for the relation between *ShpP* and *WDPP* (the resulting equation is the same as the one obtained with the trendline in Figure 8a).

**Table 3.** Relationships between *WDPP* and flow depths at the shoreline, including correlation coefficients (*r*)*,* equations, and coefficients of determination (*R 2* ). In the equations, y is flow depth and x is *WDPP* obtained from the equations in Table 2.


The Supplementary Materials include an Excel spreadsheet that summarizes the workflow processes, including the entire calculation procedure and includes informative charts to display the results.

**Figure 9.** Hyperbolic relation between *WDPP* and flow depth at the shoreline (mean—blue data; maximum—orange data; minimum—green data). Flow depth values for the Chehalis Lake event (blue) are obtained using the 2019 ETH Zurich equations.

## **6. Example of Application of the Proposed Approach—The Chehalis Lake Landslide-Generated Wave**

The landslide-induced wave in Chehalis Lake on 4 December 2007 (Figure 10, see Section 1) [6,7] is chosen as a test case of the proposed methodology for several reasons. It involves a representative mountain basin, and both the landslide [6] and lake characteristics [7] are well documented. In addition, multiple numerical modeling studies [6,42–44] provide insights into open-water hydrodynamics necessary to reproduce documented impacts. Bathymetric data acquired through a SONAR survey [7] are used in QGIS v3.16 to estimate the lake´s characteristics (see Section 2). The volume of the initial rockslide is estimated to be about 3 Mm<sup>3</sup> . The rock mass rapidly fragmented, transforming into a rock avalanche as it approached the shoreline. Approximately 2.2 Mm<sup>3</sup> of rock debris entered the lake and triggered the wave [6,7,44]. Although this volume is larger than the one employed in hydrodynamics models (see Section 4), the maximum impact speeds are similar (about 60 ms−<sup>1</sup> [42,45], Figure 10), making the case study suitable for testing the method.

The case study data are first used as inputs in the empirical equations provided by Evers 2019 [46] (ETH-Zurich, 3D approach—Overland flow) to estimate flow depths at the Chehalis Lake shoreline. As this approach cannot be applied to the entire area of the lake, and because the lake is also divided into two sub-basins by a shallow subaqueous ridge, the test is limited to the north sub-basin (yellow rectangle in Figure 10). The sub-basins characteristics and landslide properties are summarized in Figure 10.

**Figure 10.** Chehalis Lake overview. Lake and landslide properties are shown (data from [6,7]). The distance and angle of wave propagation from the landslide impact point, which are useful inputs for the 2019 ETH-Zurich equations [46], are also shown. Flow depths at the shoreline in different locations are shown in white. Topographical background taken from Microsoft Bing Maps—2021 Microsoft Corporation Earthstar Geographics Sio (Accessed on 22 July 2021).

> Flow depths are calculated at shoreline locations corresponding to wave propagation angles in 10◦ increments and ranging from −30◦ to +80◦ on both sides of the landslide midline (Figure 10). Calculations yield an open-water amplitude of the initial impulse wave of 37 m a.l.l., and shoreline flow depths ranging from 3.3 to 13.5 m, with a maximum opposite the slide source and a mean shoreline flow depth of 9.2 m (Figure 10).

> A *ShpP* value of 139.34 m2/3 and a *WDPP* value of 2.495 (Equation (5) in Table 2 and Figure 8a) are calculated for the north sub-basin. Appling the latter value in Equations (6)–(8) yields estimates of maximum, mean, and minimum flow depths of 15.5, 8.8, and 4.9 m, respectively. The maximum and minimum values obtained with Equations (6)–(8) are higher than the ones obtained with the ETH equations, but a close match is found for the mean flow depth (Figure 9). However, the ratio of the maximum wave amplitude (37 m a.l.l.) and the *WDPP* of 2.49 is 14.8 m, which is very close to the maximum flow depth calculated using the ETH equations. These results pertain only to the north sub-basin of the Chehalis Lake. Considering the whole lake, the *ShpP*-value is 196 m2/3 and the *WDPP* value is 3.36, yielding a slightly lower estimate of maximum, mean, and minimum flow depths at the shoreline of 12.63, 6.95, and 4.12 m, respectively.

> These results show the applicability of the proposed method and use of equations in Table 3, with specific values of *WDPP*, for a preliminary evaluation of potential landslide-

induced waves in natural alpine lakes to provide reliable estimates of the flow depths at the lake shoreline. Comparisons between the resulting flow depths obtained with the proposed equations and the ones derived using the ETH equations suggest that the new methodology provides a reliable estimate of flow characteristics. Regardless of what was observed with the presented test, appropriate validation of the proposed method is still required, where a comparison with real data or observations would increase the reliability of this approach's application.

## **7. Comparison between Real and Theoretical Lakes**

Based on this work, the *WDPP* is proposed as a suitable first-order descriptor of a lake basin's potential for dissipating an impulse wave. Figure 11 is a plot of *WDPP* values superposed on the alpine lake classification. It provides a screening tool to identify lakes that are most likely to disperse a wave (Section 3). For the medium-size lakes, *WDPP* generally increases as lake size increases and depth decreases. This is consistent with the numerical modeling results (Figure 5)—lower water depths and longer distances to shorelines enhance wave dissipation and thus decrease shoreline flow depths.

**Figure 11.** Comparison of modeling results (circle size and color based on *WDPP*; black circles are from the alpine lakes classification chart, see Section 3).

In the case of large-size lakes, *WDPP* provides no clear separation of geometrical characteristics of lake basins (lake extent or depth), although a larger *WDPP* appears to be favored by larger lake volumes.

Figure 12 shows expected *WDPP* values for all the natural lakes in the Alps examined in this study. Circles on the map are lakes for which *WDPP* has been derived from *ShpP* Equation (5). Still, *WDPPs* for lakes with complex shapes (diamonds in Figure 12), for which *ShpP* is not available, are here based, for instance, on the water volume (Table 2, Equation (4)). Generally, high values of *WDPP* are related to perialpine lakes with large areas and volumes, whereas lower *WDPP* values are associated with smaller alpine lakes.

**Figure 12.** Alpine and perialpine lakes considered in this study and related *WDPP* values obtained using the *ShpP* relation (circles in the legend). Where *ShpP* is unavailable, *WDPP* was obtained using the water volume relation (squares in the legend, see also Figure 7b). Topographical background taken from Microsoft Bing Maps—2021 Microsoft Corporation Earthstar Geographics Sio (Accessed on 26 July 2021).

> A critical issue in this work is the reliable range of lakes for which the new approach is applicable (see Table 1 and the complete table of theoretical lakes in the Supplementary Data). The data suggest that the approach has limitations when used for some lakes that are outside of this range, notably some large-size and very large-size lakes (i.e., *WDPP* > 7 in Figure 12). Further study is required to extend these investigations to alpine lakes outside of the stated range in this work to test the broader application of the proposed method.

## **8. Discussion and Further Required Research**

## *8.1. Applicability of the Proposed Approach*

The approach introduced in this work is intended to be a high-level screening tool for identifying the greatest landslide impulse wave threats and thus helping prioritize resources for a more detailed and thorough analysis. The study is based on 3D hydrodynamic modeling and considers a wide range of water body geometries. Results demonstrate the reliability of a lake geometry-based approach for a first-order assessment of the danger posed by potential landslide-triggered impulse waves in mountain lakes. The suggested method finds its applicability in situations where the characteristics of the possible impacting volume are unknown, as only the geometrical properties of lakes are required. Unlike other methods, this approach does not require a deep knowledge of wave theory or the need for additional software. A reliable constraint on wave attenuation, as is provided by this approach, means that the initial wave amplitude needs only to be generally approximated.

The method can be applied to lakes with diverse shapes and dimensions, subject to the limitation that the potential impacting volume has similar characteristics to the one adopted for the modeling in this work (Section 4). In the case of lakes with complex shapes, Equation (5) can be applied to sub-basins with specific geometrical properties to estimate the wave decay potential parameter *WDPP*, as has been done with the Chehalis Lake test case (Section 6). If Equation (5) is not applicable due to unknown lake-geometry parameters in Equation (2), *WDPP* can be estimated using Equation (3) or Equation (4), which consider lake area or volume respectively. The equations in Table 3 provide flow depths expected at

the shoreline. However, because models with a maximum depth of less than 20 m were excluded from the later stages of the workflow, this method might overestimate the hazard potential for small shallow lakes.

The Chehalis Lake example shows that analyzing singular sub-basins provides a more conservative estimate of flow characteristics compared to an analysis of the entire lake. As waves propagating from one basin to the next may have already attenuated, sub-basinspecific estimations using this approach can be deemed as worst-case scenarios. It is worth noting that the subdivision of sub-basins does not take into account wave propagation from one basin to another. However, this limitation is a minor issue because a wave entering a second sub-basin has already significantly attenuated, implying that the highest threat is already identified by intra-basin events.

Nonetheless, wherever sub-basins join, complex variations in flow depth at the shoreline might be expected as impulse waves approach from the adjacent sub-basin. This is due to changes in bathymetry (e.g., a shallow sill as at Chehalis Lake, Figure 10), basin alignment (e.g., non-parallel valleys such as the western part of Lake Lucerne, Figure 1b), or both (e.g., the eastern part of Lake Lucerne, Figure 1b). In some instances, these effects might result in locally increased near-shore wave energy and run-up. As a consequence, although intra-basin wave generation gives a conservative assessment of near-shore wave threats, intra-basin waves may underestimate such threats where sub-basins join.

Given the numerical modeling results for the theoretical lakes (Section 5.1) and the results produced for the Chehalis Lake test (Section 6), it is concluded that the proposed approach is applicable for lakes that are flanked by steep unstable slopes, for which failures with volumes ranging from thousands to a few million cubic meters can be expected. However, this assertion requires validation with additional historical case studies from a variety of mountain regions, as well as fiords with basin characteristics similar to alpine lakes. Extending the dataset to other mountain regions for which good data are available (e.g., Canadian Cordillera or a select part of the Andes) to newly formed glacial water bodies [39] and artificial reservoirs would also improve the analysis and results.

## *8.2. Limitations, Uncertainties, and Further Potential Development*

This work provides a workflow to estimate near-shore impulse-wave magnitude in terms of flow depth at the shoreline (see "Excel spreadsheet—Computational tool" in the Supplementary Materials). By contrast, most other studies report impacts in terms of inundation distance and run-up height, metrics that are not considered in this work. Nonetheless, shoreline water depth provides a preliminary characterization of run-up potential and inundation, although these are also heavily affected by the morphology and the steepness of the surrounding topography. As a result, observations of an impulse wave or post-event measurements of its impacts provide only a very general indication of the large-scale variability of flow depth at the shoreline. A more reasonable assessment of this new approach can be achieved using previously established relationships that independently estimate flow depth at the shoreline, as was done for the Chehalis Lake case study (Section 6).

A current disadvantage of this work is the lack of adequate validation. The Chehalis Lake case study and comparison of the results with those based on the ETH equations are insufficient to fully evaluate the validity of this approach. Proper validation would entail a comparison of the results derived using this method with real data collected in the field. However, flow depths at the shoreline caused by an impulse wave are not directly observable in the field and therefore must be calculated independently. Such calculation requires the use of detailed, site-specific, retrospective numerical simulations capable of accurately recreating a landslide-generated wave event.

Future extension and expansion of the proposed approach will involve two critical steps: (1) implementing numerical analyses on well documented, real-world events to validate the geometry-based strategy as a predictor of flow depth at the shoreline; and (2) developing rapid, low-computational-cost methods for estimating run-up and flooding

potential from flow depth at the shoreline. This next phase of development will expand the range of outputs to include expected run-up, thereby extending the applicability of the proposed approach and enchaining opportunities to reliably validate it.

Another possible limitation of this study is its applicability to large lakes and a single impact volume. The size and properties of the impacting landslide substantially influence wave characteristics. For example, the maximum wave height resulting from an impact is dependent on landslide volume and competence, debris thickness and frontal width, impact speed, and slope angle and roughness [17,37]. Further research that addresses these issues is required. For example, the approach can be extended to consider diverse characteristics of the impacting volume to provide a relation for *WDPP* that considers landslide properties in addition to the geometrical characteristics of the lake (the *ShpP*). Incorporation of additional consideration into the workflow—including by varying impact processes, volumes, and velocities—might lead to the development of additional equations that can be applied to the full range of real-world situations. This can be accomplished in Flow-3D by adopting the concept of the impacting volume being a dense fluid, as done in the present study. Different volumes, shapes, impact velocities, and slope angles can be implemented in the model set-up. Scaling impact volumes relative to lake volume may assist in standardizing comparisons between lakes. In this work, the slide source is located halfway along the long side of the lake. Further development of the approach should consider different impacting locations along the lakeshore to extend the applicability of the suggested method. Moreover, it would be valuable to consider submerged slide sources, as subaqueous mass movements can also trigger lake impulse waves [47,48].

Further development of this methodology would also benefit from the inclusion of geotechnical properties of the materials that collapse and enter the lake. This would be especially interesting when studying mass movements made up of unlithified materials or heterogeneous rock masses, as well as studies of moraine-dammed lakes where large sediment masses can fail. Finally, landslide hazard and the erosional vulnerability of nearshore elements should be addressed in the context of risk analysis to complement the relative level of threat posed by the degree and pattern of impulse wave attenuation in a given basin.

## **9. Conclusions**

A novel method for conducting a preliminary evaluation of the size and propagation of landslide-induced waves in mountain lakes is proposed. A set of equations is used within a numerical modeling framework to characterize the geometric characteristics of a lake and quickly assess the possible wave threat in terms of wave dissipation and expected flow depth along the shoreline (which is different from the expected run-up). Wave propagation controlled by lake geometry provides a general indication of where run-up or inundation would be the highest, although the behavior of the breaking wave is also heavily influenced by topography along the shoreline. The incorporation of landslide properties, behavior, and locations into the workflow will further increase the utility of this approach. The proposed method can be used as a first-order indicator for prioritizing lakes that appear to be particularly susceptible to landslide-induced impulse waves.

The findings of this study reveal that *WDPP* is a valid metric for the "dissipation power" (attenuation) of an impulse wave in a mountain lake and that it correlates well to lake geometrical characteristics, particularly area and water volume. The linear relationship between *WDPP* and *ShpP*, a metric that takes into account diverse lake properties, provides a valuable approach for estimating potential wave dissipation. If the *ShpP* is unavailable, for example in lakes with complex shapes, water volume can be used to estimate *WDPP*. Furthermore, *WDPP* can be used to calculate the expected flow depth at the lake shoreline. Results suggest that large perialpine lakes (*V<sup>w</sup>* > 10.000 Mm<sup>3</sup> ) are more likely to disperse impulse waves; these attenuate less in smaller alpine lakes (*V<sup>w</sup>* < 10 Mm<sup>3</sup> ).

This method can be applied worldwide to mountain lakes that differ in shape, extent, and volume. The minimum required input data are the geometrical characteristics of the

lakes. Unlike other proposed methodologies, specific technical knowledge or additional inputs such as digital elevation data are not required, making the proposed method easy to use. However, inclusion of high-resolution digital bathymetric data would produce more accurate assessments.

The proposed method is tested with data available for the 2007 Chehalis Lake landslidegenerated wave event. Results match well those obtained using the empirical equations published by ETH Zurich (2019 Edition—3D approach, [17,46]). Despite the positive results, one case study is insufficient to assess the overall reliability of our novel approach, and a validation considering real, observed data is still required. Indeed, additional historic events in which wave characteristics have been directly observed would be particularly valuable for further assessing the method. Furthermore, the applicability of impact volumes that differ from the one used in this study, together with a variety of impact characteristics, must be determined.

This study is an initial step in the development of this methodology; additional research is required to improve the method and its application, for example by taking into account properties of the potential impacting mass. Stability analyses of slopes bordering lakes with high hazards from impulse waves, including implications of external triggers such as intense rain events or earthquakes, are recommended to contribute to a proper estimate of the related hazard in a cascade effect context. Recently formed alpine lakes due to glacier retreat should also be included in these investigations.

**Supplementary Materials:** The following are available online at: https://zenodo.org/record/5569 220 (accessed on 29 November 2021), Data set—Zenodo: https://doi.org/10.5281/zenodo.5569220, Data set (Excel spreadsheet): Alpine Lakes and Artificial Lakes; Data set (STL files): Alpine lakes. The following is available online at: https://zenodo.org/record/5733950 (accessed on 29 November 2021), Zenodo: https://doi.org/10.5281/zenodo.5733950, Computational tool (Excel spreadsheet): UIBK\_Computational tool for Landslide\_Induced Impulse Wave 2021.

**Author Contributions:** Conceptualization, A.F. and B.G.; data curation, A.F., N.J.R., J.J.C. and B.G.; formal analysis, A.F., B.S.-M. and N.J.R.; investigation, A.F. and B.S.-M.; methodology, A.F. and B.G.; software, B.G.; supervision, B.S.-M. and B.G; validation, N.J.R.; visualization, A.F., N.J.R. and J.J.C.; writing—original draft, A.F.; writing—review and editing, B.S.-M., N.J.R., J.J.C. and B.G. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Data Availability Statement:** Open source data set and free access tools that support the findings in this work are available in the links posted in the Supplementary Materials.

**Acknowledgments:** We thank the Sedimentary Geology Working Group at the University of Innsbruck for the useful data, especially Jasper Moernaut and Michael Strasser. We thank the anonymous referees for their constructive contribution in helping to improve this work.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

## **References**


## *Article* **A Case Study of a Large Unstable Mass Stabilization: "El Portalet" Pass at the Central Spanish Pyrenees**

**Guillermo Cobos 1,\* , Miguel Ángel Eguibar <sup>2</sup> , Francisco Javier Torrijo 1,3 and Julio Garzón-Roca <sup>4</sup>**


**Abstract:** This case study presents the engineering approach conducted for stabilizing a landslide that occurred at "El Portalet" Pass in the Central Spanish Pyrenees activated due to the construction of a parking lot. Unlike common slope stabilization cases, measures projected here were aimed at slowing and controlling the landslide, and not completely stopping the movement. This decision was taken due to the slow movement of the landslide and the large unstable mass involved. The degree of success of the stabilization measures was assessed by stability analyses and data obtained from different geotechnical investigations and satellite survey techniques such as GB-SAR and DinSAR conducted by different authors in the area under study. The water table was found to be a critical factor in the landslide's stability, and the tendency of the unstable slope for null movement (total stability) was related to the water table lowering process, which needs more than 10 years to occur due to regional and climatic issues. Results showed a good performance of the stabilization measures to control the landslide, demonstrating the effectiveness of the approach followed, and which became an example of a good response to the classical engineering duality cost–safety.

**Keywords:** landslide; safety factor; Central Spanish Pyrenees; soft rocks; stabilization measures

## **1. Introduction**

Slope stabilization is probably one of the most typical, ancient and challenging issues of civil engineering. Commonly, problematic landslides affecting buildings and infrastructures are solved by installing a series of measures that lead to completely and immediately stopping the ground movement. Those measures normally include rigid retaining walls, anchors or a substantial variation of the ground profile0 s geometry, and their implementation can give rise to a compromise between cost and safety [1–3]. However, when dealing with a great amount of ground material, stopping the movement absolutely and instantaneously may not be the optimal solution. The literature shows the possibility of using different techniques for reducing the mobility of landslides, including small excavations and adjustments of the slope profile, the implementation of flexible walls or piles and the installation of drainage [4–9]. The latter is especially important since landslides are often triggered by precipitations and the resulting change in the groundwater level. Water reduces ground strength and increases pore pressures, contributing to the instability of a slope [7,10–13]. Controlling the water table is essential when dealing with shallow and slow movements, as any rainfall may accelerate or even reactivate a landslide. The previous issues are particularly relevant in areas characterized by rainy seasons followed by dry periods, such as the Mediterranean region and, more precisely, the Iberian Peninsula.

The "El Portalet" Pass is located at the municipality of "Sallent de Gallego", Huesca province, in Spain, and belongs to the Central Spanish Pyrenees. In 2004, an excavation to

**Citation:** Cobos, G.; Eguibar, M.Á.; Torrijo, F.J.; Garzón-Roca, J. A Case Study of a Large Unstable Mass Stabilization: "El Portalet" Pass at the Central Spanish Pyrenees. *Appl. Sci.* **2021**, *11*, 7176. https://doi.org/ 10.3390/app11167176

Academic Editors: Ricardo Castedo, Miguel Llorente Isidro and David Moncoulon

Received: 2 July 2021 Accepted: 2 August 2021 Published: 4 August 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

create a parking lot next to a local road was performed in the area. Parking excavation was conducted and resulted in the activation of a large landslide. That landslide occurred on a hillside in the surroundings of Petrusos Peak, in an area of about 0.35 km<sup>2</sup> , where superficial cracking and diverse instabilities at different movement rates had been identified in the past [14]. A carboniferous substrate characterizes the zone, affected by faults of slate nature, with gravel colluvium and sandy clay materials on foothills as a result of previous paleo slides. Natural slopes (average angles of 12◦ ) were also identified to be on the verge of instability [15].

The parking excavation reactivated small slides previously identified in the area [16,17] and produced new ones. The first instability signals were displayed at the end of the year, when large cracks appeared in the head of the slope and propagated afterwards along the hillside, showing cracks and longitudinal and transversal deformations, even breaking the slope toe. This type of deformation characterizes an paleo-slide failure mechanism involving an important volume of material [15,18–20]. Due to the great deformations and large unstable masses involved in the landslide, the typical solutions aimed at absolutely and instantaneously stopping the ground movement would have represented a higher cost than the total elimination of the unstable hillside. Instead, the approach was based on controlling the hillside movement evolution and implementing a series of measures to slow down the ground movements, achieving the stabilization of the slope after some years, but also ensuring no damage to any infrastructure during that time.

This paper shows and analyses the stabilization measures projected and implemented on "El Portalet" Pass. The area has been intensively investigated since 2004 by diverse authors. Classical geotechnical investigations, including boreholes and the installation of inclinometers [15], were conducted after landslide activation. Several Differential GPS campaigns as well as the use of different satellite surface monitoring techniques, including the ground-based SAR (GB-SAR) technique and the Differential Synthetic Aperture Radar Interferometry (DinSAR) technique [14,21–24], were performed from 2006 to 2010. Besides, a new geotechnical investigation was conducted to monitor the landslide evolution 10 years after the initial implementation of the stabilization measures. These data are presented for the first time in this paper. Data coming from all of those investigations are used to assess the performance and the degree of success of the stabilization measures, establishing the expected future evolution of the landslide.

## **2. Materials and Methods**

## *2.1. Geographical and Geological Situation*

The case study is located at "Sallent de Gallego" (Huesca, Spain). That area belongs to the high basin of the Gallego River Valley (Figure 1), in the Central Spanish Pyrenees.

The studied hillside corresponds with the southwestern slope of the spurs located between Petrusos Peak and the Old Pass of Sallent (latitude 42◦4804.8300 N; longitude 0 ◦24048.1900 O), with heights between 2128 m and 1848 m. The road next to which the parking lot was built, road A-136, is situated at the hillside foot, and links "El Portalet" Pass and "Sallent de Gallego" village. The Gallego River flows through lower elevations, parallel to the road. The excavation was performed at altitudes between 1735 and 1775 m. Figure 1 shows a parking excavation view, where road A-136 can also be seen.

From the geological point of view, the area under study is located in the Pyrenean Axial Zone, primarily composed of Devonian and Carboniferous materials, amongst which slates (sometimes with sandstone interlayers) are common, with some calcareous zones, all of which are affected by an intense Hercynian folding, accompanied by low-grade metamorphism. From a tectonic point of view, all of these materials are part of the Gavarnie thrust, which contains part of the southern configuration of the Pyrenean chain. Most of the hillside is covered by recent colluvial and quaternary deposits [25–27].

**Figure 1.** Geographical and geological setting of the area under study in the Central Spanish Pyrenees and view of the parking excavation works (note: bottom left image modified from [22]). **Figure 1.** Geographical and geological setting of the area under study in the Central Spanish Pyrenees and view of the parking excavation works (note: bottom left image modified from [22]).

The area contains low strength rocks with intense active slope processes and with portions with low structural control. This is observed between the Inner Chains and "El Portalet" on the Gallego River basin, where slates, schists and clays have caused several slope evolution phenomena, associated with mass movements. Slumps are the most spectacular shapes, which leave a scar or tension crack on the head of the slope. Those slumps are deep and affect the substrata, also presenting deep weathering due to both their plasticity and tectonization. In some cases, Devonian slates have acted as a lubricant, their structure and faults located at the top of the slope sometimes causing an additional instability issue. The length of tension cracks can exceed 1 km, and the landslides portray the typical internal corrugation, a consequence of a mass movement. The area contains low strength rocks with intense active slope processes and with portions with low structural control. This is observed between the Inner Chains and "El Portalet" on the Gallego River basin, where slates, schists and clays have caused several slope evolution phenomena, associated with mass movements. Slumps are the most spectacular shapes, which leave a scar or tension crack on the head of the slope. Those slumps are deep and affect the substrata, also presenting deep weathering due to both their plasticity and tectonization. In some cases, Devonian slates have acted as a lubricant, their structure and faults located at the top of the slope sometimes causing an additional instability issue. The length of tension cracks can exceed 1 km, and the landslides portray the typical internal corrugation, a consequence of a mass movement.

grade metamorphism. From a tectonic point of view, all of these materials are part of the Gavarnie thrust, which contains part of the southern configuration of the Pyrenean chain. Most of the hillside is covered by recent colluvial and quaternary deposits [25–27].

One of these slumps was mapped by García Ruiz [16,17]. Two parallel tension cracks, which form successive steps (see Figure 1), match the southern slope of the Petrusos spurs and they also overlap with the convex surface presented by the rocky hillside. At its toe, the slumps can be found with two coalescent lobes that reach the fracture northwest of the sunken block of "El Portalet". The high density of such morphologies can be found on the Gallego head of the slope, especially between "El Portalet" and Formigal ski station in addition to the Peña Foratata southwestern slope, where superposed landslides tend to form great slumps. Most of them do not look very active, but other slumps are still moving or show a local reactivation. One of these slumps was mapped by García Ruiz [16,17]. Two parallel tension cracks, which form successive steps (see Figure 1), match the southern slope of the Petrusos spurs and they also overlap with the convex surface presented by the rocky hillside. At its toe, the slumps can be found with two coalescent lobes that reach the fracture northwest of the sunken block of "El Portalet". The high density of such morphologies can be found on the Gallego head of the slope, especially between "El Portalet" and Formigal ski station in addition to the Peña Foratata southwestern slope, where superposed landslides tend to form great slumps. Most of them do not look very active, but other slumps are still moving or show a local reactivation.

This area is also located in a seismic zone, presenting a basic seismic acceleration of 0.10 g (with g being the gravitational acceleration), according to the Spanish NCSR-02 standard [28].

### *2.2. Previous Field Investigations 2.2. Previous Field Investigations*

standard [28].

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 4 of 18

Previous geological–geotechnical investigations [15] conducted in 2005 consisted of six boreholes (location is shown in Figure 2) with depths between 24 m and 40 m. Standard Penetration Tests (SPT) were performed on each borehole when crossing soils and very weathered rocks, with a frequency of about 3 m. Disturbed and undisturbed samples were extracted from each level identified. Laboratory tests conducted on such samples included general identification tests (e.g., grain size, Atterberg limits, unit weight) and mechanical tests (uniaxial compression strength on rocks and direct CD shear tests and triaxial CU tests on soils). Previous geological–geotechnical investigations [15] conducted in 2005 consisted of six boreholes (location is shown in Figure 2) with depths between 24 m and 40 m. Standard Penetration Tests (SPT) were performed on each borehole when crossing soils and very weathered rocks, with a frequency of about 3 m. Disturbed and undisturbed samples were extracted from each level identified. Laboratory tests conducted on such samples included general identification tests (e.g., grain size, Atterberg limits, unit weight) and mechanical tests (uniaxial compression strength on rocks and direct CD shear tests and triaxial CU tests on soils).

This area is also located in a seismic zone, presenting a basic seismic acceleration of 0.10 g (with g being the gravitational acceleration), according to the Spanish NCSR-02

Inclinometers were installed on each borehole performed. Besides, topographical landmarks were placed along the hillside to measure surface movements. Attempts to locate landmarks were made next to the existing tension cracks on the head of the slope as well as in the excavated slope. Movements between pairs of landmarks were measured every 15 days and five measurement campaigns were carried out. Inclinometers were installed on each borehole performed. Besides, topographical landmarks were placed along the hillside to measure surface movements. Attempts to locate landmarks were made next to the existing tension cracks on the head of the slope as well as in the excavated slope. Movements between pairs of landmarks were measured every 15 days and five measurement campaigns were carried out.

In addition, an inventory of water points (e.g., springs and permanent courses) was carried out along the entire hillside, also recording the water table position in the excavated slope. In addition, an inventory of water points (e.g., springs and permanent courses) was carried out along the entire hillside, also recording the water table position in the excavated slope.

**Figure 2.** Location of the boreholes performed in the previous study [15] ("old" boreholes, conducted in 2005, reprinted with permission from ref. [15], Copyright Year 2005, Copyright Owner's F.J. Torrijo) and the new geotechnical investigation ("new" boreholes, conducted in 2016); profile I-I′ relates to Figure 5. **Figure 2.** Location of the boreholes performed in the previous study [15] ("old" boreholes, conducted in 2005, reprinted with permission from ref. [15], Copyright Year 2005, Copyright Owner's F.J. Torrijo) and the new geotechnical investigation ("new" boreholes, conducted in 2016); profile I-I0 relates to Figure 5.

## *2.3. Corrective Measures*

A series of corrective measures were proposed to be performed on the slope to stabilize the landslide. Those measures (Figure 3) were not aimed to stop the hillside movement immediately and absolutely but to slow it down and prevent any damage to surrounding infrastructures. Eventually, the measures projected were expected to completely stabilize the hillside and stop the movement (or reduce it to a minimal value) after some years0 time. Corrective measures applied included: years′ time. Corrective measures applied included: • A slight modification of the slope geometry by means of the excavations of interme-

A series of corrective measures were proposed to be performed on the slope to stabilize the landslide. Those measures (Figure 3) were not aimed to stop the hillside movement immediately and absolutely but to slow it down and prevent any damage to surrounding infrastructures. Eventually, the measures projected were expected to completely stabilize the hillside and stop the movement (or reduce it to a minimal value) after some

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 5 of 18

*2.3. Corrective Measures* 


**Figure 3.** Stabilization measures projected; red dotted line and green line show the slope profile before and after implementing the stabilization measures (highlighted in bold), respectively. **Figure 3.** Stabilization measures projected; red dotted line and green line show the slope profile before and after implementing the stabilization measures (highlighted in bold), respectively.

#### *2.4. Satellite Surveys 2.4. Satellite Surveys*

After conducting stabilization works on the slope, the ground-based SAR (GB-SAR) technique was used to monitor the landslide during the years 2006 and 2007 as well as to predict the slope movement evolution [14]. That technique is based on SAR interferometry, i.e., the use of consecutive pairs of SAR images for obtaining information about displacements according to the phase difference (interferogram) between two consecutive SAR images [31,32]. For monitoring the landslide movements [33], the GB-SAR sensor was installed about 600 m from the slope (target distance was between 200 and 1300 m), and worked for 47 days, with an acquisition rate of 1 image per hour, a range resolution of 1.7 m and an azimuth resolution of 0.74°. After conducting stabilization works on the slope, the ground-based SAR (GB-SAR) technique was used to monitor the landslide during the years 2006 and 2007 as well as to predict the slope movement evolution [14]. That technique is based on SAR interferometry, i.e., the use of consecutive pairs of SAR images for obtaining information about displacements according to the phase difference (interferogram) between two consecutive SAR images [31,32]. For monitoring the landslide movements [33], the GB-SAR sensor was installed about 600 m from the slope (target distance was between 200 and 1300 m), and worked for 47 days, with an acquisition rate of 1 image per hour, a range resolution of 1.7 m and an azimuth resolution of 0.74◦ .

Results of GB-SAR were validated with the results of five Differential GPS (DGPS) campaigns, where 93 points were monitored, both stable points located next to the landslide active area and points within the landslide mass. A good match between the two techniques (GB-SAR and DGPS) was found, being the maximum difference 1.5 cm.

In addition, the Differential Synthetic Aperture Radar Interferometry (DinSAR) technique was employed [21,22] to monitor different landslides of the Upper Tena Valley (located in the Central Spanish Pyrenees area), including the area under study. That technique uses microwave remote sensing and produces the measurement of surface displacement of high accuracy with a great coverage capacity [34]. The SPN approach [35], which combines both the Persistent Scatterers [35–38] and Small Baselines [39–43] methods, was used. ment of high accuracy with a great coverage capacity [34]. The SPN approach [35], which combines both the Persistent Scatterers [35–38] and Small Baselines [39–43] methods, was used.

Results of GB-SAR were validated with the results of five Differential GPS (DGPS) campaigns, where 93 points were monitored, both stable points located next to the landslide active area and points within the landslide mass. A good match between the two techniques (GB-SAR and DGPS) was found, being the maximum difference 1.5 cm.

In addition, the Differential Synthetic Aperture Radar Interferometry (DinSAR) technique was employed [21,22] to monitor different landslides of the Upper Tena Valley (located in the Central Spanish Pyrenees area), including the area under study. That technique uses microwave remote sensing and produces the measurement of surface displace-

In total, 43 SAR images acquired by ERS-2 and ENVISAT satellites (from 2001 to 2007), 14 SAR images acquired by the TerraSAR-X satellite (2008) and 12 SAR images acquired by ALOS PALSAR (L-band) satellite (from 2006 to 2010) were processed. An auxiliary DGPS campaign was also undertaken to validate the results. In total, 43 SAR images acquired by ERS-2 and ENVISAT satellites (from 2001 to 2007), 14 SAR images acquired by the TerraSAR-X satellite (2008) and 12 SAR images acquired by ALOS PALSAR (L-band) satellite (from 2006 to 2010) were processed. An auxiliary DGPS campaign was also undertaken to validate the results.

More details about this investigation can be found in [14,21–24]. More details about this investigation can be found in [14,21–24].

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 6 of 18

### *2.5. New Field Investigations 2.5. New Field Investigations*

In 2016, approximately 10 years after conducting the stabilization works of the slope, three new boreholes were performed (Figure 4a) at the area under study. Those boreholes attained a depth between 30 m and 45 m and were performed close to those executed in the previous field investigation (see Figure 2). On each borehole, SPT tests were performed in soils and very weathered rocks, with a frequency of about 3 m. Disturbed samples were extracted from each level found and general identification laboratory tests were carried out (grain size, Atterberg limits and unit weight). In 2016, approximately 10 years after conducting the stabilization works of the slope, three new boreholes were performed (Figure 4a) at the area under study. Those boreholes attained a depth between 30 m and 45 m and were performed close to those executed in the previous field investigation (see Figure 2). On each borehole, SPT tests were performed in soils and very weathered rocks, with a frequency of about 3 m. Disturbed samples were extracted from each level found and general identification laboratory tests were

Similar to the actions performed in the previous geological–geotechnical investigation, inclinometers were installed on the new boreholes (Figure 4b). Besides, a new inventory of water points was carried out along the entire hillside (Figure 4c). carried out (grain size, Atterberg limits and unit weight). Similar to the actions performed in the previous geological–geotechnical investigation, inclinometers were installed on the new boreholes (Figure 4b). Besides, a new inventory of water points was carried out along the entire hillside (Figure 4c).

**Figure 4.** New geotechnical investigation works: (**a**) Borehole performance; (**b**) Inclinometer installation; (**c**) Inventory of water points; (**d**) Example of the Quaternary material obtained in the boreholes.

## *2.6. Limit Equilibrium Analysis*

Stability analysis of the landslide was carried out in its natural state and after implementing the corrective measures based on limit equilibrium methods. These methods consider the static mechanical laws and assume the shear strength of the soil to be totally and simultaneously developed along the sliding surface (failure surface). With such

methods, the slope safety factor may be computed as the ratio between the available shear strength in the sliding surface and the needed shear strength to keep a strict equilibrium of the sliding mass.

Nowadays, the most common way of applying limit equilibrium methods is using the method of slices. This method divides the soil sliding mass into many vertical slices to solve the stability problem. The method assumes that failure of the soil is governed by the Mohr–Coulomb criterion, slices behave as rigid bodies and no stresses exist inside each slice. The equation system obtained once the equilibrium of forces is established on each slice requires assuming different simplifications and/or hypotheses to solve the system, all of which leads to having several "sub-methods". In this work, Bishop0 s method [44] and the Morgenstern–Price method [45] were used. The former is an approximate method that establishes the equilibrium of vertical forces and bending moments, assumes a horizontal resultant of the interslice forces and does not take into account interslice shear forces. The latter is an exact method that establishes the stability problem using the three equilibrium conditions in slices of differential thickness and assumes that the inclination of the forces between slices is proportional to a given function.

Several slope profiles were selected to compute the landslide safety factor applying the two mentioned methods. Simulations were run with and without considering the seismic acceleration (0.10 g, with g being the gravitational acceleration).

## **3. Results**

## *3.1. Field Investigations and Satellite Surveys*

In situ (SPT) and laboratory tests conducted on the different investigations carried out identified six geotechnical units in the area under study. Table 1 lists the average parameters of those units related to SUCS classification, Atterberg limits, unit weight, water content and mechanical parameters (cohesion and friction angle for soil materials and uniaxial compression strength for rocks).


**Table 1.** Geotechnical parameters of the materials found in the area under study.

Notation: SUCS: Soil Unified Classification System (Casagrande0 s classification); WP: plastic limit; PI: plasticity index; γ: unit weight; W: water content; c: cohesion (residual); ϕ: friction angle (residual); UCS: uniaxial compression strength.

> Rock mass of the area consisted of variably calcareous slates with calcite filled veins (Geotechnical unit "Slate rock") belonging to the "Facies Culm" from the Carboniferous period. This rock mass presented an average RQD of 60% and the RMR reached an average value of 60. According to the geological–geotechnical investigations conducted, rocky formation crops out only in the northern area, where no instabilities were detected. The rest of the slope surface is covered by quaternary deposits (Figure 4d), which include colluvial deposits, green clayey sands, black clayey sands and fragmented calcareous rocks and slates. Additionally, fault breccia materials were found in an old fault detected, though currently inactive (yet some water was seen circulating through it). Except for colluvial deposits, quaternary materials come from a roto-translational, with unidirectional movement, landslide that occurred in the area in the past. Materials appeared unstructured and remoulded due to paleo-sliding.

Data from boreholes, inclinometers and surface movements led to defining a failure surface located at depths between 5 m and 19 m (see Figure 5), with a fairly planar shape and with greater curvature in the slope toe area. The failure surface daylights on the slope and is observed on the ground surface (at both the slope head and toe) as seen in Figure 6. Surface movements in the E–SE direction, topographically recorded during the previous geotechnical investigations [15], showed small movements on the head of the slope as well as on the excavated slope. Those movements ranged between 1 cm and 3 cm, with movements in the northern area being slightly inferior to southern areas. Data from boreholes, inclinometers and surface movements led to defining a failure surface located at depths between 5 m and 19 m (see Figure 5), with a fairly planar shape and with greater curvature in the slope toe area. The failure surface daylights on the slope and is observed on the ground surface (at both the slope head and toe) as seen in Figure 6. Surface movements in the E–SE direction, topographically recorded during the previous geotechnical investigations [15], showed small movements on the head of the slope as well as on the excavated slope. Those movements ranged between 1 cm and 3 cm, with movements in the northern area being slightly inferior to southern areas.

rest of the slope surface is covered by quaternary deposits (Figure 4d), which include colluvial deposits, green clayey sands, black clayey sands and fragmented calcareous rocks and slates. Additionally, fault breccia materials were found in an old fault detected, though currently inactive (yet some water was seen circulating through it). Except for colluvial deposits, quaternary materials come from a roto-translational, with unidirectional movement, landslide that occurred in the area in the past. Materials appeared unstruc-

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 8 of 18

tured and remoulded due to paleo-sliding.

**Figure 5. Figure 5.**  Geotechnical profile I-I Geotechnical profile I-I′ (see Figure 2) of the area under study; failure surface detected is shown in red. 0 (see Figure 2) of the area under study; failure surface detected is shown in red. *Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 9 of 18

**Figure 6.** Daylight of the failure surface: (**a**) stress crack; (**b**) warped plane morphology. **Figure 6.** Daylight of the failure surface: (**a**) stress crack; (**b**) warped plane morphology.

Inclinometers revealed the development of the failure surface mainly through the sandy clay materials in contact with the fragmented calcareous rock. An example of inclinometer data obtained in the previous study [15] and new geotechnical investigations is shown in Figure 7 (note that those inclinometers belong to profile I-I′ shown in Figure 5; Inclinometers revealed the development of the failure surface mainly through the sandy clay materials in contact with the fragmented calcareous rock. An example of inclinometer data obtained in the previous study [15] and new geotechnical investigations

**Figure 7.** Inclinometers records: (**a**) Inclinometer placed in borehole B-5, reprinted with permission from ref. [15], Copyright Year 2005, Copyright Owner's F.J. Torrijo (old geotechnical investigations [15], conducted in 2005); (**b**) Inclinometer placed in borehole B-9 (new geotechnical investigations,

Topographical works developed during the previous investigation [15] detected the existence of three main landslides in the area under study (landslides, 1, 2 and 3 in Figure

failure surface may be easily identified).

conducted in 2016).

is shown in Figure 7 (note that those inclinometers belong to profile I-I0 shown in Figure 5; failure surface may be easily identified). shown in Figure 7 (note that those inclinometers belong to profile I-I′ shown in Figure 5; failure surface may be easily identified).

Inclinometers revealed the development of the failure surface mainly through the sandy clay materials in contact with the fragmented calcareous rock. An example of inclinometer data obtained in the previous study [15] and new geotechnical investigations is

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 9 of 18

**Figure 6.** Daylight of the failure surface: (**a**) stress crack; (**b**) warped plane morphology.

**Figure 7.** Inclinometers records: (**a**) Inclinometer placed in borehole B-5, reprinted with permission from ref. [15], Copyright Year 2005, Copyright Owner's F.J. Torrijo (old geotechnical investigations [15], conducted in 2005); (**b**) Inclinometer placed in borehole B-9 (new geotechnical investigations, conducted in 2016). **Figure 7.** Inclinometers records: (**a**) Inclinometer placed in borehole B-5, reprinted with permission from ref. [15], Copyright Year 2005, Copyright Owner's F.J. Torrijo (old geotechnical investigations [15], conducted in 2005); (**b**) Inclinometer placed in borehole B-9 (new geotechnical investigations, conducted in 2016).

Topographical works developed during the previous investigation [15] detected the existence of three main landslides in the area under study (landslides, 1, 2 and 3 in Figure Topographical works developed during the previous investigation [15] detected the existence of three main landslides in the area under study (landslides, 1, 2 and 3 in Figure 8). This information was confirmed by the DinSAR data [21,22]. One of those landslides (landslide 3) was not affected by any anthropogenic activity (i.e., was not the consequence of the parking excavation) and was an extremely slow movement [46,47] with an estimated average velocity [22] of approximately 16 mm/year and affected by seasonal rainfalls. The second landslide (landslide 1) was activated by the parking excavation and corresponds to the slope movement under study. Its movement rate, estimated by the auxiliary DGPS campaign of approximately 0.1 m/year, meant it was not possible to compute its velocity with the DinSAR technique [22]. The third landslide (landslide 2) was located next to the previous one, was also activated by the parking excavation and its average velocity was estimated [22] at approximately 25 mm/year. The ALOS PALSAR differential interferometry [24] was also applied to this last landslide, obtaining similar results.

**Figure 8.** Main landslides detected in the area under study. **Figure 8.** Main landslides detected in the area under study.

DGPS measurements [14], conducted after performing the stabilization work on the slope (i.e., landslide 2 according to Figure 8), showed that maximum displacement rates were around 1 mm/day in the summer while rates increased up to 2 mm/day in the fall. Those measurements also identified that the landslide's most active part was an area located below the main scarp. GB-SAR technique discovered that the landslide displacements were fairly linear in time, although slight differences were observed due to daily rainfall variations. Generally, an acceleration of the landslide occurs when rainfall events take place. DGPS measurements [14], conducted after performing the stabilization work on the slope (i.e., landslide 2 according to Figure 8), showed that maximum displacement rates were around 1 mm/day in the summer while rates increased up to 2 mm/day in the fall. Those measurements also identified that the landslide's most active part was an area located below the main scarp. GB-SAR technique discovered that the landslide displacements were fairly linear in time, although slight differences were observed due to daily rainfall variations. Generally, an acceleration of the landslide occurs when rainfall events take place.

8). This information was confirmed by the DinSAR data [21,22]. One of those landslides (landslide 3) was not affected by any anthropogenic activity (i.e., was not the consequence of the parking excavation) and was an extremely slow movement [46,47] with an estimated average velocity [22] of approximately 16 mm/year and affected by seasonal rainfalls. The second landslide (landslide 1) was activated by the parking excavation and corresponds to the slope movement under study. Its movement rate, estimated by the auxiliary DGPS campaign of approximately 0.1 m/year, meant it was not possible to compute its velocity with the DinSAR technique [22]. The third landslide (landslide 2) was located next to the previous one, was also activated by the parking excavation and its average velocity was estimated [22] at approximately 25 mm/year. The ALOS PALSAR differential interferometry [24] was also applied to this last landslide, obtaining similar results.

In fact, hydrogeological results obtained in the previous study [15] and the new geotechnical investigation indicated that surface materials were permeable and the hillside drains with abundant subterranean water flow. Continuous flow springs near the slope toe were identified, with that flow occurring through both the permeable material and the failure surface. The water table was found approximately on the failure surface, suffering oscillations with precipitations. Thus, the relationship between the landslide evolution rate and rainfall reported by the use of the GB-SAR technique [14] was confirmed due to In fact, hydrogeological results obtained in the previous study [15] and the new geotechnical investigation indicated that surface materials were permeable and the hillside drains with abundant subterranean water flow. Continuous flow springs near the slope toe were identified, with that flow occurring through both the permeable material and the failure surface. The water table was found approximately on the failure surface, suffering oscillations with precipitations. Thus, the relationship between the landslide evolution rate and rainfall reported by the use of the GB-SAR technique [14] was confirmed due to the easy infiltration of water in the sliding mass, thanks to the high drainage capacity of the colluvial material located on the top of it.

## *3.2. Limit Equilibrium Analysis*

Several slope profiles were analysed using limit equilibrium methods by means of the method of slices. The geotechnical parameters obtained in the field investigations and listed in Table 1 were used to conduct such analyses. The Bishop0 s method [44] was applied to compute the safety factor without considering the seismic action. The Morgenstern–Price method [45] was applied to compute the safety factor considering the seismic action, which was taken into account by a pseudo-static approach considering two orthogonal forces, their value based on the Spanish code [28].

Under natural conditions, once the parking area was excavated and prior to implementing any corrective measure, profile I-I0 , according to Figures 2 and 5, was found to be the most unfavourable profile. Without taking into account the seismic component, the safety factor at that profile resulted in 0.86. That confirms the existence of the landslide produced by the parking excavation and the subsequent movement of the slope. It should be mentioned that this landslide was a first-time failure, not a reactivation of a previous one. When the seismic action was taken into account, a safety factor of 0.59 was obtained for the same profile (i.e., about 30% lower).

The analyses were run again introducing the projected measures and considering the final stage of the projected slope. The water table was located following the data obtained from the inventory of water points carried out in the new investigations and considering that the capacity of California drains is greater than the recharge of the aquifer (an average level was considered since the water table depends on rainfalls, see the next section). Profile I-I0 was still found to be the most unfavourable. In this case, the safety factor without taking into account the seismic component increased to 1.3; taking into account the seismic acceleration, the safety factor was 1.1 (about 15% lower). That result indicates that the slope is stable and the landslide is expected to be under control with the projected measures. Figure 9 shows the results computed by the Bishop0 s method for profile I-I0 before and after implementing the stabilization measures using the software Slide v5.028 [48]. *Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 12 of 18

**Figure 9.** Safety factor analyses of profile I-I′ (see Figure 2): (**a**) slope in its natural state after the parking excavation; (**b**) slope state after implementing the stabilization measures. **Figure 9.** Safety factor analyses of profile I-I0 (see Figure 2): (**a**) slope in its natural state after the parking excavation; (**b**) slope state after implementing the stabilization measures.

It is interesting to mention that the stability analysis calculations conducted considered the whole slope and the corrective measures were designed accordingly. Considering only a part of the slope was not correct, this may have given rise to erroneous results. For It is interesting to mention that the stability analysis calculations conducted considered the whole slope and the corrective measures were designed accordingly. Considering only a part of the slope was not correct, this may have given rise to erroneous results. For

instance, Figure 10 shows a stability analysis carried out by both the Bishop [44] and the

greater than that computed in the scenario considered in Figure 9b. This may have led to an assumption that the installation of that wall was the unique corrective measure needed,

which resulted in it not being enough for controlling the hillside movement.

instance, Figure 10 shows a stability analysis carried out by both the Bishop [44] and the Morgenstern–Price [45] methods taking into account only the area around the toe wall. The safety factor obtained was nearly 1.6 in both cases, about 23% and 45%, respectively, greater than that computed in the scenario considered in Figure 9b. This may have led to an assumption that the installation of that wall was the unique corrective measure needed, which resulted in it not being enough for controlling the hillside movement. *Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 13 of 18

**Figure 10.** Stability analysis considering the projected wall alone. **Figure 10.** Stability analysis considering the projected wall alone.

#### *3.3. Drainage Evolution 3.3. Drainage Evolution*

mentary Materials).

Water table lowering plays an essential role in the stability of the landslide studied. The approximate volume of the adjacent aquifer to the area under study was estimated at 0.86 Hm3. However, based on the available data, the aquifer lateral supply was unknown, so this volume could be much higher. Although the effective porosity of the different materials traversed by the California drains was also unknown, adopting an estimated value of 5–7%, the volume of water to drain is probably between 100,000 m3 and 200,000 m3. Besides, the groundwater recharge of the aquifer should be considered and this does not stop, being especially significant due to the high rainfalls in the area. Recharge can be estimated at around 1900 mm per year, widely distributed throughout the year except for the summer season. Water table lowering plays an essential role in the stability of the landslide studied. The approximate volume of the adjacent aquifer to the area under study was estimated at 0.86 Hm<sup>3</sup> . However, based on the available data, the aquifer lateral supply was unknown, so this volume could be much higher. Although the effective porosity of the different materials traversed by the California drains was also unknown, adopting an estimated value of 5–7%, the volume of water to drain is probably between 100,000 m<sup>3</sup> and 200,000 m<sup>3</sup> . Besides, the groundwater recharge of the aquifer should be considered and this does not stop, being especially significant due to the high rainfalls in the area. Recharge can be estimated at around 1900 mm per year, widely distributed throughout the year except for the summer season.

As a consequence, the water table lowering does not only depend on the drainage capacity and the efficiency of the Californian drain but also on the weather and the annual recharge (Figure 11). As a result, the drainage of the slope needs time. During that time, the slope is expected to have a safety factor lower than 1.0, which explains the movement of the slope recorded by the different investigations (in fact, the safety factor increased from the situation depicted in Figure 9a to the one given in Figure 9b). Thus, although the future solution in the long term will be stable, attaining this situation will need to wait for some time during which the slope will exhibit unstable behaviour. As a consequence, the water table lowering does not only depend on the drainage capacity and the efficiency of the Californian drain but also on the weather and the annual recharge (Figure 11). As a result, the drainage of the slope needs time. During that time, the slope is expected to have a safety factor lower than 1.0, which explains the movement of the slope recorded by the different investigations (in fact, the safety factor increased from the situation depicted in Figure 9a to the one given in Figure 9b). Thus, although the future solution in the long term will be stable, attaining this situation will need to wait for some time during which the slope will exhibit unstable behaviour.

Although the tendency of the water table is the one explained, this evolution does not take place through a uniform decay. Figure 12 shows the evolution of the lateral displacements recorded by the inclinometers located at a depth of 5 m in boreholes B5 and B9 (Figure 7). Both boreholes were located close to one another and correspond to the previous (year 2005) and new (year 2016) geotechnical investigations, respectively.

Figure 12 shows that, in 2005, the deformations were increasing and the slope was moving faster (A). Between 2005 and 2016 the deformations were greatly reduced, so the landslide was stopping (B). Finally, according to 2016 records, the slope continued to accelerate; from April to September the movement evolved from 5 to 20 mm, although from September to November it seemed to stabilize (C). This acceleration in a given year was also observed in 2005 for the same period, April to September. That period corresponds

with the snowmelt phenomenon [49]. Thus, these results confirm the influence of the water table in the landslide stability, the long-term tendency observed in the evolution of the slope movement and the long time needed to attain drainage of the ground mass (Supple-

**Figure 11.** Expected evolution of the water table and its qualitative relationship with the safety factor. **Figure 11.** Expected evolution of the water table and its qualitative relationship with the safety factor.

Although the tendency of the water table is the one explained, this evolution does not take place through a uniform decay. Figure 12 shows the evolution of the lateral displacements recorded by the inclinometers located at a depth of 5 m in boreholes B5 and B9 (Figure 7). Both boreholes were located close to one another and correspond to the previous (year 2005) and new (year 2016) geotechnical investigations, respectively. **Figure 11.** Expected evolution of the water table and its qualitative relationship with the safety factor. Long-term water table. Safety factor > 1

is an example of dealing with a large landslide by slowing it and controlling its behaviour by different instrumentation and monitoring techniques. The proposed approach was fostered by the slow surface movements (about 0.3 m/year) observed during the previous geotechnical investigations and recorded in the inclinometers installed [15]. This value led **Figure 12.** Evolution of the movements at a depth of 5 m. Long-term trend. The movements are represented in blue for borehole B5 (installed in year 2005) and brown for borehole B9 (installed in year 2016), Date given in Day/Month/Year. Letter A refers to the fast increment in deformations in 2005; B refers to the reduction of deformations between 2005 and 2016; C refers to the acceleration of deformations registered in 2016. **Figure 12.** Evolution of the movements at a depth of 5 m. Long-term trend. The movements are represented in blue for borehole B5 (installed in year 2005) and brown for borehole B9 (installed in year 2016), Date given in Day/Month/Year. Letter A refers to the fast increment in deformations in 2005; B refers to the reduction of deformations between 2005 and 2016; C refers to the acceleration of deformations registered in 2016.

to the conclusion that the hillside material had a slow type of movement [46] with both distensile and compressive areas. Therefore, permanent infrastructures (such as the parking lot) situated over the hillside were expected not to be seriously harmed by the natural movement of the slope. Nevertheless, stability of the landslide in the long term must be **4. Discussion**  The case study presented in this paper at "El Portalet" Pass in the Spanish Pyrenees is an example of dealing with a large landslide by slowing it and controlling its behaviour by different instrumentation and monitoring techniques. The proposed approach was fostered by the slow surface movements (about 0.3 m/year) observed during the previous Figure 12 shows that, in 2005, the deformations were increasing and the slope was moving faster (A). Between 2005 and 2016 the deformations were greatly reduced, so the landslide was stopping (B). Finally, according to 2016 records, the slope continued to accelerate; from April to September the movement evolved from 5 to 20 mm, although from

geotechnical investigations and recorded in the inclinometers installed [15]. This value led

ing lot) situated over the hillside were expected not to be seriously harmed by the natural movement of the slope. Nevertheless, stability of the landslide in the long term must be

September to November it seemed to stabilize (C). This acceleration in a given year was also observed in 2005 for the same period, April to September. That period corresponds to the spring season when precipitations normally occur in the area under study together with the snowmelt phenomenon [49]. Thus, these results confirm the influence of the water table in the landslide stability, the long-term tendency observed in the evolution of the slope movement and the long time needed to attain drainage of the ground mass (Supplementary Materials).

## **4. Discussion**

The case study presented in this paper at "El Portalet" Pass in the Spanish Pyrenees is an example of dealing with a large landslide by slowing it and controlling its behaviour by different instrumentation and monitoring techniques. The proposed approach was fostered by the slow surface movements (about 0.3 m/year) observed during the previous geotechnical investigations and recorded in the inclinometers installed [15]. This value led to the conclusion that the hillside material had a slow type of movement [46] with both distensile and compressive areas. Therefore, permanent infrastructures (such as the parking lot) situated over the hillside were expected not to be seriously harmed by the natural movement of the slope. Nevertheless, stability of the landslide in the long term must be ensured due to both safety and economic efficiency of the parking lot and the surrounding area.

The slow movement allowed for the projection of a series of hillside stabilization measures that tended towards the reduction of the movement rate velocity, which eventually attained the total stopping of the landslide evolution. These measures included the modification of the slope geometry, the installation of drainage and the construction of a flexible retaining toe wall. Data obtained from diverse geotechnical investigations, satellite monitoring [14,15,18,19,21–24] and numerical models were used to assess the effectiveness of such measures. In addition, new geotechnical investigations, involving new boreholes and the installation of new inclinometers, were conducted, showing that the evolution of the landslide tends towards its total stabilization (no movement).

The initial design of the parking lot included a basic stability analysis of the hillside and the implementation of a rigid retaining wall to contain the ground [15]. The landslides activated once the parking area was excavated however, which showed an inefficacy of the analysis and measures. As indicated in Section 3.2, the stability analysis of the area under study should have considered the entire slope and not have been restricted to the area just adjacent to the parking excavation to avoid erroneous results and misunderstand the measures needed. A stability analysis that leads to a safety factor greater than 1.0 does not guarantee the slope to be stable, especially when involving large areas, as is the case under study. Besides, it should be noted that limit equilibrium methods do not consider deformations, so a slope may be unstable sometimes even though the safety factor is greater than 1.0, as reported in some cases found in the literature [50].

An important aspect involving a large area and also shown in this case study is the possibility that a landslide triggers other landslides in the area. The cascade effect that a potential landslide may have in other points of the slope should never be neglected. These additional landslides are difficult to predict unless a geological exploration is conducted in the area potentially affected. All in all, the stability analysis of potential landslides involving large ground masses, even though adjusted to the corresponding design requirements, should be supported on thorough geological and geotechnical surveys that enable an assessment that the analysis conducted properly fits the natural processes.

In the case under study, the influence of the water table plays a critical role in the stability of the landslide. In particular, the drainage time needed for lowering the water table was shown to be decisive in the instability of the system. Thus, establishing only a future situation and assuming this situation will be reached in a short term may give rise to errors. As shown in this work, stability takes time to be reached, an issue that should be considered in the design of slope stability measures and the subsequent monitoring process. Thus, although a large number of drains were installed, the water outlet from the aquifer did not have a high discharge ratio, so lowering the water table needs more than 10 years to be effective.

Regional climatology showed a direct influence on the water table evolution. The region under study presents high values of rain and snow precipitation as well as snowmelt events [49], all of which prevent the water table from decreasing efficiently. According to the data obtained by the inclinometers installed at the geotechnical investigations, during certain periods of the year the water table increases, as indicated by the landslide acceleration in 2016. This result is evidence that the water table does not always decrease but it may also increase in some periods throughout the year in which the water supply (due to precipitation and snowmelt events) exceeds the outlet of water through the drainage measures.

Therefore, the transition period between the construction works and the stabilization of the long-term water table may give rise to an unfavourable scenario in terms of stability. This motivates the need for monitoring of large landslides such as this case under study to establish the moment when the landslide may be considered to be under control. In this case under study, once the water table reaches the desired level considered in the stability analyses, seasonal oscillations are not expected to cause any instability in the slope, so the landslide may be considered stable and no significant movements of the ground are expected from that point.

Finally, it is interesting to mention that the cost of the stabilization measures carried out in this case study is estimated at EUR 1.5 M. This type of solution forces slope control (monitoring) and the acceptance of a certain degree of ground deformations, however, considering that the cost of the implementation of the common measures aimed to totally stop the landslide would be considerably much higher, it is clear that the engineering solution adopted was a good answer for solving the classical engineering duality cost–safety.

**Supplementary Materials:** A Kmz file is available at https://www.mdpi.com/article/10.3390/app1 1167176/s1.

**Author Contributions:** Conceptualization, F.J.T. and J.G.-R.; methodology, F.J.T. and J.G.-R.; project administration, G.C. and F.J.T.; software, J.G.-R.; validation, G.C. and F.J.T.; formal analysis, M.Á.E. and F.J.T.; writing—original draft preparation, J.G.-R.; writing—review and editing, G.C. and M.Á.E.; visualization, G.C. and M.Á.E.; supervision, G.C. and F.J.T. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. The authors fully acknowledge the financial support provided by the Department of Geological and Geotechnical Engineering of the UPV.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy reasons.

**Acknowledgments:** Authors thank ICOG for providing them with the geotechnical data related to the "previous geotechnical investigation" works. Thanks are also given to IBERGEOTECNIA for its collaboration in the "new geotechnical investigation" works.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


## *Article* **A Novel Decomposition-Ensemble Learning Model Based on Ensemble Empirical Mode Decomposition and Recurrent Neural Network for Landslide Displacement Prediction**

**Xiaoxu Niu <sup>1</sup> , Junwei Ma 1,2,\* , Yankun Wang <sup>3</sup> , Junrong Zhang <sup>4</sup> , Hongjie Chen <sup>5</sup> and Huiming Tang 1,2**


**Featured Application: The proposed decomposition-ensemble learning model can be efficiently used to enhance the prediction accuracy of landslide displacement prediction and can also be extended to other difficult forecasting tasks in the geosciences with extremely complex nonlinear data characteristics.**

**Abstract:** As vital comments on landslide early warning systems, accurate and reliable displacement prediction is essential and of significant importance for landslide mitigation. However, obtaining the desired prediction accuracy remains highly difficult and challenging due to the complex nonlinear characteristics of landslide monitoring data. Based on the principle of "decomposition and ensemble", a three-step decomposition-ensemble learning model integrating ensemble empirical mode decomposition (EEMD) and a recurrent neural network (RNN) was proposed for landslide displacement prediction. EEMD and kurtosis criteria were first applied for data decomposition and construction of trend and periodic components. Second, a polynomial regression model and RNN with maximal information coefficient (MIC)-based input variable selection were implemented for individual prediction of trend and periodic components independently. Finally, the predictions of trend and periodic components were aggregated into a final ensemble prediction. The experimental results from the Muyubao landslide demonstrate that the proposed EEMD-RNN decompositionensemble learning model is capable of increasing prediction accuracy and outperforms the traditional decomposition-ensemble learning models (including EEMD-support vector machine, and EEMDextreme learning machine). Moreover, compared with standard RNN, the gated recurrent unit (GRU)-and long short-term memory (LSTM)-based models perform better in predicting accuracy. The EEMD-RNN decomposition-ensemble learning model is promising for landslide displacement prediction.

**Keywords:** landslide displacement prediction; decomposition-ensemble model; recurrent neural network (RNN); ensemble empirical mode decomposition (EEMD); maximal information coefficient (MIC)

## **1. Introduction**

Landslides are a ubiquitous global hazard [1] posing significant threats to life and property. The statistics data show that landslide disasters affected 5 million people and caused total damage of 4.7 billion US dollars during the period from 2000 to 2020 [2].

**Citation:** Niu, X.; Ma, J.; Wang, Y.; Zhang, J.; Chen, H.; Tang, H. A Novel Decomposition-Ensemble Learning Model Based on Ensemble Empirical Mode Decomposition and Recurrent Neural Network for Landslide Displacement Prediction. *Appl. Sci.* **2021**, *11*, 4684. https://doi.org/ 10.3390/app11104684

Academic Editors: Miguel Llorente Isidro, David Moncoulon and Ricardo Castedo

Received: 27 April 2021 Accepted: 18 May 2021 Published: 20 May 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

As shown in Figure 1, China, the USA, Japan, Nepal, and India are the most landslideprone regions [3], among which China suffers the most landslide disasters. In the past two decades, landslides have killed 3706 people and caused over 2 billion US dollars of estimated damage to China. Landslide early warning has proven to be the most effective measure for landslide mitigation [4,5], and landslide displacement prediction has been catching extensive attention from practitioners and scholars because of its significant importance in early landslide warning systems [6,7]. However, due to the inherent nonlinear characteristics of landslide monitoring data, achieving the desired prediction accuracy remains highly difficult and challenging. Therefore, it is essential to develop an effective and accurate prediction model to improve the performance of landslide displacement prediction, thus aiding landslide mitigation.

**Figure 1.** Spatial distribution of landslide disasters during the period from 2000 to 2020. Each dot represents a single landslide. The insets show the total deaths and total estimated damages. (Source: https://public.emdat.be/data, accessed on 2 December 2020).

A variety of landslide displacement prediction models have been proposed since the pioneering work of Saito [8]. These prediction models generally fall into two main groups: Physics-based models and data-driven models [9]. Physics-based models generally require a clear understanding of the physical processes that involve a large amount of input, sophisticated mathematical tools, and significant user expertise. Therefore, the generalization ability of physics-based models is limited [4].

Recently, data-driven models, including artificial neural networks (ANNs) [10], decision trees [4], extreme learning machines (ELMs) [11,12], support vector machines (SVMs) [13–15], quantile regression neural networks [16], random forest (RF) [17], and kernel-based ELMs and SVMs [9,18,19], have attracted attention in landslide displacement prediction. These studies have demonstrated that a data-driven model is capable of providing satisfactory predictions by recognizing movement patterns in historical monitoring data and establishing a mapping between input and output displacements without the requirement of complex physical processes. Recent applications have demonstrated the feasibility of data-driven models to capture nonlinear relationships and to model landslide dynamic processes based on historical model data; however, limitations remain.

First, in most data-driven models, the input variables that have an important influence on the accuracy of landslide displacement prediction [9] are selected based on a priori expert knowledge, trial and error, or linear cross-correlation [9,12]. Nevertheless, a priori expert knowledge of landslide systems is biased [20], or not always available, or even when available, knowledge acquisition tends to be a difficult and time-consuming process. Generally, input variable selection via trial and error is a brute-force process that is computationally expensive, especially for data-driven models with large input candidates. The most commonly used linear correlation coefficients only evaluate linear correlation and cannot reveal the nonlinear relationships that are generally involved in data-driven models. Therefore, a clear need exists for a systematic input variable selection process that does

not rely on a priori expert knowledge, is computationally inexpensive, and can describe nonlinear relationships.

Second, conventional data-driven models ignore the intrinsic temporal dependency, which involves the effect of preceding actions on present actions [21,22]. Actually, measured landslide displacement data contain temporal dependencies [23,24].

The abovementioned limitations can be addressed from the following perspectives. The first is to utilize mutual information index describing nonlinear relationships by the amount of related information that is jointly owned by two or more variables [25] for input variable selection. The second solution is to recognize intrinsic temporal dependencies by deploying advanced modeling techniques. A promising solution is the recurrent neural network (RNN) [26]. The temporal dependency in monitoring data can be captured by adopting a sequential approach, thereby improving the ability to model dynamic systems. In addition, the "decomposition-ensemble" learning paradigm can also be considered a promising tool for analyzing series with complex nonlinearity characteristics and enhancing prediction accuracy [27–31]. The effectiveness of the "decomposition-ensemble" has already been confirmed in a variety of fields.

Based on the "decomposition-ensemble" principle, a novel "decomposition-ensemble" learning model integrating EEMD and RNN was proposed in this study to enhance the performance of landslide displacement prediction. The Muyubao landslide located in the Three Gorges Reservoir area was selected as a case study to verify the performance of the proposed model.

## **2. Study Area and Datasets**

## *2.1. Overview of the Muyubao Landslide*

The Muyubao landslide, an ancient landslide, is located in Zigui County, Hubei Province and is situated on the right bank of the Yangtze River (see Figure 2 for landslide location). The length and width of the landslide are approximately 1500 m and 1200 m, respectively. The landslide is 50 m thick on average. The landslide covers approximately 2 million m<sup>2</sup> in the planar area and has a volume of approximately 90 million m<sup>3</sup> . The altitude at the toe of the landslide is 100 m, and the altitude at the crown is 520 m (see Figure 2 for the landslide geological profile). The Muyubao landslide mainly slides in a direction of 20 degrees from North. The borehole analysis reveals that the Muyubao landslide slide along a soft coal layer with an average thickness of 0.2 m. The landslide materials are distributed in two layers: The upper Quaternary deposit and the lower highly disturbed rock mass (Figure 2). The underlaid bedrock is mudstone and sandstone of the Jurassic Xiangxi Formation.

**Figure 2.** *Cont*.

**Figure 2.** Location and geological profile of the Muyubao landslide, Three Gorges Reservoir area.

## *2.2. Data Collection*

The ancient Muyubao landslide was reactivated by the impoundment of the Three Gorges Reservoir in September 2006. A landslide monitoring system consisting of twelve GPS survey monuments was installed on the landslide mass (see Figure 2 for GPS monument locations) to monitor landslide movement. Nearly 13 years of monitoring data from October 2006 to October 2018 were acquired. According to the monitoring data, the maximum landslide displacement occurred at ZG291 with a cumulative displacement of 2437.36 mm. The landslide displacement at ZG291, reservoir level in the Yangtze River, and rainfall intensity are shown in Figure 3. As shown, the Muyubao landslide exhibits step-like deformation. Sharp increments of displacement occur mainly from November to March, with the reservoir level decreasing from 175 m to 165 m.

**Figure 3.** Time series of landslide displacement at ZG291, reservoir level, and rainfall intensity during the monitoring period from October 2006 to October 2018.

## **3. Methodology**

## *3.1. Ensemble Empirical Mode Decomposition*

Empirical mode decomposition (EMD) is an approach to decompose nonlinear signals into a finite number of simple components called intrinsic mode functions (IMFs). These components form a complete and nearly orthogonal basis for the original signal. The main idea of EMD is repeatedly subtracting the local mean from the original signal. EEMD was improved from EMD to overcome modal aliasing problems by adding white noise [32], and it has been widely used for the decomposition of nonlinear and nonstationary signals [33]. EEMD has the advantages of robust self-adaptability and local variation. As shown in Figure 4**,** the EEMD decomposition process can be briefly described as the following steps:

**Figure 4.** Schematic diagram of the EEMD decomposition process.

Add a random noise signal *nj*(*t*) to the original raw data *x*(*t*) to obtain the noise-added data signal *xj*(*t*)

$$\mathbf{x}\_{j}(t) = \mathbf{x}(t) + n\_{j}(t), j = 1, 2, \cdots, M \tag{1}$$

(1) Use EMD to decompose the noise data *xj*(*t*) into some IMFs:

$$x\_j(t) = \sum\_{i=1}^{L} c\_{i,j}(t) + r\_{L,j}(t), j = 1, 2, \cdots, M \tag{2}$$

where *ci*,*j*(*t*) is the *i*th IMF of noise-added data *xj*(*t*) in the *j*th decomposition and *rL*,*j*(*t*) is the corresponding residue.


$$\overline{c}\_{i}(t) = \sum\_{j=1}^{M} c\_{i,j}(t) / M \tag{3}$$

$$\overline{r}\_L(t) = \sum\_{j=1}^{M} r\_{L,j}(t) / M \tag{4}$$

## *3.2. Maximal Information Coefficient (MIC)*

Compared with the traditional statistical indexes such as the Pearson coefficient, MIC allows to detect various correlation relationships including linear, non-linear, functional, and non-functional relationships. Secondly, the MIC is designed to maintain similar results even in presence of equal levels noise of different types [34,35].

For continuous variables *x* and *y*, the MIC between *x* and *y* is described by the following formula:

$$\begin{aligned} \text{MIC}(\mathbf{x}, \mathbf{y}) &= \max\{I(\mathbf{x}, \mathbf{y}) / \log\_2 \min\{n\_{\mathbf{x}}, n\_{\mathbf{y}}\}\} \\ \text{where} \\ I(\mathbf{x}, \mathbf{y}) &= H(\mathbf{x}) + H(\mathbf{y}) - H(\mathbf{x}, \mathbf{y}) \\ = \sum\_{i=1}^{n\_{\mathbf{x}}} p(\mathbf{x}\_{i}) \log\_2 \frac{1}{p(\mathbf{x}\_{i})} + \sum\_{j=1}^{n\_{\mathbf{y}}} p(y\_{j}) \log\_2 \frac{1}{p(y\_{j})} - \sum\_{i=1}^{n\_{\mathbf{x}}} \sum\_{j=1}^{n\_{\mathbf{y}}} p(\mathbf{x}\_{i} y\_{j}) \log\_2 \frac{1}{p(\mathbf{x}\_{i} y\_{j})} \end{aligned} \tag{5}$$

where *P*(*xi*) presents the marginal probability of *x*, *P*(*yj*) presents the marginal probability of *y*, *P*(*x<sup>i</sup>* , *yj*) presents the joint probability density function of *x* and *y*, and *nx*, *n<sup>y</sup>* is the number of bins of the partition of the *x*- and *y*-axis. An MIC of zero indicates that there is no dependence between the concerned variables, while MIC of one implies a stronger relationship [36]. Based on previous research, the final input variables with MICs greater than 0.1 [37,38] were selected from input candidates for model training.

## *3.3. Recurrent Neural Network*

An RNN is an artificial neural network wherein adjacent hidden neurons are connected [39]. These recurrent structures of RNNs can transfer time dependence through hidden units and consider temporal correlations. There are three main types of RNNs: Standard RNN, long short-term memory (LSTM), and gated recurrent unit (GRU) (Figure 5).

**Figure 5.** Basic structures of RNN units: (**a**) Standard RNN; (**b**) LSTM; (**c**) GRU.

## 3.3.1. Standard RNN

A standard RNN is a simple and powerful RNN. Figure 5a shows the typical structure of a standard RNN. *x<sup>t</sup>* is the input vector at time step *t* and *h<sup>t</sup>* is the hidden state of RNN cell at time step *t*, which is computed based on the hidden state (*ht–*1) at the previous time step *t*–1 and the input vector (*xt*) at the current time step *t*. Formally, the output of the hidden units of the standard RNN can be formulated as follows:

$$h\_t = \tanh\left(\mathcal{W}\_\mathbf{x}\mathbf{x}\_t + \mathcal{W}\_h h\_{t-1} + b\right) \tag{6}$$

The final output of RNN depends on not only the input of the current time step but also the calculated of the hidden layer in the previous time step. Theoretically, RNN can take advantage of all information no matter how long the sequences are. However, according to previous studies, because of the vanishing gradient problem, standard RNNs are suitable only for short-term dependencies [39,40].

## 3.3.2. LSTM

LSTM was improved to overcome the gradient disappearance problem [41] in standard RNN [42]. Figure 5b shows the basic structure of LSTM. A typical LSTM cell consists of one unit state and three types of gates: Input gate (*it*), output gate (*ot*), and forget gate (*ft*). These three gates act as filters, serving different purposes. The input gate (*it*) determines what new information is going to be stored in the cell state (*Ct*). The output gate (*ot*) specifies what information from the cell state (*Ct*) is used as output. The forget gate (*ft*) determines what information will be moved away from the cell state (*Ct*). More formally, the outputs of the input gate (*it*), output gate (*ot*), and forget gate (*ft*) can be formulated as follows:

$$f\_t = \sigma \left(\mathcal{W}\_{f\mathbf{x}}\mathbf{x}\_t + \mathcal{W}\_{fh}h\_{t-1} + b\_f\right) \tag{7}$$

$$\dot{\mathbf{u}}\_{t} = \sigma(\mathbf{W}\_{\text{ix}}\mathbf{x}\_{t} + \mathbf{W}\_{\text{ih}}\mathbf{h}\_{t-1} + \mathbf{b}\_{i}) \tag{8}$$

$$\rho\_t = \sigma(\mathcal{W}\_{\text{ox}}\mathbf{x}\_t + \mathcal{W}\_{\text{ol}}h\_{t-1} + b\_o) \tag{9}$$

The current cell state (*Ct*) can be formulated as follows:

$$\mathbf{C}\_{t} = f\_{t} \odot \mathbf{C}\_{t-1} + i\_{t} \odot \ddot{\mathbf{C}}\_{t} \tag{10}$$

The unit state *C*e*<sup>t</sup>* can be described by the following formula:

$$\bar{\mathbf{C}}\_{l} = \tanh(\mathcal{W}\_{\mathbf{C}x}\mathbf{x}\_{l} + \mathcal{W}\_{\mathbf{C}h}h\_{t-1} + b\_{\mathbf{C}}) \tag{11}$$

The LSTM unit (*ht*) can be formulated as follows:

$$h\_t = o\_t \odot \tan \mathbf{h}(\mathbf{C}\_t) \tag{12}$$

where *Wf h*, *Wih*, *Woh*, and *WCh* are the linear correlation coefficient matrices; *Wf x*, *Wix*, *Wox*, and *WCx* are the coefficient matrices of the input variable; *σ*(·) denotes the sigmoid activation function; and *b<sup>f</sup>* , *b<sup>i</sup>* , *bo*, and *b<sup>C</sup>* are the bias terms of the corresponding formula.

## 3.3.3. GRU

GRU was developed by [43] to simplify LSTM. Figure 5c shows the basic structure of GRU. A typical GRU unit contains two types of gates: A reset gate (*rt*) and an update gate (*zt*). The reset gate (*rt*) controls how much information from the previous state is written into the current candidate hidden layer vector e*h<sup>t</sup>* . The smaller the reset gate (*rt*), the less information from the previous state is written. The update gate (*zt*) is used to control the degree to which the state information *ht*−<sup>1</sup> at the previous time step *t* − 1 will be brought into the current time step *t*. The larger the value of the update gate (*zt*), the more the state information at the previous time step is brought in. The reset gate (*rt*) and update gate (*zt*) can be defined by the following formula:

$$r\_t = \sigma \left(\mathcal{W}\_{rx}\mathbf{x}\_t + \mathcal{W}\_{rh}\mathbf{h}\_{t-1} + \mathbf{b}\_r\right) \tag{13}$$

$$z\_t = \sigma(\mathcal{W}\_{\text{zx}}\mathbf{x}\_t + \mathcal{W}\_{\text{z}h}h\_{t-1} + b\_{\mathbf{z}}) \tag{14}$$

The candidate hidden layer vector e*h<sup>t</sup>* is defined as follows:

$$h\_t = \tan \mathbf{h} (\mathcal{W}\_{hx} \mathbf{x}\_t + \mathcal{W}\_{hh} (r\_t \odot h\_{t-1}) + b\_h) \tag{15}$$

The output of the GRU unit can be formulated as follows:

$$h\_t = (1 - z\_t) \odot h\_{t-1} + z\_t \odot h\_t \tag{16}$$

where *Wrx*, *Wrh*, *Wzx*, and *Wzh* are the weight matrices; *b<sup>r</sup>* and *b<sup>z</sup>* are the bias terms.

## *3.4. Decomposition-Ensemble Learning Model for Landslide Displacement Prediction*

Based on the principle of the "decomposition-ensemble" methodology, a three-step learning model integrating EEMD and RNN can be formulated for landslide displacement prediction. As shown in Figure 6, the proposed EEMD-RNN learning model mainly consists of the following steps: Data decomposition, individual prediction, and ensemble prediction.

**Figure 6.** Overall process of the decomposition-ensemble learning model based on EEMD and RNN.

## 3.4.1. Data Decomposition

The data decomposition technique is useful for the accurate prediction of landslide displacement, as it can reduce the complexity and improve the interpretability of nonlinear time series. In the present study, EEMD and kurtosis criteria were applied for landslide displacement decomposition and construction of trend and periodic components for further landslide displacement prediction.

Kurtosis is a dimensionless parameter [44,45] describing the waveform peak that is formulated as follows:

$$K = \frac{1}{M} \sum\_{t=1}^{M} \left[ \frac{\varkappa(t) - \mu}{\sigma} \right]^4 \tag{17}$$

where *M* is the signal length, *µ* presents the average of the signal, and *σ* presents the standard deviation. A decomposed component with a higher kurtosis retains more deformation characteristics.

EEMD was used to decompose the landslide displacement data shown in Figure 3 into six IMFs and one residual (Figure 7). According to previous works [32], the noise added to the original signal and the maximum number of iterations were set to 0.2 and 100, respectively. The decomposed IMFs oscillate in descending order. The corresponding kurtoses for the decomposition components are listed in Table 1. The obtained kurtoses indicate that the decomposed residual term retains the overall deformation trend of the original time series with the largest kurtosis. Therefore, the residual component was treated as the main trend series for further landslide displacement prediction. The periodic series was obtained by subtracting the trend series from the original series [15]. As shown in Figure 7, the obtained trend components (*yT*) and periodic components (*yP*) show two characteristics: The trend components show an approximate monotonic increase in displacement with time, and the periodic components exhibit characteristics of a chaotic time series.

**Table 1.** The kurtoses for the decomposition components.


**Figure 7.** Decomposition results of landslide displacement using the EEMD method and time series of original landslide displacement and the trend and periodic components.

## 3.4.2. Individual Prediction

In the present study, 124 measurements from October 2006 to January 2017 were used as the training set for the prediction model, and 21 measurements from February 2017 to October 2018 were treated as testing data. According to previous research on landslide displacement prediction [46,47], the trend components are mainly controlled by internal geological conditions and can be perfectly predicted by polynomial regression fitting. In contrast, the periodic component is mainly controlled by external triggering factors, such as rainfall intensity and reservoir fluctuation. The major difficulty in landslide displacement prediction is accurate prediction of the periodic components. Therefore, polynomial regression fitting was treated as an individual prediction model to predict trend components. The trend component shown in Figure 7 can be fitted as follows:

$$y\_T(t) = -0.0337t^2 + 21.7046t + 7.1479\tag{18}$$

The coefficient of determination (R<sup>2</sup> ) for the trend component is 1000, which indicates a perfect model for the prediction of the trend component.

Aiming at interpreting the behaviors between input candidates and model outputs and excluding irrelevant and redundant variables to develop accurate and cost-effective prediction models [48], the RNN with MIC-based input variable selection was implemented for individual prediction of periodic components. Based on previous research related to landslide displacement prediction [49], seven commonly used variables were selected as input candidates, including three state candidates and four trigger candidates. The selected four trigger input candidates are one-month antecedent rainfall (*x*1), two-month antecedent rainfall (*x*2), average values of reservoir level for the current month (*x*3), and reservoir fluctuation for the current month (*x*4). The state candidates are displacement in the past month (*x*5), displacement of landslides in the past two months (*x*6), and displacement of landslides in the past three months (*x*7). Pair plots and MICs between the input candidates and periodic components are shown in Figure 8. The pair plots show an approximately linear dependency between the periodic components (*yP*) and state candidates (*x*5, *x*6, and *x*7). The MICs indicate that the seven input candidates have significant dependency on the

## periodic components, with MIC values larger than 0.2. Therefore, seven input candidates were treated as the input for individual prediction of periodic components.

**Figure 8.** Pair plots and MICs between the input candidates and periodic components.

The landslide measurements were first normalized in the range of 0 to 1 by min-max feature scaling. After the outputs from the EEMD-RNN approach were renormalized, the final displacement predictions were obtained. The simple trial and error method was adopted for the parameter tuning in RNN, GRU, and LSTM networks. The results from trial-and-error analysis show that RNN, GRU, and LSTM networks with one hidden layer for landslide displacement prediction is better than using a multi-layer network. Therefore, one hidden layer with topologies of 7-50-1, 7-55-1, and 7-50-1 was set up for RNN, LSTM, and GRU in the present study. The epoch strategy referring to the process by which all data are sent into the network to complete an iterative calculation was adopted. The epoch sizes were set to 1000, 400, and 100, respectively. Moreover, learning rate scheduling was adopted for faster convergence and convergence to a better minimum [50]. The corresponding learning rate parameters for RNN, LSTM, and GRU were set to 0.6, 0.7, and 0.5, respectively. More details about the parameter settings in the comparative studies are shown in Table 2.


**Table 2.** Parameter settings in the comparative studies.

## 3.4.3. Ensemble Prediction

The final ensemble predictions of landslide displacement were obtained by aggregating the predictions of trends and periodic components. A comparative analysis was conducted with the following decomposition-ensemble learning model: EEMD-based RNN, EEMD-LSTM, EEMD-GRU, EEMD-SVM, EEM-ELM, and EMD-LSTM. The parameters of the different models used in the comparative studies are listed in Table 2. The model comparative processes were performed in RStudio Version 1.2.5042 running on an Intel(R) Core (TM) i5-6300HQ CPU @ 2.3 GHz with 4 GB RAM.

## *3.5. Evaluation Metrics*

In the present study, six evaluation metrics, namely the mean absolute error (MAE), mean square error (MSE), mean absolute percentage error (MAPE), normalized root mean square error (NRMSE), coefficient of determination (R<sup>2</sup> ), and Kling-Gupta efficiency (KGE), were applied to evaluate the model performance. These evaluation metrics are defined as follows:

$$\text{MAE} = \frac{1}{N} \left( \sum\_{t=1}^{N} |y\_{pre,t} - y\_{obs,t}| \right) \tag{19}$$

$$\text{MSE} = \frac{1}{N} \sum\_{t=1}^{N} (y\_{pre,t} - y\_{obs,t})^2 \tag{20}$$

$$\text{MAPE} = \frac{1}{N} \left( \sum\_{t=1}^{N} \left| \frac{y\_{pre,t} - y\_{obs,t}}{y\_{obs,t}} \right| \right) \times 100\% \tag{21}$$

$$\text{NRMSE} = \frac{1}{\overline{y}\_{obs}} \sqrt{\frac{\sum\_{t=1}^{t} (y\_{pre,t} - y\_{obs,t})^2}{N}} \tag{22}$$

where *N* is the quantity of deformation monitoring data; *yobs*,*<sup>t</sup>* presents the measured values of landslide displacement; *ypre*,*<sup>t</sup>* is predicted values of landslide displacement; *yobs* and *ypre* represent the mean values of observations and predictions; r is the linear relative coefficient between simulated displacement values *ypre*,*<sup>t</sup>* and observed displacement values *yobs*,*<sup>t</sup>* ; *α* = = *σypre*/*σyobs* is a metric of the relative variability between predicted and observed

displacement; and *β* = *µypre*/*µyobs* is the ratio between the average predicted displacement to the average observed displacement. The MAE is the average of the absolute errors between the predicted values and actual values, which reflects the actual predicted value error. The MSE is the expected value of the square of the difference between the predicted values and actual values, which evaluates the degree of variability in the data. The MAPE further considers the radio between error and the actual value. In general, the smaller the MAE, MSE, and MAPE values, the better the model performs. The NRMSE allows to read the errors in a more understandably way, since it is a non-dimensional parameter. The R <sup>2</sup> measures the linear relationship between the predicted values and actual values of a dependent variable, whereby a high value of R<sup>2</sup> (up to one) signposts a perfect model. The KGE values range from negative infinity to 1. It can evaluate the model performance from three perspective views: Correlation, bias, and variability [51,52]. For an ideal prediction model, the value of KGE should be as close to 1 as possible.

## **4. Results and Discussion**

The final ensemble predictions from EEMD-RNN, EEMD-LSTM, EEMD-GRU, EEMD-SVM, EEMD-ELM, and EMD-LSTM are shown in Figure 9. The evaluation metrics, including MAE, MSE, MAPE, NRMSE, R<sup>2</sup> , and KGE, are shown in Figure 10. As shown, satisfactory predictions were achieved, with R<sup>2</sup> values greater than 0.98, which demonstrate the effectiveness of the "decomposition-ensemble" learning model.

**Figure 9.** Time series plots of observed and predicted landslide displacement. The training set is shown with the white background, while the testing set is shown with the blue background.

**Figure 10.** Comparison of model performance in terms of MAE, MSE, MAPE, NRMSE, R<sup>2</sup> , and KGE.

## *4.1. Comparison of EEMD-SVM, EEMD-ELM, and EEMD-RNNs*

As shown in Figure 10, in terms of correlation (R<sup>2</sup> ), there are no significant differences among the models. For standard RNN, SVM, and ELM, the values of R<sup>2</sup> are 0.994, 0.992, and 0.993, respectively, and the values of KGE are 0.987, 0.983, and 0.974, respectively. In terms of KGE, predictions with less bias and variability were achieved by the RNN-type network than SVM and ELM because recurrent networks provide higher nonlinearity. Moreover, landslide movements are essentially suspended during the dry season. Because static models can only learn current information and can only learn from a portion of historical data, static approaches, including ELM and SVM, provide unreasonable results. The suspended movement characteristics can be approximated well using dynamic RNN approaches through connections of adjacent hidden neurons and learning from a fully historical sequence.

## *4.2. Comparison of EEMD-Based Standard RNN, LSTM and GRU*

As shown in Figure 10, LSTM has higher prediction accuracy than GRU, both of which are better than standard RNN. For the standard RNN, LSTM, and GRU models, the MAE values are 16.935, 5.357, and 9.8425, respectively, and the NRMSE values are 6.1, 11.3, and 17, respectively. The evaluation metrics in Figure 10 illustrate that the prediction accuracy of the LSTM and GRU models is better than that of the standard RNN model. The problem of gradient disappearance in standard RNN is the primary cause for this performance distinction. The LSTM and GRU approaches are more practicable for landslide displacement prediction because of the gated unit structures.

In this study, the GRU model consumes 47.66 s to train, while the LSTM model consumes 201.75s, an increase of nearly three times the computational cost due to the complex network structure. The comparative analysis shows that the LSTM and GRU models provide equally satisfactory performance for landslide displacement prediction, but GRU is more efficient because of its simpler network structure.

## *4.3. Comparison of EMD-LSTM and EEMD-LSTM*

As shown in Figure 10, the R<sup>2</sup> and KGE of EMD-LSTM are lower than those of the EEMD-LSTM decomposition-ensemble learning model. The lower performance statistics of the EMD-LSTM decomposition-ensemble learning model are caused mainly by the mode mixing problem in EMD.

Figure 11 compares the model performance for the periodic component in terms of R<sup>2</sup> when varying the training data size: The model performance improves with the data capacity of the training set. The outperformance of EEMD-LSTM over EEMD-based static methods, including ELM and SVM, is not remarkable when the training dataset is smaller than 80%. This can be explained as follows: Compared to traditional models, more parameters must be tuned in the LSTM-based prediction model. Therefore, more input data are required to maintain the model performance.

**Figure 11.** Comparison of model performance for periodic components in terms of R<sup>2</sup> when varying the training set size.

The case study from the Muyubao landslide shows that the hybrid EEMD-RNN decomposition-ensemble learning model is promising for accurate prediction of landslide displacement by combining the advantages of EEMD and RNN. The main advantages of the proposed EEMD-based RNN decomposition-ensemble learning model can be outlined as follows:

The MIC-based input variable selection is a systematic process without any a priori expert knowledge, computationally inexpensive, and capable of describing the nonlinear relationships. The performance of prediction model is able to be improved by EEMD decomposition of complicated forecasting problems into several easier ones, and the temporal dependency in complicated monitoring data is captured by adopting a RNN approach, thereby improving the ability to model dynamic systems.

Although the EEMD-RNN decomposition-ensemble learning model has potential for the accurate prediction of landslide displacement, it has inherent limitations associated with data-driven approaches, including lack of transparency and a requirement for large quantities of training data [53,54].

## **5. Conclusions**

According to the decomposition-ensemble principle, a novel three-step decompositionensemble learning model integrating EEMD and RNN was proposed for landslide displacement prediction. The experimental results from the Muyubao landslide in the Three Gorges Reservoir area demonstrate that the proposed EEMD-RNN decompositionensemble learning model is capable of increasing prediction accuracy and outperforms traditional decomposition-ensemble learning models (including EMD-LSTM, EEMD-SVM, and EEMD-ELM) in terms of prediction accuracy. Moreover, the GRU- and LSTM-based models perform better than standard RNN by providing equally satisfactory performance in terms of predicting accuracy. Due to the simpler structure, GRU is more efficient than standard RNN and LSTM. Therefore, in practical application, EEMD-GRU learning model is more suitable for medium-term to long-term horizon displacement prediction of reservoir landslide in the Three Gorges Reservoir area. In addition to landslide displacement prediction, the proposed EEMD-RNN decomposition-ensemble learning model can also be extended to other difficult forecasting tasks in geosciences with extremely complex nonlinear data characteristics.

**Author Contributions:** The work was carried out in collaboration between all the authors. X.N. and J.M. guided and supervised this research; Y.W., J.Z., H.C. and H.T. performed the field investigation; X.N. wrote the original draft; and X.N. and J.M. reviewed and edited the draft. All authors have contributed to, seen, and approved the manuscript. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the Major Program of National Natural Science Foundation of China (Grant No. 42090055), Chongqing Geo-disaster Prevention and Control Center (Grant No. 20C0023), Science and Technology Projects of the Huaneng Lancang River Hydropower Co., Ltd. (Grant No. HNKJ18-H24), and the National Natural Science Foundation of China (Grant Nos. 41702328 and 42090055). All support is gratefully acknowledged.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data used in this study are available from the corresponding author upon reasonable request.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


## *Article* **Stabilization Methodology in Foundation Soils by ERT-3D Application in Estepona, South Spain**

**Alfonso Gutiérrez-Martín 1,\* , José I. Yenes <sup>2</sup> , Marta Fernández-Hernández 3,4 and Ricardo Castedo <sup>4</sup>**


## **Featured Application: A non-invasive solution developed to preserve damaged buildings using an injection of cement grout in the soils to stabilize it.**

**Abstract:** The paper proposes a novel methodology for the stabilization of shallow foundations, with a simplified model combined with 3D electrical resistivity tomography (ERT-3D and consolidation injections. To determine its usefulness, the method has been applied in a case located in Estepona (southern Spain). The chosen tomography model is the dipole–dipole configuration, with an optimized distance between electrodes of 0.80 m for a better visualization of the foundation subsoil; with this parameterization, a total of 72 electrodes were installed in the analyzed case. In this work, the depth of the anomaly in the building's supporting subsoil was detected ranging from 2.00 m to 3.90 m deep. The study also delineates areas of high resistivity variations (50–1000 Ω m) in the middle and eastern end of the field. These data have been validated and corroborated with a field campaign. The results of the ERT-3D monitoring are presented, once the investment data has been processed with the RES3DINV software, from the beginning to the end of the stabilization intervention. The novelty occurs with the interaction between the tomography and the foundation consolidation injections, until the final stabilization. This is a very useful methodology in case of emergency consolidation, where there is a need to minimize damage to the building. Thus, people using this combined system will be able to practically solve the initial anomalies of the subsoil that caused the damages, in a non-invasive way, considerably lowering the value of the resistivities.

**Keywords:** electric tomography; three-dimensional; electrodes; seat control; foundations; stabilization methodology

## **1. Introduction**

The success of architectural structures, which are built directly on the earth's surface, depends, among other factors, on the support offered by the foundation materials bearing the structures' loads [1–6] In turn, the ability of a building's foundation to offer the necessary support for architectural structures depends on the bearing capacity of the soils underneath, and, if the upper layer has heterogeneous physical properties, it could cause spatial variability in the foundation material's strength. Spatial variability of the soil's bearing capacity puts stress on poorly supported architectural structures. Associated structural failure could occur as total, partial, or differential settlement, or even total collapse of the structure, with differential settlement in the foundations being one of the worst problems [7–10].

The global prevalence of failure and collapse, with the associated loss of life and property, has made it necessary to ensure that buildings are properly constructed [11]. It is common to find many stress-induced cracks and other defect-related issues due to

**Citation:** Gutiérrez-Martín, A.; Yenes, J.I.; Fernández-Hernández, M.; Castedo, R. Stabilization Methodology in Foundation Soils by ERT-3D Application in Estepona, South Spain. *Appl. Sci.* **2021**, *11*, 4455. https://doi.org/10.3390/app11104455

Academic Editor: Daniel Dias

Received: 13 April 2021 Accepted: 11 May 2021 Published: 13 May 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

the various structures and foundations of historic, public, and private buildings, and some of these problems are disasters in the making [12–14]. This necessitates proposing a non-invasive intervention methodology for the subsoil of these buildings to avoid collapse.

One of the most used non-invasive techniques for subsoil exploration is geophysics and, more specifically, electrical resistivity tomography, in both two and three dimensions [15–21], and metrological perspectives of tomography in civil engineering [22]. In recent years, this method has proven to be an efficient tool, not only to monitor wall degeneration [23], but also to detect other types of problems in building foundations, such as subsoil degradation [14,24,25]. Spatial variability in load-bearing rock/soil beneath a building puts stress on poorly supported structures, causing failure [10,26]. Therefore, a geophysical investigation offers a faster, non-invasive means of obtaining detailed, credible information about the subsurface under a building. Electric tomography can also image the ground's distribution and structural deformation, both of which offer credible information regarding the strength the rock/soil is likely to offer a building [24,27,28]. Electrical resistivity imaging has, for decades, been very effective in illuminating the subsurface, and apt at providing information about the soil's physical properties for economic, environmental, and geological engineering. As mentioned, this technique is a useful and non-invasive method to diagnose subsoil problems in shallow foundations of buildings [3,29–32], which represent 25% of the claims reported in Spain, according to recent data provided by Aseguradora Mutua de Arquitectos Superiores, a public limited company (ASEMAS S.A.). However, the three-dimensional geophysical research approach ERT 3D [33–35] is better able than the two-dimensional approach ERT 2D to characterize the subsurface and determine heterogeneity in measured rock properties along the vertical (Z) and orthogonal horizontal (X, Y) axes [36,37].Thus, the determination of the variation in the soil properties along the three orthogonal directions would allow the evaluation of the spatial variation in the strength of the foundations, imposed by the heterogeneous properties of the soil [38].

Electrical resistivity tomography is an appropriate procedure for detecting and controlling underground consolidation and stabilization, specifically the ERT-3D tomography technique [39–42]. As mentioned above, the method consists of setting out parallel lines of observation, which cover the study area, to obtain underground data; however, the key point lies in processing the data. Fortunately, important advances have been made in 3D resistive imaging and its inversion processing through applying the powerful resistance inversion software, RES3DINV [43].This is possibly the most widespread, accurate, simple, and affordable software for data inversion in electrical tomography, and, thus, it has been used in this work. Several factors have been reported to impact the variation of electrical resistivity in the subsurface. These include variation in rock type, rock fabrics, rock deformation, water saturation, different degrees of weathering, etc. [44,45]. These factors, which are known to impact spatial variation on ground electrical resistivity, are also capable of impacting variation in other physical properties of rocks [46,47].With different electrode distances, ERT-3D offers the possibility of locating these gaps/cavities and possible holes under the foundation of a building, which can cause severe deformation and settlement. Both consolidating and stabilizing soil through injections have been proven effective [48–51] and are used to mitigate or even solve differential settling. The injection material generally depends on the lithological morphology of the soil in question. Synthetic resin is frequently used as an injection fluid [52], but this paper proposes injections of cement grout as a more economical and versatile solution.

For a soil consolidation project to be successful, electrical resistivity data must be available in advance to provide initial information on the subsurface structure. In our methodology for consolidating and stabilizing buildings with shallow foundations, we will use electrical tomography as a tool that detects empty cavities, which are typical examples of anthropic (resistive) fillings with low bearing capacity in subsoils [3]. Thus, this work proposes ERT -3D to monitor the subsoil in real time, from the beginning to the final consolidation through the different injection passes into the subsoil, until its stabilization. As an example of a case of application of our combined methodology, we have chosen a

historic building with serious stability problems in its foundations [5], located in Estepona, on the Spanish south coast. The subsoil was consolidated with injections of cement grout applying the electrical methodology, resulting in the preservation of the building, and avoiding its possible collapse.

## **2. The Model Development**

The ERT method discharges an electric current into the ground and measures the potential difference at two determined points on the surface. The suitability of this method lies in the fact that irregularities in the subsoil beneath a building can be identified as contrasts or anomalies in the subsoil's electrical properties.

This method is based on Ohm's law:

$$\mathbf{Pa} = \mathbf{k} (\Delta \mathbf{V}/\mathbf{l}) \tag{1}$$

where: Pa = apparent resistivity [36]; k = geometric constant that depends only on the reciprocal positions of the current and potential electrodes; ∆V = potential difference; l = intensity of the injected current.

The apparent values of resistivity depend on the real resistivity distribution in the tested area. The true resistivity distribution can be estimated through a reverse procedure, based on minimizing an adequate function [15,39,44]. The solution to this problem is not unique. For the same set of data, a wide range of models can calculate the same apparent resistivity values. A preliminary lithological analysis is usually carried out on the subsoil's nature, to reduce the range of possible models, which can be incorporated into the reverse subroutine.

The solution method used minimizes the difference between the apparent resistivities measured and those calculated by the RES3DINV software, which uses the limited softness inversion formulation, restricting the model's change in resistivity values [15,36,43,53]. This study has used RES3DINVx64, which implements a smooth routine based on least squares and is practically the only commercial software of its type available [36,54,55]. The inversion routine used by the program is based on the smoothness-constrained leastsquares method [56]. The basic smoothness-constrained least-squares method is based on the following equation.

$$\text{(J}^{\text{T}}\text{J} + \lambda\text{F)}\,\Delta\mathbf{q}\_{\text{k}} = \text{J}^{\text{T}}\text{g} - \lambda\text{ Fq}\_{\text{k}-1} \tag{2}$$

where:

$$\mathbf{F} = \alpha\_{\mathbf{x}} \mathbf{C}\_{X}^{\mathsf{T}} \mathbf{C}\_{X} + \alpha\_{\mathbf{Y}} \mathbf{C}\_{Y}^{\mathsf{T}} \mathbf{C}\_{Y} + \alpha\_{\mathbf{Z}} \mathbf{C}\_{Z}^{\mathsf{T}} \mathbf{C}\_{Z} \tag{3}$$

J <sup>T</sup> = Transpose of J.

J = Jacobian matrix of partial derivatives.

λ = Damping factor.

q = Disturbance vector.

k = Iteration number

g = Data mismatch vector.

αx, αy, αz = weights for roughness filters

Cx, Cy = horizontal roughness filters

Cz = vertical roughness filter

This method is advantageous because of its versatility since the damping and roughness factor filters adjust to the different types of data. This program uses the Gauss–Newton method, which recalculates the Jacobian matrix, after each interaction [57]. The interpretations of electrical tomography profiles are made using RES3DINVx64 [34,36] for resistivity and induced polarization. As mentioned above, this calculation software is based on the least-squares method with forced smoothing, modified with the Quasi-Newton optimization technique. The inversion method designs a subsoil model using rectangular prisms and determines the resistivity values for each of them, minimizing the difference between the apparent resistivity values observed [15,58]. The results of our model show an

uncertainty in its acceptable magnitude. They are in a critical state with respect to their structural stability, load capacity and, under these conditions, the safety of the building's habitability. This is at an acceptable level of uncertainty associated with these increasing risks of instability [5].

## *2.1. Application*

Some results reported in the literature and obtained in the last eleven years regard the application of the ERT-3D method to study geotechnical anomalies in the subsoils on slopes after landslides located in different geographical contexts.Such reports make us consider the ERT method as a tool and methodology very suitable to investigate these geotechnical anomalies of the subsoil during the pre-event and post-event phases of a disaster cycle [30,31,59–61] or simply a poor foundation support subsoil (anthropic fill).

In fact, during the pre-event phase, the resistivity contrasts that characterize ERT-3D allow defining the geological environment of the subsoil. They allow identifying areas of high-water content that could be responsible for reactivation events. In the post-event phase, ERT-3Denablesus to reconstruct the damaged or altered subsoil body by also providing information on the volume of the removed or altered material. This information can help to better plan future mitigation activities. Our original application not only detects these subsoil alterations in real time, but we also propose an original tool for subsequent mitigation, in case of affecting existing buildings or infrastructure in a non-invasive way and thus preventing their collapse.

One of the biggest drawbacks of 3D tomography for the investigation of shallow and disturbances in the subsurface was the fact that it did not provide continuous acquisitions over time, which made it unsuitable for the study of the dynamic nature of shallow and disturbances.

Fortunately, the development of systems for the continuous acquisition of electrical resistivity time and software for data inversion [61] are paving the way to test this method during the emergency phase, such as the application and methodology developed in the present work, where in real time we are applying the geoelectric investment data and carrying out the consolidation of the subsurface area, increasing its resistivity in the emergency case, and thus being able to recover existing constructions in the altered subsoil.

The possibility of using ERT-3D to monitor geotechnical changes and alterations in the first layers of a ground clearance-settlement area will add important information during the emergency phase. At this time, the preliminary results obtained in our methodology when applying the ERT-3D for this purpose are very encouraging. The proof of this is that we were able to recover the building that was based on the geotechnically altered substrate, where the grout cement, filler material proposed in our methodology for consolidating the subsoil, satisfactorily filled the gaps that water and air occupied in the altered substrate. Cement grout has advantages over other fillers, such as its economy, easy means of performing on site, ease of dosing, and versatility, among others.

In applying our methodology in these cases of emergencies where there is a removal of surface soil after a shallow of the subsurface that affects buildings and infrastructures [62,63], we have observed that a low resistivity zone lies in the upper area from the surface to a depth up to 4.00 m, and the zone shows a higher resistivity of greater than 400 Ω m. According to the data obtained in this work, the upper area of the soil unit consists of mostly silt soil with granule from the erosion of the bedrock. Therefore, we thought that the anomaly in the upper area is attributed to the silt soil and granule with low resistivity. This zone of low resistivity coincides with the zone of removal and presence of water, which we consolidated with our grout cement filling methodology and in real time the monitoring was carried out to verify the satisfactory level of the landfill.

The procedure followed by the authors clearly defines the altered surface material with cavities and gaps of the rocky matrix (phyllite) by lower resistivities in the bedrock (50–150 Ω m). This methodology characterizes the subsoil altered by shallow bedrock material mainly composed of (by) clay material (colluvial) with high pore-hole content and

high resistivities. The modelling of the subsoil analyzed with our system was monitored with the ERT of the grout process using the 3D RESINV software [63]. Another novelty of our tool is the geological characterization of this type of shallow phenomenon, and its affectation to existing buildings on the slope that in the South of Spain, in La Cordillera Bética, occurs with recurrence [62–64]. To have a tool like this proposal to characterize and mitigate damage to existing buildings and infrastructures in a non-invasive way in this type of geological formation is an advantage.

The results after the application of the methodology in the case study showed the effectiveness of the diagnostic and intervention methodology for mitigating the serious damage suffered by the building, preventing the collapse and destruction of the building while preserving the safety of its inhabitants.

## *2.2. Phases: Developed Methodology*

Anew methodology has been developed consisting of different phases and applied to the case study:


## **3. Case Study**

This research shows the results of applying the proposed methodology to a specific case of differential settlements in a building in anat-risk area after soil removal [5]—see Figure 1. The land movements occurred after heavy rains [64] in the 2009–2010 hydrological years, but measurements presented in this paper were taken in 2012.

The applied methodology is based on ERT-3D and consists of placing electrodes along profiles separated from each other according to the resolution, depth, and objectives to be covered. The lower the separation, the greater the resolution; the greater the separation, the greater the depth. An optimal distance of 0.80 m. has been determined in our methodology between electrodes to obtain a balance between resolution tomographic profiles and depth in accordance with the problem in question.

Prior to performing the tomographic profiles, it is recommended that preliminary geological research be carried out in the area where the buildings are located [64]. In this case, the affected buildings were located at Paraje del Arroyo, La Cala, Estepona, Malaga (Spain), which is in the south of the Baetic Mountain Range (South Iberian Peninsula). The damage occurred at coordinates 36.461094, −5.160498 (Figure 1). In the first phase, it is proposed that a granulometric analysis and a Standard Penetration Test (SPT) be executed in the geotechnical surveys. In the present case, soil consistency increased with depth (Table 1). Change in the geotechnical response of the soil occurred at an approximate depth between one and four meters, according to the SPT hits along the analyzed sample (Table 1).

Note that the numbers in bold correspond to values that are too low. The grey background switching to white marks a discontinuity and an important change of capacity and bearing resistance in the subsoil, coincident with level I of Table 2.

**Figure 1.** Aerial picture.Red circle: building location. April 2012.Source: geographical application Googlemaps/https://www.google.com/maps/place (accessed on 25 April 2012), coordinates 36.461094, −5.160498. **Figure 1.** Aerial picture.Red circle: building location. April 2012.Source: geographical application Googlemaps/https://www.google.com/maps/place (accessed on 25 April 2012), coordinates 36.461094, −5.160498.

The applied methodology is based on ERT-3D and consists of placing electrodes along profiles separated from each other according to the resolution, depth, and objec-**Table 1.** Summary of the consistency and admissible stress of the soil according to the rotary probes. Measurements were taken at the height of the main facade of the most affected building.


depth (Table 1). Change in the geotechnical response of the soil occurred at an approximate depth between one and four meters, according to the SPT hits along the analyzed sample (Table 1). **Table 2.** Lithography from the affected area via subsoil removal. Data are from lab tests. The depth of soil affected by lack of cohesion and similarity correspond with level I.


**1.00–2.00 3 Soft 20 2.00–3.00 6 Slightly hard 40 3.00–4.00 7 Slightly hard 50 4.00–4.40 10 Slightly hard 70** Based on geological and geotechnical studies, the materials extracted in the area depict the lithological levels listed in Table 2. The analyzed soils were mainly rocky and clayey, of varying thickness, which could be verified in field work and through laboratory geotechnical tests.

### 4.40–6.00 19 Moderately hard 120 6.00–7.00 52 Hard 310 *3.1. Damaged Building Analysis*

7.00–8.00 63 Hard 350 8.00–8.60 84 Hard 440 The most common deformations in a building are related to foundation differential movements. These alterations result in structural deformations in the building and angular distortions in its foundation [3,65]. This provokes stress throughout the construction, and when this stress limit is exceeded, cracking or breakages occur.

The Spanish Building Technical Code (2006) defines the maximum angular distortion values for a building's ultimate limit of state and service, conditioning this value to the type of structure. In the service limit state, the allowable angular distortion value is L/500 for reticulated structures, and L/300 for isostatic structures and for load-bearing walls (the studied case). For L, the length in a straight line between the axes of the footings is analyzed. However, the structure's ability to assume these deformations will depend, among other factors, on the stiffness of each element comprising it, so any fixed distortion value could be conservative or give rise to non-tolerable deformations [66–69]. The Spanish Building Technical Code (2006) defines the maximum angular distortion values for a building's ultimate limit of state and service, conditioning this value to the type of structure. In the service limit state, the allowable angular distortion value is L/500 for reticulated structures, and L/300 for isostatic structures and for load-bearing walls (the studied case). For L, the length in a straight line between the axes of the footings is analyzed. However, the structure's ability to assume these deformations will depend, among other factors, on the stiffness of each element comprising it, so any fixed distortion value could be conservative or give rise to non-tolerable deformations [66–69].

tion, and when this stress limit is exceeded, cracking or breakages occur.

*Appl. Sci.* **2021**, *11*, 4455 7 of 20

tory geotechnical tests.

*3.1. Damaged Building Analysis* 

Note that the numbers in bold correspond to values that are too low. The grey background switching to white marks a discontinuity and an important change of ca-

Based on geological and geotechnical studies, the materials extracted in the area depict the lithological levels listed in Table 2. The analyzed soils were mainly rocky and clayey, of varying thickness, which could be verified in field work and through labora-

**Table 2.** Lithography from the affected area via subsoil removal. Data are from lab tests. The depth

**Levels Lithology Depth (m)**  I Colluvial clay material 0.00–4.40 II Modified phyllites 4.40–6.00 III Phyllites 6.00–25.00

The most common deformations in a building are related to foundation differential movements. These alterations result in structural deformations in the building and angular distortions in its foundation [3,65]. This provokes stress throughout the construc-

pacity and bearing resistance in the subsoil, coincident with level I of Table 2.

of soil affected by lack of cohesion and similarity correspond with level I.

According to current urban regulations, the rural house examined in this study would be difficult to demolish and rebuild. Therefore, a major challenge to this research and methodology was finding a solution to recover the building without having to demolish it. The main building in this study was a rectangular two-story building, 17.15 m long and 12.05 m wide, with a surface area of approximately 206.66 m<sup>2</sup> per floor and a total building area of 413.32 m<sup>2</sup> (Figure 2). According to current urban regulations, the rural house examined in this study would be difficult to demolish and rebuild. Therefore, a major challenge to this research and methodology was finding a solution to recover the building without having to demolish it. The main building in this study was a rectangular two-story building, 17.15 m long and 12.05 m wide, with a surface area of approximately 206.66 m2 per floor and a total building area of 413.32 m2 (Figure 2).

**Figure 2.** (**a**) Structural scheme of the damaged building, formed by three load-bearing walls, A, B, and C, and forged of semi-resistant joists, 24 cm thick. Building foundation scheme formed by three parallel, longitudinal shallow foundations **Figure 2.** (**a**) Structural scheme of the damaged building, formed by three load-bearing walls, A, B, and C, and forged of semi-resistant joists, 24 cm thick. Building foundation scheme formed by three parallel, longitudinal shallow foundations (A, B, C), where the load-bearing walls rest and transmit the structure's weight. (**b**) Floor plan of the distribution of the damaged building in the slope.

This was a residential building and was used as a rural house (country holiday house); hence, guest and owner safety was paramount. The distribution comprised two building components: two floors, with an open area on the ground, and an upper, noble floor (Figure 2). Both the main building and the entrance foyer were designed in the late 1980s and completed in the early 2000s.The building's structure consisted of load-bearing walls and solid brick pilasters on the façade's main porch (vertical structure); these rested on a shallow foundation base of a reinforced concrete simple wall footing, formed by three parallel simple wall footings (A, B, C) (Figure 2). The horizontal structures of the building were24 cm thick unidirectional slabs formed of semi-resistant joists, supported in three spaces by load-bearing walls and pilasters resting on a shallow foundation composed of three longitudinal foundations in a trench.

From the previous geotechnical and structural analysis of the building illustrated in Figures 2 and 3, a problem of support in the foundation for differential seats was deduced.

Of the A, B, and C foundations, B and C had become unstable (sunk downward) with respect to A (Figure 4). This phenomenon caused the building to swing forward, which caused serious damage to its structure, as can be seen in Figure 4. The foundation of wall A also suffered a small movement in its seat (S1) (Figure 4).

damaged building in the slope.

duced.

This was a residential building and was used as a rural house (country holiday house); hence, guest and owner safety was paramount. The distribution comprised two building components: two floors, with an open area on the ground, and an upper, noble floor (Figure 2). Both the main building and the entrance foyer were designed in the late 1980s and completed in the early 2000s.The building's structure consisted of load-bearing walls and solid brick pilasters on the façade's main porch (vertical structure); these rested on a shallow foundation base of a reinforced concrete simple wall footing, formed by three parallel simple wall footings (A, B, C) (Figure 2). The horizontal structures of the building were24 cm thick unidirectional slabs formed of semi-resistant joists, supported in three spaces by load-bearing walls and pilasters resting on a shallow foundation

From the previous geotechnical and structural analysis of the building illustrated in Figures 2 and 3, a problem of support in the foundation for differential seats was de-

(A, B, C), where the load-bearing walls rest and transmit the structure's weight. (**b**) Floor plan of the distribution of the

composed of three longitudinal foundations in a trench.

**Figure 3.** The building's state of conservation in April 2012, after the movement of the subsoil and its shallow foundation. (C—see Figure 2 for reference) Structure of load pilasters and semi-circular arches, a structure that has suffered a decrease in its foundation with respect to the foundations of the load-bearing walls (B) and (A) on Figure 2. This has caused the building to deform and rotate forward. **Figure 3.** The building's state of conservation in April 2012, after the movement of the subsoil and its shallow foundation. (C—see Figure 2 for reference) Structure of load pilasters and semi-circular arches, a structure that has suffered a decrease in its foundation with respect to the foundations of the load-bearing walls (B) and (A) on Figure 2. This has caused the building to deform and rotate forward. *Appl. Sci.* **2021**, *11*, 4455 9 of 20

**Figure 4.** Cross-section scheme, presenting the likely behavior of the building's foundation and structure. There was a differential settlement of the S1 and S2 foundations due to a possible settlement-removal of the soil and the lack of bearing capacity of the resistant subsoil. Displacement occurred through the variable Φ = 95 mm, which is the depth B and C have descended, with respect to A. The seat of the foundation for wall A (S1) has lowered a depth of 25 mm, remaining within the admissible limit according to the Spanish Building Technical Code (CTE). **Figure 4.** Cross-section scheme, presenting the likely behavior of the building's foundation and structure. There was a differential settlement of the S1 and S2 foundations due to a possible settlementremoval of the soil and the lack of bearing capacity of the resistant subsoil. Displacement occurred through the variable Φ = 95 mm, which is the depth B and C have descended, with respect to A. The seat of the foundation for wall A (S1) has lowered a depth of 25 mm, remaining within the admissible limit according to the Spanish Building Technical Code (CTE).

Due to the type of movement and structural typology, displacement occurred as shown in Figure 4. When load-bearing walls deform, they collapse, and destruction of the Due to the type of movement and structural typology, displacement occurred as shown in Figure 4. When load-bearing walls deform, they collapse, and destruction of the building becomes a major threat.

building becomes a major threat. To determine the building deformation as a function of the angular distortion, it was calculated as the differential settlement, defined as the settlement difference between foundations A and C, which were the extreme foundations affected by the settlements, To determine the building deformation as a function of the angular distortion, it was calculated as the differential settlement, defined as the settlement difference between foundations A and C, which were the extreme foundations affected by the settlements,

applying the following equation. Knowing that in this case, it was the same height that

suffered differential settlement at point A; *SC* = suffered differential settlement at point C.

ߚ ൌ ఋಲ

The Spanish Building Technical Code (2006), also known as CTE, establishes limitations on *β* movements, according to building structure typology. Studies regarding *β* began in the 1940s [73] and continue today [74–76]. The CTE limitation for the *β* value, for building damages, must be >L/300 for load-bearing walls. In this case, the differential settlement between foundations A and C was *δAC* = 95 mm and *β* = 0.0086, which are above the admissible limit established by the CTE, indicating imminent building col-

Authors also link angular distortion *β* values with damages suffered in enclosures (shell and internal partitions) and structural elements. The present case exceeded the limit values established by these authors, where the dimensions of cracks in enclosures must be >25 mm to be considered a serious structural risk, according to a programmed damage classification [77]. As shown in Figure 4, this building had serious damage, due

ൌ ܵ െ ܵ (4)

ಲ (5)

ߜ

where:*β* = angular distortion; *LAC* = distance between foundation points A and C.

The equation of angular distortion is also applied [3,70–72]:

has descended B and C with respect to A:

lapse.

applying the following equation. Knowing that in this case, it was the same height that has descended B and C with respect to A:

$$
\delta\_{A\mathbb{C}} = \mathbb{S}\_A - \mathbb{S}\_{\mathbb{C}} \tag{4}
$$

where *δAC* = foundation vertical differential displacement between points A and C; *S<sup>A</sup>* = suffered differential settlement at point A; *S<sup>C</sup>* = suffered differential settlement at point C.

The equation of angular distortion is also applied [3,70–72]:

$$
\beta = \frac{\delta\_{AC}}{L\_{AC}} \tag{5}
$$

where: *β* = angular distortion; *LAC* = distance between foundation points A and C.

The Spanish Building Technical Code (2006), also known as CTE, establishes limitations on *β* movements, according to building structure typology. Studies regarding *β* began in the 1940s [73] and continue today [74–76]. The CTE limitation for the *β* value, for building damages, must be >L/300 for load-bearing walls. In this case, the differential settlement between foundations A and C was *δAC* = 95 mm and *β* = 0.0086, which are above the admissible limit established by the CTE, indicating imminent building collapse.

Authors also link angular distortion *β* values with damages suffered in enclosures (shell and internal partitions) and structural elements. The present case exceeded the limit values established by these authors, where the dimensions of cracks in enclosures must be >25 mm to be considered a serious structural risk, according to a programmed damage classification [77]. As shown in Figure 4, this building had serious damage, due to cracks in enclosures and structural elements (load-bearing walls), which were measured at greater than 50 mm. Therefore, according to this classification, the building was close to collapse. Due to the type of structure, foundation, and damages suffered, the methodology proposed in this research seems appropriate for the case. It would allow detection of zones of soil weakness where the differential settlement has been produced and consolidation of the subsoil from outside the construction, thus solving the building's issues non-invasively.

## *3.2. Analysis and Stabilisation Methodology*

According to the data—regulations, the state of the building, the undeveloped land surrounding the building, and the necessity of keeping the building stable (i.e., avoiding imminent collapse)—a simplified model was proposed, in which ERT-3D would be used, with subsequent processing of the reversal data via the RES2DINV software [43], as seen in Figure 5. This analysis will consider the geotechnical data given in the previous sections.

The developed methodology allowed the researchers to see the reality of the soil, as well as any possible cavities in the subsoil supporting the building's foundation. These cavities were filled with grout to consolidate the soil, and then, via contrast campaigns (ERT-3D), the researchers checked to determine if the filling worked, if it moved, etc. Thus, the proposed methodology provided a complete, three-dimensional analysis of the subsoil supporting the damaged foundation, where the differential settlements occurred. Later, along with this original geophysical technique, the researchers monitored, through successive contrast campaigns, the consolidation and underpinning of the subsoil with different passes of consolidation injections. The injection material depends on the lithological morphology of the soil to be consolidated. Due to the type of subsoil in this case, as well as the material's low cost and ease of use and control, this paper proposed injections of controlled cement grout [78,79]. Synthetic resin is frequently used as an injection fluid, though it does not allow for high solicitation [52]. Three beneficial effects of injecting cement grout into the subsoil have been observed: (1) filling existing gaps, (2) soil compaction, and (3) interstitial water reduction and/or elimination.

to cracks in enclosures and structural elements (load-bearing walls), which were measured at greater than 50 mm. Therefore, according to this classification, the building was close to collapse. Due to the type of structure, foundation, and damages suffered, the methodology proposed in this research seems appropriate for the case. It would allow detection of zones of soil weakness where the differential settlement has been produced and consolidation of the subsoil from outside the construction, thus solving the build-

According to the data—regulations, the state of the building, the undeveloped land surrounding the building, and the necessity of keeping the building stable (i.e., avoiding imminent collapse)—a simplified model was proposed, in which ERT-3D would be used, with subsequent processing of the reversal data via the RES2DINV software [43], as seen in Figure 5. This analysis will consider the geotechnical data given in the previous sec-

**Figure 5.** Block–flow diagram, showing the phases and methodology developed for intervening with the subsoil under the foundations of buildings with differential settlements. **Figure 5.** Block–flow diagram, showing the phases and methodology developed for intervening with the subsoil under the foundations of buildings with differential settlements.

### The developed methodology allowed the researchers to see the reality of the soil, as *3.3. Approach Methodology: Application*

ing's issues non-invasively.

tions.

*3.2. Analysis and Stabilisation Methodology* 

well as any possible cavities in the subsoil supporting the building's foundation. These cavities were filled with grout to consolidate the soil, and then, via contrast campaigns (ERT-3D), the researchers checked to determine if the filling worked, if it moved, etc. Thus, the proposed methodology provided a complete, three-dimensional analysis of the subsoil supporting the damaged foundation, where the differential settlements occurred. Later, along with this original geophysical technique, the researchers monitored, through Had the proposed methodology not been applied in this case, the building could have been damaged to a critical point, making its repairs too expensive to pursue. Other, more invasive proceedings, such as underpinning through micropyles [5,80–82], were not feasible due to the state of the building and the structural typology of the load-bearing walls. However, making consolidation injections into the subsoil allowed recovery, to a certain degree, of the differential settlement suffered and rendered the building useable again.

successive contrast campaigns, the consolidation and underpinning of the subsoil with different passes of consolidation injections. The injection material depends on the lithological morphology of the soil to be consolidated. Due to the type of subsoil in this case, as well as the material's low cost and ease of use and control, this paper proposed injec-The first necessary action involved using electrical tomography to analyze the subsoil beneath the affected foundations. In an initial campaign, this study accomplished various tests with spacings of 0.80, 1.00, and 1.50 m to identify the best resolution for checking the subsoil's state. A spacing of 0.80 m was chosen, based on another recent research [11].

tions of controlled cement grout [78,79]. Synthetic resin is frequently used as an injection fluid, though it does not allow for high solicitation [52]. Three beneficial effects of injecting cement grout into the subsoil have been observed: (1) filling existing gaps, (2) soil compaction, and (3) interstitial water reduction and/or elimination. The Syscal Switch 48 (V114 ++) from IRIS Instruments was used as electrical tomography equipment. It is a multi-electrode piece of equipment with an integrated computer capable of managing up to 900 electrodes, with resolution characteristics of resolution/accuracy:1 µV/0.2% [43]. The equipment power source is 250 W and 2.5 A, which generates 880 Vp-p pulses, and the manufacturer has incorporated a transmitter and receiver into the system. Among the equipment's features are a time injection adjustment, an apparent resistivity and chargeability automatic processor, a 3D real-time resistivity control, a voltage and current injection curve control, an integrated PC, and a commutation processor. The locations of the two topographies' parallel profiles (E1-E2) were shown via coordinates given by a GPS from Garmin Etrex Ventura [14], with a WG84 coordinate system. The results are listed in Table 3.

In the developed methodology, two parallel profiles (E1 and E2) were selected, covering an area of approximately 240 m<sup>2</sup> , which included the main affected building area in accordance with Figure 6 and Table 3. The electrodes were placed in parallel.

RES3DINV—3D data inversion software for electrical imaging and induced polarization (IP) [83]—was used to automatically invert the acquired data from apparent resistivity and produce a 3D resistivity model. The two parallel profiles were executed using a dipole–dipole matrix. The dipole–dipole matrix was chosen due to its sensitivity to lateral variations of resistivity [41,84]; thus, structures such as gaps and cavities that were damaging the building's subsoil could be detected. This dipole–dipole matrix provided us with a sharper horizontal resolution of the whole, which is an important advantage in this type of

building pathology. As discussed previously, an inter-electrode separation of 0.80 m was estimated as suitable for this methodology. The two profiles extended along 28 m, with 72 electrodes distributed according to the coordinates listed in Table 3 and Figure 6. were shown via coordinates given by a GPS from Garmin Etrex Ventura [14], with a WG84 coordinate system. The results are listed in Table 3. **Table 3.** UTM coordinates of the tomography profiles; location of the profiles according to the dismography multi-electrode piece equipment with an capable of up 900 resolution characteristics µV/0.2% [43]. and A, which generates 880 Vp-p pulses, and the manufacturer has incorporated a transmitter the equipment's are time an resistivity automatic a control, a current curve control, commutation The locations of the topographies' (E1-E2) via by Etrex Ventura coordinate system. The **Table 3.** UTM of the the profiles the dis-

Had the proposed methodology not been applied in this case, the building could have been damaged to a critical point, making its repairs too expensive to pursue. Other, more invasive proceedings, such as underpinning through micropyles [5,80–82], were not feasible due to the state of the building and the structural typology of the load-bearing walls. However, making consolidation injections into the subsoil allowed recovery, to a certain degree, of the differential settlement suffered and rendered the

Had the proposed methodology not been applied in this case, the building could have to a repairs too to pursue. Other,

such underpinning micropyles to the of the load-bearing into recovery, to a certain degree, of the differential settlement suffered and rendered the

The first necessary action involved using electrical tomography to analyze the subsoil beneath the affected foundations. In an initial campaign, this study accomplished various tests with spacings of 0.80, 1.00, and 1.50 m to identify the best resolution for checking the subsoil's state. A spacing of 0.80 m was chosen, based on another recent

necessary analyze sub-

The Syscal Switch 48 (V114 ++) from IRIS Instruments was used as electrical tomography equipment. It is a multi-electrode piece of equipment with an integrated computer capable of managing up to 900 electrodes, with resolution characteristics of resolution/accuracy:1 µV/0.2% [43]. The equipment power source is 250 W and 2.5 A, which generates 880 Vp-p pulses, and the manufacturer has incorporated a transmitter and receiver into the system. Among the equipment's features are a time injection adjustment, an apparent resistivity and chargeability automatic processor, a 3D real-time resistivity control, a voltage and current injection curve control, an integrated PC, and a commutation processor. The locations of the two topographies' parallel profiles (E1-E2)

the affected In this study various spacings of 0.80, and m identify the subsoil's state. A spacing m chosen, based on another 48 from IRIS Instruments as to-

**Table 3.** UTM coordinates of the tomography profiles; location of the profiles according to the distribution of Figure 6. tribution of Figure 6. **Start UTM Coordinates of Tomography Final UTM Coordinates of Tomography**  tribution of Figure 6. **Tomography Final UTM Coordinates of Tomography** 


*Appl. Sci.* **2021**, *11*, 4455 11 of 20

*Application* 

*3.3. Approach Methodology: Application* 

building useable again.

research [11].

**Figure 6.** (**a**) Longitudinal scheme of the methodology and installation of the electrodes at 0.80 m in the building (house), as well as the subsoil analyzed with ERT -3D. The analyzed subsoil has a depth of 5.80 m, which corresponds to levels I and II of the geotechnical layers detected. (**b**) ERT -3D (TE1-3D) delimited tomography surface, covering the main building's foundation structure, within the parallel profiles proposed: E1 and E2. Circles refer to the E1 profile, while squares refer to E2; orange refers to the initial UTM coordinates, while green refers to the final ones (Table 3). **Figure 6.** (**a**) Longitudinal scheme of the methodology and installation of the electrodes at 0.80 m in the building (house), as well as the subsoil analyzed with ERT -3D. The analyzed subsoil has a depth of 5.80 m, which corresponds to levels I and II of the geotechnical layers detected. (**b**) ERT -3D (TE1-3D) delimited tomography surface, covering the main building's foundation structure, within the parallel profiles proposed: E1 and E2. Circles refer to the E1 profile, while squares refer to E2; orange refers to the initial UTM coordinates, while green refers to the final ones (Table 3).

RES3DINV—3D data inversion software for electrical imaging and induced polarization (IP) [83]—was used to automatically invert the acquired data from apparent resistivity and produce a 3D resistivity model. The two parallel profiles were executed using a dipole–dipole matrix. The dipole–dipole matrix was chosen due to its sensitivity to lateral variations of resistivity [41,84]; thus, structures such as gaps and cavities that were damaging the building's subsoil could be detected. This dipole–dipole matrix provided us with a sharper horizontal resolution of the whole, which is an important advantage in Figure 7 shows a 3D electrical resistivity model from the studied terrain and the depths reached, in this case, up to 5.80 m depth. Both profiles show resistivity values represented in a range of colors for better and easier observation of the variations in subsoil vertical and horizontal resistivities. Zones with resistivity values between 0–55 Ω m corresponded to levels of clay or colluvial material, represented in green; zones between 55–1000 Ω m, shown in orange, corresponded to fillings or colluvium removed with low

this type of building pathology. As discussed previously, an inter-electrode separation of 0.80 m was estimated as suitable for this methodology. The two profiles extended along 28 m, with 72 electrodes distributed according to the coordinates listed in Table 3 and

Figure 7 shows a 3D electrical resistivity model from the studied terrain and the depths reached, in this case, up to 5.80 m depth. Both profiles show resistivity values represented in a range of colors for better and easier observation of the variations in subsoil vertical and horizontal resistivities. Zones with resistivity values between 0–55 Ω m corresponded to levels of clay or colluvial material, represented in green; zones between 55–1000 Ω m, shown in orange, corresponded to fillings or colluvium removed with low compactness; and resistivity > 1000 Ω m, in red, indicated anomalies due to the

231

Figure 6.

presence of interstitial gaps.

compactness; and resistivity > 1000 Ω m, in red, indicated anomalies due to the presence of interstitial gaps. *Appl. Sci.* **2021**, *11*, 4455 13 of 20

**Figure 7.** Advanced inversion algorithm of 3D synthetic data (RES3DINV software). Electric Ground Section ERT-3D: cross-section diagram of the ground, with the probable behavior of the soil and foundation. Cross-section scheme, with the likely behavior of the soil and foundation. **Figure 7.** Advanced inversion algorithm of 3D synthetic data (RES3DINV software). Electric Ground Section ERT-3D: cross-section diagram of the ground, with the probable behavior of the soil and foundation. Cross-section scheme, with the likely behavior of the soil and foundation.

Figure 7 shows resistivity anomalies presented in slices of subsoil supporting the foundation of the damaged building. The red color shows a higher resistivity value, up to 3000Ωm, which suggests that there are cavities causing the resistivity to rise. From the obtained resistivities, combined with the geotechnical data (SPT standards) [85,86], the physicochemical properties of geological materials can be determined. Figure 7 shows resistivity anomalies presented in slices of subsoil supporting the foundation of the damaged building. The red color shows a higher resistivity value, up to 3000 Ωm, which suggests that there are cavities causing the resistivity to rise. From the obtained resistivities, combined with the geotechnical data (SPT standards) [85,86], the physicochemical properties of geological materials can be determined.

The survey depth was established as 5.80 m, considering the previous geotechnical results of Table 1. From that depth, level III began to appear, determined by a rocky matrix of phyllite. The main anomalies were located on the main load-bearing walls, specifically A and B. The survey depth was established as 5.80 m, considering the previous geotechnical results of Table 1. From that depth, level III began to appear, determined by a rocky matrix of phyllite. The main anomalies were located on the main load-bearing walls, specifically A and B.

The structure of the building had been seriously damaged by the seat of its foundation and the turning of the building, specifically in the main facade between pilasters and the main bearing wall (wall B). There were also apparent interstitial anomalies in the subsoil up to 3.90 m deep in the dielectric profile E2 (wall B), and 2.00 m in the dielectric profile E1 (wall A). The anomalies and settlement in the foundation of wall B (S2) were of greater importance than the settlement at the base of wall A (S1), with a characteristic angular distortion of *β* = 0.0086, hence the forward inclination of the building. In summary, according to the algorithm developed in Figure 6, the first phase of the tomographical campaign was carried out, by analyzing the resistivities of the subsoil supporting the building and considering previous data, including the initial geotechnical data. Anomalies and areas with low compaction and/or gaps were identified and isolated in ERT-3D for the building's main load-bearing walls (Figure 8), so this knowledge might be acted on in the future. The structure of the building had been seriously damaged by the seat of its foundation and the turning of the building, specifically in the main facade between pilasters and the main bearing wall (wall B). There were also apparent interstitial anomalies in the subsoil up to 3.90 m deep in the dielectric profile E2 (wall B), and 2.00 m in the dielectric profile E1 (wall A). The anomalies and settlement in the foundation of wall B (S2) were of greater importance than the settlement at the base of wall A (S1), with a characteristic angular distortion of *β* = 0.0086, hence the forward inclination of the building. In summary, according to the algorithm developed in Figure 6, the first phase of the tomographical campaign was carried out, by analyzing the resistivities of the subsoil supporting the building and considering previous data, including the initial geotechnical data. Anomalies and areas with low compaction and/or gaps were identified and isolated in ERT-3D for the building's main load-bearing walls (Figure 8), so this knowledge might be acted on in the future.

Given the powerful information about the subsoil under the building's main loadbearing walls, the second phase of this methodology was proposed: consolidation injections into the subsoil for those detected anomalies (flow diagram, Figure 5). Figure 9 isolates the areas to be treated with injections via the RES3DINV investment program.

The damages observed in the analyzed building and the initial geotechnical tests confirmed the diagnosis from the ERT-3D tomography methodology. The initial ERT-3D results—that is, the application of the first phase of the flow chart (Figure 5)—showed high coincidence in the geological composition of the supporting subsoil beneath the damaged building: altered clay edges and colluvial sand from the meteorisation of the rock matrix (phyllite).

This innovative intervention technique allowed the researchers to evaluate the induced effects in the field and, in light of them, carry out possible modifications in the distribution

of injection points to consolidate the subsoil. In the procedure's second phase, after the injections in the anomalies were made, the ERT-3D contrast was repeated to check the consolidation results. *Appl. Sci.* **2021**, *11*, 4455 14 of 20

**Figure 8.** Volume plot rendered for the subsoil model of the building analyzed with topography included results of 3D inversion of electrical resistivity tomography over the whole survey area. On the left, view from the south, and, on the right, view from the north. **Figure 8.** Volume plot rendered for the subsoil model of the building analyzed with topography included results of 3D inversion of electrical resistivity tomography over the whole survey area. On the left, view from the south, and, on the right, view from the north. load-bearing walls, the second phase of this methodology was proposed: consolidation injections into the subsoil for those detected anomalies (flow diagram, Figure 5). Figure 9 isolates the areas to be treated with injections via the RES3DINV investment program.

Given the powerful information about the subsoil under the building's main

**Figure 9.** Example of a 3D survey using the offset dipole–dipole: (**a**) resistivity contour plot by anomalies. Scheme obtained through ERT-3D. (**b**) Resistivity contour plot by anomalies. Isolation of areas to be treated with soil consolidation injections grout using RES3DINV software. **Figure 9.** Example of a 3D survey using the offset dipole–dipole: (**a**) resistivity contour plot by anomalies. Scheme obtained through ERT-3D. (**b**) Resistivity contour plot by anomalies. Isolation of areas to be treated with soil consolidation injections grout using RES3DINV software.

**Figure 9.** Example of a 3D survey using the offset dipole–dipole: (**a**) resistivity contour plot by anomalies. Scheme obtained through ERT-3D. (**b**) Resistivity contour plot by anomalies. Isolation of areas to be treated with soil consolidation injections grout using RES3DINV software. The damages observed in the analyzed building and the initial geotechnical tests confirmed the diagnosis from the ERT-3D tomography methodology. The initial ERT-3D results—that is, the application of the first phase of the flow chart (Figure 5)—showed The damages observed in the analyzed building and the initial geotechnical tests confirmed the diagnosis from the ERT-3D tomography methodology. The initial ERT-3D results—that is, the application of the first phase of the flow chart (Figure 5)—showed high coincidence in the geological composition of the supporting subsoil beneath the damaged building: altered clay edges and colluvial sand from the meteorisation of the rock matrix (phyllite). The second phase of the proposed procedure (Figure 5) intended to obtain uniformity of chemical and physical features in the stabilized subsoil [65,70,86] until the desired goal was reached. As shown in Figure 10, the contrast tomography gave a favorable result after the cement grout injections were applied. The resistivities up to 3000 Ω m that caused distortions (cavities), affecting the stability of the building in Figure 8, were generally reduced drastically, even at values below 55 Ω m. This means the anomalies and stability issues of the building were resolved. In this way, the affected building could be recovered, and demolition avoided.

rock matrix (phyllite).

high coincidence in the geological composition of the supporting subsoil beneath the damaged building: altered clay edges and colluvial sand from the meteorisation of the

to check the consolidation results.

recovered, and demolition avoided.

**Figure 10.** Results of 3D inversion of electrical resistivity tomography over the whole survey area. On the left, view from the south, and, on the right, view from the north. Contrast tomography after the consolidation injections in the subsoil. **Figure 10.** Results of 3D inversion of electrical resistivity tomography over the whole survey area. On the left, view from the south, and, on the right, view from the north. Contrast tomography after the consolidation injections in the subsoil.

This innovative intervention technique allowed the researchers to evaluate the induced effects in the field and, in light of them, carry out possible modifications in the distribution of injection points to consolidate the subsoil. In the procedure's second phase, after the injections in the anomalies were made, the ERT-3D contrast was repeated

The second phase of the proposed procedure (Figure 5) intended to obtain uniformity of chemical and physical features in the stabilized subsoil [65,70,86] until the desired goal was reached. As shown in Figure 10, the contrast tomography gave a favorable result after the cement grout injections were applied. The resistivities up to 3000 Ω m that caused distortions (cavities), affecting the stability of the building in Figure 8, were generally reduced drastically, even at values below 55 Ω m. This means the anomalies and stability issues of the building were resolved. In this way, the affected building could be

The proposed intervention methodology is, therefore, highly effective in buildings damaged by settlements in shallow foundations, due to a deficient stress capacity of the subsoil, in this case by a removed anthropic fill. In this case study, a building on the verge of collapse, was recovered. Figure 10, showing the tomography performed after consolidation injections, indicates that up to 3000 Ω m resistivity anomaly areas have disappeared and left the building's subsoil consolidated and stabilized. The proposed intervention methodology is, therefore, highly effective in buildings damaged by settlements in shallow foundations, due to a deficient stress capacity of the subsoil, in this case by a removed anthropic fill. In this case study, a building on the verge of collapse, was recovered. Figure 10, showing the tomography performed after consolidation injections, indicates that up to 3000 Ω m resistivity anomaly areas have disappeared and left the building's subsoil consolidated and stabilized.

Four beneficial effects of our combined interactive methodology (tomography–injection) have been observed on subsoil where a building foundation rests: Four beneficial effects of our combined interactive methodology (tomography–injection) have been observed on subsoil where a building foundation rests:


### **4. Results and Discussion 4. Results and Discussion**

The interpretation and adaptation of this paper's proposed methodology were satisfactory and successful. The building, on the verge of collapse, recovered its stability and was ready for use again. This procedure is promising for buildings that have reached their ultimate limit state but must be maintained either because they could not be rebuilt The interpretation and adaptation of this paper's proposed methodology were satisfactory and successful. The building, on the verge of collapse, recovered its stability and was ready for use again. This procedure is promising for buildings that have reached their ultimate limit state but must be maintained either because they could not be rebuilt due to urban regulations (as in this case), or because they are listed as historic and cultural heritage buildings to be preserved at all costs.

The advantage of this proposed method is that ERT-3D allows researchers to determine if the foundation settlement and subsidence of the subsoil has been permanently eliminated after consolidation injections. Consolidating foundations through grout injections is inherently problematic because researchers do not know how much injection grout or how many injections are necessary, and they are unable to control or physically see the result, as it is underground. Therefore, an intervention methodology, such as the one presented here uniting ERT-3D monitoring with the consolidation injection process, is essential for checking the results. The advantage of this ERT-3D methodology, as well as consolidation grout injection compared to other reclining systems (e.g., micropyles), is that it is a non-invasive method for an already damaged building. This makes the system useful in extreme cases, like the one analyzed here.

In the application of our methodology, two electrical profiles were made, E1 and E2 (Figure 6), an upper part of approximately 3.90 m thick was defined for E2 (wall B) and

2.00 m for E1 (wall A) with high resistivity, ranging from 500 to up to 3000 Ω m. This section corresponded to the colluvial material (porous) that covers the metamorphic rock (phyllites). Under this lithology was the rocky matrix, formed by altered phyllites with less resistivity, of the order of 55 Ω m. The upper colluvial material, which was in contact with the underside of the building's footings, showed a significant degree of alteration, which means a high resistivity (from 1000 to up to 3000 Ω m). However, the resistivity values were quite variable within this mass of colluvial material, and this is linked to two causes: (1) lack of compaction and (2) presence of gaps and cavities in the subsoil where the foundation of the building supported, which coincided with the levels I and II altered. These data confirm the instability results of the shallow foundation of the building. The building's final state, after the application of this methodology, is shown in Figure 11. section corresponded to the colluvial material (porous) that covers the metamorphic rock (phyllites). Under this lithology was the rocky matrix, formed by altered phyllites with less resistivity, of the order of 55 Ω m. The upper colluvial material, which was in contact with the underside of the building's footings, showed a significant degree of alteration, which means a high resistivity (from 1000 to up to 3000 Ω m). However, the resistivity values were quite variable within this mass of colluvial material, and this is linked to two causes:(1) lack of compaction and (2) presence of gaps and cavities in the subsoil where the foundation of the building supported, which coincided with the levels I and II altered. These data confirm the instability results of the shallow foundation of the building. The building's final state, after the application of this methodology, is shown in Figure 11.

due to urban regulations (as in this case), or because they are listed as historic and cul-

The advantage of this proposed method is that ERT-3D allows researchers to determine if the foundation settlement and subsidence of the subsoil has been permanently eliminated after consolidation injections. Consolidating foundations through grout injections is inherently problematic because researchers do not know how much injection grout or how many injections are necessary, and they are unable to control or physically see the result, as it is underground. Therefore, an intervention methodology, such as the one presented here uniting ERT-3D monitoring with the consolidation injection process, is essential for checking the results. The advantage of this ERT-3D methodology, as well as consolidation grout injection compared to other reclining systems (e.g., micropyles), is that it is a non-invasive method for an already damaged building. This makes the system

In the application of our methodology, two electrical profiles were made, E1 and E2 (Figure 6), an upper part of approximately 3.90 m thick was defined for E2 (wall B) and 2.00 m for E1 (wall A) with high resistivity, ranging from 500 to up to 3000 Ω m. This

*Appl. Sci.* **2021**, *11*, 4455 16 of 20

tural heritage buildings to be preserved at all costs.

useful in extreme cases, like the one analyzed here.

**Figure 11.** Current state of the main building. Image taken after the intervention and application of the methodology. Had this methodology not been used, the only solution would have been demolition of the building. **Figure 11.** Current state of the main building. Image taken after the intervention and application of the methodology. Had this methodology not been used, the only solution would have been demolition of the building.

#### **5. Conclusions and Follow-Up 5. Conclusions and Follow-Up**

The ERT-3D technique proposed in this research proved to be versatile, fast, and cost-effective for detecting subsurface anomalies. In the present case study, the researchers were able to analyze the subsurface to a depth of 5.80 m, deep below the severely damaged building. Nevertheless, anomalies only existed at a depth of 3.90 m, be-The ERT-3D technique proposed in this research proved to be versatile, fast, and cost-effective for detecting subsurface anomalies. In the present case study, the researchers were able to analyze the subsurface to a depth of 5.80 m, deep below the severely damaged building. Nevertheless, anomalies only existed at a depth of 3.90 m, below the foundation of load-bearing wall B. When the subsoil has holes, those spaces are filled with air or with water which, when evaporated, are dielectric. This means that the terrain presents a strong gradient anomaly and very high resistivity values, which causes serious damage to the foundation and structure of the building. Our methodology solves this problem, by reducing the resistivity of these initial anomalies (cavities) by means of the combined grout injection in two passes, with the control of resistivities through 3D tomography, and returning the stabilization of the foundation of the building in a non-invasive way. It has been applied to a case study that has been totally effective, where initially we had resistivities up to 3000 Ω m and it was effectively reduced with our methodology to 55 Ω m.

In the present case study, to implement the methodology, the subsoil cavities and weaknesses were first identified and located beneath the building's foundations. Secondly, a consolidation grout injection campaign was carried out, enhancing the subsoil conditions to an optimum result. The consolidation fluid injections eliminated the high value of initial subsoil resistivities (up to 3000 Ω m) and moved those resistivities to values between 55 and 1000 Ω m, filling in those cavities. Through this process, ERT-3D showed itself to be a fundamental, minimally invasive tool for research. It was useful both during the project

phase and in the follow-up, underpinning work when the consolidation injections were placed under the foundations.

The limitations of our methodology increase with the depth of the altered subsoil, from 5.80 m. As has been shown in our research, it is difficult to have a reliable electrical section of resistivities to intervene later with consolidation injections. Although it would be necessary to use another geophysical methodology and another type of deep consolidation intervention using a pile or micropyle, our methodology is appropriate for emergency surface consolidations in disturbed porous subsoils. As they work safely outside the affected building, this methodology offers researchers an accurate lithological model and correctly highlights subsurface anomalies. Hence, ERT-3D allows them to determine the causes of the instability of the foundation and building and how it can be later consolidated and stabilized, in addition to monitoring the chosen solution.

**Author Contributions:** Conceptualization, A.G.-M. and R.C.; methodology, A.G.-M.; software, A.G.-M. and M.F.-H.; validation, A.G.-M., J.I.Y. and M.F.-H.; formal analysis, A.G.-M. and M.F.- H.; investigation, A.G.-M., J.I.Y. and R.C.; writing—original draft preparation, A.G.-M. and R.C.; writing—review and editing, R.C., A.G.-M.; visualization, R.C. and M.F.-H.; supervision, R.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Data Availability Statement:** Some or all data, models, or approach that support the findings of this study are available from the corresponding author upon reasonable request.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


## *Article* **Dynamic Effect of the Earth Fissure Sites in the Yuncheng Basin, China**

**Ge Cao <sup>1</sup> , Yahong Deng 1,2,\*, Jiang Chang <sup>1</sup> , You Xuan <sup>1</sup> , Nainan He <sup>1</sup> and Huandong Mu <sup>3</sup>**


**Abstract:** Earth fissures are widely distributed worldwide, and the Fenwei Basin in China is one of the regions with the most significant number and scale of fissures in the world. The Yuncheng Basin is an important constituent basin of the Fenwei Basin in China, where earth fissures are densely developed and cause severe damage. In particular, the impact of earth fissures on the seismic response of the site is still unknown and is an urgent problem that needs to be solved. Based on microtremor tests, three types of typical earth fissure sites in the Yuncheng Basin were selected for field testing. Through spectrum analysis, the dynamic response characteristics of the earth fissure sites were determined. The results show that the dynamic response of the site is significantly affected by the earth fissures. The dynamic response strength of the site is the largest on both sides of the earth fissures, and it decreases and gradually stabilizes with increasing distance from the fissures. The influence range of the earth fissures on the hanging side is slightly longer than the heading side.

**Keywords:** earth fissure; microtremor testing; spectrum analysis; amplification effect

**Citation:** Cao, G.; Deng, Y.; Chang, J.; Xuan, Y.; He, N.; Mu, H. Dynamic Effect of the Earth Fissure Sites in the Yuncheng Basin, China. *Appl. Sci.* **2023**, *13*, 9923. https://doi.org/ 10.3390/app13179923

Academic Editors: Miguel Llorente Isidro, Ricardo Castedo and David Moncoulon

Received: 8 August 2023 Revised: 28 August 2023 Accepted: 31 August 2023 Published: 1 September 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

## **1. Introduction**

Earth fissures are linear tensile fissures with a certain extension length that develop on the Earth's surface and may be accompanied by vertical fissures. They are affected by various complex factors such as internal and external forces and human activities [1]. Earth fissures have developed widely around the world and occur in the United States, Mexico, Africa, Europe, and throughout most of China [2,3]. Thus, the formation and development of earth fissures seriously affect infrastructure construction and restrict national economies [4].

The study of earth fissures began in 1929 [5,6], and humans have been studying earth fissures for a century. The first step in preventing and controlling earth fissure hazards is to identify the mechanisms of their formation and evolution. There are many factors affecting the formation of earth fissures, and the explanations of the formation mechanisms of earth fissures by domestic and foreign scholars can mainly be divided into three categories: ground subsidence induced by tectonic genesis [7,8], groundwater genesis [9–11], and structural and groundwater compound genesis [12–14]. Among them, China is one of the countries whose earth fissures have an enormous scale of development, the broadest distribution, and the most substantial phenomenon of surface damage caused by earth fissures. Since 2000, more than 5000 earth fissures have been discovered in China, mainly in the Fenwei Basin, the Hebei Plain, and the Su-Xi-Chang area (Figure 1) [15]. Compared to the other two regions, the planar spread of the earth fissures within the Fenwei Basin is the longest and most active. There have been many excellent results and consensus on the distribution pattern, activity characteristics, and genesis mechanisms of fissures in the Fenwei Basin [16–28].

to the other two regions, the planar spread of the earth fissures within the Fenwei Basin is the longest and most active. There have been many excellent results and consensus on the distribution pattern, activity characteristics, and genesis mechanisms of fissures in the

**Figure 1.** Geographical location of the Yuncheng Basin. **Figure 1.** Geographical location of the Yuncheng Basin.

Fenwei Basin [16–28].

The existence and activity of earth fissures do not only cause direct damage to surrounding structures, but when an earthquake occurs, the presence of earth fissures can change and even increase the seismic response of the site, causing the structures near the earth fissures to suffer severe damage or even be destroyed. Thus far, research on the development, distribution characteristics, and formation mechanisms of earth fissures all over the world has produced results, but our understanding of the dynamic effects of earth fissure sites is insufficient. Several studies have investigated the dynamic effects of earth fissures under earthquake loads based on numerical simulations [29,30], and others have used indoor simulation experiments to explore the influence of earth fissures on the dynamic response of a site [31–35], but the existing results are few and unsystematic and are not sufficient to guide engineering practice. Therefore, in terms of the specifications for site investigation and engineering design on Xi'an ground fractures (DBJ61-6-2006), [36] only regards the seismic effects of earth fissure sites as a general site threat in accordance with the code for the seismic design of buildings (GB50011-2010) [37]. Furthermore, because there are no systematic measurement data on the seismic response of earth fissure sites, in order to study this problem it is necessary to develop new ideas and appropriate methods. The existence and activity of earth fissures do not only cause direct damage to surrounding structures, but when an earthquake occurs, the presence of earth fissures can change and even increase the seismic response of the site, causing the structures near the earth fissures to suffer severe damage or even be destroyed. Thus far, research on the development, distribution characteristics, and formation mechanisms of earth fissures all over the world has produced results, but our understanding of the dynamic effects of earth fissure sites is insufficient. Several studies have investigated the dynamic effects of earth fissures under earthquake loads based on numerical simulations [29,30], and others have used indoor simulation experiments to explore the influence of earth fissures on the dynamic response of a site [31–35], but the existing results are few and unsystematic and are not sufficient to guide engineering practice. Therefore, in terms of the specifications for site investigation and engineering design on Xi'an ground fractures (DBJ61-6-2006), [36] only regards the seismic effects of earth fissure sites as a general site threat in accordance with the code for the seismic design of buildings (GB50011-2010) [37]. Furthermore, because there are no systematic measurement data on the seismic response of earth fissure sites, in order to study this problem it is necessary to develop new ideas and appropriate methods.

A microtremor is a kind of constant micro-movement that has no specific seismic source and can be observed at any time, and its amplitude is generally only a few microns. Microtremor studies originated in the 1950s. Kanao and Tamaka [38] quantitatively analyzed the nature of microtremor surface waves. After this, many research works were carried out on the microtremor sources, formation mechanisms, and waveforms of microtremors [39–42]. In the late 1960s, Toksoz and Lacoss used a seismic network to separate, extract, and analyze the various periodic components of microtremors [43,44] and reported on the components of the different bands of the microtremors and their possible causes. In the 1990s, Nakamura proposed a new microtremor method [45]. This horizontal-to-vertical spectral ratio (H/V) method has gradually become the focus of the analysis of site dynamics around the world. Many scholars have tried to explain the theoretical basis of this method from different angles [46,47]. In 1994, Lermo and Chavez-Garcia [48] summarized the three main types of microtremor analysis methods and concluded that A microtremor is a kind of constant micro-movement that has no specific seismic source and can be observed at any time, and its amplitude is generally only a few microns. Microtremor studies originated in the 1950s. Kanao and Tamaka [38] quantitatively analyzed the nature of microtremor surface waves. After this, many research works were carried out on the microtremor sources, formation mechanisms, and waveforms of microtremors [39–42]. In the late 1960s, Toksoz and Lacoss used a seismic network to separate, extract, and analyze the various periodic components of microtremors [43,44] and reported on the components of the different bands of the microtremors and their possible causes. In the 1990s, Nakamura proposed a new microtremor method [45]. This horizontal-to-vertical spectral ratio (H/V) method has gradually become the focus of the analysis of site dynamics around the world. Many scholars have tried to explain the theoretical basis of this method from different angles [46,47]. In 1994, Lermo and Chavez-Garcia [48] summarized the three main types of microtremor analysis methods and concluded that the H/V method could effectively eliminate the source effect. Since then, the H/V method has been widely used in the field of engineering. Many scholars in China have also contributed to the microtremor theory and practical engineering applications [49–51]. A large number of studies have shown that microtremor movement is an efficient, economical, and convenient method for testing the dynamic characteristics of a site which contains a large amount of site soil structure information that can be used as field measurement data. Moreover,

the microtremor test has initially been applied to the topic of the dynamic effect of earth fissure sites [52,53], and the study showed that the existence of earth fissures does indeed aggravate the vibration intensity of the site.

The Yuncheng Basin is an important part of the Fenwei Basin and contains the most well-developed earth fissures in China. The earth fissures in the Yuncheng Basin have caused serious disasters, and have become the most notable geological disaster in this area. Furthermore, the basin has a large population, developed agriculture, and industry, and the earth fissures not only cause direct damage to local roads and houses, but also severely restrict the construction of urban infrastructure, and thus they restrict the economic and social development of the Yuncheng Basin. The results of this study provide theoretical support for the development of seismic fortification and avoidance measures for the earth fissure sites in this area.

## **2. Regional Structure**

From north to south, the Fenwei graben system consists of the Datong Basin, the Xinding Basin, the Taiyuan Basin, the Linfen Basin, the Yuncheng Basin, the Weihe Basin, and several other large fault basins (Figure 1). The Yuncheng Basin is located in the southwestern part of Shanxi Province and has a total area of 4885 km<sup>2</sup> .

The Yuncheng Basin is a Cenozoic faulted basin superimposed on the multi-cyclic superimposed Ordos Basin. The sedimentary strata have been deformed by multiple tectonic movements and have experienced multiple tectonic periods, such as the Yanshanian and Himalayan movements. A representative large rift basin formed later [27]. The basic framework of the Zhongtiao mountain uplift area and the Emei platform uplift area in the basin was laid by the strong tectonic movement in the Yanshanian period, and it was mainly affected by the Himalayan movement during the Paleogene. Large-scale fault block tectonic movement occurred in this area. The tectonic setting of the basin has changed from a compressive and twisting thrust fault to a tension and twisting normal fault, and the embryonic form of the basin has been modified since then. The main structural deformations preserved today include a fault system in the Himalayan strike-slip extensional background [2].

The main forms of tectonic movement in the Yuncheng Basin are faults. The Basin was controlled by the Yanshan movement in the Mesozoic era, the Himalayan movement in the Cenozoic era, and especially by tectonic movement since the Late Cenozoic era, and therefore fault structures have been well-developed in the basin. The main active fault in the basin is the large fault on the northwest side of Zhongtiao mountain, which is also the main controlling boundary fault in the Yuncheng Basin [25]. There are eight active faults in the basin, with NE, NEE, and NNE strikes. They enclose the boundaries of the Yuncheng Basin and the secondary structural units inside the basin, and they mutually restrict and control the structural framework of the basin [27]. The Yuncheng Basin is bordered by the Emei Platform to the west, and the Zhongtiao mountains to the southeast. The NE-trending Mingtiaogang in the middle of the basin divides the Yuncheng Basin into the Sushui river plain and the Qinglong river plain. The basin is an asymmetrical sag basin that is deep in the south and shallow in the north.

The basin contains a large amount of Cenozoic strata, accounting for about 80% of the total area, and the thickness of the strata increases from the northeast to the southwest. The thickness of the Cenozoic strata in the basin is generally greater than 1000 m, and the Quaternary sediments are also more than 300 m thick. This loose, thick sedimentary layer provided the material basis for the extensive development of earth fissures.

In general, factors such as the uplift of the horst in the Yuncheng basin, the faulted basement, the over-exploitation of groundwater, and the loose, thick sedimentary layer have laid the foundation for the development of earth fissures in the basin.

### **3. Characteristics of the Earth Fissures in the Yuncheng Basin 3. Characteristics of the Earth Fissures in the Yuncheng Basin**

have laid the foundation for the development of earth fissures in the basin.

*Appl. Sci.* **2023**, *13*, x FOR PEER REVIEW 4 of 16

The development of earth fissures is controlled by tectonic movement. The earth fissures in a basin are mainly affected by the secondary structural units. The Yuncheng Basin has been affected by the Himalayan tectonic movement since the Cenozoic era. Due to the control of the faults at the northern foot of the Zhongtiao mountain, the basement has been in a faulted extensional environment. Under the strong tectonic activity of the Zhongtiao mountain fault, the secondary structural unit in the basin was controlled, which intensified the uplift of the block and caused the formation of the Mingtiaogang earth fissures. The dip slip extension of the hanging wall normal fault in the basin caused the tension shear cracking of the surface soil layer, which laid the structural foundation for the development of earth fissures in the basin. In addition, since the 1980s, the over-exploitation of groundwater has been severe in this area, causing the groundwater level to continuously fall and the falling funnel to expand year by year, which has induced a large number of earth fissures in the Yuncheng Basin. Thus, the Yuncheng Basin has become the faulted basin with the largest number of earth fissures in Shanxi. The development of earth fissures is controlled by tectonic movement. The earth fissures in a basin are mainly affected by the secondary structural units. The Yuncheng Basin has been affected by the Himalayan tectonic movement since the Cenozoic era. Due to the control of the faults at the northern foot of the Zhongtiao mountain, the basement has been in a faulted extensional environment. Under the strong tectonic activity of the Zhongtiao mountain fault, the secondary structural unit in the basin was controlled, which intensified the uplift of the block and caused the formation of the Mingtiaogang earth fissures. The dip slip extension of the hanging wall normal fault in the basin caused the tension shear cracking of the surface soil layer, which laid the structural foundation for the development of earth fissures in the basin. In addition, since the 1980s, the overexploitation of groundwater has been severe in this area, causing the groundwater level to continuously fall and the falling funnel to expand year by year, which has induced a large number of earth fissures in the Yuncheng Basin. Thus, the Yuncheng Basin has become the faulted basin with the largest number of earth fissures in Shanxi.

In general, factors such as the uplift of the horst in the Yuncheng basin, the faulted basement, the over-exploitation of groundwater, and the loose, thick sedimentary layer

The earth fissures in the basin are mainly distributed on top of and on both sides of the Mingtiaogang uplift within the basin, in the Zhongtiao mountain uplift to the southeast, and in the Emei platform front edge to the north (Figure 2). The earth fissures in the Yuncheng Basin are distributed along the boundaries of the geomorphology, are concentrated along the fault zone, and are associated with land subsidence. The earth fissures predominantly have NE strikes. According to our survey, the earth fissure disasters in the basin mainly occurred before 1980 and between 1995 and 2005. A total of 119 earth fissures (belts) have developed in the study area. They mainly strike NE and are generally 100–2000 m long, with the longest reaching up to 5000 m. They are generally 0.05–0.5 m wide, and the widest can reach up to 2 m. The small earth fissures account for 45% of the earth fissures (length less than 500 m); the medium earth fissures account for 28% (length between 500 and 1000 m); and the large earth fissures account for 17% (length between 1000 and 5000 m). There is only one giant earth fissure in the basin. It is located in the salt lake district and is 10 km long and 0.3–1 m wide. The earth fissures in the basin are mainly distributed on top of and on both sides of the Mingtiaogang uplift within the basin, in the Zhongtiao mountain uplift to the southeast, and in the Emei platform front edge to the north (Figure 2). The earth fissures in the Yuncheng Basin are distributed along the boundaries of the geomorphology, are concentrated along the fault zone, and are associated with land subsidence. The earth fissures predominantly have NE strikes. According to our survey, the earth fissure disasters in the basin mainly occurred before 1980 and between 1995 and 2005. A total of 119 earth fissures (belts) have developed in the study area. They mainly strike NE and are generally 100– 2000 m long, with the longest reaching up to 5000 m. They are generally 0.05–0.5 m wide, and the widest can reach up to 2 m. The small earth fissures account for 45% of the earth fissures (length less than 500 m); the medium earth fissures account for 28% (length between 500 and 1000 m); and the large earth fissures account for 17% (length between 1000 and 5000 m). There is only one giant earth fissure in the basin. It is located in the salt lake district and is 10 km long and 0.3–1 m wide.

**Figure 2. Figure 2.**  Structure and earth fissure distribution map of the Yuncheng Basin. Structure and earth fissure distribution map of the Yuncheng Basin.

Figure 3 shows six typical earth fissures in the Yuncheng Basin and their profiles. Earth fissure F<sup>1</sup> (Pleistocene earth fissure on the southern margin of the Mingtiaogang uplift) is located in the salt lake district of Yuncheng. It is the only giant earth fissure in

the Yuncheng Basin. The earth fissure F<sup>1</sup> appeared in 1975. It has a long extension, large scale, and strong continuity, and is not controlled by the pavement, buildings, or roads along the route. It is 10 km long and crosses Taocun–Banpo–Wucao–Xincao and other villages, and it is very destructive along the strike (Figure 4a,c). Earth fissure F<sup>1</sup> strikes NE, which is consistent with the active fault in the underlying bedrock. In a plane view, the earth fissure F<sup>1</sup> is longer, and its overall shape is linear. In profile, the site is dominated by loess and paleosol, with secondary fissures on either side of the main fissure symmetrically developed and approximately parallel. The vertical plane of the two sides near the surface of the main fissure is small, and the deep dislocation is large, showing the characteristic of a synsedimentary fault (Figure 3a). Yuncheng Basin. The earth fissure F<sup>1</sup> appeared in 1975. It has a long extension, large scale, and strong continuity, and is not controlled by the pavement, buildings, or roads along the route. It is 10 km long and crosses Taocun–Banpo–Wucao–Xincao and other villages, and it is very destructive along the strike (Figure 4a,c). Earth fissure F<sup>1</sup> strikes NE, which is consistent with the active fault in the underlying bedrock. In a plane view, the earth fissure F<sup>1</sup> is longer, and its overall shape is linear. In profile, the site is dominated by loess and paleosol, with secondary fissures on either side of the main fissure symmetrically developed and approximately parallel. The vertical plane of the two sides near the surface of the main fissure is small, and the deep dislocation is large, showing the characteristic of a synsedimentary fault (Figure 3a).

Figure 3 shows six typical earth fissures in the Yuncheng Basin and their profiles. Earth fissure F<sup>1</sup> (Pleistocene earth fissure on the southern margin of the Mingtiaogang uplift) is located in the salt lake district of Yuncheng. It is the only giant earth fissure in the

*Appl. Sci.* **2023**, *13*, x FOR PEER REVIEW 5 of 16

**Figure 3.** The locations of six typical earth fissures and survey lines in the Yuncheng Basin and their profiles. (**a**) Profile of F<sup>1</sup> earth fissure site, (**b**) profile of F<sup>2</sup> earth fissure site, and (**c**) profile of F<sup>5</sup> earth fissure site. **Figure 3.** The locations of six typical earth fissures and survey lines in the Yuncheng Basin and their profiles. (**a**) Profile of F<sup>1</sup> earth fissure site, (**b**) profile of F<sup>2</sup> earth fissure site, and (**c**) profile of F<sup>5</sup> earth fissure site.

The Zhongtiao mountain piedmont fault zone is located in the southern part of Xia County. Affected by tectonic movement, the five earth fissures (F2–F6) developed in the Xia County area are all distributed on the northwestern side of the Zhongtiao mountain fault zone. The earth fissures are distributed in parallel rows at nearly equal intervals from southeast to northwest (Figure 3). The earth fissure F<sup>2</sup> is farthest from (i.e., ~9.5 km) the Zhongtiao mountains in Xia County. It has the most significant vertical surface dislocation among the five earth fissures in Xia County. The largest surface vertical dislocation occurs in the playground of Xinmiao Primary School in Yuwang, where the surface dislocation reaches 30 cm (Figure 4b,d). The earth fissure F<sup>2</sup> appeared in 1998 and developed rapidly from 2007 to 2008. It strikes, which is consistent with the trend of the Zhongtiao mountains, and extends for 3.9 km. Its scale is relatively large. The earth fissure F<sup>2</sup> passes through Yuwang and other places, and it is more destructive along the strike. As can be seen from the cross-section shown in Figure 3b, the formation is composed of interbedded silt and silty clay, and the vertical dislocation increases with increasing depth. The earth The Zhongtiao mountain piedmont fault zone is located in the southern part of Xia County. Affected by tectonic movement, the five earth fissures (F2–F6) developed in the Xia County area are all distributed on the northwestern side of the Zhongtiao mountain fault zone. The earth fissures are distributed in parallel rows at nearly equal intervals from southeast to northwest (Figure 3). The earth fissure F<sup>2</sup> is farthest from (i.e., ~9.5 km) the Zhongtiao mountains in Xia County. It has the most significant vertical surface dislocation among the five earth fissures in Xia County. The largest surface vertical dislocation occurs in the playground of Xinmiao Primary School in Yuwang, where the surface dislocation reaches 30 cm (Figure 4b,d). The earth fissure F<sup>2</sup> appeared in 1998 and developed rapidly from 2007 to 2008. It strikes, which is consistent with the trend of the Zhongtiao mountains, and extends for 3.9 km. Its scale is relatively large. The earth fissure F<sup>2</sup> passes through Yuwang and other places, and it is more destructive along the strike. As can be seen from the cross-section shown in Figure 3b, the formation is composed of interbedded silt and silty clay, and the vertical dislocation increases with increasing depth. The earth fissure F<sup>5</sup> is 5 km from the Zhongtiao mountain fault zone, making it the closest of the five earth fissures to the fault zone. The earth fissure F<sup>5</sup> starts in Yuguo and ends in Zhongwei. Its overall trend is NE55◦ , which is consistent with the trend of the Zhongtiao mountains. It

extends for about 2.3 km, and the vertical dislocation of the ground surface is 8–20 cm. The earth fissure F<sup>5</sup> appeared in 2000, and it entered a period of rapid development after 2007. It is active and destructive along the strike, and it is still developing. The earth fissure is a tiltslip tension crack with obvious horizontal extensional movement and vertical differential movement. The differential settlement on both sides of the earth fissure is obvious, and the maximum vertical dislocation is 13.5 m. As can be seen from the cross-section shown in Figure 3c, the formation is composed of silt and silty clay. The stratum is relatively weak. 2007. It is active and destructive along the strike, and it is still developing. The earth fissure is a tilt-slip tension crack with obvious horizontal extensional movement and vertical differential movement. The differential settlement on both sides of the earth fissure is obvious, and the maximum vertical dislocation is 13.5 m. As can be seen from the cross-section shown in Figure 3c, the formation is composed of silt and silty clay. The stratum is relatively weak.

fissure F5 is 5 km from the Zhongtiao mountain fault zone, making it the closest of the five earth fissures to the fault zone. The earth fissure F5 starts in Yuguo and ends in Zhongwei. Its overall trend is NE55°, which is consistent with the trend of the Zhongtiao mountains. It extends for about 2.3 km, and the vertical dislocation of the ground surface is 8–20 cm. The earth fissure F5 appeared in 2000, and it entered a period of rapid development after

*Appl. Sci.* **2023**, *13*, x FOR PEER REVIEW 6 of 16

**Figure 4.** Earth fissure hazards in the Yuncheng Basin. (**a–c**) earth fissure hazard, (**d**) surface dislocations caused by earth fissures. **Figure 4.** Earth fissure hazards in the Yuncheng Basin. (**a**–**c**) earth fissure hazard, (**d**) surface dislocations caused by earth fissures.

## **4. Microtremor Tests and Analysis**

### **4. Microtremor Tests and Analysis**  *4.1. Survey Line and Data Point Layout*

*4.1. Survey Line and Data Point Layout*  Survey lines were along the previously described earth fissures (F1, F2, and F5). Two survey lines (S1 and S2) were laid in Banpo and Wucao for the earth fissure F1, only one survey line (S3) was laid in Yuwang for the earth fissure F2, and one survey line each was laid in Yuguo and Zhongwei (lines S4 and S5, respectively) for the earth fissure F5. As an example, the survey line layout of F2 in Yuwang is shown schematically in Figure 5. The survey lines were perpendicular to the earth fissures. Each survey line contained 18 data Survey lines were along the previously described earth fissures (F1, F2, and F5). Two survey lines (S1 and S2) were laid in Banpo and Wucao for the earth fissure F1, only one survey line (S3) was laid in Yuwang for the earth fissure F2, and one survey line each was laid in Yuguo and Zhongwei (lines S4 and S5, respectively) for the earth fissure F5. As an example, the survey line layout of F<sup>2</sup> in Yuwang is shown schematically in Figure 5. The survey lines were perpendicular to the earth fissures. Each survey line contained 18 data acquisition points, and 9 measurement points were set on either side of the earth fissure. The survey lines were about 60 m long.

acquisition points, and 9 measurement points were set on either side of the earth fissure.

The survey lines were about 60 m long.

**Figure 5.** Schematic diagram of microtremor testing.(A represents hanging side, B represents heading side, M represents Measurement point, also in the following figures) **Figure 5.** Schematic diagram of microtremor testing.(A represents hanging side, B represents heading side, M represents Measurement point, also in the following figures).

#### *4.2. Equipment and Methodology 4.2. Equipment and Methodology*

The testing instrument used for the microtremor was a high-sensitivity servo-type velocity network seismograph (CV-374AV) manufactured by Tokyo Sokushin Co., Ltd. Company (Tokyo, Janpan). The sampling frequency of the instrument was 0.1–100 Hz. It can monitor in three orthogonal directions of microtremor data at the same time. The testing instrument meets the requirements of the microtremor test. The testing instrument used for the microtremor was a high-sensitivity servo-type velocity network seismograph (CV-374AV) manufactured by Tokyo Sokushin Co., Ltd. Company (Tokyo, Janpan). The sampling frequency of the instrument was 0.1–100 Hz. It can monitor in three orthogonal directions of microtremor data at the same time. The testing instrument meets the requirements of the microtremor test.

The tests were carried out at night when it was quiet and in good weather in order to avoid obvious vibration sources. The microtremors in the X, Y, and Z directions were measured at each measurement point, and each measurement point was monitored for more than 10 min. If pedestrians or vehicles passed by during the test, these details were recorded, and the affected time period was avoided as much as possible when selecting the data. The tests were carried out at night when it was quiet and in good weather in order to avoid obvious vibration sources. The microtremors in the X, Y, and Z directions were measured at each measurement point, and each measurement point was monitored for more than 10 min. If pedestrians or vehicles passed by during the test, these details were recorded, and the affected time period was avoided as much as possible when selecting the data.

The collected speed–time history curve from the reliable signal was intercepted and converted into the acceleration time–history curve. Intercepted data were imported into the SeismoSignal program. After the preprocessing had been completed, including filtering and baseline corrections, Fourier spectrum, response spectrum, and Arias intensity analyses were performed. The collected speed–time history curve from the reliable signal was intercepted and converted into the acceleration time–history curve. Intercepted data were imported into the SeismoSignal program. After the preprocessing had been completed, including filtering and baseline corrections, Fourier spectrum, response spectrum, and Arias intensity analyses were performed.

### **5. Microtremor Analysis of the Earth Fissure Sites 5. Microtremor Analysis of the Earth Fissure Sites**

Figures 6–10 show the results of the microtremor spectrum analysis of the five survey lines. Figures 6 and 7 show the results obtained from the S1 and S2 survey lines on the F1 earth fissure site, respectively. Figure 8 shows the result of the F2 fissure site, and Figures 9–10 show the spectral results of the F5 earth fissure site. Figures 6–10 show the results of the microtremor spectrum analysis of the five survey lines. Figures 6 and 7 show the results obtained from the S1 and S2 survey lines on the F<sup>1</sup> earth fissure site, respectively. Figure 8 shows the result of the F<sup>2</sup> fissure site, and Figures 9 and 10 show the spectral results of the F<sup>5</sup> earth fissure site.

*Appl. Sci.* **2023**, *13*, x FOR PEER REVIEW 8 of 16

**Figure 6.** Fourier spectrum, response spectrum, and Arias intensity analysis results for line S1. **Figure 6.** Fourier spectrum, response spectrum, and Arias intensity analysis results for line S1. **Figure 6.** Fourier spectrum, response spectrum, and Arias intensity analysis results for line S1. **Figure 6.** Fourier spectrum, response spectrum, and Arias intensity analysis results for line S1.

**Figure 7.** Fourier spectrum, response spectrum, and Arias intensity analysis results for line S2. **Figure 7.** Fourier spectrum, response spectrum, and Arias intensity analysis results for line S2. **Figure 7.** Fourier spectrum, response spectrum, and Arias intensity analysis results for line S2.

**Figure 8.** Fourier spectrum, response spectrum, and Arias intensity analysis results for line S3. **Figure 8.** Fourier spectrum, response spectrum, and Arias intensity analysis results for line S3.

*Appl. Sci.* **2023**, *13*, x FOR PEER REVIEW 9 of 16

**Figure 9.** Fourier spectrum, response spectrum, and Arias intensity analysis results for line S4. **Figure 9.** Fourier spectrum, response spectrum, and Arias intensity analysis results for line S4. **Figure 9.** Fourier spectrum, response spectrum, and Arias intensity analysis results for line S4.

**Figure 10.** Fourier spectrum, response spectrum, and Arias intensity analysis results for line S5. **Figure 10.** Fourier spectrum, response spectrum, and Arias intensity analysis results for line S5.

The Fourier spectral results show that the same earth fissure site yielded consistent

Fourier spectral features, while the spectra patterns from different fissure sites were distinctive. As can be seen from Figures 6 and 7, the Fourier spectrum patterns of the F1 earth fissure site are dominated by 'single-peak' spectra, with small spectral areas, prominent main peaks, and narrow spectral energy distribution intervals. From Figures 8–10, it can be seen the Fourier spectra of the F2 and F3 fissure sites are both broad and dominated by 'multi-peak' spectra, with more secondary peaks and a relatively wide range of spectral energy distribution. The Fourier spectral pattern is dependent on the fissure site, with the F1 fissure site having relatively hard soil conditions, dominated by interbedded loess and paleosol. The F2 and F5 fissure sites have different site conditions, but the profiles reveal that the two sites have similar soil conditions, dominated by silty clay intercalated with silt, with a relatively weak soil layer. Therefore, the Fourier spectral patterns of the earth fissure sites F2 and F5 are similar, and they are significantly different from the Fourier spectral characteristics of the F1 site. The Fourier spectral results show that the same earth fissure site yielded consistent Fourier spectral features, while the spectra patterns from different fissure sites were distinctive. As can be seen from Figures 6 and 7, the Fourier spectrum patterns of the F1 earth fissure site are dominated by 'single-peak' spectra, with small spectral areas, prominent main peaks, and narrow spectral energy distribution intervals. From Figures 8–10, it can be seen the Fourier spectra of the F2 and F3 fissure sites are both broad and dominated by 'multi-peak' spectra, with more secondary peaks and a relatively wide range of spectral energy distribution. The Fourier spectral pattern is dependent on the fissure site, with the F1 fissure site having relatively hard soil conditions, dominated by interbedded loess and paleosol. The F2 and F5 fissure sites have different site conditions, but the profiles reveal that the two sites have similar soil conditions, dominated by silty clay intercalated with silt, with a relatively weak soil layer. Therefore, the Fourier spectral patterns of the earth fissure sites F2 and F5 are similar, and they are significantly different from the Fourier spectral characteristics of the F1 site. The Fourier spectral results show that the same earth fissure site yielded consistent Fourier spectral features, while the spectra patterns from different fissure sites were distinctive. As can be seen from Figures 6 and 7, the Fourier spectrum patterns of the F<sup>1</sup> earth fissure site are dominated by 'single-peak' spectra, with small spectral areas, prominent main peaks, and narrow spectral energy distribution intervals. From Figures 8–10, it can be seen the Fourier spectra of the F<sup>2</sup> and F<sup>3</sup> fissure sites are both broad and dominated by 'multi-peak' spectra, with more secondary peaks and a relatively wide range of spectral energy distribution. The Fourier spectral pattern is dependent on the fissure site, with the F<sup>1</sup> fissure site having relatively hard soil conditions, dominated by interbedded loess and paleosol. The F<sup>2</sup> and F<sup>5</sup> fissure sites have different site conditions, but the profiles reveal that the two sites have similar soil conditions, dominated by silty clay intercalated with silt, with a relatively weak soil layer. Therefore, the Fourier spectral patterns of the earth fissure sites F<sup>2</sup> and F<sup>5</sup> are similar, and they are significantly different from the Fourier spectral characteristics of the F<sup>1</sup> site.

Further analysis of the predominant frequencies revealed that the presence of earth fissures had little effect on the predominant frequencies of the site. This means that the predominant frequencies of the same survey line did not change significantly with increasing distance from the earth fissures, and the ranges of the predominant frequencies on the hanging and heading sides were basically the same. From the results of the spectral analysis, it can be concluded that the predominant frequency of the F<sup>1</sup> earth fissure site is 6–7 Hz, that of F<sup>2</sup> is 3–5 Hz, and that of F<sup>5</sup> is 3–5 Hz. However, because of the different soil structures of different earth fissure sites, the predominant frequencies of the different sites are different.

The response spectra of the same fissure sites were also generally consistent, with all three types of fissure sites showing a predominantly 'single-peak' response spectrum, with prominent main peaks and few or no secondary peaks. Similar to the results of the Fourier spectrum analysis, the response spectrum of the F<sup>1</sup> fissure site was the narrowest among all of the fissure sites, while the F<sup>2</sup> and F<sup>5</sup> fissure sites revealed a similar response spectrum with a slightly larger spectral area than that of F1. All three fissure sites had a relatively concentrated predominant period, which was distributed in the interval of 0–0.5 s.

The Arias intensity is the curve of the energy accumulation at the measurement point over time. The difference between the energy of the different measurement points can be seen better from the final accumulated energy. Based on the Arias intensity of the three earth fissure sites, the closer the measurement point to the earth fissure, the greater the energy accumulated at the measurement point. The accumulated energies of each site reached the extreme values at measurement points A1 and B1, and as the distance increased, the energy gradually decreased and finally stabilized.

From the spectral results, it can be further found that earth fissures have no significant influence on the inherent characteristics of the site, such as the predominant frequency and the predominant period. However, when we explored the dynamic response law of the earth fissure site, the amplitude difference of each measurement point was notable. From Figures 6–10, there are significant differences in the amplitudes of the various measurement points, but it is difficult to visualize the relationship between the magnitudes of the various measurement points and the location of the earth fissure from the spectrum results. Therefore, we averaged the amplitudes of each measuring point in the X, Y, and Z directions, and obtained the relationship curve between the average amplitude of the Fourier spectrum, response spectrum, Arias intensity, and the distance from the earth fissure, as shown in Figure 11. As can be seen from the figure, although the response intensity varied from site to site, the measurement points near the earth fissures all exhibited a significant amplification effect, which gradually attenuated and stabilized with increasing distance from the earth fissure.

**Figure 11.** The amplitude attenuation curves of each site. **Figure 11.** The amplitude attenuation curves of each site.

### **6. Analysis of Dynamic Effects of the Earth Fissure 6. Analysis of Dynamic Effects of the Earth Fissure**

### *6.1. Amplification Effect 6.1. Amplification Effect*

In order to more intuitively reveal the degree and scope of the impact of the earth fissures on the dynamic response of the site, the concept of the amplification factor was introduced. Figure 12 shows that the amplitudes of the three measurement points on both sides of the earth fissure on each survey line were basically the same and tended to be stable. Therefore, the stable amplitude was defined as the average of the amplitudes of these three measurement points, and the amplification factor was defined as the average amplitude of each measurement point divided by the stationary amplitude. Figure 12 shows the attenuation curve and fitting curve of the amplification factor with distance. As can be seen from the figure, the earth fissure site amplification effect revealed via Fourier spectrum, response spectrum, and Arias intensity had the same attenuation mode, and the dynamic amplification response of the site was most significant when closest to the earth fissure and tended to decay as the distance from the fissure increased until reaching the area farthest away from the earth fissure, where the dynamic response of the site gradually stabilized and the amplification factor approached 1. This also indicates that the in-In order to more intuitively reveal the degree and scope of the impact of the earth fissures on the dynamic response of the site, the concept of the amplification factor was introduced. Figure 12 shows that the amplitudes of the three measurement points on both sides of the earth fissure on each survey line were basically the same and tended to be stable. Therefore, the stable amplitude was defined as the average of the amplitudes of these three measurement points, and the amplification factor was defined as the average amplitude of each measurement point divided by the stationary amplitude. Figure 12 shows the attenuation curve and fitting curve of the amplification factor with distance. As can be seen from the figure, the earth fissure site amplification effect revealed via Fourier spectrum, response spectrum, and Arias intensity had the same attenuation mode, and the dynamic amplification response of the site was most significant when closest to the earth fissure and tended to decay as the distance from the fissure increased until reaching the area farthest away from the earth fissure, where the dynamic response of the site gradually stabilized and the amplification factor approached 1. This also indicates that the influence of the earth fissure on the dynamic response of the site had a limited range.

fluence of the earth fissure on the dynamic response of the site had a limited range. According to the amplification factor fitting curve (Figure 12), we can obtain the amplification factor extreme value of different analysis results. As shown in Table 1, the extreme value of the hanging and heading sides obtained by the Fourier method is 1.8–2.1. The extreme value of the amplification factor of the hanging side obtained by the response spectrum is 1.8, and that of the heading side is 1.6, while the Arisa intensity has the largest amplification factor extreme values of 3.2 for the hanging side and 2.9 for that of the heading. The amplification factors obtained by three analysis methods are different, we can choose different amplification factors according to the needs of seismic fortification. The Fourier method focuses on the inherent information of the site soil layer, which can intuitively reflect the vibration characteristics of the site. Therefore, we can select the Fourier amplification factor when considering only the dynamic characteristics of the site

itself. The response spectrum can reflect the dynamic characteristics of different structural particle systems under ground motion, which is very effective for the seismic fortification of structures. The Arias intensity is a time-dependent curve, indicating the strength of the overall dynamic response of the site over a period of time. According to the amplification factor of the Arias intensity, the peak acceleration of the site under seismic action can be obtained, and the seismic fortification level of the site can be adjusted. Different methods show that the extreme value of the amplification factor in the hanging side of the same earth fissure site is slightly larger than that of the heading. Therefore, under the dynamic load, when the distance from the earth fissure is the same, the dynamic amplification response of the hanging side is stronger.

**Figure 12.** Amplification factor diagrams of the different analyses. (**a**) Fourier amplification factor and fitting curve, (**b**) response spectrum amplification factor and fitting curve, and (**c**) Arias intensity amplification factor and fitting curve.


**Table 1.** The extreme value of the amplification factor of each analysis method.

## *6.2. Range of Influence*

From the amplification factor fitting curve, the range of distances corresponding to the amplification factor of 1.5 was divided into three regions for the influence of the dynamic response of earth fissure, as shown in Table 2. In future seismic fortification work, the seismic fortification levels of buildings and structures can be adjusted according to the amplification factors in different areas.

**Table 2.** The influence range of each analysis method.


When the amplification factor is greater than 1.5, it is in an area where the dynamic response of the site is amplified most intensely. All three methods have varying degrees of influence range, in which Fourier and response spectra are relatively close, the hanging side is about 5.0–7.1 m, and the heading side is 3.8–5.9 m. The area of Arias intensity is relatively large, with the hanging side at 0–13.3 m and the heading side at 0–9.2 m, and when the amplification factor is between 1.5 and 1, it is in an area where the dynamic response of the site is significantly amplified, and the buildings and structures in this area should also increase their seismic protection level accordingly. Furthermore, when the amplification factor is attenuated to 1, the site can be considered as no longer being affected by the amplification effect of the earth fissure, and therefore a zone of influence of the dynamic response can be delineated accordingly. The Fourier method yielded a hanging and heading side of approximately 20 m, while the response spectrum was about 18 m, and the Arias intensity showed an influence zone of 23.9 m for the hanging side and 17.3 m for the heading side.

As a result, the influence range of the same method was closer to both fissure sides, but the hanging side influence range was always greater than that of the heading. Therefore, from the perspective of practical engineering for seismic protection, new buildings should try to avoid the influence area. If it cannot be avoided, the seismic protection intensity of the building should be increased as much as possible according to the corresponding amplification factor. Moreover, the dynamic amplification effect of the hanging side has a wider and more extensive scope of influence than that of the heading side, in particular for the seismic fortification level of the hanging sides of the earth fissure sites.

## **7. Conclusions**

(1) The presence of the earth fissure significantly amplifies the peaks of the direct Fourier spectra, the acceleration response spectra, and the Arias intensity within the site. In particular, the peaks of each spectrum increased dramatically in the range of 5 m immediately on either side of the fissure.

(2) The amplification effect at the fissure site follows the pattern of steep increase in the near field, slow rise in the middle field, and steady rise in the far field. Areas with amplification factors higher than 1.5 showed a steep rise in amplification, with both spectral and intensity peaks rapidly increasing to several times the original site and eventually reaching extreme values at the outcrops of the fissures. Areas with an amplification factor between 1.5 and 1 are zones with a slow rise of the amplification effect, and the dynamic response of the site is still affected by the fissures.

(3) The amplification of the dynamic response of an earth fissure site with positive fault characteristics has a significant "hanging side effect." The amplification factor is often higher in the steep rise zone of the hanging side, and the hanging side has a further influence range.

(4) In the actual seismic fortification of the project, it is necessary to avoid the area where the amplification factor rises sharply. Alternatively, the strength of structures and foundations in the area should be increased, and additional seismic fortification measures should be taken. In areas where the amplification factor rises slowly, the seismic fortification level of the structure should also be strengthened according to the amplification factor, so that it can withstand 1.5 times the expected seismic factor.

In this study, based on the microtremor test, typical earth fissures in the Yuncheng Basin of China were taken as the research objects, and dynamic amplification patterns at earth fissure sites were systematically revealed using spectral characterization of the microtremors. We propose preliminary seismic fortification for different dynamic amplification areas. This study provides a reference for the seismic fortification of sites with similar engineering conditions and the subject of seismic amplification effect in similar sites. In addition, the numerical simulation of seismic responses of fissure sites and different structures is expected to further improve the results of the amplification effect and provide more detailed suggestions for seismic fortification under different seismic intensities.

**Author Contributions:** Conceptualization, Y.D.; Software, G.C.; Investigation, G.C., J.C., Y.X. and N.H.; Writing—original draft, G.C.; Writing—review & editing, G.C., Y.D. and H.M.; Supervision, Y.D.; Funding acquisition, Y.D. and H.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by [National Natural Science Foundation of China] grant number [41772275], [Fundamental Research Funds for the Central Universities] grant number [300102268203], [Scientific Research Plan Projects of Shaanxi Education Department] grant number [20JK0801], [Natural Science Basic Research Program of Shaanxi Province] grant number 2022JQ-289, [Fundamental Research Funds for the Central Universities] grant number 300102262505, and [Key Research and Development projects of Shaanxi Province] grant number 2022SF-197.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author. The data are not publicly available due to the processed data required to reproduce these findings cannot be shared at this time as the data also forms part of an ongoing study.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

**Péter Szabó 1,\*, László Tóth <sup>2</sup> and Judith Cerdà-Belmonte <sup>1</sup>**


**Abstract:** In this article we present a space–time epidemic-type aftershock sequence (ETAS) model for the area of Hungary, motivated by the goal of its application in insurance risk models. Highquality recent instrumental data from the period 1996–2021 are used for model parameterization, including data from the recent nearby Zagreb and Petrinja event sequences. In the earthquaketriggering equations of our ETAS model, we replace the commonly used modified Omori law with the more recently proposed stretched exponential time response form, and a Gaussian space response function is applied with a variance add-on for epicenter error. After this model was tested against the observations, an appropriate overall fit for magnitudes *M* ≥ 3.0 was found, which is sufficient for insurance applications, although the tests also show deviations at the *M* = 2.5 threshold. Since the data used for parameterization are dominated by Croatian earthquake sequences, we also downscale the model to regional zones via parameter adjustments. In the downscaling older historical data are incorporated for a better representation of the key events within Hungary itself. Comparison of long-term large event numbers in simulated catalogues versus historical data shows that the model fit by zone is improved by the downscaling.

**Keywords:** insurance hazard model; earthquake clustering; space–time ETAS model; Hungarian earthquake catalogue; 2020 Petrinja earthquake

## **1. Introduction**

## *1.1. Motivation*

The origin of the study presented in this article is an ongoing effort by the UNIQA Insurance Group (www.uniqagroup.com, accessed on 5 February 2023) to build a proprietary earthquake model for Hungary. Earthquake models in insurance are used to measure the risk of a set of objects, such as buildings, civil engineering structures, and means of transport, with a focus on assessing the potential losses caused by severe earthquakes.

One key application of an earthquake model or, more generally, a natural catastrophe model in insurance is the determination of capital requirements. According to the Solvency II regulatory standards applicable in European Union member countries [1,2], the solvency capital requirement (SCR) of an insurance company shall correspond to the 200-year loss; that is, the insurer must hold their own funds to cover the risk of a loss whose probability of occurrence within a year is 0.5%. The SCR is calculated either by the Solvency II standard formula or by an internal model. For natural perils, the standard formula is a deterministic approximation of the risk based on fixed factors applied to insured amounts of the objects by geographical area. On the other hand, an internal model is based on the stochastic simulation of the peril, and its output is not just an SCR single value but a full probability distribution forecast of the loss. The use of an internal model by an insurance company for calculating the SCR is subject to case-by-case regulatory approval by the competent supervisory authorities, including a comprehensive and lengthy validation process. The development of an internal model requires a significant effort; however, the benefits include

**Citation:** Szabó, P.; Tóth, L.; Cerdà-Belmonte, J. Hazard Model: Epidemic-Type Aftershock Sequence (ETAS) Model for Hungary. *Appl. Sci.* **2023**, *13*, 2814. https://doi.org/ 10.3390/app13052814

Academic Editors: Miguel Llorente Isidro, Ricardo Castedo and David Moncoulon

Received: 19 December 2022 Revised: 16 February 2023 Accepted: 18 February 2023 Published: 22 February 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

a more accurate and more detailed evaluation of the risk, and a range of model uses in risk management tasks beyond the calculation of the SCR. Another typical and important use of stochastic catastrophe models is the optimization and pricing of protection covers for the purpose of risk transfer. Reinsurance risk transfer is an instrument for risk mitigation, and it can reduce the SCR of the ceding insurer.

The general architecture of a stochastic earthquake model for insurance is the following: (1) the first fundamental module is a statistical model of earthquake occurrences, which is used to generate a synthetic event catalogue via stochastic simulation. This is followed by two equally important ones: (2) the ground motion attenuation model, translating the catalogue into event footprints, and (3) the vulnerability model, translating the event footprints into the loss of the insured objects in financial terms. All three modules require a full study of their own. In this article, only the first module is discussed, that is, a model of earthquake occurrences.

Earthquake models used in the field of insurance typically focus on the mainshocks only. This simplifies the modelling effort drastically since the mainshocks are assumed to follow a stationary process. This simplification is justified if aftershock losses are low compared to the mainshock loss. Furthermore, customer claims arising from aftershock losses are often not reported separately from the mainshock; therefore, it is a straightforward approach to model them only implicitly via conservative parameterization. Nevertheless, the recent nearby Petrinja 2020 event in Croatia and its aftershocks highlighted some properties of earthquake sequences [3], suggesting that clustering cannot necessarily be safely ignored even in a Central European model: aftershock activity after such a major event may last for months or even years, and some aftershocks (or foreshocks) are significant events on their own. Moreover, there are historical records of complex earthquake sequences (e.g., Romania 1991) lasting for several months and including multiple significant shocks [4]. Large events are still possible late in the sequence, and large aftershocks can trigger second-generation sequences, epidemic style.

## *1.2. Aim of the Study*

The aim of this study is to build an epidemic-type aftershock sequence (ETAS) model for Hungary using recent instrumental earthquake catalogue data. ETAS methodology assumes that, on top of a constant background seismicity, every single earthquake is a potential trigger for subsequent events. Such models have been widely used to analyse earthquake clustering both for studies covering broad regions and local studies focusing on individual earthquake sequences. This study builds on existing methodologies, however, the planned implementation for insurance application is always considered in the choice of the methods.

Due to its moderate seismicity, Hungary is not an optimal target area for an ETAS model. Nonetheless, the nearby Zagreb and Petrinja event sequences in 2020–2021 produced rich data that make the parameter fitting feasible. The authors do not know about a published previous region-specific ETAS study for Hungary relying on these recent catalogue data; therefore, the results can be interesting themselves.

The data used in the study are described in Section 2. In Section 3, an overall model for Hungary is formulated and in Section 4, its results are described. In Sections 5 and 6, downscaled parameters are determined, representing local regions within the total modelled area. The motivation of parameter downscaling stems from the application of the model in insurance where local geographical outputs are needed. Section 7 includes the conclusions of the study.

## *1.3. Background*

Since the initial formulation of the model by Ogata [5,6], ETAS methodology has been researched extensively. Works by several authors suggested different methods and algorithms for parameter estimation [6–10] and investigated the optimal forms of eventtriggering functions [6,8,11,12]. The literature on ETAS also includes a discussion of important model properties. One such model property is magnitude independence, i.e., the assumption that the magnitude of an earthquake is independent of predecessor events [8,13]. The stability or criticality conditions of the event triggering, as well as the question of selfsimilarity, have been investigated [14,15]. Certain enhancements of the model that allow a more accurate modelling of complex processes, such as three-dimensional or finite-source ETAS models, have also been developed [9,16,17]; however, the benefits of these advanced features for modelling a moderately active region like Hungary are not immediately clear.

This study significantly relies on the work of Zhuang et al. [7,8], who developed an iterative algorithm for ETAS parameterization, based on a kernel density estimation of the background activity in combination with stochastic declustering. The authors analyse the properties of the method in regional studies covering areas of Japan and New Zealand, and they suggest a range of methods for testing the different components of the model.

ETAS parameter estimation is affected by known sensitivities and biases, which are discussed at length by Seif et al. [18], who demonstrate these effects on actual and simulated catalogues derived from Italian and South Californian data. Typical sources of parameter bias include early aftershock incompleteness in the wake of a major earthquake, the anisotropy of aftershock clusters, and the choice of the magnitude threshold. These effects also need to be considered in this Hungarian study.

## **2. Data**

The data quality requirements of an ETAS model fitting are challenging. A consistent instrumental earthquake catalogue for Hungary and the seismograph network collecting the measurements for this catalogue have only been in place since 1996 [19], and there has been a scarcity of significant Hungarian events from 1996 until 2022. For a sufficiently rich catalogue, the 2020 Zagreband the 2020 Petrinja, Croatia events and their aftershock sequences need to be included in the study. It is relevant to note that the Petrinja mainshock also caused damage and triggered insurance claims within Hungary [20]. These two event sequences have been recorded by the Hungarian seismograph network. On the other hand, both clusters fell largely outside the core geographical window of the Hungarian earthquake catalogue; this means that the study area is extended to the periphery of the data sources where the completeness and accuracy of data is less than optimal. Furthermore, large Croatian event sequences will have a dominant effect on the model fitting.

The core geographical window of the study is the area between latitudes 45.5–49.0 N and longitudes 16.0–23.0 E. In order to include the Zagreb and Petrinja events, a margin to the west and to the south is added by defining the extended window as the area between latitudes 45.3–49.0 N and longitudes 15.7–23.0 E. The magnitude threshold of the study is set at *M<sup>w</sup>* = 2.5.

The data available for the study include both historical and recent instrumental data belonging to several Hungarian earthquake catalogues. These catalogues have been merged and event magnitudes have been homogenized to moment magnitude *M<sup>w</sup>* for this study. Hereafter, the subscript *w* is omitted when referring to magnitudes. The component catalogues are the following (see also Figure 1):


The number of events included in the extended window that reach the magnitude threshold is 1107, partly overlapping with catalogue B data in the core window. the core window by a margin of 0.2 degrees latitude and 0.3 degrees longitude in all directions. Automated preliminary epicenter determinations are post-processed for

reach the magnitude threshold is 2402. Due to obvious data quality limitations, these events are not used directly for ETAS parameter fitting but only as data for subse-

2. Catalogue B: Instrumental catalogue covering the core window and the period 1996 to 2019, based on the recordings of the Hungarian seismograph network [**Error! Reference source not found.**]. The number of events reaching the magnitude threshold

3. Catalogue C: Instrumental catalogue data for the period 1996 to 2010, collected and

4. Catalogue X: Initial event list (IEL) data covering the period 2012 to 2021 [**Error! Ref-**

old is 1107, partly overlapping with catalogue B data in the core window.

merged from international sources [22–24]. While some of the underlying sources are continuously updated, the merged dataset is only available to the end of 2010. The number of events included in the extended window that reach the magnitude thresh-

**erence source not found.**], based on the recordings of the Hungarian seismograph network. IEL data are an interim phase of the yearly updates of the Hungarian instrumental earthquake catalogue (catalogue B). Geographical coverage is wider than

4. Catalogue X: Initial event list (IEL) data covering the period 2012 to 2021 [25], based on the recordings of the Hungarian seismograph network. IEL data are an interim phase of the yearly updates of the Hungarian instrumental earthquake catalogue (catalogue B). Geographical coverage is wider than the core window by a margin of 0.2 degrees latitude and 0.3 degrees longitude in all directions. Automated preliminary epicenter determinations are post-processed for the monthly publications of the list, which may result in some coordinate shifts across the window boundary. Thus, the western and southern edge of the extended window corresponds approximately to the geographical scope of the IEL. The number of events from IEL included in the extended window that reach the magnitude threshold is 2313, fully overlapping with catalogue B data in the core window in the period 2012 to 2019. the monthly publications of the list, which may result in some coordinate shifts across the window boundary. Thus, the western and southern edge of the extended window corresponds approximately to the geographical scope of the IEL. The number of events from IEL included in the extended window that reach the magnitude threshold is 2313, fully overlapping with catalogue B data in the core window in the period 2012 to 2019. Microseismic events caused by quarry blasts and explosions had been identified and excluded from the datasets during the preparation of the catalogues prior to this study.

*Appl. Sci.* **2023**, *13*, x FOR PEER REVIEW 4 of 31

quent model downscaling.

is 654.

**Figure 1.** Schematic diagram of the coverage and overlaps of the source catalogues used in the study. **Figure 1.** Schematic diagram of the coverage and overlaps of the source catalogues used in the study.

It is observed that the events in the Romanian area of the study show unusual mag-Microseismic events caused by quarry blasts and explosions had been identified and excluded from the datasets during the preparation of the catalogues prior to this study.

nitude–frequency patterns. This suggests an inhomogeneity of the merger of the primary sources underlying the Romanian part of catalogue C. Therefore, the affected area is excluded from the ETAS parameter fitting due to concerns of data quality. The boundary of the excluded area follows the area source zone boundaries of the SHARE project [**Error! Reference source not found.**]. Therefore, the exclusion affects a strip within Hungary It is observed that the events in the Romanian area of the study show unusual magnitude–frequency patterns. This suggests an inhomogeneity of the merger of the primary sources underlying the Romanian part of catalogue C. Therefore, the affected area is excluded from the ETAS parameter fitting due to concerns of data quality. The boundary of the excluded area follows the area source zone boundaries of the SHARE project [26]. Therefore, the exclusion affects a strip within Hungary along the Romanian border.

along the Romanian border. Magnitude–frequency distributions suggest that the completeness threshold of the remaining catalogue on the core window is ≥ 2.8 at the beginning in 1996, which has gradually improved to ≥ 2.5 by 2000 and to ≥ 2.2 by 2013. The sensitivity contours of the Hungarian seismograph network indicate completeness for ≥ 2.5 on most of the modelled area in the core window with some lower-sensitivity areas on the periphery and with gradually improving coverage over time [**Error! Reference source not**  Magnitude–frequency distributions suggest that the completeness threshold of the remaining catalogue on the core window is *M* ≥ 2.8 at the beginning in 1996, which has gradually improved to *M* ≥ 2.5 by 2000 and to *M* ≥ 2.2 by 2013. The sensitivity contours of the Hungarian seismograph network indicate completeness for *M* ≥ 2.5 on most of the modelled area in the core window with some lower-sensitivity areas on the periphery and with gradually improving coverage over time [19]. The catalogue is considered complete for *M* ≥ 2.5 in the extended window only since 2018, except for a completeness gap immediately after the Petrinja 2020 mainshock: the detection of early aftershocks after a large earthquake is typically incomplete since the seismograms in this initial period are saturated with overlapping waves from multiple events [18]. Given the trade-off between completeness and the size of the catalogue, we find that the optimal choice of the threshold for this study is *M* ≥ 2.5. A threshold increase would marginalize the part of the catalogue within Hungary itself and would sharply reduce the number of events for estimating the spatial distribution of the background process. For the model parameterization, completeness is assumed on the core window since 2000 and on the extended window since 2018, except for a 1-day period after the Petrinja mainshock.

After merging catalogues B, C, and X, applying the *M* ≥ 2.5 threshold, removing duplications, removing events outside the completeness periods, and removing events in the excluded area, the remaining size of the target catalogue for ETAS parameter fitting is 1978. An additional 236 events reaching *M* ≥ 2.5 are used as auxiliary trigger events; this includes 27 events from year 1999 and 209 events from the Petrinja completeness gap. A map of the fitting catalogue is shown in Figure 2. Another 218 events from catalogues B, C, and X outside the completeness periods are used as additional epicenter nodes for the kernel density estimation of the spatial distribution of background events. After merging catalogues B, C, and X, applying the ≥ 2.5 threshold, removing duplications, removing events outside the completeness periods, and removing events in the excluded area, the remaining size of the target catalogue for ETAS parameter fitting is 1978. An additional 236 events reaching ≥ 2.5 are used as auxiliary trigger events; this includes 27 events from year 1999 and 209 events from the Petrinja completeness gap. A map of the fitting catalogue is shown in Figure 2. Another 218 events from catalogues B, C, and X outside the completeness periods are used as additional epicenter nodes for the kernel density estimation of the spatial distribution of background events.

**found.**]. The catalogue is considered complete for ≥ 2.5 in the extended window only since 2018, except for a completeness gap immediately after the Petrinja 2020 mainshock: the detection of early aftershocks after a large earthquake is typically incomplete since the seismograms in this initial period are saturated with overlapping waves from multiple events [**Error! Reference source not found.**]. Given the trade-off between completeness and the size of the catalogue, we find that the optimal choice of the threshold for this study is ≥ 2.5. A threshold increase would marginalize the part of the catalogue within Hungary itself and would sharply reduce the number of events for estimating the spatial distribution of the background process. For the model parameterization, completeness is assumed on the core window since 2000 and on the extended window since 2018, except for

*Appl. Sci.* **2023**, *13*, x FOR PEER REVIEW 5 of 31

a 1-day period after the Petrinja mainshock.

**Figure 2.** (**a**) Location of the model area within Europe. (**b**) Plot of the earthquake catalogue for Hungary used for epidemic-type aftershock sequence (ETAS) model parameterization—target catalogue plus auxiliary events, 1999–2021. The map covers the extended window. The boundaries of the core window (dashed black line) and excluded area (red line) are marked. Events reaching ≥ 4.0 are highlighted. **Figure 2.** (**a**) Location of the model area within Europe. (**b**) Plot of the earthquake catalogue for Hungary used for epidemic-type aftershock sequence (ETAS) model parameterization—target catalogue plus auxiliary events, 1999–2021. The map covers the extended window. The boundaries of the core window (dashed black line) and excluded area (red line) are marked. Events reaching *M* ≥ 4.0 are highlighted.

For model calculations, geographical latitude–longitude coordinates (, ) are converted to an (, ) kilometre grid via the following transformation: = ∙ cos ∙ tan( − 0) , = ∙ ( − 0), (1) For model calculations, geographical latitude–longitude coordinates (*ϕ*, *λ*) are converted to an (*x*, *y*) kilometre grid via the following transformation:

$$\mathbf{x} = \mathbf{R} \cdot \cos \varphi \cdot \tan(\lambda - \lambda\_0), \ y = \mathbf{R} \cdot (\varphi - \varphi\_0), \tag{1}$$

where *R* = 6 371.01 is the mean radius of the Earth in kilometres. This corresponds to a tangent transverse cylindrical projection whose central meridian is at *λ*<sup>0</sup> and that transforms parallels into straight lines. The point (*ϕ*0, *λ*0) is set at (47.5◦ , 19◦ ), i.e., the rounded coordinates of Budapest, which are transformed into (*x*, *y*) = (0, 0) in the kilometre grid.

## **3. Epidemic-Type Aftershock Sequence (ETAS) Modelling Methods**

## *3.1. Model Formulation*

The ETAS model describes earthquake occurrences as a non-stationary Poisson point process where every event can trigger a wave of aftershocks. In a time-only model, the rate of occurrence at time *t*, conditional on the process history *H<sup>t</sup>* , is expressed as [5]

$$
\lambda(t|H\_l) = \mu + \sum\_{j} \, \_{\, :\, t\_j < t} K \cdot e^{a(M\_{\bar{j}} - m\_0)} \cdot g\left(t - t\_{\bar{j}}\right), \tag{2}
$$

which is extended to a space–time model as [6]

$$\lambda(t, \mathbf{x}, y | H\_l) = \mu \cdot \mathbf{u}(\mathbf{x}, y) + \sum\_{j}\_{j': t\_{j'} < t} \mathbf{K} \cdot \mathbf{e}^{\mathbf{a}(M\_{\bar{j}} - m\_0)} \cdot \mathbf{g}\left(t - t\_{\bar{j}}\right) \cdot f\left(\mathbf{x} - \mathbf{x}\_{\bar{j}}, y - y\_{\bar{j}} | M\_{\bar{j}}\right), \tag{3}$$

where the first term represents background seismicity with a constant overall rate *µ* and area density *u*(*x*, *y*), and the sum in the second term reflects the rate of activity triggered by events in the catalogue prior to time *t*. An exponential relationship is assumed between the trigger event magnitude *M<sup>j</sup>* and the triggered rate of activity, where *K* and *α* are constant parameters and *m*<sup>0</sup> is the magnitude threshold. The functions *g* and *f* describe the form of the time and space response, respectively, relative to the trigger event occurrence time and epicenter. The process history *H<sup>t</sup>* = *t<sup>j</sup>* , *x<sup>j</sup>* , *y<sup>j</sup>* , *M<sup>j</sup>* : *t<sup>j</sup>* < *t* is the catalogue of events before *t*, defined by their occurrence time, location, and magnitude. Depth is not included in this ETAS model as a third coordinate, and the question of the depth distribution is not discussed in this article. This is a reasonable simplification since all events in the study catalogue were shallow crustal earthquakes and all except nine of them had a focal depth of less than 20 km.

It was assumed that event magnitudes follow a truncated exponential (Gutenberg-Richter) distribution independent of the process history [8,13]. The probability density function (PDF) of magnitudes between the bounds *m*<sup>0</sup> and *m<sup>x</sup>* is given by the following:

$$J(m) = \frac{\oint e^{-\beta(m - m\_0)}}{1 - e^{-\beta(m\_x - m\_0)}},\tag{4}$$

where *β* is the frequency decay parameter. According to the magnitude independence assumption, an aftershock can be larger than its trigger event.

For this study, we used a stretched exponential function truncated for a minimum delay as the time response:

$$g(t) = \begin{cases} t^{q-1} \cdot \exp(-\eta t^q) & \text{if } t > \Delta t\_0 \\ 0 & \text{if } t \le \Delta t\_0 \end{cases} \tag{5}$$

where ∆*t*<sup>0</sup> is a minimum delay parameter, and the normal distribution with an allowance for epicenter error is the space response function:

$$f(\mathbf{x}, \mathbf{y} | M) = \frac{1}{2\pi\sigma^2} \exp\left[-\left(\mathbf{x}^2 + \mathbf{y}^2\right)/2\sigma^2\right],\tag{6}$$
 
$$\text{with } \sigma^2 = D^2 \cdot e^{\mathbf{a}(M - m\_0)} + \varepsilon^2.\tag{7}$$

where *D* is the scale parameter of the magnitude-dependent part of the variance, and *ε* is a parameter representing a variance add-on for epicenter error. Since both Equations (5) and (6) represent a departure from the mainstream techniques, we explain the rationale of these approaches below.

The most often used time response function is the modified Omori law [27], which is a power law function in the form of *g*(*t*) = (*t* + *c*) −*p* . The stretched exponential alternative has been suggested by Mignan [12], who, besides quantitative tests showing that the stretched exponential form is better fitted to the observations from several regions than the power law, also expressed qualitative concerns about the modified Omori law. Firstly, the positive time shift constant *c* avoids the singularity at zero, but it is difficult to find a

physical explanation for it. Secondly, a parameter value of *p* ≤ 1 would mean an infinite number of triggered events unless the process is artificially truncated in time. Furthermore, due to the heavy tail of the power law function, supercritical parameter fitting outcomes, which imply an unstable process, are sometimes obtained when the modified Omori law is used in an epidemic-type model [18]. The minimum delay parameter ∆*t*<sup>0</sup> in Equation (5) is motivated by experience with stochastic simulations based on the model. This truncation was introduced to avoid modelling an unrealistically high number of aftershocks immediately after the trigger event, which were never observed or identified as separate events in actual measurements since the point process model reaches its limits at very short time intervals.

Regarding the space response function, observed aftershock sequences often show anisotropic patterns. Nonetheless, isotropic functions are widely used in ETAS models as a simplification. Both the power law function, e.g., in the form *f*(*x*, *y*|*M*) ∝ h *x* <sup>2</sup> + *y* 2 /*e <sup>α</sup>*(*M*−*m*0) + *d* i−*<sup>q</sup>* and the bivariate normal distribution have been discussed in the literature with a general preference for the power law based on goodness-of-fit diagnostics [6,8]. The Gaussian form is chosen in this study mainly for its analytical simplicity and ease of implementation in the model. Since the exponential scaling parameter *α* is the same in Equations (3) and (6), the space response formula assumes that the aftershock area grows in proportion to the number of events triggered, which is in line with the observed trend known as the Utsu–Seki law [6,27]. This constraint from early ETAS model variants has been challenged in the literature, raising the suggestion for a separate spatial scaling parameter *γ* [8,11]. However, due to the small number of major clusters available in the Hungarian catalogue, it is preferred to keep the number of fitted parameters as low as possible. Finally, for the moderately strong Hungarian events in the study period, a weak spatial scaling trend between trigger event magnitudes and aftershock radii is observed. It is assumed that the underlying effect is the epicenter error, which becomes a dominant factor at low magnitudes; hence, the motivation to include the *ε* parameter. We note that the formulation in Equation (6) is a simplified reflection of epicenter error when the space response function *f* is centred on the observed epicenters rather than the underlying ones. In a precise mathematical model, the conditional distributions of the error vectors would be neither isotropic nor independent.

When an isotropic space–time model is fitted to non-isotropic aftershock data, a typical impact is a downward bias of the *α* parameter; an indication of this effect is a significant gap between the *α* values estimated from space–time versus time-only models [18,28]. This effect is avoided by applying the constraint *α* = *β*, motivated by the assumption of self-similarity: when the magnitude threshold of the model is shifted from *m*<sup>0</sup> to *m*1, the *K* parameter is rescaled to *K*<sup>1</sup> = *K*0·*e* (*α*−*β*)(*m*1−*m*0) while ignoring the upper magnitude bound for simplicity; therefore, the model parameters are almost unchanged across magnitude scales if *α* = *β* except in the vicinity of the upper bound *mx*. Such a model also preserves Båth's law, i.e., the observation that the average magnitude difference between a trigger event and its biggest aftershock appears to be invariant [9,14,18] (S1.2.1) [29].

The branching ratio of the model characterizing the stability of the process is defined as the average number of the first-generation descendants triggered by an event and expressed as [8,18]:

$$\boldsymbol{\varrho} = \int\_{m=m\_0}^{m\_X} \int\_{t=0}^{\infty} \boldsymbol{K} \cdot \boldsymbol{e}^{a(m-m\_0)} \cdot \boldsymbol{g}(t) \cdot \boldsymbol{J}(m) \, dt \, dm. \tag{7}$$

The process is subcritical (stable) if *\$* < 1, it is critical (unstable) if *\$* = 1, and it is supercritical (unstable, potentially explosive) if *\$* > 1. In a model describing earthquake recurrence over extended periods, subcritical behaviour is assumed. Therefore, assuming a self-similar process with *α* = *β* and substituting Equations (4) and (5) in Equation (7), the branching ratio is:

$$\varrho = \frac{K}{\eta q} \cdot \frac{\beta (m\_x - m\_0)}{1 - e^{-\beta (m\_x - m\_0)}} \cdot \exp \left( -\eta \Delta t\_0^q \right) \tag{8}$$

which considers the truncation of the time response function at ∆*t*0. There is a near-linear dependence of *\$* on the magnitude span *m<sup>x</sup>* − *m*0, where the upper bound is difficult to estimate and the lower bound is subject to expert judgement. Therefore, the branching ratio according to Equations (7) and (8) is better viewed as a model property rather than as a physical parameter of the observed earthquake catalogue.

## *3.2. Parameter Fitting*

In this subsection the parameter fitting process for a space–time ETAS model formulated in Equations (3)–(6) is described. The study period is [0, *T*] with *T* = 8 036 where time is measured in days beginning at 2000-01-01 00:00 UTC. The broader study area is *A* = [*X*0, *X*1] × [*Y*0,*Y*1] = [−258, 313 ] × [−245, 167], where space coordinates follow the kilometre grid defined in Equation (1). The core study area *A*<sup>0</sup> and the proper study area *A*<sup>1</sup> are narrower than *A*, so *A*<sup>0</sup> ⊂ *A*<sup>1</sup> ⊂ *A* (see Figures 2 and 3). The lower and upper magnitude bounds are set at *m*<sup>0</sup> = 2.45 and *m<sup>x</sup>* = 6.55, respectively; it is assumed that the magnitude values in the catalogues, rounded to multiples of 0.1, reflect the mid-points of the respective magnitude bins. In addition, the minimum aftershock delay parameter is fixed at <sup>∆</sup>*t*<sup>0</sup> <sup>=</sup> 2.315·10−<sup>4</sup> days (equivalent to 20 s). *Appl. Sci.* **2023**, *13*, x FOR PEER REVIEW 10 of 31

When an estimate of the background density function (, ) is provided, the parameters (,, , , ,, ) that maximize the log-likelihood function can be computed The maximum likelihood estimation (MLE) method seeks the parameters where the maximum of the likelihood function is reached. Expressed as a function of the parameter set *θ* = (*µ*, *K*, *η*, *q*, *α*, *D*,*ε*), the log-likelihood function is [6]:

$$\log L(\theta) = \sum\_{i \in H\_V} \log \lambda\_{\theta}(t\_i, \mathbf{x}\_i, y\_i | H\_{i\cdot}) - \iiint\limits\_V \lambda\_{\theta}(t, \mathbf{x}, y | H\_{\ell}) \, d\mathbf{x} \, dy \, dt,\tag{9}$$

tain adaptations. An initial declustered catalogue is obtained via the window method, i.e., by removing those events from the full catalogue that fall inside a pre-defined space–time

**erence source not found.**] (pp. 173–174) are used. Starting from the initial declustering,

For a better-smoothed background density estimate, during the kernel smoothing

Declustering in an ETAS context aims to identify background events rather than

= ∙ (, ) (, , |: ⁄ ). (10)

mainshocks, as in the parameterization of a stationary Poisson model. Since both the background process and the trigger process contribute simultaneously to the Poisson rate at event , declustering in an ETAS context is non-binary. A background weight 0 ≤ ≤ 1 is defined for each event in the epicenter catalogue, expressing the relative contribution

and declustering steps, the epicenter catalogue is broader than the catalogue used for the log-likelihood optimization. The latter is kept narrower to reduce computing time. The epicenter catalogue includes all ≥ 2.5 events in the study area 1 from catalogues B, C, and X in the years 1996–2021, giving a total number of 2 432 potential epicenter nodes before declustering. The incompleteness of implies an inhomogeneous coverage between the core area and the extension margin, which needs to be corrected by

integrals in the log-likelihood function in Equation (9) are approximated by integration on the rectangular broader area ; the difference is immaterial since the Gaussian kernels in the space functions quickly fall close to zero outside the study area 1. In accordance with the self-similarity assumption, parameter is set to be equal with the magnitude–frequency decay parameter of the target catalogue , which is estimated at = 2.602 and is equivalent to a Gutenberg–Richter = 1.13. The function (, ) is approximated by the kernel smoothing of a declustered epicenter catalogue, following an iterative process suggested by Zhuang et al. [7,8] after cerwhere *V* is the space–time target region of the study. The different event ranges considered in Equation (9) must be noted: *H<sup>V</sup>* is the target catalogue, i.e., the list of all the events in *V*. *H<sup>V</sup>* can also be a subset of the fitting catalogue *H*; in other words, auxiliary events from outside the target region *V* can be taken into account as historical triggers in the conditional occurrence rate *λ<sup>θ</sup>* for Equation (9) [7,8]. Whereas *H<sup>V</sup>* is assumed to be a complete catalogue, the completeness of the auxiliary catalogue *H*\*H<sup>V</sup>* is not necessary. In the first term, *H*:*<sup>i</sup>* ⊂ *H* is a shorthand notation for the sub-catalogue of prior events that can have a triggering effect on event *i*. In this case, *H*:*<sup>i</sup>* = *Hti*−∆*t*<sup>0</sup> according to the minimum delay assumption.

3. follow-on declustering steps based on the ETAS model.

2. log-likelihood optimization for (,, , ,, );

1. kernel smoothing for (, );

appropriate weighting.

of the background process to event [7,8]:

cycles of the following steps are iterated until convergence is reached:

When implementing the maximum likelihood estimation, the complex non-rectangular form of the space–time target region, shown in Figure 3, must be considered: (1) the extended geographical window itself is not rectangular in the kilometre grid, (2) an area along the Romanian border is excluded due to concerns of data inhomogeneity, (3) the catalogue of the window extension is considered complete only from 2018, and finally, (4) a 1-day completeness gap from the target region beginning after the Petrinja 2020 mainshock is excluded. The exclusion of a time window after a large event is one of the techniques suggested in the literature to deal with the problem of aftershock incompleteness [18,28]. In fact, events falling into this time gap are used as auxiliary events, so their triggering effect is considered in the model, nonetheless.

When an estimate of the background density function *u*(*x*, *y*) is provided, the parameters (*µ*, *K*, *η*, *q*, *α*, *D*,*ε*) that maximize the log-likelihood function can be computed through non-linear optimization. In this study, the Davidon–Fletcher–Powell method [30] is implemented using custom Python code. The space integrals in the log-likelihood function in Equation (9) are approximated by integration on the rectangular broader area *A*; the difference is immaterial since the Gaussian kernels in the space functions quickly fall close to zero outside the study area *A*1. In accordance with the self-similarity assumption, parameter *α* is set to be equal with the *β* magnitude–frequency decay parameter of the target catalogue *HV*, which is estimated at *β* = 2.602 and is equivalent to a Gutenberg–Richter *b* = 1.13.

The function *u*(*x*, *y*) is approximated by the kernel smoothing of a declustered epicenter catalogue, following an iterative process suggested by Zhuang et al. [7,8] after certain adaptations. An initial declustered catalogue is obtained via the window method, i.e., by removing those events from the full catalogue that fall inside a pre-defined space– time neighbourhood of a bigger event, where the window parameters suggested in [31] (pp. 173–174) are used. Starting from the initial declustering, cycles of the following steps are iterated until convergence is reached:


For a better-smoothed background density estimate, during the kernel smoothing and declustering steps, the epicenter catalogue *E* is broader than the catalogue *H* used for the log-likelihood optimization. The latter is kept narrower to reduce computing time. The epicenter catalogue *E* includes all *M* ≥ 2.5 events in the study area *A*<sup>1</sup> from catalogues B, C, and X in the years 1996–2021, giving a total number of 2 432 potential epicenter nodes before declustering. The incompleteness of *E* implies an inhomogeneous coverage between the core area and the extension margin, which needs to be corrected by appropriate weighting.

Declustering in an ETAS context aims to identify background events rather than mainshocks, as in the parameterization of a stationary Poisson model. Since both the background process and the trigger process contribute simultaneously to the Poisson rate at event *i*, declustering in an ETAS context is non-binary. A background weight 0 ≤ *ξ<sup>i</sup>* ≤ 1 is defined for each event in the epicenter catalogue, expressing the relative contribution of the background process to event *i* [7,8]:

$$\mathfrak{F}\_{i} = \mu \cdot \mathfrak{u}(\mathfrak{x}\_{i\prime} y\_{i}) / \lambda(t\_{i\prime} \mathfrak{x}\_{i\prime} y\_{i} | E\_{\mathfrak{x}}). \tag{10}$$

Despite its non-binary nature, the observed distribution of *ξ<sup>i</sup>* is typically strongly U-shaped with modes close to 0 and 1, suggesting that background events can be well distinguished from triggered events in most cases [7,9]. As a simplification, the continuous background weights *ξ<sup>i</sup>* are replaced by the binary background indicators *χ<sup>i</sup>* :

$$\chi\_i = \begin{cases} 0 \text{ if } \mathfrak{f}\_i < 1/2, \\ 1 \text{ if } \mathfrak{f}\_i \ge 1/2. \end{cases} \tag{11}$$

The conversion into binary indicators is mathematically not strictly necessary, as the equations used in the parameter fitting also work with continuous weights. The advantage of binary indicators is that they are robust, so they ensure a quick convergence of the iteration, and they also provide an unambiguous declustered catalogue of background events.

Given the background indicators *χ<sup>i</sup>* , the background density function is approximated as a weighted sum of Gaussian kernels placed around the epicenter nodes [7,8]:

$$u(\mathbf{x}, y) = \frac{1}{\mathcal{C}} \cdot \left[ \sum\_{i \in \mathcal{E}\_0} \chi\_i \cdot k\_d(\mathbf{x} - \mathbf{x}\_i, y - y\_i) + w \cdot \sum\_{i \in \mathcal{E} \backslash \mathcal{E}\_0} \xi \chi\_i \cdot k\_d(\mathbf{x} - \mathbf{x}\_i, y - y\_i) \right], \tag{12}$$

with

$$k\_d(\mathbf{x}, \mathbf{y}) = \frac{1}{2\pi d} \cdot \exp\left(-\frac{\mathbf{x}^2 + \mathbf{y}^2}{2d^2}\right),\tag{13}$$

where *C* is a normalising constant and *E*<sup>0</sup> is the epicenter sub-catalogue corresponding to the core area *A*0. The weighting *w* of the sum over *E*\*E*<sup>0</sup> is introduced to correct the uneven coverage of the epicenter catalogue between the core area *A*<sup>0</sup> and the extension margin *A*1\*A*0, and it is defined such that the corrected total weight in the two area parts is proportional to the respective background occurrence rates, estimated from the most recent declustering step of the iteration. While Zhuang et al. [7,8] suggest a variable bandwidth, a fixed bandwidth is used in this study, set by expert judgement to d = <sup>√</sup> 50 = 7.071 km.

## **4. Modelling Results**

## *4.1. ETAS Parameters*

The convergence of the fitted parameters after each iteration of the declustering–kernel smoothing–log-likelihood optimization cycle is shown in Table 1; the preferred parameters are those from iteration 3. A plot of the modelled density of the background rate *µ*·*u*(*x*, *y*) is shown in Figure 4 and the histogram of background weights *ξ<sup>i</sup>* and background indicators *χi* is shown in Figure 5. The branching ratio of the model with the parameters after the last iteration is *\$* = 0.47. The major clusters in the study catalogue show anisotropy, which could have an impact in an alternative parameterization when the *α* parameter is set free. The result from space–time fitting is *α* = 1.79 while a time-only fitting yields *α* = 2.76—this pattern is consistent with the effects reported by Hainzl et al. [28].

**Table 1.** Convergence steps of the fitted ETAS parameters for the study catalogue. Parameter *α* is fixed through the fitting. The value *µ*<sup>0</sup> is included for information and it reflects the daily frequency of background events within the core area *A*0. The preferred parameters are these from iteration 3.


## *4.2. Parameter Ranges*

The estimation of parameter errors is not straightforward, since the errors obtained from analytical solutions can be different from simulation results [32]. Here, two simple tests to derive approximate error ranges are carried out: (1) In the first test, the MLE calculation is re-run on 20 simulated catalogues, using the preferred parameters for the simulation. Each simulated catalogue includes 2214 events, i.e., the same number as the actual study catalogue *H*. For comparability with the actual catalogue, a fixed copy of the *M* = 6.3 Petrinja mainshock is also included at the beginning of each simulation. Table 2 shows the minimum and maximum fitted parameters from these simulations. (2) In the second test, the sensitivity of the MLE calculation to a ±0.1 error in the magnitude of the Petrinja mainshock is reported.

*4.2. Parameter Ranges*

**(1/Day)**

error in the magnitude of the Petrinja mainshock is reported.

Parameter ranges from 20 simulations min 0.1031 0.0044 0.3694 0.2200 0.0389 2.2394 0.44 max 0.1284 0.0056 0.5148 0.2805 0.0455 2.4919 0.49 Parameters calculated with alternative magnitude of the Petrinja mainshock

= 6.4 0.1148 0.0045 0.4401 0.2308 0.0357 2.3842 0.44

**Figure 4.** Modelled background density ∙ (, ) plotted at 10 km pixel size (unit is <sup>1</sup> (day <sup>∙</sup> km<sup>2</sup> <sup>⁄</sup> ). **Figure 4.** Modelled background density *<sup>µ</sup>*·*u*(*x*, *<sup>y</sup>*) plotted at 10 km pixel size (unit is 1/(day·km<sup>2</sup> )).

The estimation of parameter errors is not straightforward, since the errors obtained from analytical solutions can be different from simulation results [**Error! Reference source not found.**]. Here, two simple tests to derive approximate error ranges are carried out: (1) In the first test, the MLE calculation is re-run on 20 simulated catalogues, using the preferred parameters for the simulation. Each simulated catalogue includes 2214 events, i.e., the same number as the actual study catalogue . For comparability with the actual catalogue, a fixed copy of the = 6.3 Petrinja mainshock is also included at the beginning of each simulation. Table 2 shows the minimum and maximum fitted parameters from these simulations. (2) In the second test, the sensitivity of the MLE calculation to a ±0.1

**Table 2.** Approximate error ranges for the estimated parameters and for the branching ratio .

 **(km)** **(km)**

**Figure 5.** Histogram of background weights over the epicenter catalogue of the study. A background weight close to 1 indicates a background event while a background weight close to zero indicates a triggered event. **Figure 5.** Histogram of background weights *ξ<sup>i</sup>* over the epicenter catalogue of the study. A background weight close to 1 indicates a background event while a background weight close to zero indicates a triggered event.

The cumulative number of modelled events is obtained by integrating the conditional

<sup>0</sup> , (14)

Λ() = ∫ ∬ (, , |) ()

where the notation () reflects the time-dependent area coverage of the study. Figure 6 shows the comparison of modelled and observed cumulative event numbers in the study period. The estimated number of non-observed Petrinja aftershocks is 155; this is the difference between the modelled and observed number of occurrences in the 1-day interval after the Petrinja mainshock. When the observed curve is corrected for this difference, the model captures the large Croatian earthquake sequences quite well. A weak point is the seemingly uneven observed background rate, which is not matched by the model: the underlying issue is that the southern part of the study area is unevenly covered by the patchwork of catalogues; this affects not only the southern extension margin but the

**Figure 6.** Modelled versus observed event accumulation in the study period 2000–2021 for ≥ 2.5. The two large event surges late in the period correspond to the Zagreb and Petrinja, Croatia, 2020–

southern edge of the core window too.

occurrence rate function [**Error! Reference source not found.**]:

*4.3. Event Accumulation Test*


*Appl. Sci.* **2023**, *13*, x FOR PEER REVIEW 13 of 31

**Table 2.** Approximate error ranges for the estimated parameters and for the branching ratio *\$*.

### *4.3. Event Accumulation Test 4.3. Event Accumulation Test* The cumulative number of modelled events is obtained by integrating the conditional

The cumulative number of modelled events is obtained by integrating the conditional occurrence rate function [5]: occurrence rate function [**Error! Reference source not found.**]:

$$
\Lambda(t) = \int\_0^t \iint\limits\_{A(\tau)} \lambda(\tau, \mathbf{x}, y | H\_{\tau}) \, d\mathbf{x} \, dy \, d\tau,\tag{14}
$$

where the notation *A*(*τ*) reflects the time-dependent area coverage of the study. Figure 6 shows the comparison of modelled and observed cumulative event numbers in the study period. The estimated number of non-observed Petrinja aftershocks is 155; this is the difference between the modelled and observed number of occurrences in the 1-day interval after the Petrinja mainshock. When the observed curve is corrected for this difference, the model captures the large Croatian earthquake sequences quite well. A weak point is the seemingly uneven observed background rate, which is not matched by the model: the underlying issue is that the southern part of the study area is unevenly covered by the patchwork of catalogues; this affects not only the southern extension margin but the southern edge of the core window too. shows the comparison of modelled and observed cumulative event numbers in the study period. The estimated number of non-observed Petrinja aftershocks is 155; this is the difference between the modelled and observed number of occurrences in the 1-day interval after the Petrinja mainshock. When the observed curve is corrected for this difference, the model captures the large Croatian earthquake sequences quite well. A weak point is the seemingly uneven observed background rate, which is not matched by the model: the underlying issue is that the southern part of the study area is unevenly covered by the patchwork of catalogues; this affects not only the southern extension margin but the southern edge of the core window too.

**Figure 6.** Modelled versus observed event accumulation in the study period 2000–2021 for ≥ 2.5. The two large event surges late in the period correspond to the Zagreb and Petrinja, Croatia, 2020– **Figure 6.** Modelled versus observed event accumulation in the study period 2000–2021 for *M* ≥ 2.5. The two large event surges late in the period correspond to the Zagreb and Petrinja, Croatia, 2020–2021 earthquake sequences. The model estimates that 155 occurrences in the Petrinja aftershock sequence are missing from the observed catalogue. Additionally, it is worth noting the step-up of the background rate from day 6576 (1 January 2018), which is the beginning of the period when the study area covers the extended window.

### *4.4. Poisson Goodness-of-Fit Test* study area covers the extended window.

*Appl. Sci.* **2023**, *13*, x FOR PEER REVIEW 14 of 31

The Poisson behaviour of the background process on the core area and of the full non-declustered process on the total modelled area is tested. *4.4. Poisson Goodness-of-Fit Test*

2021 earthquake sequences. The model estimates that 155 occurrences in the Petrinja aftershock sequence are missing from the observed catalogue. Additionally, it is worth noting the step-up of the background rate from day 6576 (1 January 2018), which is the beginning of the period when the

In the case of the background process, the modelled occurrence rate on the core area *A*<sup>0</sup> is *µ*<sup>0</sup> = 0.0643 at the magnitude threshold *M* ≥ 2.5. When splitting the period into *k*-day intervals, the event counts by interval should follow a Poisson distribution with a mean parameter *k*·*µ*<sup>0</sup> according to the model. The outcome of chi-squared tests at a 0.95 confidence level is the following: at the magnitude threshold *M* ≥ 2.5 the catalogue typically fails the test for the period 2000–2021, but it passes the test for the period 2000–2010. At the magnitude threshold *M* ≥ 3.0, with the Poisson mean scaled accordingly, the catalogue passes the test for the full 2000–2021 period. This can be explained by possible data deficiencies, i.e., missing small events along the periphery after the termination of Catalogue C in 2010. The Poisson behaviour of the background process on the core area and of the full non-declustered process on the total modelled area is tested. In the case of the background process, the modelled occurrence rate on the core area 0 is 0 = 0.0643 at the magnitude threshold ≥ 2.5. When splitting the period into day intervals, the event counts by interval should follow a Poisson distribution with a mean parameter ∙ 0 according to the model. The outcome of chi-squared tests at a 0.95 confidence level is the following: at the magnitude threshold ≥ 2.5 the catalogue typically fails the test for the period 2000–2021, but it passes the test for the period 2000–2010. At the magnitude threshold ≥ 3.0, with the Poisson mean scaled accordingly, the catalogue passes the test for the full 2000–2021 period. This can be explained by possible data

In the case of the full non-declustered process, the time transformation *t* 7−→ Λ(*t*), according to Equation (14), is used to convert the process into a stationary one [5]. When splitting the full period into *k*-unit intervals of transformed time, the event counts by interval should follow a Poisson distribution with a mean parameter *k* for *M* ≥ 2.5. Chisquared tests at a 0.95 confidence level fail the model at the threshold *M* ≥ 2.5, but the model passes the tests at *M* ≥ 3.0. This indicates that the observed process has some features at low magnitudes that this simplified model is unable to capture. Figure 7 shows some of the results of the Poisson goodness-of-fit tests. deficiencies, i.e., missing small events along the periphery after the termination of Catalogue C in 2010. In the case of the full non-declustered process, the time transformation ⟼ Λ(), according to Equation (14), is used to convert the process into a stationary one [**Error! Reference source not found.**]. When splitting the full period into -unit intervals of transformed time, the event counts by interval should follow a Poisson distribution with a mean parameter for ≥ 2.5. Chi-squared tests at a 0.95 confidence level fail the model

The following three tests (i.e., distance test, time lag test, and triggering ability test) have been suggested and used with Japanese earthquake data by Zhuang et al. [8] in order to assess how well the space and time response functions of a model match the observations. The next subsections show the test results of this model with the Hungarian data. at the threshold ≥ 2.5, but the model passes the tests at ≥ 3.0. This indicates that the observed process has some features at low magnitudes that this simplified model is unable to capture. Figure 7 shows some of the results of the Poisson goodness-of-fit tests.

**Figure 7.** *Cont*.

268

**Figure 7.** Examples of chi-squared Poisson goodness-of-fit tests: The distribution of event counts per interval are compared to the modelled Poisson distribution for (**a**) background events in the core area, ≥ 3.0; (**b**) background events in the core area, ≥ 2.5; (**c**) all events, ≥ 3.0; and (**d**) all events, ≥ 2.5. **Figure 7.** Examples of chi-squared Poisson goodness-of-fit tests: The distribution of event counts per interval are compared to the modelled Poisson distribution for (**a**) background events in the core area, *M* ≥ 3.0; (**b**) background events in the core area, *M* ≥ 2.5; (**c**) all events, *M* ≥ 3.0; and (**d**) all events, *M* ≥ 2.5.

### The following three tests (i.e., distance test, time lag test, and triggering ability test) *4.5. Distance Test*

*4.5. Distance Test*

this model with the Hungarian data.

have been suggested and used with Japanese earthquake data by Zhuang et al. [**Error! Reference source not found.**] in order to assess how well the space and time response functions of a model match the observations. The next subsections show the test results of This test compares the distribution of observed distances between trigger events and their aftershocks against the modelled space response function [8]. If an event *i* is triggered by an event *j*, the standardized distance between the two events can be defined as:

$$r\_{ij} = \frac{\sqrt{\left(\mathbf{x}\_i - \mathbf{x}\_j\right)^2 + \left(y\_i - y\_j\right)^2}}{\sqrt{D^2 \cdot \epsilon^{a\left(M\_j - m\_0\right)} + \epsilon^2}}.\tag{15}$$

their aftershocks against the modelled space response function [**Error! Reference source not found.**]. If an event is triggered by an event , the standardized distance between the two events can be defined as: <sup>=</sup> ��−� 2 +�−� 2 . (15) If the approximation of using a Gaussian space response function in the model is correct, then *rij* should follow a chi distribution with two degrees of freedom (or Rayleigh distribution with *σ* = 1). Since ancestor–descendant relationships between events cannot be identified unambiguously, the weights *ζij* are introduced to express the fraction of event *i*, which is triggered by event *j*:

$$\zeta\_{ij} = \frac{K \cdot e^{a(M\_{\backslash\backslash} - m\_0)} \cdot g(t\_{\backslash} - t\_{\backslash}) \cdot f(x\_{\backslash} - x\_{\backslash}, y\_{\backslash} - y\_{\backslash} | M\_{\backslash})}{\lambda(t\_{\backslash}, x\_{\backslash}, y\_{\backslash} | H\_{\backslash})}. \tag{16}$$

leigh distribution with = 1). Since ancestor–descendant relationships between events cannot be identified unambiguously, the weights are introduced to express the frac-In order to build a histogram of the observed distribution of *rij* the sum of the *ζij* values is taken in each standardized distance bin, where *i* runs over *H<sup>V</sup>* and *j* runs over *H*:*<sup>i</sup>* .

tion of event , which is triggered by event : <sup>=</sup> ∙�−0� ∙�−�∙�−,−�� (,,|:) . (16) In order to build a histogram of the observed distribution of the sum of the values is taken in each standardized distance bin, where runs over and runs over :. The test results are shown in Figure 8, and they show a good fit. This outcome seems to be in contrast with the results of Zhuang et al., who conclude that the Gaussian space response form fits poorly to the Japan Meteorological Agency (JMA) catalogue data, as The test results are shown in Figure 8, and they show a good fit. This outcome seems to be in contrast with the results of Zhuang et al., who conclude that the Gaussian space response form fits poorly to the Japan Meteorological Agency (JMA) catalogue data, as opposed to the power law, which shows good results. However, the authors used the Rayleigh distribution with the wrong parameter for the test, that is, *σ* = 1/ √ 2 instead of *σ* = 1 [8] (Equation (26), p. 7, Figure 7a, p. 10). With the correct theoretical distribution, the difference between the Gaussian model variant and the JMA data is much less than what the 2004 article suggests. In the case of this study, the additional spatial parameter *ε* also helps to achieve better results in this test.

opposed to the power law, which shows good results. However, the authors used the

Rayleigh distribution with the wrong parameter for the test, that is, = 1⁄√2 instead of = 1 [**Error! Reference source not found.**] (Equation (26), p. 7, Figure 7a, p. 10). With the correct theoretical distribution, the difference between the Gaussian model variant and the JMA data is much less than what the 2004 article suggests. In the case of this study,

the additional spatial parameter also helps to achieve better results in this test.

**Figure 8.** Distance test results: reconstruction of the distribution of the standardized triggering dis-**Figure 8.** Distance test results: reconstruction of the distribution of the standardized triggering distance from observations versus the modelled distribution.

### tance from observations versus the modelled distribution. *4.6. Time Lag Test*

*4.6. Time Lag Test* The purpose of this test is to analyse the fit of the modelled time response function to the observations [**Error! Reference source not found.**]. The test method is analogous to the distance test: instead of standardized distances, the distribution of the time lag = − between triggered events and their ancestors is reconstructed using the triggering weights , as defined in Equation (16). Unlike in the distance test, is allowed to run over the full catalogue rather than excluding the first day after the Petrinja mainshock; as a result, the incompleteness of the catalogue is also visible in the test outcome. Two The purpose of this test is to analyse the fit of the modelled time response function to the observations [8]. The test method is analogous to the distance test: instead of standardized distances, the distribution of the time lag *tij* = *t<sup>i</sup>* − *t<sup>j</sup>* between triggered events and their ancestors is reconstructed using the triggering weights *ζij*, as defined in Equation (16). Unlike in the distance test, *i* is allowed to run over the full catalogue *H* rather than excluding the first day after the Petrinja mainshock; as a result, the incompleteness of the catalogue is also visible in the test outcome. Two views of the test output are shown in Figure 9: a plot of the probability density function (PDF) on a logarithmic scale and a complementary cumulative distribution function (CCDF) log–log plot, with the latter suggested as a more powerful diagnostic tool by Mignan [12].

views of the test output are shown in Figure 9: a plot of the probability density function (PDF) on a logarithmic scale and a complementary cumulative distribution function (CCDF) log–log plot, with the latter suggested as a more powerful diagnostic tool by Mignan [**Error! Reference source not found.**]. The test results show the following deviations: a downward deviation of the observed PDF from the model for very short time lags, apparently due to the incompleteness of early aftershock data in the Petrinja earthquake sequence. However, the test shows this deviation only at time intervals of less than 10−2 days, that is, less than 15 min, whereas The test results show the following deviations: a downward deviation of the observed PDF from the model for very short time lags, apparently due to the incompleteness of early aftershock data in the Petrinja earthquake sequence. However, the test shows this deviation only at time intervals of less than 10−<sup>2</sup> days, that is, less than 15 min, whereas in the study catalogue, events with *M* < 3 are missing from the initial part of the observed Petrinja aftershock sequence for more than 10−<sup>1</sup> days, that is, more than 2.4 h. The test also shows a downward deviation of the observed PDF from the model for very long time lags due to the boundary effect at the end of the study period. The latter effect also explains the downward deviation of the observed CCDF from the model.

### in the study catalogue, events with < 3 are missing from the initial part of the ob-*4.7. Triggering Ability Test*

served Petrinja aftershock sequence for more than 10−1 days, that is, more than 2.4 h. The test also shows a downward deviation of the observed PDF from the model for very long time lags due to the boundary effect at the end of the study period. The latter effect also explains the downward deviation of the observed CCDF from the model. This test compares the modelled and observed triggering ability of events by magnitude class [8]. The observed number of events triggered by an event *j* is expressed as ∑*<sup>i</sup> ζij* when using the triggering weights defined in Equation (16) while the modelled ultimate number is an exponential function of *M<sup>j</sup>* , and the expected number within the study period is obtained by restricting the integration bounds of the time responses. The test results are shown in Figure 10. While the test confirms the exponential trend, it also reveals an upward deviation from the model at low magnitudes. The graph suggests an exponential trend with a lower *α* parameter than that of the model; the likely underlying effect is the anisotropy of aftershock clusters.

**Figure 9.** Time lag test results: reconstruction of the distribution of the time lag between triggered events and their ancestors from observations. (**a**) Plot of probability density. (**b**) Plot of complementary cumulative distribution function (CCDF). To check the hypothesis that the deviation of the observed curve is explained by the time boundary effect, an alternative CCDF is also shown where the time lag is truncated at 400 days. **Figure 9.** Time lag test results: reconstruction of the distribution of the time lag between triggered events and their ancestors from observations. (**a**) Plot of probability density. (**b**) Plot of complementary cumulative distribution function (CCDF). To check the hypothesis that the deviation of the observed curve is explained by the time boundary effect, an alternative CCDF is also shown where the time lag is truncated at 400 days. *Appl. Sci.* **2023**, *13*, x FOR PEER REVIEW 18 of 31

**Figure 10.** Results of the triggering ability test: comparison of modelled and observed triggered event numbers by trigger event magnitude. **Figure 10.** Results of the triggering ability test: comparison of modelled and observed triggered event numbers by trigger event magnitude.

to insufficient data. Due to the data quality requirements of an ETAS study, the parameterization presented in the previous section relies on the instrumental catalogue data from the period 1996–2021, which are dominated by the 2020–2021 Croatian earthquake sequences. On the other hand, the largest historical occurrences within Hungary itself date to earlier than the study period. From the insurance industry point of view, a Hungarian earthquake model needs a regional parameterization, covering the central area around Budapest, since this area has the densest accumulation of insured property in the country. Furthermore, for a full coverage of the country, the geographical coverage gap along the

Sections 5.2 and 5.3 describe an approach for adapting ETAS parameters to regional source zones, which incorporate historical earthquake data from a longer period. Therefore, we combined ETAS with standard techniques that have been used to parameterize stationary mainshock-only models. The zonation used in this study is based on the area sources of the SHARE project [**Error! Reference source not found.**], with zone groupings for larger datasets. Figure 11 shows the map of the zones. The downscaling effort was restricted to the part of the zones within the core geographical window, defined between

**5. Parameter Downscaling Methods** *5.1. Motivation for Downscaling*

Romanian border would need to be filled.

45.5–49.0 N and 16.0–23.0 E.

## **5. Parameter Downscaling Methods**

## *5.1. Motivation for Downscaling*

By downscaling, we mean the adaptation of the model parameters to smaller zones within the study area, where a stand-alone ETAS parameter estimation is not feasible due to insufficient data. Due to the data quality requirements of an ETAS study, the parameterization presented in the previous section relies on the instrumental catalogue data from the period 1996–2021, which are dominated by the 2020–2021 Croatian earthquake sequences. On the other hand, the largest historical occurrences within Hungary itself date to earlier than the study period. From the insurance industry point of view, a Hungarian earthquake model needs a regional parameterization, covering the central area around Budapest, since this area has the densest accumulation of insured property in the country. Furthermore, for a full coverage of the country, the geographical coverage gap along the Romanian border would need to be filled.

Sections 5.2 and 5.3 describe an approach for adapting ETAS parameters to regional source zones, which incorporate historical earthquake data from a longer period. Therefore, we combined ETAS with standard techniques that have been used to parameterize stationary mainshock-only models. The zonation used in this study is based on the area sources of the SHARE project [26], with zone groupings for larger datasets. Figure 11 shows the map of the zones. The downscaling effort was restricted to the part of the zones within the core geographical window, defined between 45.5–49.0 N and 16.0–23.0 E. *Appl. Sci.* **2023**, *13*, x FOR PEER REVIEW 19 of 31

**Figure 11.** Source zones and zone groupings used in the model, based on the area sources of the SHARE project [**Error! Reference source not found.**]. The core geographical window is also displayed—the downscaled model covers the part of the zones falling within the core window. Zones 01–02–03, Zones 04–10, and Zones 05–06 are grouped for the model. The non-contiguous zones with grey labels are grouped for the model as the residual Zone 99. Zones with empty labels are not modelled. **Figure 11.** Source zones and zone groupings used in the model, based on the area sources of the SHARE project [26]. The core geographical window is also displayed—the downscaled model covers the part of the zones falling within the core window. Zones 01–02–03, Zones 04–10, and Zones 05–06 are grouped for the model. The non-contiguous zones with grey labels are grouped for the model as the residual Zone 99. Zones with empty labels are not modelled.

In this section, we refer to the maximum likelihood estimation (MLE) method devel-

− () <sup>=</sup> <sup>∙</sup> <sup>∙</sup> −(−0) <sup>∙</sup> <sup>2</sup> sinh( )

∑

The calculus for finding the maximum of the corresponding likelihood function

∑ ∙exp(−) <sup>=</sup> ∑∙

1−−(−0). (17)

= � , (18)

mation is to determine recurrence parameters for mainshocks, defined as the largest earthquakes in each cluster. The accumulation of mainshocks is assumed to follow a stationary Poisson point process with a constant recurrence rate . It is assumed that event magnitudes are independent and follow a truncated exponential distribution with decay parameter between the bounds 0 and , according to Equation (4). Magnitudes are rounded and grouped into the magnitude bins [ − , + ). The length of the time interval when the historical event catalogue is complete varies by the magnitude class . The number of events in the respective magnitude class and completeness period is denoted by . The parameters 0, , , and are determined by prior expert judgement and are fixed in the estimation; the unknown parameters to be fitted are and . Under these assumptions, the event count is drawn from a Poisson distribution

(, ) <sup>=</sup> <sup>∙</sup> <sup>∙</sup> <sup>∫</sup> +

(, ) leads to the equations [**Error! Reference source not found.**]:

∑ ∙∙exp(−)

whose mean parameter is:

*5.2. Maximum Likelihood Estimation with Variable Observation Periods*

## *5.2. Maximum Likelihood Estimation with Variable Observation Periods*

In this section, we refer to the maximum likelihood estimation (MLE) method developed by Weichert [33]. In a nutshell, the aim of the estimation is to determine recurrence parameters for mainshocks, defined as the largest earthquakes in each cluster. The accumulation of mainshocks is assumed to follow a stationary Poisson point process with a constant recurrence rate *ν*. It is assumed that event magnitudes are independent and follow a truncated exponential distribution with decay parameter *β* between the bounds *m*<sup>0</sup> and *mx*, according to Equation (4). Magnitudes are rounded and grouped into the magnitude bins [*m<sup>k</sup>* − *δ* , *m<sup>k</sup>* + *δ*). The length *T<sup>k</sup>* of the time interval when the historical event catalogue is complete varies by the magnitude class *m<sup>k</sup>* . The number of events in the respective magnitude class and completeness period is denoted by *n<sup>k</sup>* . The parameters *m*0, *mx*, *δ*, and *T<sup>k</sup>* are determined by prior expert judgement and are fixed in the estimation; the unknown parameters to be fitted are *ν* and *β*. Under these assumptions, the event count *nk* is drawn from a Poisson distribution whose mean parameter is:

$$\lambda\_k(\boldsymbol{\nu}, \boldsymbol{\beta}) = \boldsymbol{\nu} \cdot \boldsymbol{T}\_k \cdot \int\_{m\_k - \delta}^{m\_k + \delta} \boldsymbol{I}(m) \, dm = \boldsymbol{\nu} \cdot \boldsymbol{T}\_k \cdot \boldsymbol{e}^{-\beta(m\_k - m\_0)} \cdot \frac{2 \sinh(\beta \delta)}{1 - e^{-\beta(m\_x - m\_0)}}.\tag{17}$$

The calculus for finding the maximum of the corresponding likelihood function *L*(*ν*, *β*) leads to the equations [33]:

$$\frac{\sum\_{k} T\_{k} \cdot m\_{k} \cdot \exp(-\beta m\_{k})}{\sum\_{k} T\_{k} \cdot \exp(-\beta m\_{k})} = \frac{\sum\_{k} n\_{k} \cdot m\_{k}}{\sum\_{k} n\_{k}} = \overline{m}\_{\prime} \tag{18}$$

$$\nu = \frac{1 - e^{-\beta(m\_x - m\_0)}}{2\sinh(\beta\delta)} \cdot \frac{\sum\_k n\_k}{\sum\_k T\_k \cdot \exp(-\beta(m\_k - m\_0))}.\tag{19}$$

The recurrence parameter estimates are obtained by solving Equation (18) for *β* and then calculating *ν* from Equation (19). The strength of this MLE method is that it makes optimum use of available data by allowing magnitude-dependent observation periods. The method aims to find only temporal parameters. Its limitation is the reliance on a stationary Poisson event counting process, which normally restricts its application to declustered catalogues. Because the largest event in every cluster is selected, the estimated *β* is expected to be lower than the *β* parameter of the full population.

In our ETAS model context, the above MLE technique with variable observation periods could be straightforwardly adapted to find the (*µ*, *β*) parameters of the stationary Poisson background process while focusing on the initial event in each cluster rather than the largest one. For that, the calculation of the background weights *ξ<sup>i</sup>* and indicators *χ<sup>i</sup>* according to Equations (10) and (11) would need to be extended to the full historical catalogue. Such an extension is possible, with the following caveats: due to incomplete data, the process may identify some triggered earthquakes as background events. In addition, the high epicenter uncertainty of historical events, and especially the use of macroseismic epicenters, may lead to implausible results, unless Equation (10) is adjusted to take this effect into account, e.g., by the convolution of the space distributions with error functions. However, we did not elaborate in this paper on this method and its results because this study took a different approach for the downscaling of ETAS parameters, as seen in Section 5.3.

## *5.3. Transformed Time Estimates*

To be able to perform the MLE with variable observation periods on a non-declustered catalogue, the *t* 7−→ Λ(*t*) transformation was used, which converts the event counts into a stationary Poisson point process with unit rate [5], where Λ(*t*) is given by Equation (14).

When the event catalogue is incomplete, the part of the transformed time triggered by missing events is unknown. Given the ETAS parameters (*µ*, *K*, *η*, *q*, *α*) and the magnitude– frequency decay parameter *β*, an estimate is provided for the transformed time interval

length Λ(*T*2) − Λ(*T*1) corresponding to the observation period [*T*1, *T*2] where the observations are complete at magnitude threshold *mc*. Here, the space–time region is defined by a geographical zone and an observation period, and, as a simplification, it is assumed that the triggering effects crossing the boundaries of this region are negligible in both directions, which is a reasonable assumption if there are no major events close to the zone and period boundaries. This simplification allows for disregarding of the space–time integration bounds of the response functions *g* and *f* and ignoring all events outside the selected region. The total transformed time is divided into the following parts: (1) Λ<sup>0</sup> is the transformed time coming from the background process, (2) Λ1<sup>+</sup> is the transformed time triggered by the above-threshold events, and (3) Λ1<sup>−</sup> is the transformed time triggered by the below-threshold events. Given that the observed below-threshold events only set a lower bound, Λ1<sup>−</sup> is unknown.

$$
\Lambda\_0 = \mu \cdot (T\_2 - T\_1) \tag{20}
$$

$$\Lambda\_{1+} = \frac{K}{\eta \eta} \cdot \exp\left(-\eta \Delta t\_0^{\eta}\right) \cdot \sum\_{j:\mathcal{I}\_j \in \left[T\_1, T\_2\right]\_\text{\mathcal{I}}} \, \_{\mathcal{M}\_j \ge m\_\varepsilon} e^{a(M\_j - m\_0)},\tag{21}$$

$$
\Lambda\_{1-} \ge \Lambda\_{1-}^{\min} = \frac{K}{\eta q} \cdot \exp\left(-\eta \Delta t\_0^q\right) \cdot \sum\_{j: t\_j \in [T\_1, T\_2]\_\nu} e^{a(M\_j - m\_0)}.\tag{22}
$$

In Equation (22), Λ1<sup>−</sup> is estimated by emulating iterations of the triggering process by tracing back every below-threshold event either to some above-threshold event or to the background:

$$\begin{split} \Lambda\_{1-}^{\text{est}} &= (\Lambda\_0 + \Lambda\_{1+}) \cdot \left[ p(m\_\varepsilon) \cdot \varrho(m\_\varepsilon) + p(m\_\varepsilon)^2 \cdot \varrho(m\_\varepsilon)^2 + \dots \right] = \\ &= (\Lambda\_0 + \Lambda\_{1+}) \cdot \frac{p(m\_\varepsilon) \cdot \varrho(m\_\varepsilon)}{1 - p(m\_\varepsilon) \cdot \varrho(m\_\varepsilon)} \end{split} \tag{23}$$

where *p*(*mc*) is the probability that the magnitude of an event is lower than *m<sup>c</sup>* according to Equation (24), and *\$*(*mc*) is the average reproduction rate of events below magnitude *m<sup>c</sup>* according to Equation (25). The truncation at *m<sup>c</sup>* is carried out to avoid double counting with Λ1+.

$$p(m\_\ell) = \frac{1 - e^{-\beta(m\_\ell - m\_0)}}{1 - e^{-\beta(m\_x - m\_0)}},\tag{24}$$

$$\varrho(m\_{\varepsilon}) = \begin{cases} \frac{K}{\eta q} \cdot \exp\left(-\eta \Delta t\_{0}^{q}\right) \cdot \beta \cdot \frac{m\_{\varepsilon} - m\_{0}}{1 - e^{-\beta(m\_{\varepsilon} - m\_{0})}} & \text{if } a = \beta, \\\frac{K}{\eta q} \cdot \exp\left(-\eta \Delta t\_{0}^{q}\right) \cdot \frac{\beta}{\beta - a} \cdot \frac{1 - e^{-(\beta - a)(m\_{\varepsilon} - m\_{0})}}{1 - e^{-\beta(m\_{\varepsilon} - m\_{0})}} & \text{if } a \neq \beta. \end{cases} \tag{25}$$

Considering the lower bound in (22), the total estimated transformed time is then

$$
\Lambda(T\_2) - \Lambda(T\_1) = \Lambda\_0 + \Lambda\_{1+} + \max\left(\Lambda\_{1-\prime}^{\text{est}}, \Lambda\_{1-}^{\text{min}}\right). \tag{26}
$$

The MLE framework with variable observation periods is used with the above time transformation for downscaling the ETAS parameters (*µ*, *K*, *η*, *q*, *α*, *D*,*ε*) to zones. The starting values are the estimated overall parameters from Table 1. The background rate parameter *µ* is estimated for each zone from the 2000–2021 catalogue as ∑ *χi*/*T*, where *T* is the length of this period in days; because the *χ<sup>i</sup>* indicators are robust, they are not recalculated with the downscaled parameters. The time response shape parameters (*η*, *q*) are left unchanged during the downscaling as most of the zone sub-catalogues have insufficient data to reparametrize them. The trigger scale parameter *K* is rescaled to *K*·*r* for each zone individually, and the *α* parameter is adjusted by zone using the *α* = *β* assumption. The transformed time intervals corresponding to the observation periods change with the rescaling. Starting from *r* = 1, the following iteration steps are repeated until convergence:

1. Parameter *β* is obtained by solving Equation (18) for the non-declustered event set in the transformed time, and *α* is set equal to *β*.

2. The scaling factor *r* is set such that *ν* = 1 for the non-declustered event set in the transformed time, where *ν* is calculated according to Equation (19).

There are no sufficient data by zone for an independent rescaling of the space response parameters *D* and *ε*. Therefore, *ε* is kept unchanged while *D* is rescaled to *D*·*r* 1/2; the latter assumption keeps the number of triggered events per unit of aftershock area approximately invariant.

## **6. Parameter Downscaling Results**

## *6.1. Downscaled Parameters*

The magnitude bounds are set at *m*<sup>0</sup> = 2.45 and *m<sup>x</sup>* = 6.55, and the magnitude class width is 2*δ* = 0.1. For simplicity, the same upper bound is used for all zones. Considering the maximum magnitudes of the model variants as in Tóth et al. [34], *m<sup>x</sup>* = 6.55 appears a conservative choice for the core geographical window. The largest homogenized magnitude in the historical catalogue observed within this area is *M* = 6.1 which is estimated for the Érmellék 1834 earthquake. The observation periods by magnitude class are given in Table 3.


**Table 3.** Observation periods by magnitude class.

The downscaled parameter estimates by zone are shown in Table 4. The obvious boundary effects at the Croatian edge of the core window make the transformed time estimates unreliable in Zone 99. For this zone all original parameters other than *µ* are kept unchanged. Because of data quality concerns, the parameters are not downscaled in the Romanian border Zone 05–06 either. As a placeholder solution for including the Romanian border area in the model, the kernel estimate of the background density *u*(*x*, *y*) and the declustering indicators *χ<sup>i</sup>* are extended to Zone 05–06 without changing the original ETAS parameters while applying a 0.5 credibility weight to all events with questionable data quality (that is, all events in zone 05–06 whose only source is Catalogue C) and restricting the estimation of *µ* to the years 2000–2010. Another problem area is zone 13 on the northeastern periphery, where there is a very steep frequency decay at low-end magnitudes, which is a possible indication of the presence of an inhomogeneous sub-population. Therefore, in this zone, the lowermost magnitude classes—*M* < 2.7, for the estimation of *µ*, and *M* < 2.6, for transformed-time MLE—are excluded from the calculation. Thereby, some low-magnitude residual activity in this zone is left unmodelled.

## *6.2. Sensitivity of the Downscaled Parameters*

For assessing the robustness of the downscaling, the parameters are recalculated using two alternative parameterizations of the overall model from Section 4.2., i.e., with the two parameter sets calculated with alternative magnitudes of the Petrinja mainshock. The results of this calculation are shown in Table 5. Except for *D*, the changes in most parameters are small. One cannot conclude from these results that the errors of the downscaled parameters are small; however, the test suggests that the sensitivity of these parameters to the overall parameterization is limited and that the parameters depend more strongly on the zone catalogues.

**Table 4.** Preferred ETAS parameters by zone after downscaling. The event numbers used for parameter downscaling and the implied branching ratios *\$* are also shown. The downscaling approaches were not applied to the Romanian border Zone 05–06 due to data deficiencies and to the residual Zone 99 due to boundary effects. Note that the *µ* background rates (without Zone 05–06) do not add up to the core area total *µ*<sup>0</sup> in Table 1; the difference is due to the non-covered residual activity in Zone 13.


\* No downscaling. \*\* With credibility weights.

**Table 5.** Alternative results for the downscaled parameters, starting from two alternative parameter sets of the overall model calculated with different magnitudes of the Petrinja mainshock, *M* = 6.2 and *M* = 6.4, respectively (cf. Table 2).


## *6.3. Test of Long-Term Event Numbers by Zone*

For this test, 500 stochastic catalogues were generated using the downscaled parameters and another 500 stochastic catalogues with the original parameters, each stochastic catalogue spanning 300 years. This provided a modelled distribution of event numbers by zone before and after downscaling, which were compared with the respective event numbers in the historical catalogue. The comparison focused on magnitudes *M* ≥ 4.0, i.e., these events with a potential for insurance losses. Event numbers in the observation periods according to Table 3 were compared to these in the equivalent simulated periods by magnitude class. The test results are shown in Figure 12.

## *6.4. Test of Event Accumulation by Zone, 2000–2021*

The comparison of modelled versus observed cumulative event numbers (cf. Figure 6) is performed in this test for each zone separately. The results are shown in Figure 13. The model shows a better fit to the observations in the central Zones (01–02–03 and 04–10) than on the periphery; this can also reflect variations of underlying data quality. The biggest deviation is observed in Zone 13 where the smallest events were excluded from the downscaling to obtain a better fit to long-term event counts in higher magnitude classes.

**Zone Alternative**

**Table 5.** Alternative results for the downscaled parameters, starting from two alternative parameter sets of the overall model calculated with different magnitudes of the Petrinja mainshock, = 6.2

**(km)**

**(km)**

and = 6.4, respectively (cf. Table 2).

**(1/Day)** <sup>=</sup>

01–02–03 = 6.2 0.0101 0.0067 0.4011 0.2700 2.4414 0.0545 2.3035 0.60

04–10 = 6.2 0.0113 0.0038 0.4011 0.2700 2.8070 0.0410 2.3035 0.39

08 = 6.2 0.0174 0.0049 0.4011 0.2700 2.7400 0.0466 2.3035 0.49

11 = 6.2 0.0092 0.0054 0.4011 0.2700 2.3870 0.0490 2.3035 0.47

13 = 6.2 0.0059 0.0032 0.4011 0.2700 2.8651 0.0376 2.3035 0.33

magnitude class. The test results are shown in Figure 12.

*6.3. Test of Long-Term Event Numbers by Zone*

= 6.4 0.0101 0.0065 0.4401 0.2308 2.4423 0.0429 2.3842 0.60

= 6.4 0.0113 0.0037 0.4401 0.2308 2.8073 0.0323 2.3842 0.39

= 6.4 0.0174 0.0047 0.4401 0.2308 2.7414 0.0368 2.3842 0.49

= 6.4 0.0092 0.0052 0.4401 0.2308 2.3872 0.0387 2.3842 0.47

= 6.4 0.0059 0.0031 0.4401 0.2308 2.8680 0.0296 2.3842 0.33

For this test, 500 stochastic catalogues were generated using the downscaled parameters and another 500 stochastic catalogues with the original parameters, each stochastic catalogue spanning 300 years. This provided a modelled distribution of event numbers by zone before and after downscaling, which were compared with the respective event numbers in the historical catalogue. The comparison focused on magnitudes ≥ 4.0, i.e., these events with a potential for insurance losses. Event numbers in the observation periods according to Table 3 were compared to these in the equivalent simulated periods by

**Figure 12.** *Cont*.

**Figure 12.** *Cont*.

**Figure 12.** Comparison of long-term event counts by zone, modelled and observed. The modelled distributions were obtained by generating 500 stochastic catalogues using the downscaled parameters and another 500 stochastic catalogues with the original parameters, each one spanning 300 years. The observation periods reflecting completeness differ by magnitude class; therefore, the event numbers shown are not always decreasing with the magnitude threshold. (**a**,**c**,**e**,**g**,**i**) Zone-byzone downscaled model results. (**b**,**d**,**f**,**h**,**j**) Results for the same zones with parameters before downscaling, for comparison. (**k**,**l**) Results for Zones 05–06 and 99 where the downscaling approach is not used. (**m**,**n**) Results for the total modelled area, with and without downscaling. **Figure 12.** Comparison of long-term event counts by zone, modelled and observed. The modelled distributions were obtained by generating 500 stochastic catalogues using the downscaled parameters and another 500 stochastic catalogues with the original parameters, each one spanning 300 years. The observation periods reflecting completeness differ by magnitude class; therefore, the event numbers shown are not always decreasing with the magnitude threshold. (**a**,**c**,**e**,**g**,**i**) Zone-by-zone downscaled model results. (**b**,**d**,**f**,**h**,**j**) Results for the same zones with parameters before downscaling, for comparison. (**k**,**l**) Results for Zones 05–06 and 99 where the downscaling approach is not used. (**m**,**n**) Results for the total modelled area, with and without downscaling.

The comparison of modelled versus observed cumulative event numbers (cf. Figure 6) is performed in this test for each zone separately. The results are shown in Figure 13. The model shows a better fit to the observations in the central Zones (01–02–03 and 04–10) than on the periphery; this can also reflect variations of underlying data quality. The biggest deviation is observed in Zone 13 where the smallest events were excluded from the downscaling to obtain

a better fit to long-term event counts in higher magnitude classes.

*6.4. Test of Event Accumulation by Zone, 2000–2021*

**Figure 13.** Modelled versus observed event accumulation by zone in the study period 2000–2021 for *M* ≥ 2.5. (**a**–**e**) Results of the test shown for each zone. Zones 05–06 and 99 with no downscaling are omitted.

## **7. Discussion and Conclusions**

ETAS methodology has been applied to Hungarian earthquake data of the period 1996–2021. Despite the data challenges caused by moderate seismicity, we succeeded in fitting space–time ETAS parameters to recent instrumental data of the region, which include the 2020–2021 Croatian earthquake sequences. Our conclusions from the modelling exercise are the following:


shown in Figure 13a; however, this graph masks the inhomogeneities between the two 2013 event sequences, which were geographically distant but overlapped in time.

Finally, we reflect on how the results of this study contribute to the planned application in insurance. In this study, only the first module of an earthquake risk model was covered, that is, a statistical model of event occurrences. With the help of this model, synthetic earthquake catalogues can be generated, representing sequences of events that may happen in the future and possibly cause losses. Unlike in those models where only mainshocks are modelled, the synthetic catalogues of this model also include aftershock sequences. However, the subset of the mainshocks can be extracted from the catalogues if a simplified model variant is preferred. If aftershocks are modelled, then the future scenarios are not independent from the past, as large earthquakes are followed by periods of increased seismic activity. This effect can be captured by the stochastic catalogue generator of this model if recent events are added to the simulation as fixed inputs, occurring before the simulated time interval.

The synthetic catalogues mentioned above will be used as a starting point for simulating insurance losses when the full model is in place. Future work on the model will focus on developing those modules that were not discussed in this study: the ground motion attenuation and the vulnerability modules. Another possible area of future work is to extend the geographical coverage of the model to other Central European countries.

**Author Contributions:** Conceptualization, J.C.-B. and P.S.; Methodology, P.S.; Software, P.S.; Formal analysis, P.S.; Data curation, L.T.; Writing—original draft preparation, P.S.; Writing—review and editing, J.C.-B. and L.T.; Supervision, J.C.-B. and L.T.; Project administration, J.C.-B. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was fully funded by the UNIQA Insurance Group AG.

**Data Availability Statement:** Original datasets analysed in this study are mostly publicly available and can be found here: http://mek.niif.hu/04800/04801/ (accessed on 5 February 2023), http: //www.isc.ac.uk/iscbulletin/search/ (accessed on 5 February 2023), https://earthquake.usgs.gov/ earthquakes/search/ (accessed on 5 February 2023), http://www.georisk.hu/Bulletin/bulletinh.html (accessed on 5 February 2023), http://www.georisk.hu/Tajekoztato/tajekoztato.html (accessed on 5 February 2023). Other, pre-processed data presented in this study are available on request from the author: László Tóth; email: toth@georisk.hu.

**Acknowledgments:** The authors acknowledge the financial support of the UNIQA Insurance Group AG, Wien, Austria and especially Kurt Svoboda and Roman Schneider. The authors also thank Balázs Sághy at UNIQA Biztosító Zrt., Budapest, Hungary for all his support and encouragement given to this research, and for his help with software tools.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

## *Article* **The Conditional Probability of Correlating East Pacific Earthquakes with NOAA Electron Bursts**

**Cristiano Fidani 1,2,3**

<sup>2</sup> Central Italy Electromagnetic Network, 63847 Fermo, Italy

<sup>3</sup> Osservatorio Sismico "A. Bina", 63847 Perugia, Italy

**Abstract:** A correlation between low L-shell 30–100 keV electrons precipitating into the atmosphere and M ≥ 6 earthquakes in West Pacific was presented in past works where ionospheric events anticipated earthquakes by 1.5–3.5 h. This was a statistical result obtained from the Medium Energy Protons Electrons Detector on board the NOAA-15 satellite, which was analyzed for 16.5 years. The present analysis, utilizing the same database, translated into adiabatic coordinates during geomagnetic quiet periods, lead to another significant correlation regarding East Pacific strong earthquakes. This new correlation is still observed between high energy precipitating electrons detected by the NOAA-15 0◦ telescope and M ≥ 6 events of another very dangerous seismic region of the Pacific ring of fire. The particle precipitation that contributed to this correlation was characterized by electron L-shell, pitch-angle, possible disturbance altitudes, and geographical locations. This correlation occurred circa 57 h prior to the East Pacific earthquakes, according to past single cases of reports. The conditional probability corresponding to the cross-correlation peak of 0.024 per binary events reached a value of 0.011. A probability gain of 2 was calculated for earthquakes after an independent L-shell EBs detection, it is therefore applicable for future earthquake forecasting experiments. Moreover, a time-dependent probability gain approaching the correlation peak was estimated.

## **1. Introduction**

Low Earth Orbit (LEO) satellites fly at altitudes ranging between 200 and 2000 km, providing a platform of observation extending over hundreds of km, thus being able to monitor large regions struck by strong earthquakes (EQs). Likewise, given that LEO satellites are not stationary, and they circle the Earth many times every day, this platform is capable of monitoring the entire Earth's surface, assuring multiple passages on the same areas in a few days [1]. This type of monitoring uses non-seismic detectors which are mainly electromagnetic, being that the atmosphere is absent at satellite altitudes. Nonseismic phenomena observed on the ground during strong EQs are well recognized as happening before, during, and after a seismic manifestation [2]. These phenomena include fluid migration [3], the Earth's electric currents [4], atmospheric phenomena [5], and electromagnetic perturbations [6]. However, at ground level, they can be influenced by local effects, including land, atmospheric variables, and anthropogenic activities [7]. Thus, they cannot be reliably studied and related to the different phases of EQ preparation [8]. In this regard, LEO satellite observations overcome such difficulties by averaging signals on large areas and thus reducing the influence of local phenomena [9].

Electromagnetic detectors at LEO altitudes are also disturbed by atmospheric phenomena [10] and extraterrestrial perturbations mainly associated with the Sun [11]. Therefore, it has been suggested that remote sensing from the near-earth space of phenomena observed with EQs may be associated with seismic phenomena by statistical approaches [12]. An additional advantage of LEO satellites is that regarding terrestrial observations, they are

**Citation:** Fidani, C. The Conditional Probability of Correlating East Pacific Earthquakes with NOAA Electron Bursts. *Appl. Sci.* **2022**, *12*, 10528. https://doi.org/10.3390/app122010528

Academic Editors: Ricardo Castedo, Miguel Llorente Isidro and David Moncoulon

Received: 27 August 2022 Accepted: 15 October 2022 Published: 18 October 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

able to continuously monitor multiple regions to investigate many strong seismic events over a few years for statistical analyses [13].

Statistical correlations between time series of strong earthquakes and time series of remotely collected signals have been recently obtained in several fields of electromagnetic observations [14]. A study lasting 7.5 years concerning Pc1 pulsations reported from a low-latitude station in Parkfield, California, an enhanced occurrence probability of such phenomena about 5–15 days prior to EQs, during the daytime [15]. A statistical correlation was calculated for ULF geomagnetic fluctuations, and this phenomenon anticipated moderate earthquakes by 1–5 days at Japanese ground stations [16]. Seismo-ionospheric effects on long sub-ionospheric paths have been investigated in amplitude variations of signals [17,18]. In a statistical study concerning VLF/LF wave paths, transmitter signal amplitudes revealed perturbations with a frequency excess of 3–6 days before strong seismic events occurred in 10 years [19]. Concerning the magnetometers onboard LEO satellites, shallow earthquakes with M > 5.5 have been anticipated by both magnetic field perturbation and electron density signals of 10−0.96+0.51·<sup>M</sup> and 10−3.46+0.83·<sup>M</sup> days, respectively, using the first 8 years of Swarm data [20]. A significant correlation between VLF wave intensity and strong EQs occurring at 0–4 h was observed in a statistical study using the micro-satellite DEMETER [21,22], where EQs were preceded by decreasing intensities. Furthermore, the largest occurrence rates of anomalies on TEC data from the global ionosphere map were correlated with strong superficial EQs, they occurred 1–5 days before the EQs [23,24]. Significant increases in electron density measurements observed on board the DEMETER satellite were correlated with moderate seismic events that occurred 10 to 6 days later [25]. A statistical analysis of NOAA POES data has evidenced Electron Bursts (EBs), which are sudden increases in electron fluxes, in the loss cone 2–3 h before the M ≥ 6 quakes in Indonesia and the Philippines [26]. This analysis has been recently improved upon by considering the interval of 1.5–3.5 h before strong EQs in the West Pacific and resolving the ambiguity of recognizing EBs without knowing the EQ epicenters a priori [27]. The aim was to develop a methodology for EQ forecasting and verification [28].

Based on the theory of conditional probability developed in past studies [27,28] for EQ forecasting, an application for both EQ and tsunami risk reductions will be described for seismic activity in the East Pacific. Moreover, Section 2 will describe the databases used for the analysis and their relative restrictions. Section 3 will deal with the selection of EBs used to verify their correlation with East Pacific's strong EQs. This correlation is used for possible risk reduction applications discussed in Section 4, due to a longer time lag of correlated EQs. Finally, a summary of the results is reported in the Conclusions.

## **2. Materials and Methods**

Data from the NOAA-15 polar satellite were used in this study. Particle counting rates (CRs) were produced by detectors on board the satellite that monitors protons and electrons flying in polar orbits at altitudes between 808 and 824 km at perigee and apogee, respectively, with an inclination of 98.7◦ and a period of 101.20 [29]. The particle detectors, where the analysis was applied, were composed of eight solid-state telescopes called medium energy proton and electron detectors (MEPED). The latter was measuring proton and electron fluxes on 0◦ , 90◦ , and Omni-directions from 30 keV to 200 MeV [30]. MEPED data were provided every 2 s, whereas all the sets of orbital parameters were provided every 8 s; consequently, 8 s averages of the CRs were calculated, discarding unreliable negative values, and correcting for proton contamination [31]. Only the zenith telescope, 0 ◦ , was used to count electrons in this, as both the 90◦ and omnidirectional telescopes were investigated in the past, reporting no positive correlations [26–28]. The detected electron energies were within three ranges of 30–2700 keV, 100–2700 keV, and 300–2700 keV, so three differences in CRs were calculated to create new sets of data from 30 keV to 100 keV, 100 keV to 300 keV, and 300 keV to 2700 keV. In fact, different behaviors in particle dynamics are determined by energy, and, thus, the new sets resulted in a simpler analysis. When the mirror point of such electrons goes under 100 km, particles are sure to be stopped in

the residual atmosphere. This occurs when electrons enter the loss cone, which can be determined by minimizing electron mirror point altitudes through the UNILIB libraries [32]; so, the minimum mirror altitude was also added to the data.

The International Geomagnetic Reference Field model (IGRF-13) was used to precisely determine the B-field and the L-shell at the satellite orbit. The dynamics of electrons were described using adiabatic invariants as they are more stable. Therefore, the analysis parameters were the geomagnetic field at mirror points Bm = B/ cos2α, where α is the pitch angle that is the difference between the electron velocity and geomagnetic field direction, and the L-shell. A 4-dimensional matrix (t; L; α; B) was filled with data relative to electron CRs where t is the time. Intervals were chosen to be of 8 s for t, 0.1 for L, 15◦ for α, and non-linear for B, with the shortest of these latter used where CRs were highest, to better describe the CR spatial variations and with the largest intervals used where the CRs were less frequent, so to obtain a greater number of samples [26–28]. The time period was from July 1998 to December 2014. The parameter L was restricted to a range between 1.0 and 2.2 to focus the analysis on the inner Van Allen Belts. Moreover, the α range was 180◦ , and the geomagnetic field range was between 16 µT and 47 µT.

The solar activity is the primary cause of precipitating electrons, it is also able to influence the inner Van Allen Belts, thus, days of moderate to high solar activity were excluded from the analysis. This was performed by neglecting data when the daily Ap index overcame a threshold. The threshold was set according to the seasons and years due to the solar cycle using the relation Ap = 11.1 + 0.8 sin [0.37(year − 1996)] + {2.1 − 0.1 sin [0.37(year − 1996)]} cos [0.0172(day − 27)]. The phase shift defined by the above relation corresponds to the minimum of the Sun's activity in 1996, whereas the 27-day modulation was dictated by the Sun's rotation. Furthermore, CRs were not included in the analysis whenever the Dst index resulted in being lower than −27 nT. This was done so as not to be influenced by any substorm activity.

Being that the CR distributions were Poissonian in every interval of (t; L; α; B), an electron CR fluctuation was considered a statistical fluctuation with a probability of less than 1% if the value of the Poisson distribution(CR fluctuation) was less than 0.01. CRs that satisfy this relation were defined as EBs. When the satellite runs a semi-orbit, all detected EBs were labeled as only one EB. Including only EBs with L-shells in a restricted interval [27], a histogram of Correlation Events (CEs) between EQs and EBs was calculated. CE was defined by the time difference between EQs and EBs, TEQ − TEB, which permitted the filling of the histogram with a bin interval of 2 h. Whenever a CE peak appeared at a certain time interval for some kind of EQs, it was tested for its significance [26] thus determining whether it was a positive correlation. The conditional probability of an EQ event following the observation of an EB event was defined for binary events by [27,28]:

$$\mathbf{P(EQ \mid EB)} = \mathbf{CE / N\_{EB}}.\tag{1}$$

The cross-correlation Pearson coefficient for binary events called the Matthews correlation coefficient [33] can be obtained by [27,28]:

$$\text{P(EQ \mid EB)} = \text{P(EQ)} + \text{corr(EQ,EB)} \text{(P(EQ)} [1 - \text{P(EQ)}] [1 - \text{P(EB)}] / \text{P(EB)} [^{1/2}] \tag{2}$$

and

$$\text{corr(EQ,EB)} = [\text{CE/N}\_{\text{h}} - \text{P(EQ)P(EB)}][\text{P(EQ)}[1 - \text{P(EQ)}]\text{P(EB)}[1 - \text{P(EB)}]]^{-1/2},\tag{3}$$

where P(EQ) = NEQ/Nh, P(EB) = NEB/Nh, NEQ was the number of considered EQs, NEB was the number of considered EBs, and N<sup>h</sup> was the number of time intervals. The corresponding ratio P(EQ|EB)/P(EQ) = G is the probability gain of the correlation peak.

### **3. Results** is represented in colors. The black contour in the center represents the SAA. As reported

**3. Results**

The extraction of EBs from the NOAA-15 database of CRs starting from July 1998 to December 2014 was performed using the same procedure as in previous analyses [26–28]. Being so, the starting set of EBs to correlate to EQs remained unvaried and is reported in Figure 2 of a past work [28]. These EBs were all in the loss cone with bouncing altitudes lower than 200 km, and for 95% of cases lower than 100 km, in correspondence to the South Atlantic Anomaly (SAA). Figure 1 reports the geographical distribution of the NOAA-15 trajectories in one day, where the electron flux distribution along the trajectory is represented in colors. The black contour in the center represents the SAA. As reported in previous analysies, see, for example, Figure 1 of [26], precipitating electrons were observed far from the SAA to the west up until 170◦ in longitude, indicated by the cyan contour of Figure 1. The space-time distribution of the detected EBs consists of a series of consecutive events corresponding to time intervals lasting up to a few minutes. Although, one or more non-consecutive detections characterized the space-time distribution of EBs along the single semi-orbit. Moreover, the observation of one or more EBs in a semi-orbit was defined as only an EB event. Being that the EB L-shells were limited at L < 1.4 by the requirements of precipitating particles from the inner Van Allen Belts, the time length of every EBs was always found less than 12 min. For what concerns the EB time, was defined as the average time among the times of selected EBs in the semi-orbit. in previous analysies, see, for example, Figure 1 of [26], precipitating electrons were observed far from the SAA to the west up until 170° in longitude, indicated by the cyan contour of Figure 1. The space-time distribution of the detected EBs consists of a series of consecutive events corresponding to time intervals lasting up to a few minutes. Although, one or more non-consecutive detections characterized the space-time distribution of EBs along the single semi-orbit. Moreover, the observation of one or more EBs in a semi-orbit was defined as only an EB event. Being that the EB L-shells were limited at L < 1.4 by the requirements of precipitating particles from the inner Van Allen Belts, the time length of every EBs was always found less than 12 min. For what concerns the EB time, was defined as the average time among the times of selected EBs in the semi-orbit. Unlike the previous study [26], which investigated strong EQs worldwide, the subsequent work [28] reported on correlations obtained with EQs restricted to the Philippines and Indonesian areas, in order to maximize the ratio between the correlation amplitude and the non-correlation amplitude. Afterward, numerous geographical areas were tested for their suitability, with the aim of maximizing this ratio, so a stronger correlation was obtained for West Pacific EQs [27]. During our study, the extension of the EQ area was pushed to the Central Pacific until it reached the East Pacific, therein another correlation peak emerged. Given this, a new analysis started for EQs in East Pacific.

The extraction of EBs from the NOAA-15 database of CRs starting from July 1998 to December 2014 was performed using the same procedure as in previous analyses [26–28]. Being so, the starting set of EBs to correlate to EQs remained unvaried and is reported in Figure 2 of a past work [28]. These EBs were all in the loss cone with bouncing altitudes lower than 200 km, and for 95% of cases lower than 100 km, in correspondence to the South Atlantic Anomaly (SAA). Figure 1 reports the geographical distribution of the NOAA-15 trajectories in one day, where the electron flux distribution along the trajectory

*Appl. Sci.* **2022**, *12*, 10528 4 of 14

**Figure 1.** The set of NOAA-15 semi-orbits on 4 January 2001, with the CRs along the line orbits evidenced by colors; CR = 0 in blue, CR = 10<sup>0</sup> in light blue, CR = 10<sup>1</sup> in green, CR = 10<sup>2</sup> in orange, and CR = 10<sup>3</sup> in yellow; the SAA region is evidenced using a black contour, while the region where the satellite detect electrons in the loss cone, is evidenced by a cyan contour. **Figure 1.** The set of NOAA-15 semi-orbits on 4 January 2001, with the CRs along the line orbits evidenced by colors; CR = 0 in blue, CR = 10<sup>0</sup> in light blue, CR = 10<sup>1</sup> in green, CR = 10<sup>2</sup> in orange, and CR = 10<sup>3</sup> in yellow; the SAA region is evidenced using a black contour, while the region where the satellite detect electrons in the loss cone, is evidenced by a cyan contour.

Unlike the previous study [26], which investigated strong EQs worldwide, the subsequent work [28] reported on correlations obtained with EQs restricted to the Philippines and Indonesian areas, in order to maximize the ratio between the correlation amplitude and the non-correlation amplitude. Afterward, numerous geographical areas were tested for their suitability, with the aim of maximizing this ratio, so a stronger correlation was obtained for West Pacific EQs [27]. During our study, the extension of the EQ area was pushed to the Central Pacific until it reached the East Pacific, therein another correlation peak emerged. Given this, a new analysis started for EQs in East Pacific.

A set of 6 correlation plots between 30–100 eV EBs and East Pacific M ≥ 6 EQs are shown in Figure 2 where the bin time interval was chosen between 2 and 6 h. A range of ±3 days was considered for the time difference between EQs and EBs, where TEQ − TEB positive highlights that EBs precede EQs, while EQs precede EBs for negative time differences. CE distributions were Poissonian as in past cases, and the average values are indicated using black dashed lines, while the red dashed lines are used for standard deviations. Significant correlation peaks appear at time differences around 57 h, which means that the EB observation anticipated the corresponding EQ. Unlike past works [26,28], where the correlation significance was represented by EQ projections at certain altitudes, here the correlation significance was calculated exclusively with respect to the EB features. In analogy with the case of West Pacific EQs [27], the L-shell interval of electrons to maximize the correlations between EBs and East Pacific EQs was obtained at 1.1 ≤ L ≤ 1.3. A plot of this EB L-shell parameter compared to latitudes is shown in Figure 3. The plot reports two distinct distributions, a correlation for negative latitudes and a lack of correlation for positive ones. For what concerns electron pitch angles, they were concentrated in intervals around 67◦ and 117◦ , with 56◦ ≤ α ≤ 72◦ and 108◦ ≤ α ≤ 126◦ . The correlations in Figure 2 were obtained for EQ depths lower than 200 km, supporting the hypothesis that only EQs close to the surface seem to be correlated to ionospheric activity. A set of 6 correlation plots between 30–100eV EBs and East Pacific M ≥ 6 EQs are shown in Figure 2 where the bin time interval was chosen between 2 and 6 h. A range of ±3 days was considered for the time difference between EQs and EBs, where TEQ – TEB positive highlights that EBs precede EQs, while EQs precede EBs for negative time differences. CE distributions were Poissonian as in past cases, and the average values are indicated using black dashed lines, while the red dashed lines are used for standard deviations. Significant correlation peaks appear at time differences around 57 h, which means that the EB observation anticipated the corresponding EQ. Unlike past works [26,28], where the correlation significance was represented by EQ projections at certain altitudes, here the correlation significance was calculated exclusively with respect to the EB features. In analogy with the case of West Pacific EQs [27], the L-shell interval of electrons to maximize the correlations between EBs and East Pacific EQs was obtained at 1.1 ≤ L ≤ 1.3. A plot of this EB L-shell parameter compared to latitudes is shown in Figure 3. The plot reports two distinct distributions, a correlation for negative latitudes and a lack of correlation for positive ones. For what concerns electron pitch angles, they were concentrated in intervals around 67° and 117°, with 56° ≤ α ≤ 72° and 108° ≤ α ≤ 126°. The correlations in Figure 2 were obtained for EQ depths lower than 200 km, supporting the hypothesis that only EQs close to the surface seem to be correlated to ionospheric activity.

**Figure 2.** Correlations were obtained by filling histograms of the time difference TEQ – TEB which is positive when EBs anticipated EQs. They are plotted for different bins of 2, 2.5, 3, 4, 5, and 6 h from top left to right; the centers of correlation peaks are 57.2, 56.05, 55.8, 56.3, 57.3, and 57.2 h, respectively; averages are evidenced by the black horizontal dotted lines, while the red ones indicated the 3 σ levels; note that border bins are not completely populated as they are partially out the range of ±3 days. **Figure 2.** Correlations were obtained by filling histograms of the time difference TEQ − TEB which is positive when EBs anticipated EQs. They are plotted for different bins of 2, 2.5, 3, 4, 5, and 6 h from top left to right; the centers of correlation peaks are 57.2, 56.05, 55.8, 56.3, 57.3, and 57.2 h, respectively; averages are evidenced by the black horizontal dotted lines, while the red ones indicated the 3 σ levels; note that border bins are not completely populated as they are partially out the range of ±3 days.

**Figure 3.** The L-shell parameter distribution is compared to the EB latitudes; the distribution pattern is different from those of the EB correlation with West Pacific EQs as in this case EB longitudes correspond to the SAA. **Figure 3.** The L-shell parameter distribution is compared to the EB latitudes; the distribution pattern is different from those of the EB correlation with West Pacific EQs as in this case EB longitudes correspond to the SAA. **Figure 3.** The L-shell parameter distribution is compared to the EB latitudes; the distribution pattern is different from those of the EB correlation with West Pacific EQs as in this case EB longitudes correspond to the SAA.

The significance of the correlation peaks was calculated for all the cases shown in Figure 2. The plot shown in Figure 4 summarizes the results regarding the number of standard deviations that define the correlation peak significance above the average values concerning the bin duration and the relative average correlated events. A statistical increase in the correlation peak significances was observed, which started to exceed 3 standard deviations for a bin correlation less than or equal to 6 h and reached a maximum of over 4 for a bin correlation of 4 h. The correlation peak significance was less than 3 for bins greater than 6 h. Concerning the number of correlated events, it decreases for short correlation bins. In Figure 4, a continuous red line indicates the number of standard deviations of 57 h correlation peaks observed with different bins, a continuous black line indicates the number of events at the peaks, whereas the dashed line indicates the averages. All the correlation peak significances overcame the threshold of 99%. As in the past publications, both correlation calculus using a randomized space and time distributions of EQs were also calculated by using the same EQ times, and the same EQ epicenters, respectively. In these randomized cases, the previously obtained correlation peaks completely disappeared. The significance of the correlation peaks was calculated for all the cases shown in Figure 2. The plot shown in Figure 4 summarizes the results regarding the number of standard deviations that define the correlation peak significance above the average values concerning the bin duration and the relative average correlated events. A statistical increase in the correlation peak significances was observed, which started to exceed 3 standard deviations for a bin correlation less than or equal to 6 h and reached a maximum of over 4 for a bin correlation of 4 h. The correlation peak significance was less than 3 for bins greater than 6 h. Concerning the number of correlated events, it decreases for short correlation bins. In Figure 4, a continuous red line indicates the number of standard deviations of 57 h correlation peaks observed with different bins, a continuous black line indicates the number of events at the peaks, whereas the dashed line indicates the averages. All the correlation peak significances overcame the threshold of 99%. As in the past publications, both correlation calculus using a randomized space and time distributions of EQs were also calculated by using the same EQ times, and the same EQ epicenters, respectively. In these randomized cases, the previously obtained correlation peaks completely disappeared. The significance of the correlation peaks was calculated for all the cases shown in Figure 2. The plot shown in Figure 4 summarizes the results regarding the number of standard deviations that define the correlation peak significance above the average values concerning the bin duration and the relative average correlated events. A statistical increase in the correlation peak significances was observed, which started to exceed 3 standard deviations for a bin correlation less than or equal to 6 h and reached a maximum of over 4 for a bin correlation of 4 h. The correlation peak significance was less than 3 for bins greater than 6 h. Concerning the number of correlated events, it decreases for short correlation bins. In Figure 4, a continuous red line indicates the number of standard deviations of 57 h correlation peaks observed with different bins, a continuous black line indicates the number of events at the peaks, whereas the dashed line indicates the averages. All the correlation peak significances overcame the threshold of 99%. As in the past publications, both correlation calculus using a randomized space and time distributions of EQs were also calculated by using the same EQ times, and the same EQ epicenters, respectively. In these randomized cases, the previously obtained correlation peaks completely disappeared.

**Figure 4.** The correlation significance in red is compared with the correlation peaks in black and the corresponding correlation averages for the bin intervals reported in Figure 2. **Figure 4.** The correlation significance in red is compared with the correlation peaks in black and the corresponding correlation averages for the bin intervals reported in Figure 2. **Figure 4.** The correlation significance in red is compared with the correlation peaks in black and the corresponding correlation averages for the bin intervals reported in Figure 2.

The maximum number of 45 correlation events was found for the greatest time bin of 6 h, which corresponded to 45 EQs identified in the map in Figure 5. The geographical region where EQs correlated with NOAA-15 EBs is delimited by −40◦ to 30◦ in latitude and by 245◦ to 300◦ in longitude, the yellow line in Figure 5, so included are regions with strong seismic activity, such as Mexico, Caribbean Sea, Guatemala, Honduras, Nicaragua, Costa Rica, Panama, Columbia, Ecuador, Perú, Bolivia, Chile, and a large part of South-eastern Pacific. EQ epicenters are indicated by red dots in Figure 5, about half of which are located offshore. Note that the total number of mainshocks that occurred during 16.5 years in Figure 5 yellow square was 199, they were about 1/3 of the mainshocks that stroked West Pacific in the same period. However, the 45 East Pacific mainshocks of the peak correlation are near the 44 EQs of the West Pacific correlation peak, being the East Pacific correlation bin 3 times the West Pacific correlation bin. The time distribution of the 45 considered EQs from July 1998 to December 2014 is shown in Figure 6, with their relative magnitudes. Concerning EB detection positions, the geographical region is delimited at −35◦ and 20◦ in latitude, and at 205◦ and 295◦ in longitude, divided into two inclined belts indicated by cyan contours in Figure 5 whose inclinations are due to the asymmetry of the geomagnetic field. Finally, electron mirror points of detected EBs over the EQ epicenters are plotted in Figure 7, ranging from minimum altitudes between 100 km and 700 km. and by 245° to 300° in longitude, the yellow line in Figure 5, so included are regions with strong seismic activity, such as Mexico, Caribbean Sea, Guatemala, Honduras, Nicaragua, Costa Rica, Panama, Columbia, Ecuador, Perú, Bolivia, Chile, and a large part of Southeastern Pacific. EQ epicenters are indicated by red dots in Figure 5, about half of which are located offshore. Note that the total number of mainshocks that occurred during 16.5 years in Figure 5 yellow square was 199, they were about 1/3 of the mainshocks that stroked West Pacific in the same period. However, the 45 East Pacific mainshocks of the peak correlation are near the 44 EQs of the West Pacific correlation peak, being the East Pacific correlation bin 3 times the West Pacific correlation bin. The time distribution of the 45 considered EQs from July 1998 to December 2014 is shown in Figure 6, with their relative magnitudes. Concerning EB detection positions, the geographical region is delimited at −35° and 20° in latitude, and at 205° and 295° in longitude, divided into two inclined belts indicated by cyan contours in Figure 5 whose inclinations are due to the asymmetry of the geomagnetic field. Finally, electron mirror points of detected EBs over the EQ epicenters are plotted in Figure 7, ranging from minimum altitudes between 100 km and 700 km.

region where EQs correlated with NOAA-15 EBs is delimited by −40° to 30° in latitude

*Appl. Sci.* **2022**, *12*, 10528 7 of 14

**Figure 5.** The very large area of EQ epicenters that contributed to the correlations in Figure 2 is delimited by yellow lines, EQ positions are reported by red dots, and cyan contours delimitate the region where the NOAA satellite detected EBs correlated with EQs. **Figure 5.** The very large area of EQ epicenters that contributed to the correlations in Figure 2 is delimited by yellow lines, EQ positions are reported by red dots, and cyan contours delimitate the region where the NOAA satellite detected EBs correlated with EQs.

**Figure 6.** The EQ magnitudes reported with respect to EQ times contributed to the correlations in Figure 2. **Figure 6.** The EQ magnitudes reported with respect to EQ times contributed to the correlations in Figure 2. **Figure 6.** The EQ magnitudes reported with respect to EQ times contributed to the correlations in Figure 2.

**Figure 7.** The electron mirror point altitudes at L = 1.2 are indicated by continuous and dashed lines, which are compared to continuous horizontal lines indicating NOAA-15 and atmosphere altitudes; green-colored areas indicate longitude regions where electrons are detectable, and the yellow to red areas indicate the probable interaction areas between EQs and EBs; arrows next to shades indicate increasing probabilities; the SAA region excluded by the analysis is delimited by vertical lines, which discriminate between the southern and the northern hemisphere using continuous and dashed lines, respectively. **Figure 7.** The electron mirror point altitudes at L = 1.2 are indicated by continuous and dashed lines, which are compared to continuous horizontal lines indicating NOAA-15 and atmosphere altitudes; green-colored areas indicate longitude regions where electrons are detectable, and the yellow to red areas indicate the probable interaction areas between EQs and EBs; arrows next to shades indicate increasing probabilities; the SAA region excluded by the analysis is delimited by vertical lines, which discriminate between the southern and the northern hemisphere using continuous and dashed lines, respectively. **Figure 7.** The electron mirror point altitudes at L = 1.2 are indicated by continuous and dashed lines, which are compared to continuous horizontal lines indicating NOAA-15 and atmosphere altitudes; green-colored areas indicate longitude regions where electrons are detectable, and the yellow to red areas indicate the probable interaction areas between EQs and EBs; arrows next to shades indicate increasing probabilities; the SAA region excluded by the analysis is delimited by vertical lines, which discriminate between the southern and the northern hemisphere using continuous and dashed lines, respectively.

## **4. Discussion**

A comparison of the possible results obtained above for East Pacific EQs with those obtained for West Pacific EQs [27] is useful to describe the similarities and differences between them. The first striking difference was found in the correlation time difference which was a few hours for West Pacific EQs and a few days for East Pacific EQs, even if EBs anticipated EQs. The first striking similarity was found between the maximum CE number. Concerning the CE significance, calculated using the σ number, the results were slightly worse being the maximum significance of a little more than 4, observed for bin intervals of 4 h, compared to a little more than 5 for bin intervals of 2 h of the past. The CE number relative to the maximum significance was lower for the East Pacific case. The geographical positions of EBs were nearly entirely overlapping, with a slight shift to the East of around 10◦ , whereas the relative positions concerning the correlated EQ epicenters were completely different and non-overlapping. Altitudes of electrons calculated over the new EQ epicenters were also completely different, resulting in many hundreds of kilometers lower compared to past studies. The EQ area in the West Pacific belonged more to the Northern hemisphere than to the area of the East Pacific EQs that belonged more to the Southern hemisphere. This seems to reflect the geomagnetic field asymmetry, following the geomagnetic equator crossing the geographic equator from north to south and moving eastwards to those longitudes. The EQs are instead quite similar for the two regions, being in both cases superficial and equally distributed between land and oceanic crusts.

Concerning the adiabatic parameters of EBs, the L-shell range in this study resulted in being more extended towards lower values, and the latitude dependence was lost for positive latitudes. Moreover, the positive latitude dependence of L-shell was observed for EBs over West Pacific EQ epicenters around 125◦ , see Figure 2 of [27]. However, EB longitudes over East Pacific EQs are close to the SAA, where the geomagnetic field deviates from the dipolar shape. Specifically, the angle between the satellite's orbit and the geomagnetic line is small for positive latitudes and close to 90◦ for negative latitudes. Therefore, the satellite crosses the 1.1 ≤ L ≤ 1.3 in more than 20◦ of positive latitude, whereas it crossed the same interval in 5◦ of negative latitudes. In fact, the pitch angle of positive latitude EBs is restricted to 56◦ ≤ α ≤ 72◦ while the pitch angle of negative latitude EBs is restricted to 108◦ ≤ α ≤ 126◦ . Furthermore, two different distribution bands were observed, see Figure 3, for negative latitudes. They were separated approximately by α = 120◦ , with 1.1 ≤ L ≤ 1.22 in the interval 108◦ ≤ α ≤ 120◦ and with 1.22 ≤ L ≤ 1.3 in the interval 120◦ ≤ α ≤ 126◦ . The pitch angle results are divided into two intervals for both studies, coinciding exactly. Being so, the striking difference in the correlation times between the two studies seems to be more naturally associated with the major difference retrieved from the correspondent analyzed data of the two cases: which is the altitude of electrons passing above the East and West Pacific EQs. If the Lithosphere-Ionosphere interaction occurs above the future epicenters and the magnitude depends on the distance of electrons from the lithosphere; as previously proposed for magnetic pulses [6], a lower altitude of charged particles above the East Pacific means that lower magnitude magnetic pulses could be able to modify electron trajectories, even if the general behavior of magnetic pulses have not been proven decreased with the temporal remoteness of strong EQs. Case studies based upon data from the DEMETER satellite of East Pacific EQs, such as the Chile EQ M = 8.8 on 27 February 2010, and the Haiti EQ M = 7.0 on 12 January 2010, recorded similar delays with electron detections [34], also associated with VLF wave activity.

Preparedness for strong EQs and tsunamis has been developed over recent decades [35] either by triggering early warnings or by rapidly assessing the expected damages [36]. Early warnings consist of alerts seconds before the arrival of destructive seismic waves in populated regions. Such an alert may be useful in controlling the shutdown of gas pipelines and critical facilities, reducing speeds of rapid-transit vehicles, and also advising the interested populations to follow the necessary precautions. Reduced lead times limit the possible preparedness actions, thus leaving part of the population at the mercy of imminent

danger. Utilizing the possible correlation results obtained above for a short-time warning in strong EQs and tsunami preparedness might contribute to reducing the impact of EQs. warning in strong EQs and tsunami preparedness might contribute to reducing the impact of EQs. A way of using such results is to define a scenario for a short-time prediction model

*Appl. Sci.* **2022**, *12*, 10528 10 of 14

A way of using such results is to define a scenario for a short-time prediction model [27]. Following the work of Console [37], the representation of the EQ prediction model would need to define the target volume where EQs occur VT, which is a 2-d geographic space + 1-d time-space. This volume is displayed by a black contour in Figure 8, and with EQs occurrence by red stars. If an EQ occurs in the alarm volume V<sup>A</sup> in yellow, it is a success (S), while if an EQ occurs out of the yellow volume it is a failure of prediction (F). The precursor volume VP, which is generally different from VT, contains the alarm events. In this case, V<sup>P</sup> is the cyan volume of Figure 8 where EBs are detected, the time of EB observations lasting less than 12 min. When an EB is detected in V<sup>P</sup> it is an alarm that defines a VA. The time span of the EQ observations multiplied by the area of East Pacific where the correlations in Figure 2 are calculated is VT. An EQ event is included in this scenario only if M ≥ 6. Unlike West Pacific earthquakes where geographical areas of V<sup>P</sup> and V<sup>A</sup> were completely disjoined, areas of EQs and EBs are largely overlapping in this scenario. In this representation, V<sup>A</sup> has the dimension T which coincides with the correlation bin dimension, it is generated by an EB detection in VP, and covers the same area as VT. To have the greatest significance the 4 h bin interval of 54.3–58.3 h should be chosen. The V<sup>A</sup> dimension and shape are the same for each detected EBs in VP; if an EQ with M ≥ 6 occurs in V<sup>A</sup> one success is collected. Similarly, if an EQ occurs out of the VA, meaning that it happened before or after the 4 h between the 56.3 h following the EB, one failure is collected. Moreover, one false alarm is collected for any EB detection not followed by any EQ. Non-continuous V<sup>T</sup> is defined as when the Sun's activity leads to discarding days or due to the NOAA-15 satellite intermittence into the detection area west of the SAA. A V<sup>A</sup> is generated within V<sup>T</sup> 56.3 h after an EB is detected in VP, and the time ahead is the vertical distance in the representation of Figure 8. [27]. Following the work of Console [37], the representation of the EQ prediction model would need to define the target volume where EQs occur VT, which is a 2-d geographic space + 1-d time-space. This volume is displayed by a black contour in Figure 8, and with EQs occurrence by red stars. If an EQ occurs in the alarm volume V<sup>A</sup> in yellow, it is a success (S), while if an EQ occurs out of the yellow volume it is a failure of prediction (F). The precursor volume VP, which is generally different from VT, contains the alarm events. In this case, V<sup>P</sup> is the cyan volume of Figure 8 where EBs are detected, the time of EB observations lasting less than 12 min. When an EB is detected in V<sup>P</sup> it is an alarm that defines a VA. The time span of the EQ observations multiplied by the area of East Pacific where the correlations in Figure 2 are calculated is VT. An EQ event is included in this scenario only if M ≥ 6. Unlike West Pacific earthquakes where geographical areas of V<sup>P</sup> and V<sup>A</sup> were completely disjoined, areas of EQs and EBs are largely overlapping in this scenario. In this representation, V<sup>A</sup> has the dimension T which coincides with the correlation bin dimension, it is generated by an EB detection in VP, and covers the same area as VT. To have the greatest significance the 4 h bin interval of 54.3–58.3 h should be chosen. The V<sup>A</sup> dimension and shape are the same for each detected EBs in VP; if an EQ with M ≥ 6 occurs in V<sup>A</sup> one success is collected. Similarly, if an EQ occurs out of the VA, meaning that it happened before or after the 4 h between the 56.3 h following the EB, one failure is collected. Moreover, one false alarm is collected for any EB detection not followed by any EQ. Non-continuous V<sup>T</sup> is defined as when the Sun's activity leads to discarding days or due to the NOAA-15 satellite intermittence into the detection area west of the SAA. A V<sup>A</sup> is generated within V<sup>T</sup> 56.3 h after an EB is detected in VP, and the time ahead is the vertical distance in the representation of Figure 8.

**Figure 8.** The scenario is the volume representation where a forecasting model can be tested, here it is delimited by the target volume VT, the alert volume VA, and the precursor volume VP, the latter two being products among the geographical coordinates of EQs and EBs and the time of observations. Discontinuities in time are due to the solar activity intermittence. The alarm duration can be chosen to be 4 h in this work which corresponds to the greatest significance. V<sup>T</sup> and V<sup>A</sup> cover the entire East Pacific area, while V<sup>P</sup> is restricted to two latitudinal areas being more extended on the west. In red is the possible precursor volume which hypothesizes a physical action on the ionosphere of the future epicenter lithosphere. **Figure 8.** The scenario is the volume representation where a forecasting model can be tested, here it is delimited by the target volume VT, the alert volume VA, and the precursor volume VP, the latter two being products among the geographical coordinates of EQs and EBs and the time of observations. Discontinuities in time are due to the solar activity intermittence. The alarm duration can be chosen to be 4 h in this work which corresponds to the greatest significance. V<sup>T</sup> and V<sup>A</sup> cover the entire East Pacific area, while V<sup>P</sup> is restricted to two latitudinal areas being more extended on the west. In red is the possible precursor volume which hypothesizes a physical action on the ionosphere of the future epicenter lithosphere.

The V<sup>T</sup> can be obtained by multiplying the East Pacific area by the total number of hours [27], this total number corresponds to N<sup>h</sup> = 36,135 for 4 h intervals. The following was observed in VT: a total number of NEQ = 199 EQs with M ≥ 6 and depths ≤ 200 km, resulting in a P(EQ) = 0.0055, a total number of NEB = 3371 EBs which are alarms, and a total number of N<sup>S</sup> = 37 CEs which are successes. The success rate was calculated by NS/NEB = 0.011, which is the conditional probability (1) [27], 1 − (NS/NEB) = 0.989 is the false alarm rate, while the alarm rate is NS/NEQ = 0.186, and the failure rate is 1 − (NS/NEQ) = 0.814. As there were more alarms than those for the West Pacific EQs, with the success being near the same for the two cases, the number of false alarms in this study increased compared to previous research. A cross-correlation of 0.024 was calculated using the relation (3). The EQ occurrence probability of at least one target event was estimated to be P(EQ) when no EB was observed two and a half days before, while it increased to P(EQ|EB) given by (2) two and a half days after the EB observation. The probability distribution is shown in Figure 9, which indicates a probability gain near G = 2. As for the study in the West Pacific [27], days with more than one burst can be found with a frequency of about 20%, and EBs belonging to successive orbits generated a partial overlapping between two consecutive alarms. These are not considered here and will be presented in a future publication. *Appl. Sci.* **2022**, *12*, 10528 12 of 14

**Figure 9.** The conditional probability distribution P(EQ|EB) which corresponds to the correlation bin of 4 h in Figure 2, the average probability indicated by a horizontal dashed line coincides with the unconditional probability P(EQ). **Figure 9.** The conditional probability distribution P(EQ|EB) which corresponds to the correlation bin of 4 h in Figure 2, the average probability indicated by a horizontal dashed line coincides with the unconditional probability P(EQ).

**Figure 10.** The probability gain approaching the correlation peak around 57 h; it was obtained by fitting the probability gain distribution from bin intervals of 1.5, 2, 2.5, 3, 4, 5, 6, 7, 8, 9, 10, 12, 14, and 17 h; the 3 sigmas limit is indicated by the red horizontal dashed. An early warning system for evacuation should be based on effective EQ observations [37]. However, the geohazard risk reduction can gain valuable preparation time by adopting a probabilistic short-term warning a few hours prior, especially for tsunamis [36]. Thus, using NOAA-15 electron CR analysis is in principle achievable based on the infrastructure of antennas operating along the West Coast of the US [28], at the same longitude where EBs are detected. The system would need to be able to download NOAA data early enough so it can be analyzed in a few minutes and then compared with geomagnetic activity within a day. As EBs were detected in the same region where they anticipated West Pacific EQs [27], the probability of a strong EQ in the West Pacific cannot be neglected, which would vanish in less than 4 h. Then, if no strong EQs occur in the West Pacific, based on completely statistical evaluations of disastrous events presented in this work, an East Pacific coast forecasting can be generated using G increases in EQ-probability. Moreover, correlation bins of time lengths greater than 6 h were investigated, they concerned the cases of 7, 8, 9,10, 12, 14, and 17 h. A decrease in significance was generally observed for the corresponding correlations, where the sigma decreased up to 1.1 for the 17 h case.

carried out from the analysis of exactly 16.5 years of NOAA-15 particle data. This seemed to indicate that electrons in the loss cone were mainly observed around 57 h before main shocks with M ≥ 6 in the East Pacific, comprising countries where seismic activity is frequently a danger. Results are in line with previous single-case observations by the DEME-TER satellite. The new correlation time is significantly longer than the correlation time found for strong West Pacific EQs, with EBs occurring a few hours prior. It again supports the hypothesis that there might exist a link between ionospheric and lithospheric activities of shallow EQs whose depths are less than 200 km. As for West Pacific EQs, also this correlation happens regardless of whether the EQs occurred in the sea or on the mainland. The parameter L-shell of the electrons was defined uniquely, as in the previous publication on West Pacific EQs [27], therein providing a precise definition of an EB to also carry out an EQ forecast in the East Pacific. The L-shell distribution in this new case was distributed differently in terms of latitude, compared to the West Pacific case, which probably reflects the asymmetry of the geomagnetic field. A probability gain G = 2 was calculated which could be used to increase the pre-alarm time of some early warnings for both

**5. Conclusions**

Furthermore, the probability gain was calculated for each bin and reported in the plot of Figure 10 for each starting/ending time of the correlation peak. The case of 1.5 h was also added to this plot. A fit of the G distribution is reported in Figure 10, which can be interpreted as the time-dependent conditional probability of a strong EQ in the East Pacific area after an EB has been observed at time 0. Figure 10 shows that an increase in conditional probability is observed at about 40 h from the NOAA detection, around 54 h the probability magnitude overcomes the 99% significance level up to around 59 h. Finally, the probability decreases at an unconditional level around 70 h after the EB detection. **Figure 9.** The conditional probability distribution P(EQ|EB) which corresponds to the correlation bin of 4 h in Figure 2, the average probability indicated by a horizontal dashed line coincides with the unconditional probability P(EQ).

*Appl. Sci.* **2022**, *12*, 10528 12 of 14

**Figure 10.** The probability gain approaching the correlation peak around 57 h; it was obtained by fitting the probability gain distribution from bin intervals of 1.5, 2, 2.5, 3, 4, 5, 6, 7, 8, 9, 10, 12, 14, and 17 h; the 3 sigmas limit is indicated by the red horizontal dashed. **Figure 10.** The probability gain approaching the correlation peak around 57 h; it was obtained by fitting the probability gain distribution from bin intervals of 1.5, 2, 2.5, 3, 4, 5, 6, 7, 8, 9, 10, 12, 14, and 17 h; the 3 sigmas limit is indicated by the red horizontal dashed.

### **5. Conclusions 5. Conclusions**

A new statistical correlation analysis between precipitating EBs and strong EQs was carried out from the analysis of exactly 16.5 years of NOAA-15 particle data. This seemed to indicate that electrons in the loss cone were mainly observed around 57 h before main shocks with M ≥ 6 in the East Pacific, comprising countries where seismic activity is frequently a danger. Results are in line with previous single-case observations by the DEME-TER satellite. The new correlation time is significantly longer than the correlation time found for strong West Pacific EQs, with EBs occurring a few hours prior. It again supports the hypothesis that there might exist a link between ionospheric and lithospheric activities of shallow EQs whose depths are less than 200 km. As for West Pacific EQs, also this correlation happens regardless of whether the EQs occurred in the sea or on the mainland. A new statistical correlation analysis between precipitating EBs and strong EQs was carried out from the analysis of exactly 16.5 years of NOAA-15 particle data. This seemed to indicate that electrons in the loss cone were mainly observed around 57 h before main shocks with M ≥ 6 in the East Pacific, comprising countries where seismic activity is frequently a danger. Results are in line with previous single-case observations by the DEMETER satellite. The new correlation time is significantly longer than the correlation time found for strong West Pacific EQs, with EBs occurring a few hours prior. It again supports the hypothesis that there might exist a link between ionospheric and lithospheric activities of shallow EQs whose depths are less than 200 km. As for West Pacific EQs, also this correlation happens regardless of whether the EQs occurred in the sea or on the mainland.

The parameter L-shell of the electrons was defined uniquely, as in the previous publication on West Pacific EQs [27], therein providing a precise definition of an EB to also carry out an EQ forecast in the East Pacific. The L-shell distribution in this new case was distributed differently in terms of latitude, compared to the West Pacific case, which probably reflects the asymmetry of the geomagnetic field. A probability gain G = 2 was calculated which could be used to increase the pre-alarm time of some early warnings for both The parameter L-shell of the electrons was defined uniquely, as in the previous publication on West Pacific EQs [27], therein providing a precise definition of an EB to also carry out an EQ forecast in the East Pacific. The L-shell distribution in this new case was distributed differently in terms of latitude, compared to the West Pacific case, which probably reflects the asymmetry of the geomagnetic field. A probability gain G = 2 was calculated which could be used to increase the pre-alarm time of some early warnings for both strong EQs and tsunamis on the East Pacific's coasts. Despite the high number of false alarms being greater than in the West Pacific case, a time-dependent conditional probability interpretation of the process was proposed.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The NOAA-15 electron CRs from 1998, first accessed on 22 August 2022: http://www.ngdc.noaa.gov/stp/satellite/poes/dataaccess.html. Corrections by proton contaminations were carried out using software downloaded from the Virtual Radiation Belt Observatory, first accessed on 22 August 2022: http://virbo.org/POES#Processing. The geomagnetic field was re-evaluated together with L-shells on the NOAA-15 orbit using the International Geomagnetic Reference Field (IGRF-13) model, first accessed on 22 August 2022, and downloaded at: http://www.ngdc.noaa.gov/IAGA/vmod/igrf.html. Geomagnetic Ap indexes and Dst variations, first accessed on 22 August 2022, were downloaded at the links https://www.ngdc.noaa. gov/geomag/data.shtml and http://wdc.kugi.kyoto-u.ac.jp/dst\_final/index.html, respectively. The UNILIB libraries to calculate mirror altitudes, first accessed on 22 August 2022, were downloaded at https://www.mag-unilib.eu. Finally, EQ events were first accessed on 22 August 2022, and downloaded at https://earthquake.usgs.gov/earthquakes/search/, and the database was declustered using CLUSTER 2000 software, first accessed on 22 August 2022, and downloaded at https://www.usgs.gov/media/images/cluster2000.

**Acknowledgments:** I would like to express my thanks to Craig J. Rodger and Janet Green from NOAA for their useful codes to subtract the proton contamination of electron channels. Additionally, I would like to express my gratitude to M. Kruglanski for the library to calculate bouncing altitudes. I would like to thank the project "Limadou Science +", and an anonymous reviewer for his improving suggestion.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


## *Article* **Stochastic Generator of Earthquakes for Mainland France**

**Corentin Gouache 1,\* , Pierre Tinard <sup>2</sup> and François Bonneau <sup>1</sup>**


**Abstract:** Mainland France is characterized by low-to-moderate seismic activity, yet it is known that major earthquakes could strike this territory (e.g., Liguria in 1887 or Basel in 1356). Assessing this French seismic hazard is thus necessary in order to support building codes and to lead prevention actions towards the population. The Probabilistic Seismic Hazard Assessment (PSHA) is the classical approach used to estimate the seismic hazard. One way to apply PSHA is to generate synthetic earthquakes by propagating information from past seismicity and building various seismic scenarios. In this paper, we present an implementation of a stochastic generator of earthquakes and discuss its relevance to mimic the seismicity of low-to-moderate seismic areas. The proposed stochastic generator produces independent events (main shocks) and their correlated seismicity (only aftershocks). Main shocks are simulated first in time and magnitude considering all available data in the area, and then localized in space with the use of a probability map and regionalization. Aftershocks are simulated around main shocks by considering both the seismic moment ratio and distribution of the aftershock's proportion. The generator is tested with mainland France data.

**Keywords:** generator of earthquakes; low-to-moderate seismicity; stochastic; France

**Citation:** Gouache, C.; Tinard, P.; Bonneau, F. Stochastic Generator of Earthquakes for Mainland France. *Appl. Sci.* **2022**, *12*, 571. https:// doi.org/10.3390/app12020571

Academic Editors: Ricardo Castedo, Miguel Llorente Isidro, David Moncoulon and Arcady Dyskin

Received: 1 October 2021 Accepted: 28 December 2021 Published: 7 January 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

## **1. Introduction**

Mainland France seismicity is considered as low to moderate due to its high return periods and weak maximal magnitudes. Only a few earthquakes have produced damages in this territory since the introduction of the French insurance compensation system in 1982. The six major earthquakes occurring after 1982 resulted in less than EUR 600 M of insured losses, representing approximately 1.5% of the total amount compensated with this compensation scheme, https://catastrophes-naturelles.ccr.fr/ (accessed on 20 December 2021). Nevertheless, major earthquakes could strike this territory (e.g., Liguria in 1887, Basel in 1356) and create financial losses and casualties. For example, CCR (*Caisse Centrale de Réassurance*, French insurance company) and BRGM (*Bureau de Recherches Géologiques et Minières*, French geological survey) have quantified the probable insurance losses associated with historical earthquakes if they occurred in the present day:


It shows that estimating the seismic hazard and risk is necessary even in low-tomoderate seismic areas, since losses can be significant.

The Probabilistic Seismic Hazard Assessment (PSHA) [5] is a group of methodologies that estimates seismic hazard in a probabilistic way using seismic, tectonic and geologic data, as well as their uncertainties. It leads to a measure of probabilities of exceedance for different hazard levels (e.g., Peak Ground Acceleration) in a given site. The spatiotemporal analysis of past seismicity recorded in an area leads to distributions of the earthquake's occurrence and location that are the cornerstone of the PSHA. Traditionally, these distributions can be used in an integrative way [5] or as an input to constrain a stochastic generator of earthquakes [6]. The main advantage of the latter is to explicitly consider some synthetic seismic events that allow for the computation of the contributions of these earthquakes to the hazard evaluated at a given site. This aspect is useful in an insurance context and also makes it possible to link earthquakes to others hazards (e.g., [7] for tsunamis, [8] for liquefaction, [9] for landslides and so on). Regardless, this study only focuses on earthquakes.

The PSHA process takes into account the spatio-temporal behavior of seismicity. To carry this out, three principal methods can be involved:


Each of these methods have drawbacks and advantages. These methods are classically used in parallel in order to take into account the advantages of each. Their application depends on the user's wishes and mainly on the studied region.

Mainland France is a low-to-moderate seismic area since it is localized on an intraplate domain, which is also called a stable region, i.e., far from tectonic plate boundaries.

The spatial segregation of the zoning method induces a drastic reduction in data. In fact, available data in each region are far smaller than those observed in the whole studied area. This finding is even more critical in low-to-moderate seismic areas, where seismic data are scarce. Using this method calls for specific care regarding data completeness and representativeness.

The smoothing method provides a good description of the seismicity but is not complete. In intraplate regions, where seismic frequencies are low, longer observation periods are needed. This is particularly true for extreme events (highest magnitudes) that may not have been seen during the period of observations. In fact, according to some authors, e.g., [14,15], stable regions present a seismicity that is more diffuse and homogeneous than past observations.

Finally, in intraplate domains, only few or even zero active faults are known. Moreover, a seismic cycle assumption could be questionable in stable regions [15]. According to these authors, intraplate seismicity is not driven by repeating tectonic loading and release energy but by transient crust deformations [15] caused by other origins than tectonics; see [16] (for a review in a French context). For this reason, the application of the active fault method is disputed. For example, for an application in the southeastern part of France [17], Martin et al. used this method as a complement to others, but only for the Provence region, where active faults are better known, e.g., [18,19].

In this paper, we propose a new stochastic generator to simulate the seismic catalogues of earthquakes that considers the specificities of the intraplate region (e.g., little data, various and complex origins of seismicity). We start by presenting the stochastic generator that simulates both main shocks (main and independent events) and aftershocks (correlated events). Then, we apply this methodology to mainland France.

## **2. Methodology**

The proposed generator first simulates main shocks and then their associated aftershocks. A global scheme of the method is illustrated in Figure 1. In this scheme, the

method is represented by the blue blocks, whereas inputs are marked as green blocks. Data used to produce these inputs are referred to in orange. Since this paper only focuses on methodology, the user is free to use their own data.

**Figure 1.** General workflow of earthquake generation. Blue: generator. Green: generator's inputs. Orange: data. FMD: Frequency–Magnitude Distribution. PMD: Proportion–Magnitude Distribution.

## *2.1. Main Shock Generation*

Only few data are available in low-to-moderate seismic regions. Knowing this, instead of computing a Frequency–Magnitude Distribution (FMD) per zone or a smoothing FMD, we established the FMD of main shocks recorded on the whole territory. This has the advantage of maximizing the number of data and evaluating the probability of occurrence of an earthquake in France without any prior assumption. This follows recent studies, e.g., [14,15], that stipulate that stable region's seismicity is more diffuse and homogeneous than past observations suggest. Main shocks are thus generated through time (year) and magnitude for the whole territory. Then, the position of the main shock is moved according to its magnitude and a fault probability map.

## 2.1.1. Main Shock Generation in Time and Magnitude

Synthetic main shocks are first generated through time and magnitude. The time step chosen is the year, while the magnitude step is set to 0.1. We first set the number of years *N<sup>y</sup>* through which the synthetic catalogue will span. Then, the generator calls for a stochastic Frequency–Magnitude Distribution (FMD) of main shocks, which corresponds to a probabilistic distribution of annual number of main shocks for each magnitude step. The annual density *λy*(*M*) of main shocks is defined year after year *y* for each magnitude step *M* according to this stochastic FMD of main shocks.

Main shocks are simulated yearly following a homogeneous Poisson point process of annual density *λy*(*M*):

$$n\_y(M) \sim Poisson(\lambda\_y(M))\,. \tag{1}$$

with *ny*(*M*) being the number of main shocks of magnitude *M* ∈ [*Mmin*, *Mmax*] generated at the year *y* ∈ [1, *Ny*].

The annual number of main shocks of magnitude equal to *M* at the year *y* (*λy*(*M*) Equation (1)) is drawn in the stochastic FMD of main shocks. However, these FMDs are defined for main shocks with magnitude greater than or equal to *M* (*λy*(≥*M*)). The annual density obtained from this kind of FMD is not *λy*(*M*) but *λy*(≥*M*). Thus, *λy*(*M*) is defined as:

$$\begin{cases} \forall y \in [1, N\_y], \forall M \in [M\_{\text{min}}, M\_{\text{max}}],\\ \lambda\_y(M) = \begin{cases} \lambda\_y(\ge M) & \text{if } M = M\_{\text{max}}\\ \lambda\_y(\ge M) - \lambda\_y(\ge M + dM) & \text{otherwise} \end{cases} \end{cases} \tag{2}$$

In some cases, *λy*(≥*M* + *dM*) can be higher than *λy*(≥*M*), and so *λy*(*M*) is set to 0. This step is referred as 1A in Figure 1.

## 2.1.2. Main Shock Generation in Space

Every main shock simulated in terms of year and magnitude is then localized somewhere in the whole of mainland France.

A recent study [16] stipulates that the structural inheritance—that is, crust or mantle weakening from past tectonic—can play an important role in deformation localization in intraplate context. According to previous works and new observation, these authors estimate that 55–95% of the stable region's seismicity occurs near such weaknesses. Based on this, we suppose here that such geological objects, that give testimony to hundreds of millions years of seismicity, give more exhaustive information than past seismicity in predicting possible location of future earthquakes. In this paper, for a first approximation, we propose guiding the location of generated main shocks by using a fault probability map.

However, even in active regions around the world, fault activity could be difficult to address, e.g., [20] (for the Canterbury earthquake in New Zealand). Recent studies in China [21] or earthquakes in France [22] and Australia [23] serve as reminders for how difficult it is to define fault activity in stable regions. Since, in these regions, faults can be inactive over long periods of time [14,21], we decided to consider all faults, regardless of their presumed importance and activity.

The map of faults is converted into a density map by computing the average length of fault lines per unit area on a Cartesian grid. After that, the density values are normalized in order to obtain a sum equal to 1, which leads to a probability map. In that sense, each cell of the grid is characterized by a probability of hosting faults. Grid resolution is set by the user.

Looking at past seismicity, one would notice that earthquakes with the highest magnitudes are concentrated in the most active regions, even in intraplate domains. For example, these regions in France are the Alps, the Pyrenees and the Rhine basin. We decide to apply a magnitude threshold to drive highest magnitudes' earthquakes in areas where they have most chance to happen. To achieve this, we segregate the fault probability map by regionalization. Each region is characterized by a maximal magnitude *Mmax region*. This magnitude corresponds to that of the largest earthquakes that this region can host.

Coupling regionalization with probability map leads to a set of regions defined by their own probability map and maximal magnitude *Mmax region*. A main shock of magnitude *M* generated at year *y* can only be localized in eligible regions, i.e., regions where *M* ≤ *Mmax region*. Once the set of eligible regions is defined, the location of this main shock is drawn through the cumulative distribution function of the corresponding probability map (Figure 2). This procedure is a heterogeneous Poisson process whose density depends on the magnitude of the synthetic main shock to be located.

The Cartesian grid on which fault probability map is computed is used to select a cell, whose resolution depends on the user, where the main shocks' coordinates (the black point in Figure 2) are uniformly drawn. Figure 2 summarizes the localization process. Owing to this approach, spatial distribution of generated main shocks is limited by magnitude, with respect to the fault probability map that characterizes seismicity.

This step is referred as 2A in Figure 1.

**Figure 2.** General workflow of main shocks localization. A probability map is defined through a Cartesian grid. Each region is characterized by their own probability map and maximal magnitude allowed. The magnitude *M* of the generated main shock limits the number of eligible regions. Spatial drawing is then realized only within eligible regions. Once a cell of the grid is drawn, epicentral coordinates of the main shock (black point) are uniformly drawn within this cell.

## 2.1.3. Rupture Plane's Parameters

Seismic hazard produced by an earthquake at a given site is computed by groundmotion prediction equations. These equations accept the definition of earthquakes as points. However, the geometry of rupture plane associated to an earthquake could play a significant role in seismic hazard computation. Knowing this, a rupture plane needs to be described for each generated earthquake.

In this study, a rupture plane is considered as a plane at a particular depth and is parametrized by a length (*L*) and three orientation angles: (i) azimuth (clockwise angle from north between 0◦ and 360◦ ), (ii) dip (angle between 0◦ and 90◦ from the horizontal) and (iii) rake (angle between −180◦ and 180◦ ) that define the components of fault movement (normal, reverse, strike-slip).

*L* is evaluated from moment magnitude *M<sup>w</sup>* of the earthquake using the following relation, e.g., [24,25]:

$$L = \mathbf{10}^{(M\_w - l\_1)/l\_2} \, \, \, \, \tag{3}$$

with *L* being the rupture plane's length in km and *l*<sup>1</sup> and *l*<sup>2</sup> being two constants. For the other parameters, they are defined per region, where intervals of values are set according to data. Classically, these kinds of data are used:


See Appendix A for an example of definition of these intervals for an application in mainland France.

Values for each parameter are randomly drawn in a uniform distribution when a main shock in generated in a given region.

This step is referred as 3A in Figure 1.

## *2.2. Aftershock Generation*

Aftershocks could play a significant role in the seismic risk, e.g., [26], since their occurrence can destroy a building damaged by the main shock. Generating aftershocks is thus necessary in the context of seismic risk. Once again, the low amount of data makes seismic sequences difficult to observe in intraplate domains. Thus, applying the famous Epidemic Type Aftershocks-Sequence (ETAS [27]) model, which is a marked point process fond of parameters, is difficult.

Computing Frequency–Magnitude Distribution (FMD) of main shocks in order to generate main shocks through time and magnitude calls for differentiation between main shocks and aftershocks in the data analyzed. This leads to a Proportion–Magnitude Distribution (PMD) of main shocks that depicts the proportion of main shock in the studied catalogue in function of magnitude. Here, we propose generating aftershocks complementarily to main shocks by following the same PMD used to produce the main shocks. Once every main shock is generated, aftershocks are produced and related to their own main shock. This aftershock's production is carried out through three steps described below and represented in Figure 1 (1-3B).

## 2.2.1. Number of Aftershocks to be Produced

The first step consists of defining the number *NbAs*(≥*M*) of aftershocks with magnitude greater than or equal to *M* to be produced. This number is obtained for each magnitude step according to the number of main shocks *NbMs*(≥*M*) already generated and the proportion of main shocks *PropMs* as follows:

$$NbAs(\geq M) = NbMs(\geq M) \times \left(\frac{1}{PropMs(\geq M)} - 1\right).\tag{4}$$

An example of PMD of main shocks, as well as the number of main shocks with magnitudes greater than of equal to 4 generated over 100,000 years, are shown in Figure 3. The number of aftershocks *M* ≥ 4 to be produced, computed by Equation (4), is also visible in Figure 3. Thus, aftershocks are firstly associated with their magnitude.

**Figure 3.** Example of number of aftershocks generated over 100,000 years. This number is obtained from examples of number and proportion of main shocks (Equation (4)).

2.2.2. Aftershock–Main-Shock Relation

According to Båth law [28,29] (Equation (5)), main shock is the event of the sequence with the highest magnitude. Moreover, its magnitude *Mms* is limited from below according to the magnitudes *Mas* of its aftershocks as follows:

$$M\_{\rm ms} \ge M\_{\rm as} + \Delta M.\tag{5}$$

where ∆*M* is a constant equal to 1.2 for crustal seismicity (<50 km depth, in [28]).

In this paper, we use this convenient law to find the eligible main shocks for each aftershock. In fact, owing to the first step, the number of aftershocks to be produced is known for each magnitude step (Figure 3). Applying Equation (5) for each aftershock allows us to define the minimal magnitude *Mms min* of its related main shock. Its linked main shock is then randomly selected among all of the main shocks with magnitude greater than or equal to *Mms min*. Aftershock's year of occurrence is set to the same as its main shock.

However, one needs to keep in mind that Equation (5) is an empirical model and thus not a general truth. For example, the M6.5 Amatrice (Italy) earthquake provoked an aftershock of magnitude 6.1 three months later. In order to take into account ∆*M* variability in the aftershock–main-shock association, the method of seismic moment ratio [30] is used in this paper. It computes this term:

$$
\Delta M = -\frac{\log\_{10}(R)}{1.5} \,\text{,}\tag{6}
$$

according to the *R* ratio equal to:

$$R = \frac{\sum M\_{0as} - \sum M\_{0fs}}{M\_{0ms}} \approx 0.05\,\text{.}\tag{7}$$

where ∑ *M*<sup>0</sup> *as* and ∑ *M*<sup>0</sup> *f s* are the seismic moment, equivalent to energy, released by all of the aftershocks and foreshocks, respectively, and *M*<sup>0</sup> *ms* is the seismic moment released by their main shock. According to [30], *R* is equal to 0.05.

A ∆*M* distribution is obtained by sampling *R* in a Gaussian law with mean equal to 0.05 and a standard deviation of 0.0125 [30]. Such a distribution gives an average ∆*M* equal to 0.87 while 90% of ∆*M* values between 0.77 (Q5) and 1.02 (Q95). Drawing a ∆*M* in this distribution makes it possible to estimate *Mms min*, the minimal magnitude of its main shock, from Equation (5).

## 2.2.3. Aftershock's Location and Rupture Plane's Parameters Definition

We suppose that aftershocks are localized on the same fault as the main shock. Aftershocks' epicenter (*xas*, *yas*) are thus placed at approximately 0.75 *L* from main shocks' epicenter (*x*, *y*) in the direction of its azimuth (*azi*) to within 10◦ :

$$\begin{array}{l} \chi\_{as} \sim N(x, \, 0.75 \times L \times \sin(-azi \pm 10)) \\ y\_{as} \sim N(y, \, 0.75 \times L \times \cos(-azi \pm 10)) \end{array} \tag{8}$$

where *L* (km) is the rupture plane length associated with the main shock (Equation (3)).

Depth *zas* (km), azimuth *azias* ( ◦ ) and dip *dipas* ( ◦ ) of aftershock's rupture plane are sampled into normal laws. These laws have a mean that is equal to the value of the main shock (*z*, *azi* et *dip*) and a standard deviation depending on the parameter:

$$\begin{array}{l} z\_{as} \sim \mathcal{N}(z, \, 2.5) \ge 0 \\ azi\_{as} \sim \mathcal{N}(azi, \, 5) \ge 0 \\ dip\_{as} \sim \mathcal{N}(dip\_{\prime} \, 2.5) \ge 0 \end{array} . \tag{9}$$

Movement of aftershock's rupture plane (normal, reverse, strike-slip or unknown) is equal to the main shocks' rupture plane.

## **3. Application to Mainland France**

*3.1. Data Used in This Study*

## 3.1.1. Regionalization

Even in intraplate domains, the spatial distribution of seismicity can seem heterogeneous. For example, in France, the Pyrenees and Alps are linked to a collision of European and African plates, which accentuates the tectonic features and seismicity in these areas. The SHARE project [31] has proposed segregating European territories according to tectonic features by differentiating pure Stable Continental Regions (SCR) and Oceanic Crusts (OC) from shallow active regions. This tectonic distinction is visible in Figure 4 (Figure 2 in [32]) for France. This regionalization is used in this paper in order to limit the spatial distribution of magnitudes (see Section 2.1.2 and Figure 2).

**Figure 4.** SHARE tectonic regionalization [32] used in this paper. Regions 1 correspond to Stable Continental Regions (SCR). Region 2 represents the Oceanic Crust (OC). Regions 3 (Pyrenees) and 4 (Alps) represent compressional active shallow regions, whereas regions 5 (Rhine Basin) and 6 (Alps) correspond to extensional ones.

## 3.1.2. Seismic Catalogues

## Historical Catalogue

The historical French catalogue FCAT-17 [33] is composed of 4250 earthquakes felt and reported by the French population from 463 to 1964. Despite being associated with large uncertainties, such a long catalogue is useful to define the maximal magnitude of the study (*Mmax*) and per region (*Mmax region*). In this study, we decide to select these maximal magnitudes as the maximum recorded magnitudes in the FCAT-17 catalogue, plus its uncertainty. The strongest earthquake of this catalogue occurred in Liguria in 1887, with a moment magnitude of 6.7 ± 0.6. Thus, the maximal magnitude *Mmax* is set to 6.7 + 0.6 = 7.3. Since this earthquake is localized in region 4 (compressional Alps, Figure 4), this region is characterized by a *Mmax region* equal to 7.3. Table 1 summarizes the maximal magnitudes observed in each SHARE region.

**Table 1.** Maximal magnitude *Mmax region* of each region. Region numbers refer to Figure 4.


Consistent with the recent study [34], we consider that a seismic event with a magnitude of up to 5.5 can occur everywhere in France. As a result, the *Mmax region* is raised to 5.5 if the estimation using the catalogue leads to a smaller value, which is the case for the OC region (n◦2, Table 1). Maximal magnitudes are constant and, thus, their uncertainties are not explored in this paper.

Instrumental Catalogue

This study uses both the instrumental French catalogue SIHex [35] and the RéNaSS catalogues, https://api.franceseisme.fr/fr/search (accessed on 20 December 2021). The former is composed of 37,408 earthquakes recorded between 1965 and 2009, whereas the latter regroups 25,042 seismic events indexed from 2010 to 2020. In total, the merged catalogue brings together 62,450 earthquakes that occurred during the 1965–2020 period.

Such a catalogue isn't complete since the seismometers' number and resolution have improved over time. Also, the catalogue needs to be processed in order to reveal representative and meaningful information. The classical approach to make this catalogue exhaustive is to consider cut-off magnitudes (*Mc*) and years (*Yc*). The catalogue can be considered exhaustive on a particular territory for earthquakes with a magnitude greater than or equal to *M<sup>c</sup>* only during the period *Yc*-2020. For the French territory, we determine *M<sup>c</sup>* and *Y<sup>c</sup>* thanks to the cumulative visual method [36,37]. The couples of *M<sup>c</sup>* and *Y<sup>c</sup>* that we found are visible in Table 2.

**Table 2.** Periods of completeness in function of magnitude for the instrumental catalogue.


Figure 5 presents the spatial distribution of the 15,567 earthquakes *M<sup>w</sup>* ≥ 2 contained in the instrumental catalogue. The strongest earthquake of this catalogue occurred near Arette in 1967, with a moment magnitude of 5.2 ± 0.3.

**Figure 5.** SIHex and ReNaSS catalogues (1965–2020) used in this study to calculate the FMD of main shocks for the whole of mainland France. Only the 15,567 earthquakes with magnitudes greater than or equal to 2 are represented.

## 3.1.3. Faults

For a first approximation, we propose building a spatial probability map (Figure 2) by using the CHARM database, https://infoterre.brgm.fr/page/telechargement-cartesgeologiques (accessed on 20 December 2021), produced by BRGM (Figure 6a). This database contains fault traces, i.e., intersection lines between faults and the Earth surface mapped on geological maps. The CHARM database is composed of 10,576 terrestrial fault lines covering the whole of mainland France and its vicinity (Figure 6a). As already stated in Section 2.1.2, we decide to consider all faults, regardless of their presumed importance and activity. This map of fault traces is converted into a density map by computing the average length of fault lines per unit area on a Cartesian grid of 5 km. The density map is illustrated in Figure 6b.

**Figure 6.** (**a**) Fault traces map of the French territory (CHARM database of BRGM). (**b**) Spatial density map of fault traces.

Spatial distributions of past earthquakes (Figure 5) and fault traces (Figure 6) show the same trend. However, one can remark some differences between these two maps. For example, some moderate earthquakes have been observed in the north of Bordeaux, whereas no fault trace is visible nearby. These differences are explained by:


In order to allow for the possibility of localizing main shocks anywhere [14,15], thus avoiding specific cases, such as the one observed near Bordeaux, regions where no fault lines have been documented are associated with a minimal density value instead of 0. In this study, in the first instance, this minimal value is arbitrarily set to 1% of the maximal density value. In that sense, we consider that the least represented regions have a density that is 100 times lower than in most represented regions. Concretely, according to Figure 6b, every 5 <sup>×</sup> 5 km cell characterized by a density value lower than 0.34 <sup>100</sup> = 0.0034 is associated with this minimal value (0.0034). The cells concerned by this density modification represent around 15% of mainland France. It mainly concerns Parisian and Aquitaine basins (SCR), as well as the Mediterranean Sea (oceanic crust). This map is then transformed to a probability

map due to normalization. The final regionalized fault probability map is used as an input of the generator of earthquakes for mainland France ("*Regionalized density map of faults depending on magnitude*", Figure 1).

Faults (Figure 6a) can also be employed to define the probability density functions (pdf) of azimuth in each region. Please refer to Appendix A for an application to mainland France.

## *3.2. Temporal Description of the Instrumental Catalogue*

## 3.2.1. Frequency–Magnitude Distribution

The frequency–magnitude distribution (FMD) describes the temporal occurrence of earthquakes. It consists of computing the annual number of earthquakes as a function of the magnitude. The classical way to compute FMD is to apply the GR law [38] on observed data:

$$\log\_{10}(N(\ge M)) = a + b \times (M - M\_{\text{min}}) \,. \tag{10}$$

where *M* is the earthquake's magnitude and *N*(≥*M*) is the annual number of earthquakes with a magnitude greater than or equal to the *M* observed in the seismic catalogue. In this study, *M* stands for the moment magnitude (*Mw*), and it is limited by a minimal magnitude *Mmin* (which, here, is equal to 2).

The seismic catalogue is composed of two types of earthquakes: main shocks and their correlated seismicity (fore shocks or aftershocks). Since they are characterized by different spatial and temporal behaviors, seismologists usually analyze them separately. A declustering algorithm is classically used for that purpose, e.g., [39]. It applies spatiotemporal windows around earthquakes in a catalogue in order to find clusters. In each cluster, the earthquake with the maximal magnitude is considered as the main shock, and the others are considered as correlated events. Thus, it is possible to segregate independent events from dependent ones by conserving only main shocks and alone earthquakes (clusters with only one earthquake). Then, the population of main shocks can be studied separately through a Poisson distribution in time, space and magnitude. In this study, the G85 declustering algorithm [40] is chosen. The Proportion–Magnitude Distribution (PMD) of main shocks obtained by applying this algorithm on French instrumental seismicity is represented by blue empty dots Figure 7.

**Figure 7.** Proportion–Magnitude and Frequency–Magnitude Distributions (PMD and FMD) of main shocks for the whole of mainland France. The PMD is obtained by applying the G85 declustering algorithm [40] on data. A Regression of the GR law (Equation (10)) on data has been applied from *M*2 to *M*5 and has been extrapolated with the use of the truncated GR law (Equation (11)).

Orange empty dots in Figure 7 show the annual number of main shocks observed in the exhaustive and declustered catalogue as a function of their magnitude. Applying Equation (10) to these data requires a straight line distribution. Such a distribution is observed between magnitudes 2 and 5 (Figure 7). Thus, we apply Equation (10) between *M*2 and *M*5 and obtain *a* = 4.41 and *b* = −1.12.

From *M*5.1, the annual numbers of main shocks fall and do not respect the GR law (Equation (10)). This discrepancy for large magnitudes is classically attributed to a noncompleteness of data for these magnitudes. Assuming that, an extrapolation of the GR law that is estimated to be between magnitudes 2 and 5 is carried out for *M* ≥ 5.1.

As the total energy released by earthquakes is finite, some deviation from the GR straight line is required for largest magnitudes in order to avoid infinite distribution. A truncation is generally applied on the GR law [41]:

$$\forall M \in [M\_{\min}M\_{\max}], N(\geq M) = 10^4 \times \frac{e^{-\beta(M - M\_{\min} - e^{-\beta(M\_{\max} - M\_{\min})})}}{1 - e^{-\beta(M\_{\max} - M\_{\min})}},\tag{11}$$

with *β* = *b* × *log*(10). Applying Equation (11) with *a* and *b* estimated before (Equation (10)) gives the FMD of main shocks modeled from the exhaustive and declustered catalogue. This FMD is visible in Figure 7 and is characterized by a mean slope (*b* value) that is equal to 1.12 and an asymptotic fall of frequencies for the maximal magnitude *Mmax*7.3.

The mean return periods (inversely equal to annual frequencies) of main shocks are summarized for six magnitude steps in Table 3. These return periods describe the average waiting time between main shocks of magnitudes greater than or equal to *M*. As expected, the mean return periods increase as the magnitude grows.

**Table 3.** Mean return periods of main shocks in function of magnitude *M* computed in the instrumental catalogue. These return periods are calculated from G85-GR application (Figure 7). Standard deviations are also reported. d: days; y: years.


3.2.2. Consideration of Magnitudes' Uncertainties through a Monte Carlo Scheme

Taking into account magnitudes' uncertainties in the calculation of FMD through a Monte Carlo scheme allows us to obtain a stochastic set of PMD and FMD that corresponds to the global representation of main shocks' proportion and occurrence (Figure 7). This Monte Carlo process consists of producing 1000 initial catalogues of 15,567 earthquakes with *M* ≥ 2. Only the magnitude differs from one catalogue to another, since the magnitude of each earthquake is drawn in a normal law *N*(*µ*, *σ*), where *µ* is its magnitude and *σ* its magnitude's uncertainty.

In Figure 7, each magnitude step is characterized by a set of annual frequencies of main shocks. Thus, for each magnitude step, the annual frequency of main shocks can be described by a probability density function (pdf). These pdfs are visible in Figure 8 for six magnitude steps. For a given magnitude *M*, the annual frequency of main shocks of magnitudes greater than or equal to *M* is thus defined as a pdf. The pdf values are normalized by their sum. In this way, these normalized values are dimensionless and their sum converges to 1, which makes them probabilities.

These pdfs of annual frequencies as a function of magnitude are inputs of the generator of earthquakes for mainland France ("*Stochastic FMD of main shocks*"), whereas the mean PMD of main shocks constitutes another input ("*PMD of main shocks*", Figure 1).

**Figure 8.** Probability density function of frequency of main shocks for different magnitudes according to the stochastic FMD of main shocks visible in Figure 7.

## *3.3. Results*

This section presents 100,000 years of synthetic seismicity generated by using SHARE regionalization (Figure 4). Generated earthquakes have magnitudes greater than or equal to 4.

## 3.3.1. Main Shock Generation

Cumulative Seismic Moment Distribution

Cumulative seismic moments produced by both synthetic and instrumental catalogues are compared Figure 9. The seismic moment *M*<sup>0</sup> (N.m) released by an earthquake is calculated for each real or synthetic main shock directly from its moment magnitude *M<sup>w</sup>* following [42]:

$$M\_0 = 10^{1.5 \, M\_{\overline{\nu}} + 9.1} \,. \tag{12}$$

According to Table 2, the completeness period associated with instrumental *M* ≥ 4 earthquakes is 1965–2020, which represents 56 years. Comparisons are thus made over this completeness period. Since 100,000 years have been generated, we divided the synthetic seismicity into a set of 1785 sub-catalogues of 56 years. Each of these sub-catalogues are comparable to the complete instrumental catalogue.

The final seismic moment observed in the instrumental catalogue is smaller than the final mean seismic moment generated. This slight difference is due to the fact that the return periods of *M* ≥ 4 main shocks given as an input are equal to 1.14 years (Table 3) and 1.24 years, according to the initial catalogue. Thus, the initial catalogue (red line Figure 9) is composed of 45 *M* ≥ 4 main shocks over 56 years, whereas the sub-catalogues of 56 years are on average composed of 49 *M* ≥ 4 main shocks (black line Figure 9). Thus, the generator produces a few more *M* ≥ 4 earthquakes than we observe in the initial catalogue.

**Figure 9.** Cumulative seismic moment from observed and generated *M* ≥ 4 seismicities over the period of completeness (Table 2). The synthetic catalogue used is 100,000 years long.

The mean distribution of generated seismic moments is, by definition, smooth, and could differ from the one observed in the instrumental catalogue, but trends are similar. Distributions visible in the synthetic sub-catalogues show variations, as well as the instrumental distribution.

## Frequency-Magnitude Distribution

The Frequency–Magnitude Distribution (FMD) of generated main shocks can be observed in the 100,000 years of generated seismicity. This FMD of main shocks is compared Figure 10 with the mean one given as an input to the generator (Figure 7). The generator manages to reproduce this FMD of main shocks, especially for magnitudes smaller than 6.5. For higher magnitudes, slight variations are observed between the generator's FMD and the data's FMD. This part needs to be improved in order to avoid these slight variations.

The FMD of generated main shocks for the six SHARE regions (Figure 4) are also shown in Figure 10. Table 4 presents the b-values computed by regressing the GR law (Equation (10)) on these FMDs of main shocks, from magnitudes 4 to 5 and between *M*5 and *M*6.

**Table 4.** b-values computed by regressing GR law (Equation (10)) for different ranges of magnitudes on synthetic FMD of main shocks (Figure 10) obtained in each SHARE region. Region numbers refer to Figure 4.


**Figure 10.** Mean frequency–magnitude distributions of main shocks given as input (black) and observed in a synthetic catalogue of 100,000 years for the whole of mainland France (grey) and for the SHARE regions (Figure 4).

Aside from the Rhine basin region (n◦5 Table 4), every region shares practically the same slope of the FMD of generated main shocks before magnitude 5 (Figure 10): from 1.09 to 1.15. These slopes are close to the slope of the FMD of main shocks given as an input: 1.12. This is expected, since we only use this FMD as an input. Homogeneity is thus observed in the results.

Table 5 lists b-values calculated on the data in each region. Please refer to Appendix B for more details on how these values have been computed. Regardless of what is obtained from the observed (Table 5) or synthetic (b(M4-5), Table 4) seismicities, the b-values of a given region keep the same order with respect to the b-value used as the input. In fact, regions 1, 5 and 6 are characterized by both b-values being greater than or equal to the reference b-value (1.12). In parallel, regions 2, 3 and 4 are characterized by both b-values being lower than the reference b-value. Thus, variations of synthetic b-values around the initial value are consistent with b-values obtained from data.

**Table 5.** b-values computed by regressing GR law (Equation (10)) on FMD of main shocks observed in instrumental seismicity for each SHARE region. Region numbers refer to Figure 4. See Appendix B for more details.


Moreover, for *M* ≥ 5.1, slopes of FMD diverge from one region to another. Between magnitudes 5 and 6, the FMD of stable oceanic crust and the Pyrenees are, respectively, characterized by the highest and the lowest slopes: 1.98 and 0.98 (Table 4). The higher the slope, the lower the seismic activity. Thus, according to the generator, stable continental and

oceanic regions are the lowest seismic areas, and the most active regions (Alps, Pyrenees and Rhine basin) are the highest seismic areas. These results are in line with our knowledge of mainland France seismicity. The use of an input as a fault probability map and only one FMD of main shocks thus leads to consistent results.

From the magnitude 5.5, the limitation of large earthquakes' spatial distribution through the regionalization seems to be efficient. In fact, when the magnitude is too high to appear in a given region, it moves to another one, which concentrates the generated main shocks in regions where high magnitudes are allowed. Thus, the limitation of large earthquakes' spatial distribution has an impact on the change in b-values, which is noted in Table 4. Moreover, a relatively pronounced asymptotic fall of frequency is noticed at the region of maximal magnitude, except for the Rhine basin region (Figure 10). This observation is also in line with our knowledge of general seismicity: without using the truncated GR law (Equation (11)), the generator seems to reproduce this asymptotic fall at the largest magnitudes. Finally, the sum of each region FMD is equal to the FMD of the whole of mainland France. This behavior is due to the use of only one FMD as an input, and brings coherence to the results.

## 3.3.2. Aftershock Generation

Figure 11 illustrates the number of main shocks and aftershocks per unit of magnitude generated over 100,000 years. The higher the magnitude, the higher the proportion of main shocks. This is consistent with the literature and with the Proportion–Magnitude Distribution (PMD) of main shocks given as an input of the generator (Figure 3), which is identical to the one observed in the synthetic seismicity. Thus, the generator of aftershocks is working well, since it manages to reproduce this PDM.

Figure 12 shows the number of aftershocks per main shock as a function of the magnitude of the main shock. This number is constant and equal to 1.6 on average for magnitudes of main shocks between 5.2 and 6.9.

**Figure 11.** Number of main shocks and aftershocks generated over 100,000 years as function of magnitude.

**Figure 12.** Number of aftershocks produced by one main shock according to magnitude of the latter. Error bars correspond to standard deviation observed in the 100,000-year synthetic catalogue.

## **4. Discussion**

## *4.1. On the Use of Fault Probability Map*

The use of the spatial probability of faults to guide the location of generated earthquakes is a strong choice. As already stated, it comes from both various seismic origins in stable regions and the fact that existing faults are more likely to localize further seismic events [16]. The lack of completeness and clear definition of active faults have led us to consider all faults, regardless of their alleged importance and activity. This choice is questionable but consistent with the fact that, in stable regions, faults can be inactive over long periods of time [14,21]. Results described in this paper are encouraging, since they are consistent with data given as an input and with our knowledge of French seismicity. However, the method presented in this article corresponds to a first step in a new way to generate stable seismicity, and some improvements must be achieved.

First of all, we have seen that the spatial distribution of past seismicity in mainland France is heterogeneous and that the strongest occurring earthquakes are concentrated in specific regions (Alps, Pyrenees and Rhine basin). To consider this statistical observation, we opt for a spatial limitation of magnitudes by applying regionalization. Still, we can imagine using other methodologies to generate seismicity in the most active French regions. For example, a methodology that is closer to what is conventionally carried out with FMD computed by region/zone or smoothly, e.g., [17,34,43] (for recent mainland France applications). Another possible methodology could be to compute a FMD of main shocks in the Alps, the Pyrenees and the Rhine basin and to use a spatial probability map of past seismicity. One could also use a spatial probability of faults weighted by stress rate or others tectonic and geologic information in order to address the seismic potential of faults.

Furthermore, one can imagine weighting the probability map used in this study with our knowledge of intraplate domains. For example, various origins of stable seismicity in the long term could be considered: topography potential energy, erosion and glacial isostatic adjustment since the last glaciation and so on [16]. Transient processes (e.g., fluid pore pressure increase and hydrological or sedimentary load change [15]) leading to stable seismicity could also be taken into account.

Finally, the use of a fault map calls for exhaustiveness of these faults. However, this is not the case, since the map used is only composed of terrestrial fault traces, excluding covered, deep and/or bathymetric faults. Attempting to complete this database should be one of the next steps of our work.

## *4.2. On the Definition of Maximal Magnitudes*

As already stated, 1500 years of historic data (FCAT-17 catalogue) cannot be representative for stable seismicity. That is why we choose to use a fault map instead of an earthquake map in order to drive the synthetic earthquake's location. Thus, our definition of maximal magnitudes is paradoxical, since it is based on historical seismicity.

Analyzing analogical regions in Europe or all around the world through a Bayesian approach, e.g., [44,45], can make the definition of maximal magnitudes more robust and not constant. This approach also has the advantage of defining maximal magnitude not as a constant but across a range of values, which is interesting when investigating maximal magnitudes' uncertainties.

## *4.3. On the Aftershock Production*

Epidemic Type Aftershocks Sequence (ETAS) models are marked point processes [27] that produce main shocks and associated aftershocks. For that purpose, they use various laws, such as the aftershocks production law ([46] from [47]):

$$K = k \times 10^{a \left(M\_{\rm ms} - M\_{\rm c}\right)} \,, \tag{13}$$

where *K* is the number of aftershocks produced by a main shock of magnitude *Mms*, *M<sup>c</sup>* is the minimal magnitude of interest and *k* and *α* are two constants. Thus, the number of aftershocks produced by a main shock increases with *Mms*.

In our results (Figure 12), the number of aftershocks per main shock seems to be independent to the magnitude of main shocks. This observation is mainly due to the fact that the Proportion–Magnitude Distribution (PMD) of main shocks used as an input in this study (Figure 3) is derived from the instrumental catalogue. However, as already stated, this catalogue is far from exhaustive for large magnitudes (*M* > 5). Although it is not a problem for the FMD of main shocks, since it is produced by the extrapolation of the exhaustive part (see Section 3.2), it is a problem for PMD, which is not extrapolated and is thus not exhaustive. Thus, within the whole *M* ≥ 4 generated earthquakes, aftershocks represent only 6% (Figure 3), which is low, as Figure 11 illustrates.

A PMD of main shocks computed thanks to the G85 declustering algorithm [40] from the exhaustive historical catalogue (FCAT-17) should be more representative of the true distribution. According to this catalogue, aftershocks represent 15% of the whole *M* ≥ 4 seismicity. Applying the PMD of main shocks estimated from the exhaustive FCAT-17 catalogue gives results that are shown in Figure 13.

These results seem more realistic, since they are based on more exhaustive data and since the number of aftershocks per main shock follows the Equation (13). Moreover, our method to generate aftershocks is non-parametric in contrast to this equation, which needs to set two parameters (*k* and *α*). This is an advantage in the context of low-to-moderate seismicity, where objective parametrization from data is limited since data are sparse.

Main shocks are generated according to the proportion of main shocks *p* observed in the data, whereas the number of aftershocks produced depends on the complementary of *p* (1 − *p*). However, results in Figure 13 have been obtained by using instrumental data to generate main shocks and historical data to produce aftershocks. Thus, two PMDs have been used, and so the consistency between *p* and 1 − *p* no longer holds. A solution could be to produce both the PMD and FMD of main shocks with historical data. Nevertheless, this catalogue is characterized by large uncertainties (magnitude, time of occurrence and

space) and is known to overestimate magnitudes compared to the instrumental catalogue (e.g., Figure 2c in [16]). Its use must be carried out with care.

**Figure 13.** Number of aftershocks produced over 100,000 years by using the PMD estimated from the exhaustive historical French catalogue (FCAT-17). (**Left**): Number of main shocks and aftershocks as function of magnitude. (**Right**): Number of aftershocks produced by one main shock according to magnitude of the latter. Red line represents the regression of Equation (13) on these results: *k* = 0.887 and *α* = 0.331.

## **5. Conclusions**

In this paper, a new generator of earthquakes is proposed and applied to mainland France. Applying a stochastic generator of earthquakes to a French context is not new (see, for example, Ref. [48] for a Pyrenean application). However, the method of generating synthetic seismicity is new. Classically, two methods exist to compute a Frequency–Magnitude Distribution (FMD) of main shocks: (i) smoothly through a kernel approach [11,12] or (ii) discretely by using a zoning approach [5]. This allows us to analyze spatial and temporal behaviors of seismicity at the same time, but calls for data number reduction. In intraplate domains, i.e., far from tectonic plate boundaries, data are sparse. Thus, contrary to these classical approaches, the proposed generator uses a FMD of main shocks at a national scale in order to generate main shocks in time and magnitude in order to maximize the number of data available. These main shocks are then spatially distributed through the use of a probability map and regionalization. The former is used to guide the location of main shocks, whereas the latter allows us to limit the distribution of large earthquakes in space. Intraplate seismicity seems to be more uniformly positioned than in active regions [14,15], structural inheritance "*can play a strong role in deformation localization*" [16] and stable faults' activity is difficult to define [21–23]. For these reasons, faults, regardless of whether they are supposed as active or not, are used to produce the probability map.

Aftershocks are then produced by using the Båth law [28,29], the seismic moment ratio [30] and the Proportion–Magnitude Distribution (PMD) of main shocks. This approach defines aftershocks and main shocks differently from the famous ETAS models. Nevertheless, it remains consistent, since the numbers of main shocks and aftershocks are complementary and depend on data. Moreover, unlike ETAS models, this approach has the advantage of being non-parametric.

Temporal and energetic behaviors of generated main shocks are in line with inputs (FMD of main shocks and regionalization) and our knowledge of mainland France seismicity.

However, some improvements can be achieved, such as completing the fault map; in particular, by bathymetric faults, and better describing maximal magnitudes in each region. Moreover, we have seen that using instrumental seismicity to produce the DPM of

main shocks is not representative enough. Using the FCAT-17 historical French catalogue could be a solution, but it needs to be carried out with care due to its large uncertainties and its overestimation of magnitudes. Finally, the method proposed in this paper should be applied to other stable continental regions, such as Northwestern European countries, Australia and so on in order to test its effectiveness.

**Author Contributions:** Conceptualization, P.T., C.G. and F.B.; methodology, C.G., P.T. and F.B.; data curation, C.G. and P.T.; writing—original draft preparation, C.G.; writing—review and editing, C.G., F.B. and P.T. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was performed in the frame of the RING and EXTRA& CO/ICEEL projects at Université de Lorraine. It has been funded by Caisse Centrale de Réassurance (CCR).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Fault lines from CHARM database (BRGM).

**Acknowledgments:** We would like to thank the industrial and academic sponsors of the RING-GOCAD Consortium managed by ASGA for their support, especially CCR. The authors also want to thanks the anonymous reviewers for their meaningful reviews that helped considerably to improve this article.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **Abbreviations**

The following abbreviations are used in this manuscript:


## **Appendix A. Parameters of Rupture Plane**

Each region is characterized by rupture plane's parameters defined by range of values. These values are defined according to seismic and tectonic data (seismicity, faults and focal mechanisms). In this study, we use the European [49] and World [50] focal mechanism catalogues. Table A1 summaries these ranges of values for depth, azimuth, dip and movement.

**Table A1.** Upper and lower bounds of the ranges of values defined for depth, azimuth, dip and movement used in this study. Movement: U = unknown/N = normal/S = strike-slip/R = reverse.


Since SHARE regions are large (Figure 4), range of values are wide: up to 25 km for depth, 120◦ for azimuth and 40◦ for dip (Table A1). All the movements (normal, strike-slip and reverse) are allowed in the stable continental region due to its wideness. Moreover, every azimuth value can be drawn in this region. Finally, since no focal mechanism is available in the oceanic crust region, all the dip (20 to 90◦ ) and azimuth (0 to 359◦ ) values are allowed.

Azimuth can also be described through pdf of values by deriving fault map (Figure 6a). However, these data give information only on orientation (between 0 and 180◦ ) and not azimuth (between 0 and 360◦ ). We thus propose to calculate pdf of orientation from fault map. Then, when an orientation *O*◦ is drawn in one of these pdf, the azimuth is chosen as equal to *O*◦ or *O* + 180◦ . Figure A1 illustrates pdf of orientation derived from fault map for each SHARE region (except the oceanic crust one).

**Figure A1.** Pdf of orientation obtained from fault map for each SHARE region. C. and E. Alps respectively stand for Compressional and Extensional Alps (regions 4 and 6 Figure 4). No fault is localized in the oceanic crust region in the data we used thus this region isn't associated to pdf of orientation.

## **Appendix B. Application of GR Law in Each SHARE Regions**

For some reasons explained in the text, only one Frequency-Magnitude Distribution (FMD) of main shocks, computed at national scale, is used in this paper. However, in order compare FMD obtained in each region from synthetic main shocks (Figure 10) with data, we also decide to apply GR law regression on instrumental catalogue for each region.

We first divide the initial instrumental catalogue into six sub-catalogues, one per region. Only earthquakes localized in a given region are listed in the associated sub-catalogue. Then, we explore completeness of these sub-catalogues through the cumulative visual method [36,37]. Results of this analysis are shown Table A2.

**Table A2.** Cut-off years obtained from instrumental catalogue in each SHARE region according to the cumulative visual method [36,37]. Region numbers refer to Figure 4.


Once sub-catalogues are complete, only main shocks are kept by applying G85 declustering algorithm [40]. Finally, we regress GR law on the observed frequencies of main

shocks as function of magnitude. Obtained b-values for each region are summarized Table A3.

**Table A3.** b-values computed by regressing GR law (Equation (10)) on FMD of main shocks observed in instrumental seismicity for each SHARE region. Magnitude's ranges and number of data used for regression are also detailed. Region numbers refer to Figure 4.


One can see that the two extreme b-values, 0.72 and 1.36 (Table A3), are obtained with the lower number of data: 40 and 301 respectively. Thus, these values must be used carefully.

## **References**


## *Article* **The Effect of the Wenchuan and Lushan Earthquakes on the Size Distribution of Earthquakes along the Longmenshan Fault**

**Chun Hui 1,2,3, Changxiu Cheng 1,2,3,4,\*, Shi Shen 1,2,3 , Peichao Gao 1,2,3 , Jin Chen 1,2, Jing Yang <sup>5</sup> and Min Zhao 1,2,3**



**Citation:** Hui, C.; Cheng, C.; Shen, S.; Gao, P.; Chen, J.; Yang, J.; Zhao, M. The Effect of the Wenchuan and Lushan Earthquakes on the Size Distribution of Earthquakes along the Longmenshan Fault. *Appl. Sci.* **2021**, *11*, 8534. https://doi.org/10.3390/ app11188534

Academic Editors: Ricardo Castedo, Miguel Llorente Isidro and David Moncoulon

Received: 10 August 2021 Accepted: 11 September 2021 Published: 14 September 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

**Abstract:** Changes in the stress state of faults and their surroundings is a highly plausible mechanism explaining earthquake interaction. These stress changes can impact the seismicity rate and the size distribution of earthquakes. However, the effect of large earthquakes on the earthquake size distribution along the Longmenshan fault has not been quantified. We evaluated the levels of the *b* value for the stable state before and after the large earthquakes on 12 May 2008 (Wenchuan, *M*<sup>S</sup> 8.0) and 20 April 2013 (Lushan, *M*<sup>S</sup> 7.0) along the Longmenshan fault. We found that after the mainshocks, the size distribution of the subsequent earthquakes shifted toward relatively larger events in the Wenchuan aftershock zone (*b* value decreased from 1.21 to 0.84), and generally remained invariable in the Lushan aftershock zone (*b* value remained at 0.76). The time required for the *b* value to return to stable states after both mainshocks was entirely consistent with the time needed by the aftershock depth images to stop visibly changing. The result of the temporal variation of *b* values shows decreasing trends for the *b* value before both large earthquakes. Our results are available for assessing the potential seismic risk of the Longmenshan fault as a reference.

**Keywords:** *b* value; stable state; depth; Longmenshan fault

## **1. Introduction**

Following the Wenchuan *M<sup>S</sup>* 8.0 earthquake on 12 May 2008, the Longmenshan fault zone was struck by the 20 April 2013 *M<sup>S</sup>* 7.0 Lushan earthquake. The Longmenshan fault zone is composed of several almost parallel thrust faults, forming a boundary fault between the Sichuan Basin and Tibetan Plateau, and controls the seismicity of the Longmenshan region (Figure 1). The epicenters of the Wenchuan and Lushan earthquakes were approximately 87 km apart, and the focal mechanism of both events showed a thrust rupture [1].

According to the characteristics of the Wenchuan and Lushan earthquakes, whether the *M<sup>S</sup>* 7.0 Lushan event was a strong aftershock of the *M<sup>S</sup>* 8.0 Wenchuan earthquake or a new and independent event has been a topic of debate. For example, some researchers suggest that the two large earthquakes were independent events. The reasons are as follows: (1) there is no overlapping area between the Wenchuan and Lushan earthquake rupture zones [2]; (2) the Wenchuan and Lushan earthquakes were generated in different faults in the Longmenshan fault zone [3]; (3) the rupture processes of the Wenchuan and Lushan earthquakes were different, and the aftershock zones of the two events were nearly 45 km apart [4]. Alternatively, some scientists propose that the Wenchuan and Lushan

earthquakes were a mainshock–aftershock sequence and note that the Lushan event struck in an area where Coulomb stress was increased due to the Wenchuan earthquake [5,6].

**Figure 1. The topographic and tectonic map of the Longmenshan fault zone and its surrounding region.** Blue beach ball represents the focal mechanism of the Wenchuan *M<sup>S</sup>* 8.0 earthquake. Pink beach ball represents the focal mechanism of the Lushan *M<sup>S</sup>* 7.0 earthquake. Red circles represent epicenters of earthquakes (*M<sup>S</sup>* ≥ 4) from 1 January 2000 to 1 January 2019.

The controversy over the relationship between the Wenchuan and Lushan earthquakes highlights the complexity of earthquake interaction in the Longmenshan fault zone. It is widely accepted that earthquake interactions can be understood by identifying changes in static and dynamic stress states around faults [7–9]. The most observable effect of this stress change is a significant increase in seismicity rate, which is generally considered an aftershock phenomenon [10–12]. Statistically, aftershock activity is classically described by *n(t) = K/(t + c)p*, where *n*(*t*) is the number of aftershocks after time *t* and *K*, *c*, and *p* are constants that describe the aftershock productivity [13]. Ogata [14,15] described aftershock activity as a multigenerational branching process and proposed the epidemictype aftershock sequence model, which is a stochastic point process model of self-exciting point processes.

Changes in stress can impact the seismicity rate and the frequency size distribution, which is alternatively known as the frequency-magnitude distribution (FMD) [16] or Gutenberg–Richter (G–R) law [17] and is expressed as log*N* = *a*−*b M*, where *N* is the number of events in a given time period with magnitude greater than *M*, *a* describes the seismicity of a given seismogenic volume, and *b* is the slope of the FMD. Previous studies showed that *b* values fall within the range of 1.02 ± 0.03 on a large scale for a long time [18,19]. For regions on a smaller scale, the *b* values show a broad range of spatial and temporal variations. For example, the *b* value ranged from 0.5 to 2.5 in the Andaman– Sumatra region and California [20]. Interpretation of the variation of *b* values is based on several factors, including stress state [21,22], focal depth [23], faulting style [24,25], fluid pressure [26], and so on.

The earthquake size distribution generally follows a power law, with a slope of *b* values, which characterizes the relative occurrence of large and small events. A low *b* value indicates a larger proportion of large earthquakes and vice versa. Zhao et al. [27] (2008) compared the spatial footprint of *b* values before and after the Wenchuan earthquake in the Longmenshan fault zone, and the results showed that the *b* values tend to change from lower in the southern region to higher in the northeastern region. The temporal change in *b* values before the Wenchuan *M<sup>S</sup>* 8.0 earthquake showed a decreasing trend [28–30], and the Lushan *M<sup>S</sup>* 7.0 earthquake showed similar temporal change trends in *b* values [31]. These studies focused on the variation trend of *b* values before and after the two large events along the Longmenshan fault. However, the fundamental effect of the Wenchuan and Lushan earthquakes on the size distribution of earthquakes along the Longmenshan fault has not been quantified, which limits our understanding of how the apparent stress changes in the region affect the size distribution of earthquakes.

In this study, we evaluated the spatiotemporal evolution of the *b* values along the Longmenshan fault in the past nearly 20 years. Moreover, we estimated the levels of the *b* value for the stable state before and after the Wenchuan and Lushan earthquakes and quantified the effects of the two large earthquakes on the size distribution of subsequent events at different times. In addition, the spatial evolution process of the deep seismogenic environment in the Wenchuan and Lushan aftershock zones in two and three dimensions was illustrated via spatial scanning and data fitting, which can be used to analyze the aftershock activity of the two large earthquakes.

## **2. Data and Postulates**

The earthquake catalog we used here was documented by the regional seismic network and then verified by the China Earthquake Networks Center (CENC) along the Longmenshan fault during the period from 1 January 2000 to 1 January 2019. It is a relatively complete catalog containing the Wenchuan-Lushan earthquake sequence.

Figure 2 shows the magnitude-time distribution of earthquakes in the Wenchuan source region and Lushan source region. The locations of the earthquakes in this catalog were corrected for accuracy. Each event includes the time, location, magnitude, and depth. Homogeneity of the catalog was iterated and optimized for subsequent research and analysis.

**Figure 2. Magnitude versus time for events in the Longmenshan fault zone from 2000 to 2019.** (**a**) Magnitude versus time for events in the *M<sup>S</sup>* 8.0 Wenchuan source region; (**b**) magnitude versus time for events in the *M<sup>S</sup>* 7.0 Lushan source region.

Stress change has been widely used to interpret the triggering of the mainshock– mainshock and mainshock–aftershock events [7,32–36]. However, the exact measurement of stress states is difficult; thus, a relationship between the stress state and *b* value has

been proposed [16,24,37,38]. Schorlemmer et al. [24] demonstrated that the *b* value could be regarded as a stress indicator that depends inversely on differential stress. Therefore, changes in the stress state on faults lead to the variation in *b* values, which is followed by time-dependent recovery. In general, the larger the magnitude of the earthquake, the greater the stress changes and the larger the *b* value fluctuations. No event of magnitude larger than *M<sup>S</sup>* 7.0 had been reported in the historical record of the Longmenshan fault zone and the catalog we used in this paper contains more than 80,000 events and only two large earthquakes greater than magnitude 6.5, that is, the Wenchuan *M<sup>S</sup>* 8.0 and Lushan *M<sup>S</sup>* 7.0 earthquakes. Thus, these two events are mainly responsible for the apparent changes in stress state along the Longmenshan fault zone in the past 20 years.

Therefore, there are two postulates: first, without a large earthquake perturbation, the *b* value will remain in a stable state with a small fluctuation range for a long time; second, after the perturbation of a large earthquake, the *b* value may recover to another stable state. We evaluated the levels of the *b* value for the stable state before and after these two large earthquakes in the study region.

## **3. Methods**

The main methods we used include estimation methods (MaxCurvature for the estimation of the completeness magnitude; maximum likelihood estimation (MLE) for the *b* value estimation), test methods (Akaike information criterion (AIC) for the variation trend of *b* values; the nonlinearity index (NLIndex) for the linearity assessment of frequencymagnitude distribution) and kriging interpolation to describe the spatial and temporal evolution images of aftershock focal depth.

## *3.1. Completeness Magnitude (MC) and b Value Estimation*

The estimation of the completeness of earthquake catalogs is essential to the computation of *b* values, and the lowest magnitude of all earthquakes that are reliably detected in a space-time volume is defined as the completeness magnitude (*MC*) [39]. The lower the *MC*, the higher the detection capability. Here, we use the MaxCurvature technique, which estimates the *M<sup>C</sup>* by locating the magnitude that is the highest frequency of events in the FMD. Mignan [40] showed that the MaxCurvature technique underestimates the *M<sup>C</sup>* in cases involving gradually curved FMDs and postulated that this underestimation tendency arises from spatiotemporal heterogeneities within the earthquake monitoring network. Therefore, we used the corrected MaxCurvature method with a correction factor of +0.2 [41], and the uncertainties were determined by bootstrapping.

The least-squares method and maximum likelihood estimation are often used to calculate the *b* value, and the latter approach is considered more stable. In this work, the maximum likelihood estimation used to calculate the *b* value and its standard deviation [42,43]:

$$b = \frac{1}{\ln(10)(\overline{M} - M\_{\mathbb{C}})} \tag{1}$$

where *M* is the average magnitude of earthquakes with *M* ≥ *MC*; *M<sup>C</sup>* is the cutoff magnitude. The confidence limit of the *b* value is expressed as follows:

$$
\sigma = \frac{b}{\sqrt{N}} \tag{2}
$$

where *N* is the number of earthquake cases of the given sample.

## *3.2. Estimation of the Frequency-Magnitude Distribution (FMD) Extrapolation*

The nonlinearity index (NLIndex) can be used to assess whether the extrapolation of a given high-magnitude FMD is likely an overestimate or underestimate of the probable rates for large events [44].


If NLIndex ≤ 1, the FMD is regarded as linear, and if NLIndex > 1, the FMD is not linear. The slope of *Mcut* is clearly positive or negative, respectively, indicating that the FMD overestimates or underestimates large *M* rates.

## *3.3. Akaike Information Criterion*

To quantify the changing trend of *b* values, the *P* test was conducted for *b* values in two sample windows based on the Akaike information criterion (AIC) [45]. Hypothesis 1: the *b* values in the two sample windows are the same; Hypothesis 2: the *b* values in the two sample windows are different, represented as *b<sup>1</sup>* and *b2*, respectively. The hypothesis of the difference of the AIC leads to the difference ∆*AIC* [46]:

$$\Delta AIC = -2(N\_1 + N\_2)\ln(N\_1 + N\_2) + 2N\_1\ln(N\_1 + \frac{N\_2b\_1}{b\_2}) + 2N\_2\ln(N\_2 + \frac{N\_1b\_2}{b\_1}) - 2 \tag{3}$$

where *N<sup>i</sup>* is the number of events in the sample windows and *b<sup>i</sup>* is *b* values in the sample windows. *P<sup>b</sup>* represents the probability that the events in the two sample windows come from the same population and can be derived from the AIC as follows:

$$P\_{\mathfrak{b}} = e^{\left(-\Delta A I \mathbb{C}/\mathfrak{2}\right) - \mathfrak{2}} \tag{4}$$

The *b* value in the sample window represents a significant change when ∆*AIC* ≥ 2 (*P<sup>b</sup>* ≈ 0.05) and is highly significant when ∆*AIC* > 5 (*P<sup>b</sup>* ≈ 0.01) [47].

## *3.4. Kriging Interpolation*

Kriging interpolation is the most commonly used geostatistical approach for spatial interpolation. With this method, a semivariogram is used to express the spatial relationship of the distance between samples. This technique depends on the spatial model between samples to predict attribute values at unsampled locations [48]. As a widely used interpolation method, kriging takes into account the distance between unknown positions and the sample locations as well as the distance between sample locations, effectively reducing the interference of clustering in samples on the accuracy of the interpolated estimates [49]. We used the kriging interpolation algorithm to produce maps incorporating anisotropy and underlying trends from irregularly spaced data. The exponential semivariance model with the smallest prediction errors was chosen over the Gaussian and spherical models for the spatial interpolation of focal depth data.

## **4. Results and Analysis**

## *4.1. Completeness Magnitude (MC) and Linearity Assessment of Frequency-Magnitude Distribution (FMD)*

As shown in Figure 3, the results of the corrected MaxCurvature method show that the *M<sup>C</sup>* of the earthquake catalog used in this work is *M<sup>C</sup>* = 1.5. This result is consistent with the results of previous studies on the Wenchuan earthquake zone [30,50]. Fang et al. [51] described in detail the aftershock performance and analysis of the Lushan earthquake based on the combined data from permanent and temporary seismic stations. They concluded that the minimum complete magnitude was *M* = 1.0. To unify the consistency of the *M<sup>C</sup>* of the Wenchuan *M<sup>S</sup>* 8.0 and Lushan *M<sup>S</sup>* 7.0 earthquakes in the Longmenshan fault zone, we selected events with magnitudes of *M* ≥ *M<sup>C</sup>* = 1.5.

We performed a linearity check on FMD, and the results are shown in Figure 4. The NLIndex (red) is shown for different cutoff magnitudes (upper inset) and the NLIndex ≤ 1 for all cut off magnitudes; thus, the linear FMD is accepted as the best *MC*.

**Figure 3.** Frequency-magnitude distribution of the seismicity of the Wenchuan-Lushan sequence from 1 January 2000 to 1 January 2019.

**Figure 4.** Frequency-magnitude distribution and NLIndex.

## *4.2. Time-Space Analysis of b Values*

Earthquake frequency will increase immediately within a short time after a large event and may exceed the recording capacity of the seismic network. Before establishing the time-space series of *b* values with aftershocks, we should eliminate the events documented in the early catalog, which is somewhat heterogeneous and incomplete in small events [41]. In this work, the exclusion period depends on the magnitude of completeness over time. Therefore, we first removed the events documented in the initial catalog within two months after the Wenchuan *M<sup>S</sup>* 8.0 and Lushan *M<sup>S</sup>* 7.0 earthquakes, a period for which the data are highly incomplete. Then, we calculated the spatiotemporal distributions of *b* values before and after two large events that occurred from 2000–2019 along the Longmenshan fault by selecting events with *M* ≥ *M<sup>C</sup>* = 1.5 and using a time window and spatial grid to calculate the *b* values. In this computation, the window lengths were set to at least 500 events in the

Wenchuan aftershock zone and 200 events in the Lushan aftershock zone. Each window was moved forward by one event at a time.

Figure 5a,b display time series of the *b* value in source regions, and the overall change trend conforms to our postulation that *b* values will undergo relatively significant changes in a period of time before and after a large earthquake. Specifically, *b* values show a decreasing trend before the occurrence of both large earthquakes in both zones. To ensure that this trend is statistically significant, we quantitatively assessed the temporal variation in *b* values using the *P* parameter test and selected three windows before the *M<sup>S</sup>* 8.0 event (*W*1, *W*2, and *W*3) and the *M<sup>S</sup>* 7.0 event (*L*1, *L*2, and *L*3). Window selection was based on the significance of changes in *b* values (Figure 5a,b). The results are shown in Table 1. The *b* value in the sample window has significantly changed when ∆AIC ≥ 2 (*P<sup>b</sup>* ≈ 0.05). Table 1 shows that the *b* values decreased before both large earthquakes with statistically significant variations.

After the Wenchuan *M<sup>S</sup>* 8.0 event, the *b* values in the Wenchuan aftershock zone experienced a period of dramatic fluctuation (indicated by the pink shading lasting not more than one year) before gradually stabilizing within a small fluctuation range (Figure 5a), which was similar to the range of *b* values in the third period (Figure 5c, *b* = 0.84). After the Lushan *M<sup>S</sup>* 7.0 event, the *b* values in the Lushan aftershock zones increased rapidly and then slowly dropped to a stable state (Figure 5b), which was similar to the FMD in the first period (Figure 5d, *b* = 0.76). As shown in Figure 5b (red shading), the *b* value required less than ten months to return to a stable state.

The reference *b* values can be estimated for the background levels (for the period, *b* = 1.21 in the Wenchuan aftershock zone and *b* = 0.76 in the Lushan aftershock zone). When the perturbation effect of the mainshocks gradually decreases, the *b* values in the Lushan aftershock zone eventually return to the background level (*b* = 0.76), whereas those in the Wenchuan aftershock zone drop below the background level (from 1.21 to 0.84). To date, there have been no earthquakes that have significantly changed the stability of the Longmenshan fault zone since the *M<sup>S</sup>* 7.0 Lushan earthquake.

The temporal distribution of earthquakes also indicates the change in stress state of faults and their surroundings. As shown in Figure 2a, before the *M<sup>S</sup>* 8.0 Wenchuan earthquake, the frequency of events in the Wenchuan aftershock zone had been decreasing for a year since 2006, and only two events greater than *M<sup>S</sup>* 4 occurred during the period when *b* values were significantly decreasing (Figure 5a). However, earthquakes that were greater than *M<sup>S</sup>* 4 struck the entire Wenchuan aftershock zone after the *M<sup>S</sup>* 8.0 event (Figure 1). In addition, there were no strong aftershocks above magnitude 6.5 along the faults, and only six events greater than *M<sup>S</sup>* 4 occurred within two months after the mainshock. These phenomena indicate that without the continuous perturbation of strong aftershock, the *b* value gradually stabilized to the state mainly determined by the background earthquakes, which tended to shift to larger events following the *M<sup>S</sup>* 8.0 event in the Wenchuan aftershock zone.

After the *M<sup>S</sup>* 8.0 Wenchuan earthquake, the Lushan aftershock zone also experienced a "seismic quiescence" of approximately two years, and only one event greater than *M<sup>S</sup>* 4 occurred before the *M<sup>S</sup>* 7.0 Lushan earthquake during the period when *b* values were significantly decreasing (Figure 2b). Moreover, only a few events greater than *M<sup>S</sup>* 4 occurred within two months after the mainshock; the subsequent events basically returned to the magnitude of the background earthquakes before the mainshock, which shows that the *b* values in the Lushan aftershock zone eventually returned to the background level (Figure 5b).

To analyze the spatial footprints of the changes in the *b* values, we divided the two study regions into 0.1◦ × 0.1◦ grids, sampled the 300 events nearest to each grid node within radius of 30 km, and re-estimated the *M<sup>C</sup>* in each node. For this purpose, we used a bootstrap approach to sample the events 1000 times randomly. The spatial footprints of the changes in the *b* values are consistent with the FMDs (Figure 5c,d). Figure 5e–g demonstrate the spatial variation in the *b* values throughout the Wenchuan aftershock zone. Figure 5h–j show the spatial variation in the *b* values in the Lushan aftershock zone.

**Figure 5.** Temporal and spatial analysis of the *b* values for the Wenchuan-Lushan sequence. (**a**,**b**) Temporal variation of *b* values for the Wenchuan and Lushan source regions. The dashed black lines represent the times of the *M<sup>S</sup>* 8.0 (Wenchuan) and *M<sup>S</sup>* 7.0 (Lushan) events, and the dashed blue lines show the background *b* values for the Wenchuan and Lushan source regions. The dashed red line represents the *b* value after the time of the *M<sup>S</sup>* 7.0 event (Lushan) for the Wenchuan source region. The shaded regions represent the uncertainties in the *b* values. (**c**,**d**) FMDs for the two aftershock zones in three different periods. (**e**–**g**) Map showing the spatial footprint of *b* values for the Wenchuan aftershock zone in three different periods. (**h**–**j**) Map showing the spatial footprint of *b* values for the Lushan aftershock zone in three different periods.


**Table 1.** Results of the *P* parameter test between windows.

The *b* value in the southern part of the Wenchuan source region and Lushan source region was lower than that in the northern part of the Wenchuan aftershock before the *M<sup>S</sup>* 8.0 Wenchuan event (Figure 5e,h). This pattern is consistent with the characteristics of the Longmenshan fault, which is a strike-slip fault in the north and a thrust fault in the southern section. It is generally considered that the *b* value is inversely proportional to stress, and the *b* values of different types of faults are as follows: *b* (normal) > 1, *b* (strike-slip)~1, and *b* (thrust) < 1 [24,25]. The conditions changed markedly after the *M<sup>S</sup>* 8.0 Wenchuan event; the *b* values decreased in the Wenchuan source region and Lushan source region (Figure 5f,i). This finding illustrates the effect of the Wenchuan earthquakes on stress change along the Longmenshan fault. Figure 5g,j illustrate the stable state of the *b* value in the Longmenshan fault zone.

## *4.3. Evolution of Images of Aftershock Activity Depicted by Focal Depth*

Spatial scanning was performed using events with depth data in the catalog, i.e., at a step size of 0.1◦ for both longitude and latitude. For all earthquakes in each 0.1◦ × 0.1◦ grid point, the average depth was used as the depth value of the grid point, and then kriging interpolation was applied to all the grids. We counted at least ten events in each grid node in the Wenchuan aftershock zone as samples (as well as five events for the Lushan aftershock zone) to prevent the average depth of grid points from being affected by too few events. The contour lines of depth distribution at different periods after the mainshock were obtained and superimposed with regional faults (Figures 6 and 7).

To illustrate the evolution of the deep seismogenic environment in the Wenchuan aftershock zone. Figure 6 shows the spatial evolution of the aftershock depth at one day, one week, one month, six months, one year, and three years after the mainshock. The analysis shows that the focal depth of the Wenchuan aftershock zone spread along the direction of the fault. With Mianyang as the boundary, the aftershock zone can be divided into the southern section and the northern section. The depth distribution is generally deep in the southeast and shallow in the northwest, and the average aftershock focal depth is 10–15 km.

A comparison of the aftershock activity images depicted by the focal depth information shown in Figure 6d,e revealed that the image formed one year after the mainshock did not show a significantly different pattern in the following two years. This finding indicates that after the mainshock, the aftershock frequency tends to be stable one year later, which means that the active aftershock period of the Wenchuan *M<sup>S</sup>* 8.0 was less than one year.

Figure 7 shows the spatial evolution of aftershock depths at one day, one week, five months, ten months, one year and three years after the mainshock in the Lushan aftershock zone. The analysis shows that the focal depths of the Lushan aftershock are distributed around the fault, with the fault as the boundary, with a trend of deep in the southeast and shallow in the northwest. Moreover, a comparison of the regional aftershock activity images depicted by the focal depth information in Figure 7d,e revealed that the pattern of the image formed ten months after the mainshock presented limited changes in the following one year, which indicates that the aftershock frequency tended to be stable ten months after the Lushan mainshock. Therefore, the active aftershock period of the Lushan *M<sup>S</sup>* 7.0 earthquake was less than ten months.

**Figure 6.** Spatial distribution of the focal depths of the aftershocks following the Wenchuan *M<sup>S</sup>* 8.0 earthquake on the fault plane. Red lines represent the locations of faults. (**a**) One day after the mainshock; (**b**) 1 week after the mainshock; (**c**) 1 month after the mainshock; (**d**) 1 year after the mainshock; and (**e**) 3 years after the mainshock.

**Figure 7.** Spatial distribution of the focal depths of the aftershocks following the Lushan *M<sup>S</sup>* 7.0 earthquake on the fault plane. Red lines represent the locations of faults. (**a**) One day after the mainshock; (**b**) 1 month after the mainshock; (**c**) 5 months after the mainshock; (**d**) 10 months after the mainshock; and (**e**) 2 years after the mainshock.

## **5. Discussion**

Earthquake interaction can change the stress state of faults, which is reflected in both earthquake activity rate and earthquake size distribution. The relationship between stress state and *b* value can be used to quantify the effect of large earthquakes that can significantly change the stress states and, therefore, the earthquake size distribution. In contrast to previous studies on the Wenchuan and Lushan earthquakes, our paper focuses on a quantitative analysis and discussion of the effect of the Wenchuan *M<sup>S</sup>* 8.0 and Lushan *M<sup>S</sup>* 7.0 earthquakes on the size distribution of the earthquakes along the Longmenshan fault at different times.

Interpretation of the *b* value and its variability according to physical mechanisms has received considerable attention and discussion. In most cases, the observation of spatial and temporal *b* value variability can be caused by several factors: (i) Process of estimation: homogeneity of catalog and method of calculation can affect the results. All the data in this work are from the Longmenshan fault and its surroundings, and each event includes the time, location and magnitude depth. The maximum likelihood estimation and least-squares regression method [30,41–43,52–55] are used to estimate the *b* value and its uncertainty, but the latter is excessively affected by the largest earthquake magnitude. Marzocchi et al. [56] measured the bias on *b* values caused by the magnitude binning and catalog incompleteness when the *b* value is estimated by the maximum likelihood estimation and provided guidance to reduce the likelihood of being misled by *b* value variation. (ii) Stress conditions: the *b* value and its variation represent stress buildup and release. The differential stress is inversely dependent on the *b* value has been observed in laboratory experiments [57,58] as well as in the field [59,60]. The stress acting on a fault may control the variation in the *b* value in space and time. Parsons et al. [61] calculated the regional Coulomb stress changes on major faults surrounding the rupture resulting from the Wenchuan *M<sup>S</sup>* 8.0 and showed that significant stress increased in the Lushan aftershock zone. Other studies obtained similar results [5,6,62]. The spatial variation in the *b* values throughout the Lushan aftershock zone decreased after the *M<sup>S</sup>* 8.0 mainshock (Figure 5i), and the same effect occurred in the southern part of the Wenchuan aftershock zone after the *M<sup>S</sup>* 7.0 event (Figure 5g). These findings suggest that the *b* value is negatively correlated with stress and indicate the effect of the earthquake interaction along the Longmenshan fault zone. (iii) Crustal tectonics: the variation in *b* value can be interpreted according to the tectonic characteristics, i.e., rock heterogeneity [63], focal depth [23], pore pressure [26], and fault types [24,25]. Previous studies have shown that the *b* value in different types of faults is *b* (normal) > 1, *b* (strike-slip)~1, and *b* (thrust) < 1 [26,64]. As shown in the spatial footprints in Figure 5, the *b* value of the southern part of the Longmenshan fault zone is lower than that in the northern part. This pattern is consistent with the tectonic characteristics of the Longmenshan fault, which is a strike-slip fault in the north and a thrust fault in the southern part [65,66].

Stress changes seem to be a key factor that affects the *b* value and its variation. Excepting the approach of estimation, all other factors are secondary because they are directly or indirectly affected by the stress [67]. Therefore, the observed falls in the *b* values shown in Figure 5a,b was interpreted as changes in the related stress conditions, which could be precursors to large earthquakes. However, these temporal variations may occur over a timescale ranging from months to years, and the timeliness and effectiveness of this variability as an indicator are difficult to guarantee. Additionally, there is usually an insufficient number of events to accurately calculate the *b* value before large earthquakes. Gulia and Wiemer [41] pointed out that the period following a moderate earthquake is rich in such data, with thousands of events occurring within a short period. These events may allow real-time monitoring of the evolution of *b* values. The authors claim that the probability of a larger earthquake following a moderate earthquake increases by several orders of magnitude if the *b* value remains the same or drops significantly rather than increases. However, Brodsky [68] suggested that the observed pattern revealing the changes

in *b* values is a statistical effect rather than deterministic and that researchers need more cases to test this claim.

In general, thousands of aftershocks occur in the period following a large earthquake. Based on these abundant data, there are two typically common operational aftershock forecasting models used in aftershock hazard assessment, namely the short-term earthquake probability (STEP) model [69] and epidemic-type aftershock sequence model [14,15]. Gulia et al. [16] reported that these models forecast a high probability for a repeat of the mainshock rupture and thus substantially overestimate the aftershock hazard. This paradox can be resolved by taking into account the stress changes and their effect on the earthquake size distribution.

Our results showed that the time series of the *b* value in the Longmenshan fault zone after the mainshocks exhibited a period of significant fluctuation before returning to the stable state in both the Wenchuan aftershock zone (one year) and the Lushan aftershock zone (ten months). Figures 6 and 7 show that the time required for the *b* values to return to a stable state after both mainshocks was entirely consistent with the time required for the aftershock depth images to cease changing visibly. The spatial footprints of the changes in the *b* value reveal that the southern part of the Longmenshan fault zone is lower than the northern part. This finding demonstrates that the Wenchuan and Lushan events did not change the pattern of higher stress in the southern part of the Longmenshan fault zone than in the northern part. However, the most obvious change is that after the mainshock, the size distribution of the subsequent earthquakes in the Wenchuan source region shifts toward relatively larger events (lower *b* values).

The Longmenshan fault zone began to develop in the Late Triassic, and severe tectonic deformation occurred during the Indo-China and Himalayan movements, forming a combination of thrust and strike-slip displacement [66,70]. Previous studies did not comprehensively quantify the stable state for the Longmenshan fault zone before and after the two large events in a long time series [5,6,27,29,30,55,71]. The *b* value is a measurable indicator of earthquake size distribution within a specified region and period of time and is dependent on stress. With our present results, we reported the temporal and spatial variation in *b* values before and after two big earthquakes and fitted the source depth in time and space to quantify the stress changes and their effect on the earthquake size distribution in the Longmenshan fault zone.

## **6. Conclusions**

Based on the tectonic characteristics and potential seismicity surrounding the aftershock zones of the Wenchuan *M<sup>S</sup>* 8.0 and Lushan *M<sup>S</sup>* 7.0 earthquakes, we studied the spatial and temporal variation of *b* values in two source regions from 2000 to 2019. In addition, the spatial evolution process of the deep seismogenic environment in the Wenchuan and Lushan aftershock zones was drawn by spatial scanning and depth data fitting.

The results depict the decreasing trends of *b* values before the two large earthquakes in the study region. Additionally, the *b* value in the Wenchuan aftershock zone took approximately one year to enter a new stable state (*b* values ranging from 1.21 to 0.84), while the *b* value in the Lushan aftershock zone took approximately ten months to return to its original stable state (*b* = 0.76). Moreover, the major aftershock active periods of the Wenchuan *M<sup>S</sup>* 8.0 and Lushan *M<sup>S</sup>* 7.0 earthquakes were less than one year and ten months, respectively, which are consistent with the time required for the *b* value to return to a stable state. The spatial footprints of the changes in the *b* values results reveal that the Wenchuan *M<sup>S</sup>* 8.0 and Lushan *M<sup>S</sup>* 7.0 events did not change the pattern of high *b* values in the north and low *b* values in the south along the Longmenshan fault zone.

We quantified the effect of the Wenchuan *M<sup>S</sup>* 8.0 and Lushan *M<sup>S</sup>* 7.0 earthquakes on the size distribution of earthquakes along the Longmenshan fault. Future studies can focus on how to quantify the effect of large earthquake size distribution across different tectonic regimes and apply the findings in potential seismic risk assessment.

**Author Contributions:** Conceptualization, C.H., C.C. and P.G.; Data curation, P.G. and J.Y.; Formal analysis, C.H.; Funding acquisition, S.S.; Investigation, S.S.; Methodology, S.S. and J.C.; Resources, C.C.; Software, J.C., J.Y. and M.Z.; Supervision, P.G.; Validation, P.G. and M.Z.; Visualization, S.S. and M.Z.; Writing–Original draft, C.H.; Writing–Review & editing, C.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by the National Natural Science Foundation of China (Grant No. 41771537) and the Fundamental Research Funds for the Central Universities.

**Data Availability Statement:** Datasets for this research are available in these in-text data citation references: Shi et al. (2018), [http://doi.org/10.6038/cjg2018M0024], Li et al. (2018), [http://doi.org/ 10.6038/cjg2018M0129] and China Earthquake Networks Center (CENC), [http://www.cenc.ac.cn/].

**Acknowledgments:** The authors thank American Journal Experts (AJE) for polishing the language of this article.

**Conflicts of Interest:** No conflict of interest exist in the submission of this manuscript, and the manuscript is approved by all authors for publication.

## **References**


## *Article* **Liquefaction Potential and Vs30 Structure in the Middle-Chelif Basin, Northwestern Algeria, by Ambient Vibration Data Inversion**

**Abdelouahab Issaadi 1,2,\*, Ahmed Saadi 1,2, Fethi Semmane <sup>1</sup> , Abdelkrim Yelles-Chaouche <sup>1</sup> and Juan José Galiana-Merino 2,3**


**Abstract:** The Middle-Chelif basin, in northwestern Algeria, is located in a seismically active region. In its western part lies the El-Asnam fault, a thrust fault responsible for several strong earthquakes. The most important being the El-Asnam earthquake (*Ms* = 7.3) of 1980. In the present study, ambient vibration data with single-station and array techniques were used to investigate the dynamic properties of the ground and to estimate the Vs30 structure in the main cities of the basin. Soil resonance frequencies vary from 1.2 to 8.3 Hz with a maximum amplitude of 8.7 in. Collapsing behavior has also been demonstrated west of the city of El-Attaf, reflecting a strong potential for liquefaction. A Vs30 variation map and a soil classification for each city were obtained mainly by inversion of the HVSR and Rayleigh wave dispersion curves. Finally, an empirical prediction law of Vs30 for the Middle-Chelif basin was proposed.

**Keywords:** Middle-Chelif Basin; ambient vibrations; HVSR; array techniques; Vs30; site classification; liquefaction

## **1. Introduction**

Northern Algeria is located in the collision zone between the African and Eurasian plates. It is characterized by moderate to high seismic activity (e.g., [1–3]), mainly concentrated in the marginal areas of the Neogene basins [4]. The Chelif Basin is located in the northwestern part of Algeria (Figure 1). It is the largest of the northern Neogene sedimentary basins and hosts an important seismic activity. The basin is mainly affected by NE-SW oriented reverse faults [4,5]. The most important is the El-Asnam fault, a reverse fault about 40 km long [6], that generated several destructive earthquakes during the last century, such as the 1934 Carnot earthquake, now El-Abadia, (*M<sup>S</sup>* = 5.1, [7]); the 1954 Orléansville earthquake, now Chlef (*M<sup>S</sup>* = 6.7, [7]); and the 1980 El-Asnam earthquake, now Chlef (*M<sup>S</sup>* = 7.3, [6]). The latter is the largest and most destructive earthquake recorded in Algeria in the instrumental era.

The Chelif Basin is divided into three parts: the Lower-Chelif Basin, the Middle-Chelif Basin, and the Upper-Chelif Basin. The Middle-Chelif extends from Oued-Fodda in the west to Ain-Defla in the east (Figure 1). The cities of the Middle-Chelif suffered important damage during the 1980 El-Asnam earthquake. The cities of Oued-Fodda, El-Abadia, and El-Attaf were almost totally destroyed. In addition, several secondary effects of the earthquake were observed along the rupture zone [8], such as cracks, settlement, and soil liquefaction. Moreover, the coseismic uplift of the western part of the fault has obstructed the flow of the Chelif River, causing a flood which formed a natural dam, where the phenomenon of liquefaction occurred over a wide area west of El-Abadia and El-Attaf [8,9].

**Citation:** Issaadi, A.; Saadi, A.; Semmane, F.; Yelles-Chaouche, A.; Galiana-Merino, J.J. Liquefaction Potential and Vs30 Structure in the Middle-Chelif Basin, Northwestern Algeria, by Ambient Vibration Data Inversion. *Appl. Sci.* **2022**, *12*, 8069. https://doi.org/10.3390/ app12168069

Academic Editors: Miguel Llorente Isidro, Ricardo Castedo and David Moncoulon

Received: 28 June 2022 Accepted: 1 August 2022 Published: 12 August 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

*Appl. Sci.* **2022**, *12*, x FOR PEER REVIEW 2 of 18

the flow of the Chelif River, causing a flood which formed a natural dam, where the phe‐ nomenon of liquefaction occurred over a wide area west of El‐Abadia and El‐Attaf [8,9].

**Figure 1.** Location of the study area. The numbered yellow stars correspond to the epicenters of the major earthquakes in the area: (1) the 1934 Carnot (I0 = IX) earthquake, (2) the 1954 Orléansville (*Ms* 6.7) earthquake, and (3) the 1980 El‐Asnam earthquake (Mw 7.2). EAF = El‐Asnam fault trace; UCB = Upper‐Chelif Basin; MCB = Middle‐Chelif Basin; LCB = Lower‐Chelif Basin **Figure 1.** Location of the study area. The numbered yellow stars correspond to the epicenters of the major earthquakes in the area: (1) the 1934 Carnot (I<sup>0</sup> = IX) earthquake, (2) the 1954 Orléansville (*Ms* 6.7) earthquake, and (3) the 1980 El-Asnam earthquake (*Mw* 7.2). EAF = El-Asnam fault trace; UCB = Upper-Chelif Basin; MCB = Middle-Chelif Basin; LCB = Lower-Chelif Basin.

Prior to the 1980 El‐Asnam earthquake, several geological and geophysical studies were conducted on the Chelif Basin (e.g., [10,11]). However, these studies were mainly concentrated in the Lower‐Chelif Basin due to the presence of oil indices. Only a few stud‐ ies have been carried out on the Middle‐Chelif Basin [12,13]. Right after the 1980 earth‐ quake and given the extent of its damage, the region has finally been the subject of several geological, seismological, and geophysical studies (e.g., [6,8,14,15]). The firm Woodward and Clyde Consultants [15] conducted an important seismic microzonation study in eight cities of the Lower and Middle‐Chelif Basins, including El‐Abadia and El‐Attaf. During the investigations, several holes were drilled using Standard Penetration Tests (SPT). The study provided geotechnical, hydrogeological, landslide, and liquefaction potential maps for each of the eight cities. The Neogene formations of the Middle‐Chelif were described in detail in [12], and later in [16]. Furthermore, the structural aspect of the shallow and deep lithological units was recently imaged using land gravity data [17]. In the present study, we used ambient vibrations data to characterize some geotech‐ Prior to the 1980 El-Asnam earthquake, several geological and geophysical studies were conducted on the Chelif Basin (e.g., [10,11]). However, these studies were mainly concentrated in the Lower-Chelif Basin due to the presence of oil indices. Only a few studies have been carried out on the Middle-Chelif Basin [12,13]. Right after the 1980 earthquake and given the extent of its damage, the region has finally been the subject of several geological, seismological, and geophysical studies (e.g., [6,8,14,15]). The firm Woodward and Clyde Consultants [15] conducted an important seismic microzonation study in eight cities of the Lower and Middle-Chelif Basins, including El-Abadia and El-Attaf. During the investigations, several holes were drilled using Standard Penetration Tests (SPT). The study provided geotechnical, hydrogeological, landslide, and liquefaction potential maps for each of the eight cities. The Neogene formations of the Middle-Chelif were described in detail in [12], and later in [16]. Furthermore, the structural aspect of the shallow and deep lithological units was recently imaged using land gravity data [17].

nical features and to estimate some dynamic properties of the soil column in the cities of El‐Attaf, El‐Abadia, and Ain‐Defla. Ambient vibration‐based techniques have been used previously in the Chelif Basin [18–21]. The aim was to determine the resonance frequen‐ cies of the ground, the shear‐wave velocity structure of the sedimentary layers, and the bedrock depth, where the impedance contrasts with the sedimentary cover and may be the cause for ground shaking amplification during strong earthquakes. This work is a con‐ tinuation of the ones carried out in the Middle‐Chelif basin [20,21]. In the first part of this study, we used ambient vibration data recorded from single stations to estimate the ground resonance frequencies and assess the liquefaction potential In the present study, we used ambient vibrations data to characterize some geotechnical features and to estimate some dynamic properties of the soil column in the cities of El-Attaf, El-Abadia, and Ain-Defla. Ambient vibration-based techniques have been used previously in the Chelif Basin [18–21]. The aim was to determine the resonance frequencies of the ground, the shear-wave velocity structure of the sedimentary layers, and the bedrock depth, where the impedance contrasts with the sedimentary cover and may be the cause for ground shaking amplification during strong earthquakes. This work is a continuation of the ones carried out in the Middle-Chelif basin [20,21].

In the first part of this study, we used ambient vibration data recorded from single stations to estimate the ground resonance frequencies and assess the liquefaction potential in the three cities under study. In the second part, we used ambient vibration data recorded from single station and array techniques to estimate the average shear-wave velocity in the upper 30 m of the soil column (Vs30). Finally, a Vs30 predictive equation for the Middle-Chelif Basin was proposed.

This study contributes to the seismic hazard assessment in northern Algeria. The results obtained in this work can be used for ground motion simulation, for the calculation of amplification factors, and for many other studies related to the reduction of seismic risk. Middle‐Chelif Basin was proposed. This study contributes to the seismic hazard assessment in northern Algeria. The results obtained in this work can be used for ground motion simulation, for the calculation of ampli‐ fication factors, and for many other studies related to the reduction of seismic risk.

in the three cities under study. In the second part, we used ambient vibration data rec‐ orded from single station and array techniques to estimate the average shear‐wave veloc‐ ity in the upper 30 m of the soil column ሺVୱଷ). Finally, a Vୱଷ predictive equation for the

*Appl. Sci.* **2022**, *12*, x FOR PEER REVIEW 3 of 18

## **2. Geological Framework**

The Middle-Chelif is an intra-mountainous basin structured during the Neogene [5,10,12], and located within the Tellian Atlas mountain belts (Figure 1). The depression is filled with a thick cover of Mio-Plio-Quaternary sediments. The basement is composed of hard clays and marls of Cretaceous age [12,16]. In its southern part, autochthonous formations of Jurassic to Silurian age outcrop [22], and form the Temoulga, Rouina, and Doui massifs (Figure 2). The Middle-Chelif plain is crossed from west to east by the Cheliff River, the longest in Algeria, which contributes to form the actual alluviums. **2. Geological Framework** The Middle‐Chelif is an intra‐mountainous basin structured during the Neogene [5,10,12], and located within the Tellian Atlas mountain belts (Figure 1). The depression is filled with a thick cover of Mio‐Plio‐Quaternary sediments. The basement is composed of hard clays and marls of Cretaceous age [12,16]. In its southern part, autochthonous for‐ mations of Jurassic to Silurian age outcrop [22], and form the Temoulga, Rouina, and Doui massifs (Figure 2). The Middle‐Chelif plain is crossed from west to east by the Cheliff

The stratigraphical column is composed of a succession of marine, continental, and lacustrine deposits. Lateral variations in facies were also reported [10,16], and further confirmed in [21], where important lateral variations in the shear-wave velocity were observed within the same formations. All these perturbations in the sedimentary cover reflect several instability periods with intense tectonic activities and different episodes of marine regression and transgression that conditioned the sedimentation process [10]. However, the geologists are not in agreement concerning the age of the first sedimentary deposits. Some authors attributed those sediments to the Lower Miocene [10,12], while others assigned them to the Middle Miocene (Serravallo-Tortonian) [16,23]. The sediments are affected by a series of normal and thrust faults, located in the margins of the Middle-Chelif Plain. River, the longest in Algeria, which contributes to form the actual alluviums. The stratigraphical column is composed of a succession of marine, continental, and lacustrine deposits. Lateral variations in facies were also reported [10,16], and further con‐ firmed in [21], where important lateral variations in the shear‐wave velocity were ob‐ served within the same formations. All these perturbations in the sedimentary cover re‐ flect several instability periods with intense tectonic activities and different episodes of marine regression and transgression that conditioned the sedimentation process [10]. However, the geologists are not in agreement concerning the age of the first sedimentary deposits. Some authors attributed those sediments to the Lower Miocene [10,12], while others assigned them to the Middle Miocene (Serravallo‐Tortonian) [16,23]. The sediments are affected by a series of normal and thrust faults, located in the margins of the Middle‐ Chelif Plain.

**Figure 2.** Geological map of the study area. Compiled and modified from [22,24]. **Figure 2.** Geological map of the study area. Compiled and modified from [22,24].

In terms of lithology, the Miocene series occupies the major part of the sedimentary column. These formations were described and detailed in [12]. The first deposits are de‐ tritic continental series of conglomerates and marls. These sediments are overlayed by In terms of lithology, the Miocene series occupies the major part of the sedimentary column. These formations were described and detailed in [12]. The first deposits are detritic continental series of conglomerates and marls. These sediments are overlayed by different intercalations of marls, clays, limestones, and sandstones [12,16]. A thin layer of blue marls marks the transition between the Miocene and the Pliocene sediments [10,12]. The Miocene series outcrop in succession on the hills that overlook the city of El-Abadia (Figure 2). The shear-wave velocity varies from 640 to 1450 m/s for the Miocene formations [21]. The Pliocene is divided in two stages, marine and continental, with alternations between sands, sandstones, and conglomerates [10,16]. The Quaternary deposits are continental and predominant in the Middle-Chelif Plain. They are represented by Holocene and Pleistocene alluviums.

The three cities under study are built on Quaternary alluviums of different thicknesses. The city of Ain-Defla is built at the bottom of the northern flank of the Doui massif (Figure 2), where the topmost layer of the soil is composed of old Quaternary clays and gravels (Pleistocene), reaching a maximum thickness of 60 m [21]. The old alluviums lay directly over hard Jurassic limestone and Silurian schists and quartzite [22].

The cities of El-Attaf and El-Abadia are built on stiff Quaternary soil, the topmost layer is thin (<20 m) [21], and composed of recent alluviums (Holocene), which lie over older alluviums (Pleistocene). The engineering bedrock in El-Abadia is composed of Pliocene sandstones, while in El-Attaf, it is composed of Miocene marls and sandstones [14,21]. In these cases, the engineering bedrock has been characterized as the first layer of the soil column that contains a shear-wave velocity value above 750 m/s.

The presence of important sandbanks at shallow depths (<10 m) in El-Attaf and El-Abadia may lead to liquefaction phenomena during ground shaking. The risk is weak in El-Abadia since the sandy layers are dense and the ground water level is between 15 and 30 m deep [15]. However, in El-Attaf the risk is significant since the sandy banks are loose and the ground water level is shallow (between 7 and 10 m) [15].

## **3. Data and Methodology**

## *3.1. Horizontal-to-Vertical Spectral Ratio Technique (HVSR)*

The HVSR (or H/V) technique [25] allows retrieval of the resonance frequency of the soil at a given site, using single station ambient vibrations measurements. The theoretical aspect of this technique consists of calculating the ratio between the amplitude spectra of the vertical and horizontal components of ambient vibrations. As a result, an HVSR curve is obtained. The HVSR frequency peak is well correlated with the soil resonance frequency.

Although this technique is very effective in estimating the soil resonance frequency [26], the scientific community is reluctant about its ability to estimate the amplification factors [27–29]. This issue is due to the contribution of different seismic waves to the wavefield. As for now, it is very difficult to quantify the ratio between body and surface waves. Bonnefoy-Claudet et al. [30] showed that in the case of high impedance contrasts between the bedrock and the sediments, the HVSR curve is mainly controlled by surface waves. The relatively low amplitude of the body waves is not sufficient to correctly estimate the amplification factor. Moreover, La Rocca et al. [31] and Benkaci et al. [32] have proven that the HVSR peak amplitude varies considerably with time. The amplitude of the frequency peak obtained from the HVSR technique in this study is interpreted as a relative indicator and not as a true amplification factor value.

Ambient vibration single-station measurements were carried out in October 2021 at 71 sites in the cities of El-Attaf (24 sites), Ain-Defla (33 sites), and El-Abadia (14 sites) (Figure 3). Some measurement points [15] were taken from a previous study [21]. Recordings were performed at night and in calm weather, as recommended by the SESAME project [33]. The acquisition time was 20 min. At some sites, the recording time was extended to 30 min due to anthropogenic noise from human activity. The equipment used for the recording was a pair of Tromino seismographs, with a sampling rate of 512 samples per second.

The HVSR technique was processed following the recommendations of the SESAME project [33]. The whole data were processed in the same way using the Geopsy software [34]. First, the signals were divided into several windows of 30 s each, tapered by a 5% cosine function. The window selection was made automatically using an anti-triggering algorithm, which allows selecting windows where the ambient noise is stationary. After that, the Fast Fourier Transform (FFT) is computed for each component (vertical and both horizontals) and the amplitude spectrum of both horizontal components is combined by an RMS (root mean square) average computation. After that, the ratio between the amplitude spectrum of both vertical and horizontal components is calculated. The HVSR curve is calculated for each window. The resulting curve is smoothed using the Konno–Ohmachi algorithm [35],

with a smoothing coefficient of 40. Finally, the curves are averaged and the final HVSR curve is retrieved in the frequency range between 0.2 and 20 Hz. The resulting HVSR curve may contain one or several frequency peaks, which are directly linked to impedance contrasts at the soil column. *Appl. Sci.* **2022**, *12*, x FOR PEER REVIEW 5 of 18

**Figure 3.** Distribution of the data compiled in this study. The left panels correspond to the legend, the borehole data, and the array configuration, from top to bottom, respectively. Bh03 was taken from [14]. The rest was taken from [15]. **Figure 3.** Distribution of the data compiled in this study. The orange circles correspond to data measured in [21]. The left panels correspond to the legend, the borehole data, and the array configuration, from top to bottom, respectively. Bh03 was taken from [14]. The rest was taken from [15].

### The HVSR technique was processed following the recommendations of the SESAME project [33]. The whole data were processed in the same way using the Geopsy software *3.2. Seismic Vulnerability Index (Kg) and Shear Strain (Υ)*

[34]. First, the signals were divided into several windows of 30 s each, tapered by a 5% cosine function. The window selection was made automatically using an anti‐triggering algorithm, which allows selecting windows where the ambient noise is stationary. After that, the Fast Fourier Transform (FFT) is computed for each component (vertical and both horizontals) and the amplitude spectrum of both horizontal components is combined by an RMS (root mean square) average computation. After that, the ratio between the ampli‐ tude spectrum of both vertical and horizontal components is calculated. The HVSR curve The stability of structures during an earthquake depends on the behavior of the ground, which in turn depends on the dynamic properties of the soil column. Ishihara [36] established a relation between the shear strain deformation and the dynamic properties of the soil, by compiling several earthquake data, reports, and laboratory tests (Table 1). Nakamura [37] introduced a method based on the vulnerability index calculation using ambient vibration data to estimate the shear strain, for the purpose of potential earthquake damage assessment (soil liquefaction, landslides).

is calculated for each window. The resulting curve is smoothed using the Konno–

Ohmachi algorithm [35], with a smoothing coefficient of 40. Finally, the curves are aver‐ **Table 1.** Strain dependence of dynamic properties of the soil [36].


The stability of structures during an earthquake depends on the behavior of the ground, which in turn depends on the dynamic properties of the soil column. Ishihara [36] The vulnerability index is calculated using the following equation [37]:

$$K\_{\mathcal{S}} = \varepsilon \times \frac{\left(\frac{A\_{\mathcal{S}}^2}{F\_{\mathcal{S}}}\right)}{\left(\pi^2 \times V\_b\right)}\tag{1}$$

damage assessment (soil liquefaction, landslides).

where *F<sup>g</sup>* is the resonance frequency of the soil, and *A<sup>g</sup>* is the corresponding amplitude obtained with HVSR method. *V<sup>b</sup>* is the velocity at the bedrock and *e* is the applied dynamic force. According to Nakamura [37], where several values of dynamic force were applied and tested with several velocity values for bedrock, the optimal results are obtained with an applied dynamic force of 60%. Thus, assuming this value and a velocity at the bedrock around 6 <sup>×</sup> <sup>10</sup><sup>4</sup> cm/s [37], Equation (1) can then be simplified as follows:

$$K\_{\mathcal{S}} = \left(\frac{A\_{\mathcal{S}}^2}{F\_{\mathcal{S}}}\right) \times 10^{-6} \tag{2}$$

The shear strain can then be calculated by multiplying the vulnerability index (*Kg*) with the maximum observed acceleration, or the peak ground acceleration (αg) [37].

$$\mathcal{Y} = \mathsf{K}\_{\mathcal{S}} \times \mathfrak{a}\_{\mathcal{S}} \tag{3}$$

where the α*<sup>g</sup>* value is in cm/s<sup>2</sup> (Gal). The peak ground acceleration values for El-Attaf and El-Abadia cities are 410 and 520 cm/s<sup>2</sup> , respectively, for a return period of 500 years [15]. For Ain-Defla city, a value of 250 cm/s<sup>2</sup> is proposed [38].

This technique is used in the present work mainly to assess the liquefaction potential in the study area. For a shear strain value *Υ* > 10−<sup>2</sup> *,* phenomena of liquefaction and landslides are likely to occur during earthquakes.

## *3.3. Array Techniques*

Array-based techniques allow to obtain surface wave dispersion curves from ambient vibration records. In this study, three array techniques were used to extract the Rayleigh wave dispersion curves: the frequency-wavenumber (F-K) analysis [39–42], the spatial autocorrelation (SPAC) technique [43–45], and the extended spatial auto-correlation (ESAC) technique [44,46]. Some assumptions about the soil conditions are required before using these techniques. For example, it is assumed that the ambient vibration wavefield is essentially dominated by surface waves (especially the fundamental mode), and that the subsurface layers are homogenous and horizontally stratified, which means that in each layer the seismic waves propagate at a constant velocity [44].

The array measurement campaign was carried out in January 2021. The difficulty of finding open fields far enough from human activities inside urban areas, limited the measurement sites to 11 (four in El-Attaf and El-Abadia, and three in Ain-Defla). The measurements were made during daytime with a recording time of 40 min. The configuration of the array is triangular. Concretely, the sensor deployment consists of an equilateral triangle of 30 m on each side, with a sensor placed at each vertex, a sensor placed midway on each side, and an additional sensor placed at the center (Figure 3). This configuration provides a better coverage. According to [44], the equilateral triangle is the most efficient configuration for array techniques. The equipment used was formed by 7 SARA SS10 triaxial velocity sensors (f<sup>0</sup> = 1 Hz) connected to SL06 digitizers.

## 3.3.1. The Frequency-Wavenumber (F-K) Analysis

F-K analysis is one of the most commonly used techniques for estimating Rayleigh wave dispersion curves. It has some advantages over other techniques, such as the ability to identify the direction of ambient vibration source and the recognition of the different modes presented in the wavefield [47]. The latter is composed of a superposition of several propagated waves. The F-K technique allows estimating the velocity and the direction of approach (back-azimuth) of these waves [48].

The F-K technique is based on two fundamental assumptions: the first is that the process is stationary in time. The second is that the process is stationary in the horizontal plane and that the propagation of the wavefront is only in the vertical direction. The stationary aspect of propagated seismic waves allows the power spectral density function of the frequency wavenumber to be calculated, which contains information about the power as a function of frequency and velocity vectors of the propagated wave. There are two methods for calculating the power spectral density: the maximum likelihood method (MLM) or the high-resolution method [40,41] and the beamforming method (BFM) [42]. The BFM technique is used in this study, as it is less sensitive to errors than the MLM technique [41].

For all the array techniques, only the vertical component of the records was analyzed, as it is the one required for the estimation of the Rayleigh wave dispersion curves. The Sesarray software package [34] was used to perform the F-K analysis. First, the coordinates of each of the seven sensors were introduced in the WARANGPS software in order to calculate the array transfer function and the theoretical wavenumber limits (Kmin and Kmax). Then, the signals were loaded into the Geopsy software and the BFM method was applied. The signals were divided into several windows of frequency-dependent lengths (50 periods). After that, an anti-triggering algorithm was used for the window selection. The processing requires the input of the grid step and grid size parameters. The grid size corresponds to the Kmax value, which is related to the aliasing limit. The grid step determines the maximum resolution and was chosen as Kmin/2. Once all these parameters were introduced, the final processing was launched and the dispersion curve was obtained. The processing was the same for the data of the 11 arrays.

## 3.3.2. The SPAC and ESAC Techniques

The spatial auto-correlation and the extended spatial auto-correlation techniques are based on the assumption of a stochastic wavefield being stationary both in space and time [43].

In theory, the SPAC technique consists of calculating a single-phase velocity value at each frequency in a predefined frequency band by fitting the SPAC coefficient to a Bessel function. For a circular array, the Bessel function represents the average cross-correlation between pairs of stations as a function of their distance. Aki [43] showed that the SPAC coefficient at a given frequency has the same form as the 0th order Bessel function.

The SPAC method requires a circular array configuration with a centrally located sensor [44]. A modification of this technique has been suggested by Bettig et al. [45], which allows the SPAC technique to be applied to arrays of arbitrary configuration. The modification consists of replacing the use of fixed radius values with rings of finite thickness.

The ESAC method differs from SPAC by fixing the frequency values instead of the radius. At each frequency, the normalized transverse spectrum is fitted to the Bessel function. The inverted Bessel function that has the best fit with the normalized crossspectra allows to obtain the phase velocity [46].

As with the F-K analysis, the SPAC technique was performed using the SESARRAY software package [34]. The first step is to define the ring parameters using the SPAC toolbox of the Geopsy software. Once the coordinates of the sensors are entered, the software will define a set of spatially distributed sensor pairs (e.g., 21 pairs for 7 sensors). Then, the sensor pairs must be included in one or more rings. For this purpose, inner and outer radii of rings that best correspond to the sensor pairs were introduced. Note that a ring can contain a minimum of two pairs. A maximum number of rings is recommended for better resolution [45,49]. Similar to the F-K analysis, the signals are divided into frequency-dependent length windows containing 50 periods. Windows are selected using the anti-triggering algorithm. Then, the analysis is launched and the spatial auto-correlation curves are obtained for each ring. The Spac2Disp software is used to display the phase velocity histograms derived from the set of the calculated spatial auto-correlation values. Then, the Rayleigh phase velocity values that best contribute to the dispersion curve are chosen within the Kmin and Kmax values. In this way, a final dispersion curve is obtained.

The ESAC analysis was carried out using a specific Matlab© (Natick, MA, USA) application developed by the University of Alicante [50]. For each sensor, the recorded signal is divided into non-overlapping 30 s windows. Then, the cross-spectrum is calculated and smoothed using a triangular window, in the frequency range from 0.1 to 15 Hz.

## *3.4. Inversion of Dispersion Curves and HVSR Curves*

*Appl. Sci.* **2022**, *12*, x FOR PEER REVIEW 8 of 18

The inversion was carried out with the Dinver software (Sesarray package [34]) using the neighborhood algorithm [51]. In order to have a better spatial distribution for the Vs30 values, some HVSR curves were inverted in El-Abadia (5 sites) and Ain-Defla (5 sites). Only the part around the fundamental frequency peak was considered in the inversion process [52]. For the array data inversion process, the 3 dispersion curves obtained for each array using the 3 different techniques (F-K, SPAC, and ESAC) (Figure 4) were averaged to obtain a better constrained dispersion curve with an optimized frequency range. The input parameters required for the inversion (Vp, Vs, densities, and number of layers) were taken from previous studies [14,21]. The maximum number of iterations was set to 300 iterations, and 100 models were generated at each iteration. The experimental average dispersion curve was compared to the theoretical one via a misfit value. Then, the Vs model corresponding to the minimum misfit was selected (Figure 5). *3.4. Inversion of Dispersion Curves and HVSR Curves* The inversion was carried out with the Dinver software (Sesarray package [34]) using the neighborhood algorithm [51]. In order to have a better spatial distribution for the Vs30 values, some HVSR curves were inverted in El‐Abadia (5 sites) and Ain‐Defla (5 sites). Only the part around the fundamental frequency peak was considered in the inversion process [52]. For the array data inversion process, the 3 dispersion curves obtained for each array using the 3 different techniques (F‐K, SPAC, and ESAC) (Figure 4) were aver‐ aged to obtain a better constrained dispersion curve with an optimized frequency range. The input parameters required for the inversion (Vp, Vs, densities, and number of layers) were taken from previous studies [14,21]. The maximum number of iterations was set to 300 iterations, and 100 models were generated at each iteration. The experimental average dispersion curve was compared to the theoretical one via a misfit value. Then, the Vs model corresponding to the minimum misfit was selected (Figure 5).

outer radii of rings that best correspond to the sensor pairs were introduced. Note that a ring can contain a minimum of two pairs. A maximum number of rings is recommended for better resolution [45,49]. Similar to the F‐K analysis, the signals are divided into fre‐ quency‐dependent length windows containing 50 periods. Windows are selected using the anti‐triggering algorithm. Then, the analysis is launched and the spatial auto‐correla‐ tion curves are obtained for each ring. The Spac2Disp software is used to display the phase velocity histograms derived from the set of the calculated spatial auto‐correlation values. Then, the Rayleigh phase velocity values that best contribute to the dispersion curve are chosen within the Kmin and Kmax values. In this way, a final dispersion curve is obtained. The ESAC analysis was carried out using a specific Matlab© (Natick, MA, USA) ap‐ plication developed by the University of Alicante [50]. For each sensor, the recorded signal is divided into non‐overlapping 30 s windows. Then, the cross‐spectrum is calculated and

smoothed using a triangular window, in the frequency range from 0.1 to 15 Hz.

**Figure 4.** Rayleigh wave dispersion curves obtained from the F‐K, SPAC, and ESAC methods. **Figure 4.** Rayleigh wave dispersion curves obtained from the F-K, SPAC, and ESAC methods.

## *3.5. Estimation of the Vs30*

3.5.1. Vs30 from NSPT Measurements

There are a considerable number of studies that propose equations relating shear-wave velocity to the Normalized Standard Penetration Test (NSPT). However, the equations are specific to the region under study. Sil et al. [53] compiled data from different continents and proposed empirical equations correlating NSPT values with shear-wave velocity for sands (Equation (4)), clays (Equation (5)), and for all soil types (Equation (6)):

$$\mathbf{V}\_{\rm S} = 79.217 \times \mathbf{N}^{0.3699} \tag{4}$$

$$\mathbf{V}\_S = 99.708 \times N^{0.3358} \tag{5}$$

$$\mathbf{V}\_{\rm S} = 75.478 \times \mathbf{N}^{0.3799} \tag{6}$$

where V*<sup>S</sup>* is the shear-wave velocity and *N*-value is the number of blows in the SPT measurements. However, in the case the SPT borehole does not reach 30 m, which is the case in this study, Vs30 can be correlated from the average velocity at depth *z* using the following equation from [54]:

$$\text{Log }V\_{\text{S}\Im 0} = a + (b \times \text{Log }V\_{\text{s}}z) \tag{7}$$

*Vsz* is the velocity at depth *z*, and *a* and *b* are coefficients that vary with depth (see Table 2 in [54]). *Appl. Sci.* **2022**, *12*, x FOR PEER REVIEW 9 of 18

**Figure 5.** Results of the inversion of Rayleigh waves dispersion curves. The left panels for each site represent the dispersion curves. The right panels represent the Vs models. The black line corre‐ sponds to the best fit model, the dark grey represents models with minimum misfit + 10%. All the tested models are in light grey. The misfit is shown at the middle top of each site. **Figure 5.** Results of the inversion of Rayleigh waves dispersion curves. The left panels for each site represent the dispersion curves. The right panels represent the Vs models. The black line corresponds to the best fit model, the dark grey represents models with minimum misfit + 10%. All the tested models are in light grey. The misfit is shown at the middle top of each site.

### *3.5. Estimation of the Vs30* 3.5.1. Vs30 from NSPT Measurements 3.5.2. Vs30 from Vs Models

There are a considerable number of studies that propose equations relating shear‐ wave velocity to the Normalized Standard Penetration Test (NSPT). However, the equa‐ The inversion of the dispersion curves allows retrieval of the Vs models. The averaged Vs30 can be calculated from the Vs models using the following equation [55]:

$$\mathbf{V\_{s30}} = \frac{\mathbf{30}}{\sum\_{i=1}^{N} \frac{H i}{\mathbf{V}\_i}} \tag{8}$$

Vௌ ൌ 79.217 ൈ .ଷଽଽ (4) *Hi* is the thickness and *V<sup>i</sup>* is the shear-wave velocity of the layer *i.*

Vௌ ൌ 99.708 ൈ .ଷଷହ଼ (5) Vௌ ൌ 75.478 ൈ .ଷଽଽ (6) The obtained Vs30 values were then spatially meshed using the Kriging method [56], with a linear transformation. A map of Vs30 variation was obtained for each city. As for the classification of the sites, it was completed according to the NEHRP site classification [57] (Table 2).

where V*<sup>S</sup>* is the shear‐wave velocity and *N*‐value is the number of blows in the SPT meas‐ urements. However, in the case the SPT borehole does not reach 30 m, which is the case **Table 2.** NEHRP soil classification as a function of the average shear-wave velocity to 30 m depth [57].


## **4. Results and Discussion**

## *4.1. Soil Resonance Frequencies and Amplitudes*

The soil resonance frequencies and the corresponding amplitudes for the cities of El-Attaf, El-Abadia, and Ain-Defla, are mapped in Figure 6. In El-Attaf city, the resonance frequencies are between 1.2 and 8.3 Hz. In the major part of the city, the predominant frequencies are between 1.2 and 5 Hz. The obtained values are related to impedance contrasts in the subsoil between the Quaternary alluviums and the Miocene marls and sandstones. Near Ouled Moussa, south of the city, higher frequencies are observed (between 5 and 8.3 Hz). This increase is most likely related to the Cretaceous marls outcropping in the south [21]. Since the buildings in this area have between 1 and 5 floors, the resonance frequencies of the ground are close to the buildings' frequencies, which can be damaging for the structures during strong shaking. The predominant amplitudes of the frequency peaks vary between 2.2 and 8.7. The amplitudes are lower to the northern areas of the city. The highest amplitudes (between 5 and 8.7) are observed in "Cité Bouzar" neighborhood, in the western areas of the city. *Appl. Sci.* **2022**, *12*, x FOR PEER REVIEW 11 of 18

**Figure 6.** Results of the single−station measurements analysis. **Figure 6.** Results of the single−station measurements analysis.

*4.2. Shear Strain and Liquefaction Potential* The shear strain variation map for the three cities (right column in Figure 6) gives valuable information about the dynamic properties of the soils and the possible behavior during earthquakes. In El‐Abadia and Ain‐Defla cities, the shear strain values reflect an elasto‐plastic soil behavior, where cracks and settlements may occur during strong ground shaking. However, in the central part of Ain‐Defla city, the lower strain values indicate that the soil column tends towards a more elastic appearance, which is probably due to the thickening of the ancient Quaternary deposits in this zone, with the presence of very dense clays and gravels [13]. In both cities, the results show that the soils do not show any predisposition to liquefaction and landslide phenomena during earthquakes. On the other hand, in the El‐Attaf city, strain rate shows different dynamic properties In the El-Abadia city, the resonance frequencies vary between 1.4 and 4.2 Hz, while the corresponding amplitudes vary between 2 and 4. The obtained resonance frequencies are related to impedance contrasts between the Quaternary alluviums and the Pliocene sandstones. The amplitudes slightly increase to the southwest towards the Middle-Chelif plain, where the Quaternary stiff and soft soils are thicker. Finally, in the Ain-Defla city, the resonance frequencies vary between 1.4 and 4.5. These frequencies are related to the impedance contrasts between the Quaternary deposits and the Cretaceous-Jurassic bedrock. The increase in the resonance frequency peak from north to south is related to the presence of the Doui Massif and its hard Jurassic limestones [22] to the south of the city. The predominant amplitude of the frequency peaks varies between 2 and 7.1. The amplitudes are relatively low in the central part of the city. However, in "Mohamad Khiat"

> of the soil, an elasto‐plastic behavior in the east, and a collapse behavior in the west, more precisely in the "Cité Bouzar" neighborhood, with a soil subject to liquefaction. Piezomet‐

> of liquefaction in the area, which was the case during the 1980 El‐Asnam earthquake. In‐ deed, liquefaction phenomena were reported west of El‐Attaf, where large sand boil for‐

> The dispersion curves obtained using the three techniques, F‐K, SPAC, and ESAC, and presented in Figure 4, are plotted within the theoretical limits of the wavenumber (Kmin, Kmax). The dispersion curves are valid between 7 and 12 Hz in El‐Attaf and be‐ tween 5 and 12 Hz in Ain‐Defla. In the El‐Abadia city, the curves are valid between 5.5 and 11 Hz. This difference is due to local site conditions. The dispersion curves are well

mations (>6 m) were observed [9].

*4.3. Dispersion Curve Inversion and Vs Models*

neighborhood, in the SW, the amplitudes are relatively higher (5–7.1). In the northern areas, it reaches a value of 6.8.

## *4.2. Shear Strain and Liquefaction Potential*

The shear strain variation map for the three cities (right column in Figure 6) gives valuable information about the dynamic properties of the soils and the possible behavior during earthquakes. In El-Abadia and Ain-Defla cities, the shear strain values reflect an elasto-plastic soil behavior, where cracks and settlements may occur during strong ground shaking. However, in the central part of Ain-Defla city, the lower strain values indicate that the soil column tends towards a more elastic appearance, which is probably due to the thickening of the ancient Quaternary deposits in this zone, with the presence of very dense clays and gravels [13]. In both cities, the results show that the soils do not show any predisposition to liquefaction and landslide phenomena during earthquakes.

On the other hand, in the El-Attaf city, strain rate shows different dynamic properties of the soil, an elasto-plastic behavior in the east, and a collapse behavior in the west, more precisely in the "Cité Bouzar" neighborhood, with a soil subject to liquefaction. Piezometric measurements in "Cité Bouzar" have shown that the water table is around 7 m deep [15]. The presence of sandbanks and water table at very shallow depths increase the risk of liquefaction in the area, which was the case during the 1980 El-Asnam earthquake. Indeed, liquefaction phenomena were reported west of El-Attaf, where large sand boil formations (>6 m) were observed [9].

## *4.3. Dispersion Curve Inversion and Vs Models*

The dispersion curves obtained using the three techniques, F-K, SPAC, and ESAC, and presented in Figure 4, are plotted within the theoretical limits of the wavenumber (Kmin, Kmax). The dispersion curves are valid between 7 and 12 Hz in El-Attaf and between 5 and 12 Hz in Ain-Defla. In the El-Abadia city, the curves are valid between 5.5 and 11 Hz. This difference is due to local site conditions. The dispersion curves are well correlated at most sites (Figure 4). At sites ATF1 and AIN1, the dispersion curves obtained with the ESAC technique tend to diverge from the other curves at high frequencies. We note that at these sites, the array was deployed on slightly sloping terrain, which might indicate that the ESAC technique could be more sensitive to slopes than the other techniques.

An average curve was calculated at each site. In this way, a better constrained dispersion curve is used for the inversion process to obtain a better consistency of the resulting Vs profiles. The results of the inversion are shown in Figure 5. In the El-Attaf city, the engineering bedrock corresponds to the Miocene marls and sandstones, with a Vs value varying between 970 and 1280 m/s. Soft and stiff Quaternary alluvium occupies the first 30 m of the soil column. In the Ain-Defla city, the bedrock Vs value varies between 1390 and 1450 m/s. The thickness of the Quaternary deposits varies between 16 and 43 m. At El-Abadia city, the bedrock Vs value is divided into two different ranges, between 870 and 900 m/s to the east (ABD1, ABD3), and between 1150 and 1200 m/s to the west (ABD2, ABD4). This difference is related to the change in bedrock composition from Late Pliocene sandstones in the east to Miocene marls and sandstones in the west. The thickness of the Quaternary layers varies between 11 and 29 m.

## *4.4. Vs30 Structure and Site Classification*

Vs30 was calculated from the shear-wave velocity models and the additional SPT surveys. A map of Vs30 variation, along with site classification, is provided for each of the three cities (Figures 7–9).

*Appl. Sci.* **2022**, *12*, x FOR PEER REVIEW 13 of 18

*Appl. Sci.* **2022**, *12*, x FOR PEER REVIEW 13 of 18

**Figure 7.** Vୱଷ map and soil classification in the El‐Attaf city. **Figure 7.** Vs30 map and soil classification in the El-Attaf city [21].

**Figure 7.** Vୱଷ map and soil classification in the El‐Attaf city.

**Figure 8.** Vୱଷ map and soil classification in the Ain‐Defla city. **Figure 8.** Vs30 map and soil classification in the Ain-Defla city [21].

*Appl. Sci.* **2022**, *12*, x FOR PEER REVIEW 14 of 18

**Figure 9.** Vୱଷ map and soil classification in the El‐Abadia city. **Figure 9.** Vs30 map and soil classification in the El-Abadia city [21].

*4.5. Vs30 Predictive Equation for the Middle‐Chelif Basin* The wavelength corresponding to the Vୱଷ value was estimated for each average dis‐ persion curve. The average wavelength found is λ = 41 m ± 3. After that, the Vୱଷ values were correlated with ோସଵ values (Rayleigh wave velocity at λ = 41 m), and the best linear fitting was obtained (Equation (9)): Vୱଷ = 10171 ∗ Vୖସଵ − 6719 (9) The regression plot, along with the residuals, is shown in Figure 10. The correlation degree is R2 = 0.9472. Equation (9) was applied to dispersion curves obtained in two other In El-Attaf city, the Vs30 values vary between 290 and 460 m/s (Figure 7). In most of the city, the soil is classified as very dense (C), except for a small area in the center where the soil is classified as stiff (D). The increase in velocity towards the west is caused by the presence of shallow Jurassic limestones, which outcrop about 1 km west of the city in the Temoulga Massif (Figure 2). In the southern part of the city, the slight increase in Vs30 is related to the thinning of the Quaternary alluvium and the predominance of the Miocene stiff formations [21].

cities of the Middle‐Chelif Basin from a previous study [21] (Table 3). The aim was to evaluate the reliability of the Vୱଷ predictive equation. The Vୱଷvalues were predicted within a maximum error of 5.6%, and the site classifications were correct. For the disper‐ sion curves obtained in the present study, the Vୱଷ values were predicted within a maxi‐ mum error of 7.6%, and the site classifications were correct, except for ATF1 site. In the Ain-Defla city, the Vs30 values vary between 250 and 550 m/s (Figure 8). The soil is classified as very dense and soft rock (C) in most of the city. The velocity gradually decreases towards the northwest and the soils become stiff (D). The variation in Vs30 at Ain-Defla is mainly controlled by the ratio of Quaternary alluvium to Jurassic limestone in the first 30 m. In the south, where the city backs onto the Doui Massif, the ancient Quaternary alluvium forms a thin layer and the upper 30 m of the soil is dominated by Jurassic limestone. Moving northwest, the Quaternary deposits begin to be thicker and dominate the top 30 m of the soil column, and thus, Vs30 values decrease and the soils are classified as stiff. In the El-Abadia city, the Vs30 values range from 340 and 530 m/s (Figure 9). In most of the city, soils are classified as very dense and soft rock (C). The high Vs30 values are related to the presence of Pliocene conglomerates and sandstones at shallow depths. The shear-wave velocity values decrease towards the south where the Quaternary alluvium is thicker [16]. The lowest Vs values are observed around the southern part of the Boukalli River in the city, where the upper layer is composed of present alluvium.

## *4.5. Vs30 Predictive Equation for the Middle-Chelif Basin*

The wavelength corresponding to the Vs30 value was estimated for each average dispersion curve. The average wavelength found is λ = 41 m ± 3. After that, the Vs30 values were correlated with *VR*<sup>41</sup> values (Rayleigh wave velocity at λ = 41 m), and the best linear fitting was obtained (Equation (9)):

$$V\_{\text{s30}} = 10171 \, \ast \, V\_{\text{R41}} - 6719 \, \tag{9}$$

The regression plot, along with the residuals, is shown in Figure 10. The correlation degree is R<sup>2</sup> = 0.9472. Equation (9) was applied to dispersion curves obtained in two other cities of the Middle-Chelif Basin from a previous study [21] (Table 3). The aim was to

evaluate the reliability of the Vs30 predictive equation. The Vs30 values were predicted within a maximum error of 5.6%, and the site classifications were correct. For the dispersion curves obtained in the present study, the Vs30 values were predicted within a maximum error of 7.6%, and the site classifications were correct, except for ATF1 site. cities of the Middle‐Chelif Basin from a previous study [21] (Table 3). The aim was to evaluate the reliability of the Vୱଷ predictive equation. The Vୱଷvalues were predicted within a maximum error of 5.6%, and the site classifications were correct. For the disper‐ sion curves obtained in the present study, the Vୱଷ values were predicted within a maxi‐ mum error of 7.6%, and the site classifications were correct, except for ATF1 site.

The wavelength corresponding to the Vୱଷ value was estimated for each average dis‐ persion curve. The average wavelength found is λ = 41 m ± 3. After that, the Vୱଷ values were correlated with ோସଵ values (Rayleigh wave velocity at λ = 41 m), and the best linear

The regression plot, along with the residuals, is shown in Figure 10. The correlation degree is R2 = 0.9472. Equation (9) was applied to dispersion curves obtained in two other

Vୱଷ = 10171 ∗ Vୖସଵ − 6719 (9)

**Figure 9.** Vୱଷ map and soil classification in the El‐Abadia city.

*4.5. Vs30 Predictive Equation for the Middle‐Chelif Basin*

fitting was obtained (Equation (9)):

*Appl. Sci.* **2022**, *12*, x FOR PEER REVIEW 14 of 18

**Figure 10.** (**A**) Regression line (VS30 vs. VR41) and equation given. R<sup>2</sup> is the correlation coefficient. (**B**) The corresponding residual values.


**Table 3.** Evaluation of Vs30 predictive equation based on Vs30 values obtained in this study, and in [21].

In the case of the ATF1 site, the estimated and predicted Vs30 values are close to the limit between class C and D (according to the NEHRP classification [57]).

## **5. Conclusions**

In the present study, ambient vibration records were used to characterize the dynamic properties of the soil, and the velocity structure of its upper 30 m (*Vs*30) in the cities of El-Attaf, Ain-Defla, and El-Abadia, in the Middle-Chelif Basin. Both single-station and array-based techniques were applied. The studied cities are a good example of growing cities located in a highly seismic zone. This study improves the one carried out by the WCC (1984) by investigating the behavior of the soils and quantifying the liquefaction potential. Additionally, the Vs30 values allowed classifying the soils of the three cities for the first time.

The HVSR technique was applied on single-station measurements to estimate the ground resonance frequencies. In the El-Attaf city, the frequencies vary between 1.2 and 8.3 Hz, and between 1.4 and 4.2 Hz in the El-Abadia city. In theAin-Defla city, the resonance frequencies vary between 1.4 and 4.5 Hz. The frequency peaks are directly related to impedance contrasts at different depths between sediments and bedrock. The corresponding amplitudes are ranging between 2 and 8.7.

The obtained resonance frequencies and the corresponding amplitudes were used to calculate the shear strain, which may give an idea about possible behavior of the soils during major earthquakes. In the El-Abadia and Ain-Defla cities, the shear strain values reflect the elasto-plastic behavior of the soil column. Cracks and settlements may occur during earthquakes, especially in El-Abadia city. In El-Attaf, the shear strain analysis also shows an elasto-plastic behavior of the soil in most of the city. Except in its western part, where a collapse behavior is observed. Consequently, the soil is subject to liquefaction.

Rayleigh wave dispersion curves were obtained from array recordings at 11 sites, using F-K, SPAC, and ESAC techniques. Shear-wave velocity models were obtained from the inversion of the mean dispersion curves. From the Vs30 variation maps, the local soils were classified using the NEHRP chart for site classification. In the El-Attaf city, the Vs30 values vary between 300 and 470 m/s. The soil is classified as very dense and soft rock (C) in most of the city. In the Ain-Defla city, Vs30 values vary between 250 and 530 m/s. The soils are classified as very dense (C) in the central and eastern sides. In the west, the soils are stiff (D) due to the thickening of the Quaternary alluviums. Finally, in the El-Abadia city, the Vs30 values vary between 340 and 530 m/s. In the major part of the city, the soil is classified as very dense and soft rock (C). In addition, a predictive equation for Vs30 in the Middle-Chelif Basin was proposed based on the obtained dispersion curves and Vs30 values.

The three studied cities extend to the alluvial plains of the Middle-Chelif, an area of unstable soils, which has undergone several phenomena induced by past earthquakes (e.g., soil liquefaction, landslides, cracks, settlements). A well constrained characterization of the dynamic properties of the soil, as well as the shear-wave velocity structure, allows a better understanding of the soil behavior during strong earthquakes. Therefore, this allows minimizing potential damage during potential earthquakes. The present study aims to contribute to the seismic hazard assessment in northern Algeria.

**Author Contributions:** A.I. acquired and processed the data and wrote the paper. A.S. acquired and prepared the data, helped with figure conceptualization. F.S. proposed the methodology, the structure, and reviewed the paper. A.Y.-C. provided the acquisition materials and supervised the paper. J.J.G.-M. provided Matlab code and reviewed the paper. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Data Availability Statement:** The data used in this study belongs to the research center in astronomy astrophysics and geophysics (CRAAG, Algeria).

**Acknowledgments:** The authors would like to thank the members of the CRAAG team that participated in field measurements: B. Melouk, O. Haddad, and R. Chimouni.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


## *Article* **Shallow S-Wave Velocity Structure in the Middle-Chelif Basin, Algeria, Using Ambient Vibration Single-Station and Array Measurements**

**Abdelouahab Issaadi 1,2,\*, Fethi Semmane <sup>1</sup> , Abdelkrim Yelles-Chaouche <sup>1</sup> , Juan José Galiana-Merino 2,3 and Anis Mazari <sup>1</sup>**


**Citation:** Issaadi, A.; Semmane, F.; Yelles-Chaouche, A.; Galiana-Merino, J.J.; Mazari, A. Shallow S-Wave Velocity Structure in the Middle-Chelif Basin, Algeria, Using Ambient Vibration Single-Station and Array Measurements. *Appl. Sci.* **2021**, *11*, 11058. https://doi.org/10.3390/ app112211058

Academic Editors: Ricardo Castedo, Miguel Llorente Isidro and David Moncoulon

Received: 10 October 2021 Accepted: 19 November 2021 Published: 22 November 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

**Abstract:** In order to better assess the seismic hazard in the northern region of Algeria, the shear-wave velocity structure in the Middle-Chelif Basin is estimated using ambient vibration single-station and array measurements. The Middle-Chelif Basin is located in the central part of the Chelif Basin, the largest of the Neogene sedimentary basins in northern Algeria. This basin hosts the El-Asnam fault, one of the most important active faults in the Mediterranean area. In this seismically active region, most towns and villages are built on large unconsolidated sedimentary covers. Application of the horizontal-to-vertical spectral ratio (HVSR) technique at 164 sites, and frequency–wavenumber (F–K) analysis at 7 other sites, allowed for the estimation of the ground resonance frequencies, shear-wave velocity profiles, and sedimentary cover thicknesses. The electrical resistivity tomography method was used at some sites to further constrain the thickness of the superficial sedimentary layers. The soil resonance frequencies range from 0.75 Hz to 12 Hz and the maximum frequency peak amplitude is 6.2. The structure of the estimated shear-wave velocities is presented in some places as 2D profiles to help interpret the existing faults. The ambient vibration data allowed us to estimate the maximum depth in the Middle-Chelif Basin, which is 760 m near the city of El-Abadia.

**Keywords:** Middle-Chelif sedimentary basin; HVSR; array measurement; frequency–wavenumber (F–K) method

## **1. Introduction**

Northern Algeria is characterized by a series of Neogene basins (e.g., Constantine Basin, Hodna Basin, Soummam Basin, Tizi-ouzou Basin, Mitidja Basin, Medea Basin, and Chelif Basin), elongated in an E–W direction, and surrounded by the Tellian Atlas mountain belts, which act as a substratum for their sedimentary covers [1–3]. These basins host important seismic activity, mainly in their marginal zones [4]. Active tectonics in the northern part of the country and the related seismicity are due to the fact that this zone is located at the boundary between the African and Eurasian convergent plates [4,5].

In the northwestern part of Algeria, between the septentrional and meridional Tellian Atlas mountain belts, lies the Chelif Basin, a wide depression of over 450 km in length, which is the largest and the most subsident of the sublittoral basins [2,3]. The dimensions of the basin and the complexity of its hydrographic network has led several authors to divide it into three sub-basins: The Lower-, the Middle-, and the Upper-Chelif. In this study, we focus on the plain extending from Oued-Fodda to Ain-Defla (Figure 1). However, one must note that this subdivision is not unanimous within the scientific community. While some authors assign our study area to the Middle-Chelif Basin [6–8], others have

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 2 of 28

assigned it to the oriental part of the Lower-Chelif Basin [2,3,9]. In this study, we consider the study area as a part of the Middle-Chelif Basin. nity. While some authors assign our study area to the Middle-Chelif Basin [6–8], others have assigned it to the oriental part of the Lower-Chelif Basin [2,3,9]. In this study, we consider the study area as a part of the Middle-Chelif Basin.

study, we focus on the plain extending from Oued-Fodda to Ain-Defla (Figure 1). However, one must note that this subdivision is not unanimous within the scientific commu-

**Figure 1.** Situation of the Middle-Chelif Basin. LCB: Lower-Chelif Basin. MCB: Middle-Chelif Basin. UCB: Upper-Chelif Basin. OF: Oued-Fodda. AB: El-Abadia. AT: El-Attaf. AM: El-Amra. RO: Rouina. AD: Ain-Defla. **Figure 1.** Situation of the Middle-Chelif Basin. LCB: Lower-Chelif Basin. MCB: Middle-Chelif Basin. UCB: Upper-Chelif Basin. OF: Oued-Fodda. AB: El-Abadia. AT: El-Attaf. AM: El-Amra. RO: Rouina. AD: Ain-Defla.

The Middle-Chelif plain is home to over 400,000 inhabitants, who are mainly concentrated on its edges, and distributed over its principal cities, which are: Ain-Defla, Rouina, El-Amra, El-Attaf, El-Abadia, and Oued-Fodda. The Middle-Chelif plain is home to over 400,000 inhabitants, who are mainly concentrated on its edges, and distributed over its principal cities, which are: Ain-Defla, Rouina, El-Amra, El-Attaf, El-Abadia, and Oued-Fodda.

The region has witnessed several destructive earthquakes, such as the 1858 El-Amra earthquake (I0 = IX, [10]); the 1934 El-Abadia earthquake (I0 = IX, [10]); the 1954 Orléansville earthquake (Ms 6.7, [10]); and the 1980 El-Asnam earthquake (Ms 7.3, [11]). The region has witnessed several destructive earthquakes, such as the 1858 El-Amra earthquake (I<sup>0</sup> = IX, [10]); the 1934 El-Abadia earthquake (I<sup>0</sup> = IX, [10]); the 1954 Orléansville earthquake (Ms 6.7, [10]); and the 1980 El-Asnam earthquake (Ms 7.3, [11]).

In the last few decades, techniques based on ambient vibrations have been widely used to study sedimentary basins because of the quick implementation and the lower cost of the process, compared to boreholes or other geophysical prospecting methods (e.g., [12–16]). It is proven that soft sedimentary layers can amplify the ground shaking and prolong its duration during an earthquake, which can be harmful for buildings. Therefore, the determination of the geotechnical characteristics of the soils is primary for seismic risk and site effects assessments. One of the most important parameters for determining the geotechnical characteristics is the shear-wave velocity structure. The high-velocity contrast between soft sediments and the bedrock, along with the local geological and topographical aspects, are important factors for ground motion amplification. In the last few decades, techniques based on ambient vibrations have been widely used to study sedimentary basins because of the quick implementation and the lower cost of the process, compared to boreholes or other geophysical prospecting methods (e.g., [12–16]). It is proven that soft sedimentary layers can amplify the ground shaking and prolong its duration during an earthquake, which can be harmful for buildings. Therefore, the determination of the geotechnical characteristics of the soils is primary for seismic risk and site effects assessments. One of the most important parameters for determining the geotechnical characteristics is the shear-wave velocity structure. The high-velocity contrast between soft sediments and the bedrock, along with the local geological and topographical aspects, are important factors for ground motion amplification.

The Chelif Basin has been the subject of several geological and geophysical studies. The majority of these studies have been done on the Lower-Chelif Basin, and only a few of them have included the western part of the Middle-Chelif. It was only at the end of the 1960's that the CGG (Compagnie Générale de Géophysique) carried out a geophysical prospecting campaign by electric methods in the Middle-Chelif Plain in order to study the structure of the sedimentary deposits [7]. The Chelif Basin has been the subject of several geological and geophysical studies. The majority of these studies have been done on the Lower-Chelif Basin, and only a few of them have included the western part of the Middle-Chelif. It was only at the end of the 1960's that the CGG (Compagnie Générale de Géophysique) carried out a geophysical prospecting campaign by electric methods in the Middle-Chelif Plain in order to study the structure of the sedimentary deposits [7].

Since the 1980 El-Asnam earthquake, and the damage it caused, the Middle-Chelif region has been the subject of a multitude of geological and geophysical studies. The first study was carried out by the Institute of Earthquake Engineering of Skopje University (Macedonia), which resulted in the elaboration of the code for the repairing and strengthening of damaged buildings in the Chelif region. The code was based on studies of seismic hazards and the geotechnical conditions of the soil. Several seismic refraction profiles were performed in the cities of Chlef and El-Attaf [17]. The Woodward Clyde Consultants also carried out a complete geotechnical and geological study in eight cities of the Chelif region, including Oued-Fodda, El-Attaf, and El-Abadia, where several boreholes were drilled. The study outcomes provided geotechnical and hydrogeological maps for each city, along with seismic microzoning survey maps [18]. The neotectonic and paleoseismological studies carried out on the formations of the oriental Lower-Chelif Basin (e.g., [9,19]) were intended to highlight the tectonic elements and to identify geological structures and faults likely to be reactivated by generating earthquakes.

More recently, site effects investigations using earthquakes and ambient vibration data were conducted in the cities of Chlef in the Lower-Chelif Basin [20,21] and Oued-Fodda in the Middle-Chelif Basin [22]. The main objectives were to estimate the soil resonance frequencies, the shear-wave velocities in the sedimentary rocks, and the depths to seismic bedrock, where the relatively high impedance contrast with Miocene formations may amplify ground shaking during an earthquake. In the Mitidja basin, Bouchelouh [23] and Tebbouche [24] estimated and mapped the roof of the engineering bedrock using mainly ambient vibration data.

In this study, ambient vibration measurements using single-station and array techniques were used to estimate the shear-wave velocity for the sedimentary layers, and to map the bedrock structure in the Middle-Chelif Basin. In the first part of the study, we estimate the soil resonance frequencies (f0) and the corresponding amplitudes (*A*0) using the horizontal-to-vertical spectral ratio (HVSR) technique [25,26]. In the second part, we perform array measurements using the frequency–wavenumber (F–K) technique [27–30] to retrieve the surface wave dispersion curves. After that, the dispersion and HVSR curves are inverted jointly to estimate the shear-wave velocity (Vs) profiles at each site. Available borehole information and electrical resistivity profiles constrain the Vs models used for the inversion process. As a result, the bedrock model is proposed.

This study is a continuation of previous work conducted in the city of Oued-Fodda [22]. The results obtained in this work will contribute to the seismic hazard reassessment in the Chelif Basin. They can also be used for seismic hazard mitigation studies, such as strong ground motion simulation, soil liquefaction studies, and for the updating of the Algerian seismic code.

## **2. Geological Framework**

The Neogene basins, in the western part of Algeria, stretch parallel to the Mediterranean coast and lie within the septentrional and meridional Tellian Atlas mountain belts, which belong to the southern branch of the Alpine chain [1,31]. The intramountainous Chelif Basin is a wide depression of over 450 km filled with Mio-Plio-Quaternary deposits [1–3,9,32,33]. It is formed by a succession of plains, hills, and folds, boarded by the Dahra and Boumaad Mountains in the north, and the Ouarsenis Mountains in the south. These Tellian belts were structured during the Mesozoic [34] and constitute the substratum of the Neogene deposits.

The Middle-Chelif Basin extends from the Sara-El-Maarouf anticline to the Beni-Ghomerian plateau (Figure 2). In its central part, the Quaternary plain forms a narrow strip that stretches for over 50 km. This region is affected by normal and thrust faults, including the so-called "El-Asnam Fault". The orientation of these faults is parallel to the synclinal axis of the basin. The cross-sections performed by the National Society of Petroleum [35] and the General Company of Geophysics [7] show that the synclinal axis of this basin appears to be north of the Chelif River. In the southern part of this plain, inlier terrains of Jurassic-to-Silurian age outcrops form the Doui, Rouina, and Temoulga Massifs, from east to west, respectively. These schistosity massifs are considered to be autochthonous formations [33,34]. Although the Jurassic formations are predominant in these outcrops, the

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 4 of 28

Cenomanian to Senonian clays and marls [36].

bedrock is mainly Cretaceous in most of the basin area, and it is composed of Cenomanian to Senonian clays and marls [36]. under the Quaternary plain. These deposits can reach a thickness of 200 m [9].

basin appears to be north of the Chelif River. In the southern part of this plain, inlier terrains of Jurassic-to-Silurian age outcrops form the Doui, Rouina, and Temoulga Massifs, from east to west, respectively. These schistosity massifs are considered to be autochthonous formations [33,34]. Although the Jurassic formations are predominant in these outcrops, the bedrock is mainly Cretaceous in most of the basin area, and it is composed of

Several previous geological studies have detailed the lithological and stratigraphical aspect of the Neogene deposits in the Middle-Chelif Basin [2,9,32]. The first sedimentary deposit is a red detritic series of conglomerates, poudingues, and marls. Brives [32] and Perrodon [2] attribute these formations to the Lower Miocene, while Meghraoui [9] and Thomas [3] assign them to the Middle Miocene (Serravalo-Tortonian). These formations outcrop on the southern flank of the Tsili Massif, north of El-Amra (Figure 2). Above these formations lies an important intercalation of marls, clays, and limestones of the Lower Tortonian [9]. The boreholes and the electric surveying data used in this study [7] show that these formations can reach a considerable thickness of 400 m. This layer lies in discordance over the conglomeratic series in the northern part of the basin, and directly over the Cretaceous marls in the south. The passage between the Upper Tortonian sandstones and the Lower Tortonian marls is completed abruptly [32]. This contrast can be observed in Kef-Ensoura, north of El-Abadia, and also on the Beni-Ghomerian Plateau, where the two formations outcrop. The Messinian is represented mainly by blue marls with a maximum thickness of 50 m [9]. The marine regression in the Pliocene divided this stage into marine and continental formations [2,3,32]. The marine Pliocene is represented by sands and sandstones, while the continental is composed of red sands and conglomerates [2,9,32]. The Pliocene constitutes the principal deposits of the Sara-el-Maarouf and the Sara-Belaggoune anticlines, north of Oued-Fodda, and extends eastward by outcropping on a narrow band until the Beni-Ghomerian Plateau. However, this formation does not outcrop in the southern parts of the basin, which suggests that it disappears somewhere

**Figure 2.** Geological map of the Middle-Chelif Basin. Modified and compiled from [33,36]. The lithological cross-section, AA', is digitalized from [35]; BB', CC', and DD' are from [32]. **Figure 2.** Geological map of the Middle-Chelif Basin. Modified and compiled from [33,36]. The lithological cross-section, AA0 , is digitalized from [35]; BB0 , CC0 , and DD0 are from [32].

Several previous geological studies have detailed the lithological and stratigraphical aspect of the Neogene deposits in the Middle-Chelif Basin [2,9,32]. The first sedimentary deposit is a red detritic series of conglomerates, poudingues, and marls. Brives [32] and Perrodon [2] attribute these formations to the Lower Miocene, while Meghraoui [9] and Thomas [3] assign them to the Middle Miocene (Serravalo-Tortonian). These formations outcrop on the southern flank of the Tsili Massif, north of El-Amra (Figure 2). Above these formations lies an important intercalation of marls, clays, and limestones of the Lower Tortonian [9]. The boreholes and the electric surveying data used in this study [7] show that these formations can reach a considerable thickness of 400 m. This layer lies in discordance over the conglomeratic series in the northern part of the basin, and directly over the Cretaceous marls in the south. The passage between the Upper Tortonian sandstones and the Lower Tortonian marls is completed abruptly [32]. This contrast can be observed in Kef-Ensoura, north of El-Abadia, and also on the Beni-Ghomerian Plateau, where the two formations outcrop. The Messinian is represented mainly by blue marls with a maximum thickness of 50 m [9]. The marine regression in the Pliocene divided this stage into marine and continental formations [2,3,32]. The marine Pliocene is represented by sands and sandstones, while the continental is composed of red sands and conglomerates [2,9,32]. The Pliocene constitutes the principal deposits of the Sara-el-Maarouf and the Sara-Belaggoune anticlines, north of Oued-Fodda, and extends eastward by outcropping on a narrow band until the Beni-Ghomerian Plateau. However, this formation does not outcrop in the southern parts of the basin, which suggests that it disappears somewhere under the Quaternary plain. These deposits can reach a thickness of 200 m [9].

The Quaternary is predominant in the wide plain that stretches from Oued-Fodda to Ain-Defla. The Pleistocene formations surround the Holocene alluviums and form the first hills in the northern and southern margins of the plain. The Pleistocene is composed of clays, gravels, and conglomerates. The Holocene is constituted of recent alluviums. The Chelif River crosses the Middle-Chelif Quaternary Plain from Ain-Defla to Ouled-Abbes and drags all the necessary materials that form the actual alluviums. Chelif River crosses the Middle-Chelif Quaternary Plain from Ain-Defla to Ouled-Abbes and drags all the necessary materials that form the actual alluviums.

The Quaternary is predominant in the wide plain that stretches from Oued-Fodda to Ain-Defla. The Pleistocene formations surround the Holocene alluviums and form the first hills in the northern and southern margins of the plain. The Pleistocene is composed of clays, gravels, and conglomerates. The Holocene is constituted of recent alluviums. The

The geological and geotechnical data show important lateral variations in the facies and thicknesses within the sedimentary layers and the bedrock. For that, the study area was divided into three zones: The Oued-Fodda Plain, the Carnot Plain, and the Ain-Defla Region (Figure 3). The Oued-Fodda Plain lies between the Temoulga Massif and the Sara-El-Maarouf anticline in a NE–SW direction, with an average width of 3 km. The fault affecting the Temoulga Massif, east of the plain, and the El-Asnam fault in the west, suggest that this plain acts as a graben structure. The different outcrops in the area suggest a low sedimentary thickness [22]. The bedrock is mainly composed of Senonian marls. The geological and geotechnical data show important lateral variations in the facies and thicknesses within the sedimentary layers and the bedrock. For that, the study area was divided into three zones: The Oued-Fodda Plain, the Carnot Plain, and the Ain-Defla Region **(**Figure 3). The Oued-Fodda Plain lies between the Temoulga Massif and the Sara-El-Maarouf anticline in a NE–SW direction, with an average width of 3 km. The fault affecting the Temoulga Massif, east of the plain, and the El-Asnam fault in the west, suggest that this plain acts as a graben structure. The different outcrops in the area suggest a low sedimentary thickness [22]. The bedrock is mainly composed of Senonian marls.

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 5 of 28

**Figure 3.** Zonation map of the study area. Red dots correspond to single-station measurements. Red polygons correspond to the limits of the different cities. **Figure 3.** Zonation map of the study area. Red dots correspond to single-station measurements. Red polygons correspond to the limits of the different cities.

The Carnot Plain, also known as the El-Attaf Plain, extends from Bir-Safsaf to Rouina in an E–W direction. The main cities of the plain are El-Attaf and El-Abadia, built on the southern and northern plain margins, respectively. The average basin width of 8 km suggests that the maximum depth of the basin is likely to be observed in this region. Lateral variations in the facies are observed in the Cretaceous series, which is considered as the bedrock of the sedimentary deposits in this area. The facies change from marly to mainly composed of limestones west of El-Abadia. The geophysical data show that the sedimentary layers can reach a depth of 700–800 m from El-Abadia to Rouina [6,7,35]. The Carnot Plain, also known as the El-Attaf Plain, extends from Bir-Safsaf to Rouina in an E–W direction. The main cities of the plain are El-Attaf and El-Abadia, built on the southern and northern plain margins, respectively. The average basin width of 8 km suggests that the maximum depth of the basin is likely to be observed in this region. Lateral variations in the facies are observed in the Cretaceous series, which is considered as the bedrock of the sedimentary deposits in this area. The facies change from marly to mainly composed of limestones west of El-Abadia. The geophysical data show that the sedimentary layers can reach a depth of 700–800 m from El-Abadia to Rouina [6,7,35].

### **3. Methodology 3. Methodology**

### *3.1. The Horizontal-to-Vertical Ratio Technique 3.1. The Horizontal-to-Vertical Ratio Technique*

The HVSR technique (also known as H/V) was initiated by Nogoshi and Igarashi [26], and developed by Nakamura [25]. It is one of the most commonly used techniques that allows for the retrieval of one of the most important parameters of site response, i.e., the resonance frequency at a given site. This technique has proven its reliability in estimating the soil resonance frequency [37], although several authors have reported that this technique is not reliable in estimating the amplification factors, especially at lower frequencies [38–41]. This inability is due to the contribution of body and surface waves, and it is the The HVSR technique (also known as H/V) was initiated by Nogoshi and Igarashi [26], and developed by Nakamura [25]. It is one of the most commonly used techniques that allows for the retrieval of one of the most important parameters of site response, i.e., the resonance frequency at a given site. This technique has proven its reliability in estimating the soil resonance frequency [37], although several authors have reported that this technique is not reliable in estimating the amplification factors, especially at lower frequencies [38–41]. This inability is due to the contribution of body and surface waves, and it is the latter ones that control the HVSR curve. The low contribution of body waves is sufficient for estimating the fundamental frequency, but not enough for estimating the amplification factor [38,39,41]. Recently, LaRocca [42] has also demonstrated how the peak amplitude of the HVSR of the seismic noise may vary considerably with time.

This method consists of estimating the ratio between the Fourier amplitude spectrum of the horizontal and vertical components of the ambient noise, which provides the HVSR curve as a result. The resulting HVSR frequency peak is very well-correlated with the resonance frequency of the soil [43]. This peak corresponds to an impedance contrast in the soil column.

The low cost, the simplicity of acquisition and processing, and the reliability in estimating the soil resonance frequencies, makes this technique one of the most used for site characterizations, especially for urban areas.

## *3.2. Inversion of the HVSR Curves*

The inversion of the HVSR curves allows for the retrieval of the shear-wave velocity values for the sedimentary layers, along with the bedrock depths of our study area.

The velocity profiles are calculated using the neighborhood algorithm [44]. As there is no unique solution for the inversion problem, several velocity models are calculated. For each model, the calculated ellipticity of the Rayleigh wave fundamental mode is compared to the HVSR curve and a misfit value is given [45]. The ellipticity curves and the corresponding velocity models with the lower misfit values are considered.

As for all the inversion processes, the nonuniqueness of the solution is a very delicate matter to deal with, and hundreds of thousands of velocity models can result from one simple HVSR curve. For that reason, some soil parameters are required to better guide and constrain the process. The main parameter is the number of the sedimentary layers and the range of thicknesses for each layer. Geological or lithological cross-sections are usually good enough for providing these two parameters. Some geotechnical parameters are also required for each layer, either as fixed values or as intervals. These parameters are the Vp, Vs, density, and Poisson's ratio. The latter is set between 0.2 and 0.5 (universal values for soils). For good and reliable results, it is necessary to have good data coverage in the study area, as well as boreholes and geophysical prospections (e.g., the seismic refraction experiment).

## *3.3. The Frequency–Wavenumber (F–K) Analysis*

In the last decade, array-based techniques have become popular and are widely used for the analysis of ambient vibrations. The concept of configurations in arrays, with simultaneous ambient vibration recordings, has proven its reliability and efficiency through several studies and investigations [16,46,47]. One of the most applied techniques for array processing is frequency–wavenumber analysis, commonly known as F–K analysis [27–30]. This technique is performed in the frequency domain and allows for the estimation of the back azimuth and the slowness of the seismic wave sources recorded by the array [47].

The theoretical aspect of this technique is based on two main assumptions. The first one is that the wavefront propagation is on the vertical plane, and the process is stationary in the other two planes. The second assumption is that the process is stationary in time. The microtremor wavefield is a superposition of seismic waves propagated from several distant sources [28]. The F–K analysis exploits the stationary and stochastic character of these propagated waves to construct the frequency–wavenumber power spectral density function, which holds information on the power as a function of the frequency and the velocity of travelling waves. The power spectral density can be estimated using the following two following methods: The beam-forming method (BFM) [29,30], and the maximum likelihood method (MLM) or the high-resolution method (HR) [27,28,48]. In this study, we used the beam-forming method to calculate the frequency–wavenumber power spectral density and thereby retrieve the velocity and the directions of the seismic propagating waves. The HR method provides a higher resolution in the frequency–wavenumber plane than the BFM method. However, the latter is less sensitive to measurement errors since it uses less computations [27]. Rosa-Cintas [16] calculated the shear-wave velocity using both the BFM and the HR methods and the obtained dispersion curves were almost similar. The ability of these methods to identify the directional properties of the noise wavefield is highlighted by Maresca [46].

## *3.4. Electrical Resistivity Surveying*

The electrical methods allow for the estimation of the resistivity at the different layers that compose the subsurface structure and, therefore, for inferences of the thicknesses of these layers [49–51]. In this study, we performed two electrical prospections in order to obtain the thickness of the Quaternary alluviums. Moreover, the information available from thirteen additional electrical surveys was used.

The resistivity of a rock is the physical property that determines the ability of the rock to conduct the electric current. In a sedimentary layer, the resistivity is mainly controlled by the electrical resistivity of the fluids within this layer.

The prospecting consists of injecting an electric current into the soil column using two electrodes. After that, we measure the potential difference between two other electrodes that are planted between the injecting electrodes. This technique allows us to image the vertical succession of layers with different resistivities. One must note that the higher the resistivity is, the more solid the rock is. The depth of investigation is proportional to the interspacing between the transmitter electrodes. The constant spacing between the electrodes allows for the obtainment of a profile of the lateral and vertical variations of the resistivity.

The configuration of the measuring device is chosen according to the type of study and the expected results. There are several possible electrode configurations. The most commonly used are the Wenner, Schlumberger, and dipole–dipole configurations [49,52]. In this study, we used the dipole–dipole configuration.

## **4. Geotechnical Information**

The set of parameters to be determined in the inversion process requires additional geotechnical information (number of layers, thicknesses, Vp, Vs, and densities). For a large area, such as the Middle-Chelif Basin, where lateral variations in the facies are quite frequent [2,9,32], a wide coverage of geotechnical data is recommended. In this study, geological and geophysical data were used and compiled for a well-guided inversion process with reliable results.

The synclinal aspect of the study area made it easier to determine the number of sedimentary layers since the oldest layers outcrop in succession on the edges of the Middle-Chelif Quaternary Plain, and dip under the alluviums [36]. The modified geological map, along with the lithological cross-sections (Figure 2), were good enough for fixing the number of layers.

A total of 16 geotechnical boreholes were also used, whose depths varied from 20 to 250 m [6,7,18]. These boreholes allowed us to fix the thicknesses of the Quaternary and Pliocene layers in some areas. In order to complete the lack of data and to obtain information on deeper layers, a total number of 13 electrical profiles were added. Nine of them were made by the General Company of Geophysics between El-Attaf and Ain-Defla [7]. The profiles are oriented in a N–S direction, with an average length of 5 km and a maximum depth of investigation of 1000 m. The remaining electrical profiles were carried out by the ANRH (Agence National de Ressource Hydraulique) in the El-Attaf region. The average length was 800 m, with an investigation depth of 150 m.

The velocity and density values for the different sedimentary rocks are provided in Talaganov [17], obtained by using the seismic refraction method in the cities of Chlef, Beni-Rached, and El-Attaf. For the seismic bedrock, a Vs value ranging between 1800 and 2600 m/s was assigned. These values were obtained by the inversion of the ambient vibration data for the cities of Chlef [20] and Oued-Fodda [22]. There are two different formations composing the bedrock, the Cretaceous flysches, clays, and marls in the Middle-Chelif Plain, and the older Jurassic limestones on its southern borders. For the limestones, a Vs value of 2800 m/s and a Vp value of 4800 m/s were assigned using local earthquake

tomographic inversion results [53]. The younger marls and clays of the late Cretaceous constitute the bedrock of the sedimentary deposits in the region of Oued-Fodda since it outcrops about 2 km southwest of the city. The Vs values obtained in Issaadi [22] are, therefore, assigned to the Cretaceous marls (Upper Senonian).

A lithological and geotechnical model for the Middle-Chelif Basin was built from the compilation of the whole of the gathered data (Table 1).

**Table 1.** Geotechnical information used for the inversion process. Hl: Holocene. Ps: Pleistocene. As: Astian. Pl: Placenzian. Ms: Messinian. Tr: Tortonian. Sr: Seravallian. Sn: Senonian.


## **5. Data Acquisition and Processing**

## *5.1. Data Acquisition*

Ambient vibrations were recorded using single-station and array measurement techniques. The HVSR method was applied at 164 ambient vibration measurement points, distributed over 20 profiles (Figure 3). The measurements were made following SESAME recommendations [54]. The F–K method was applied to array measurements in seven sites. Additional measurements, using the electrical prospection technique, were carried out in the Mecheta-Nouasser and El-Amra regions. The aim was to determine the thicknesses of the Plio-Quaternary layers for the HVSR inversion. It is well-known that, in a sedimentary basin, the maximum depth to bedrock is observed beneath the youngest formations, which are generally located in the middle of the basin. Thus, ambient vibrations were recorded mainly in the Quaternary plain of the Middle-Chelif, from Zbabdja village in the west to Ain-Defla in the east (Figure 3). Moreover, in order to better image the synclinal shape of the basin, the HVSR profiles are perpendicular to its axis, varying from a WNW–ESE direction in Oued-Fodda to a NNW–SSE in the rest of the plain.

The single-station measurements were carried out in calm weather, with a recording time of 16 min away from human activities, and 26 min inside cities and villages. Ambient vibrations were recorded using Tromino seismographs, with a sampling rate of 512 samples per second. The first array measurement campaign (AR1, AR4, and AR2 in Figure 4) was carried out in February 2019, using between seven and nine Mark L22 seismographs (f<sup>0</sup> = 2 Hz) set in a circular configuration, with two apertures of 20 m and 50 m, respectively, at each site. The vertical components of these sensors were plugged into an Arduino-based multichannel acquisition system [55] for 30 min of simultaneous recording at each site. The second measurement campaign (AR3, AR5, AR6, and AR7) was carried out in September 2020, using nine new SS10 (f<sup>0</sup> = 1 Hz) triaxial velocity sensors connected to their respective

SL06 digitizers (SARA electronic instruments). The dispersion curves obtained in this way produced good resolution only for the first 50 m of the soil column. Thus, additional single-station measurement points were carried out at the center of each array, recording ambient vibrations at the same time in order to perform a joint inversion and, thus, obtain a better resolution in the shallow and deep sediments. *Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 10 of 28

**Figure 4.** Compiled data in Zone 1. **Figure 4.** Compiled data in Zone 1.

For the electrical resistivity tomography survey, we used a set of 40 electrodes in dipole– dipole devices, linked to an ABEM Terrameter LS 2 resistivity meter. The interspacing between the electrodes was 10 m.

The distribution of the single-station and array measurements was made according to the available geophysical and geological data. The aim was to optimize the data coverage for the three zones.

## Zone 1. The Oued-Fodda plain.

Ambient vibrations were recorded at 67 sites, including 6 in the village of Zbabdja and 9 in Oued-Fodda city (Figure 4), and 58 of these are distributed over 8 profiles oriented in a WNW–ESE direction. As an attempt to image the rupture trace of the 1980 El-Asnam earthquake in the upper sedimentary layers, the profiles, Pr 6, Pr 7, and Pr 8 (Figure 3), are crossing the surface trace around the village of Zmoul (Figure 4), with 11 measurement points on the Sara-El-Maarouf anticline, which corresponds to the overlapping block. Ambient vibrations were also recorded using the array measurement technique at 5 sites, including 2 (AR3 and AR4) within Oued-Fodda city (Figure 4).

## Zone 2. The Carnot Plain.

Zone 3. Rouina-Ain-Defla Region.

**Figure 5.** Compiled data in Zone 2. A total set of 64 single-station measurement points were conducted in this zone, distributed over 8 profiles oriented in a NW–SE direction (Figures 3 and 5). An array measurement in a circular configuration was carried out in the city of El-Attaf, with apertures of 20 m and 40 m, along with two single-station measurement points at the same site, in order to perform a joint inversion. An electrical resistivity survey was also carried out in the Mechta-Nouasser locality (Figure 5).

In this zone, 33 single-station measurement points were carried out along 4 profiles,

In addition, an array ambient vibration measurement was carried out near the town of El-Amra, with apertures of 20 m and 50 m (AR7). Furthermore, an electrical resistivity

**Figure 4.** Compiled data in Zone 1.

**Figure 5.** Compiled data in Zone 2. **Figure 5.** Compiled data in Zone 2.

Zone 3. Rouina-Ain-Defla Region. Zone 3. Rouina-Ain-Defla Region. *Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 11 of 28

> In this zone, 33 single-station measurement points were carried out along 4 profiles, oriented NNW–SSE. Two of these profiles, Pr 17 and Pr 20 (Figures 3 and 6), are crossing the cities of Rouina and Ain-Defla. In addition, an array ambient vibration measurement was carried out near the town In this zone, 33 single-station measurement points were carried out along 4 profiles, oriented NNW–SSE. Two of these profiles, Pr 17 and Pr 20 (Figures 3 and 6), are crossing the cities of Rouina and Ain-Defla. survey (E14) was performed south of the El-Amra locality to better constrain the thickness of the shallow Quaternary layers in this area.

**Figure 6.** Compiled data in Zone 3. **Figure 6.** Compiled data in Zone 3.

*5.2. Data Processing* 5.2.1. HVSR Technique

amples of the obtained HVSR curves at each zone are shown in Figure 7.

opsy software as a processing tool [45]. The recordings of the 164 measurement points have been processed as follows: The time series were divided into 30-s windows that were 5% cosine-tapered. The windows were selected automatically using the anti–triggering algorithm, which allows for the avoidance of transients and the selection of only windows with stationary ambient vibrations. The anti–trigger parameters were used as recommended by the SESAME project [54]. The fast Fourier transform (FFT) was computed for each window. After that, the Konno–Ohmachi algorithm was applied in order to smooth the amplitude spectra [56], with a smoothing coefficient of 40. The HVSR curve was calculated for each selected window, and an averaged HVSR curve was retrieved. Some ex-

In addition, an array ambient vibration measurement was carried out near the town of El-Amra, with apertures of 20 m and 50 m (AR7). Furthermore, an electrical resistivity survey (E14) was performed south of the El-Amra locality to better constrain the thickness of the shallow Quaternary layers in this area.

## *5.2. Data Processing*

## 5.2.1. HVSR Technique

The HVSR technique was applied to the ambient vibration measurements using Geopsy software as a processing tool [45]. The recordings of the 164 measurement points have been processed as follows: The time series were divided into 30-s windows that were 5% cosine-tapered. The windows were selected automatically using the anti–triggering algorithm, which allows for the avoidance of transients and the selection of only windows with stationary ambient vibrations. The anti–trigger parameters were used as recommended by the SESAME project [54]. The fast Fourier transform (FFT) was computed for each window. After that, the Konno–Ohmachi algorithm was applied in order to smooth the amplitude spectra [56], with a smoothing coefficient of 40. The HVSR curve was calculated for each selected window, and an averaged HVSR curve was retrieved. Some examples of the obtained HVSR curves at each zone are shown in Figure 7. *Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 12 of 28

**Figure 7.** Some examples of the calculated HVSR curves at each zone. **Figure 7.** Some examples of the calculated HVSR curves at each zone.

### 5.2.2. F–K Analysis 5.2.2. F–K Analysis

The F–K analysis was applied to array recordings using the Sesarray package [45]. The first step consisted of calculating the array transfer function, along with the theoretical wavenumber limits (Kmin–Kmax) (Figure 8), from the number of receivers and the XY coordinates of each receiver of the array. The computation was carried out using WARANGPS software from the Sesarray package. After that, the beam-forming method (BFM) was applied to the vertical components using Geopsy software. The signals were divided into windows of frequency-dependent lengths, including 50 periods. For the processing, two F–K gridding parameters had to be defined: the grid step and the grid size. The grid step corresponds to the Kmin/2value, which determines the maximum resolution. The grid size corresponds to the Kmax value, which determines the aliasing limit. As a result, a dispersion curve is obtained, i.e., the slowness as a function of the frequency. The F–K analysis was applied to array recordings using the Sesarray package [45]. The first step consisted of calculating the array transfer function, along with the theoretical wavenumber limits (Kmin–Kmax) (Figure 8), from the number of receivers and the XY coordinates of each receiver of the array. The computation was carried out using WARANGPS software from the Sesarray package. After that, the beam-forming method (BFM) was applied to the vertical components using Geopsy software. The signals were divided into windows of frequency-dependent lengths, including 50 periods. For the processing, two F–K gridding parameters had to be defined: the grid step and the grid size. The grid step corresponds to the Kmin/2 value, which determines the maximum resolution. The grid size corresponds to the Kmax value, which determines the aliasing limit. As a result, a dispersion curve is obtained, i.e., the slowness as a function of the frequency.

package [45]. This software uses the neighborhood algorithm to estimate the shear-wave velocity profiles from the HVSR and dispersion curves [44]. The geotechnical model built from the gathered geotechnical data (Table 1) was used as an input parameterization for the process. However, the number of layers was not constant; it varied from one site to

another depending on the lithological aspect at each measurement site.

5.2.3. Inversion of HVSR and Dispersion Curves

**Figure 8.** Theoretical wavenumbers obtained at each array recording site. **Figure 8.** Theoretical wavenumbers obtained at each array recording site.

## 5.2.3. Inversion of HVSR and Dispersion Curves

The inversion process was carried out using the Dinver software from the Sesarray package [45]. This software uses the neighborhood algorithm to estimate the shear-wave velocity profiles from the HVSR and dispersion curves [44]. The geotechnical model built from the gathered geotechnical data (Table 1) was used as an input parameterization for the process. However, the number of layers was not constant; it varied from one site to another depending on the lithological aspect at each measurement site.

The HVSR curves were inverted entirely (0.2–20 Hz). The maximum number of iterations was fixed to 350 iterations, and 100 models were generated at each iteration. Only models with acceptable misfit values were considered. The threshold for an acceptable misfit value was 0.45 for curves with one frequency peak, and 0.6 for curves with two frequency peaks. For curves with multiple frequency peaks, misfit values under 0.8 were considered.

For the array measurements, the dispersion curves were not inverted entirely; the curves were cut and considered only within the theoretical wavenumber limits (Kmin/2, Kmax). For each dispersion curve, the corresponding HVSR curve was added in order to perform a joint inversion. Both curves were equally weighted.

## **6. Results and Discussions**

## *6.1. Fundamental Frequencies and the Corresponding Amplitudes*

The resonance frequencies and the corresponding amplitudes obtained from the HVSR analysis are mapped in Figure 9. The frequencies range between 0.75 and 12 Hz, while the amplitudes vary from 2 to 6.2. The obtained HVSR curves contain one frequency peak, two peaks, and multiple peaks. Curves with two frequency peaks are predominant, especially in the Quaternary plain.

In the Oued-Fodda Plain (Zone1), the predominant resonance frequencies of the soil vary between 1.5 and 3 Hz, and the corresponding amplitudes vary between 2 and 6. However, particularly in the center of Oued-Fodda city, higher frequencies, between 5 and 12 Hz, are observed. This high-frequency range is related to the outcropping bedrock in the area [22]. In the village of Zbabdja, about 3 km west of Oued-Fodda city, the frequencies are around 1.5 Hz. In terms of shape, two types of curves are observed in this zone: curves with two frequency peaks are predominant in the Oued-Fodda Plain, while curves with one peak were obtained only from recordings in the central parts of Oued-Fodda city and Zbabdja village. Second peaks at higher frequencies are related to impedance contrasts in the soil column at shallow depths [22]. These peaks are only observed at sites located on Holocene alluviums. This allowed us to deduce that this shallow impedance contrast is responsible for the second frequency peaks that are between the Holocene silt layer and the Pleistocene gravel and conglomerate layers.

In the Sara-El-Maarouf anticline, the predominant resonance frequency is around 2 Hz, while the amplitudes of the peaks range between 2.5 and 4.5. The second peaks at higher frequencies appear to be related to the impedance contrast between the Upper Pliocene sands and the Lower Pliocene sandstones.

In the Carnot Plain (Zone 2), the predominant resonance frequencies vary between 0.75 and 5 Hz. The corresponding amplitudes vary from 2.2 to 5.8. HVSR curves with one frequency peak, two peaks, and multiple peaks were observed in this area.

The variation in shape for these curves is mainly related to the variation in the lithology, the apparition of new sedimentary layers, and the disappearance of others at some places, caused by different factors that affect the sedimentation process as erosions and faults. Concerning the shapes of the peaks, they are mainly controlled by the dipping angles of the different layers [54,57].

Peaks at higher frequencies (between 8 and 12 Hz) are observed on the hills overlooking the El-Attaf city from the southwest. These peaks appear to be caused by the outcropping Cretaceous marls. In the city of El-Abadia, the resonance frequency is around 2 Hz, while in El-Attaf city, it is around 3 Hz.

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 15 of 28

**Figure 9.** (**A**) Fundamental frequencies obtained from the HVSR analysis. (**B**) Amplitudes of the HVSR fundamental frequency peaks. Red circles represent the cities. The blue line represents the surface trace of the El-Asnam fault. OF: Oued-Fodda. AB: El-Abadia. AT: El-Attaf. RO: Rouina. AM: El-Amra. AD: Ain-Defla. **Figure 9.** (**A**) Fundamental frequencies obtained from the HVSR analysis. (**B**) Amplitudes of the HVSR fundamental frequency peaks. Red circles represent the cities. The blue line represents the surface trace of the El-Asnam fault. OF: Oued-Fodda. AB: El-Abadia. AT: El-Attaf. RO: Rouina. AM: El-Amra. AD: Ain-Defla.

> The variation in shape for these curves is mainly related to the variation in the lithology, the apparition of new sedimentary layers, and the disappearance of others at some places, caused by different factors that affect the sedimentation process as erosions and faults. Concerning the shapes of the peaks, they are mainly controlled by the dipping angles of the different layers [54,57]. Peaks at higher frequencies (between 8 and 12 Hz) are observed on the hills overlooking the El-Attaf city from the southwest. These peaks appear to be caused by the outcropping Cretaceous marls. In the city of El-Abadia, the resonance frequency is around 2 Hz, while in El-Attaf city, it is around 3 Hz. In the third zone, the resonance frequencies range between 1.3 and 3 Hz. The corresponding amplitudes vary between 2.2 and 3.5. Two types of HVSR curves were obtained: curves with one frequency peak in the southern part, and curves with two frequency peaks in the northern part. The frequency of the second peak varies from 5 to 12 Hz and appears to be related to an impedance contrast between the Holocene and Pleistocene alluviums at shallow depths. In the city of Ain-Defla, the resonance frequency of the soil increases from 1.8 to 3 Hz, from north to south, respectively. The corresponding peak amplitudes are around 5.

#### In the third zone, the resonance frequencies range between 1.3 and 3 Hz. The corre-*6.2. F–K Analysis: Surface Wave Dispersion Curves*

sponding amplitudes vary between 2.2 and 3.5. Two types of HVSR curves were obtained: curves with one frequency peak in the southern part, and curves with two frequency peaks in the northern part. The frequency of the second peak varies from 5 to 12 Hz and appears to be related to an impedance contrast between the Holocene and Pleistocene alluviums at shallow depths. In the city of Ain-Defla, the resonance frequency of the soil In Figure 10, the dispersion curves estimated from the different deployed arrays are shown. In some places, a single array was enough to obtain a reliable dispersion curve, while, in other places, two arrays of different apertures were implemented, obtaining an average dispersion curve. The frequency range is fixed by the theoretical wavenumber

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 16 of 28

*6.2. F–K Analysis: Surface Wave Dispersion Curves*

plitudes are around 5.

limits (Kmin/2, Kmax), which depend on the array configuration and aperture (Figure 8). The curves are, therefore, not very reliable outside the theoretical wavenumber limits. 10 Hz. The dispersion curves obtained from the array recording AR6 in the city of El-Attaf are reliable in the frequency range between 5 and 10 Hz, and between 2.5 and 8 Hz in the city of El-Amra (AR7).

curves are, therefore, not very reliable outside the theoretical wavenumber limits.

increases from 1.8 to 3 Hz, from north to south, respectively. The corresponding peak am-

In Figure 10, the dispersion curves estimated from the different deployed arrays are shown. In some places, a single array was enough to obtain a reliable dispersion curve, while, in other places, two arrays of different apertures were implemented, obtaining an average dispersion curve. The frequency range is fixed by the theoretical wavenumber limits (Kmin*/2*, Kmax), which depend on the array configuration and aperture (Figure 8). The

In the city of Oued-Fodda and its surroundings, the analysis of the array measurements at sites AR1, AR2, AR3, AR4, and AR5 provided dispersion curves between 2.5 and

**Figure 10.** Surface wave dispersion curves. **Figure 10.** Surface wave dispersion curves.

*6.3. Electrical Resistivity Tomography* The electrical resistivities, obtained from measurements in Mechta-Nouasser (E1, Figure 5) and in the southeast of El-Amra (E14, Figure 6), are shown in Figure 11. In order to interpret the obtained results, we used the resistivity scale established by the CGG [7] for the Middle-Chelif Basin (left panel in Figure 11). In the city of Oued-Fodda and its surroundings, the analysis of the array measurements at sites AR1, AR2, AR3, AR4, and AR5 provided dispersion curves between 2.5 and 10 Hz. The dispersion curves obtained from the array recording AR6 in the city of El-Attaf are reliable in the frequency range between 5 and 10 Hz, and between 2.5 and 8 Hz in the city of El-Amra (AR7).

## *6.3. Electrical Resistivity Tomography*

The electrical resistivities, obtained from measurements in Mechta-Nouasser (E1, Figure 5) and in the southeast of El-Amra (E14, Figure 6), are shown in Figure 11. In order to interpret the obtained results, we used the resistivity scale established by the CGG [7] for the Middle-Chelif Basin (left panel in Figure 11).

For the resistivity profile (E1), the depth of the investigation of 115 m allowed for the identification of the two Quaternary layers. The thicknesses of these two layers (1 and 2 in Figure 11) appeared to be about 50 m (~10 m of Holocene and ~40 m of Pleistocene alluviums). For the profile (E14), the depth of investigation was 105 m. The resistivity scale for the Middle-Chelif Basin allowed us to identify four different sedimentary layers; the two top layers belong to the Quaternary, while the underlying layers are attributed to the Miocene. The thickness of the Quaternary deposits is about 35 m (~10 m of Holocene and ~25 m of Pleistocene).

The results of the electrical resistivity measurements in Mecheta-Nouasser (E1) and El-Amra (E14) allowed us to fix the thicknesses of the Quaternary layers (Holocene and Pleistocene) for the inversion of the HVSR curves recorded at Sites 77, 78, and 149.

**Figure 11.** Resistivity profiles. The shear-wave velocity models correspond to the HVSR points, P77 and P78. The top left panel represents the resistivity scale for the Middle-Chelif Basin [7]. The bottom left panel represents the lithological units identified in the resistivity profiles. **Figure 11.** Resistivity profiles. The shear-wave velocity models correspond to the HVSR points, P77 and P78. The top left panel represents the resistivity scale for the Middle-Chelif Basin [7]. The bottom left panel represents the lithological units identified in the resistivity profiles.

### For the resistivity profile (E1), the depth of the investigation of 115 m allowed for the *6.4. Shear-Wave 2D Velocity Profiles*

identification of the two Quaternary layers. The thicknesses of these two layers (1 and 2 in Figure 11) appeared to be about 50 m (~10 m of Holocene and ~40 m of Pleistocene alluviums). For the profile (E14), the depth of investigation was 105 m. The resistivity scale for the Middle-Chelif Basin allowed us to identify four different sedimentary layers; the two top layers belong to the Quaternary, while the underlying layers are attributed to the Miocene. The thickness of the Quaternary deposits is about 35 m (~10 m of Holocene and ~25 m of Pleistocene). The results of the electrical resistivity measurements in Mecheta-Nouasser (E1) and The inversion of the obtained HVSR curves allowed for the retrieval of the S-wave velocity models at each site. For sites where array measurements were also taken, the joint inversion of the obtained dispersion and HVSR curves was carried out. The number of analyzed layers, as well as the range of the respective Vs and thickness parameters, have been constrained according to previous information obtained from geological crosssections, boreholes, and geotechnical studies (see Table 1). The aim was to obtain a result closer to the true model by minimizing the misfit value. Some examples are shown in Figure 12.

El-Amra (E14) allowed us to fix the thicknesses of the Quaternary layers (Holocene and Pleistocene) for the inversion of the HVSR curves recorded at Sites 77, 78, and 149. *6.4. Shear-Wave 2D Velocity Profiles* The inversion of the obtained HVSR curves allowed for the retrieval of the S-wave velocity models at each site. For sites where array measurements were also taken, the joint inversion of the obtained dispersion and HVSR curves was carried out. The number of analyzed layers, as well as the range of the respective Vs and thickness parameters, have been constrained according to previous information obtained from geological cross-sections, boreholes, and geotechnical studies (see Table 1). The aim was to obtain a result For the first zone, the eight obtained 2D velocity profiles are mapped in Figure 13. The two uppermost layers in the plain correspond to the two Quaternary stages, the Holocene and Pleistocene, mainly composed of alluviums. The obtained Vs values for the Holocene alluviums vary between 210 and 350 m/s. This range is explained by the lateral variation between clays and silts. For the Pleistocene layer, the Vs value ranges between 405 and 630 m/s. This variation is due to the alternance between grave and clay facies. The high contrast in velocity between the two Quaternary stages is responsible for the second frequency peak in the area. The thickness of the Holocene formation does not exceed 10 m, while the Pleistocene deposits reach a maximum thickness of 65 m north of Oued-Fodda city.

closer to the true model by minimizing the misfit value. Some examples are shown in Figure 12. On the Sara-El-Maarouf anticline, in the profiles Pr6, Pr7, and Pr8 (Figure 13), the two topmost layers correspond to the Upper and Lower Pliocene. The Vs values obtained for the Upper Pliocene sands vary between 300 and 400 m/s, and between 610 and 780 m/s for the Lower Pliocene sandstones. As for the Quaternary stages down in the plain, the contrast in velocity between the two Pliocene layers is responsible for the second frequency peak in the HVSR curves obtained from recordings on the Sara-El-Maarouf anticline. These formations reach a maximum thickness of 100 m in the oriental part of the anticline. A thin

Tortonian limestone layer lies under the Pliocene layers, for which the Vs value ranges between 650 and 900 m/s. This layer dips into the northwest and is not present in the Oued-Fodda Plain. The Quaternary layers in the plain are, therefore, lying directly over a thick layer composed of Serravallo-Tortonian marls and clays. This is the thickest sedimentary layer in the Middle-Chelif Basin. It reaches a maximum thickness of 250 m in this zone. The Vs value for this layer varies between 850 and 1290 m/s. This formation is in direct contact with the Cretaceous bedrock, which is mainly composed of hard marls. The obtained shear-wave velocity of these marls varies between 1700 and 2300 m/s in this zone. *Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 18 of 28

**Figure 12.** Examples of the inversion results. For each site, the left panels for each site represent the Vs models. The black line corresponds to the best fit model, the dark grey represents models with minimum misfit + 10%. All the tested models are in light grey. In the right panels, we show the computed fundamental mode of the Rayleigh wave ellipticity curve (dark grey curve), and the inverted part of the HVSR curve (black dotted curve). For sites AR1 and AR3, the bottom of the right panel represents the surface wave dispersion curve. **Figure 12.** Examples of the inversion results. For each site, the left panels for each site represent the Vs models. The black line corresponds to the best fit model, the dark grey represents models with minimum misfit + 10%. All the tested models are in light grey. In the right panels, we show the computed fundamental mode of the Rayleigh wave ellipticity curve (dark grey curve), and the inverted part of the HVSR curve (black dotted curve). For sites AR1 and AR3, the bottom of the right panel represents the surface wave dispersion curve.

For the first zone, the eight obtained 2D velocity profiles are mapped in Figure 13. The two uppermost layers in the plain correspond to the two Quaternary stages, the Holocene and Pleistocene, mainly composed of alluviums. The obtained Vs values for the Holocene alluviums vary between 210 and 350 m/s. This range is explained by the lateral variation between clays and silts. For the Pleistocene layer, the Vs value ranges between 405 and 630 m/s. This variation is due to the alternance between grave and clay facies. The high contrast in velocity between the two Quaternary stages is responsible for the second frequency peak in the area. The thickness of the Holocene formation does not exceed 10 m, while the Pleistocene deposits reach a maximum thickness of 65 m north of Oued-

On the Sara-El-Maarouf anticline, in the profiles Pr6, Pr7, and Pr8 (Figure 13), the two

the Upper Pliocene sands vary between 300 and 400 m/s, and between 610 and 780 m/s for the Lower Pliocene sandstones. As for the Quaternary stages down in the plain, the contrast in velocity between the two Pliocene layers is responsible for the second frequency peak in the HVSR curves obtained from recordings on the Sara-El-Maarouf anticline. These formations reach a maximum thickness of 100 m in the oriental part of the anticline. A thin Tortonian limestone layer lies under the Pliocene layers, for which the Vs

Fodda city.

**Figure 13.** 2D Shear-wave velocity profiles for Zone 1. Qt: Quaternary. Pl: Pliocene. Mi: Miocene. Cr: Cretaceous. **Figure 13.** 2D Shear-wave velocity profiles for Zone 1. Qt: Quaternary. Pl: Pliocene. Mi: Miocene. Cr: Cretaceous.

In the profiles, Pr 6, Pr 7, and Pr 8 (Figure 13), we can see that the sediments of the plain are separated from the ones of the Sara-El-Maarouf anticline by a deep reverse fault that outcrops at the surface, the so-called "El-Asnam fault". The surface trace coordinates of the fault, along with its dipping angle calculated in Yielding [8] and Ouyed [11], were taken into consideration for the inversion process. In the profile, Pr 7, we can observe normal faults at the top of the anticline. The surface trace of these faults was observed during the HVSR measurement campaign. The important variations in the thickness of the layers between the measurement points, P54, P55, P56, and P57 (Profile Pr7 in Figure 12), allowed us to obtain an approximate idea on the fault position under the surface. The presence of normal faults at the top of the Sara-El-Maarouf anticline was highlighted before by Philip and Meghraoui [19]. This normal faulting comes as a result of the extension of the surface layers of the anticline due to the vertical slip on the El-Asnam fault [11,19].

The eight 2D velocity profiles in the Carnot Plain are mapped in Figure 14. Unlike the first zone, the synclinal shape of the basin is clearly visible in this zone. The depocenters are located north of the Chelif River, as attested by the CGG study [7]. The highest depths of the basin are observed around the city of El-Abadia, with a maximum depth of 760 m (P120 in Profile Pr14, Figure 14). Compared to the first zone, the lithostratigraphic column contains two additional layers of the Miocene age; the Messinian marls and the Serravallian poudingues. The Holocene alluviums have a Vs value between 220 and 370 m/s and reach a maximum thickness of 21 m in the middle of the plain. The Pleistocene alluviums reach a maximum thickness of 86 m, and the Vs value for this layer varies between 330 and 680 m/s. The Upper and Lower Pliocene layers that outcrop on a narrow band north of the plain, dip south under the Quaternary layers and disappear in the middle of the plain. The absence of Pliocene deposits in the south is due to marine regression during the late-Miocene early-Pliocene period, where the shorelines were located in the middle of the plain [2]. These formations are therefore thicker in the north and reach thicknesses of 145 m (P92 in Profile Pr11, Figure 14). The Vs value for the Upper Pliocene sands and conglomerates varies between 390 and 750 m/s, and between 510 and 900 m/s for the Lower Pliocene sandstones. The Pliocene deposits lie over the Messinian blue marls, the uppermost layer of the Miocene. The Miocene deposits occupy most of the sedimentary column of the Middle-Chelif Basin, reaching thicknesses of 550 m. However, there is no impedance contrast between the four Miocene layers in the area. The Vs values for the blue marls stands between 640 and 1190 m/s. For the Tortonian sandstones, they vary between 830 and 1280 m/s, and between 890 and 1380 m/s for the Serravallo-Tortonian clays and marls. For the Serravallian poudingues and conglomerates, the shear-wave velocities vary between 1190 and 1400 m/s.

For the Cretaceous bedrock, the Vs varies between 1650 and 2270 m/s. This variation is due to the lateral change of formations from Senonian clays and marls to Albian marls and limestones. In the southwestern part of the plain (Profiles Pr 9, Pr 10, and Pr 11, Figure 14), the Jurassic limestones that compose the Temoulga Massif dip vertically under the Neogene sediments, with a faulted contact between the limestones and the Cretaceous marls [35]. The calculated shear-wave velocities for the Jurassic limestones vary between 2300 and 2670 m/s.

The third zone, which covers the Rouina-Ain-Defla region, has a different local geological context. The vast Quaternary plain gives way to hills and plateaus. The four 2D velocity profiles realized for this zone are mapped in Figure 15. The lithostratigraphic column is dominated by the Tortonian sandstones and Serravallo-Tortonian clays and marls. The Vs values for sediments are in the same range as for the second zone. In the profile, Pr20 (Figure 15), we can see all the structural complexity that exists in this zone of closure.

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 22 of 28

**Figure 14.** Shear-wave velocity profiles for Zone 2. Qt: Quaternary. Pl: Pliocene. Mi: Miocene. Cr: Cretaceous. Ju: Jurassic. **Figure 14.** Shear-wave velocity profiles for Zone 2. Qt: Quaternary. Pl: Pliocene. Mi: Miocene. Cr: Cretaceous. Ju: Jurassic.

**Figure 15.** Shear-wave velocity profiles for Zone 3. Qt: Quaternary. Pl: Pliocene. Mi: Miocene. Cr: Cretaceous. Ju: Jurassic. **Figure 15.** Shear-wave velocity profiles for Zone 3. Qt: Quaternary. Pl: Pliocene. Mi: Miocene. Cr: Cretaceous. Ju: Jurassic.

> The sedimentary layers become thinner and lie over different types of bedrock. The old formations become more rugged to the south because of the presence of the epimetamorphic Doui Massif (Figure 2). North of the city of Ain-Defla (Profile Pr 20), the Albian clays and marls are separated from the Jurassic limestones by a block of Neocomian schists, which belongs to the Arib Massif that overlooks the Upper-Chelif Plain. The presence of the Neocomian schists marks the transition between the Middle and Upper Chelif Basin. The Vs values for this formation vary between 2280 and 2410 m/s.

> The average Vp/Vs ratio was calculated from the 164 velocity models. The averaged ratio for the sedimentary column is 1.95. For the bedrock formations, the averaged ratio is 1.76. This value is in agreement with the 1.7 found by Bellalem [53] in a seismic tomography study.

The depths to seismic bedrock are mapped in Figure 16. We can see that the basin reaches higher depths in a narrow band stretching from the west of El-Abadia to El-Amra, and reaches a maximum depth of 760 m (P120 in Profile Pr14, Figure 14), which is very close to the 800 m measured by the CGG [7]. *Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 24 of 28

**Figure 16.** Bedrock depths of the Middle-Chelif Basin. Red circles represent the cities. OF: Oued-Fodda. AB: El-Abadia. AT: El-Attaf. RO: Rouina. AM: El-Amra. AD: Ain-Defla. **Figure 16.** Bedrock depths of the Middle-Chelif Basin. Red circles represent the cities. OF: Oued-Fodda. AB: El-Abadia. AT: El-Attaf. RO: Rouina. AM: El-Amra. AD: Ain-Defla.

### **7. Conclusions 7. Conclusions**

In the present work, ambient vibration data were used to investigate the soil characteristics and the site effects in the Middle-Chelif Basin, with both single-station and array measurements. The region is home to over 400,000 inhabitants, distributed over its six main cities: Ain-Defla, El-Amra, Rouina, El-Abadia, El-Attaf, and Oued-Fodda. Being located on a zone with moderate to high seismic activity, these cities have suffered important building damages during past earthquakes. The fact that most of them are built on soft sediment layers has played a role in amplifying the ground shaking and the lengthening of its duration. In the present work, ambient vibration data were used to investigate the soil characteristics and the site effects in the Middle-Chelif Basin, with both single-station and array measurements. The region is home to over 400,000 inhabitants, distributed over its six main cities: Ain-Defla, El-Amra, Rouina, El-Abadia, El-Attaf, and Oued-Fodda. Being located on a zone with moderate to high seismic activity, these cities have suffered important building damages during past earthquakes. The fact that most of them are built on soft sediment layers has played a role in amplifying the ground shaking and the lengthening of its duration.

The first tens of meters of the topsoil column play an important role in ground shaking amplification during an earthquake. In our case study, it corresponds principally to the thickness of the Quaternary layers. In the major part of the basin, these layers lie directly over the Miocene sediments, which creates an important impedance contrast. Thus, the thickness of the Quaternary layers and the corresponding Vs values had to be determined with more precision. For this purpose, seismic noise array measurements and electrical resistivity measurements were carried out. The HVSR technique was applied on ambient vibration recordings at 164 sites, and F–K analysis was applied on array recordings at 7 sites. The obtained HVSR curves allowed us to identify the soil resonance frequencies at each site. The F–K analysis allowed us to retrieve the surface wave dispersion curve at each of the 7 sites. The HVSR and dispersion curves were then jointly inverted to obtain the Vs profiles at each site. The first tens of meters of the topsoil column play an important role in ground shaking amplification during an earthquake. In our case study, it corresponds principally to the thickness of the Quaternary layers. In the major part of the basin, these layers lie directly over the Miocene sediments, which creates an important impedance contrast. Thus, the thickness of the Quaternary layers and the corresponding Vs values had to be determined with more precision. For this purpose, seismic noise array measurements and electrical resistivity measurements were carried out. The HVSR technique was applied on ambient vibration recordings at 164 sites, and F–K analysis was applied on array recordings at 7 sites. The obtained HVSR curves allowed us to identify the soil resonance frequencies at each site. The F–K analysis allowed us to retrieve the surface wave dispersion curve at each of the 7 sites. The HVSR and dispersion curves were then jointly inverted to obtain the Vs profiles at each site.

The analysis of the HVSR curves showed the existence of different types of curves: curves with one, two, and multiple frequency peaks. The HVSR curves with two peaks are predominant in the Middle-Chelif Plain. The predominant soil resonance frequencies

aspect of the basin, as well as the thinning of the sedimentary layers on its edges. The corresponding amplitudes vary between 2 and 6.2. However, the amplitudes obtained

The analysis of the HVSR curves showed the existence of different types of curves: curves with one, two, and multiple frequency peaks. The HVSR curves with two peaks are predominant in the Middle-Chelif Plain. The predominant soil resonance frequencies vary between 0.75 and 12 Hz. This wide range of frequencies is explained by the synclinal aspect of the basin, as well as the thinning of the sedimentary layers on its edges. The corresponding amplitudes vary between 2 and 6.2. However, the amplitudes obtained from the HVSR analysis may not be a reliable estimation of true amplification. The highfrequency peaks (between 5.9 and 12 Hz) appear to be related to an impedance contrast between the Holocene and Pleistocene alluviums at shallow depths.

In the Middle-Chelif Plain, the topmost layer is composed of thin Holocene alluvial deposits, for which the shear-velocity (Vs) ranges between 210 and 370 m/s. The largest thickness observed for this layer is 21 m. For the Pleistocene deposits, the Vs value varies between 330 and 680 m/s, with a maximum observed thickness of 86 m. The Pliocene deposits are present only in the northern part of the plain and on the Sara-El-Maarouf anticline in the west. The shear-wave velocity varies between 300 and 750 m/s for the Upper Pliocene sands and conglomerates, and between 510 and 900 m/s for the Lower Pliocene sandstones. The largest observed thickness for the Pliocene layers is 145 m, while the Miocene deposits occupy a major part of the sedimentary column, often composed of four distinct layers. Their thickness reaches 550 m around the city of El-Abadia. From the Messinian stiff marly formations to the Serravallian poudingues, the Vs values vary from 640 to 1450 m/s.

In the southern part of the Middle-Chelif Plain, Jurassic formations outcrop in succession on the massifs of Doui, Rouina, and Temoulga. The Jurassic limestones are dipping, generally north, and constitute the bedrock for the Neogene deposits in this part of the basin. The shear-wave velocity for these formations varies between 2300 and 2670 m/s. However, in a major part of the basin, the sedimentary deposits lie over a Cretaceous bedrock, for which the Vs values range between 1620 and 2300 m/s. The calculated Vp/Vs ratio for the sedimentary column is 1.95. For the bedrock formations, the ratio is 1.76.

The obtained velocity and resistivity profiles highlight the existence of important lateral variations in the velocity and resistivity of the whole sedimentary column. This variation is directly linked with variations in lithology (lateral change of facies), caused by different factors that have affected the sedimentation process in the Middle-Chelif Basin (marine regression and transgression episodes during the Miocene and Pliocene, erosions, faults, etc.).

The sedimentary layers show a synclinal aspect in the plain of Carnot, unlike the Oued-Fodda and Ain-Defla regions, where the underground structures are rugged, affected by different faults, folds, and outcrops. The depocenters in the Middle-Chelif Basin appear to be located in a narrow band extending from the western part of the town of El-Abadia to the northern part of the town of Rouina. The maximum observed depth to bedrock is 760 m.

The thickness of the sedimentary layers is different beneath the cities situated in the northern and southern parts of the Middle-Chelif Plain. In the city of Ain-Defla, which backs onto the northern flank of the epimetamorphic Doui Massif, the sedimentary column does not exceed 60 m. While in the cities of Oued-Fodda and El-Attaf, which were built around the western and eastern parts of the Temoulga Massif, respectively, the sedimentary column reaches a thickness of 200 m. However, the cities of Rouina, El-Abadia, and El-Amra are lying over a thick sedimentary cover that exceeds 300 m.

The cities cited in this study extend over a region where strong earthquakes with secondary effects (liquefaction, landslides, etc.) have already occurred several times before. Therefore, this study would like to be a contribution to a better assessment of the seismic hazard in the Chelif-Basin. The results obtained in this study can be used, for example, in the modeling of strong ground motions in order to update the Algerian seismic code.

**Author Contributions:** A.I. acquired and processed the data and wrote the paper. F.S. proposed the methodology and reviewed the paper. A.Y.-C. reviewed and supervised the paper. J.J.G.-M. helped with array processing and reviewed the paper. A.M. performed the electrical prospecting. All authors have read and agreed to the published version of the manuscript.

**Funding:** This study was funded by the Consellería de Participación, Transparencia, Cooperación y Calidad Democrática de la Generalitat Valenciana, and by Research Group VIGROB-116 (University of Alicante).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available upon request from the corresponding author. The data are not publicly available because of their large size.

**Acknowledgments:** We would like to thank Hamai Lamine for providing the equipment for the resistivity measurements. We also thank Z.Sahari for her help with the lithological cross-sections. A special thanks to the CRAAG team that participated in field measurements: A. Belahouene, A. Sebbane, A. Saadi, B. Melouk, O. Haddad, and R. Chimouni.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


## *Article* **Site Response Evaluation in the Trans-Mexican Volcanic Belt Based on HVSR from Ambient Noise and Regional Seismicity**

**L. Francisco Pérez-Moreno 1, , Quetzalcoatl Rodríguez-Pérez 2,\*, F. Ramón Zúñiga <sup>3</sup> , Jaime Horta-Rangel <sup>1</sup> , M. de la Luz Pérez-Rea <sup>1</sup> and Miguel A. Pérez-Lara <sup>1</sup>**


**Featured Application: The results obtained in this study allow an initial overview of the variation of the site response in an area for which low to moderate seismic risk is usually considered. The information presented may be used to analyze the seismic effects in the study zone associated with the geological characteristics.**

**Abstract:** The Trans-Mexican Volcanic Belt (TMVB), located in central Mexico, is an area for which low to moderate seismic risk is considered. This is based on the limited instrumental data available, even though large historical earthquakes have damaged some urban centers in the past. However, site effects is an aspect that must be considered in estimating risk, because there are some instances of important amplifications that have been documented with serious effects. In this work, ambient noise and earthquake records from 90 seismic permanent and temporary stations are used to analyze site response in the TMVB. The results obtained show a heterogeneous range in the value of the fundamental frequency. When possible, a comparison was made of the results obtained from ambient noise and earthquake records. In almost all these comparisons, no significant differences were observed in terms of the fundamental frequency. However, there were some stations with a flat average HVSR ambient noise curve that contradicted earthquake data results, which showed peaks at some frequencies. Our results are a first step towards categorizing the different site responses in the TMVB but in order to provide finer details, it is necessary to improve the actual monitoring conditions.

**Keywords:** site effects; spectral ratio; trans-mexican volcanic belt

## **1. Introduction**

Within the context of seismic engineering, the study of site response involves changes in variables related to the seismic intensity, in terms of amplitude, duration, and frequency content. These changes depend on the geological features at the measurement site, and usually, lead to larger amplitudes on soil sites than on hard rock.

Throughout history it is possible to find documented cases in which site effects have played a decisive role in observed damage after significant earthquakes, such as in San Francisco (1906), Mexico City (1985 and 2017), and Kobe (1995). Thus, in the seismic design of buildings and civil infrastructure it is important to estimate the maximum expected intensities at the site.

The evaluation of changes in amplitude of ground motion of a particular site is commonly made with respect to a reference position. The most popular and reliable way to do this is through the Standard Spectral Ratio (SSR) technique [1]. This method is based

**Citation:** Pérez-Moreno, L.F.; Rodríguez-Pérez, Q.; Zúñiga, F.R.; Horta-Rangel, J.; Pérez-Rea, M.d.l.L.; Pérez-Lara, M.A. Site Response Evaluation in the Trans-Mexican Volcanic Belt Based on HVSR from Ambient Noise and Regional Seismicity. *Appl. Sci.* **2021**, *11*, 6126. https://doi.org/10.3390/app11136126

Academic Editors: Miguel Llorente Isidro, David Moncoulon, Ricardo Castedo and Filippos Vallianatos

Received: 28 April 2021 Accepted: 28 June 2021 Published: 30 June 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

on the ratio of the Fourier amplitude spectra observed during an earthquake at soil *Ai*(*f*) to reference site *Aj*(*f*). The Fourier spectra of ground motion *A*(*f*) can be expressed as the multiplication of source *S*(*f*), path *P*(*f*), and site *H*(*f*) terms in the frequency domain:

$$A(f) = S(f) \times P(f) \times H(f) \tag{1}$$

where *f* is the frequency.

In the SSR technique, the reference site is required to have the most homogeneous geological conditions as it is assumed to be free of anomalous amplifications. The two sites should be close to each other in comparison with the distance to the source. Considering that the analysis is performed for stations which register the same earthquake:

$$SSR(f) = \frac{|A\_i(f)|}{|A\_j(f)|} = \frac{|H\_i(f)|}{|H\_j(f)|}. \tag{2}$$

However, it is not always possible to find records of the same event with a good signal-to-noise (S/R) ratio in two stations that meet these conditions. Thus, the Horizontalto-Vertical Spectral Ratio (HVSR) technique [2] is used as an alternative to studying the site response. In this method, the ratio between the mean of the horizontal components *AHi*(*f*) and the vertical component *AVi*(*f*), recorded in a single station, is calculated:

$$HVSR(f) = \frac{|A\_{Hi}(f)|}{|A\_{Vi}(f)|}. \tag{3}$$

This technique can be used to estimate the fundamental resonant frequency (*f*0) of a site, but is unreliable for determining its transfer function [3].

In the literature, several studies can be found in which the site response is analyzed in areas around the world, based on these empirical techniques and numerical models [4–8]. The study of this phenomenon remains current and it is an important issue to be considered in the estimation of seismic risk and damage prevention, even in areas of low seismicity [9].

The Trans-Mexican Volcanic Belt (TMVB), located in central Mexico (Figure 1), is an example of those areas where the site response might represent a potential risk despite the low frequency of earthquakes. It is a volcanic arc that covers Cretaceous and Cenozoic magmatic provinces [10]. Seismicity in this region is related to extensional faults in the crust and normal faults in the subducted Cocos plate. This zone is located between 18°300 and 21°300 , and extends from the coast of the Pacific Ocean to the Gulf of Mexico. It has an approximate length of 1000 km and a variable width of 90–230 km.

**Figure 1.** Location of the Trans-Mexican Volcanic Belt and interaction of tectonic plates in Mexico.

This area presents low seismicity, compared to the Pacific Ocean coast, where the Cocos plate subducts under the North American plate. Nevertheless, there are important recent and historical occurrences of destructive events [11,12]. Examples of these is the earthquake that occurred in Acambay on 12 November 1912 (Mw 6.9, H = 33 km) [13], and more recently in Puebla on 19 September 2017 (Mw 7.1, H = 51.2 km) [14]. This aspect is particularly important, in view that the TMVB comprises some of the largest cities in Mexico with significant industrial development and population growth in the last years.

It may be difficult to find a zone in the TMVB with negligible site effects, due to the variability in geological properties [10]. Several articles have been published on this topic related to Mexico, mainly focused on the behavior of site effects in Mexico City.

Ordaz et al. [15] found that stations located in the hill zone in the Valley of Mexico present amplifications that are 10 times higher than those predicted from ground motion attenuation equations. In a later study, García et al. [16] compared amplification responses at some of the stations analyzed by [15]. The records analyzed corresponded to inslab earthquakes with an epicenter outside the TMVB, and they observed similar behavior despite the source type and location.

Singh et al. [17] briefly discuss HVSR curves for 2 stations crossing the TMVB, obtained from records of 9 shallow coastal earthquakes on the Pacific Ocean coast. Likewise, Lozano et al. [18] studied the influence of source characteristics on the site effects in the Valley of Mexico from 36 interplate and inslab Mexican earthquakes and 12 teleseismic events originated in South America. Their results agree with those obtained by [16] and concluded that observed site effects were independent of the characteristics and location of the source.

Clemente-Chávez et al. [19] presented the first study of site effects in the TMVB outside the Valley of Mexico, based on local shallow seismicity. They analyzed HVSR curves obtained from 22 earthquakes recorded by 25 seismological stations located in the study zone. The events studied have depths H < 10 km, and magnitudes between 3.6 and 4.3. Average values for the fundamental frequency *f*<sup>0</sup> and amplification factors obtained from the HVSR curves were reported. Their results were compared with previous studies and they attributed the differences found to the location of the source. However, due to the limited amount of data employed, some of their conclusions were supported only with one or two records. Furthermore, it should be considered that HVSR curves do not unequivocally represent the actual site amplification.

Considering the low density of permanent seismological stations in the TMVB [20], the number of records that can be analyzed with the SSR technique is very limited. So, the use of ambient noise records represents a feasible option to include additional data for the analysis of site response in terms of *f*0. In this paper, we analyze HVSR curves obtained from ambient noise and regional instrumental records related to 121 crustal and inslab events with an epicenter inside the TMVB. Data considered were recorded by 90 seismological stations belonging to seven seismic networks.

The results of our study may be useful in disaster prevention and in estimating the behavior of future buildings in the study area. The information presented may also be used to complement studies examining the damage observed to existing infrastructure after the occurrence of a seismic event [21,22]. However, it is important to consider that a more accurate and reliable analysis can be carried out throughout microzonation studies, so our results only give a general overview.

## **2. Data and Methods**

To perform this study, we compiled a catalog of earthquakes with epicenter inside the area of the TMVB, based on the seismicity catalogs of the Servicio Sismológico Nacional [23], the U.S. Geological Survey [24], and the International Seismological Centre (ISC) [25]. Avoiding duplications and magnitudes reported as "non-calculable", 2113 earthquakes with magnitude 1.5–7.8 and depth 1–191 km, were selected.

Based on this information, instrumental records were searched in seven permanent and temporary seismological networks in Mexico (Table 1), with broadband stations located inside the TMVB. It should be considered that over time there have been significant changes in instrumental density in this zone [20]. Additionally, the stations that belong to permanent networks are widely dispersed, and the availability of continuous data is limited to the last 30 years.

**Table 1.** Seismological networks that were analyzed in this study. The continuous data availability is variable depending on each station.


Data were gathered from stations with velocity sensors belonging to Servicio Sismológico Nacional (SSN) [23], Centro de Geociencias de la Universidad Nacional Autónoma de México (CGEO), GEOSCOPE [26], and the temporary networks: The MesoAmerican Subduction Experiment (MASE) [27,28], Colima Deep Seismic Experiment (CODEX) [29], Mapping the Rivera Subduction Zone (MARS) [30], and Geometry of the Cocos Plate (GECO) [31].

The stations belonging to SSN and CGEO are equipped with STS-2 broadband seismometers with sampling rates of 80 and 100 Hz. Only the ACIG station has a Trillium 120 seismometer with a sampling rate of 20 Hz. In the case of the GEOSCOPE network, the UNM station has a velocity sensor STS-1 20 samples/s. During their period of operation, the GECO network stations were equipped with Reftek 151B-60 and Guralp 40T sensors. The temporary networks MASE and CODEX had CMG 3T, and 40T seismometers, respectively, with a sampling rate of 100 Hz. The MARS network was equipped with CMG 3T sensors of 40 sps.

Only records that met the following criteria were considered for the analyses: (1) Station and epicenter located inside the TMVB; (2) signal-to-noise ratio (S/R) > 2.0; and (3) clear arrival of P and S waves, based on visual inspection.

Considering these restrictions, a total of 352 records related to 24 inslab and 97 crustal earthquakes were chosen (Figure 2). Such events have magnitudes between 2.0 and 7.1, a hypocentral depth of 1 ≤ H ≤ 105 km, and occurred between 1993 and 2019 in the TMVB. The main characteristics of the selected earthquakes are listed in Table 2.

In Figure 2 it can be observed that in the central zone there is the lowest number of records. As mentioned by Zúñiga et al. [20], the other regions present higher seismic activity due to active fault systems, the subduction zone on the Pacific Ocean coast, and the occurrence of inslab earthquakes in the central-eastern sector.

The number of records selected for each station was variable, and in most of them, the data with the characteristics sought was considerably limited. An HVSR analysis was performed at all stations with at least 10 earthquake records. This threshold was considered to include as many stations as possible, trying not to compromise the reliability of the results. It is important to note that despite the large number of seismological stations belonging to MASE, MARS, and CODEX networks, the suitable records were limited due to the characteristics of the earthquakes during their short period of operation.

The processing of records was performed by means of the package Geopsy [32]. For each analyzed record, the mean and the trend were removed. The corresponding Fourier Acceleration Spectra (FAS) were smoothed using the Konno–Omachi function [33],

considering a b-value of 20. Window lengths that included 95% of the total energy were taken, beginning from the S-wave arrival.



**Table 2.** *Cont.*


**Figure 2.** Location of epicenters and seismological stations analyzed in this study. Cyan circles: Crustal earthquakes; orange circles: Inslab earthquakes. Red triangles: MARS temporary network stations; blue triangles: CODEX temporary network stations; green triangles: MASE temporary network stations; purple triangles: GECO temporary network stations; black triangles: SSN permanent network stations; yellow triangle: CGEO permanent station; and white triangle: GEOSCOPE permanent station. Dashed line: TMVB.

Due to the lack of earthquake records, an HVSR analysis was performed in all the stations using ambient noise data. Although the energy of ambient noise does not compare with that of an earthquake, this second analysis allowed us to increase the number of stations analyzed and to compare the results obtained with both procedures. For each station, a database of 30 records of one-hour duration was collected on random dates within its period of operation. In all ambient noise records, the mean and trend were removed and the same smoothing process was applied as in the earthquake records. For the HVSR analyses, the records were divided into 60 windows of 1 min and the root mean square of the horizontal components was calculated. SESAME criteria [34] were considered for the analyses and classification of the curves obtained. Finally, the results were correlated with the geological characteristics of the study area.

## **3. Results**

Due to the large extension of the study area, the stations analyzed were grouped into four zones. This division was based on the distribution of stations and on geologic information given by [10]. Most of the average curves shown correspond to the HVSR analyses based on ambient noise data, between 0.1 and 25 Hz. Some of the considered stations have a Nyquist frequency of 10 and 20 Hz, so not all curves are displayed in the same frequency range. When possible, a comparison is made with results from earthquake records. Regarding geology, information from the Geologic Map of North America [35] was used.

## *3.1. Western TMVB*

In the westernmost part of the TMVB, there are three main fault systems: Tepic-Zacoalco, Colima, and Chapala [10]. Regarding significant earthquakes, there is a historical account of the occurrence of a damaging event during the 16th century near the town of Ameca, in the state of Jalisco. For this event, a magnitude Mw 7.2 ± 0.3 was estimated from a rupture vs. magnitude scaling relation [36,37]. Another earthquake occurred on 2 October 1847, with a mean magnitude of 5.7 ± 0.4 which has been associated to the faults of the Chapala Graben. It affected several towns and produced important damages to buildings. An intensity of IX has been estimated for this event [38].

Later, on 11 February 1875 an earthquake near the city of Guadalajara caused great destruction in nearby areas including dozens of deaths [39].

Figure 3 shows the seismological stations analyzed in the Western sector of the TMVB. As can be observed, there is only one station belonging to a permanent seismological network. The other temporary stations are mostly concentrated in the south of this sector.

**Figure 3.** Geological units in the western sector of the TMVB [35]. Red triangles: MARS temporary network stations; blue triangles: CODEX temporary network stations; and black triangle: SSN permanent network station. Continuous line: Main fault systems in the area; dashed line: TMVB.

In this sector, there are a variety of geological formations ranging from the Lower Cretaceous to the Quaternary [35]. Most of the stations analyzed are located in the vicinity of the Colima volcanic complex. Figure 4 shows some of the average HVSR curves obtained in this zone, in which only the ANIG station could be analyzed with earthquake records.

In some curves there can be observed flat responses or complex shapes with low amplitudes at different frequencies: MA47, MORA, MAZE, CUAT, ZAPO, MA24, OLOT, SNID, SINN, EMBG (Figure 4), SCRI, and COMA (Figure 5). Station MA47 is located on sedimentary rocks from the Lower Cretaceous. Its HVSR curve has a flat shape and no significant impedance contrast is identified. In contrast, stations MORA, MAZE, CUAT, and ZAPO were deployed on Oligocene felsic rocks and near their location there are volcanic and sedimentary rocks from the Neogene and Quaternary. The site response could be attributed to lateral heterogeneity and the presence of sedimentary rocks from the Cretaceous. Stations MA24, OLOT, SINN, and EBMG were located on mafic and intermediate rocks from the Neogene and Quaternary. MA24 and OLOT were relatively close to each other, but a resonant frequency can not be identified in either of them. As for SINN and EBMG, the corresponding curves have a flat shape, indicating hard rock sites. Stations SCRI and COMA (Figure 5) were deployed on Quaternary sedimentary rocks, and in both of them a predominantly flat shape is also observed.

There is another group of stations in which the HVSR curves present multiple peaks, with amplitudes similar to what could be identified as the clearest peak. Station MA54 were located on sedimentary rocks from the Upper Cretaceous. The peak around 6 Hz could be attributed to a thin layer of sediments and the amplitude between 0.6 and 2 Hz to the contact with formations from the Lower Cretaceous. Meanwhile, MA46 were deployed on Oligocene felsic rocks. The observed peaks above 2 Hz could indicate multiple layers

of sediments of varying thickness above the bedrock. Station MA42 was on mafic and intermediate rocks from the Neogene and Quaternary. The shape of the HVSR curve corresponding to this station could be attributed to variations in the topography or a deep interface with Oligocene formations. SNID (Figure 4) and MA48 (Figure 5) were deployed on Quaternary sedimentary rocks. The amplitudes in SNID could be associated to lateral heterogeneities by means of its location near Neogene formations. Likewise, some of the multiple peaks in MA48 could be spurious or related to multiple layers of sediments at variable depth.

**Figure 4.** Average HVSR curves for stations located in the western sector of the TMVB. Blackline: HVSR results using ambient noise records; redline: HVSR results from earthquake records.

In the remaining stations, clear peaks of different amplitude can be observed, revealing different impedance contrasts between sediment layers and the bedrock: BAVA, JANU, MA44, MA45, ANIG, MA40, MA41, PAVE, GARC, ALPI, SANM (Figure 4), CDGZ, COLM, CANO, and ESPN (Figure 5). This is also indicative of possible amplifications in ground motion during the occurrence of seismic events. In the case of stations BAVA and JANU, located on Oligocene felsic rocks, the observed peaks at low frequencies could be associated to thick sedimentary deposits. High clear peaks at 3 and 4 Hz are observed at stations MA44 and MA45. These may be produced by strong impedance contrasts with interlayered sedimentary and volcanic rocks from the Lower Cretaceous.

**Figure 5.** Average HVSR curves for stations located in the western sector of the TMVB.

The stations located on mafic and intermediate rocks from the Neogene and Quaternary have a heterogeneous response (ANIG, MA40, MA41, PAVE, GARC, ALPI, and SANM) (Figure 4). In general, a clear peak is observed above 5 Hz, which could be attributed to impedance contrast with thin layers of sediment. It should be noted that the average HVSR curve obtained with earthquake data at ANIG station does not vary significantly in its shape to that obtained from ambient noise. There is a peak above 10 Hz that could have been produced by the presence of thin volcanic layers.

Regarding the average HVSR curves of stations on Quaternary sedimentary rocks (ESPN, CDGZ, COLM, and CANO) there are varied shapes (Figure 5). In ESPN, there is a peak above 10 Hz which may require the analysis of a wider frequency bandwidth. The clear peak in CANO could be associated to contact with Neogene and Quaternary rocks. As for CDGZ, there is a high peak near 1 Hz, attributable to impedance contrast with layers of the Lower Cretaceous or thick layers of sediments. Likewise, in COLM there is a small peak around 10 Hz, probably due to the presence of shallow unconsolidated sediments.

## *3.2. Central TMVB*

The Central sector is the one with the lowest instrumental density. There is a group of nine temporary stations concentrated on the southwest of the sector, and only two permanent stations that are considerably separated in the east (Figure 6). There were few earthquake records that met the established restrictions. For this reason, the results presented here only correspond to the analysis of ambient noise data.

Most of the stations in this sector were deployed on Neogene and Quaternary mafic rocks. In MA15, MA18, MA27, MA29, and MOIG, there are HVSR curves with flat shapes and no significant impedance contrast is observed, while stations MA20 and MA28 do not have clear peaks (Figure 7).

Station MA21 has a clear peak at 1 Hz and additional unclear peaks above this frequency. In this case, the shape of the HVSR curve could be related to the calderas that predominate in the area or lateral variations with sedimentary rocks from the Lower Cretaceous. Furthermore, the curve corresponding to the IGIG station shows peaks at frequencies beyond 10 Hz. This permanent station is located on Felsic rocks from Neogene, and its response could be related to thin layers of sediments.

**Figure 6.** Geological units in the central sector of the TMVB [35]. Red triangles: MARS temporary network stations; black triangles: SSN permanent network stations. Continuous line: Main fault systems in the area; dashed line: TMVB.

In stations MA16 and MA17, there are clear peaks at 1.5 and 5 Hz, respectively. This could be attributed to impedance contrasts with layers of sediments of different thickness.

## *3.3. Central-Eastern TMVB*

%

%

%

%

%

%

%

%

%

%

%

%

%

%

%

%

%

%

%

%

%

%

In the central-eastern zone of the TMVB is the Acambay fault system, covering some faults considered as active [40]. In the same area, there are the Taxco-San Miguel de Allende and Chapala-Tula systems, with evidence of activity in recent years. In this sector, a large number of relevant earthquakes can be mentioned over time, highlighting the following.

In the 18th century, there are reports of crustal earthquakes related to the Venta de Bravo fault. The first occurred in November 1734, followed by more than 30 strong and small events between November 1734 and March 1735 [39]. The largest crustal earthquake in the TMVB during the 20th century took place on 12 November 1912, in Acambay [13]. It had a magnitude of Mw 6.9 with a maximum intensity of IX in the epicentral area. Then, a series of 90 earthquakes were registered between February and June 1979 in a region comprising Maravatio and Mexico state. The mainshock occurred on 22 February 1979 with a magnitude Mw 5.3 and maximum intensity of VIII [40].

**Figure 7.** Average HVSR curves for stations located in the central sector of the TMVB.

Figure 8 shows the seismological stations analyzed in this sector. In this case, it was possible to analyze nine stations with earthquake data and to compare the results with the use of ambient noise. Particularly noteworthy are the stations belonging to the MASE temporary network, deployed between 2004 and 2007. Despite being numerous, due to the characteristics of the events that occurred during this period, only a few stations had earthquake records with the characteristics sought.

In this sector, there are some stations located on Neogene and Quaternary geological units (Figure 8). In JRQG, ECID, SNLU, COAC, PTRP, ESTA, and MIXC there are mostly flat HVSR curves, and no clear peaks are identified. In DHIG, CUIG, UNM, SABI, and PSIQ, the shape of the average curves obtained from ambient noise and earthquake records do not vary significantly (Figure 9). Small peaks can be observed at different frequencies that show no significant impedance contrasts or possible amplifications. As a result of this, the locations of these stations and those mentioned above may be considered as hard rock sites. In the HVSR curve corresponding to station TEPE (Figure 10), it is not possible to point out some value as the resonance frequency. This shape could be produced by lateral heterogeneity due to the vicinity of some Quaternary mafic rocks.

In some stations located on Quaternary volcanic rocks, there are almost HVSR curves of a flat shape (ACIG, PACH, MIMO, SAPE, SAPA, and TOSU). On the other hand, stations CUCE and CUNO have HVSR curves in which it is not possible to clearly identify any resonant frequency and could be considered as flat (Figure 10).

In station SALU, deployed on Quaternary sedimentary rocks, there are multiple peaks below 1 Hz and between 3 and 10 Hz (Figure 9). This could be due to the influence of topography or underlying geological units. The curve corresponding to station CHIC, located on Quaternary volcanic rocks, has multiple peaks at high frequencies, but only the one at 20 Hz can be considered as clear, indicating shallow layers of sediments. Additionally, in stations VEGU, ARBO, and VLAD, there are not very well-defined peaks between 1 and 10 Hz (Figure 10). These could be attributed to small impedance contrasts at different depths or lateral heterogeneity due to the presence of Neogene sedimentary rocks.

PTCU there are clear peaks between 1 and 3 Hz that may be due to impedance contrasts with Cretaceous layers.

**Figure 9.** Average HVSR curves for some stations located in the central-eastern sector of the TMVB. Blackline: HVSR results using ambient noise records; redline: HVSR results using earthquake records.

**Figure 10.** Average HVSR curves for some stations located in the central-eastern sector of the TMVB. Blackline: HVSR results using ambient noise records; redline: HVSR results using earthquake records.

## *3.4. Eastern TMVB*

Regarding relevant historic earthquakes in this sector, it can be mentioned the one occurred in 1546 that caused important damage in the city of Jalapa and nearby towns. There is no specific date reported for this event nor wider descriptions in the historical records, but it is known that the first Catholic church built in America was destroyed [41]. Later, on 4 January 1920 an earthquake with magnitude Mw 6.4 occurred in the city of Jalapa. This event caused structural damage in nearby towns and it seconds the 1985 Michoacán earthquake in terms of fatalities.

Figure 11 shows the seismological stations analyzed in this sector. Most of them belong to the temporary network GECO and only two to the permanent network of the SSN.

**Figure 11.** Geological units in the eastern sector of the TMVB [35]. Purple triangles: GECO temporary network stations; black triangles: SSN permanent network stations. Dashed line: TMVB.

In this sector, it was possible to analyze almost half of the stations with earthquake data (Figure 12). In the case of the LVIG station, no significant differences are found between the curves obtained using ambient noise and the ones obtained through earthquake data. As well as in this station, in AYAH and TPIG, no contrast of impedances is observed, and the shape of the curves is almost flat, indicating hard rock sites.

Moreover, TEPY was located on Quaternary mafic rocks, while QUEC and LUPE on Neogene and Quaternary sedimentary rocks. In these stations, there are multiple peaks. For station TEPY it is not possible to identify a value for the resonant frequency. As for QUEC and LUPE, there is a peak near 10 Hz, which may indicate the presence of thin deposits of sediments. Likewise, the peaks at lower frequencies could be due to the influence of sedimentary rocks from the Neogene and Upper Cretaceous.

**Figure 12.** Average HVSR curves for stations located in the Eastern sector of the TMVB. Blackline: HVSR results using ambient noise records; redline: HVSR results using earthquake records.

Stations TATA, NAOL, HUEY, and HUAT are on Quaternary mafic rocks and have varied fundamental frequencies. The observed clear peaks could be attributed to impedance contrasts related to layers of sediments with varied thickness. In TATA and HUAT, there are visible peaks between 1 and 10 Hz. The shape of these curves does not vary significantly when using earthquake and ambient noise data. In the case of the NAOL station, the amplitude of the peak is increased but its location in the considered frequency range does not change.

## **4. Conclusions**

From the results obtained in this study, it is observed that the site response in terms of the fundamental frequency *f*<sup>0</sup> had a wide variety throughout the TMVB and there was no visible correlation between the shape of the HVSR curves and extent of any particular geological unit. In almost 46% of the stations analyzed, the response was mostly flat or it was not possible to identify clear peaks. In most of these cases, no significant impedance contrasts were observed, so they could be considered as hard rock sites. There was another small group of stations distributed in the four sectors, in which multiple peaks were observed in the HVSR curves. Some of these peaks could be identified as spurious or attributable to the influence of underlying layers of varying stiffness and thickness.

Additionally, in approximately 36% of the analyzed sites there were clear peaks of varied amplitude in the HVSR curves. In these cases, it was possible to identify the value of the resonance frequency, which was mostly between 1 and 10 Hz. Similarly, in some sites there were peaks in frequencies lower than 1 Hz, which should not be underestimated considering the modern trend of tall buildings in the populated areas of the TMVB.

To date, most of the seismicity studies in Mexico have focused on earthquakes originated on the Pacific Ocean coast. Except for Mexico City, due to its geotechnical properties, the amplitude of these events usually reaches the TMVB considerably attenuated. For this reason, this study focused on analyzing what happens in this zone with regional earthquakes. In this regard, it is important to note that there were sites with high peaks in the four sectors analyzed, not only in the central-eastern sector where Mexico City is located.

In some stations, it was possible to make a comparison of the results obtained from ambient noise and earthquake records. In most of them, there were no significant differences in the shape of the average HVSR curves. In the analyses carried out with seismic records, no evidence of non-linear behavior was identified in the site response with respect to the ambient noise results. This could be attributed to the fact that most of the events considered are of low magnitude. In general, there are few major earthquakes with both epicenter and instrumental records inside the TMVB. However, there were some sites with a flat HVSR ambient noise curve in which new peaks appeared when analyzing earthquake data. Considering that the energy associated with ambient noise is not comparable to that associated with a seismic event, it is possible that several sites considered as hard rock may have significant amplifications during the occurrence of an earthquake.

Based on these observations, we consider it is necessary to increase the density of the permanent seismic instrumentation in the TMVB. This will allow to have greater coverage, better quality records, and the possibility of carrying out more analyses using earthquake data, including the SSR technique. A greater permanent seismic instrumentation would allow obtaining isofrequency maps, which could be used in soil-structure interaction analyses and as a basis for estimating seismic hazards and risk assessment in the TMVB. This would also allow to better delimit the risk of populated zones based on local seismicity, considering the significant growth that the cities in the area have had in recent years.

**Author Contributions:** Conceptualization, L.F.P.-M., Q.R.-P. and F.R.Z.; methodology, L.F.P.-M.; formal analysis, L.F.P.-M.; investigation, L.F.P.-M. and M.A.P.-L.; writing—original draft preparation, L.F.P.-M.; writing—review and editing, Q.R.-P., F.R.Z., J.H.-R. and M.d.l.L.P.-R.; supervision, J.H.-R., M.d.l.L.P.-R. and M.A.P.-L. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding. The APC was funded by L.F.P.-M.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Acknowledgments:** Special thanks to the Mexican National Council for Science and Technology (CONACYT) for the support provided to L. Francisco Pérez-Moreno during the realization of his Ph.D. studies and to Quetzalcoatl Rodríguez-Pérez in the Cátedras program (project 1126). SSN data was obtained by the Servicio Sismológico Nacional (México). Station maintenance, data acquisition, and distribution is thanks to its personnel. The authors thank Antonio de J. Mendoza Carvajal and Jorge A. Real Pérez for the maintenance and operation of the Geometry of the Cocos Plate (GECO) array. This network was financed by PAPIIT (projects IN110913, IN105816, and IN106119) and CONACYT (project 177676). We are also particularly grateful to Xyoli Pérez-Campos for providing access to the network data. We appreciate the contributions of the four anonymous reviewers of the initial version of this paper. Their valuable comments allowed us to improve this study.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


## *Article* **Influence of Rainfall Intensity and Slope on the Slope Erosion of Longling Completely Weathered Granite**

**Haojun Tian 1,2 and Zhigang Kong 1,2,\***


**Abstract:** Serious slope erosion occurs in the distribution areas of fully weathered granites, and rainfall intensity and slope gradient are important factors affecting slope erosion. In this study, we investigate the erosion characteristics of Longling completely weather granites with a focus on the effects of rainfall intensity and slope gradient. Based on an indoor 60-min simulated rainfall test, we selected four slope gradients (10◦ , 20◦ , 30◦ , and 40◦ ) and three rainfall intensities (50, 80, and 110 mm/h) for evaluation. A total of 12 groups of tests were conducted to analyze the erosion and surface hydrodynamic characteristics of the completely weathered granite slope. The results indicate a significant positive correlation between rainfall intensity and slope gradient, and the correlation between rainfall intensity and flow velocity became stronger as the slope gradient increased. The peak sediment yield rate represents the moment at which the change in slope shape is maximized. After the peak appears, the slope will no longer undergo great deformation, and the sediment yield rate will decrease and then become stable. Finally, rainfall intensity and slope gradient, which are the two key factors that determine slope flow velocity, are described using a binary function. The findings provide a reference for the study of slope erosion in completely weathered granites.

**Citation:** Tian, H.; Kong, Z. Influence of Rainfall Intensity and Slope on the Slope Erosion of Longling Completely Weathered Granite. *Appl. Sci.* **2023**, *13*, 5295. https://doi.org/ 10.3390/app13095295

Academic Editors: Miguel Llorente Isidro, Ricardo Castedo and David Moncoulon

Received: 2 March 2023 Revised: 16 April 2023 Accepted: 20 April 2023 Published: 23 April 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

**Keywords:** completely weathered granite; runoff; sediment production; slope erosion; flow velocity hydrodynamic characteristics

## **1. Introduction**

Slope erosion is the result of an interactive process between slope water flow and soil, leading to erosion, transportation, and removal of soil from the slope surface under the influence of water. This phenomenon results in soil fertility decline, soil erosion, and slope damage. Numerous studies have demonstrated that the hydraulic characteristics of slopes are the primary factors that influence slope erosion [1] with rainfall intensity and slope gradient being the most crucial [2,3]. Changes in these factors can significantly alter slope runoff affecting sediment transport capacity and erosion rate [4,5]. Additionally, the effect of slope runoff on erosion varies depending on soil type. Therefore, understanding the hydraulic characteristics of slope erosion is crucial for preventing and managing slope erosion.

To better understand the hydraulic characteristics of slope runoff, scholars have utilized indoor rainfall simulation tests to investigate the mechanisms [6] of slope erosion by varying rainfall intensity and slope gradient [7,8]. Extensive research has indicated a positive correlation between slope runoff velocity and both rainfall intensity and slope gradient [9,10]. As rainfall intensity increases, the water content in slope runoff also increases, while an increase in slope gradient accelerates the runoff's speed under gravity, leading to heightened impact and erosion on the soil surface. Dong et al. [11] studied the soil accumulation on highway slopes and found that sediment yield increases with increasing rainfall intensity. Niu et al. [12] conducted indoor experiments on different slope ratios and initial

seepage fields under rainfall infiltration to investigate the failure mechanism and pattern of fly ash dam slopes under rainfall influence. Yan et al. [13] conducted simulated rainfall experiments under various typical underlying surface conditions to study the runoff characteristics and mechanisms under different rainfall intensities and durations. In addition, researchers have also studied thin-layer water flow on slopes [14–16]. Yang et al. [17] conducted artificial rainfall simulation experiments under varying slope gradients and rainfall intensities to determine the velocity law of thin-layer water flow. Various hydraulic parameters, including Reynolds number, Froude number, Manning coefficient, and hydraulic power, have been applied to slope erosion research [18,19]. Yuan et al. [20] studied the soil erosion and hydraulic characteristics of the loess slope in Beijing and found that mean flow velocity is the most closely related hydraulic parameter to sediment concentration in runoff.

Slope erosion, a vital aspect of soil erosion, has garnered significant attention in China. Nevertheless, the patterns of erosion differ considerably among various soil types. Presently, research on slope erosion of diverse soil types focuses mainly on the Northwest loess region [21,22] and southern red soil region [23,24]. In the loess region, the loose soil texture and high porosity render it vulnerable to wind and water erosion [25,26], leading to severe slope erosion. Zhao et al. [27] calculated the slope, aspect, and channel network of loess slopes in different evolution stages and analyzed the relationship between variation characteristics and erosion and evolution processes of the slopes. In the southern red soil region, the red soil in the Jiangnan hilly area and Nanling Mountains are the primary research objects [28,29]. Red soil is characterized by high fertility, loose structure, and sensitivity to erosion [30,31], making it susceptible to rainfall erosion. Feng et al. [32] studied the impacts of rainfall intensity, slope gradient, and surface cover on the erosion process of granite red soil slopes and concluded that the impact of slope gradient on sediment yield increases with increasing rainfall intensity. However, in granite regions, erosion characteristics and mechanisms vary due to differences in granite weathering degree, resulting in limited applicability of slope management measures [33–35].

The northeast of Longling County in Yunnan Province is a distribution area of completely weathered granite and also an operating area of oil and gas pipelines. The oil and gas pipelines pass through the distribution area of completely weathered granite. To lay the pipelines, grooves need to be dug and then the completely weathered granite excavated from the grooves is backfilled. However, the backfill soil containing completely weathered granite has high sand content, low clay content, loose structure, poor physical and mechanical parameters, and low fertility. As a result, the working slope surface is exposed, leading to poor corrosion resistance. Under heavy rainfall conditions, the working slope is prone to erosion by slope runoff, resulting in slope collapse, surface water and soil loss, and other disasters. These problems result in pipeline exposure and suspension, which substantially impact the normal functioning of pipelines. However, traditional management methods have often proved insufficient, emphasizing the urgency of responding to pipeline safety challenges and ensuring the continuous operation of pipelines. It is crucial to conduct focused research on the completely weathered granite area, which involves determining the erosion characteristics and patterns of completely weathered granite slopes at a certain slope gradient and rainfall intensity. In addition, it is necessary to determine the spatiotemporal variation properties of flow velocity and hydraulic laws under different circumstances. This will ultimately provide a theoretical basis for soil erosion management of completely weathered granite slopes.

## **2. Materials and Methods**

## *2.1. Experimental Design*

According to the analysis of rainfall data in the Longling area, three rainfall intensity levels were selected for the experiments: 50, 80, and 110 mm/h (50, 80, and 110 mm/h are the average value of the maximum single-point rainstorm in the past 10 years, the calculated single-point rainstorm value in 10 years and the single-point rainstorm value

in 100 years. As well, the rainfall device can be set at a level of 20–240 mm/h). Under each of these three rainfall intensity levels, four slope gradients (10◦ , 20◦ , 30◦ , and 40◦ ) were experimentally evaluated resulting in a total of 12 test configurations. During each experiment, the erosion process of the simulated slope surface was monitored in real-time using a digital pan-tilt camera until the slope was damaged. The soil moisture contents and bulk densities were measured by drying methods before filling. The soil dry densities were controlled between 1.3 and 1.5 g/cm<sup>3</sup> , and the soil moisture contents were approximately 10% (Near natural state values). The test soil was naturally dried and passed through a 10 mm sieve to remove stones and weeds. The filling thickness was 50 cm, paved, and compacted to reduce the influence of the boundary effect. After the filling was complete, the soil surface was scraped with a wooden board to create a certain surface roughness. The soil bulk density of the slope was measured using the ring knife method to verify that it reached the experimental design level. Each experiment lasted 1 h. three rainfall intensity levels, four slope gradients (10°, 20°, 30°, and 40°) were experimentally evaluated resulting in a total of 12 test configurations. During each experiment, the erosion process of the simulated slope surface was monitored in real-time using a digital pan-tilt camera until the slope was damaged. The soil moisture contents and bulk densities were measured by drying methods before filling. The soil dry densities were controlled between 1.3 and 1.5 g/cm3, and the soil moisture contents were approximately 10% (Near natural state values). The test soil was naturally dried and passed through a 10 mm sieve to remove stones and weeds. The filling thickness was 50 cm, paved, and compacted to reduce the influence of the boundary effect. After the filling was complete, the soil surface was scraped with a wooden board to create a certain surface roughness. The soil bulk density of the slope was measured using the ring knife method to verify that it reached the experimental design level. Each experiment lasted 1 h.

*Appl. Sci.* **2023**, *13*, x FOR PEER REVIEW 3 of 12

the average value of the maximum single-point rainstorm in the past 10 years, the calculated single-point rainstorm value in 10 years and the single-point rainstorm value in 100 years. As well, the rainfall device can be set at a level of 20–240 mm/h). Under each of these

### *2.2. Test Materials and Devices 2.2. Test Materials and Devices*

The soil used in this experiment was trench backfill soil taken from the completely weathered granite distribution area of the Longling section of the China-Myanmar oil and gas pipeline in Longling County, Baoshan City, Yunnan Province. The sampling time was July 2020. The soil used in this experiment was trench backfill soil taken from the completely weathered granite distribution area of the Longling section of the China-Myanmar oil and gas pipeline in Longling County, Baoshan City, Yunnan Province. The sampling time was July 2020. For simulated rainfall experiments, a down-spraying ZYJY-DZ02 system was used.

For simulated rainfall experiments, a down-spraying ZYJY-DZ02 system was used. The rainfall source was 5.0 m, and the rainfall source was pure, non-polluted water. The adjustable-slope steel groove used in the simulated rainfall experiments was 3 m in length, 0.5 m in width, and 0.5 m in height. A V-shaped collecting port was placed at the tail of the soil bin to collect runoff and sediment samples generated by the simulated rainfall. The rainfall intensity could be controlled within the range of 20–270 mm/h using a switch in the control room, and the rainfall uniformity remained above 85%. The rainfall amount and duration could be controlled using a combination of the pressure pump and nozzle. Thus, the experimental rainfall system met the requirements for our experiments. The test device is shown in Figure 1. The rainfall source was 5.0 m, and the rainfall source was pure, non-polluted water. The adjustable-slope steel groove used in the simulated rainfall experiments was 3 m in length, 0.5 m in width, and 0.5 m in height. A V-shaped collecting port was placed at the tail of the soil bin to collect runoff and sediment samples generated by the simulated rainfall. The rainfall intensity could be controlled within the range of 20–270 mm/h using a switch in the control room, and the rainfall uniformity remained above 85%. The rainfall amount and duration could be controlled using a combination of the pressure pump and nozzle. Thus, the experimental rainfall system met the requirements for our experiments. The test device is shown in Figure 1.

**Figure 1.** Schematic diagram of the system for indoor simulated rainfall experiments. 1—Water storage tank; 2—Water supply pipe; 3—Pressure stabilizing water pump, flow meter, and water valve; 4—Rainfall simulator; 5—Rainfall nozzle; 6—Test bed; 7—Completely weathered granite for testing; 8—Slope adjustment device; 9—Triangular weir. **Figure 1.** Schematic diagram of the system for indoor simulated rainfall experiments. 1—Water storage tank; 2—Water supply pipe; 3—Pressure stabilizing water pump, flow meter, and water valve; 4—Rainfall simulator; 5—Rainfall nozzle; 6—Test bed; 7—Completely weathered granite for testing; 8—Slope adjustment device; 9—Triangular weir.

## *2.3. Test Procedure*

The experiments were conducted in the Soil and Water Conservation Laboratory of Kunming University of Science and Technology from August to October 2020. After the rain begins, the initial runoff time of each test was recorded, and the erosion process was observed. Each rainfall event lasted 60 min, The flow rate was measured every 3 min and a 1 L container was used to collect runoff sediment samples every 2 min (for a total of 30 samples in each test). After the slope flow became stable, a tracer method based on red ink was used to measure the slope runoff velocity. The entire test process was recorded using a high-speed camera, and the flow velocity was calculated from the recorded images. Measurements of flow velocity should consider different slope sections. The 3 m soil trough was measured from the top of the slope to the foot of the slope at 0.5–1.5 m and 1.5–2.5 m from the top of the slope. After the rainfall test, the volume of the sample in the container was measured. After the sample was allowed to stand for 24 h to settle, the supernatant was poured off, and all the sediment in the container was transferred to a disposable paper cup for drying to determine the dry weight of the sediment.

## *2.4. Test Data Analysis*

The hydrodynamic parameters of the slope runoff were determined as follows.

(1) The runoff yield rate *N* (mL/s) was calculated as

$$N = \frac{q}{t'} $$

where *q* is the runoff (mL) in a single simulated rainfall event, and *t* is the single sampling time (s).

(2) The sediment yield rate *M* (g/s) was calculated as

$$M = \frac{M'}{t}'$$

where *M*0 is the sediment yield in a single sampling time (g), and *t* is the single sampling time (min).

(3) The Reynolds number *Re* was calculated as

$$Re = \frac{vR}{\mu}\_{\prime}$$

where *v* is the flow rate (m/s), *R* is the hydraulic radius (m), and *µ* is the viscosity coefficient of water flow (Based on the field test, the water temperature is 16–18 ◦C, *µ* = 1.061–1.115).

(4) The Froude number *Fr* was calculated as

$$Fr = \frac{v}{\left(gh\right)^{0.5}}.$$

where *g* is the acceleration due to gravity (m/s<sup>2</sup> ) (Based on the test site, *g* = 9.79), and *h* is the water depth (m).

(5) The resistance coefficient *f* was calculated as

$$f = \frac{8ghJ}{v^2}$$

where *g* is the acceleration due to gravity (m/s<sup>2</sup> ), *h* is the water depth (m), and *J* is the hydraulic break (*J* = *sinβ*, where *β* is the slope gradient).

(6) The Manning coefficient *n* was calculated as

$$n = \frac{R^{2/3}S^{1/2}}{V}$$

where *R* is the hydraulic radius (m), *S* is the slope gradient (radians), and *V* is the average velocity (m/s).

## **3. Results and Analysis**

This section may be divided into subheadings. It should provide a concise and precise description of the experimental results, their interpretation, as well as the experimental conclusions that can be drawn.

## *3.1. Process of Runoff Formation*

As shown in Figure 2, a higher intensity of precipitation results in a more pronounced surge in the runoff rate. Taking a slope gradient of 10◦ as an instance, the peak runoff rates achieve at precipitation intensities of 50, 80, and 110 mm/h are 16.3, 31.2, and 38.5 mL/s, respectively, with corresponding average runoff rates of 13.6, 27.7, and 32.7 mL/s. Notably, the augmentation of both peak and average runoff rates is substantial when the precipitation intensity increases from 50 to 80 mm/h. However, this enhancement is not linear, and the peak runoff rate experiences only a 23% surge, while the average runoff rate increases by less than 20% as the precipitation intensity elevates from 80 to 110 mm/h. *Appl. Sci.* **2023**, *13*, x FOR PEER REVIEW 6 of 12

**Figure 2.** Variation in runoff rate with time under different rainfall intensities for the same slope gradients. (**a**) Slope is 10°. (**b**) Slope is 20°. (**c**) Slope is 30°. (**d**) Slope is 40°. **Figure 2.** Variation in runoff rate with time under different rainfall intensities for the same slope gradients. (**a**) Slope is 10◦ . (**b**) Slope is 20◦ . (**c**) Slope is 30◦ . (**d**) Slope is 40◦ .

*3.2. Sediment Production Process*  As shown in Figure 3, an evident rise in the sediment yield rate occurs with an increase in the rainfall intensity for the same slope gradient. For instance, at a slope gradient It is also worth mentioning that, for a fixed precipitation intensity, the impact of slope gradient on the runoff rate is relatively minor. Specifically, when the precipitation intensity is 110 mm/h, the average runoff rate undergoes a modest 19% escalation as the slope

of 40°, the peak sediment yield rates attained at rainfall intensities of 50, 80, and 110 mm/h are 0.81, 1.47, and 3.07 g/s, respectively, with corresponding average sediment yield rates of 0.24, 0.81, and 2.25 g/s, respectively. When the rainfall intensity elevates from 50 to 80

yield rate increases by 235%. Similarly, when the rainfall intensity grows from 80 to 110 mm/h, the peak sediment yield rate undergoes a 107% rise, and the average sediment yield

Under uniform rainfall intensity, a minor elevation in the slope gradient leads to a slight rise in the sediment yield rate. The most significant increment in peak sediment yield rate with increasing slope is detected under the rainfall intensity of 110 mm/h, with a substantial surge observed as the slope gradient escalates from 30° to 40°. Conversely, the most trivial rise in peak sediment yield rate occurs at the rainfall intensity of 80 mm/h, with only a modest increase detected as the slope gradient increases from 10° to 20°. The maximal increment in the average sediment yield rate transpires when the rainfall intensity is 110 mm/h, with a notable rise observed as the slope gradient increases. When the slope gradient increases from 30° to 40°, the average sediment yield rate undergoes a substantial escalation of 269%. In contrast, the most minor augmentation in average sediment yield rate is recorded when the rainfall intensity is 80 mm/h; with only a modest increment

rate increases by 178%.

gradient rises from 20◦ to 30◦ . Subsequently, as the slope gradient increases from 30◦ to 40◦ , the minimum increase in average runoff rate (8%) is observed. Consequently, it can be inferred that the rise in runoff rate with increasing slope gradient diminishes substantially once the slope gradient exceeds 30◦ .

For a given slope gradient, the runoff rates associated with distinct rainfall intensities exhibit a descending order of 110 mm/h > 80 mm/h > 50 mm/h. Specifically, the runoff rate initially rises as rainfall intensity grows, followed by a stabilization of the runoff rate. This phenomenon is attributed to the low soil water content during the initial rainfall stage, leading to the transformation of rainfall into seepage. As the soil water content gradually increases, the slope water content approaches saturation, and the rainfall is converted into slope runoff.

Compared to the slope gradient, rainfall intensity has a more significant effect on the runoff rate. Specifically, the escalation of rainfall intensity leads to an augmented amount of precipitation on the slope, leading to a corresponding increase in the runoff volume. In contrast, a rise in the slope gradient alters the slope's shape, resulting in greater downward acceleration and flow velocity of the water traveling down the slope.

## *3.2. Sediment Production Process*

As shown in Figure 3, an evident rise in the sediment yield rate occurs with an increase in the rainfall intensity for the same slope gradient. For instance, at a slope gradient of 40◦ , the peak sediment yield rates attained at rainfall intensities of 50, 80, and 110 mm/h are 0.81, 1.47, and 3.07 g/s, respectively, with corresponding average sediment yield rates of 0.24, 0.81, and 2.25 g/s, respectively. When the rainfall intensity elevates from 50 to 80 mm/h, the peak sediment yield rate experiences an 81% surge, and the average sediment yield rate increases by 235%. Similarly, when the rainfall intensity grows from 80 to 110 mm/h, the peak sediment yield rate undergoes a 107% rise, and the average sediment yield rate increases by 178%.

Under uniform rainfall intensity, a minor elevation in the slope gradient leads to a slight rise in the sediment yield rate. The most significant increment in peak sediment yield rate with increasing slope is detected under the rainfall intensity of 110 mm/h, with a substantial surge observed as the slope gradient escalates from 30◦ to 40◦ . Conversely, the most trivial rise in peak sediment yield rate occurs at the rainfall intensity of 80 mm/h, with only a modest increase detected as the slope gradient increases from 10◦ to 20◦ . The maximal increment in the average sediment yield rate transpires when the rainfall intensity is 110 mm/h, with a notable rise observed as the slope gradient increases. When the slope gradient increases from 30◦ to 40◦ , the average sediment yield rate undergoes a substantial escalation of 269%. In contrast, the most minor augmentation in average sediment yield rate is recorded when the rainfall intensity is 80 mm/h; with only a modest increment of 8% detected as the slope gradient elevates from 10◦ to 20◦ . For insignificant slope gradients and rainfall intensities, the effect of slope gradient on sediment yield rate is not apparent. However, as the slope gradient and rainfall intensity increase, the sediment yield rate also rises, with the slightest growth rate observed under the slope gradient of 20◦ and the rainfall intensity of 80 mm/h. Based on these observations, it can be speculated that a critical point exists at or near these conditions.

Elevating either the rainfall intensity or slope gradient induces an augmentation in sediment yield to a certain extent. Raising the slope gradient diminishes the stability of the slope. Meanwhile, alterations in rainfall intensity modify the intensity and erosive force of slope flow, with augmented runoff leading to heightened sediment yield. The simultaneous effect of increased slope and rainfall intensity accelerates the slope erosion process, giving rise to rills and depressions on the slope surface and collapses in areas with the most severe erosion. Augmentations in both rainfall intensity and slope gradient also abbreviate the time taken to reach the peak sediment yield.

that a critical point exists at or near these conditions.

also abbreviate the time taken to reach the peak sediment yield.

**Figure 3.** Changes in sediment yield rate with time under different rainfall intensities for the same slope. (**a**) Slope is 10°. (**b**) Slope is 20°. (**c**) Slope is 30°. (**d**) Slope is 40°. **Figure 3.** Changes in sediment yield rate with time under different rainfall intensities for the same slope. (**a**) Slope is 10◦ . (**b**) Slope is 20◦ . (**c**) Slope is 30◦ . (**d**) Slope is 40◦ .

of 8% detected as the slope gradient elevates from 10° to 20°. For insignificant slope gradients and rainfall intensities, the effect of slope gradient on sediment yield rate is not apparent. However, as the slope gradient and rainfall intensity increase, the sediment yield rate also rises, with the slightest growth rate observed under the slope gradient of 20° and the rainfall intensity of 80 mm/h. Based on these observations, it can be speculated

Elevating either the rainfall intensity or slope gradient induces an augmentation in sediment yield to a certain extent. Raising the slope gradient diminishes the stability of the slope. Meanwhile, alterations in rainfall intensity modify the intensity and erosive force of slope flow, with augmented runoff leading to heightened sediment yield. The simultaneous effect of increased slope and rainfall intensity accelerates the slope erosion process, giving rise to rills and depressions on the slope surface and collapses in areas with the most severe erosion. Augmentations in both rainfall intensity and slope gradient

Under the same slope gradient, the sediment yield rate initially rises with the intensifying rainfall intensity, subsequently declines, and ultimately stabilizes. This occurs because as the slope surface experiences erosion, forming rills, the quantity of sediment transported escalates. When the slope morphology is disrupted, a sudden collapse triggers a massive volume of sediment to be carried away by the slope flow, reaching the peak sediment yield rate. Following the initial significant damage, the slope morphology rapidly achieves stability; during this phase, the slope flow is inadequate to inflict secondary damage on the slope, causing the sediment yield rate to diminish and stabilize. Under the same slope gradient, the sediment yield rate initially rises with the intensifying rainfall intensity, subsequently declines, and ultimately stabilizes. This occurs because as the slope surface experiences erosion, forming rills, the quantity of sediment transported escalates. When the slope morphology is disrupted, a sudden collapse triggers a massive volume of sediment to be carried away by the slope flow, reaching the peak sediment yield rate. Following the initial significant damage, the slope morphology rapidly achieves stability; during this phase, the slope flow is inadequate to inflict secondary damage on the slope, causing the sediment yield rate to diminish and stabilize.

### *3.3. Spatial and Temporal Differences in Flow Velocity 3.3. Spatial and Temporal Differences in Flow Velocity*

As shown in Figure 4, Under the examined slope gradients, the escalation in rainfall intensity leads to substantial growth in flow rate. Using the 40◦ slope gradient as an example, with rainfall intensities of 50, 80, and 110 mm/h, the mean flow velocities are 0.15, 0.18, and 0.26 m/s, while the peak flow velocities are 0.17, 0.23, and 0.32 m/s, respectively. As the rainfall intensity rises from 50 to 80 mm/h, the average flow velocity experiences a 23% enhancement, and the peak velocity shows a 35% expansion. When the rainfall intensity climbs from 80 to 110 mm/h, the average flow velocity exhibits a 35% increment, and the peak velocity demonstrates a 40% amplification. Hence, the upsurge in rainfall intensity has a notable impact on the flow velocity.

The maximum flow velocity reaches 0.32 m/s, while the minimum flow velocity is a mere 0.09 m/s. The timing of the peak flow velocity does not coincide with the timing of runoff and sediment production. This is due to the formation of rills on the slope during erosion by overland flow, which alters the slope's shape, rendering it uneven and causing the overland flow to deviate from uniform laminar flow. The impact of rills on

overland flow is multifaceted. Runoff within rills can continuously affect their inner walls, making them smoother, thereby diminishing resistance and augmenting flow velocity. Simultaneously, rills may also evolve along the cross-section, raising the height difference between the rill and the original slope, consequently generating a height drop in the slope flow and diminishing the flow velocity. As the rainfall intensity rises from 50 to 80 mm/h, the average flow velocity experiences a 23% enhancement, and the peak velocity shows a 35% expansion. When the rainfall intensity climbs from 80 to 110 mm/h, the average flow velocity exhibits a 35% increment, and the peak velocity demonstrates a 40% amplification. Hence, the upsurge in rainfall intensity has a notable impact on the flow velocity.

As shown in Figure 4, Under the examined slope gradients, the escalation in rainfall intensity leads to substantial growth in flow rate. Using the 40° slope gradient as an example, with rainfall intensities of 50, 80, and 110 mm/h, the mean flow velocities are 0.15, 0.18, and 0.26 m/s, while the peak flow velocities are 0.17, 0.23, and 0.32 m/s, respectively.

*Appl. Sci.* **2023**, *13*, x FOR PEER REVIEW 8 of 12

**Figure 4.** Comparison of slope flow velocity under different slope gradients. (**a**) Slope is 10°. (**b**) Slope is 20°. (**c**) Slope is 30°. (**d**) Slope is 40°. **Figure 4.** Comparison of slope flow velocity under different slope gradients. (**a**) Slope is 10◦ . (**b**) Slope is 20◦ . (**c**) Slope is 30◦ . (**d**) Slope is 40◦ .

The maximum flow velocity reaches 0.32 m/s, while the minimum flow velocity is a mere 0.09 m/s. The timing of the peak flow velocity does not coincide with the timing of runoff and sediment production. This is due to the formation of rills on the slope during erosion by overland flow, which alters the slope's shape, rendering it uneven and causing the overland flow to deviate from uniform laminar flow. The impact of rills on overland flow is multifaceted. Runoff within rills can continuously affect their inner walls, making them smoother, thereby diminishing resistance and augmenting flow velocity. Simultaneously, rills may also evolve along the cross-section, raising the height difference between the rill and the original slope, consequently generating a height drop in the slope flow and diminishing the flow velocity. During simulated rainfall, the flow velocity on the slope exhibits fluctuations with a general upward tendency. The flow velocity demonstrates a significant positive correlation with both slope gradient and rainfall intensity. As the flow velocity escalates, the During simulated rainfall, the flow velocity on the slope exhibits fluctuations with a general upward tendency. The flow velocity demonstrates a significant positive correlation with both slope gradient and rainfall intensity. As the flow velocity escalates, the slope flow intensifies the erosion of the slope, and the continuous erosion of the slope surface by runoff leads to the emergence of rills on the slope surface. Rills predominantly appear in the middle section of the slope, while their presence in the upper and lower sections is scarce. This can be ascribed to the flow velocity and slope strength. The flow velocity in the upper part of the slope is comparatively low, and the volume of runoff is relatively small; as a result, the erosive force is insufficient to erode the slope. The flow velocity in the middle section is relatively high, and the volume of runoff is larger. In this scenario, the erosive force on the slope is relatively significant, culminating in the formation of rills of varying sizes in the middle section of the slope under the action of erosion from runoff. Although the flow velocity in the lower slope section is relatively substantial, the slope strength exceeds that of other sections. Consequently, the volume of runoff in the lower section of the slope is larger than in the upper section but smaller than in the middle section.

## *3.4. Hydraulic Characteristics of Slope Runoff*

As shown in Table 1, *Re* elevates with the intensification of slope gradient and rainfall intensity. The value of *Re* yields a result lower than 500; therefore, despite the evident rise in *Re*, the flow of the slope runoff remains laminar. The mean value of Fr exceeds 1 signifying rapid overland flow that escalates with the augmentation of slope gradient and rainfall intensity. The resistance coefficient *f* diminishes with the amplification of slope gradient and rainfall intensity, which also elucidates why enhancing the slope gradient or rainfall intensity leads to greater slope runoff velocity. The Manning coefficient *n* reflects the roughness of the slope surface. The greater the slope gradient, the greater the rainfall intensity, and the erosion of slope runoff, the easier it is to make the slope surface appear depressed and promote the formation of rills. The more complex the slope morphology, the greater the *n*.


**Table 1.** Slope runoff hydraulic parameters under different slope gradients and rainfall intensities.

## *3.5. Slope Flow Velocity, Rainfall Intensity, and Slope Gradient Fitting Equation*

An intensification of precipitation strength results in an expansion of surface water flow volume, subsequently accelerating the flow velocity of slope runoff. An increase in slope magnitude signifies a greater inclination of the terrain, which, under the influence of gravitational forces, amplifies the energy of water traversing the slope and further accelerates the flow velocity of slope runoff. The combined action and mutual constraints of these factors ultimately impact the velocity of water flow along the slope. To delve deeper into the impact of rainfall intensity and slope gradient on slope flow velocity, the rainfall intensity and slope gradient tangent values were denoted as X and Y, respectively. A binary fitting analysis of the rainfall intensity and slope gradient tangent value was subsequently conducted. The fitted surface is illustrated in Figure 5, and the fitting equation is as follows:

$$\mathbf{v} = 0.057 \text{lnX} + 0.033 \text{lnY} - 0.053 \text{ (R}^2 = 0.908)$$

**Figure 5.** Relationships among slope flow velocity, rainfall intensity, and slope gradient tangent value. **Figure 5.** Relationships among slope flow velocity, rainfall intensity, and slope gradient tangent value.

intensity and slope gradient tangent values were denoted as X and Y, respectively. A binary fitting analysis of the rainfall intensity and slope gradient tangent value was subsequently conducted. The fitted surface is illustrated in Figure 5, and the fitting equation is

v 0.057lnX 0.033lnY - 0.053 (R 0.908) <sup>2</sup> = + =

### **4. Conclusions 4. Conclusions**

as follows:


**Author Contributions:** Conceptualization, H.T.; methodology, H.T.; software, H.T.; validation, Z.K.; formal analysis, H.T.; investigation, Z.K.; resources, Z.K.; data curation, H.T.; writing—original draft preparation, H.T.; writing—review and editing, Z.K.; visualization, Z.K.; supervision, Z.K.; project **Author Contributions:** Conceptualization, H.T.; methodology, H.T.; software, H.T.; validation, Z.K.; formal analysis, H.T.; investigation, Z.K.; resources, Z.K.; data curation, H.T.; writing—original draft preparation, H.T.; writing—review and editing, Z.K.; visualization, Z.K.; supervision, Z.K.; project administration, Z.K.; funding acquisition, Z.K. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Key Research and Development Plan of Yunnan Province: The Technology of the Comprehensive Risk Assessment of the Earthquake Catastrophe and the Disaster Chains in Yunnan and Its Application, (Grant No. 202203AC100003); the scientific and technological development project of Southwest Pipeline Co., Ltd. (Chengdu, China), National Pipe Network Group Research on Hydraulic Protection and Soil and Water Conservation of Oil and Gas Pipelines through Fully Weathered Granite Area (Grant No. 2018016); and science and technology development project of China Hydropower Foundation Co., Ltd. (Tianjin, China). Evaluation of rapid excavation of slope cut-off wall in complex geological background area and treatment technology of mud and water inrush in tunnel engineering (Grant No. 2022530103001936).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

**Kai Liu <sup>1</sup> , Tianfeng Gu 1,\*, Xingang Wang 1,2,\* and Jiading Wang <sup>1</sup>**


**Abstract:** The structure, time-dependent mechanical deformation, and strength characteristics of loess, which is loose and porous with well-developed vertical joints, are greatly affected by the dry-wet cycles, which are attributed to periodic artificial irrigation, rainfall, and water evaporation. To better understand the creep characteristics of loess under the effect of dry-wet cycles, Q<sup>2</sup> loess samples obtained from the South Jingyang County, China, were subjected to different dry-wet cycles (0, 5, 10, 15, 20) and sheared in triaxial creep tests. The experimental results revealed that: firstly, the maximum value of the deviatoric stress corresponding to creep failure gradually decreases with an increase in the dry-wet cycles. Secondly, the long-term strength of the loess after dry-wet cycles were obtained through the Isochronous Curve Method. It is found that the long-term strength and the number of dry-wet cycles showed an exponential decreasing relationship. In addition, the creep damage mechanism of loess due to dry-wet cycles is proposed. This study may provide the basis for understanding the mechanical behavior of the loess under the effect of dry-wet cycles, as well as guidelines for the prevention and prediction of loess landslide stability.

**Keywords:** loess; dry-wet cycles; triaxial creep test; creep behavior; strain-time; long-term strength

## **1. Introduction**

Loess is a weakly cemented loose sediment with a special material composition, which is widely distributed in the Loess Plateau of China (Figure 1) [1,2]. Creep phenomenon is common in loess-covered areas, and the creep characteristics of loess pose an important impact on the landslide formation and the construction of engineering structures [3–5]. The creep properties of loess vary due to the change of the environment. For example, unsaturated loess soils are usually subjected to wetting because of the natural rainfall or artificial irrigation, and loess soils would then be subjected to a drying process, i.e., water evaporation, as rainfall or irrigation stops. Such a process could be called a drywet cycle [6,7] and leads to the following consequences: the loss of soluble salts in the loess [8–10], the deterioration of the mechanical properties of the loess [11,12], and the change of shear strength, deformation, and permeability of the loess [13–15]. Therefore, the creep deformation of the loess was aggravated, which will contribute to the formation of disasters such as landslides.

The influence of dry-wet cycles on the time-dependent behavior of rocks has been considered in some research [16–19], while in the last few years increasing attention has been paid to experimental research into the behavior of loess soils. It is acknowledged that the reduction in the shear strength of the loess due to the dry-wet cycles is obvious [3,20,21]. Malusis et al. [22] reported that the hydraulic conductivity of the soil gradually increases as the number of dry-wet cycles increases. Until now, extensive attempts have been made on the physical and mechanical properties of loess after dry-wet cycles by using the uniaxial tensile shear test [23], direct shear test [24,25], the uniaxial compressive test [26],

**Citation:** Liu, K.; Gu, T.; Wang, X.; Wang, J. Time-Dependence of the Mechanical Behavior of Loess after Dry-Wet Cycles. *Appl. Sci.* **2022**, *12*, 1212. https://doi.org/10.3390/ app12031212

Academic Editors: Ricardo Castedo, Miguel Llorente Isidro and David Moncoulon

Received: 15 December 2021 Accepted: 23 January 2022 Published: 24 January 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

conventional triaxial test [7,27], the soil-water characteristic curve (SWCC) of loess [28], etc. For example, Yuan et al. [23] carried out an experimental study on the tensile strength of the intact loess samples under dry-wet cycles and concluded that the repeated dry-wet cycle destroys the original structure of the intact loess and causes its tensile strength to disappear. Furthermore, Mu et al. [24] investigated the shear characteristics of loess samples under dry-wet cycles through direct shear tests and pointed out that the shear strength of loess gradually decreased as the number of dry-wet cycles increase. By conducting experiments, Li et al. [26] reported that the unconfined compressive strength, elastic modulus, and cohesive force of the loess specimen decreased with an increase in the number of dry-wet cycles, while the vertical compressive strain and failure deformation increased. Hu et al. [27] conducted triaxial tests considering three influence factors such as dry density, dry-wet cycle amplitude, and the lower bound water content of dry-wet cycles and established a compacted loess deterioration model (CLDM). Wang et al. [29] explored the effect of the dry-wet cycle on the collapsible deformation characteristics of compacted loess, and the results revealed that the compressive strain at all levels of deviatoric stress of specimens with different initial density gradually increases with the increase of the number of dry-wet cycles. The higher the initial compaction degree, the more significant effect of the dry-wet cycle on the mechanical behavior of compacted loess. loess [28], etc. For example, Yuan et al. [23] carried out an experimental study on the tensile strength of the intact loess samples under dry-wet cycles and concluded that the repeated dry-wet cycle destroys the original structure of the intact loess and causes its tensile strength to disappear. Furthermore, Mu et al. [24] investigated the shear characteristics of loess samples under dry-wet cycles through direct shear tests and pointed out that the shear strength of loess gradually decreased as the number of dry-wet cycles increase. By conducting experiments, Li et al. [26] reported that the unconfined compressive strength, elastic modulus, and cohesive force of the loess specimen decreased with an increase in the number of dry-wet cycles, while the vertical compressive strain and failure deformation increased. Hu et al. [27] conducted triaxial tests considering three influence factors such as dry density, dry-wet cycle amplitude, and the lower bound water content of dry-wet cycles and established a compacted loess deterioration model (CLDM). Wang et al. [29] explored the effect of the dry-wet cycle on the collapsible deformation characteristics of compacted loess, and the results revealed that the compressive strain at all levels of deviatoric stress of specimens with different initial density gradually increases with the increase of the number of dry-wet cycles. The higher the initial compaction degree, the more significant effect of the dry-wet cycle on the mechanical behavior of compacted loess.

been made on the physical and mechanical properties of loess after dry-wet cycles by using the uniaxial tensile shear test [23], direct shear test [24,25], the uniaxial compressive test [26], conventional triaxial test [7,27], the soil-water characteristic curve (SWCC) of

*Appl. Sci.* **2022**, *01*, x FOR PEER REVIEW 2 of 14

**Figure 1. Figure 1.** The study area. The study area.

The studies above have been focused on the variation in conventional mechanical properties such as shear strength, uniaxial tensile strength, uniaxial compressive strength, and triaxial shear strength of the loess after the dry-wet cycles. However, very little

attention has been paid to the damaging effect of the dry-wet cycles on the creep properties of loess. Additionally, a few experimental results have shown that the creep strains of loess soils developed after dry-wet cycles are not negligible [30] and can reach a critical value, eventually triggering loess landslides. In this backdrop, typical loess samples obtained from South Jingyang, China, were used to conduct triaxial creep tests under different numbers of dry-wet cycles to gain full insight into the time-dependent characteristics of loess soils and the deterioration of the long-term strength caused by dry-wet cycles. Furthermore, the damage mechanism of the creep characteristics of loess due to the dry-wet cycles is explored. This study would be beneficial for understanding the time-dependence of mechanical behaviors of loess soils as well as the long-term stability of loess slopes. properties of loess. Additionally, a few experimental results have shown that the creep strains of loess soils developed after dry-wet cycles are not negligible [30] and can reach a critical value, eventually triggering loess landslides. In this backdrop, typical loess samples obtained from South Jingyang, China, were used to conduct triaxial creep tests under different numbers of dry-wet cycles to gain full insight into the time-dependent characteristics of loess soils and the deterioration of the long-term strength caused by dry-wet cycles. Furthermore, the damage mechanism of the creep characteristics of loess due to the dry-wet cycles is explored. This study would be beneficial for understanding the time-dependence of mechanical behaviors of loess soils as well as the long-term stability of loess slopes.

The studies above have been focused on the variation in conventional mechanical properties such as shear strength, uniaxial tensile strength, uniaxial compressive strength, and triaxial shear strength of the loess after the dry-wet cycles. However, very little attention has been paid to the damaging effect of the dry-wet cycles on the creep

### **2. Triaxial Creep Tests 2. Triaxial Creep Tests**

### *2.1. Sampling Site and Dry-Wet Cycle Process 2.1. Sampling Site and Dry-Wet Cycle Process*

*Appl. Sci.* **2022**, *01*, x FOR PEER REVIEW 3 of 14

The loess samples used in test were collected from the back scalp of the Dabuzi landslide in Jingyang County, Shaanxi Province (Figure 1). The sampling site was about 20 m from the edge of the platform, and the collected samples were identified as Lishi loess (Q2) according to previous studies [31]. The loess in this area have suffered from different degrees of dry-wet cycles as a result of the long-term repeat rise and fall of the groundwater level due to the farmland irrigation (Figure 2a) [32]. Therefore, the mechanical properties of loess deteriorates, resulting in the occurrence of the Dabuzi landslide (Figure 2b). All loess samples were firstly sealed and kept in iron buckets. Then, samples were brought back to the laboratory and cut immediately to a cylindrical specimen with a diameter of 61.8 mm and height of 125 mm. The procedures based on the ASTM standards were conducted to determine the index properties, namely Atterberg limits [33], specific gravity and density [34], and moisture content [35]. The basic physical index properties of loess are listed in Table 1. The loess samples used in test were collected from the back scalp of the Dabuzi landslide in Jingyang County, Shaanxi Province (Figure 1). The sampling site was about 20 m from the edge of the platform, and the collected samples were identified as Lishi loess (Q2) according to previous studies [31]. The loess in this area have suffered from different degrees of dry-wet cycles as a result of the long-term repeat rise and fall of the groundwater level due to the farmland irrigation (Figure 2a) [32]. Therefore, the mechanical properties of loess deteriorates, resulting in the occurrence of the Dabuzi landslide (Figure 2b). All loess samples were firstly sealed and kept in iron buckets. Then, samples were brought back to the laboratory and cut immediately to a cylindrical specimen with a diameter of 61.8 mm and height of 125 mm. The procedures based on the ASTM standards were conducted to determine the index properties, namely Atterberg limits [33], specific gravity and density [34], and moisture content [35]. The basic physical index properties of loess are listed in Table 1.

**Figure 2.** The sampling site. (**a**) Irrigation canal, (**b**) landslide boundary. **Figure 2.** The sampling site. (**a**) Irrigation canal, (**b**) landslide boundary.

**Table 1.** The physical index properties of loess.


A two-step process for preparation of the sample was used with each dry-wet cycle. The first step is sample saturation, which is briefly described as follows. The specimens were saturated by the vacuum saturation method. Filter papers and porous stones were firstly placed on the top and bottom of the soil sample. Then, the soil sample was fixed in a saturator chamber, as shown in Figure 3a, and then the saturator chamber with the soil sample was tightened with screw nuts and placed in a vacuum container (Figure 3b). After that, the soil specimen was submerged in de-aired water inside a vacuum container for at least 24 h to achieve a degree of saturation greater than 97% [36,37]. The drying phase was followed by wetting. After saturation, the soil sample was taken out by loosening the screw nuts and removing the filter papers and porous stones. Then, the soil sample sealed by the saturator chamber was placed in a drying oven at about 105 ◦C for 48 h to fully remove its water content [38,39]. Next, the soil sample was taken out and cooled down. This process, consisting of saturation and drying phases, is called one dry-wet cycle in this study. After the last dry-wet cycle, the corresponding water required to obtain the target water content was dripped on the top surface of the soil sample, and then the soil sample was wrapped with a plastic wrap and placed in a humidifier for 48 h to obtain a homogeneous sample in room temperature [40,41]. In this study, samples with different dry-wet cycles (0, 5, 10, 15, and 20) were prepared and the sample after 10 dry-wet cycles was shown in Figure 3c. The first step is sample saturation, which is briefly described as follows. The specimens were saturated by the vacuum saturation method. Filter papers and porous stones were firstly placed on the top and bottom of the soil sample. Then, the soil sample was fixed in a saturator chamber, as shown in Figure 3a, and then the saturator chamber with the soil sample was tightened with screw nuts and placed in a vacuum container (Figure 3b). After that, the soil specimen was submerged in de-aired water inside a vacuum container for at least 24 h to achieve a degree of saturation greater than 97% [36,37]. The drying phase was followed by wetting. After saturation, the soil sample was taken out by loosening the screw nuts and removing the filter papers and porous stones. Then, the soil sample sealed by the saturator chamber was placed in a drying oven at about 105 °C for 48 h to fully remove its water content [38,39]. Next, the soil sample was taken out and cooled down. This process, consisting of saturation and drying phases, is called one dry-wet cycle in this study. After the last dry-wet cycle, the corresponding water required to obtain the target water content was dripped on the top surface of the soil sample, and then the soil sample was wrapped with a plastic wrap and placed in a humidifier for 48 h to obtain a homogeneous sample in room temperature [40,41]. In this study, samples with different dry-wet cycles (0, 5, 10, 15, and 20) were prepared and the sample after 10 dry-wet cycles was shown in Figure 3c.

A two-step process for preparation of the sample was used with each dry-wet cycle.

**Figure 3.** Specimen with the dry-wet cycles. (**a**) Frame saturator chamber, (**b**) vacuum container, (**c**) specimen after dry-wet cycles, (**d**) triaxial chamber. **Figure 3.** Specimen with the dry-wet cycles. (**a**) Frame saturator chamber, (**b**) vacuum container, (**c**) specimen after dry-wet cycles, (**d**) triaxial chamber.

### **Table 1.** The physical index properties of loess. *2.2. Testing Apparatus*

**Moisture Content (%) Dry Density (g/cm3) Density (g/cm3) Specific Gravity Void Ratio**  16 1.503 1.858 2.711 0.768 *2.2. Testing Apparatus*  The FSR-20 triaxial creep apparatus for the testing cylinder of soil samples with a height of 125 mm and diameter of 61.8 mm in the State Key Laboratory of Continental Dynamics in China (see Figures 3c and 4) was utilized in this study. As shown in Figure 4, the apparatus consists of several parts, such as an axial loading system, pore water pressure controlling system, confining pressure controlling system, matric suction controlling system, data measurement, and a collect system. By using a pneumatic servo control system, the apparatus is capable of providing not only constant shear stress but also constant air pressure for a long time with high accuracy [3,42]. Creep tests can be The FSR-20 triaxial creep apparatus for the testing cylinder of soil samples with a height of 125 mm and diameter of 61.8 mm in the State Key Laboratory of Continental Dynamics in China (see Figures 3c and 4) was utilized in this study. As shown in Figure 4, the apparatus consists of several parts, such as an axial loading system, pore water pressure controlling system, confining pressure controlling system, matric suction controlling system, data measurement, and a collect system. By using a pneumatic servo control system, the apparatus is capable of providing not only constant shear stress but also constant air pressure for a long time with high accuracy [3,42]. Creep tests can be carried out for confining pressure up to 1 MPa, pore water pressure up to 500 kPa, axial load up to 2 MPa, and pore gas pressure up to 500 kPa. The tests may be run either undrained or consolidated undrained with or without pore water pressure measurement, or drained condition. In this study, drained creep tests were conducted on loess samples.

> carried out for confining pressure up to 1 MPa, pore water pressure up to 500 kPa, axial load up to 2 MPa, and pore gas pressure up to 500 kPa. The tests may be run either undrained or consolidated undrained with or without pore water pressure measurement, or drained condition. In this study, drained creep tests were conducted on loess samples.

**Figure 4.** FSR-20 triaxial creep apparatus. **Figure 4.** FSR-20 triaxial creep apparatus.

### *2.3. Testing Scheme 2.3. Testing Scheme*

According to the basic physical index properties listed in Table 1, samples with a moisture content of 16% were tested by using multi-level loading method [4,43]. Considering the density (ρ = 1858 kg/m3) and the thickness (20 m) of the overlying loess, the confining pressure of 200 kPa was chosen for the triaxial creep test. The multi-level loading method used in this study is described as follows: the first level of deviatoric stress was applied to the specimen until the stability was attained when the deformation of the specimen was less than 0.01 mm in 24 h. To find an optimum time during which the deformation of the specimen was less than 0.01 mm, a series of preliminary tests were conducted, where 24 h, 36 h, and 48 h were used for specimens with the same testing conditions. The creep shear test results show that for the cases of 24 h, 36 h, and 48 h, the strength and strain-time curves of loess have no obvious difference, indicating that 24 h is enough for obtaining the stability state of the loess. Therefore, 24 h was selected here from the point of view of time saving. Then, the second level of deviatoric stress was applied to the specimen until the stability state was achieved. The specimen was subjected to shearing until creep failure occurred at some level of deviatoric stress. In this study, the failure occurred in the specimen when the strain of specimen reached 20% following the previous study concluded by Xie et al. [44]. The samples after different dry-wet cycles were subjected to consolidation and shearing in the conventional triaxial compression test to calculate the instantaneous failure strength. Then, the loading scheme of the triaxial creep test (Table 2) was determined according to the instantaneous According to the basic physical index properties listed in Table 1, samples with a moisture content of 16% were tested by using multi-level loading method [4,43]. Considering the density (ρ = 1858 kg/m<sup>3</sup> ) and the thickness (20 m) of the overlying loess, the confining pressure of 200 kPa was chosen for the triaxial creep test. The multi-level loading method used in this study is described as follows: the first level of deviatoric stress was applied to the specimen until the stability was attained when the deformation of the specimen was less than 0.01 mm in 24 h. To find an optimum time during which the deformation of the specimen was less than 0.01 mm, a series of preliminary tests were conducted, where 24 h, 36 h, and 48 h were used for specimens with the same testing conditions. The creep shear test results show that for the cases of 24 h, 36 h, and 48 h, the strength and strain-time curves of loess have no obvious difference, indicating that 24 h is enough for obtaining the stability state of the loess. Therefore, 24 h was selected here from the point of view of time saving. Then, the second level of deviatoric stress was applied to the specimen until the stability state was achieved. The specimen was subjected to shearing until creep failure occurred at some level of deviatoric stress. In this study, the failure occurred in the specimen when the strain of specimen reached 20% following the previous study concluded by Xie et al. [44]. The samples after different dry-wet cycles were subjected to consolidation and shearing in the conventional triaxial compression test to calculate the instantaneous failure strength. Then, the loading scheme of the triaxial creep test (Table 2) was determined according to the instantaneous failure strength.

failure strength. **Table 2.** Loading scheme of the triaxial creep test.


10 50, 125, 200, 250, 300 15 50, 100, 175, 250 20 50, 100, 175, 250

### **3. Test Results** that is, the steady state stage. However, with a higher loading, the creep curve of the stable phase increases linearly with time. (ii) The slope of the curve corresponding to the

**3. Test Results** 

Based on the experimental data collected, the typical creep strain curves for loess with a different number of dry-wet cycles is plotted in Figures 5–9. After that, according to Boltzmann's linear superposition principle [45,46], the creep curves corresponding to each level of deviatoric stress was obtained by using "the coordinate translation method" [47,48]. Figures 5–9 show the whole process curve of creep for loess and creep curves of loess with a different number of dry-wet cycles. In Figures 5–9, *q* represents the deviatoric stress, *ε* is strain, *w* is the moisture content, and t is time. steady state stage increases with an increase of loading. However, sudden failures could be observed within the loess specimen once the load exceeds a certain value. For example, when the deviator stress of the sample with 20 dry-wet cycles increased to 250 kPa, the deformation dramatically increases. (iii) With the increase of the number of dry-wet cycles, the maximum value of the deviatoric stress corresponding to the creep failure gradually decreases, indicating that the creep strength of the sample is deteriorated by the dry-wet cycles.

Based on the experimental data collected, the typical creep strain curves for loess with a different number of dry-wet cycles is plotted in Figures 5–9. After that, according to Boltzmann's linear superposition principle [45,46], the creep curves corresponding to each level of deviatoric stress was obtained by using "the coordinate translation method" [47,48]. Figures 5–9 show the whole process curve of creep for loess and creep curves of loess with a different number of dry-wet cycles. In Figures 5–9, *q* represents the devia-

It can be observed from Figures 5–9 that the characteristics of the whole process curve of creep for loess and creep curves are as follows: (i) The specimen with different dry-wet cycles underwent three typical creep stages, namely the decelerating creep stage, the steady state stage, and accelerate creep stage. With the same number of dry-wet cycles, the creep curve tends to be flat with an increase of time when a small load is applied,

*Appl. Sci.* **2022**, *01*, x FOR PEER REVIEW 6 of 14

toric stress, *ε* is strain, *w* is the moisture content, and t is time.

**Figure 5.** (**a**) The whole process curve of creep for loess without dry-wet cycles, (**b**) creep curves of the loess without dry-wet cycles. **Figure 5.** (**a**) The whole process curve of creep for loess without dry-wet cycles, (**b**) creep curves of the loess without dry-wet cycles. *Appl. Sci.* **2022**, *01*, x FOR PEER REVIEW 7 of 14

**Figure 6.** (**a**) The whole process curve of creep for loess with 5 dry-wet cycles, (**b**) creep curves of loess with 5 dry-wet cycles. **Figure 6.** (**a**) The whole process curve of creep for loess with 5 dry-wet cycles, (**b**) creep curves of loess with 5 dry-wet cycles.

*w*=16%, σ3

*w*=16%, σ3

(b) *t* (min)

**Figure 7.** (**a**) The whole process curve of creep for loess with 10 dry-wet cycles, (**b**) creep curves of

250kPa 300kPa

=200kPa

=200kPa

 50kPa 100kPa 175kPa 250kPa

0 200 400 600 800 1000 1200 1400 1600

*t* (min)

50kPa 125kPa 200kPa

0 200 400 600 800 1000 1200 1400 1600

ε (%)

(b)

ε (%)

*w*=16%, σ3

*w*=16%,σ3

100kPa

0 1000 2000 3000 4000

*t* (min)

125kPa

(a)

ε (%)

50kPa

(a)

ε (%)

50kPa

=200kPa

=200kPa

loess with 10 dry-wet cycles.

175kPa

250kPa

200kPa

250kPa

300kPa

0 1000 2000 3000 4000 5000 6000

*t* (min)

(a)

ε (%)

ε (%)

50kPa

50kPa

*w*=16%, σ3

*w*=16%,

*t* (min)

*t* (min)

*t* (min)

**4. Discussion** 

0 50 100 150 200 250 300 350 400

*q* (kPa)

250kPa

(a)

ε (%)  10 min 20 min 60 min 100 min 200 min 400 min 1000 min

100kPa

100kPa

0 1000 2000 3000 4000 5000 6000 7000

0 1000 2000 3000 4000 5000 6000 7000

150kPa

150kPa

σ3

=200kPa

=200kPa

loess with 5 dry-wet cycles.

loess with 5 dry-wet cycles.

200kPa

200kPa

250kPa

300kPa

300kPa

250kPa

(b)

(b)

ε (%)

ε (%)

*Appl. Sci.* **2022**, *01*, x FOR PEER REVIEW 7 of 14

**Figure 7.** (**a**) The whole process curve of creep for loess with 10 dry-wet cycles, (**b**) creep curves of loess with 10 dry-wet cycles. **Figure 7.** (**a**) The whole process curve of creep for loess with 10 dry-wet cycles, (**b**) creep curves of loess with 10 dry-wet cycles. loess with 10 dry-wet cycles.

**Figure 6.** (**a**) The whole process curve of creep for loess with 5 dry-wet cycles, (**b**) creep curves of

**Figure 6.** (**a**) The whole process curve of creep for loess with 5 dry-wet cycles, (**b**) creep curves of

*w*=16%, σ3

*w*=16%, σ3

=200kPa

=200kPa

 50kPa 100kPa 150kPa 200kPa 250kPa 300kPa

 50kPa 100kPa 150kPa 200kPa 250kPa 300kPa

0 200 400 600 800 1000 1200 1400 1600

*t* (min)

0 200 400 600 800 1000 1200 1400 1600

*t* (min)

**Figure 8.** (**a**) The whole process curve of creep for loess with 15 dry-wet cycles, (**b**) creep curves of loess with 15 dry-wet cycles. **Figure 8.** (**a**) The whole process curve of creep for loess with 15 dry-wet cycles, (**b**) creep curves of loess with 15 dry-wet cycles.

**Figure 9.** (**a**) The whole process curve of creep for loess with 20 dry-wet cycles, (**b**) creep curves of loess with 20 dry-wet cycles. **Figure 9.** (**a**) The whole process curve of creep for loess with 20 dry-wet cycles, (**b**) creep curves of loess with 20 dry-wet cycles.

The long-term strength of rock and soil is an important parameter for assessing the stability of landslides [49,50]. At present, the main methods to obtain long-term strength through creep test curves include the isochronous curve method [3,51], which uses the creep test curve and the Boltzmann superposition principle [52]and states that the stress response of a system to a time dependent shearing deformation may be written as the sum of responses to a sequence of step strain perturbations in the past [53], to obtain the stress-strain isochronous curves corresponding to the different deviatoric stress at the same time [54]. Figure 10a–e show the isochronous curves corresponding to the creep curves of the samples after five kinds of dry-wet cycles (*n* = 0, 5, 10, 15, 20). Based on the

> 1 min 10 min 100 min 200 min 1000 min

0 25 50 75 100 125 150 175 200 225 250 275 300 325

206kPa

*q* (kPa)

isochronous curves, the long-term strength (*q*L) was obtained.

(b)

ε (%)

(a)

50kPa

6 7 8

ε (%)

It can be observed from Figures 5–9 that the characteristics of the whole process curve of creep for loess and creep curves are as follows: (i) The specimen with different drywet cycles underwent three typical creep stages, namely the decelerating creep stage, the steady state stage, and accelerate creep stage. With the same number of dry-wet cycles, the creep curve tends to be flat with an increase of time when a small load is applied, that is, the steady state stage. However, with a higher loading, the creep curve of the stable phase increases linearly with time. (ii) The slope of the curve corresponding to the steady state stage increases with an increase of loading. However, sudden failures could be observed within the loess specimen once the load exceeds a certain value. For example, when the deviator stress of the sample with 20 dry-wet cycles increased to 250 kPa, the deformation dramatically increases. (iii) With the increase of the number of drywet cycles, the maximum value of the deviatoric stress corresponding to the creep failure gradually decreases, indicating that the creep strength of the sample is deteriorated by the dry-wet cycles. **Figure 8.** (**a**) The whole process curve of creep for loess with 15 dry-wet cycles, (**b**) creep curves of loess with 15 dry-wet cycles. *w*=16%, σ3 =200kPa 250kPa 175kPa *w*=16%, σ 3 =200kPa 2 3 4 5 6 7 8 50kPa 100kPa 175kPa 250kPa ε (%)

### **4. Discussion** 100kPa

#### *4.1. The Long-Term Strength of Loess Samples with Different Dry-Wet Cycles w*=16% σ<sup>3</sup> =200kPa 1

*Appl. Sci.* **2022**, *01*, x FOR PEER REVIEW 8 of 14

The long-term strength of rock and soil is an important parameter for assessing the stability of landslides [49,50]. At present, the main methods to obtain long-term strength through creep test curves include the isochronous curve method [3,51], which uses the creep test curve and the Boltzmann superposition principle [52] and states that the stress response of a system to a time dependent shearing deformation may be written as the sum of responses to a sequence of step strain perturbations in the past [53], to obtain the stress-strain isochronous curves corresponding to the different deviatoric stress at the same time [54]. Figure 10a–e show the isochronous curves corresponding to the creep curves of the samples after five kinds of dry-wet cycles (*n* = 0, 5, 10, 15, 20). Based on the isochronous curves, the long-term strength (*q*L) was obtained. 0 1000 2000 3000 4000 *t* (min) 0 200 400 600 800 1000 1200 1400 0 (b) *t* (min) **Figure 9.** (**a**) The whole process curve of creep for loess with 20 dry-wet cycles, (**b**) creep curves of loess with 20 dry-wet cycles. **4. Discussion**  *4.1. The Long-Term Strength of Loess Samples with Different Dry-Wet Cycles*  The long-term strength of rock and soil is an important parameter for assessing the stability of landslides [49,50]. At present, the main methods to obtain long-term strength

The experimental data listed in Table 3 was plotted in Figure 11. One can observe that there is a clear correlation between the long-term strength and the number of dry-wet cycles. The overall relationship between the long-term strength of loess and the number of dry-wet cycles can be described by the following equation: through creep test curves include the isochronous curve method [3,51], which uses the creep test curve and the Boltzmann superposition principle [52]and states that the stress response of a system to a time dependent shearing deformation may be written as the sum of responses to a sequence of step strain perturbations in the past [53], to obtain the

$$q\_L = 123.50 \text{e}^{(-n/9.98)} + 127.33, \; n \le 20,\tag{1}$$

where *q<sup>L</sup>* is the long-term strength measured in kPa and *n* is the number of the drywet cycles. curves of the samples after five kinds of dry-wet cycles (*n* = 0, 5, 10, 15, 20). Based on the isochronous curves, the long-term strength (*q*L) was obtained.

**Figure 10.** *Cont*.

**Figure 10.** Creep isochronous curve after different dry-wet cycles. (**a**) *n* = 0, (**b**) *n* = 5, (**c**) *n* = 10, (**d**) *n* = 15, (**e**) *n* = 20.



**Figure 11.** The fitting curve of the long-term strength after different dry-wet cycles.

It is clear that the long-term strength of loess decreases when increasing the dry-wet cycles. However, the reducing magnitude of long-term strength gradually decreases with the increase of the number of dry-wet cycles, until it reaches a constant value. That is to say, the long-term strength of the loess and the number of dry-wet cycles (*n* ≤ 20) shows an exponential decreasing relationship (see Equation (1)).

## *4.2. Creep Damage Mechanism of Loess Samples Due to the Dry-Wet Cycles*

The creep damage mechanism of the loess samples due to the dry-wet cycles was explained from the perspective of macro pictures of the loess samples after creep failure and the change of long-term strength with dry-wet cycles. Figure 12 shows loess samples after experiencing creep failure. One can observe that the obvious failure surface did not appear in the specimen, whereas the bending and bulging occurred on the side of the specimen. Furthermore, as the number of dry-wet cycles increases, the phenomenon of bending and bulging of the specimen becomes more noticeable. Such a phenomenon can be explained as follows: under the effect of a dry-wet cycle, the microstructure of loess is destroyed due to the repeated migration and loss of soluble salts between soil particles in the loess sample [8,10]. Moreover, the dry-wet cycles also promote the development and expansion of joint fractures in the loess sample [7], resulting in the worsening of the creep mechanical properties (see Table 3 and Figure 11). As the number of dry-wet cycles further increased, the dissolved salts in the loess sample gradually dissolved in water, which results in the coarse particles being further dispersed and disintegrated and the total pore volume of the sample being increased [25]. Therefore, the creep curves of the loess specimen show a strain-softening pattern, and the bending or lateral swelling occurred in the specimen (see Figure 12, *n* = 5 and *n* = 10). When the number of dry-wet cycles reaches a certain value, structural stability (micro- and macro-structure stability) within the loess soils was achieved. Consequently, the long-term strength of loess samples tends to be stable (see Figure 10, *n* = 15 and *n* = 20), causing the creep mechanical properties to be consistent with each other. *Appl. Sci.* **2022**, *01*, x FOR PEER REVIEW 11 of 14

**Figure 12.** Specimen after creep failure. **Figure 12.** Specimen after creep failure.

### *4.3. Limitations of the Experimental Test in This Study 4.3. Limitations of the Experimental Test in This Study*

Concerning the influence of dry-wet cycles, the experimental results herein can be used for the determination of numerical simulation parameters and for better evaluating the long-term stability of loess slopes. Therefore, the study would be beneficial to understanding the time-dependence of the mechanical behaviors of loess soils as well as the long-term stability of loess slopes, which provide the basis for understanding the mechanical behavior of the loess under the effect of dry-wet cycles, as well as guidelines for Concerning the influence of dry-wet cycles, the experimental results herein can be used for the determination of numerical simulation parameters and for better evaluating the longterm stability of loess slopes. Therefore, the study would be beneficial to understanding the time-dependence of the mechanical behaviors of loess soils as well as the long-term stability of loess slopes, which provide the basis for understanding the mechanical behavior of the loess under the effect of dry-wet cycles, as well as guidelines for the prevention and

the prevention and prediction of loess landslide stability. However, there are some potential limitations in this study that could be addressed in future research. First, the loess

with a wide range of dry-wet cycles. Therefore, the conclusions and theoretical formulas obtained in this study have certain restrictions. Secondly, the water used in this study is distilled water, which is unrealistic due to the fact that the water in the filed contains many mixtures due to solute transport and other chemical effects [10,55–57]. Finally, the influence of dry-wet cycle on loess in the field is very complicated due to the influence of rainfall, irrigation water, evaporation, and other factors [8,9,11,12], which could not be simply simulated by dry-wet cycles. On the other side, it is difficult to estimate the actual number of dry-wet cycles that the loess experienced in the field. Thus, there are some difficulties in applying the experimental results in this study to engineering practice. All

To assist in the understanding of the time-dependent behaviors and the long-term stability of loess slopes after dry-wet cycles, the mechanical properties of intact loess samples were investigated through triaxial creep tests. The following conclusions can be

(i) With the same number of dry-wet cycles, the strain-time curve of the loess samples shows a similar trend, where the strain eventually reaches a certain value with an increase of time when a small load is applied, whereas the creep curve of the stable phase increases linearly with time when the loess specimen is subjected to a higher loading. As the number of dry-wet cycles increases, the maximum value of the deviatoric stress corresponding to the creep failure gradually decreases, indicating that the deterioration of triaxial compressive strength is attributed to the dry-wet cycles.

limitations above would be considered in future research.

drawn according to the obtained experimental results:

**5. Conclusions** 

prediction of loess landslide stability. However, there are some potential limitations in this study that could be addressed in future research. First, the loess samples were sheared under a saturated condition with a maximum value of 20 dry-wet cycles, whereas the triaxial creep tests were not conducted on unsaturated loess samples with a wide range of dry-wet cycles. Therefore, the conclusions and theoretical formulas obtained in this study have certain restrictions. Secondly, the water used in this study is distilled water, which is unrealistic due to the fact that the water in the filed contains many mixtures due to solute transport and other chemical effects [10,55–57]. Finally, the influence of dry-wet cycle on loess in the field is very complicated due to the influence of rainfall, irrigation water, evaporation, and other factors [8,9,11,12], which could not be simply simulated by dry-wet cycles. On the other side, it is difficult to estimate the actual number of dry-wet cycles that the loess experienced in the field. Thus, there are some difficulties in applying the experimental results in this study to engineering practice. All limitations above would be considered in future research.

## **5. Conclusions**

To assist in the understanding of the time-dependent behaviors and the long-term stability of loess slopes after dry-wet cycles, the mechanical properties of intact loess samples were investigated through triaxial creep tests. The following conclusions can be drawn according to the obtained experimental results:


**Author Contributions:** Conceptualization, X.W. and T.G.; methodology, K.L.; software, X.W. and K.L.; writing—original draft preparation, K.L.; writing—review and editing, X.W. and T.G.; visualization, X.W.; supervision, J.W.; funding acquisition, X.W. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by the National Natural Science Foundation of China (No. 41902268) and the Opening Fund of State Key Laboratory of Geohazard Prevention and Geoenvironment Protection (Chengdu University of Technology) (No. SKLGP2020K016).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** We thanks the editors, and the two anonymous reviewers for their constructive comments, which has helped us improve the quality of the manuscript.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


**Hyungjoon Chang, Kukhyun Ryou \* and Hojin Lee**

School of Civil Engineering, Chungbuk National University, Cheongju 28644, Korea; param79@cbnu.ac.kr (H.C.); hojinlee@cbnu.ac.kr (H.L.)

**\*** Correspondence: rgh0126@naver.com; Tel.: +82-10-4240-9941

**Abstract:** This study was conducted to identify the characteristics and mobility of debris flows and analyze the performance of a berm as a debris flow mitigation measure. The debris flow velocity, flow depth, Froude number, flow resistance coefficients, and mobility ratio were accordingly determined using the results of flume tests. To analyze the influence of the berm, the results for a straight channel test without a berm were compared with those for a single-berm channel test. The debris flow velocity was observed to increase with increasing channel slope and decreasing volumetric concentration of sediment, whereas the mobility ratio was observed to increase with increasing channel slope and volumetric concentration of sediment. In addition, it was confirmed that the installation of a berm significantly decreased the debris flow velocity and mobility ratio. This indicates that a berm is an effective method for reducing damage to areas downstream of a debris flow by decreasing its potential mobility. By identifying the effects of berms on debris flow characteristics according to the channel slope and volumetric concentration of sediment, this study supports the development of berms to serve as debris flow damage mitigation measures.

**Keywords:** debris flow; berm; mitigation measure; volumetric concentration

## **1. Introduction**

The frequency of torrential rainfall has increased worldwide due to climate change caused by global warming, in turn increasing the occurrence of forest soil sediment disasters, such as landslides and debris flows [1]. Indeed, in South Korea, the occurrence of forest soil sediment disasters has increased due to the increasing frequency of torrential rainfall. The average annual area affected by the occurrence of forest soil sediment disasters in Korea was 290 ha for the period from 1980 to 1999, but significantly increased to 469 ha for the period from 2000 to 2019. In particular, a large-scale debris flow occurred in July 2011 at Mt. Woomyeon, located in an urban area, causing serious human casualties and property damage. This incident increased public interest in debris flow safety and expanded debris flow policy-making from mountainous areas to include urban areas as well [2]. It is known that debris flows can be initiated by various factors such as rainfall, snowmelt, typhoons, volcanoes, and earthquakes [3], but rainfall has been identified as the main cause of the debris flows that have occurred in Korea [4,5].

Debris flow, a type of mass movement, is a mixture composed of water and sediments of various particle sizes from clay to boulders. It is a dynamic phenomenon that moves downhill under the influence of gravity and may cause human casualties and property damage along the way [3,6–9]. Field monitoring of debris flows is difficult because they occur irregularly and exist for only a short duration of time [7,10,11]. Furthermore, it is difficult to accurately predict and respond to debris flows because they increase in scale by absorbing the sediment and water along their movement paths through strong erosive force [7,11–13]. Notably, the debris flow is distinguished from other mass movements because it is a sediment–water mixture, and thus can transfer momentum under the influence of grain friction, grain collision, and viscous fluid dynamics [14]. In addition,

**Citation:** Chang, H.; Ryou, K.; Lee, H. Debris Flow Characteristics in Flume Experiments Considering Berm Installation. *Appl. Sci.* **2021**, *11*, 2336. https://doi.org/10.3390/ app11052336

Academic Editor: Ricardo Castedo

Received: 2 February 2021 Accepted: 2 March 2021 Published: 6 March 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

momentum transfer caused by collisions between grains in a debris flow leads to excess pore water pressure, which reduces the shear resistance of the debris flow and results in very high mobility [14]. Thus, a debris flow can produce a peak discharge dozens of times higher than that of a flood in the same watershed [15,16], and its flow characteristics may vary depending on the relative contents of water and sediment as well as the size and type of particles [3]. Therefore, it is necessary to identify the flow and deposition characteristics of debris flows and establish mitigation measures to predict and reduce the damage they cause. momentum transfer caused by collisions between grains in a debris flow leads to excess pore water pressure, which reduces the shear resistance of the debris flow and results in very high mobility [14]. Thus, a debris flow can produce a peak discharge dozens of times higher than that of a flood in the same watershed [15,16], and its flow characteristics may vary depending on the relative contents of water and sediment as well as the size and type of particles [3]. Therefore, it is necessary to identify the flow and deposition characteristics of debris flows and establish mitigation measures to predict and reduce the damage they cause.

the influence of grain friction, grain collision, and viscous fluid dynamics [14]. In addition,

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 2 of 19

Debris flow mitigation measures can be mainly divided into structural and nonstructural measures. Structural mitigation measures directly reduce debris flow damage using structures installed in the path of a debris flow; erosion control dams, flexible debris flow barriers, and berms can all be considered appropriate structures for this purpose. Nonstructural mitigation measures indirectly limit debris flow damage using methods such as debris flow forecasting and warning systems or land use regulations. Figure 1 shows examples of real world slopes with multiple berms. This study evaluated the use of a berm, a structural debris flow mitigation measure consisting of a small step installed on a slope that effectively disperses rainwater, reduces the momentum of debris flows, and reduces the riverbed erosion. Berms can be easily constructed at a low cost compared to other debris flow mitigation measures. Therefore, several studies applying berms as a debris flow mitigation measure have been recently conducted. VanDine [17] introduced a method of applying berms to lateral, deflection, and terminal walls serving as open debris flow control structures, and suggested major considerations for structural design. Prochaska et al. [18] pointed out that existing analysis methods for debris basins and deflection berms were not sufficient for the prediction of debris flow volume or the subsequent design of the berm geometry, impact load, and outlet, and accordingly suggested solutions and guidelines. Kim and Lee [19] performed numerical simulations by applying the finite difference method based on the mass conservation and momentum conservation equations to evaluate the behavior and mechanism of debris flow in response to the installation of berms on a slope. Sharma et al. [20] calculated the axial force, horizontal displacement, and safety factor using the finite element method to identify the stability of a slope according to the soil cohesion, the angle of internal friction, and the width and height of the berm installed upon it. Debris flow mitigation measures can be mainly divided into structural and nonstructural measures. Structural mitigation measures directly reduce debris flow damage using structures installed in the path of a debris flow; erosion control dams, flexible debris flow barriers, and berms can all be considered appropriate structures for this purpose. Non-structural mitigation measures indirectly limit debris flow damage using methods such as debris flow forecasting and warning systems or land use regulations. Figure 1 shows examples of real world slopes with multiple berms. This study evaluated the use of a berm, a structural debris flow mitigation measure consisting of a small step installed on a slope that effectively disperses rainwater, reduces the momentum of debris flows, and reduces the riverbed erosion. Berms can be easily constructed at a low cost compared to other debris flow mitigation measures. Therefore, several studies applying berms as a debris flow mitigation measure have been recently conducted. VanDine [17] introduced a method of applying berms to lateral, deflection, and terminal walls serving as open debris flow control structures, and suggested major considerations for structural design. Prochaska et al. [18] pointed out that existing analysis methods for debris basins and deflection berms were not sufficient for the prediction of debris flow volume or the subsequent design of the berm geometry, impact load, and outlet, and accordingly suggested solutions and guidelines. Kim and Lee [19] performed numerical simulations by applying the finite difference method based on the mass conservation and momentum conservation equations to evaluate the behavior and mechanism of debris flow in response to the installation of berms on a slope. Sharma et al. [20] calculated the axial force, horizontal displacement, and safety factor using the finite element method to identify the stability of a slope according to the soil cohesion, the angle of internal friction, and the width and height of the berm installed upon it.

**Figure 1.** Examples of using berms in real world slopes: (**a**) Slopes with berms in coal mine areas (Dr Ajay Kumar Singh/Shutterstock.com); (**b**) slopes with berms in Peru (lialina/Shutter-**Figure 1.** Examples of using berms in real world slopes: (**a**) Slopes with berms in coal mine areas (Dr Ajay Kumar Singh/Shutterstock.com); (**b**) slopes with berms in Peru (lialina/Shutterstock.com).

To effectively design debris flow mitigation measures, it is necessary to first calculate the main debris flow parameters, such as the potential debris volume, impact force, flow velocity, peak discharge, mobility ratio, runout distance, and total travel distance [16,21]. In particular, the flow velocity must be considered in the design of any debris flow mitigation measure because it directly influences the impact force and deposition characteristics, and is affected by the channel topography [3,15,22–23]. In the countries where studies To effectively design debris flow mitigation measures, it is necessary to first calculate the main debris flow parameters, such as the potential debris volume, impact force, flow velocity, peak discharge, mobility ratio, runout distance, and total travel distance [16,21]. In particular, the flow velocity must be considered in the design of any debris flow mitigation measure because it directly influences the impact force and deposition characteristics, and is affected by the channel topography [3,15,22,23]. In the countries where studies on debris flow characteristics have been actively conducted, the debris flow process and impact force can be estimated relatively accurately [10,21,24]. Domestic studies on debris flows in South

stock.com).

Korea, however, have been limited to trend analyses for simple experimental conditions. Numerical analysis, field observations, and flume experiments have all been previously used for research into the behavior and mechanism of debris flows. Numerical analyses provide relatively high accuracy, but it is difficult to obtain the necessary parameters through field observations, and as such, the observational data required to verify the results of such analyses are insufficient [2]. Thus, while field observations and numerical analyses are suitable for developing and testing methods to predict the behavior of debris flows, flume experiments are more suitable for conducting research on the behavior of debris flows under controlled conditions to develop related prediction equations [8]. Indeed, flume experiments are the only available method for identifying the flow and deposition characteristics of debris flows through repeatable experiments with reproducible results [25]. For these reasons, flume experiments have been mainly used in the past to identify the behaviors of debris flows.

In this study, flume experiments were therefore performed to analyze debris flow characteristics and mobility ratios according to channel slope, volumetric concentration of sediment, and presence of a berm. The flow velocity, flow depth, Froude number, and mobility ratio of the debris flows were then calculated based on the experimental observations. In addition, flow resistance coefficients were calculated by substituting the observed flow velocity and flow depth into the debris flow velocity estimation equations.

## **2. Methods for Assessing Debris Flow Characteristics**

## *2.1. Flow Velocity*

The debris flow velocity must be considered in any debris flow risk assessment and subsequent design of mitigation measures, because it significantly affects the impact force. It can be directly measured through field monitoring or flume experiments, or calculated using flow velocity estimation equations. The basic formula suggested in previous studies for estimating the flow velocity of a debris flow is [16,26–28]:

$$v = \mathcal{C}h^a \mathfrak{a}^b \tag{1}$$

where *v* is the flow velocity, *C* is the flow resistance coefficient, *h* is the flow depth, *α* is the channel slope, and *a* and *b* are exponential factors defined according to the flow characteristics.

Table 1 defines Equations (2) through (6), which have been proposed by Hungr et al. [15], Rickenmann [16,29], Takahashi [22], Koch [26], and Lo [27]. These equations have been proposed to calculate the flow velocity for different flow types through field observations, flume experiments, and numerical analyses based on Equation (1). In Table 1, *ρ* is the mixed density of debris flow (kg·m−<sup>3</sup> ), *g* is the gravitational acceleration (9.81 m·s −2 ), *k* is the cross-sectional shape factor (3 for wide rectangular channels, 5 for trapezoidal channels, and 8 for semicircular channels), *µ* is the apparent dynamic viscosity of the debris flow (Pa·s), *ξ* is the lumped coefficient depending on the volumetric concentration of sediment (m−1/2 ·s −1 ), *n* is the Manning coefficient (m−1/3 ·s), *<sup>C</sup>*<sup>1</sup> is the Chezy coefficient (m1/2 ·s −1 ), and *C*<sup>2</sup> is the empirical coefficient proposed by Koch [26] (m0.78 ·s −1 ). In this study, the flow resistance coefficients *µ*, *ξ*, *n*, *C*1, and *C*<sup>2</sup> were calculated by substituting the experimentally observed debris flow velocity and flow depth into Equations (2) through (6).

The various coefficients in Equations (2) through (6) are empirical constants known to provide results that are reasonably consistent with field observations [27]. Hungr et al. [15] suggested that Equation (2) is suitable to estimate the debris flow velocities because a laminar flow regime is formed near the peak of debris flow surge. The variable *k* in Equation (2) has a different value depending on the cross-sectional shape of the channel [16]. Rickenmann [16], and Eu et al. [2], used a value of 3 as *k*, which is valid for a wide rectangular channel. This is because the debris flow depth is generally smaller than the width. Equation (3) is based on Bagnold's theory [30] of dilatant grain shearing in the inertial regime. Takahashi [22] estimated the debris flow velocity based on Equation (3) and suggested that it is suitable for stony debris flows in Japan. Equation (4) is known

as Manning's formula, and Mizuyama [31] proposed using it for estimating the debris flow velocity. Equation (5) is known as Chezy's formula, and was initially introduced to estimate the velocity of snow avalanches; Rickenmann [29] used it to estimate the debris flow velocity. Equation (6) was proposed by Koch [26] through numerical analyses. Koch [26] confirmed that the Newtonian turbulent, Voellmy, and empirical models, which have smaller exponents (*a* and *b* in Equation (1)), are closer to observed values in the field than other models.

**Table 1.** Equations used to estimate the mean velocity of debris flow according to flow type.


## *2.2. Froude Number*

Debris flows are dominated by the influence of gravity. Therefore, similarity must be applied using the Froude number, which represents the ratio of inertial force to gravity. The Froude number is expressed as

$$F\_r = \frac{v}{\sqrt{gh}}\tag{7}$$

where *F<sup>r</sup>* is the Froude number.

## *2.3. Mobility Ratio*

The debris flow mobility ratio is a dimensionless number, generally known as the Heim coefficient, obtained by dividing the total drop between the initial occurrence and final deposition points of the debris flow by the total travel distance, which is defined as the horizontal distance between the two points. This concept was introduced by Heim [32] to analyze rock avalanches, and its applicability was expanded when Iverson [14] applied it to the analysis of debris flows. The debris flow mobility ratio can be used to predict the runout distance and flow velocity [33], and many studies have used this ratio because it accurately reflects the risk and potential mobility of debris flows [8,11,14,33,34].

## **3. Flume Experiments**

## *3.1. Experimental Setup*

Figure 2 shows the experimental setups and main parameters of the flume experiments performed in this study. Here, *H* is the total drop between the initial occurrence and final deposition points of the debris flow, *L* is the total travel distance. The setup for the flume experiments was composed of a sample box, a channel, and a deposition plane. The sample box was 0.2 m long, 0.15 m wide, and 0.3 m high, and installed at the top of the channel. The sediment–water mixture was supplied to the channel by manually lifting the gate of the sample box. The channel was made of steel in consideration of the strong erosion that accompanies debris flow, and had a length of 1.3–1.9 m, a width of 0.15 m, and a height of 0.3 m (Table 2). The channel was fabricated in separate upper and lower slope segments so that a 0.6-m long berm could be installed between them. At the outlet of the channel, a 1.5-m long, 1.0-m wide deposition plane was installed to analyze the debris flow deposition characteristics. The deposition plane was composed of 0.05-m grids in the longitudinal and lateral directions.


**Table 2.** Dimensions of the channels for each test type. front camera was used to measure the mean velocity of the head of the debris flow from

debris flow deposition characteristics. The deposition plane was composed of 0.05-m grids

osition by observing the total drop *H* and total travel distance *L*, as indicated in Figure 2.

To observe the flow and deposition characteristics of the debris flow, cameras capable of capturing video at 60 fps were installed at the front and on a side of the flume. The

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 5 of 19

To observe the flow and deposition characteristics of the debris flow, cameras capable of capturing video at 60 fps were installed at the front and on a side of the flume. The front camera was used to measure the mean velocity of the head of the debris flow from the top of the channel to the outlet, and the side camera was used to measure the maximum depth of the debris flow at a point 0.1 m upstream of the channel outlet. The debris flow mobility ratio was calculated upon the completion of sediment–water mixture deposition by observing the total drop *H* and total travel distance *L*, as indicated in Figure 2. **Table 2.** Dimensions of the channels for each test type. **Test Type Length (m) Width (m) Height (m)**  Straight channel test 1.3 0.15 0.3 Single-berm channel test 1.9 0.15 0.3

(c) (d)

**Figure 2.** Experimental setups for flume experiments: (**a**) Front image of the straight channel test; (**b**) front image of the single-berm channel test; (**c**) side image of the straight channel test; (**d**) side image of the single-berm channel test; (**e**) schematic diagram of the straight channel test; and (**f**) **Figure 2.** Experimental setups for flume experiments: (**a**) Front image of the straight channel test; (**b**) front image of the single-berm channel test; (**c**) side image of the straight channel test; (**d**) side image of the single-berm channel test; (**e**) schematic diagram of the straight channel test; and (**f**) schematic diagram of the single-berm channel test.

## schematic diagram of the single-berm channel test. *3.2. Experimental Conditions*

mixture as follows:

*3.2. Experimental Conditions*  Debris flows are a sediment–water mixture, thus its viscosity is difficult to measure using only a general soil test [1]. Takahashi [9] indirectly estimated the viscosity of debris flows using the volumetric concentration of sediment, and confirmed that the viscosity of debris flows increases as the volumetric concentration of sediment increases. As the viscosity of debris flows increases, the shear resistance and bed friction of the debris flows Debris flows are a sediment–water mixture, thus its viscosity is difficult to measure using only a general soil test [1]. Takahashi [9] indirectly estimated the viscosity of debris flows using the volumetric concentration of sediment, and confirmed that the viscosity of debris flows increases as the volumetric concentration of sediment increases. As the viscosity of debris flows increases, the shear resistance and bed friction of the debris flows increase. This leads to a decrease in the momentum of the debris flows, which changes the debris flow characteristics. Therefore, it is necessary to compose various experimental conditions for the viscosity of debris flows. The volumetric concentration of sediment can

where *CV* is the volumetric concentration of sediment, *VS* is the volume of sediment, *VW* is

shows the experimental cases evaluated in this study considering the experiment type

In this study, the viscosity of the debris flow was also considered using *CV*. Table 3

=

(8)

=

ሻ + ሺ

the volume of water, and *Vtotal* is the total volume of the sediment–water mixture.

increase. This leads to a decrease in the momentum of the debris flows, which changes the debris flow characteristics. Therefore, it is necessary to compose various experimental conditions for the viscosity of debris flows. The volumetric concentration of sediment can be obtained by dividing the volume of sediment by the volume of the sediment–water mixture as follows:

$$\mathcal{C}\_V = \frac{V\_\mathcal{S}}{(V\_\mathcal{S} + V\_W)} = \frac{V\_\mathcal{S}}{V\_{total}} \tag{8}$$

where *C<sup>V</sup>* is the volumetric concentration of sediment, *V<sup>S</sup>* is the volume of sediment, *V<sup>W</sup>* is the volume of water, and *Vtotal* is the total volume of the sediment–water mixture. *Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 7 of 19

> In this study, the viscosity of the debris flow was also considered using *CV*. Table 3 shows the experimental cases evaluated in this study considering the experiment type (straight channel test or single-berm channel test), α (10–25◦ in 5◦ increments), and *C<sup>V</sup>* (0.40–0.60 in increments of 0.05). Consequently, a total of 40 experimental conditions were evaluated in this study, each of which was repeated five times. (straight channel test or single-berm channel test), α (10–25° in 5° increments), and *CV* (0.40–0.60 in increments of 0.05). Consequently, a total of 40 experimental conditions were evaluated in this study, each of which was repeated five times.


**Table 3.** Experimental conditions used in this study. **Table 3.** Experimental conditions used in this study.

### *3.3. Sample Properties 3.3. Sample Properties*

In this study, a sediment–water mixture was used to reproduce the conditions of a debris flow. Table 4 shows the particle composition of the sediment as determined through a sieve analysis, with reference to previous studies [35,36], and Figure 3 shows the particle-size distribution curve of the sediment. The weight ratios of each particle size range were 25% for 4.75–9.50 mm, 25% for 2.00–4.75 mm, and 50% for 2.00 mm or less. In this study, *C<sup>V</sup>* was set to be 0.40–0.60, based on previous studies with similar experimental setup scales [1,8,11,12]. Table 5 shows the weight of sediment and water required for each experiment when *C<sup>V</sup>* was adjusted within the 0.40–0.60 range, where W is the weight of the sediment-water mixture. The same volume of sediment–water mixture (4500 cm<sup>3</sup> ) was used in the straight channel test and single-berm channel test. As *C<sup>V</sup>* increased from 0.40 to 0.60, the density of the sediment–water mixture linearly increased from 1578 to 1867 kg·m−<sup>3</sup> (Table 5). In this study, a sediment–water mixture was used to reproduce the conditions of a debris flow. Table 4 shows the particle composition of the sediment as determined through a sieve analysis, with reference to previous studies [35,36], and Figure 3 shows the particle-size distribution curve of the sediment. The weight ratios of each particle size range were 25% for 4.75–9.50 mm, 25% for 2.00–4.75 mm, and 50% for 2.00 mm or less. In this study, *CV* was set to be 0.40–0.60, based on previous studies with similar experimental setup scales [1,8,11–12]. Table 5 shows the weight of sediment and water required for each experiment when *CV* was adjusted within the 0.40–0.60 range, where W is the weight of the sediment-water mixture. The same volume of sediment–water mixture (4500 cm3) was used in the straight channel test and single-berm channel test. As *CV* increased from 0.40 to 0.60, the density of the sediment–water mixture linearly increased from 1578 to 1867 kg·m−3 (Table 5).

**Figure 3.** Grain-size distribution curve of the mixed sediment used in the debris flow. **Figure 3.** Grain-size distribution curve of the mixed sediment used in the debris flow.


**Table 4.** Weight ratios of the sediment–water mixture by particle size. **Table 4.** Weight ratios of the sediment–water mixture by particle size. **Particle Size (mm) Weight Ratio (%)** 

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 8 of 19

**Table 5.** Weight of debris flow samples according to volumetric concentration of sediment *CV*. **Table 5.** Weight of debris flow samples according to volumetric concentration of sediment *CV*.


### *3.4. Experimental Method 3.4. Experimental Method*

Figures 4 and 5 shows examples and a flow chart of flume experiments performed in this study, respectively. Experimental videos were separately presented in Videos S1 and S2. Flume experiments were first planned by determining the setup and conditions for each experiment, as defined in Table 3. Then, the effects of α, *CV*, and berm installation were examined for each of the 40 defined test conditions in the following sequence. Sediment and water were added to the sample box to meet the required weight and mixed well. The gate was then rapidly removed from the sample box to supply the sediment–water mixture. Finally, the flow velocity, flow depth, and mobility ratio of the resulting debris flow were observed as described in Section 3.1, and the flow resistance coefficient and Froude number were calculated accordingly. Figures 4 and 5 shows examples and a flow chart of flume experiments performed in this study, respectively. Experimental videos were separately presented in Videos S1 and S2. Flume experiments were first planned by determining the setup and conditions for each experiment, as defined in Table 3. Then, the effects of α, *CV*, and berm installation were examined for each of the 40 defined test conditions in the following sequence. Sediment and water were added to the sample box to meet the required weight and mixed well. The gate was then rapidly removed from the sample box to supply the sediment– water mixture. Finally, the flow velocity, flow depth, and mobility ratio of the resulting debris flow were observed as described in Section 3.1, and the flow resistance coefficient and Froude number were calculated accordingly.

**Figure 4.** Examples of flume experiments conducted in this study: (**a**) Straight channel test (α was 25° and *CV* was 0.50); and (**b**) single-berm channel test (α was 25° and *CV* was 0.50). **Figure 4.** Examples of flume experiments conducted in this study: (**a**) Straight channel test (<sup>α</sup> was 25◦ and *C<sup>V</sup>* was 0.50); and (**b**) single-berm channel test (α was 25◦ and *C<sup>V</sup>* was 0.50).

**Figure 5.** Flow chart of flume experiments. **Figure 5.** Flow chart of flume experiments.

#### **4. Results and Analysis 4. Results and Analysis**

Table 6 shows the debris flow velocity, flow depth, Froude number, total travel distance, and mobility ratio observed through the flume experiments, where *N* is the experiment number. In the case of the straight channel test, the development of debris flow was insufficient to observe the flow depth when α was 10° and *CV* was 0.60; however, debris flow was sufficiently developed when α was 15° or greater. In the single-berm channel test, debris flow stopped in the channel when *CV* was 0.60, regardless of α, so these values are not reported in the analysis below. In addition, when α was 20° or less and *CV* was between 0.50 and 0.55, the development of debris flow was insufficient to observe the flow depth in some experiments. Table 6 shows the debris flow velocity, flow depth, Froude number, total travel distance, and mobility ratio observed through the flume experiments, where *N* is the experiment number. In the case of the straight channel test, the development of debris flow was insufficient to observe the flow depth when α was 10◦ and *C<sup>V</sup>* was 0.60; however, debris flow was sufficiently developed when α was 15◦ or greater. In the single-berm channel test, debris flow stopped in the channel when *C<sup>V</sup>* was 0.60, regardless of α, so these values are not reported in the analysis below. In addition, when α was 20◦ or less and *C<sup>V</sup>* was between 0.50 and 0.55, the development of debris flow was insufficient to observe the flow depth in some experiments.

**Table 6.** Observed data from flume experiments.




## *4.1. Debris Flow Characteristics*

## 4.1.1. Flow Velocity

Figure 6 shows the flow velocities observed through the flume experiments and their changes due to the installation of the berm. Figure 6a,b show the debris flow velocity according to α and *CV*, respectively. Figure 6c,d show the percent change in debris flow velocity according to α and *CV*, respectively, after the berm was installed. The experimental results indicated that the debris flow velocity increased as α increased or *C<sup>V</sup>* decreased (Figure 6a,b). In the case of the straight channel test, the incremental increase in flow velocity with increasing α was similar for *C<sup>V</sup>* values less than 0.55, but increased noticeably with α at a higher *CV*. The incremental decrease in flow velocity with increasing *C<sup>V</sup>* was similar, regardless of α. In the case of the single-berm channel test, the incremental increase in flow velocity with increasing α was found to be similar, regardless of *CV*. The incremental decrease in flow velocity with increasing *C<sup>V</sup>* was similar when α was 15◦ or greater, but increased noticeably with *C<sup>V</sup>* when α was less than 15◦ . In addition, it was confirmed that the installation of the berm on the slope reduced the debris flow velocity by 3.2–34.3% (Figure 6c,d). The average decreases in debris flow velocity were found to be 22.9%, 17.4%, 15.2%, and 9.5% for a *C<sup>V</sup>* of 0.40–0.55 and α of 10◦ , 15◦ , 20◦ , and 25◦ , respectively, and 9.0%, 13.0%, 15.9%, and 27.0% for an α of 10◦–25◦ and *C<sup>V</sup>* of 0.40, 0.45, 0.50, and 0.55, respectively. This confirmed that the installation of the berm was more effective in reducing the debris flow velocity at smaller α values and larger *C<sup>V</sup>* values.

(Figure 6a,b). In the case of the straight channel test, the incremental increase in flow velocity with increasing α was similar for *CV* values less than 0.55, but increased noticeably with α at a higher *CV*. The incremental decrease in flow velocity with increasing *CV* was similar, regardless of α. In the case of the single-berm channel test, the incremental increase in flow velocity with increasing α was found to be similar, regardless of *CV*. The incremental decrease in flow velocity with increasing *CV* was similar when α was 15° or greater, but increased noticeably with *CV* when α was less than 15°. In addition, it was confirmed that the installation of the berm on the slope reduced the debris flow velocity by 3.2–34.3% (Figure 6c,d). The average decreases in debris flow velocity were found to be 22.9%, 17.4%, 15.2%, and 9.5% for a *CV* of 0.40–0.55 and α of 10°, 15°, 20°, and 25°, respectively, and 9.0%, 13.0%, 15.9%, and 27.0% for an α of 10°–25° and *CV* of 0.40, 0.45, 0.50, and 0.55, respectively. This confirmed that the installation of the berm was more effective

in reducing the debris flow velocity at smaller α values and larger *CV* values.

**Figure 6.** (**a**) Flow velocity according to α (at *CV* = 0.40–0.55); (**b**) flow velocity according to *CV* (at α = 10°–25°); (**c**) percent decrease in flow velocity due to berm according to α (at *CV* = 0.40–0.55); (**d**) percent decrease in flow velocity due to berm according to *CV* (at α = 10°–25°). **Figure 6.** (**a**) Flow velocity according to α (at *C<sup>V</sup>* = 0.40–0.55); (**b**) flow velocity according to *C<sup>V</sup>* (at α = 10◦–25◦ ); (**c**) percent decrease in flow velocity due to berm according to α (at *C<sup>V</sup>* = 0.40–0.55); (**d**) percent decrease in flow velocity due to berm according to *C<sup>V</sup>* (at α = 10◦–25◦ ).

#### 4.1.2. Flow Depth 4.1.2. Flow Depth

Figure 7 shows the flow depths observed through the flume experiments and their changes due to the installation of the berm. Figure 7a,b show the debris flow depth according to α and *CV*, respectively. Figure 7c,d show the percent change in debris flow depth according to α and *CV*, respectively, after the berm was installed. In the case of the straight channel test, the debris flow depth decreased as α increased or *CV* increased (Figure 7a,b). The decrease in flow depth with increasing α slowed when *CV* was 0.50 or greater, and the decrease in flow depth with increasing *CV* slowed when α was 15° or greater. In the case of the single-berm channel test, the debris flow depth decreased as *CV* increased (Figure 7b). The decrease in flow depth with increasing α changed when *CV* was 0.50, and the flow depth suddenly increased when α was 25° and *CV* was 0.45 or less (Figure 7a). The installation of the berm was found to decrease the debris flow depth by an average of 3.9–71.2%, even though the flow depth increased by 15.6% when α was 25° and *CV* was 0.40 (Figure 7c,d). The average decreases in debris flow depth were 43.7%, Figure 7 shows the flow depths observed through the flume experiments and their changes due to the installation of the berm. Figure 7a,b show the debris flow depth according to α and *CV*, respectively. Figure 7c,d show the percent change in debris flow depth according to α and *CV*, respectively, after the berm was installed. In the case of the straight channel test, the debris flow depth decreased as α increased or *C<sup>V</sup>* increased (Figure 7a,b). The decrease in flow depth with increasing α slowed when *C<sup>V</sup>* was 0.50 or greater, and the decrease in flow depth with increasing *C<sup>V</sup>* slowed when α was 15◦ or greater. In the case of the single-berm channel test, the debris flow depth decreased as *C<sup>V</sup>* increased (Figure 7b). The decrease in flow depth with increasing α changed when *C<sup>V</sup>* was 0.50, and the flow depth suddenly increased when α was 25◦ and *C<sup>V</sup>* was 0.45 or less (Figure 7a). The installation of the berm was found to decrease the debris flow depth by an average of 3.9–71.2%, even though the flow depth increased by 15.6% when α was 25◦ and *C<sup>V</sup>* was 0.40 (Figure 7c,d). The average decreases in debris flow depth were 43.7%, 53.8%, 44.4%, and 1.1% for a *C<sup>V</sup>* of 0.40–0.50 and α values of 10◦ , 15◦ , 20◦ , and 25◦ , respectively, and 25.4%, 37.6%, and 44.4% for an α of 10◦–25◦ and *C<sup>V</sup>* values of 0.40, 0.45, and 0.50, respectively. This confirmed that the installation of the berm was more effective in reducing the debris flow depth at larger *C<sup>V</sup>* values.

reducing the debris flow depth at larger *CV* values.

**Figure 7.** (**a**) Flow depth according to α (at *CV* = 0.40–0.55); (**b**) flow depth according to *CV* (at α = 10°–25°); (**c**) percent decrease in flow depth due to berm according to α (at *CV* = 0.40–0.50); (**d**) percent decrease in flow depth due to berm according to *CV* (at α = 10°–25°). **Figure 7.** (**a**) Flow depth according to α (at *C<sup>V</sup>* = 0.40–0.55); (**b**) flow depth according to *C<sup>V</sup>* (at α = 10◦–25◦ ); (**c**) percent decrease in flow depth due to berm according to α (at *C<sup>V</sup>* = 0.40–0.50); (**d**) percent decrease in flow depth due to berm according to *C<sup>V</sup>* (at α = 10◦–25◦ ).

#### 4.1.3. Froude Number 4.1.3. Froude Number

increased as *CV* increased.

Figure 8 shows the Froude numbers calculated through the flume experiment observations and the change in Froude number due to the installation of the berm. Figures 8a,b show the Froude number according to α and *CV*, respectively. Figure 8c,d show the change in the Froude number according to α and *CV*, respectively, when the berm was installed. In the case of the straight channel test, the Froude number increased as α increased (Figure 8a). With the berm installed, the Froude number decreased by 2.1–9.7% when α was 25° and *CV* was 0.50 or less (Figure 8c,d); however, when α was less than 25°, the Froude number increased by 9.7–58.0% due to the installation of the berm (Figure 8c,d). The average incremental increases in the Froude number were found to be 12.8%, 30.8%, and 19.5% for a *CV* of 0.40–0.50 and α values of 10°, 15°, and 20°, respectively, and 8.4%, 12.3%, and 25.6% for an α of 10°–25° and *CV* values of 0.40, 0.45, and 0.50, respectively. In other words, the average increase in the Froude number due to the installation of the berm Figure 8 shows the Froude numbers calculated through the flume experiment observations and the change in Froude number due to the installation of the berm. Figure 8a,b show the Froude number according to α and *CV*, respectively. Figure 8c,d show the change in the Froude number according to α and *CV*, respectively, when the berm was installed. In the case of the straight channel test, the Froude number increased as α increased (Figure 8a). With the berm installed, the Froude number decreased by 2.1–9.7% when α was 25◦ and *C<sup>V</sup>* was 0.50 or less (Figure 8c,d); however, when α was less than 25◦ , the Froude number increased by 9.7–58.0% due to the installation of the berm (Figure 8c,d). The average incremental increases in the Froude number were found to be 12.8%, 30.8%, and 19.5% for a *C<sup>V</sup>* of 0.40–0.50 and α values of 10◦ , 15◦ , and 20◦ , respectively, and 8.4%, 12.3%, and 25.6% for an α of 10◦–25◦ and *C<sup>V</sup>* values of 0.40, 0.45, and 0.50, respectively. In other words, the average increase in the Froude number due to the installation of the berm increased as *C<sup>V</sup>* increased.

53.8%, 44.4%, and 1.1% for a *CV* of 0.40–0.50 and α values of 10°, 15°, 20°, and 25°, respectively, and 25.4%, 37.6%, and 44.4% for an α of 10°–25° and *CV* values of 0.40, 0.45, and 0.50, respectively. This confirmed that the installation of the berm was more effective in

**Figure 8.** (**a**) Froude number according to α (at *CV* = 0.40–0.55); (**b**) Froude number according to *CV* (at α = 10°–25°); (**c**) percent increase in Froude number due to berm according to α (at *CV* = 0.40–0.50); (**d**) percent increase in Froude number due to berm according to *CV* (at α = 10°–25°). **Figure 8.** (**a**) Froude number according to α (at *C<sup>V</sup>* = 0.40–0.55); (**b**) Froude number according to *C<sup>V</sup>* (at α = 10◦–25◦ ); (**c**) percent increase in Froude number due to berm according to α (at *C<sup>V</sup>* = 0.40–0.50); (**d**) percent increase in Froude number due to berm according to *C<sup>V</sup>* (at α = 10◦–25◦ ).

#### 4.1.4. Flow Resistance Coefficients 4.1.4. Flow Resistance Coefficients

Table 7 shows the ranges, mean values, and standard deviations of the flow resistance coefficients *μ*, *ξ*, *n*, *C1*, and *C2* calculated using Equations (2)–(6) based on the experimental results in this study. Figure 9 shows the experimentally obtained flow resistance coefficient values according to α and *CV*. The experimental results showed that α had little influence on any flow resistance coefficients (Figure 9a,c,e,g and i), but *CV* clearly affected all flow resistance coefficients except *C2* (Figures 9b,d,f,h and j): As *CV* increased, the values of *μ* and *n* decreased, whereas the values of *ξ* and *C1* increased. When the berm was installed, the values of *ξ* decreased by 4.3% when α was 25° and *CV* was 0.40. Except for this one observation, each flow resistance coefficient exhibited a consistent change pattern due to the installation of the berm. The installation of the berm decreased *μ* by 8.6– 93.8%, increased *ξ* by up to 563%, decreased *n* by 7.0–57.2%, increased *C1* by 10.6–91.4%, and increased *C2* by 2.9–26.4%. Table 7 shows the ranges, mean values, and standard deviations of the flow resistance coefficients *µ*, *ξ*, *n*, *C1*, and *C<sup>2</sup>* calculated using Equations (2)–(6) based on the experimental results in this study. Figure 9 shows the experimentally obtained flow resistance coefficient values according to α and *CV*. The experimental results showed that α had little influence on any flow resistance coefficients (Figure 9a,c,e,g,i), but *C<sup>V</sup>* clearly affected all flow resistance coefficients except *C<sup>2</sup>* (Figure 9b,d,f,h,j): As *C<sup>V</sup>* increased, the values of *µ* and *n* decreased, whereas the values of *ξ* and *C<sup>1</sup>* increased. When the berm was installed, the values of *ξ* decreased by 4.3% when α was 25◦ and *C<sup>V</sup>* was 0.40. Except for this one observation, each flow resistance coefficient exhibited a consistent change pattern due to the installation of the berm. The installation of the berm decreased *µ* by 8.6–93.8%, increased *ξ* by up to 563%, decreased *n* by 7.0–57.2%, increased *C<sup>1</sup>* by 10.6–91.4%, and increased *C<sup>2</sup>* by 2.9–26.4%.

**Table 7.** Flow resistance coefficients calculated in this study.


**Figure 9.** *Cont*.

**Figure 9.** (**a**) *μ* according to α (at *CV* = 0.40–0.55); (**b**) *μ* according to *CV* (at α = 10°–25°); (**c**) *ξ* according to α (at *CV* = 0.40– 0.55); (**d**) *ξ* according to *CV* (at α = 10°–25°); (**e**) *n* according to α (at *CV* = 0.40–0.55); (**f**) *n* according to *CV* (at α = 10°–25°); (**g**) *C1* according to α (at *CV* = 0.40–0.55); (**h**) *C1* according to *CV* (at α = 10°–25°); (**i**) *C2* according to α (at *CV* = 0.40–0.55); (**j**) *C2* according to *CV* (at α = 10°–25°). **Figure 9.** (**a**) *µ* according to α (at *C<sup>V</sup>* = 0.40–0.55); (**b**) *µ* according to *C<sup>V</sup>* (at α = 10◦–25◦ ); (**c**) *ξ* according to α (at *C<sup>V</sup>* = 0.40–0.55); (**d**) *ξ* according to *C<sup>V</sup>* (at α = 10◦–25◦ ); (**e**) *n* according to α (at *C<sup>V</sup>* = 0.40–0.55); (**f**) *n* according to *C<sup>V</sup>* (at α = 10◦–25◦ ); (**g**) *C<sup>1</sup>* according to α (at *C<sup>V</sup>* = 0.40–0.55); (**h**) *C<sup>1</sup>* according to *C<sup>V</sup>* (at α = 10◦–25◦ ); (**i**) *C<sup>2</sup>* according to α (at *C<sup>V</sup>* = 0.40–0.55); (**j**) *C<sup>2</sup>* according to *C<sup>V</sup>* (at α = 10◦–25◦ ).

### 4.1.5. Mobility Ratio 4.1.5. Mobility Ratio

**No berm Berm No berm Berm No berm Berm No berm Berm 10**° **15**° **20**° **25**°

*CV* 0.40 0.45 0.50 0.55 Mean

α

**0.10**

**0.15**

**0.20**

*H/L*

**0.25**

**0.30**

**0.35**

Figure 10 shows the debris flow mobility ratio observed through the flume experiments and the percent decrease in mobility ratio due to the installation of the berm. Figure 10a,b show the debris flow mobility ratio according to α and *CV*, respectively. Figure 10c,d show the percent decrease in the debris flow mobility ratio according to α and *CV*, respectively, when the berm was installed. The experimental results confirmed that the debris flow mobility ratio increased as α or *CV* increased (Figure 10a,b). In the case of the straight channel test, the incremental increases in the mobility ratio due to the increase in α were similar when *CV* was less than 0.55, but larger when it was 0.55 or greater. The incremental increases in mobility ratio due to the increase in *CV* were similar when α was 15° or greater, but considerably smaller when it was less than 15°. In the single-berm channel test, the incremental increases in the mobility ratio due to the increase in α were similar when *CV* was less than 0.55, but larger when it was 0.55 or higher. The incremental increases in mobility ratio due to the increase in *CV* were similar when α was 20° or greater, but relatively smaller when it was less than 20°. In addition, the debris flow mobility ratio was observed to decrease by 4.6–15.7% due to the installation of the berm on the slope (Figure 10c,d). The average reductions in the debris flow mobility ratio were found to be 9.8%, 9.4%, 9.1%, and 10.2% for a *CV* of 0.40–0.55 and α values of 10°, 15°, 20°, and 25°, respectively, and 13.6%, 10.0%, 8.2%, and 6.7% for an α of 10°–25° and *CV* values of 0.40, 0.45, 0.50, and 0.55, respectively. In other words, the installation of the berm was more effective Figure 10 shows the debris flow mobility ratio observed through the flume experiments and the percent decrease in mobility ratio due to the installation of the berm. Figure 10a,b show the debris flow mobility ratio according to α and *CV*, respectively. Figure 10c,d show the percent decrease in the debris flow mobility ratio according to α and *CV*, respectively, when the berm was installed. The experimental results confirmed that the debris flow mobility ratio increased as α or *C<sup>V</sup>* increased (Figure 10a,b). In the case of the straight channel test, the incremental increases in the mobility ratio due to the increase in α were similar when *C<sup>V</sup>* was less than 0.55, but larger when it was 0.55 or greater. The incremental increases in mobility ratio due to the increase in *C<sup>V</sup>* were similar when α was 15◦ or greater, but considerably smaller when it was less than 15◦ . In the single-berm channel test, the incremental increases in the mobility ratio due to the increase in α were similar when *C<sup>V</sup>* was less than 0.55, but larger when it was 0.55 or higher. The incremental increases in mobility ratio due to the increase in *C<sup>V</sup>* were similar when α was 20◦ or greater, but relatively smaller when it was less than 20◦ . In addition, the debris flow mobility ratio was observed to decrease by 4.6–15.7% due to the installation of the berm on the slope (Figure 10c,d). The average reductions in the debris flow mobility ratio were found to be 9.8%, 9.4%, 9.1%, and 10.2% for a *C<sup>V</sup>* of 0.40–0.55 and α values of 10◦ , 15◦ , 20◦ , and 25◦ , respectively, and 13.6%, 10.0%, 8.2%, and 6.7% for an α of 10◦–25◦ and *C<sup>V</sup>* values of 0.40, 0.45, 0.50, and 0.55, respectively. In other words, the installation of the berm was more effective in reducing the debris flow mobility ratio at lower *C<sup>V</sup>* values.

> **No berm Berm No berm Berm No berm Berm No berm Berm 0.40 0.45 0.50 0.55**

α

10° 15° 20° 25° Mean

*CV*

(a) (b)

**0.10**

**0.15**

**0.20**

*H/L*

**0.25**

**0.30**

**0.35**

in reducing the debris flow mobility ratio at lower *CV* values.

according to *CV* (at α = 10°–25°).

**6**

**7**

**8**

*C2*

**9**

**10**

4.1.5. Mobility Ratio

*CV* 0.40 0.45 0.50 0.55 Mean

**No berm Berm No berm Berm No berm Berm No berm Berm 10**° **15**° **20**° **25**°

α

in reducing the debris flow mobility ratio at lower *CV* values.

(i) (j) **Figure 9.** (**a**) *μ* according to α (at *CV* = 0.40–0.55); (**b**) *μ* according to *CV* (at α = 10°–25°); (**c**) *ξ* according to α (at *CV* = 0.40– 0.55); (**d**) *ξ* according to *CV* (at α = 10°–25°); (**e**) *n* according to α (at *CV* = 0.40–0.55); (**f**) *n* according to *CV* (at α = 10°–25°); (**g**) *C1* according to α (at *CV* = 0.40–0.55); (**h**) *C1* according to *CV* (at α = 10°–25°); (**i**) *C2* according to α (at *CV* = 0.40–0.55); (**j**) *C2*

> Figure 10 shows the debris flow mobility ratio observed through the flume experiments and the percent decrease in mobility ratio due to the installation of the berm. Figure 10a,b show the debris flow mobility ratio according to α and *CV*, respectively. Figure 10c,d show the percent decrease in the debris flow mobility ratio according to α and *CV*, respectively, when the berm was installed. The experimental results confirmed that the debris flow mobility ratio increased as α or *CV* increased (Figure 10a,b). In the case of the straight channel test, the incremental increases in the mobility ratio due to the increase in α were similar when *CV* was less than 0.55, but larger when it was 0.55 or greater. The incremental increases in mobility ratio due to the increase in *CV* were similar when α was 15° or greater, but considerably smaller when it was less than 15°. In the single-berm channel test, the incremental increases in the mobility ratio due to the increase in α were similar when *CV* was less than 0.55, but larger when it was 0.55 or higher. The incremental increases in mobility ratio due to the increase in *CV* were similar when α was 20° or greater, but relatively smaller when it was less than 20°. In addition, the debris flow mobility ratio was observed to decrease by 4.6–15.7% due to the installation of the berm on the slope (Figure 10c,d). The average reductions in the debris flow mobility ratio were found to be 9.8%, 9.4%, 9.1%, and 10.2% for a *CV* of 0.40–0.55 and α values of 10°, 15°, 20°, and 25°, respectively, and 13.6%, 10.0%, 8.2%, and 6.7% for an α of 10°–25° and *CV* values of 0.40, 0.45, 0.50, and 0.55, respectively. In other words, the installation of the berm was more effective

**6**

**7**

**8**

*C2*

**9**

**10**

**No berm Berm No berm Berm No berm Berm No berm Berm 0.40 0.45 0.50 0.55**

α

10° 15° 20° 25° Mean

*CV*

**Figure 10.** (**a**) Mobility ratio according to α (at *CV* = 0.40–0.55); (**b**) mobility ratio according to *CV* (at α = 10°–25°); (**c**) percent decrease in mobility ratio due to berm according to α (at *CV* = 0.40–0.55); (**d**) percent decrease in mobility ratio due to berm according to *CV* (at α = 10°–25°). **Figure 10.** (**a**) Mobility ratio according to α (at *C<sup>V</sup>* = 0.40–0.55); (**b**) mobility ratio according to *C<sup>V</sup>* (at α = 10◦–25◦ ); (**c**) percent decrease in mobility ratio due to berm according to α (at *C<sup>V</sup>* = 0.40–0.55); (**d**) percent decrease in mobility ratio due to berm according to *C<sup>V</sup>* (at α = 10◦–25◦ ).

### **5. Discussion 5. Discussion**

Debris flows are affected by the characteristics of the sediment–water mixture (magnitude, *CV*, and grain-size distribution) and channel shape (α, width, and curvature) [3,6,9]. In this study, various α and *CV* values were therefore evaluated in flume tests along with changes in channel shape due to the installation of a berm in the middle of the channel. The experimental results showed that the development of a flow was influenced to occur at α of 15° (Table 6), which has been mentioned as the slope that causes debris flow in previous studies [3,9]. Takahashi [9] mentioned that the debris flow pattern suddenly changes when its *CV* is less than 0.55 because of active particle separation, and that a debris flow cannot reach the outlet of its channel when the *CV* exceeds 0.58. In this study, the development of the debris flow was also observed to differ around a *CV* of 0.55; at a *CV* of 0.60 in the single-berm channel test, the debris flow stopped in the channel before reaching the outlet (Table 6). In the straight channel test, however, it appears that α had a larger influence on the flow. Thus, debris flow occurred when α was 15° or higher, even when *CV* exceeded 0.58. Debris flows are affected by the characteristics of the sediment–water mixture (magnitude, *CV*, and grain-size distribution) and channel shape (α, width, and curvature) [3,6,9]. In this study, various α and *C<sup>V</sup>* values were therefore evaluated in flume tests along with changes in channel shape due to the installation of a berm in the middle of the channel. The experimental results showed that the development of a flow was influenced to occur at α of 15◦ (Table 6), which has been mentioned as the slope that causes debris flow in previous studies [3,9]. Takahashi [9] mentioned that the debris flow pattern suddenly changes when its *C<sup>V</sup>* is less than 0.55 because of active particle separation, and that a debris flow cannot reach the outlet of its channel when the *C<sup>V</sup>* exceeds 0.58. In this study, the development of the debris flow was also observed to differ around a *C<sup>V</sup>* of 0.55; at a *C<sup>V</sup>* of 0.60 in the single-berm channel test, the debris flow stopped in the channel before reaching the outlet (Table 6). In the straight channel test, however, it appears that α had a larger influence on the flow. Thus, debris flow occurred when α was 15◦ or higher, even when *C<sup>V</sup>* exceeded 0.58.

As α increased, the debris flow velocity and mobility ratio were both observed to increase (Figures 6a and 10a). The changes in the flow velocity and mobility ratio differed around an α of 15°. The flow depth consistently decreased as α increased when no berm was installed, but it suddenly increased at an α of 25° when the berm was installed (Figure 7a). This appears to be because the debris flow moved along the steeply sloped channel with considerable momentum, and then the flow suddenly changed under the influence of the channel cross section geometry where the berm was installed. The installation of the berm caused the Froude number to increase when α was less than 25°, but it decreased when it was 25° (Figure 8c). In other words, it was confirmed that the 15° slope known to cause debris flow indeed affects the debris flow velocity and mobility ratio, and that an α of 25° affects the flow depth and Froude number in single-berm channel. As α increased, the debris flow velocity and mobility ratio were both observed to increase (Figures 6a and 10a). The changes in the flow velocity and mobility ratio differed around an α of 15◦ . The flow depth consistently decreased as α increased when no berm was installed, but it suddenly increased at an α of 25◦ when the berm was installed (Figure 7a). This appears to be because the debris flow moved along the steeply sloped channel with considerable momentum, and then the flow suddenly changed under the influence of the channel cross section geometry where the berm was installed. The installation of the berm caused the Froude number to increase when α was less than 25◦ , but it decreased when it was 25◦ (Figure 8c). In other words, it was confirmed that the 15◦ slope known to cause debris flow indeed affects the debris flow velocity and mobility ratio, and that an α of 25◦ affects the flow depth and Froude number in single-berm channel.

As *CV* increased, the debris flow velocity and flow depth decreased (Figures 6b and 7b), but the mobility ratio increased (Figure 10b). A *CV* of 0.50 marked important the

The Froude numbers obtained in this study ranged from 4.14 to 10.79. This range is similar to those reported in studies conducted using an experimental setup with a channel

ratio. At a *CV* of 0.60, the debris flow did not reach the outlet at any α when the berm was installed, but it reached the outlet if α was 15° or higher when no berm was installed. Furthermore, it was confirmed that the *CV* dominated the flow resistance coefficients *μ*, *ξ*, *n*, and *C1*, while *C2* was not affected by the *CV*, likely because it was defined using a numerical analysis, unlike the other flow resistance coefficients, which were defined using the experimental results. Similarly, Rickenmann [16] mentioned that *C2* is appropriate for

the numerical analysis of unsteady debris flows.

As *C<sup>V</sup>* increased, the debris flow velocity and flow depth decreased (Figures 6b and 7b), but the mobility ratio increased (Figure 10b). A *C<sup>V</sup>* of 0.50 marked important the changes in the development of debris flow depth and Froude number, and a *C<sup>V</sup>* of 0.55 marked important changes in the development of the debris flow velocity and mobility ratio. At a *C<sup>V</sup>* of 0.60, the debris flow did not reach the outlet at any α when the berm was installed, but it reached the outlet if α was 15◦ or higher when no berm was installed. Furthermore, it was confirmed that the *C<sup>V</sup>* dominated the flow resistance coefficients *µ*, *ξ*, *n*, and *C1*, while *C<sup>2</sup>* was not affected by the *CV*, likely because it was defined using a numerical analysis, unlike the other flow resistance coefficients, which were defined using the experimental results. Similarly, Rickenmann [16] mentioned that *C<sup>2</sup>* is appropriate for the numerical analysis of unsteady debris flows.

The Froude numbers obtained in this study ranged from 4.14 to 10.79. This range is similar to those reported in studies conducted using an experimental setup with a channel length of 2 m or less [2,35]. The Froude numbers of actual debris flows have been determined to range from 0.36 to 7.56 in previous studies [3,15,23,37–39], and the Froude numbers of debris flows produced using flume experiments have been reported to range from 0.6 to 12.44 [2,12,35,40–43]. Since debris flow is affected by various conditions, the Froude number varies depending on the characteristics of the target debris flow. In general, the Froude numbers of actual debris flows are higher than those obtained in flume experiments. In the case of an actual debris flow, it appears that the Froude number decreases because the flow depth increases with the large amount of sediment absorbed into the flow by the riverbed erosion during the movement process [7,13].

In this study, the use of a berm was considered as a debris flow mitigation measure. The installation of the berm in the channel was observed to reduce the debris flow velocity, depth, and mobility ratio by up to 34.3%, 71.2%, and 15.7%, respectively (Figure 6c,d; Figure 7c,d; and Figure 10c,d, respectively). This indicates that the berm effectively decreased the potential kinetic energy and mobility of the debris flow moving downstream under the influence of gravity. However, it is important to note that various experimental berm conditions were not considered in this study due to laboratory test limitations. The effects of berms on debris flow characteristics can be more effectively identified if additional experimental berm conditions such as the length, location, and back slope are considered.

## **6. Conclusions**

In South Korea, where mountainous areas account for more than 63% of the land, an increasing amount of debris flow has occurred each year due to rapidly increasing torrential rainfall. However, related studies on debris flow and the preparation of mitigation measures are currently insufficient. Therefore, this study was conducted using a straight channel test (without a berm) and a single-berm channel test to determine the effects of channel slope, volumetric concentration of sediment, and berm installation on the resulting flow velocity, flow depth, Froude number, flow resistance coefficients, and mobility ratio.

The experimental results showed that the debris flow velocity and mobility ratio increased, but the debris flow depth decreased as the channel slope increased. In addition, as the volumetric concentration of sediment increased, the debris flow velocity and depth both decreased, whereas the mobility ratio increased. When the berm was installed on the channel slope, the debris flow velocity, depth, and mobility ratio all significantly decreased, indicating that the installation of a berm on a slope can effectively decrease the spread of debris flow in downstream areas. In this study, the Froude number exhibited a range similar to those determined in previous studies at similar experimental scales.

The results of this study provide a useful understanding of the effects of channel slope and volumetric concentration of sediment on debris flow characteristics. They also provide details describing the effects of berm installation, which are required to design adequate debris flow damage reduction measures. In future studies, the down-channel depositions will be further analyzed to derive the correlation between the flow characteristics and deposition characteristics.

**Author Contributions:** Conceptualization and methodology, H.L.; formal analysis, H.C.; investigation, K.R.; writing-original draft preparation, K.R. and H.C. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2017R1D1A3B03035477 & NRF-2019R1A6A3A01096145).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are not available without the author's agreement. To use the data, a request should be made to the corresponding author.

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

## **References**


## *Article* **Modeling the Impact of Extreme Droughts on Agriculture under Current and Future Climate Conditions Using a Spatialized Climatic Index**

**Dorothée Kapsambelis 1,2,\*, David Moncoulon <sup>1</sup> , Martine Veysseire <sup>3</sup> , Jean-Michel Soubeyroux <sup>4</sup> and Jean Cordier <sup>2</sup>**


**Abstract:** Extreme droughts have a strong impact on agricultural production. In France, the 2003 drought generated records of yield losses at a national scale for grassland (more than 30%) and for cereals (more than 10% for soft winter wheat and winter barley). These extreme events raise the question of farm resilience in the future. Studying them makes it possible to adapt risk management policy to climate change. Therefore, the objective of this paper was to analyze the frequency and the intensity of extreme drought in 2050 and their impact on crop yield losses (grassland and cereals) in France. We used the DOWKI (Drought and Overwhelmed Water Key Indicator) meteorological index based on a cumulative water anomaly, which can explain droughts and their consequences on agricultural yield losses at a departmental scale. Then, using the ARPEGE-Climat Model developed by Meteo-France, DOWKI was projected in 2050 and grassland, soft winter wheat, and winter barley yield losses were simulated. The results compare the frequency and intensity of extreme droughts between the climate in 2000 and 2050. Our results show that the frequency of extreme droughts (at least as intense as in 2003) doubled in 2050. In addition, the yield losses due to 10-year droughts increased by 35% for grassland and by more than 70% for cereals.

**Keywords:** extreme droughts; climate change; modeling; crop yield losses; crop insurance

## **1. Introduction**

## *1.1. Consequences of Climate Change on Agriculture*

Agriculture in France represents an important economic activity (leading producer in the European Union). In 2014, of the EUR 373 billion of gross agricultural products (GAP) produced in the European Union, France produced EUR 67 billion, representing 18% of the GAP [1,2]. France is the main producer of wheat and cattle in the European Union, and these two activities cover a large part of its utilized agricultural land [3]. In 2003, a severe drought caused a massive decrease in agricultural production and income (30% of the production was lost [4]), despite the rise in prices that some crops experienced [5]. Grassland yields were also greatly reduced and public support (via the Calamity Fund System) was necessary to allow farmers to get through the year, especially in the milk production community. These elements indicate that despite technological progress, crop production remains highly dependent on water resources and climatic conditions. In this context, increasing our scientific knowledge of the intensity and frequency of these extreme droughts is necessary to evaluate their impact on agricultural production.

**Citation:** Kapsambelis, D.; Moncoulon, D.; Veysseire, M.; Soubeyroux, J.-M.; Cordier, J. Modeling the Impact of Extreme Droughts on Agriculture under Current and Future Climate Conditions Using a Spatialized Climatic Index. *Appl. Sci.* **2022**, *12*, 2481. https://doi.org/10.3390/ app12052481

Academic Editor: Joao Carlos Andrade dos Santos

Received: 16 December 2021 Accepted: 23 February 2022 Published: 27 February 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

There is a global consensus in the scientific community that the climate will be very different in the middle of the 21st century [6–9]. The main cause of climate change is the increase in atmospheric concentrations of several greenhouses gases as a result of human activity [10,11]. Several models are used in the community to study different scenarios of climate change and its consequences on agriculture. The following studies highlight several important points:


Finally, several studies demonstrate that heat waves like the 2003 one, which particularly affected crop yields, will be more frequent in the future [2,6,26–28].

Extreme drought events are difficult to study because there are by definition rare events that occur very infrequently, so an archive of historical data may contain just a few extreme events [29]. The IPCC defines the concept of "extreme" as "the occurrence of a value of a weather or climate variable above (or below) a threshold value near the upper (or lower) ends ('tails') of the range of observed values of the variable" [8]. Thus, the notion of an extreme drought event is dependent on the value of the climatic index chosen to characterize the climate. Therefore, the link between the climate index and the impact on agriculture (yield losses) has to be explicit. However, annual yield losses are due to a set of phenomena (diseases, climatic events, changes in cropping practices) and it is not easy to assess the weight of one phenomenon independently. The best-known drought indicators are:


These indicators characterize meteorological and hydrological droughts, which are not necessarily similar to agricultural droughts. Although many indices have been developed to analyze the evolution of droughts, the direct relation between these indices and crop yield has not been frequently investigated. Some studies have been conducted in China, in the United States, and in Europe comparing the SPEI, PDSI (Palmer–Drought Severity Index), and SPI indices for the detection of agricultural droughts [32,33]. The best correlations are obtained with SPEI. In addition, in Canada, a study was carried out to analyze the correlation between grassland losses linked to droughts and certain agro-climatic indices like PDSI and SPI. The results indicate that the coefficients of determination remained very low with all indices [34]. Other drought indices have been developed for specific territory, such as ARID to study the link between water stress and plant growth in the United States [35]. Thus, some studies show that an index developed with the parameter of precipitation alone, like SPI, is not sufficient to explain the variability in crop production due to drought, particularly for extreme events like the one in 2003 [36,37], because this drought was characterized by an increase in evapotranspiration rates [38]. At the French

country scale, one indicator used to analyze the effect of the climate change on agriculture is the Standardized Soil Water Index (SSWI) [39]. This indicator represents the useful water reserve of the soil, or water availability for plants. Water deficit and temperature are parameters commonly used to study the climate effect on agricultural crops [40,41].

## *1.2. Objectives of This Study*

In the paper, we propose an evaluation of the frequency and intensity of extreme droughts in current climate and future climate conditions (year 2050). Our methodology is based on a simple drought index [42] correlated to crop yield losses that can be projected into the future using a global climate model—for instance, the ARPEGE-Climat Model from Meteo-France.

Based on this method, the objectives are to:


The model is applied to three crop categories: grasslands, soft winter wheat, and winter barley.

This study aims to provide insight on the following issues:


## **2. Materials and Methods**

*2.1. Modeling Extreme Droughts and Their Consequences on Yield Losses*

2.1.1. DOWKI Computation on the SAFRAN Reanalysis

We used the DOWKI, which characterizes extreme events of drought and excess water. DOWKI is (1) simple to compute, (2) purely meteorological, and (3) independent of crop categories. It can be compared to yield losses for several types of crops and on large areas. DOWKI is a cumulative efficient rain anomaly, computed on a 10-day time step, between the current year value and the historical average. It is computed for the growing period of a given crop, and starts at 0 on 1 January. It is expressed in mm. Its equation for drought event characterization is as follows:

$$ERNC\_{i, \mathcal{U}} = \left[ \left( P\_{\bar{i}-1, \mathcal{U}} - PET\_{\bar{i}-1, \mathcal{U}} \right) - \left( \overline{P\_{\bar{i}-1, P} - PET\_{\bar{i}-1, P}} \right) \right] + \left[ \left( P\_{\bar{i}, \mathcal{U}} - PET\_{\bar{i}, \mathcal{U}} \right) - \left( \overline{P\_{\bar{i}, P} - PET\_{\bar{i}, P}} \right) \right]$$
 
$$\text{s.t.:} \dots$$

*DOWKIdrought c*, *<sup>n</sup>* = min *ERNCi*0→*ij*,*c*,*<sup>n</sup>*

where *ERNCi*,*<sup>n</sup>* is the cumulative rain anomaly computed in decade *i* for year *n*. *P* is the precipitation and *PET* is the potential evapotranspiration. *Pi*−1,*<sup>P</sup>* − *PETi*−1,*<sup>P</sup>* represents the average of the difference between *P* and *PET* computed for all *i-*1 10-day periods in period *P* (historical period). DOWKI is an annual value and is computed by taking the minimum of the values of *ERNC* for any 10-day period between *i*<sup>0</sup> (first 10-day period) and *ij* (final 10-day period).

In [42], DOWKI was computed for representative meteorological stations at the departmental scale to match with available yield loss data, and we showed model uncertainties as we simulated all the crops losses. One notable limitation is the climate measure at a single point over the department. On the one hand, this measure is not necessarily representative of the climate over the whole territory of the department. On the other hand, the crop parcels were not necessarily located at the climate measuring point (meteorological station point). In this second case precisely, this would mean that we measured the hazard but not the agricultural risk. In this paper, we computed DOWKI on the SAFRAN-Grid— 8 km × 8 km over the French metropolitan area for two reasons:

1. To reduce the uncertainties of the model;

2. To match with the output scale of the ARPEGE-Climat model using a quantile–quantile downscaling method on the SAFRAN daily reanalysis data (1981–2010 for rainfall and 1989–2018 for potential evapotranspiration). grid with the Graphic Plot Register (GPR) shown in Figure 1. To compute an index value by department, we calculated the DOWKI average values in each cell of the department where the crop was present. This methodology allowed us to measure climate risk specif-

After computing DOWKI on the SAFRAN grid, we crossed the SAFRAN reanalysis

2. To match with the output scale of the ARPEGE-Climat model using a quantile–quantile downscaling method on the SAFRAN daily reanalysis data (1981–2010 for rainfall

a single point over the department. On the one hand, this measure is not necessarily representative of the climate over the whole territory of the department. On the other hand, the crop parcels were not necessarily located at the climate measuring point (meteorological station point). In this second case precisely, this would mean that we measured the hazard but not the agricultural risk. In this paper, we computed DOWKI on the SAFRAN-

After computing DOWKI on the SAFRAN grid, we crossed the SAFRAN reanalysis grid with the Graphic Plot Register (GPR) shown in Figure 1. To compute an index value by department, we calculated the DOWKI average values in each cell of the department where the crop was present. This methodology allowed us to measure climate risk specifically on crop production since we integrated the hazard parameter (DOWKI value) and crop vulnerability (the DOWKI computation corresponds to on the crop vulnerability period and the agricultural parcel location). ically on crop production since we integrated the hazard parameter (DOWKI value) and crop vulnerability (the DOWKI computation corresponds to on the crop vulnerability period and the agricultural parcel location).

*Appl. Sci.* **2022**, *12*, x FOR PEER REVIEW 4 of 20

and 1989–2018 for potential evapotranspiration).

1. To reduce the uncertainties of the model;

Grid—8 km × 8 km over the French metropolitan area for two reasons:

**Figure 1.** Crossing the SAFRAN-grid with Graphic Plot Register for DOWKI computation at the departmental scale. **Figure 1.** Crossing the SAFRAN-grid with Graphic Plot Register for DOWKI computation at the departmental scale.

### 2.1.2. Computing Yield Losses 2.1.2. Computing Yield Losses

We used the AGRESTE database (https://agreste.agriculture.gouv.fr, accessed date : 05/03/2019), which refers to yield by crop and department in the historical period (1989– 2018 for soft winter wheat and winter barley and 2000–2018 for grassland) with one value by year and by crop produced and declared on a given surface. Yield losses for the *n*-th year were computed by comparing the annual yield with a yield reference defined by the Olympic average over 5 years. This methodology is used in agricultural public policies like crop insurance [43]. The crop yield loss computation using the Olympic average is presented here: We used the AGRESTE database (https://agreste.agriculture.gouv.fr, accessed date: 5 March 2019), which refers to yield by crop and department in the historical period (1989–2018 for soft winter wheat and winter barley and 2000–2018 for grassland) with one value by year and by crop produced and declared on a given surface. Yield losses for the *n*-th year were computed by comparing the annual yield with a yield reference defined by the Olympic average over 5 years. This methodology is used in agricultural public policies like crop insurance [43]. The crop yield loss computation using the Olympic average is presented here:

$$\text{Yield loss}\_{\text{c.}n} = \frac{\text{Yield}\_{\text{c.}n} - \left(\sum\_{n=5}^{n-1} \text{yield}\_{\text{c}} - \text{Max}\left(\sum\_{n=5}^{n-1} \text{yield}\_{\text{c}}\right) - \text{Min}\left(\sum\_{n=5}^{n-1} \text{yield}\_{\text{c}}\right)\right)}{\left(\sum\_{n=5}^{n-1} \text{yield}\_{\text{c}} - \text{Max}\left(\sum\_{n=5}^{n-1} \text{yield}\_{\text{c}}\right) - \text{Min}\left(\sum\_{n=5}^{n-1} \text{yield}\_{\text{c}}\right)\right)}$$

In which *c* is the culture and *n* is the year.

Figure 2 represents yield losses for soft winter wheat, winter barley, and grassland on the French farm scale for the historical period 2000–2018. Over this period, yields were affected by several events:

• The most significant soft winter wheat and winter barley yield losses were registered for the 2016 excess water event (27% and 17%, respectively) and the 2003 drought (14% and 16%, respectively).

*Appl. Sci.* **2022**, *12*, x FOR PEER REVIEW 5 of 20

ିଵ ିହ

In which *c* is the culture and *n* is the year.

affected by several events:

(14% and 16%, respectively).

and the 2011 drought (21%).

, <sup>=</sup>, − (∑ − (∑ ) − ( ିଵ

(∑ − (∑ ) − ( ିଵ

• The most significant soft winter wheat and winter barley yield losses were registered for the 2016 excess water event (27% and 17%, respectively) and the 2003 drought

• The most important grassland yield losses were registered for the 2003 drought (32%)

Cereals seem to be sensitive to two natural extreme hazards—excess water and droughts—and grassland specifically to droughts. Over the historical period, there were two extreme drought events: in 2003 and in 2011. These two years saw largescale severe droughts, the worst droughts in 30 years. On average, these two events caused 25% crop

Figure 2 represents yield losses for soft winter wheat, winter barley, and grassland on the French farm scale for the historical period 2000–2018. Over this period, yields were

ିଵ ିହ ିହ ∑ )) ିଵ

ିହ

ିହ ∑ )) ିଵ

ିହ

• The most important grassland yield losses were registered for the 2003 drought (32%) and the 2011 drought (21%). losses for grassland and 10% crop losses for soft winter wheat. In order to characterize the extreme events in the future, we used these two extreme droughts as a reference.

**Figure 2.** Crop yield losses (%) for grassland, soft winter wheat, and winter barley at the national scale computed over the historical period 2000–2018 with the AGRESTE database. **Figure 2.** Crop yield losses (%) for grassland, soft winter wheat, and winter barley at the national scale computed over the historical period 2000–2018 with the AGRESTE database.

2.1.3. Analyzing the Link between DOWKI and Yield Losses DOWKI values and yield losses were computed for each department over the historical period 2000–2018. This calibration matrix of 1800 values was used to study the statistical relationships between the index value and the yield losses. The index values were classified using 50 mm steps. For each class we calculated the Cereals seem to be sensitive to two natural extreme hazards—excess water and droughts—and grassland specifically to droughts. Over the historical period, there were two extreme drought events: in 2003 and in 2011. These two years saw largescale severe droughts, the worst droughts in 30 years. On average, these two events caused 25% crop losses for grassland and 10% crop losses for soft winter wheat. In order to characterize the extreme events in the future, we used these two extreme droughts as a reference.

### number of yield loss values exceeding 0% and the average yield loss value. The parameters of the model were: 2.1.3. Analyzing the Link between DOWKI and Yield Losses

• The period over which the annual value of the index was computed. This period corresponded to the vulnerability period of the crop and was different for each crop; • The extreme event threshold at the departmental scale; DOWKI values and yield losses were computed for each department over the historical period 2000–2018. This calibration matrix of 1800 values was used to study the statistical relationships between the index value and the yield losses.

• The minimum cultivation area to be taken into account to rule out small areas in which yields are very volatile. The index values were classified using 50 mm steps. For each class we calculated the number of yield loss values exceeding 0% and the average yield loss value.

The parameters of the model were:


These parameters were optimized using an experimental design, which consisted of computing a high number of calibration processes with different values for each parameter. The size of the experimental design was (*p n* ), with *p* being the number of parameters (here, *p* = 4) and *n* being the number of values for each parameters (here, *n* = 10). In our case study, the number of calibration processes was *p <sup>n</sup>* = 10,000. The experimental design was evaluated by analyzing the following parameters:


This experimental design allowed us to select the best parameters by minimizing both errors. The best parameters are presented in Table 1. A specific experimental design with the same 4 parameters was run for each crop studied. Thus, a selection of the best parameters for grassland, soft winter wheat, and winter barley was made. The following table gives these parameter values.

**Table 1.** Best parameters used for the calibration of the model by crop.


For soft winter wheat, two climatic regions were defined—the North and South of France—to improve the calibration results.

## *2.2. Modelling Climate Scenarios with ARPEGE-Climat*

## 2.2.1. General Methodology

Unlike most climatic projections, here the ARPEGE-Climat was not used to simulate a continuous period between 2000 and 2050 but to simulate a 400-year-long time series with year 2000 climate forcing and with year 2050 climate forcing under an RCP 8.5 scenario. The objective was to collect a large panel of possible meteorological situations, but not ones that necessarily occurred, for these two target years. These 400 years had to be interpreted as possible realizations of a given targeted year. With 400 possible realizations for the year 2000 and the year 2050, we had at our disposal a large series of data. Then, it was possible to analyze extreme events and to estimate probabilities of occurrence.

Meteorological data such as precipitations and potential evapotranspiration were the outputs of this model for the climate in 2000 and the climate in 2050. The results were analyzed on an 8 km × 8 km grid for the whole French territory.

## 2.2.2. Targeting the Year 2050

The year 2050 was chosen for this study as the target year for our climatic projections. This mid-term target year, 30 years in the future, will allow us to analyze the consequences of climate change on crop production and support public policy decisions. Under the financial context, insurers are able to make projections of their market in 2050. The target year 2100—widely used by climatologists to study the impact of climate change—is too far in the future to make serious hypotheses on the evolution of agriculture, landscapes, economy, and risk management policies.

## 2.2.3. Choice of RCP 8.5

The Representative Concentration Pathway 8.5 scenario (RCP 8.5) is characterized by increasing greenhouse gas concentration levels (>1370 eq-CO<sup>2</sup> in 2100). This scenario is the most extreme and corresponds to a radiative forcing of +5 W/m<sup>2</sup> in 2050 (only +4 W/m<sup>2</sup> for RCP 4.5) [7]. The RCP 8.5 scenario represents a "pessimistic" or "conservative" vision of what the climate could be like in 2050. In this scenario, the energy demand is high, with the highest greenhouse gas emissions, corresponding to a high population and modest technological improvements. In France, RCP 8.5 corresponded to a temperature increase of 2.2 ◦C in 2050 compared to the 1976–2005 period, and a temperature increase of 1.7 ◦C for RCP 4.5 in 2050 [44,45]. According to the IPCC, RCP 8.5 corresponds today to historical paths since 1992.

## 2.2.4. ARPEGE-Climat Model Description and Parameterization

The numerical model ARPEGE is a global and spectral general circulation model developed for an "operational numerical weather forecast" by Meteo-France in collaboration with the ECMWF (European Centre for Medium-Range Weather Forecasts). ARPEGE-Climat became the atmospheric part of the CNRM earth-modelling system, which couples different components of the climate system (atmosphere, ocean, land surface, sea ice). The ARPEGE grid can be tilted and stretched by changing the position of the pole and by increasing the horizontal resolution over an area of interest. This zoom ability allows regional climate to be studied with ARPEGE-Climat.

In our case, ARPEGE-Climat had the pole in Germany (9.97◦ E, 50.00◦ N). The spatial resolution over Europe was about 20 km. The time step of the model was 600 s (10 min).

The exchanges between atmosphere and soil were taken into account by the specific SVAT (Soil Vegetation Atmosphere Transfer) module SURFEX (V7) implemented in ARPEGE-Climat.

The climate forcing allowed the climate to be kept stationary using fixed parameters:


## 2.2.5. Model Outputs

The archive held model outputs over Europe and North Africa at stretched and tilted grid points in ARPEGE at an hourly step time for 36 near-surface parameters and at a 3-hour time step for 5 altitude parameters at 9 different levels. Then, data were generally interpolated on user-specific grids.

## 2.2.6. Downscaling and Post-Processing

We needed precipitation and potential evapotranspiration for metropolitan France. Precipitation could be directly extracted and interpolated on the 8 km × 8 km SAFRAN grid. Potential evapotranspiration was computed at a daily step time according to the Penman–Monteith formula, with 2 m temperature, sea level pressure, 2 m specific humidity, 10 m wind speed, surface downwards global short-wave radiation, and surface longwave radiation. These parameters were retrieved and interpolated on the 8 km × 8 km SAFRAN grid.

The imperfections of the models induced biases in the outputs and downscaling, and interpolation is not a perfect method. Therefore, we removed the biases with 30 years of the climatic reference database SAFRAN (SIM2 reanalysis).

The precipitation and the parameters used to compute the potential evapotranspiration were generally corrected with the quantile mapping method. A specific method was developed at Meteo-France for global radiation. Last, potential evapotranspiration was corrected with the quantile-mapping method.

## *2.3. Uncertainty Analysis*

The simulation results using a model chain like ours carried important uncertainties that needed to be evaluated and taken into account in the confidence interval of the results. Different uncertainties were contained in our model chain: climatic model uncertainties, index uncertainties, and damage model uncertainties.

As seen in its definition above, the DOWKI index computation is deterministic with no addition of uncertainties between the input data (*P* and *PET*) and index value. The hazard uncertainties were thus contained in the values of *P* and *PET* provided by the ARPEGE-Climat values and the downscaling process. To evaluate these uncertainties contained in the input data, we relied on two hypotheses:


The most important uncertainty lay in the crop yield loss simulations using DOWKI values and the damage model. As seen during the calibration process, false positives and false negatives induced model errors. We decided to take this uncertainty into account in the confidence interval by simulating each climate year in the ARPEGE-Climat model 100 times: For each of the 100 repeats of the same year, a yield loss value was randomly chosen within the index class at the department scale. This method allowed the confidence interval (for example, quantiles 10 and 90) to be estimated for each year and department.

## **3. Results**

## *3.1. Historical Reanalysis*

The relationship between the DOWKI values and yield losses for grassland is illustrated in Figure 3. The damage model is the statistical relation between climatic index and yield losses at the department scale. It is a combination of two predictive models:

	- Prediction of the yield loss value at the department scale.

Grassland average yield losses at national scale (%) Frequency of claims (%)

**Figure 3.** Damage function for grassland yield loss simulations: frequency of claims, percentiles 10 and 90, and average of yield losses according to the DOWKI values. **Figure 3.** Damage function for grassland yield loss simulations: frequency of claims, percentiles 10 and 90, and average of yield losses according to the DOWKI values.

When a DOWKI value exceeded −475 mm, the departmental yield loss was equal to 42% with a probability of occurrence close to 90%. For DOWKI values close to zero, the When a DOWKI value exceeded −475 mm, the departmental yield loss was equal to 42% with a probability of occurrence close to 90%. For DOWKI values close to zero, the probability of claims was significantly lower (~30%), as was the yield loss value (10%).

probability of claims was significantly lower (~30%), as was the yield loss value (10%). The calibration generated false positives and false negatives not explained by our index. False positives are departments where the index indicated, for a given year, an intense drought but without consistent yield loss. A false negative, on the contrary, is a case where high yield loss could not be explained by the index value. Several hypotheses were formulated to explain these errors (Table 2). The calibration generated false positives and false negatives not explained by our index. False positives are departments where the index indicated, for a given year, an intense drought but without consistent yield loss. A false negative, on the contrary, is a case where high yield loss could not be explained by the index value. Several hypotheses were formulated to explain these errors (Table 2).

**False Positives False Negatives** 

Protection measures (irrigation) Combination of several climatic events, in-

To validate our damage model, back testing was performed by comparing, at the

national scale, the observed yield losses and the simulated yield losses (Figure 4).

Development and propagation of disease

cluding droughts

**Table 2.** False positives and false negatives.

fying the sowing period or harvest period, choice of varieties


**Table 2.** False positives and false negatives.

To validate our damage model, back testing was performed by comparing, at the national scale, the observed yield losses and the simulated yield losses (Figure 4). *Appl. Sci.* **2022**, *12*, x FOR PEER REVIEW 10 of 20

Average yield losses computed on AGRESTE data base

Average yield losses simulated by the model with DOWKI

**Figure 4.** Average grassland yield losses at the national scale (%) computed in the AGRESTE database and simulated by the model with DOWKI values. **Figure 4.** Average grassland yield losses at the national scale (%) computed in the AGRESTE database and simulated by the model with DOWKI values.

The back-testing relative error at the national scale was 5.5% for grasslands (14.6% for soft winter wheat and 20.4% for winter barley). The 2011 and 2003 intense droughts were explained by the model with an underestimation of 24% (2003) and overestimation of 2% (2011), but the highest simulated yield losses remained, as expected. The lowest yield losses at the national scale (years 2000, 2001, 2002, and 2008) were overestimated by the model, but the simulated yield losses were still the lowest in the distribution. The two droughts in 2003 and 2011 were characterized by a lack of precipitation and an augmentation of evapotranspiration rates. In addition, for the 2003 drought, record extreme temperatures were experienced during the summer. The main difference between these two droughts is that they did not begin at the same period of the year. The 2011 drought was a spring drought and the extreme values of DOWKI were computed in June. For the 2003 drought, extreme values of DOWKI were computed in August. The DOWKI values were more extreme for grassland than for cereals because the drought lasted all of August and The back-testing relative error at the national scale was 5.5% for grasslands (14.6% for soft winter wheat and 20.4% for winter barley). The 2011 and 2003 intense droughts were explained by the model with an underestimation of 24% (2003) and overestimation of 2% (2011), but the highest simulated yield losses remained, as expected. The lowest yield losses at the national scale (years 2000, 2001, 2002, and 2008) were overestimated by the model, but the simulated yield losses were still the lowest in the distribution. The two droughts in 2003 and 2011 were characterized by a lack of precipitation and an augmentation of evapotranspiration rates. In addition, for the 2003 drought, record extreme temperatures were experienced during the summer. The main difference between these two droughts is that they did not begin at the same period of the year. The 2011 drought was a spring drought and the extreme values of DOWKI were computed in June. For the 2003 drought, extreme values of DOWKI were computed in August. The DOWKI values were more extreme for grassland than for cereals because the drought lasted all of August and the vulnerability period of cereals is shorter.

the vulnerability period of cereals is shorter. The most difficult issue with these model results is the case of the drought in 2018: High yield losses due to an extreme drought that occurred in the northeast region of France were not detected by our model. This was due to multiannual drought cycles. The DOWKI index value was initialized at 0 on 1 January of each simulated year, whereas in The most difficult issue with these model results is the case of the drought in 2018: High yield losses due to an extreme drought that occurred in the northeast region of France were not detected by our model. This was due to multiannual drought cycles. The DOWKI index value was initialized at 0 on 1 January of each simulated year, whereas in 2017, the soils were abnormally dry in December.

2017, the soils were abnormally dry in December. As shown in Figure 4, the back testing of the model showed its capacity to simulate As shown in Figure 4, the back testing of the model showed its capacity to simulate extreme drought events and predict the national yield loss.

were computed. The first issue was to determine whether these distributions were significantly different. The numerous repeats in each target year allowed us to use a statistical

We compared the distributions of the annual average national scale DOWKI values with a Wilcoxon–Mann–Whitney non-parametric test commonly used to compare medians of two samples that do not follow a Gaussian distribution. The test rejected the null hypothesis that the two distributions were samples from continuous distributions with

extreme drought events and predict the national yield loss.

*3.2. Agro-Climatic Model Results in 2000 and 2050* 

equal medians. The p-value was equal to 0.027.

test to answer this first question.

## *3.2. Agro-Climatic Model Results in 2000 and 2050*

## 3.2.1. Comparison of DOWKI Distributions between 2000 and 2050

Using ARPEGE-Climat, two event sets of 400 years (2000 climate and 2050 climate) were computed. The first issue was to determine whether these distributions were significantly different. The numerous repeats in each target year allowed us to use a statistical test to answer this first question.

We compared the distributions of the annual average national scale DOWKI values with a Wilcoxon–Mann–Whitney non-parametric test commonly used to compare medians of two samples that do not follow a Gaussian distribution. The test rejected the null hypothesis that the two distributions were samples from continuous distributions with equal medians. The *p*-value was equal to 0.027.

## 3.2.2. National Analysis

In this first approach, we analyzed the frequency of extreme droughts with an intensity equal or superior to 2003 and 2011 at the national scale. In the current climate distribution, 29 drought events were identified. A quick estimation of the return period of these extreme droughts in current climate was 13 years. This first result is consistent with the 30 years of available historical data (1989–2018), with two extreme droughts (2003 and 2011), giving an empirical return period of 15 years. In the 2050 scenario, 57 extreme drought events were identified with a return period of seven years.

On average, these droughts affected 81.7% of utilized agricultural land (UAL) in 2000 and 86.1% in 2050. All these events were systemic, with a minimum of 61.8% (2000 climate) and 52.7% (2050 climate) of UAL affected by drought.

The annual DOWKI value at the national scale decreased by 40% (DOWKI was equal to −78 mm for the climate in 2000 and −110 mm for the climate in 2050) when comparing the 2000 and 2050 distributions.

At the national scale, the effect of climate change on the frequency of extreme droughts, considering 2003 as the reference, will increase significantly (+100%) between 2020 and 2050, according to the ARPEGE-Climat simulations. These events will remain systemic in the climate in 2050, with at least 50% of the UAL affected by drought.

Beyond the average, Figure 5 illustrates the yield losses (%) at the national scale for soft winter wheat (a), winter barley (b), and grassland (c) with respect to their return period in the current climate and in 2050.

A lot of information can be extracted from Figure 5. When integrating the model uncertainties (percentile 10–90), the empirical cumulative distribution functions (ECDF curves) between the 2000 climate and the 2050 climate did not overlap over 10 years, showing a significant increase in yield losses between 2000 and 2050 for all return periods. First, the annual average loss will increase in 2050 by:


The yield losses due to 10-year droughts will increase by:


The results show a more important yield loss increase for cereals than grassland (Table 3) due to 10-year droughts. The whole of France would be impacted by a significant increase in risk in 2050, but the evolution would be even more significant in the northern half of France, where straw cereals are cultivated. Indeed, we analyzed the DOWKI values of 10-year droughts between the 2000 climate and the 2050 climate: A critical increase in the water balance anomaly (30–50%) was registered in the North of France, particularly where cereals are cultivated [46]. In the South of France, for 10-year droughts an increase in the water balance anomaly of 10–30% was recorded [46].

3.2.2. National Analysis

the 2000 and 2050 distributions.

riod in the current climate and in 2050.

In this first approach, we analyzed the frequency of extreme droughts with an intensity equal or superior to 2003 and 2011 at the national scale. In the current climate distribution, 29 drought events were identified. A quick estimation of the return period of these extreme droughts in current climate was 13 years. This first result is consistent with the 30 years of available historical data (1989–2018), with two extreme droughts (2003 and 2011), giving an empirical return period of 15 years. In the 2050 scenario, 57 extreme

On average, these droughts affected 81.7% of utilized agricultural land (UAL) in 2000 and 86.1% in 2050. All these events were systemic, with a minimum of 61.8% (2000 climate)

The annual DOWKI value at the national scale decreased by 40% (DOWKI was equal to −78 mm for the climate in 2000 and −110 mm for the climate in 2050) when comparing

At the national scale, the effect of climate change on the frequency of extreme droughts, considering 2003 as the reference, will increase significantly (+100%) between 2020 and 2050, according to the ARPEGE-Climat simulations. These events will remain

Beyond the average**,** Figure 5 illustrates the yield losses (%) at the national scale for soft winter wheat (a), winter barley (b), and grassland (c) with respect to their return pe-

systemic in the climate in 2050, with at least 50% of the UAL affected by drought.

drought events were identified with a return period of seven years.

and 52.7% (2050 climate) of UAL affected by drought.

**Figure 5.** Average yield losses and percentile 10–90 on 100 simulations for the year 2000 and the year 2050 (RCP 8.5) for (**a**) soft winter wheat, (**b**) winter barley, and (**c**) grassland. **Figure 5.** Average yield losses and percentile 10–90 on 100 simulations for the year 2000 and the year 2050 (RCP 8.5) for (**a**) soft winter wheat, (**b**) winter barley, and (**c**) grassland.

A lot of information can be extracted from Figure 5. When integrating the model uncertainties (percentile 10–90), the empirical cumulative distribution functions (ECDF **Table 3.** Average yield losses at the national scale for soft winter wheat, winter barley, and grassland for 10-year droughts for the climate in 2000 and the climate in 2050.


The return period of the highest losses (50-year return period at current climate) will become every 19.6 years for grasslands and winter barley and every 26.1 years for soft winter wheat.

In terms of output and income losses, these droughts will affect the agricultural economy with a loss of:


## 3.2.3. Regional Analysis

Were the evolutions highlighted at the national scale consistent with a geographical study at the departmental scale?

We analyzed the intensity and frequency of extreme droughts at the local level using DOWKI values computed at an 8 km × 8 km scale and crop yield losses simulated at the departmental level.

The results presented in Figure 6a illustrate that the increase in the water deficit will be more significant on average in the South of France (southwest and Mediterranean region). Overall, we observed a worsening water deficit of 30% to 50% throughout France and above 50% in the south. results: the northeastern and southeastern parts of France will incur high yield loss increases for straw cereals. Depending on the department, Figure 6b,c shows that the yield losses will increase by 30% to 100% in 2050 for straw cereals. For grasslands, the whole of France will be affected by a significant increase in yield losses of between 30% and 75%.

The translation of the DOWKI values in terms of yield losses showed the following

*Appl. Sci.* **2022**, *12*, x FOR PEER REVIEW 13 of 20

**Figure 6.** Annual average evolution between the climate in 2000 and in 2050 of (**a**) DOWKI values, (**b**) yield losses of soft winter wheat, (**c**) yield losses of winter barley, and (**d**) grassland. **Figure 6.** Annual average evolution between the climate in 2000 and in 2050 of (**a**) DOWKI values, (**b**) yield losses of soft winter wheat, (**c**) yield losses of winter barley, and (**d**) grassland.

**4. Discussion**  *4.1. Comparison of the Results with Others Studies*  Brittany, Normandy, and the coastal northern regions showed the lowest evolution of drought index (<30%). In these areas, yield losses for grassland will increase by 30 to 50% on average.

This study shows that significant droughts from the recent past generated high yield losses at the national scale with a systemic impact on the French territory. More extreme events were computed under the current climate in terms of hazard and yield losses. Their probability of occurrence was estimated by our model to be 13 years. Our results show The translation of the DOWKI values in terms of yield losses showed the following results: the northeastern and southeastern parts of France will incur high yield loss increases for straw cereals. Depending on the department, Figure 6b,c shows that the yield losses will increase by 30% to 100% in 2050 for straw cereals. For grasslands, the whole of France will be affected by a significant increase in yield losses of between 30% and 75%.

## **4. Discussion**

## *4.1. Comparison of the Results with Others Studies*

This study shows that significant droughts from the recent past generated high yield losses at the national scale with a systemic impact on the French territory. More extreme events were computed under the current climate in terms of hazard and yield losses. Their probability of occurrence was estimated by our model to be 13 years. Our results show that the frequency of these extreme events will increase in the future to a return period of seven years.

These results are consistent with the ClimSec project [39]; with the study of IPCC [7,44], which focuses on extreme events; and with other European studies using EUROCORDEX models [47–50]. Climatic projections indicate that droughts will have a severity never before registered in terms of spatial extension and intensity. Other studies point out that the frequency of extreme drought events will strongly increase in the future, leading to a crop yield decrease, including grasslands under the RCP 8.5 scenario in all French territory [26]. In addition, studies focusing on specific countries and analyzing the evolution of droughts using climatic indices show an increase in severe drought in Greece [51], and a decrease in wheat yield due to drought severity [52] and an increase in drought frequency and severity in Spain [53], as well as in others areas like in China [54,55] and the United States [56].

Many studies show that the Mediterranean region appears to be very exposed to droughts in the future [57,58]). Indeed, the different models used at the regional scale (RCM models) to measure the impact of climate change on drought events agree that the droughts will be more intense in southern Europe, especially in the Mediterranean region [7,57,59]. Extreme heat wave studies show that the Mediterranean region will therefore probably record a cumulative water deficit anomaly, but this is, however, more widely throughout France, where the evolution between the climate in 2000 and in 2050 will be the most marked [60–63].

These extreme events are the most worrying for the sustainability of agricultural production systems because they generate very significant losses at the country level, affecting food security. For example, the extreme drought in 2011 was responsible for losses of more than USD 1 billion for animal production in United States [64]. In the European Union, losses due to the 2003 drought are estimated at EUR 13 billion, including EUR 4 billion for France [65]. Nowadays it is well documented that in many rural areas, small farms do not have the financial capacity to cope with systemic climate shocks [66]. In the future, climate change will increase extreme drought frequency [8,67], which raises the question of the resilience of farm income. The improvement of risk knowledge supports the assessment of the risk management systems currently in place and their sustainability in the context of climate change.

## *4.2. Limits of This Study*

The first limit to this work is the use of a single climate model. It was important for the authors to question the reliability of this climate model. The specificity of our approach was to simulate 400 years of steady-state climate under the conditions of the years 2000 and 2050. Was the variability of other CORDEX-Drias models contained in these 2 × 400 years event sets? CORDEX-Drias simulations between 1985 and 2005 (current climate) and 2040–2060 (climate in 2050) for six different models were compared with ARPEGE-Climat. It appears that the current climate, future climate, and evolution ratio of the six models at the French scale were included in ARPEGE-Climat 400-year outputs, as shown in Figures A1 and A2 in Appendix A. After this validation was complete, it was obvious that obtaining extreme event values was tougher when mixing a short-scale event set from six different models than with the use of ARPEGE-Climat model. This study highlights the relevance of using large-scale event sets to represent the variability of climate, especially for extreme values.

Another limitation is the computation of crop yield losses using the Olympic average. This method allowed us to integrate a certain variability of yields over time. However, the crop yield loss computed was annual and was a sum of different factors, and this

explains, in part, the errors in the model. Moreover, crop yields are not stable over time and many authors have shown that the cereal yield in France increased until 1996 and then stagnated or decreased [68,69]. However, the results contrast depending on geographical locations. Many studies have been done to eliminate bias introduced by non-climatic factors in the computation of yield losses [70,71], and it would be interesting to apply this kind of methodology. However, other factors may arise the same year, such as several climatic events. This was the case in 2003. A significant frost occurred in the central region of France, which contains 20% of the cultivated area for soft winter wheat [72]. The effects of frost accumulated with those of drought, which partly explains the significant crop losses and our difficulty in simulating it using a drought index.

Finally, this study was conducted with other elements being equal by definition. It did not take into account agricultural adaptation to climate change. The cultivated area for each crop modeled was the same in 2050. Our methodology was to project yield losses based on the relation between index values and historical yield losses. Therefore, cultivation of resistant varieties to extreme drought were not included in these results.

## **5. Conclusions**

This paper analyzed the intensity and frequency of extreme agricultural droughts in 2050. For this purpose, the analysis focused on three crops: soft winter wheat, winter barley, and grassland. A new meteorological index was developed, which represents a cumulative water anomaly and is correlated to the yield losses. The model created simulated the crop yield losses at the departmental scale from the index values. Then, the index was projected in 2050 using the ARPEGE-Climat model from Meteo-France. The results compared the intensity and frequency of extreme droughts between the climate in 2000 and the climate in 2050 and show that the yield losses due to 10-year drought increased by 35% for grassland and by more than 70% for cereals.

Within the frameworks of both (1) the new CAP program (2023–2027) and (2) the French risk management scheme reform, these numbers are useful to alert and inform political stakeholders to the consequences of climate change, at the national and regional scale, on grassland and cereals. Our results show that to calibrate a risk management scheme and to be able to estimate the national farm exposure in the mid-term, the evolution of climatic extremes has to be taken into account. Insurers, reinsurers, public funds, and farmers (individually and globally) are exposed at different levels to the increase in climatic events in the next 30 years.

Insurance and reinsurance solvability is linked to the capacity to face extreme losses and, by definition, to the capacity to model the frequency/intensity curve. Nevertheless, as shown in this paper, this frequency/intensity curve cannot be considered stationary over the next 30 years. Under this condition, pricing treaties to allow loss balance in the mid-term have to integrate a mix between current and future losses.

The next step will be to integrate risk management scenarios in our model and to estimate the losses for the different stakeholders. A public–private partnership is a promising route to face systemic extreme events when insurance mutualization is to be reconsidered. Today, in France, the crop insurance diffusion rate is 30% for cereals and less than 2% for grasslands [73]. In this respect, after the occurrence of an extreme event at the national scale, the State must intervene to support farmers' resiliency. A significant increase in the diffusion rates is one way to achieve sustainable agriculture in the context of increasing risks.

Agriculture has always been able to adapt to the changing climate. However, considering pessimistic scenarios like RCP 8.5 and the fast increase in extreme droughts, risk management policies must support national agricultural production during the adaptation period.

**Author Contributions:** Validation, J.C. and J.-M.S.; writing—original draft, D.K., D.M. and M.V. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable. **Data Availability Statement:** Data used for yield are available in https://agreste.agriculture.gouv.fr,

**Informed Consent Statement:** Not applicable. **Acknowledgments:** The authors want to thank Elizabeth Harader-Coustau for helping with the re-

**Data Availability Statement:** Data used for yield are available in https://agreste.agriculture.gouv.fr (accessed on 20 January 2022). vision of the manuscript and Lea Boittin for the English corrections. We also want to thank our anonymous reviewers for their valued advice on the structure of the paper and the addition of complementary studies.

**Acknowledgments:** The authors want to thank Elizabeth Harader-Coustau for helping with the revision of the manuscript and Lea Boittin for the English corrections. We also want to thank our anonymous reviewers for their valued advice on the structure of the paper and the addition of complementary studies. **Conflicts of Interest:** The authors declare no conflict of interest.  **Appendix A** 

> **Conflicts of Interest:** The authors declare no conflict of interest. Using a single climatic model can generate bias in the results. This section presents the multi-model study. To analyze it with an objective approach, the data from five cli-

#### **Appendix A** matic models were downloaded (IPSL-CM5A, CNRM-CERFACS-ALADIN, NCC, MPI,

climate in 2050.

Using a single climatic model can generate bias in the results. This section presents the multi-model study. To analyze it with an objective approach, the data from five climatic models were downloaded (IPSL-CM5A, CNRM-CERFACS-ALADIN, NCC, MPI, MOHC-HadGEM2) using: MOHC-HadGEM2) using: • The years 1985–2005 for the current climate; • The years 2040–2060 for the future climate according to RCP 8.5.


The parameters we analyzed were the annual average DOWKI values for the French territory and the evolution of the annual average between 2000 and 2050 for each model. To compare the DRIAS models with ARPEGE on the same basis, we randomly chose a set of 20 years in the current climate and 20 years in the future climate 100 times in the

To compare the DRIAS models with ARPEGE on the same basis, we randomly chose a set of 20 years in the current climate and 20 years in the future climate 100 times in the ARPEGE event set. ARPEGE event set. The distribution of the 100 values for ARPEGE for the annual average values and the evolution of the annual average values were compared. As shown in Figures A1 and A2

The distribution of the 100 values for ARPEGE for the annual average values and the evolution of the annual average values were compared. As shown in Figures A1 and A2 below, we can see that the 400 years of ARPEGE simulations contained the annual average values of the five models and their evolution. below, we can see that the 400 years of ARPEGE simulations contained the annual average values of the five models and their evolution.

**Figure A1.** Distribution of the 100 average annual DOWKI values from ARPEGE in black. The limits of the box plot represent percentiles 10–90 and the error bars represent percentiles 5–95. The average is also represented by a bar in the boxplot. The average annual DOWKI values computed in 5 mod-**Figure A1.** Distribution of the 100 average annual DOWKI values from ARPEGE in black. The limits of the box plot represent percentiles 10–90 and the error bars represent percentiles 5–95. The average is also represented by a bar in the boxplot. The average annual DOWKI values computed in 5 models are represented in color. The two distributions are computed for the climate in 2000 and the climate in 2050.

els are represented in color. The two distributions are computed for the climate in 2000 and the

**Figure A2.** Distribution of the evolution between the climate in 2000 and the climate in 2050 of the annual average values of DOWKI from ARPEGE-Climat in black. The limits of the boxplot represent **Figure A2.** Distribution of the evolution between the climate in 2000 and the climate in 2050 of the annual average values of DOWKI from ARPEGE-Climat in black. The limits of the boxplot represent percentiles 10–90 and the error bars represent percentiles 5–95. The average of the 100 values is also represented by a bar in the boxplot. The evolution of the average annual DOWKI values between the climate in 2000 and the climate in 2050 computed in 5 other climate models are represented in color.

percentiles 10–90 and the error bars represent percentiles 5–95. The average of the 100 values is also represented by a bar in the boxplot. The evolution of the average annual DOWKI values between We can thus consider that ARPEGE, with a long-range simulation of 400 years, takes into account more uncertainties than the five DRIAS models.

the climate in 2000 and the climate in 2050 computed in 5 other climate models are represented in

## **References**

976.

*l'Alimentation; Paris, France,* 2013.

color.

	- of Agricultural Organisations in the European Union and General Committee for Agricultural cooperation in the European Union: Brussels, Belgium, 2003; pp. 1–15. 8. IPCC. *Managing the Risks of Extreme Events and Disasters to Advance Climate Change Adaptation*; Cambridge University Press: Cambridge, UK, 2012; p. 582.

*of the Intergovernmental Panel on Climate Change.*; Cambridge University Press: Cambridge, UK; New York, NY, USA, 2007; p.

*Agriculture, Forêt et Climat : vers des stratégies d'adaptation Centre d'étude et de prospectives. Ministère de l'Agriculture et de* 


## *Article* **Modelling Fire Risk Exposure for France Using Machine Learning**

**Baptiste Gualdi <sup>1</sup> , Emma Binet-Stéphan <sup>1</sup> , André Bahabi <sup>1</sup> , Roxane Marchal 2,\* and David Moncoulon <sup>2</sup>**


**Abstract:** Wildfires generating damage to assets are extremely rare in France. The peril is not covered by the French natural catastrophes insurance scheme (law of 13 July 1982). In the context of the changing climate, Caisse Centrale de Réassurance—the French state-owned reinsurance company involved in the Nat Cat insurance scheme—decided to develop its knowledge on the national exposure of France to wildfire risks. Current and future forest fires events have to be anticipated in case one of the events threatens buildings. The present work introduces the development of a catastrophe loss risk model (Cat model) for forest fires for the French metropolitan area. Cat models are the tools used by the (re)insurance sector to assess their portfolios' exposure to natural disasters. The open-source national Promethée database focusing on the South of France for the period 1973–2019 was used as training data for the development of the hazard unit using machine learning-based methods. As a result, we observed an extension of the exposure to wildfire in northern areas, namely Landes, Pays-de-la-Loire, and Bretagne, under the RCP 4.5 scenario. The work highlighted the need to understand the multi-peril exposure of the French country and the related economic damage. This is the first study of this kind performed by a reinsurance company in collaboration with a scholarly institute, in this case EURIA Brest.

**Keywords:** forest fires; Cat model; climate change; disaster risk; machine learning; R programming language

## **1. Introduction**

In the world, large forest fire events are generating significant damage to natural ecosystems, human lives, and critical infrastructures [1]. In the last few years, large events occurred especially in the United States and in Australia [2]. In 2017 and 2018, in California, wildfire events were estimated, respectively, to have caused \$12 bn in damages for the Tubbs Fire and the CampFire. It has been estimated that wildfire caused \$150 bn damage globally, with \$27.7 bn for direct losses to buildings and houses, or 20% of the total [3–5]. Between 2011 and 2020, the average annual loss for the USA was \$4.7 bn for forest fires [3]. More recently, we had in mind last year's Black Summer in Australia, with the sad images of koalas and kangaroos burnt by the flames; in addition to the 10,000 people displaced, 25 people died, 5.5 million ha were destroyed, and 2448 homes were destroyed [6]. Those wildfires generated colossal economic losses. Periods of long and intense droughts elevated fire risk prediction, which is especially the case in Canada and the Western USA [7]. Nowadays, in early July 2021, the world watches, helplessly, the heat wave hitting Lytton (Canada), which recorded temperatures of 49.6 ◦C, with flames destroying the city [8], as well as the large events in Greece and Turkey due to the greatest heat wave in thirty four years [9].

Modelling wildfire is a complex task, as several parameters have to be defined (fire propagation, fuel, wind speed, terrain type, smoke, prevention actions and building sus-

**Citation:** Gualdi, B.; Binet-Stéphan, E.; Bahabi, A.; Marchal, R.; Moncoulon, D. Modelling Fire Risk Exposure for France Using Machine Learning. *Appl. Sci.* **2022**, *12*, 1635. https://doi.org/10.3390/app12031635

Academic Editor: Jason K. Levy

Received: 7 January 2022 Accepted: 1 February 2022 Published: 4 February 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

ceptibility). This leads to the development of detailed models to assess the fire propagation, ignition and dynamic, as exemplified by the well-known models of the literature (FlamMap, FSPro, FARSITe, FIRETACTIC, FEPS, HYPSLIT, PHOENIX, and Minimum Travel Time) [1,4,10–14]. The insurance sector also developed models at the asset level, modelling the roofs, walls, and windows which were the most susceptible to burning with destruction functions [3,11,15–17]. In comparison to the US or Australia, southern Europe records less burnt surface, such as the 7.4 million ha burnt between 2000 and 2018 [18]. For France, from 1982 to 2017, 12 events were recorded for a total of 350,000 ha burnt in the Mediterranean area (EM-DAT data 2021, https://public.emdat.be/, (accessed on 2 August 2021)). Then, comparing these elements to the Promethée database (Promethée data, https://www.promethee.com/, (accessed on 12 October 2020)), France records an accumulation of small events (approximately 2000 events per year) with small-to-medium surfaces, with strong spatial and temporal fluctuations (approximately 7.3 ha burnt per year). At the end of August 2021, the South of France recorded a large wildfire event of 8100 ha in the natural park Plaine des Maures. The event was extreme in terms of its propagation speed of up to 8 km/h, destroying a dozen houses in Val de Gilly Grimaud. It is the worst wildfire event occurring in France since 2–3 September 2003, when flames destroyed—in the same location—20,000 ha of forest, with three lives lost. It was demonstrated that the 1994 fire protection was successful [19]. On the contrary, for other countries, as explained by [20], there is a lack of international coordinated safety procedure for wildfires.

Nevertheless, considering climate change, it is important to anticipate the future exposure. Indeed, France is the fourth European country in terms of forest cover, with 17 million ha of forest, which means an increase of exposure in the next few years. Climate change affects the frequency of wildfires due to anomalous maximum temperatures, lower humidity, higher maximum wind speed, and fewer rainy days [5,21–23]. A study projected the fire danger due to climate change in Southern France [24]. In 2019, a lot of kermes oak trees died due to the heatwave, as the lethal temperature was reached with temperatures of more than 60 ◦C measured (French Ministry of Ecology, 2021, https://www.ecologie. gouv.fr/prevention-des-feux-foret, (accessed on 7 September 2021)). There is a clear need to develop a model to identify the exposed areas, and to protect them from significant losses (to the ecosystem and to the economy). Then, taking into account those elements, CCR experts in natural disaster modelling covered by the Nat Cat scheme raised interest in forest fires. In collaboration with EURIA Brest, we developed a Cat model from scratch within seven months, from data collection and hazard modelling through machine learning to exposure and damage estimates. A Cat model is the tool of the (re)insurance sector to estimate the consequences of natural disasters on their portfolios. It is composed of three submodules: hazard, vulnerability and damage units (Figure 1). We aimed to test the ability of machine learning-based methods to model the fire hazard, namely burnt surface and occurrence [25]. The hazards themselves, such as fire smoke and earth imagery, were not the target of this study. The outputs of the hazard and vulnerability units are combined into the damage unit in order to provide estimates of the amount of loss due to the natural events. A special interest was taken in the wildland–urban interface in order to consider the increasing number of houses in the French littoral at-risk areas [26]. Due to data availability, the RCP 4.5 IPCC scenario was used [27]. The model was developed using the R programming language. We combined meteorological data from spatially synchronized Safran daily weather data, building location BD TOPO IGN®, and insured values at the department scale.

The paper ends with the discussions and conclusion.

of France. The aim is to estimate whether the probability of fire occurrence and burnt surface per DFCI (DFCI, https://www.geonov.fr/smartdata/carroyage/, (accessed on 12th October 2020)) mesh according to the Safran data meteorological conditions of the mesh. This

the implemented machine learning methods for Cat model development. Section 3 presents the results of the Cat models from current to future exposure with potential losses.

The paper is structured as follows. Section 2 relates the data collection process and

study was designed to demonstrate learning for residential exposure.

**Figure 1.** Cat model structure used by the (re)insurance sector to estimate the amount of loss due to natural disasters, in this case wildfire events. For the study, machine learning-based methods were integrated into the model, namely into the hazard unit. The latest was based on the use of machinelearning methods to define the number of fire events and the burnt surface. Once the best model for each of the variables was defined, the outputs were combined with the damage unit. The vulnerability unit gathers all of the information about the insured portfolios with building locations and insured values. Then, the damage unit (risk assessment), allows the loss estimation for a wildfire **Figure 1.** Cat model structure used by the (re)insurance sector to estimate the amount of loss due to natural disasters, in this case wildfire events. For the study, machine learning-based methods were integrated into the model, namely into the hazard unit. The latest was based on the use of machine-learning methods to define the number of fire events and the burnt surface. Once the best model for each of the variables was defined, the outputs were combined with the damage unit. The vulnerability unit gathers all of the information about the insured portfolios with building locations and insured values. Then, the damage unit (risk assessment), allows the loss estimation for a wildfire event.

event. **2. Materials and Methods**  *2.1. Fire and Meteorological Data as the Input Data for a Machine Learning-Based Hazard Unit*  The historical patterns of wildfires in Southern France were based on the Promethée Cat modelling allows a probabilistic assessment of wildfire risk, examining key locations in order to determine the potential property losses; the model calculates risk by looking at a range of factors, in this case simplistic factors, in order to define the first exposure of France. The aim is to estimate whether the probability of fire occurrence and burnt surface per DFCI (DFCI, https://www.geonov.fr/smartdata/carroyage/, (accessed on 12 October 2020)) mesh according to the Safran data meteorological conditions of the mesh. This study was designed to demonstrate learning for residential exposure.

dataset from 1973 to 2019. The database contains, for each DFCI mesh, several metrics (Table 1). **Table 1.** Metrics of the Promethée database used for the study. The paper is structured as follows. Section 2 relates the data collection process and the implemented machine learning methods for Cat model development. Section 3 presents the results of the Cat models from current to future exposure with potential losses. The paper ends with the discussions and conclusion.

## **2. Materials and Methods**

### **Metrics Details**  *2.1. Fire and Meteorological Data as the Input Data for a Machine Learning-Based Hazard Unit*

Date Date of occurrence of the wildfire Number Id of the fire The historical patterns of wildfires in Southern France were based on the Promethée dataset from 1973 to 2019. The database contains, for each DFCI mesh, several metrics (Table 1).

Type of fire Unfilled variable Department Localization of the fire The DFCI geographical mesh system is used in France by actors for fire prevention from the 100-km to the 2-km resolution (Figure 2).

> INSEE ID French ID for community Community Name of the community DFCI mesh Id of the DFCI mesh

Origin of the alert Policemen, population, aerial etc.

Alert The data and hour of the first fire alert

Max\_burnt\_surf Maximal burnt surface for each DFCI mesh


**Table 1.** Metrics of the Promethée database used for the study.

**Figure 2.** DFCI mesh, from the national to the local scale (https://www.data.gouv.fr/fr/datasets/carroyage-dfci-2-km/, (accessed on 12th October 2020)). The mesh has a value of 100 km with a letter code, exemplified with "LD". Then, for the 20 km resolution, two figures are added—e.g., "LD26" and for the 2 km a letter and a figure are added—e.g., "LD26G2"—providing a unique code for each mesh. **Figure 2.** DFCI mesh, from the national to the local scale (https://www.data.gouv.fr/fr/datasets/ carroyage-dfci-2-km/, (accessed on 12 October 2020)). The mesh has a value of 100 km with a letter code, exemplified with "LD". Then, for the 20 km resolution, two figures are added—e.g., "LD26"—and for the 2 km a letter and a figure are added—e.g., "LD26G2"—providing a unique code for each mesh.

The Fire Weather Index (FWI) system was developed by the Canadian Forest Fire Danger Rating System (CFFDRS) in the seventies [28]. This indicator is used worldwide as a trustable indicator for the study of climate change effects in fire exposure [29]. The indicator is available on the EFFIS Copernicus website throughout Europe. The FWI is calculated daily by Météo-France for France via Arpège-Climat 4.6 over the period of reference: 1959–2007. The FWI data was downloaded from Drias's Météo-France platform and for the period 1973–2007. The FWI is only available on a seasonal average from March to November. The higher the FWI is, the greater the probability of wildfire is. The winter The Fire Weather Index (FWI) system was developed by the Canadian Forest Fire Danger Rating System (CFFDRS) in the seventies [28]. This indicator is used worldwide as a trustable indicator for the study of climate change effects in fire exposure [29]. The indicator is available on the EFFIS Copernicus website throughout Europe. The FWI is calculated daily by Météo-France for France via Arpège-Climat 4.6 over the period of reference: 1959–2007. The FWI data was downloaded from Drias's Météo-France platform and for the period 1973–2007. The FWI is only available on a seasonal average from March to November. The higher the FWI is, the greater the probability of wildfire is. The winter season is not studied in this article.

season is not studied in this article. The Safran data provide information about the temperature, humidity, wind and precipitation. The data were downloaded from the Drias's Météo-France platform. The resolution is 8 km × 8 km; for different RCP scenarios, a total of 8602 points cover France. The evolution of the critical meteorological parameters was calculated for 1973–2005, and for the horizon 2050 RCP 4.5 seasonality of the parameters and a 20 × 20 km2 analysis. The study focuses on a seasonal timescale in order to highlight the variation of the meteorological metrics. Climsec Météo-France data are available as a seasonal average for the entire year. We assume that the fire event within a DFCI mesh is uniquely determined by The Safran data provide information about the temperature, humidity, wind and precipitation. The data were downloaded from the Drias's Météo-France platform. The resolution is 8 km × 8 km; for different RCP scenarios, a total of 8602 points cover France. The evolution of the critical meteorological parameters was calculated for 1973–2005, and for the horizon 2050 RCP 4.5 seasonality of the parameters and a 20 <sup>×</sup> 20 km<sup>2</sup> analysis. The study focuses on a seasonal timescale in order to highlight the variation of the meteorological metrics. Climsec Météo-France data are available as a seasonal average for the entire year. We assume that the fire event within a DFCI mesh is uniquely determined by the mesh's meteorological conditions.

the mesh's meteorological conditions. EURO-CORDEX (Coordinated Downscaling Experiment) data are available daily by Safran point. The data were reanalyzed in order to obtain them for the same seasonality as the FWI and Climsec data (Table 2). EURO-CORDEX (Coordinated Downscaling Experiment) data are available daily by Safran point. The data were reanalyzed in order to obtain them for the same seasonality as the FWI and Climsec data (Table 2).

> TASMIN Daily minimal temperature at 2 m (altitude) TASMAX Daily maximal temperature at 2 m TAS Daily averaged temperature at 2 m PR Daily precipitations (mm) SFCWIND Wind speed at 10 m (altitude) (m/s) SPI Meteorological drought for 3 months

**Metrics Details** 


**Table 2.** EURO-CORDEX and Climsec data.

The meteorological Safran data were overlaid with the DFCI mesh. The resolution is 20 km × 20 km. The meteorological database is thus based on 1467 meshes (Appendix A).

## *2.2. Machine Learning-Based Methods for the Development of the Hazard Unit*

Artificial intelligence and machine learning methods have been used in wildfire science since the 1990s. Within the sets of available tools suggested in Jain et al. [30], we focused only on the following: (i) decision trees, (ii) support vector machines, and (iii) artificial neural networks.

We used machine learning-based methods for the development of the hazard unit from the historical data collected in the Prométhée database. We focused only on the definition of the burnt surface and the occurrence of fire events. The meteorological data were integrated as indicators in order to assess their consequences on the area covered by the fire, and on the occurrence.

## 2.2.1. Burnt Surface

The first approach was to predict the burnt surface in each DFCI mesh, and to validate it for the real historical data. The training data were the total burnt surfaces and the maximal burnt surface. In order to solve the issue of extreme events and fires with low intensity, we focused only on the fires of 1 ha and 100 ha. The first tested method was the adjustment according to a law. Indeed, this makes it possible to study the data as a drawing of a random variable X, the law of which was known but the parameters of which were not. In order to do this, we must choose a known law of X that seems to be close to the distribution of our data. Then, by a method of optimization of the parameters of this law, we find the parameters that maximize the likelihood between the data and the density of this law. We can predict future data by randomly drawing this law as many times as necessary in a statistical approach. Linear regression establishes a linear relation between an explained variable and one or more explicative variables. The model was defined as follows, with Y being the explained variable, Xp being the p explicative variable, and the error and β being the parameters of the model.

We focused on the least-squares method, which minimizes the square deviation and the estimated regression. The the R packages used were fitdistr, MASS and fitdistrplus. After different tests, the most appropriate law was Burr's law.

The second process was to test the neural network. A neural network is composed of neurons distributed in several layers: input neurons, neurons in different hidden layers, and output neurons. The input data and our hidden layers will modulate these data by different weights and biases which provide output neurons and a value [31–33]. Then, the square error of the prediction was calculated by comparing the differences between the data and the predicted value. The neural network was then modified in order to minimize this error. Thus, by repeating this operation the neural network obtains accurate predictions while avoiding overlearning. The package used was nnet. The network length was 10, with 40 iterations.

## 2.2.2. Fire Occurrence

The objective was to predict the probability of a DFCI mesh being affected by a fire event, and to validate it with the real historical data. The training data were the number of fires recoded as a Boolean variable, in order to predict the occurrence of at least one fire event for the summer period. Three machine learning-based methods were tested, and are detailed below. In addition to the machine learning models developed with R, we tested the ArcGis® GIS-based machine learning hot-spot analysis.

A decision tree is a decision support tool that takes the form of a tree. At each node, a decision is taken according to a parameter, and we descend in the tree to a new node until we arrive at a leaf [34]. In a decision tree, in the case of a classification tree, the leaves contain qualitative variables (labels); in the case of a regression tree, the leaves contain quantitative variables. The package used was rpart. The maximal length of the trees was 3, with at least 50 individuals per terminal node. Between 1973 and 2005, during the summer season, 64% of the meshes recorded at least one fire; thus, the learning database will gather 50% of the database.

In a Random Forest, the main advantage of decision trees is their readability and speed of execution [35]. The package used was randomForest; 500 trees were chosen, with a height of five, and with at least 10 individuals per terminal node.

A Support Vector Machine (SVM) is a supervised learning technique. If the points (target values) are linearly separable in the space of explanatory variables, the SVM will search for the hyperplane boundary (the decision boundary) [36]. However, the points may be not separated by a hyperplane, and it is then possible to reconsider the problem in a higher dimensional space [37]. In order to deform the original space, we apply a kernel function; in this new space, it is then likely that there is a linear separation. The package used was e1017, and the duration of the calculation was a few minutes.

The space–time path is useful to visualize and understand the relationship between time and geography data. The geographical data are represented along the x and y axis, and the cube's height represents the time on the z axis [38]. The Promethée dataset fits perfectly with the ArcGis® geovisualization tool, as there is information on the location and on the time series.

In order to perform the analysis, the Space Time Cube tool (ArcGIS Pro 2.8 online support: https://pro.arcgis.com/fr/pro-app/latest/tool-reference/space-time-patternmining/learnmorecreatecube.htm, (accessed on 10 March 2021)) was used based on Promethée data, in order to define the hot spots whilst considering their evolution over the timeline. The point data per year and community were integrated within the tool, and were aggregated considering space and time. The fire events were recapitulated within a hexagonal grid at yearly time steps, and then the spatial model provided the evolution in the time under a NetCDF format. We did not define the interval distance between the points, because they are the centroid of the community. Then, the Emerging Hot Spot Analysis tool was used to read the NetCDF file. It analyzes the area in which the fire events are statistically emergent or reduced in the area. The temporal interval is defined as a year.

## *2.3. Vulnerability and Damage Modelling*

For the development of the vulnerability unit, information about the portfolio exposure is required. First, the land-use type is needed; we used the Theia data at a 100-m resolution. Theia defined 23 land-use types; the data was re-categorized into 4 categories (Table 3). We assume the land use to be constant for the horizon 2050, as well as the number of buildings. The land-use data were overlaid with the DFCI meshes.


**Table 3.** New classification of the THEIA land-use data.

*Appl. Sci.* **2022**, *12*, 1635 7 of 17

Secondly, building location data were obtained from the vector building dataset BD TOPO IGN®. In this study, we only focused on the individual residential building, for which we have more detailed insurance data. For the damage model, we assume that when a fire crosses three departments with different building densities and insured values, that the three are proportionally damaged (Figure 3). when a fire crosses three departments with different building densities and insured values, that the three are proportionally damaged (Figure 3).

**Figure 3.** When a fire (in red) touches, proportionally, the department within the DFCI mesh, the density of the houses is applied, and we obtain the number of houses burnt and the damage costs. **Figure 3.** When a fire (in red) touches, proportionally, the department within the DFCI mesh, the density of the houses is applied, and we obtain the number of houses burnt and the damage costs.

In terms of insured damages, wildfires may have important consequences for the buildings. They can totally damage the infrastructure. Nevertheless, wildfire damage is not covered within the Nat Cat scheme, and events destroying residential assets are extremely rare in France. In French newspapers, there is information about the costs for firemen, but not for the insured losses. Thus, the wildfire-related insured damages are not available. In order to bridge the gap, we used confidential CCR portfolio data aggregated at the department scale. The leaflet R package was used to map the data. We consider, in the model, that if the fire touches a house it is totally destroyed within the damage function. As there are no data on fire-related claims, we considered the insured values as the claims. This is contrary to Australia or the USA, where buildings are destroyed and the destruction functions are then calibrated [11]. The price of a m2 of building per department was downloaded (https://www.meilleursagents.com/prix-immobilier/ (accessed on)). The fictive model applies the maximal historical burnt surface of each DFCI mesh since 1973 to the existing urban areas. Then, we count the number of damaged assets, *ihouse*, to which we apply the costs of a square meter of house per department, *P€M2*, and the insured In terms of insured damages, wildfires may have important consequences for the buildings. They can totally damage the infrastructure. Nevertheless, wildfire damage is not covered within the Nat Cat scheme, and events destroying residential assets are extremely rare in France. In French newspapers, there is information about the costs for firemen, but not for the insured losses. Thus, the wildfire-related insured damages are not available. In order to bridge the gap, we used confidential CCR portfolio data aggregated at the department scale. The leaflet R package was used to map the data. We consider, in the model, that if the fire touches a house it is totally destroyed within the damage function. As there are no data on fire-related claims, we considered the insured values as the claims. This is contrary to Australia or the USA, where buildings are destroyed and the destruction functions are then calibrated [11]. The price of a m<sup>2</sup> of building per department was downloaded (https://www.meilleursagents.com/prix-immobilier/ (accessed on 1 December 2021)). The fictive model applies the maximal historical burnt surface of each DFCI mesh since 1973 to the existing urban areas. Then, we count the number of damaged assets, *ihouse*, to which we apply the costs of a square meter of house per department, *P*€*M2*, and the insured values of the house and furniture at the department scale, *P*€*fur*.

$$Fori\_{house} = \sum P\poundsM2 + P\poundsur$$

*For ihouse* = ∑ P€M2 P€fur Third, the Wildland–Urban Interface (WUI), developed by the USA, describes areas where wildfires and urban areas interact, generating a potential loss of properties and life Third, the Wildland–Urban Interface (WUI), developed by the USA, describes areas where wildfires and urban areas interact, generating a potential loss of properties and life [10,39]. The WUI types intermix, and interface areas were applied to the entire French scale at the DFCI mesh of 20 × 20 km.

departmental insured values, we are able to calculate the potential damages.

− 1, the probability of fire is decreased for the year 0 or year +1 (Figure 4).

[10,39]. The WUI types intermix, and interface areas were applied to the entire French

burnt areas of each DFCI mesh on the urban surface of the same mesh; then, by using the

The statistical analysis of the data highlights the variations of the total burnt areas and fire occurrence per season per year in the Promethée area. The seasonal variability is very important; it is correlated to the fact that if a lot of areas are burnt during the year n

scale at the DFCI mesh of 20 × 20 km.

**3. Results** 

*3.1. Statistical Analysis* 

Concerning the damage model, we assume the application of the maximal surface of burnt areas of each DFCI mesh on the urban surface of the same mesh; then, by using the departmental insured values, we are able to calculate the potential damages.

## **3. Results**

#### *3.1. Statistical Analysis Appl. Sci.* **2022**, *12*, 1635 8 of 17

The statistical analysis of the data highlights the variations of the total burnt areas and fire occurrence per season per year in the Promethée area. The seasonal variability is very important; it is correlated to the fact that if a lot of areas are burnt during the year n − 1, the probability of fire is decreased for the year 0 or year +1 (Figure 4). *Appl. Sci.* **2022**, *12*, 1635 8 of 17

**Figure 4.** The occurrence and surface area of fire events tend to decrease over time, especially since **Figure 4.** The occurrence and surface area of fire events tend to decrease over time, especially since the 1990s, with the reinforcement of the preventive measures. In terms of burnt surface, the year 2003 is highlighted, and is well known regarding the high intensity of the heat wave. **Figure 4.** The occurrence and surface area of fire events tend to decrease over time, especially since the 1990s, with the reinforcement of the preventive measures. In terms of burnt surface, the year 2003 is highlighted, and is well known regarding the high intensity of the heat wave.

the 1990s, with the reinforcement of the preventive measures. In terms of burnt surface, the year 2003 is highlighted, and is well known regarding the high intensity of the heat wave. A large majority of the DFCI mesh has no or only two fire departures. On the con-A large majority of the DFCI mesh has no or only two fire departures. On the contrary, some of the mesh has more than 100 fire departures over the entire studied period and over the years (Figure 5). A large majority of the DFCI mesh has no or only two fire departures. On the contrary, some of the mesh has more than 100 fire departures over the entire studied period and over the years (Figure 5).

**Figure 5.** The two maps exemplify the diversity of the fire occurrence (**left**) and burnt surface (**right**) according to each DFCI mesh for the summer period in 2005. **Figure 5.** The two maps exemplify the diversity of the fire occurrence (**left**) and burnt surface (**right**) according to each DFCI mesh for the summer period in 2005.

**Figure 5.** The two maps exemplify the diversity of the fire occurrence (**left**) and burnt surface (**right**)

is not insured. The variability of the data poses an issue of extreme values, and adds complexity for the implementation of machine learning-based methods (see Section 3.2). In order to cope with the issue, the correlation matrix allows us to better understand the relationships and interdependence between the metrics. The number of fires is positively correlated with the FWI (0.31), the number of days without precipitation (0.21) and the

For a daily analysis, the link between a high FWI and fire occurrence is important; nevertheless, when considering the average value of the FWI over three months, the link is not insured. The variability of the data poses an issue of extreme values, and adds complexity for the implementation of machine learning-based methods (see Section 3.2). In order to cope with the issue, the correlation matrix allows us to better understand the relationships and interdependence between the metrics. The number of fires is positively correlated with the FWI (0.31), the number of days without precipitation (0.21) and the

For a daily analysis, the link between a high FWI and fire occurrence is important;

For a daily analysis, the link between a high FWI and fire occurrence is important; nevertheless, when considering the average value of the FWI over three months, the link is not insured. The variability of the data poses an issue of extreme values, and adds complexity for the implementation of machine learning-based methods (see Section 3.2). In order to cope with the issue, the correlation matrix allows us to better understand the relationships and interdependence between the metrics. The number of fires is positively correlated with the FWI (0.31), the number of days without precipitation (0.21) and the mean temperature (0.29). The number of fires is, on the contrary, negatively correlated with the precipitation; in particular, the negative values are between −0.09 and −0.17. This matrix also highlights that the SPI and SSWI indexes are not correlated with other variables, and especially with the targeted variables NBFEUX and SURFTOT (values equal to 0). On the contrary, the variables representing the same data have an important coefficient of relation; for example, the variable for temperature TASQ50 and TASQ90 with a correlation of 0.96 (Figure 6). *Appl. Sci.* **2022**, *12*, 1635 9 of 17 mean temperature (0.29). The number of fires is, on the contrary, negatively correlated with the precipitation; in particular, the negative values are between −0.09 and −0.17. This matrix also highlights that the SPI and SSWI indexes are not correlated with other variables, and especially with the targeted variables NBFEUX and SURFTOT (values equal to 0). On the contrary, the variables representing the same data have an important coefficient of relation; for example, the variable for temperature TASQ50 and TASQ90 with a correlation of 0.96 (Figure 6).

**Figure 6.** The correlation matrix highlights the coefficient of correlation for the different possible pairs of variables. There is a positive correlation between the FWI and the Q50/Q90 of the daily temperature (0.71). We observed a correlation between the number of days without precipitation and the number of fires (0.21). **Figure 6.** The correlation matrix highlights the coefficient of correlation for the different possible pairs of variables. There is a positive correlation between the FWI and the Q50/Q90 of the daily temperature (0.71). We observed a correlation between the number of days without precipitation and the number of fires (0.21).

#### *3.2. Hazard Unit 3.2. Hazard Unit*

The results below show the comparison of the estimates from the machine learningbased methods between the real and modelled scenarios in order to predict the burnt surface and the probability of fire. We also integrated the projections at horizon 2050 under the RCP 4.5 IPCC Scenario. The results below show the comparison of the estimates from the machine learningbased methods between the real and modelled scenarios in order to predict the burnt surface and the probability of fire. We also integrated the projections at horizon 2050 under the RCP 4.5 IPCC Scenario.

### 3.2.1. Burnt Surface 3.2.1. Burnt Surface

For this purpose, we started with the statistical approach based on Burr's law. We obtained a similar distribution for the majority of the low-intensity fires, and for some of the extreme events. The quality of the simulation was determined using a quantile–quantile diagram. Figure 5 shows a soft Burr's law overestimating the area of the burnt surface according to the seasons. The average error was 0.7. The densities are coherent with a very high probability of low fire intensity and a very low probability of extreme fire. The higher the fire, is the lower the probability is (Figure 7). It is a soft Burr's law. We assumed an For this purpose, we started with the statistical approach based on Burr's law. We obtained a similar distribution for the majority of the low-intensity fires, and for some of the extreme events. The quality of the simulation was determined using a quantile–quantile diagram. Figure 5 shows a soft Burr's law overestimating the area of the burnt surface according to the seasons. The average error was 0.7. The densities are coherent with a very high probability of low fire intensity and a very low probability of extreme fire. The higher the fire, is the lower the probability is (Figure 7). It is a soft Burr's law. We assumed an

interesting simulation, and we simulated with this law several thousand numbers in order

to calculate the average burnt surface.

*Appl. Sci.* **2022**, *12*, 1635 10 of 17

interesting simulation, and we simulated with this law several thousand numbers in order to calculate the average burnt surface. *Appl. Sci.* **2022**, *12*, 1635 10 of 17

**Figure 7.** Burnt surface simulated with Burr's law (**a**), and compared to the observed data from the Prométhée database (**b**). **Figure 7.** Burnt surface simulated with Burr's law (**a**), and compared to the observed data from the Prométhée database (**b**). **Figure 7.** Burnt surface simulated with Burr's law (**a**), and compared to the observed data from the Prométhée database (**b**).

In a second analysis, the neural network was tested for the estimation of the burnt surface. Figure 8 highlights the average absolute difference between our predicted values and the real ones. The average error on the learning base is in blue, and the error on the test base is given in red. The error on the test base is 15.6 ha. The use of the neural network highlights the need to focus on the prediction of the occurrence or absence of fire events. In a second analysis, the neural network was tested for the estimation of the burnt surface. Figure 8 highlights the average absolute difference between our predicted values and the real ones. The average error on the learning base is in blue, and the error on the test base is given in red. The error on the test base is 15.6 ha. The use of the neural network highlights the need to focus on the prediction of the occurrence or absence of fire events. In a second analysis, the neural network was tested for the estimation of the burnt surface. Figure 8 highlights the average absolute difference between our predicted values and the real ones. The average error on the learning base is in blue, and the error on the test base is given in red. The error on the test base is 15.6 ha. The use of the neural network highlights the need to focus on the prediction of the occurrence or absence of fire events.

**Figure 8.** Comparison of the squared deviation for the neutral network based on the Prométhée database from 1973 to 2005 concerning the burnt surface. Regarding the training data available and the machine learning outputs, we assume that Burr's Law provides better results that the neural network concerning the estimation of the burnt surface. Regarding the training data available and the machine learning outputs, we assume that Burr's Law provides better results that the neural network concerning the estimation of the burnt surface.

#### Regarding the training data available and the machine learning outputs, we assume 3.2.2. Fire Occurrence

3.2.2. Fire Occurrence

ods and one GIS-based approach.

that Burr's Law provides better results that the neural network concerning the estimation of the burnt surface. 3.2.2. Fire Occurrence In order to model the fire occurrence, we tested three machine learning-based methods and one GIS-based approach. In order to model the fire occurrence, we tested three machine learning-based methods and one GIS-based approach.

First, we tried the decision tree. At the least, the learning base will contain 50% of the lines. We still obtained a high number of false negatives and badly classified results com-

First, we tried the decision tree. At the least, the learning base will contain 50% of the lines. We still obtained a high number of false negatives and badly classified results compared to the other machine learning methods (Figure 9). We could use a deeper tree, but

First, we tried the decision tree. At the least, the learning base will contain 50% of the lines. We still obtained a high number of false negatives and badly classified results compared to the other machine learning methods (Figure 9). We could use a deeper tree, but the result wouldn't change much, and the tree would have too many leaves to be interpretable. The main variables are practically always the same, no matter which sample we take. The confusion matrix associated to the tree based on the learning base reveals 22.03% badly classified results; 40.01% false positives and 12.38% false negatives. The results are similar for the validation base, with 24.50% being badly classified, 43.56% false positives and 13.48% false negatives. Based on this method, the false negatives are too important and the results are insufficient. *Appl. Sci.* **2022**, *12*, 1635 11 of 17 the result wouldn't change much, and the tree would have too many leaves to be interpretable. The main variables are practically always the same, no matter which sample we take. The confusion matrix associated to the tree based on the learning base reveals 22.03% badly classified results; 40.01% false positives and 12.38% false negatives. The results are similar for the validation base, with 24.50% being badly classified, 43.56% false positives and 13.48% false negatives. Based on this method, the false negatives are too important and the results are insufficient. the result wouldn't change much, and the tree would have too many leaves to be interpretable. The main variables are practically always the same, no matter which sample we take. The confusion matrix associated to the tree based on the learning base reveals 22.03% badly classified results; 40.01% false positives and 12.38% false negatives. The results are similar for the validation base, with 24.50% being badly classified, 43.56% false positives and 13.48% false negatives. Based on this method, the false negatives are too important and the results are insufficient.

*Appl. Sci.* **2022**, *12*, 1635 11 of 17

**Figure 9.** Extract of the decision tree developed for the study. **Figure 9.** Extract of the decision tree developed for the study.

**Figure 9.** Extract of the decision tree developed for the study.

Next, we tested the Support Vector Machine (SVM). The objective was to split linearly the true and false results within the space of meteorological indexes (of dimension 14). The function svm of the package e1071 allows the realization of SVM by choosing among four nodal functions: linear, polynomial, radial or sigmoid. After different tests, the radial nodal function provides better results. Figure x provides the error rate on the learning base, and tests for a variation of the γ parameter. Among the three presented methods and related results, the three are similar in terms of performance. The more interesting rates to consider in the context of fire occurrence are the false negative and badly classified results, at around 14% (i.e., the number of fires that the model has not predicted). Next, we tested the Support Vector Machine (SVM). The objective was to split linearly the true and false results within the space of meteorological indexes (of dimension 14). The function svm of the package e1071 allows the realization of SVM by choosing among four nodal functions: linear, polynomial, radial or sigmoid. After different tests, the radial nodal function provides better results. Figure 10 provides the error rate on the learning base, and tests for a variation of the γ parameter. Among the three presented methods and related results, the three are similar in terms of performance. The more interesting rates to consider in the context of fire occurrence are the false negative and badly classified results, at around 14% (i.e., the number of fires that the model has not predicted). We assume that the models have good performance (Figure 10). Next, we tested the Support Vector Machine (SVM). The objective was to split linearly the true and false results within the space of meteorological indexes (of dimension 14). The function svm of the package e1071 allows the realization of SVM by choosing among four nodal functions: linear, polynomial, radial or sigmoid. After different tests, the radial nodal function provides better results. Figure x provides the error rate on the learning base, and tests for a variation of the γ parameter. Among the three presented methods and related results, the three are similar in terms of performance. The more interesting rates to consider in the context of fire occurrence are the false negative and badly classified results, at around 14% (i.e., the number of fires that the model has not predicted). We assume that the models have good performance (Figure 10).

**Figure 10. Figure 10.** Error rates according to the pa Error rates according to the parameter rameter y for a radial nodal. γ for a radial nodal.

**Figure 10.** Error rates according to the parameter y for a radial nodal.

Then, in the continuity of the tests, we tested the random forest method. The implementation of the random forest of the learning base does not highlight false negatives and positives from phenomena of overlearning. Nevertheless, the random forest provides similar results from a dozen trees; the rate of bad classification was around 25% for a random forest predicting the occurrence of fires larger than 10 ha, the FWI, and the Q50 of the daily minimal temperature. Then, in the continuity of the tests, we tested the random forest method. The implementation of the random forest of the learning base does not highlight false negatives and positives from phenomena of overlearning. Nevertheless, the random forest provides similar results from a dozen trees; the rate of bad classification was around 25% for a random forest predicting the occurrence of fires larger than 10 ha, the FWI, and the Q50 of the daily minimal temperature. We also tried to estimate the prediction of a burnt area of more than 10 ha; the best mentation of the random forest of the learning base does not highlight false negatives and positives from phenomena of overlearning. Nevertheless, the random forest provides similar results from a dozen trees; the rate of bad classification was around 25% for a random forest predicting the occurrence of fires larger than 10 ha, the FWI, and the Q50 of the daily minimal temperature. We also tried to estimate the prediction of a burnt area of more than 10 ha; the best

Then, in the continuity of the tests, we tested the random forest method. The imple-

*Appl. Sci.* **2022**, *12*, 1635 12 of 17

*Appl. Sci.* **2022**, *12*, 1635 12 of 17

We also tried to estimate the prediction of a burnt area of more than 10 ha; the best model was a random forest (Figure 11). The results are coherent and applicable to the historical datasets and for the 2050 future climate. model was a random forest (Figure 11). The results are coherent and applicable to the historical datasets and for the 2050 future climate. model was a random forest (Figure 11). The results are coherent and applicable to the historical datasets and for the 2050 future climate.

**Figure 11.** The figure compares (**a**) the modelled prediction of fires of 10 ha for the summer of 2005 using a random forest, and observations from the same year (**b**). **Figure 11.** The figure compares (**a**) the modelled prediction of fires of 10 ha for the summer of 2005 using a random forest, and observations from the same year (**b**). **Figure 11.** The figure compares (**a**) the modelled prediction of fires of 10 ha for the summer of 2005 using a random forest, and observations from the same year (**b**).

We observe that the results of the projection are different according to the method of projection, as the models are calibrated on the Promethée area, and are projected to the entire country (Figure 12). Each model has at least 20% badly classified results and 40% false positives. We observe that the results of the projection are different according to the method of projection, as the models are calibrated on the Promethée area, and are projected to the entire country (Figure 12). Each model has at least 20% badly classified results and 40% false positives. We observe that the results of the projection are different according to the method of projection, as the models are calibrated on the Promethée area, and are projected to the entire country (Figure 12). Each model has at least 20% badly classified results and 40% false positives.

**Figure 12.** Comparison of the results from the three machine learning tools rgarding the probability of at least one fire within the 2 km DFCI mesh during summer 2050 RCP 4.5. (**a**) The decision tree projection is the more moderated model; we observed the exposure of Landes, Rhône valley and Vosges. (**b**) The random forest's projection highlights the Mediterranean and Corse areas' exposure, with an increasing exposure of the Rhône valley, Landes, Bretagne, Nord-Pas-de-Calais and Paysde-Loire; this model seems to be the more coherent one. (**c**) SVM's projection is the more pessimistic model, generating fire events in the large majority of the territory; the results are more related to the RCP 8.5 pessimistic scenario. **Figure 12.** Comparison of the results from the three machine learning tools rgarding the probability of at least one fire within the 2 km DFCI mesh during summer 2050 RCP 4.5. (**a**) The decision tree projection is the more moderated model; we observed the exposure of Landes, Rhône valley and Vosges. (**b**) The random forest's projection highlights the Mediterranean and Corse areas' exposure, with an increasing exposure of the Rhône valley, Landes, Bretagne, Nord-Pas-de-Calais and Paysde-Loire; this model seems to be the more coherent one. (**c**) SVM's projection is the more pessimistic model, generating fire events in the large majority of the territory; the results are more related to the RCP 8.5 pessimistic scenario. **Figure 12.** Comparison of the results from the three machine learning tools rgarding the probability of at least one fire within the 2 km DFCI mesh during summer 2050 RCP 4.5. (**a**) The decision tree projection is the more moderated model; we observed the exposure of Landes, Rhône valley and Vosges. (**b**) The random forest's projection highlights the Mediterranean and Corse areas' exposure, with an increasing exposure of the Rhône valley, Landes, Bretagne, Nord-Pas-de-Calais and Pays-de-Loire; this model seems to be the more coherent one. (**c**) SVM's projection is the more pessimistic model, generating fire events in the large majority of the territory; the results are more related to the RCP 8.5 pessimistic scenario.

The results are highly coherent for the Mediterranean area. We observe some similarities, especially for increasing exposure of Bretagne, Pays de la Loire, Centre Val de Loire, and the Atlantic coast (Landes forested areas), and an increasing exposure of the mediterranean area (Occitanie and Provence Alpes Côte d'Azur) [40]. The results are highly coherent for the Mediterranean area. We observe some similarities, especially for increasing exposure of Bretagne, Pays de la Loire, Centre Val de Loire, and the Atlantic coast (Landes forested areas), and an increasing exposure of the mediterranean area (Occitanie and Provence Alpes Côte d'Azur) [40].

Finally, the statistically significant hotspots and cold spots are represented on the map. The red areas indicate that, throughout the time, there is an aggregation of a high number of forest fires. The blue areas highlight a smaller number of fire events. Each hexagon is classified according to the timescale. The geographical analysis underlines the exposure of the areas nearby Béziers and Perpignan, which are areas of oscillating hot spots. The rest of the Mediterranean area is exposed in the same manner without a strong evolution in time (oscillating cold spots). For North Corse, at least 90% of the temporal intervals were statistically significant hotspots (Figure 13). Finally, the statistically significant hotspots and cold spots are represented on the map. The red areas indicate that, throughout the time, there is an aggregation of a high number of forest fires. The blue areas highlight a smaller number of fire events. Each hexagon is classified according to the timescale. The geographical analysis underlines the exposure of the areas nearby Béziers and Perpignan, which are areas of oscillating hot spots. The rest of the Mediterranean area is exposed in the same manner without a strong evolution in time (oscillating cold spots). For North Corse, at least 90% of the temporal intervals were statistically significant hotspots (Figure 13).

**Figure 13.** Emerging hot spot analysis for the Promethée dataset 1973–2019. **Figure 13.** Emerging hot spot analysis for the Promethée dataset 1973–2019.

### *3.3. Damage Model 3.3. Damage Model*

The R code provides, for each year, a table with the potential costs of a fire for the DFCI. Within this model, we assume that the burnt surface is entirely inhabited. Under the hypothesis of the RCP 4.5 at horizon 2050 using a random forest model predicting the occurrence or not of fire greater than 10 ha, we estimate that the damage will be, in 2050, around 35 M€ on average for residential insured areas only. The evolution of the insured values is not considered, nor are the land-use changes. The spatial repartition of the future areas exposed to wildfire events further north in France highlights the increase of the economic exposure (Figure 10). This evolution can be compared to the results of Moncoulon et al. [26] on geotechnical drought and shrinking swelling clay (SSC), in that they reveal an increasing exposure of the southern communities, as well as those is the Atlantic area. The R code provides, for each year, a table with the potential costs of a fire for the DFCI. Within this model, we assume that the burnt surface is entirely inhabited. Under the hypothesis of the RCP 4.5 at horizon 2050 using a random forest model predicting the occurrence or not of fire greater than 10 ha, we estimate that the damage will be, in 2050, around 35 M€ on average for residential insured areas only. The evolution of the insured values is not considered, nor are the land-use changes. The spatial repartition of the future areas exposed to wildfire events further north in France highlights the increase of the economic exposure (Figure 10). This evolution can be compared to the results of Moncoulon et al. [26] on geotechnical drought and shrinking swelling clay (SSC), in that they reveal an increasing exposure of the southern communities, as well as those is the Atlantic area.

#### **4. Discussion 4. Discussion**

Despite the many limitations associated with simulation modelling and machine learning-based methods in these experiments discussed above, the outputs from this work provide useful information on the exposure of France to wildfires. This work introduced the foundation of the Cat model for the assessment of forest fire exposure and projection Despite the many limitations associated with simulation modelling and machine learning-based methods in these experiments discussed above, the outputs from this work provide useful information on the exposure of France to wildfires. This work introduced the foundation of the Cat model for the assessment of forest fire exposure and projection to

horizon 2050. We acknowledge the limitations in the use of the model for the prediction of building exposure considering the spatial resolution chosen. First and foremost, our predictions were only applied to the RCP 4.5 scenarios. This study focused on wildfire disasters from an asset context, and we recognize that wildfires also threaten human lives and ecosystems, as well as cascading impacts on floods, landslides and potable water after severe fire events. Furthermore, the model could consider not only the physical damage but also the business interruption or critical infrastructure issues (highways, secondary roads, etc.). Furthermore, we did not consider the currently implemented fire prevention programs. We considered the exposure to be constant in the 2050 horizon. As the issue of population growth in littoral areas will be higher in the future, it will increase the exposure to fire events and greatly influence the WUI [26,41,42]. Regarding the damage model's development, wildfire-related claims data could support the determination of the damage functions; nonetheless, based on our knowledge, that kind of data is not available in France. Downscaling the model could also make the results and damage estimates more precise.

Although climate change is considered in the optimistic scenario RCP 4.5, the results of the future exposure are significant to start to raise and develop a risk culture in the future exposed areas (Bretagne, Alsace, etc.), and for the maintenance of the currently well-structured prevention processes in Southern France. There have been numerous studies that focused on the asset level and considered the fire conditions, landscape and properties' structures. Here, in order to obtain a global vision of the exposure of the entire area of France, we used simplistic models. Obtaining new data is challenging, as it requires waiting for future wildfires and potentially generating large losses, and ensuring that data of sufficient quality are collected. The relatively small number of fire events in terms of number or burned acres, and the very low number of burnt assets in France means that the applicability of the model developed overestimates the number, and the surface is not easily calibrated on the training data. Nevertheless, it offers a new visibility of France's exposure to that kind of natural disaster for the next few years. It provides elements for discussion on the issue of the underwriting of fire risk within the Nat Cat scheme.

## **5. Conclusions**

This model synthesizes information for the French insurance sector, and contributes to understanding and reducing wildfire losses. CCR and Euria Brest developed a first-of-itskind France Cat model projecting changes to wildfire potential under the RCP4.5 scenario at a granularity of about 20 × 20 km. Finally, the best model for burnt-surface prediction is Burr's law, and the random forest for the fire occurrence.

In the future, this model could be combined with GIS analysis (within distance to vegetation, slopes, and fuel type) and with satellite imagery analysis in order to make the exposure analysis more precise.

New machine learning and remote sensing data could be used to develop specific damage curves for household buildings for the vulnerability models, such as those created for hurricane damage. This is the first time, as far as we know, that a reinsurance company developed, with an institute, a prototype model that links machine learning and insurance data, and applied these models to the estimation of the expected financial loss from wildfires. Likewise, the model could be applied to other countries, as well as the pessimistic scenario RCP 8.5. The different improvements will open the door to explore a wide range of exposure management in order to reduce the climate change impact, and to support the community for preventive measures. Strong decisions have to be taken in order to avoid making 2021 the last coolest year of the rest of our lives.

This paper offers new visibility for the improvement of the preparedness in future potentially affected areas. We hope that this work will support future potentially exposed areas to integrate the analysis within their disaster risk prevention and resilience plans. Robust cat models can improve the accuracy of the predictions of the locations of the greatest risks to assets, and could provide an indication of the implementation of preventive measures. The proposed methodology could serve as a reference for wildfire risk assessment, and can be replicated elsewhere.

**Author Contributions:** Conceptualization, D.M. and R.M.; methodology, B.G., E.B.-S. and A.B.; software, B.G., E.B.-S. and A.B.; validation, D.M. and R.M.; formal analysis, B.G., E.B.-S. and A.B.; investigation, B.G., E.B.-S. and A.B.; resources, B.G., E.B.-S. and A.B.; data curation, B.G., E.B.-S. and A.B.; writing—original draft preparation, B.G., E.B.-S., A.B. and R.M.; writing—review and editing, R.M.; visualization, B.G., E.B.-S. and A.B.; supervision, D.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **Appendix A**

**Table A1.** Developed metrics from the overall database to be integrated within the machine learning tools.


## **References**


## *Article* **The Manning's Roughness Coefficient Calibration Method to Improve Flood Hazard Analysis in the Absence of River Bathymetric Data: Application to the Urban Historical Zamora City Centre in Spain**

**Julio Garrote 1,\* , Miguel González-Jiménez <sup>2</sup> , Carolina Guardiola-Albert <sup>2</sup> and Andrés Díez-Herrero <sup>2</sup>**


**Featured Application: The methodology proposed in this manuscript makes it possible to improve the estimation of flood zones and their flow depth values in situations where there are no available bathymetric data of the channel (or they are scarce and do not allow for its shape reconstruction). It could improve flood risk assessment too.**

**Abstract:** The accurate estimation of flood risk depends on, among other factors, a correct delineation of the floodable area and its associated hydrodynamic parameters. This characterization becomes fundamental in the flood hazard analyses that are carried out in urban areas. To achieve this objective, it is necessary to have a correct characterization of the topography, both inside the riverbed (bathymetry) and outside it. Outside the riverbed, the LiDAR data led to an important improvement, but not so inside the riverbed. To overcome these deficiencies, different models with simplified bathymetry or modified inflow hydrographs were used. Here, we present a model that is based upon the calibration of the Manning's *n* value inside the riverbed. The use of abnormally low Manning's *n* values made it possible to reproduce both the extent of the flooded area and the flow depth value within it (outside the riverbed) in an acceptable manner. The reduction in the average error in the flow depth value from 50–75 cm (models without bathymetry and "natural" Manning's *n* values) to only about 10 cm (models without bathymetry and "calibrated" Manning's *n* values), was propagated towards a reduction in the estimation of direct flood damage, which fell from 25–30% to about 5%.

**Keywords:** flood risk; cultural heritage sites; bathymetric data; manning's roughness coefficient; hydrodynamic modelling

## **1. Introduction**

Floods are probably the most frequently recurring natural phenomena affecting society (humans and goods) in terms of space and time, regardless of their geographical location or socioeconomic development, as shown by the data collected by the International Disasters Database for the period 1900–2018 (CRED, 2020). This is the main reason why flood risk management has become an essential tool from both a social and economic perspective, with the objective of reducing losses associated with both factors. Furthermore, urban historical centers are characterized by the presence of multiple types of cultural heritage sites, which give it a priceless value upon the consideration of the impossibility of the restoration or recovery of the possible damage caused by natural hazards, in this specific case, by river flooding. This abovementioned duality, but also mainly the irreversibility, of flood damage on cultural heritage (which cannot be reproduced once it has been destroyed) makes the prediction and assessment of the flood risks that may affect these sites a critical

**Citation:** Garrote, J.; González-Jiménez, M.;

Guardiola-Albert, C.; Díez-Herrero, A. The Manning's Roughness Coefficient Calibration Method to Improve Flood Hazard Analysis in the Absence of River Bathymetric Data: Application to the Urban Historical Zamora City Centre in Spain. *Appl. Sci.* **2021**, *11*, 9267. https://doi.org/10.3390/ app11199267

Academic Editors: Ricardo Castedo, Miguel Llorente Isidro and David Moncoulon

Received: 2 August 2021 Accepted: 2 October 2021 Published: 6 October 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

task for their preservation. Given this situation, the first natural disaster management strategies that included cultural heritage among its objectives began to be developed in the 1990s. Among these initiatives, one could highlight the "Carta del Rischio" [1], which has been developed by the Italian Central Institute for Restoration since 1992, or the "Noah's Ark" project of the European Union [2], launched in 2002. At specific sites, only a few works provide applications to case studies at different scales (from regional to local scales) [3–7].

After assessing direct tangible damage (i.e., the direct damage resulting from the physical contact of floodwater with property and its contents), it was found that the economic flood losses have been increasing throughout the past half-century. In the last decade (2008–2018), the economic losses associated with floods exceeded 35 billion USD [8] and, within this period, the flood losses exceeded 19 billion USD in 2012 alone [9–11].

The design, sizing, implementation and effectiveness of flood damage mitigation measures require rigorous risk analyses [12]. Within the aspects that a flood risk analysis includes, the greatest technical efforts are usually concentrated on flood hazard assessments. Flood vulnerability analysis is the other determining factor regarding the correct estimation of risk, mainly from the use of magnitude–damage models (e.g., [13–19]). Since the most common methods of flood hazard analysis are based on hydrological and hydraulic models [20], the greatest uncertainties and sources of error may come from the input data to these models. In the case of hydraulic or hydrodynamic models, the geometry of the channel (topography and bathymetry) [21–24], the boundary conditions (roughness, flow regime, etc. [25]) and the Manning's roughness coefficient (e.g., [26–28]) are determinants of the goodness of the model and results. In short, the effectiveness of risk mitigation may depend on the correct estimation of parameters such as the roughness of the terrain surface and the detailed bathymetric characterization of the riverbed.

In addition to the key points mentioned above, the urban character of cities' historical centers must be taken into account. This latter characteristic determines the preferred flow direction of floods inside the urban areas, turning the streets into improvised river channels. To achieve all these objectives, it is essential to have topographic data that are capable of reproducing the geometry and variability of the terrain [22,24,29,30], with the information coming from LiDAR (light detection and ranging) sensors being the most widely used today to derive DEMs (digital elevation models). However, most LiDAR data are not capable of reproducing the river channel morphology due to its inability to penetrate water bodies. Airborne LiDAR sensors (one of the possible sources of LiDAR data) are usually not capable of penetrating water bodies (turbid, turbulent or deep streams and rivers), but they constitute the most common LiDAR data due to their cost-effectiveness for wide areas relative to other LiDAR sources. In fact, as is discussed by Kinzel et al. [31], the available bathymetric LiDAR techniques are usually not designed for shallow waters and are not optimized for providing the spatial resolution necessary for mapping small-to-mediumsized rivers.

However, an accurate representation of river bathymetry (bed topography) plays a critical role in multiple hydrologic and hydraulic applications, including but not limited to flood modelling [23]. To solve these limitations, the acquisition of topographic data through ground surveys and subsequent combination with subaerial LiDAR data may be an efficient solution [32–34]. However, this approach has only been used for short river reaches. When the river reach length increases, so do the logistical and cost considerations. Therefore, under the scenario of long river reach study areas, it is common to use alternative methods or models for estimating bathymetry for use in hydraulic or hydrologic analyses. These alternative models assumed a general simplification of river bathymetric geometry, both from the use of simple geometric shapes (triangular, trapezoidal or parabola for a river cross-section) or from other geomorphological and hydraulic methods. The former approach is more frequent [23,35–38]. From a hydraulic perspective, the horizontally divided channel method (HDCM) was previously used [39,40] to solve the absence of bathymetric data. The HDCM separately considered the flow above and below the bank top and is a better option [39] for dividing the flow than the vertically divided channel

method (VDCM) when the floodplain roughness is significantly greater than the channel roughness. When the flow is horizontally divided, the lower part of the flow fits with the bankfull flow (taken from the dominant discharge concept), which has a return period of 1-2 years according to the field observations of Wolman and Miller [41]. However, this bankfull flow return period is dependent on local meteorological characteristics, thus it ranges from about 2 years in the north of Spain to 5 or more years in the southern and southeastern parts of the country. Once the bankfull flow is defined, it is detracted from the inflow hydrograph. In the same way, Chone et al. [40] used LiDAR topography to subtract the flow rate at the LiDAR date from the inflow hydrograph as a solution for the absence of river bathymetric data.

Whatever the proposed methodology, model calibration (when possible) is one of the main tasks that will ensure the quality of the results obtained. To carry out this calibration process, the availability of field data related to the event to be reproduced is essential. When this information is available, model calibration by varying the value of the Manning coefficient is one of the most frequent approaches, which is described by Ardıçlıo˘glu and Kuriqui [42]. However, valid information is not always available for the calibration of hydraulic models. This situation is more frequent when we do not try to model a specific event, but rather intend to model a design flow event that has a low frequency of occurrence (high return period). In these cases, and in many others, the lack of available information to calibrate the hydraulic models is compensated by the consideration of a benchmark model against which the results obtained in the rest of the models will be compared [43]. In general, this control model is defined by the availability of more or better data for it.

Several studies have already shown that incorporating bathymetry provides more accurate hydraulic simulations and flooding area estimations, which improve flood hazard analysis at the same time as flood risk assessments. Therefore, the goal of this study was not to reinforce these past findings but to try to open up another approach to improve flood hazard assessment in floodplains, that is, improving flood hazard mapping and main flow variables (flow depth) outside the main river channel (that is, into the flood plains) through the use of calibrated but not "natural" values of Manning's roughness coefficient. In other words, we looked for a Manning value or range of values that compensated for the effect that the absence of bathymetric data would have on the hydraulic modelling results. This new approach was supported by the assumption that most flood-exposed elements (people and goods) are located out of the river channel and not inside it; therefore, the flow parameters (depth and velocity) in the river channel are not really key points for flood hazard assessment in most cases. Under these assumptions, the use of two LiDAR DEMs (with and without bathymetric data) for the Douro River reach in Zamora city (Spain) allowed us to calibrate a Manning's roughness coefficient for the main river channel (which is consistent with the value obtained previously from three cross-sections of the river) so that the flooded area and the flow depth in the floodplain converged with the results obtained in the control (or benchmark) model. The hydrodynamic model using the calibrated value show very similar flooded areas, as well as flow parameters (both for the 500-year and 100-year return period peak flows); thus, it can be used for flood hazard and risk assessment. Furthermore, the results of this new approach were compared with results from previous methodological approaches, such as the modified inflow hydrographs. Moreover, the approach proposed here can be replicated for any kind of river and it is cost-saving, as it does not require large ground surveys.

## **2. Study Area and Data Description**

This study focused on comparing the flood area extension over different hydrodynamic models, where Manning's roughness coefficient was changed in a calibration process. The study area comprises the Douro River reach in Zamora city (Spain), where most of the historical urban center is located on the right side of Douro River, but the urban center located on the left side is developed over lower-lying and flood-prone areas. The Douro River reach in Zamora (Figure 1A) is 7 km long with a mean width of 225 m, and is char-

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 4 of 22

**2. Study Area and Data Description**

acterized by a relatively deep, sand-clayey channel with some fully vegetated sandy bars and with several meanders and a flat floodplain. flood-prone areas. The Douro River reach in Zamora (Figure 1A) is 7 km long with a mean width of 225 m, and is characterized by a relatively deep, sand-clayey channel with some fully vegetated sandy bars and with several meanders and a flat floodplain.

This study focused on comparing the flood area extension over different hydrodynamic models, where Manning's roughness coefficient was changed in a calibration process. The study area comprises the Douro River reach in Zamora city (Spain), where most of the historical urban center is located on the right side of Douro River, but the urban center located on the left side is developed over lower-lying and

**Figure 1.** The study area and cultural heritage sites location map (**A**). The digital elevation model (DEM) from LiDAR data (**B**) not showing bathymetry, which was included from echo sounders with a D-GPS (**C**). Example of a cross-section flow depth calibration (**D**). **Figure 1.** The study area and cultural heritage sites location map (**A**). The digital elevation model (DEM) from LiDAR data (**B**) not showing bathymetry, which was included from echo sounders with a D-GPS (**C**). Example of a cross-section flow depth calibration (**D**).

Topographic LiDAR data [44] are publicly available in the form of point cloud files, which are filtered and classified following the American Society for Photogrammetry and Remote Sensing (ASPRS) standard classification. The vertical accuracy of the LiDAR data used in this study is reported to be lower than 20 cm [44]. The terrain and building types of the LiDAR classification were used to derive the different DEMs. Due to the absence of LiDAR data in the main river channel (since water has low reflectance), it was modelled using a second-order polynomial interpolation of multiple LiDAR cross-sections. Each of these sections presented a height equal to the minimum LiDAR point value and had a length that was restricted to channel boundaries. Although a spline and a third-order polynomial interpolation were tried, the second-order one was chosen because it provided a smooth and downstream-constant slope surface. Bathymetry data were obtained in the form of cross-section measurements that were obtained from boatmounted echo sounders with D-GPS (differential global positioning system) data acquisition. Both sets of topographic data were combined into a 1 m spatial resolution to create a final DEM (Figure 1C), which was considered to be the "real scenario" (or benchmark scenario). On the other hand, a LiDAR point cloud was also interpolated in Topographic LiDAR data [44] are publicly available in the form of point cloud files, which are filtered and classified following the American Society for Photogrammetry and Remote Sensing (ASPRS) standard classification. The vertical accuracy of the LiDAR data used in this study is reported to be lower than 20 cm [44]. The terrain and building types of the LiDAR classification were used to derive the different DEMs. Due to the absence of LiDAR data in the main river channel (since water has low reflectance), it was modelled using a second-order polynomial interpolation of multiple LiDAR cross-sections. Each of these sections presented a height equal to the minimum LiDAR point value and had a length that was restricted to channel boundaries. Although a spline and a third-order polynomial interpolation were tried, the second-order one was chosen because it provided a smooth and downstream-constant slope surface. Bathymetry data were obtained in the form of cross-section measurements that were obtained from boat-mounted echo sounders with D-GPS (differential global positioning system) data acquisition. Both sets of topographic data were combined into a 1 m spatial resolution to create a final DEM (Figure 1C), which was considered to be the "real scenario" (or benchmark scenario). On the other hand, a LiDAR point cloud was also interpolated in the absence of bathymetric data into a 1 m spatial resolution DEM (Figure 1B), which was considered to be the "LiDAR scenario".

The peak flow values that were used for the hydrodynamic models were obtained from a streamflow gauge at the upstream location of the Douro River reach. A generalized extreme values (GEV) distribution was fitted to the annual maximum flow time series, and the 500-year return period peak flow value (2274 m<sup>3</sup> s −1 ) was selected for our analysis upon the assumption that this value can be considered as a low-frequency or "extraordinary"

flood event. To test the efficiency and validity of the proposal, a second peak flow value was selected: in this case, the peak flow value associated with a return period of 100 years (1872 m<sup>3</sup> s −1 ). The reason for not selecting a lower peak flow value, therefore one that was more contrasted with the 500-year return period, was the need for the hydraulic model to include the overflow of the main channel in the flood plain. This need was based on the premise that the usefulness of our proposal is only fulfilled in the floodplain, and therefore, in both cases (different peak flows), the floodplain should be affected.

Finally, PNOA ortho-photographs from 2017 (0.25 m of spatial resolution) were used in combination with a field survey to define the value of Manning's roughness coefficient (Manning's n); this map was created for the whole study area at a scale of 1:4.000. It contains ten different roughness units, each of which was assigned a unique Manning's n according to previously proposed tables [45,46]. For the "real scenario", a Manning's roughness coefficient value of 0.027 was assigned for the Douro River's main channel. All other Manning's roughness coefficient values related to each terrain surface unit were calculated and were kept constant throughout the calibration process.

## **3. Methodology**

Figure 2 shows the full methodological approach that was used to define the best calibrated Manning's *n* value in the river channel, which allowed us to map the flooded area and hydraulic conditions in the absence of bathymetric data. *Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 6 of 22

**Figure 2.** Flow chart of the full methodological approach that was used in the Manning's *n* value calibration process, from data sources until the calibrated vs. control scenarios comparison. **Figure 2.** Flow chart of the full methodological approach that was used in the Manning's *n* value calibration process, from data sources until the calibrated vs. control scenarios comparison.

The entire methodology and analysis that was carried out to obtain the calibrated Manning's *n* value that best reproduced (in the absence of bathymetric data) the hydraulic conditions and the flooded area in the floodplain associated with a flood event relative to the benchmark model (availability of bathymetric data) was based on a simple premise: the flow through the channel can be expressed in a simplified form from the following equation:

**Figure 3.** Basic conceptual diagram for the calibration of the parameter n, assuming that the flow rate (Q) is a function of the cross-sectional area of the channel (A<sup>1</sup> and A2) and the mean flow velocity (V<sup>1</sup> and V2, which, in turn, is dependent on the Manning's *n* values (n<sup>1</sup> and n2)). The size of the

Comparisons of MDE differences and flow depth differences as input and output parameters of the hydrodynamic modelling were carried out. The first of the comparison processes allowed us to quantify the spatially distributed cross-sectional area reduction due to the non-availability of LiDAR data on submerged areas of the main channel. The second one allowed us to spatially assess the distributed differences in flow depth value,

symbols identifying the variables is directly related to the magnitude of the variables.

$$\mathbf{Q} = \mathbf{A} \, \ast \, \mathbf{V} \, \tag{1}$$

*3.1. Comparison of Bathymetric Representation*

where Q is the total flow (m<sup>3</sup> s −1 ), A is the cross-sectional area (m<sup>2</sup> ) and V is the mean flow velocity (m s−<sup>1</sup> ).

Starting from this equation, if the absence of bathymetric data meant a decrease in the cross-sectional area of the channel, an increase in the flow velocity of the water in the channel could compensate for the reduction in area and thus allow the flow rate that was circulating through the channel to remain constant. This approach is shown graphically in Figure 3. **Figure 2.** Flow chart of the full methodological approach that was used in the Manning's *n* value calibration process, from data sources until the calibrated vs. control scenarios comparison.

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 6 of 22

**Figure 3.** Basic conceptual diagram for the calibration of the parameter n, assuming that the flow rate (Q) is a function of the cross-sectional area of the channel (A<sup>1</sup> and A2) and the mean flow velocity (V<sup>1</sup> and V2, which, in turn, is dependent on the Manning's *n* values (n<sup>1</sup> and n2)). The size of the symbols identifying the variables is directly related to the magnitude of the variables. **Figure 3.** Basic conceptual diagram for the calibration of the parameter n, assuming that the flow rate (Q) is a function of the cross-sectional area of the channel (A<sup>1</sup> and A<sup>2</sup> ) and the mean flow velocity (V<sup>1</sup> and V<sup>2</sup> , which, in turn, is dependent on the Manning's *n* values (n<sup>1</sup> and n<sup>2</sup> )). The size of the symbols identifying the variables is directly related to the magnitude of the variables.

#### *3.1. Comparison of Bathymetric Representation 3.1. Comparison of Bathymetric Representation*

Comparisons of MDE differences and flow depth differences as input and output parameters of the hydrodynamic modelling were carried out. The first of the comparison processes allowed us to quantify the spatially distributed cross-sectional area reduction due to the non-availability of LiDAR data on submerged areas of the main channel. The second one allowed us to spatially assess the distributed differences in flow depth value, Comparisons of MDE differences and flow depth differences as input and output parameters of the hydrodynamic modelling were carried out. The first of the comparison processes allowed us to quantify the spatially distributed cross-sectional area reduction due to the non-availability of LiDAR data on submerged areas of the main channel. The second one allowed us to spatially assess the distributed differences in flow depth value, as well as to select the optimal model ("LiDAR scenario" plus "best Manning's *n* value") to simulate the flood hazard levels relative to the results of the original, or control, model.

The errors in the bathymetric representation for the "LiDAR scenario" and "real scenario" were compared both using an arithmetic subtraction between the two models (Figure 4A) to obtain a difference value for each pixel of the model and by calculating the mean absolute error (MAE), as shown in Equation (2):

$$\text{MAE} = \frac{\sum\_{i=1}^{n} |\mathbf{H}\_{\mathbf{L}} - \mathbf{H}\_{\mathbf{r}}|}{\mathbf{n}\_{\mathbf{c}}} \tag{2}$$

where H<sup>L</sup> is the elevation of the ith cell for the "LiDAR scenario" DEM, H<sup>r</sup> is the elevation of the ith cell for the "real scenario" DEM and n<sup>c</sup> is the number of cells used for the analysis.

as well as to select the optimal model ("LiDAR scenario" plus "best Manning's *n* value") to simulate the flood hazard levels relative to the results of the original, or control, model. The errors in the bathymetric representation for the "LiDAR scenario" and "real scenario" were compared both using an arithmetic subtraction between the two models (Figure 4A) to obtain a difference value for each pixel of the model and by calculating the

> n i=1

where H<sup>L</sup> is the elevation of the ith cell for the "LiDAR scenario" DEM, H<sup>r</sup> is the elevation of the ith cell for the "real scenario" DEM and n<sup>c</sup> is the number of cells used for the

∑ |H<sup>L</sup> − H<sup>r</sup>

nc


(2)

MAE =

mean absolute error (MAE), as shown in Equation (2):

analysis.

**Figure 4.** Topographic differences between the "real scenario" and the "LiDAR scenario" (**A**), where green colors show lower differences and red colors show higher topographic differences. A grouping analysis (**B**) of the errors without spatial constraints showed a non-uniform pattern. **Figure 4.** Topographic differences between the "real scenario" and the "LiDAR scenario" (**A**), where green colors show lower differences and red colors show higher topographic differences. A grouping analysis (**B**) of the errors without spatial constraints showed a non-uniform pattern.

### *3.2. Hydraulic Modelling and Calibration Process 3.2. Hydraulic Modelling and Calibration Process*

The hydrodynamic modelling for all configurations was performed using 2D Iber software [47], which is a two-dimensional mathematical unsteady flow model that is used to simulate the free surface flow in rivers and estuaries and employs a high-resolution finite volume as a numerical method to solve the depth-averaged 2D shallow water equations, also known as Saint Venant equations. Iber software requires a geometric description of the channel in the form of a 3D mesh for performing hydraulic computations. Only the hydraulic modelling of peak flows was carried out, as these peak flows were related to maximum flooded area extension. The hydrodynamic modelling for all configurations was performed using 2D Iber software [47], which is a two-dimensional mathematical unsteady flow model that is used to simulate the free surface flow in rivers and estuaries and employs a high-resolution finite volume as a numerical method to solve the depth-averaged 2D shallow water equations, also known as Saint Venant equations. Iber software requires a geometric description of the channel in the form of a 3D mesh for performing hydraulic computations. Only the hydraulic modelling of peak flows was carried out, as these peak flows were related to maximum flooded area extension.

The calibration process was carried out in a similar way to a sensitivity analysis of the hydrodynamic model to changes in the Manning's n parameter. The extreme values The calibration process was carried out in a similar way to a sensitivity analysis of the hydrodynamic model to changes in the Manning's n parameter. The extreme values of the Manning's n for the Duero riverbed were, on the one hand, a minimum value of 0.001, and on the other hand, a maximum value of 0.027 (similar to the value used in the "real scenario"). No higher values of the Manning's n were considered on the basis that, with the same value of surface roughness parameter and a decrease in the cross-sectional area of the channel (due to the absence of bathymetric data in the "LiDAR scenario"), both the flooded area and the flow depth values should be clearly higher than the one obtained in the "real scenario", which here was considered the benchmark model. To complement and complete the analysis, two other scenarios were considered for comparison with the control model. On the one hand, a model with a variable and spatially distributed Manning's *n* value was considered (which was obtained from the previous models using the Manning's *n* value that showed the best fit for each of the pixels within the river channel, as is shown in Figure 5). Second, the HDCM model, as applied by Chone et al. [40], was considered by

subtracting the flow value at the date of acquisition of the LiDAR topography from the peak flow value of the 500- and 100-year return period. channel, as is shown in Figure 5). Second, the HDCM model, as applied by Chone et al. [40], was considered by subtracting the flow value at the date of acquisition of the LiDAR topography from the peak flow value of the 500- and 100-year return period.

of the Manning's n for the Duero riverbed were, on the one hand, a minimum value of 0.001, and on the other hand, a maximum value of 0.027 (similar to the value used in the "real scenario"). No higher values of the Manning's n were considered on the basis that, with the same value of surface roughness parameter and a decrease in the cross-sectional area of the channel (due to the absence of bathymetric data in the "LiDAR scenario"), both the flooded area and the flow depth values should be clearly higher than the one obtained in the "real scenario", which here was considered the benchmark model. To complement and complete the analysis, two other scenarios were considered for comparison with the control model. On the one hand, a model with a variable and spatially distributed Manning's *n* value was considered (which was obtained from the previous models using the Manning's *n* value that showed the best fit for each of the pixels within the river

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 8 of 22

**Figure 5.** Map showing the spatial distribution of the Manning's *n* value best fit option along the Douro River channel in Zamora city, Spain. The Manning's *n* value best fit was found from the relationship between the channel bathymetry differences ("real scenario"–"LiDAR scenario") and the flow depth differences. **Figure 5.** Map showing the spatial distribution of the Manning's *n* value best fit option along the Douro River channel in Zamora city, Spain. The Manning's *n* value best fit was found from the relationship between the channel bathymetry differences ("real scenario"–"LiDAR scenario") and the flow depth differences.

As the present approach is only useful for floodplain areas and not for river channels, the data collected for the calibration process were located in the Douro river floodplain. The results of different Manning's *n* values inside the river channel were also compared to the benchmark model to achieve a spatially distributed Manning model where Manning's *n* value inside the river channel varied for each model cell. Manning's *n* value is dependent on which model (with a constant value of Manning within the channel) performed best in the comparison against the control (or benchmark) model. As the present approach is only useful for floodplain areas and not for river channels, the data collected for the calibration process were located in the Douro river floodplain. The results of different Manning's *n* values inside the river channel were also compared to the benchmark model to achieve a spatially distributed Manning model where Manning's *n* value inside the river channel varied for each model cell. Manning's *n* value is dependent on which model (with a constant value of Manning within the channel) performed best in the comparison against the control (or benchmark) model.

#### *3.3. Comparison of Hydrodynamic Model Outputs 3.3. Comparison of Hydrodynamic Model Outputs*

To evaluate the suitability of the different hydrodynamic models that were generated in the calibration process, the flow depth parameter was selected since it was the most commonly used parameter for flood risk analysis through magnitude–damage models as a flood vulnerability approach. The hydraulic outputs (flow depth) generated by the Iber software were exported into geo-referenced ASCII raster files (\*.asc files) for each of the "LiDAR scenario + Manning's *n* value" models, and further outputs were compared to the benchmark model generated using the "real scenario" topography in the GIS software ESRI® ArcGIS 10.6.1. Spatially distributed flow depth differences between calibrated To evaluate the suitability of the different hydrodynamic models that were generated in the calibration process, the flow depth parameter was selected since it was the most commonly used parameter for flood risk analysis through magnitude–damage models as a flood vulnerability approach. The hydraulic outputs (flow depth) generated by the Iber software were exported into geo-referenced ASCII raster files (\*.asc files) for each of the "LiDAR scenario + Manning's *n* value" models, and further outputs were compared to the benchmark model generated using the "real scenario" topography in the GIS software ESRI® ArcGIS 10.6.1. Spatially distributed flow depth differences between calibrated models and the control model, as well as random sample creation and geostatistical analysis input parameter calculations, were also generated inside the ESRI® ArcGIS environment.

The analysis of different results started with descriptive statistics of the flow depth value differences between the "LiDAR scenario" models (with different Manning's *n* values) and the "real scenario" in the set of random samples (nearly 7000 random points for the 500-year return period peak flow and around 3600 random points for the 100-year return period). The mean, median, mode, standard deviation, variance and Nash–Sutcliffe efficiency index were calculated and compared for analysis purposes. Box plots were used to explore the spatial distribution behavior of different levels of depth errors. First, error location points were classified depending on the distance to the river bank with classes of 100 m width. Then, a boxplot was graphed for each spatial class. Distance bands to the river were computed using the Near ArcMap tool [48]. The Matplotlib library [49] from Phyton software was used to perform a descriptive and graphical statistical analysis.

The Nash–Sutcliffe efficiency (NSE) index was used as given in Equation (3):

$$\text{NSE} = 1 - \frac{\sum\_{i=1}^{n} \left(\text{D}\_{\text{m}} - \text{D}\_{\text{o}}\right)^{2}}{\sum\_{i=1}^{n} \left(\text{D}\_{\text{o}} - \overline{\text{D}}\_{\text{o}}\right)^{2}} \tag{3}$$

where D<sup>m</sup> is the flow depth value of the modelled "LiDAR scenario", D<sup>o</sup> is the flow depth value in the "real scenario" and D<sup>o</sup> is the mean flow depth value for the "real scenario".

Furthermore, the flood inundation extent that resulted from each calibrated Manning's n model was also compared with the "real scenario" model using the F-statistic, as given in Equation (4):

$$F \text{ statistic} = \frac{A\_{Lr}}{A\_L + A\_r - A\_{Lr}} \times 100\tag{4}$$

where *A<sup>L</sup>* is the observed inundation area (inundation area of the reference model in this case, "real scenario"), *A<sup>r</sup>* is the modelled inundation area ("LiDAR scenario") and *ALr* is the area that is common to both the "real scenario" and the "LiDAR scenario" inundation maps.

The F-statistic index, which was previously used in [23,29,50–52], allowed us to compare floods throughout the inundation extent resulting from each hydrodynamic model, where the hydrodynamic models differed was in the bathymetry data (as previously used) or in the Manning's *n* value at the main river reach (as was used here). A value of 100 meant a perfect match between the observed and predicted areas of inundation, and a lower F indicated a discrepancy between the two.

## **4. Results and Discussion**

## *4.1. "Real Scenario" vs. "LiDAR Scenario" Bathymetric Differences*

The use of the MAE index on a random sample of 3000 points (located in the river channel, independent of the samples obtained in the floodplain) made it possible to estimate the error in the bathymetry data between the "real scenario" and the "LiDAR scenario". The MAE index adopted a value of 1.57 m. This was not surprising if we consider the type of riverbed that the Douro River had in the study area under analysis. However, when transferring the MAE index value to the average width of the river's cross-section, we obtained a reduction in the cross-sectional area of 353.25 m<sup>2</sup> . This reduction in the crosssection could cause a reduction in the flow capacity of the channel of 353.25–529.87 m<sup>3</sup> s −1 at times when the flow was medium–low and its average speed could be considered to be between 1–1.5 m s−<sup>1</sup> . At times of flooding, when the average speed of the water could rise to values of 3 m s−<sup>1</sup> or more, the reduction in the flow capacity in the channel could rise to values of 1059 m<sup>3</sup> s <sup>−</sup><sup>1</sup> or even more.

The absolute error in bathymetric data between the "real scenario" and the "LiDAR scenario" ranged from 0–4.115 m at the Douro River reach, and it showed a clear trend from the river banks to its centerline (Figure 4A). At the same time, a more irregular pattern was observed in the downstream direction, where the application of a grouping analysis over the spatial distribution of errors in bathymetry did not show clear trends when no spatial constraints were used (compare with spatial constraints, such as K-nearest neighbors or Delaunay triangulation techniques, which show artificial grouping due to the requirements of spatial constraints between points in the same group). When no spatial constraints were used, the grouping of the random sample points could be related to the "riffle and pools" longitudinal profile shape (Figure 4B) of natural rivers.

All previous data pointed out the importance of bathymetric data for the correct hydrodynamic modelling of the flow event flooding area, as well as the uncertainty of flood parameter results when using LiDAR data without a bathymetric data improvement. In this sense, Cook and Merwade [29] previously pointed out this problem when they showed a reduction of between 5 and 20% in flooded areas when bathymetric data were

combined with LiDAR data, with this reduction range being based on the channel type and hydrodynamic model (1D or 2D). In the same way, Saleh et al. [53] highlight that, in the context of determining water levels at a specific location, the difference in bathymetry could be very important. Furthermore, they also noted how this difference may be more evident when we use DEMs or remote sensing techniques to identify river geometry at the regional scale since it is difficult to obtain an accurate bathymetric river representation. Other assessments [23] pointed towards the same conclusion, but there is no quantification of the dependence between bathymetric errors and flooded area differences.

## *4.2. Manning's n Value Calibration*

The Manning's *n* value calibration process was carried out first for the 500-year return period peak flow, and it was done in two steps. First, focusing on a cost-saving approach that would allow its application in places with very different socio-economic development, the calibration was carried out based on the results obtained in a series of cross-sections of the Douro River. At the location of each cross-section, a hydrodynamic model was generated and a best fit Manning's *n* value model was defined (Figure 1D). A non-homogeneous Manning's *n* value for each cross-section was found, which pointed towards the assumed hydrodynamic simplification when we use a homogeneous surface roughness coefficient for the whole river reach. This point is discussed later.

From the different best fit Manning's *n* values of each cross-section, a mean value of *n* = 0.015 was obtained. However, given the non-homogeneous results obtained in the previous step, a second calibration phase was carried out.

In this second phase, the hydrodynamic model was extended throughout the study area. The latter aspect favored the achievement of multiple objectives, such as the rapid comparison of results in the location of the river cross-sections that were used in the first calibration phase. However, it also allowed for statistical and geostatistical analyses of the results to be carried out to achieve a better understanding of them in such a way that one could finally obtain a selection, based on scientific criteria, of the hydrodynamic model based on the "LiDAR scenario" topography and the best value (or range of values) of Manning's n coefficient.

As a complement to the above, the results obtained from two other models were analyzed. On the one hand, use was made of a model that presented a variable spatial distribution in the value of the roughness coefficient in the channel (obtained from the roughness values that offered a better adjustment relative to the control model, i.e., the "real scenario", for each point in the channel). On the other hand, a model with a roughness coefficient value of 0.027 was considered (similar to the control model), albeit with a reduced peak flow value (from 2274 to 2114 m<sup>3</sup> s −1 ) depending on the measured flow rate at the date of acquisition of the LiDAR data.

The calibration, optimization and validation of the parameters in the models require having observational elements of calibration, such as the existence of a gauging station with its available stage–discharge curve in the case of the hydraulic models [54,55]; the existence of stage water plaques of historical floods with a known peak flow value; or other types of high-water marks, such as paleo-hydrological evidence (slackwater deposits or dendro-geomorphological floods [56]). In addition, it is not sufficient to have a single element for calibration because different combinations of parameters can give convergent results at one point; rather, multiple points are needed to adjust them, and these are not always available [28]. In fact, as pointed out by Hawker et al. [43], the availability of observational elements of calibration is not possible in many cases.

However, the readjustment of compensation between parameters, which is the strategy followed in this work, made it possible to compensate for the deficiencies in bathymetry by readjusting the roughness in a relatively simple way that was applicable to any section or situation, whether or not there were calibration elements.

In short, the methodological approach adopted, although it can be complemented with other approaches, is an innovative strategy that can substantially improve hydrodynamic models, hazard analyses, risk assessments and, finally, the effectiveness of flood risk mitigation measures. Evidently, this was all done in places where there was no bathymetric information and where such information could not be obtained, and only for the floodplain, not for the river channel.

## *4.3. Flow Depth Models Analysis and Optimal Model Selection*

The hydrodynamic output parameter of flow depth, which was derived from each Manning's *n* value calibration model that was constructed upon the topography of the "LiDAR scenario" model, was compared to the control model, i.e., the so-called "real scenario". First, the global results were analyzed both as flow depth differences and as flood area extensions by using descriptive statistics parameters, such as mean, median, variance or standard deviation, and with the use of the Nash–Sutcliffe efficiency index and the F-statistic index. All these statistical indexes are shown in Table 1.

**Table 1.** Statistical data about Manning's *n* value calibration process, considering both hydrodynamic outputs, namely, flow depth and flooding area.


From the analysis of the simulation points as a function of the flow depth values, it was found that the differences between the "real scenario" and the "LiDAR scenario (Manning's *n* value = 0.011)" had the lowest mean deviation, with a perfect average fit (0.000); the lowest median deviation results were found for the "LiDAR scenario (spatially distributed Manning's *n* value)", which slightly underestimated the flow depth (0.005); the lowest mode value was related to the scenario with Manning's *n* value equal to 0.012; the variance of the deviation had a random behavior.

The Nash–Sutcliffe efficiency index was calculated as one minus the ratio of the error variance of the modelled hydrodynamic output divided by the variance of the observed hydrodynamic output. In our assessment, the output was the flow depth parameter.

The NSE index showed very similar values for almost all of the hydrodynamic models. The best value was associated with the hydrodynamic model with a Manning's *n* value of 0.010, but the differences relative to other models (like models with *n* = 0.015, 0.016 or a spatially distributed *n* value) were not significant. All models showed high values of the NSE; therefore, the error variance of the modelled hydrodynamic outputs was much lower than the actual variance of the "real scenario" hydrodynamic model flow depth parameter. From the results of the NSE index, it was difficult to make the selection of the best Manning's calibrated hydrodynamic model.

Just as the results of the flow depth parameter given by the NSE index were not fully significant to select the best calibrated hydrodynamic model option, the results from the F-statistic index (Table 1) produced the same result. The F-statistic index showed many close values for the hydrodynamic models, with Manning's *n* value ranging from 0.010 to 0.018 (always above a value of 90), with the best value of the F-statistic index linked to the Manning's *n* value of 0.015. The F-statistic improvement from Manning's *n* value of 0.010 to 0.015 was only about 1.5%.

As a conclusion, we could say that the analysis of the flow depth results from all calibrated models not giving us a strong criterion for the better option selection means that all models using Manning's *n* values ranging from 0.010 to 0.016 may be suitable to reproduce the conditions shown by the control model, i.e., the "real scenario" hydrodynamic model. The results shown by these models were similar in quality to those obtained by the use of the variable and spatially distributed Manning *n* value model, and they were better than the result shown by the HDCM approach. Based on the results obtained up to this point, a second phase of flow depth values analysis was necessary and a geo-statistical approach was carried out, where the distance to the main river channel was taken into account for the flow depth analysis.

When the results were analyzed as a function of the distance to the riverbanks (Figure 6), less deviation was also observed for the LiDAR scenario models with Manning's *n* values ranging from 0.011 to 0.015 (the model with an *n* value of 0.016 began to show a slight deviation towards flow depth overestimation). From the best-calibrated models, those linked to a Manning's *n* value equal to 0.013 or 0.014 probably showed the lowest dispersion against the zero error line. *Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 14 of 22

**Figure 6.** Scatter plot graphical representation of the relationship between the variable flow depth and distance to the riverbank for different Manning's *n* value calibrated models. The red line shows an error equal to zero and the black line shows a logarithmic fitting to the point values. **Figure 6.** Scatter plot graphical representation of the relationship between the variable flow depth and distance to the riverbank for different Manning's *n* value calibrated models. The red line shows an error equal to zero and the black line shows a logarithmic fitting to the point values.

Based on the whole scope of the results of our analysis, the use of a spatially distributed Manning's *n* value became necessary. Manning's *n* value is usually calibrated within a river reach using a uniform value for all reaches, although the use of non-uniform values was pointed out by previous studies [65] that used a different Manning's *n* value for each river reach at the Lower Tapi River (India) to get the best-fit calibrated HEC-RAS If the analysis of flow depth values was carried out in intervals of distance from the riverbanks (Figure 7), the scenarios that best adjusted the intervals of maximum–minimum deviation were also those ranging from Manning's *n* value equal to 0.010 to 0.014. The "LiDAR scenario (Manning's *n* value = 0.010)" showed the best results for shorter distances from the riverbank, with a very good fit for distances from 0 to 150 m. The "LiDAR scenario

model. In this sense, Attari and Hosseini [66] showed a methodological framework for the automatic river segmentation into different river reaches that were fitted with a non-

but they did not use a real spatially distributed Manning's *n* value. Although a more complete study of spatially distributed values of Manning's n parameter would be necessary and convenient, the approximation to its use that was carried out in the present study did not show significantly better results than those of the other models considered.

(Manning's *n* value = 0.012)" gave the best results for distances from 150–300 m off the riverbank. Moreover, the "LiDAR scenario (Manning's *n* value = 0.013)" showed a good fit for all distances up to 300 m. However, for distances over 300 m, the "LiDAR scenario (Manning's *n* value = 0.014)" probably showed the best fit from the range of different Manning's *n* values calibrated. For this model, most of the error in the flow depth value lay within an interval of ± 10–12 cm, although this error increased to 15–20 cm when we got close to the riverbank. All these models substantially improved the results obtained in the model in which the natural value of Manning's parameter n was maintained. *Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 15 of 22

**Figure 7.** Box plot graphics showing the relationship between the flow depth errors and the distance to the riverbank (within 50 m intervals). Positive errors were related to the underestimation of the flow depth value by the calibrated model, and vice versa. **Figure 7.** Box plot graphics showing the relationship between the flow depth errors and the distance to the riverbank (within 50 m intervals). Positive errors were related to the underestimation of the flow depth value by the calibrated model, and vice versa.

In our methodological approach, we used the 500-year return period peak flow to develop the methodological framework, while the 100-year return period peak flow was used as the test model. The statistical results of the test model (Table S1 in Supplementary Materials) showed a slight difference from the 500-year return period peak flow model. A similar dependence on the statistic used was observed relative to the hydraulic model, When we analyzed the results that were associated with the spatially distributed Manning's parameter n model and those associated with the HDCM model, we observed that in both cases, the results were better than those associated with the model that preserved the natural value of Manning's n parameter. However, in neither case did the results approach the best of the models with a constant and calibrated Manning's *n* value. Thus,

which offered better results compared to the control model. The geostatistical analysis of results for the test model, considering the distance from the riverbank, showed very similar trends (Figures S1 and S2) to those related to the 500-year return period. Therefore,

was linked to Manning's *n* value in the range of 0.014–0.016. The results of the box plot (Figure S2), which were the same as for 500-year return period models, showed differences in the best fit that was linked to the distance to the riverbank. As discussed above, the best fit for shorter distances was obtained with a lower Manning's *n* value (about 0.011), while for distances equal to or greater than 500 m, the model that offered the best results was possibly the one with a Manning's *n* value of 0.016. As for the 500-

the HDCM model was one of the worst performers in this study. The bad performance of the HDCM model may have been due to the large difference between flow rates on the date of the LiDAR data and the 500-year return period peak flow, as well as the likely significant differences in flow velocity in each case, where the higher flow velocities would require less of a channel cross-sectional area (Figure 3). On the other hand, the model with a spatially distributed Manning's *n* value provided a very good fit with the control model ("real scenario") of up to about 500 m distance from the channel; however, at further distances, it underestimated the flow depth more than the models with a constant Manning's n parameter and values between 0.013 and 0.015. Therefore, if the risk is to be assessed at a short distance because this is where the exposed and vulnerable elements are located (farms, transport infrastructure, etc.), the scenario "LiDAR scenario (Manning's *n* value = 0.011)" or the spatial distributed Manning's *n* value model are of interest, while if risk analysis is to be carried out for elements distant from the riverbed (homes and towns far from the river but within a flood zone), the scenario "LiDAR scenario (Manning's *n* value = 0.012 to 0.015)" can be used. This gives rise to an interesting discussion on the need to use different roughness indices depending on the flow rate and its return period, as some authors have already pointed out (but in the opposite direction to these results [55]). This variation in the parameters and indices to be used in hydrological and hydraulic models depending on the magnitude of the event has already been described extensively in the scientific-technical literature for other parameters, such as initial abstractions (curve number) as a function of precipitation intensity.

The coefficient of water bottom friction was investigated extensively and is known to depend on the particle sizes of materials on the river bed. There have been many studies on friction parameter estimation, especially on a relationship between estimated Manning's coefficients and river bed conditions. These range from the classical tables and lists [57,58], to present-day estimations using fractals and connectivity [59,60] from remote sensing information [61], as well as including visual guides [45] and technical determination procedures [62,63]; all of these methods can be grouped in two kinds of approaches: (i) grain size–roughness relationships for different river bottom patches or polygons and (ii) micro-topographical analyses of bathymetrical data.

The first group is used in technical reports and studies of large river reaches for hydrodynamic modelling and civil engineering; the second group is usual in scientific detailed studies of small river channels for geomorphological and ecological analysis. Both approaches are necessary and complementary because they depend on the scale and objectives of roughness estimation. However, it is very important to quantify predictive uncertainty in the hydrodynamic modelling of shallow water flow in response to uncertainty in friction parameterization [64].

However, in any case, a fully spatially distributed Manning's coefficient based upon the physical characteristics of terrain (mainly the riverbed characteristics) was not achieved due to the complex distribution and the high degree of spatial variability in the physical characteristics of the terrain (grain-size and micro-topography distributions) and vegetation. This objective is already feasible for floodplains [61] by using UAVs, but not for the submerged areas of river channels.

Based on the whole scope of the results of our analysis, the use of a spatially distributed Manning's *n* value became necessary. Manning's *n* value is usually calibrated within a river reach using a uniform value for all reaches, although the use of non-uniform values was pointed out by previous studies [65] that used a different Manning's *n* value for each river reach at the Lower Tapi River (India) to get the best-fit calibrated HEC-RAS model. In this sense, Attari and Hosseini [66] showed a methodological framework for the automatic river segmentation into different river reaches that were fitted with a non-uniform Manning's *n* value. Both approaches were utilized prior to the use of a non-uniform value for the roughness coefficient along a sequential river reach segmentation, but they did not use a real spatially distributed Manning's *n* value. Although a more complete study of spatially distributed values of Manning's n parameter would be necessary and convenient, the approximation to its use that was carried out in the present study did not show significantly better results than those of the other models considered.

In our methodological approach, we used the 500-year return period peak flow to develop the methodological framework, while the 100-year return period peak flow was used as the test model. The statistical results of the test model (Table S1 in Supplementary Materials) showed a slight difference from the 500-year return period peak flow model. A similar dependence on the statistic used was observed relative to the hydraulic model, which offered better results compared to the control model. The geostatistical analysis of results for the test model, considering the distance from the riverbank, showed very similar trends (Figures S1 and S2) to those related to the 500-year return period. Therefore, the scatter plot of Figure S1 shows that the best fit with the control (or benchmark) model was linked to Manning's *n* value in the range of 0.014–0.016. The results of the box plot (Figure S2), which were the same as for 500-year return period models, showed differences in the best fit that was linked to the distance to the riverbank. As discussed above, the best fit for shorter distances was obtained with a lower Manning's *n* value (about 0.011), while for distances equal to or greater than 500 m, the model that offered the best results was possibly the one with a Manning's *n* value of 0.016. As for the 500-year return period, the test model linked to the HDCM approach showed an overestimation of the flow depth values for all distances to the riverbank. For all hydraulic models related to the 100-year return period, an increase in uncertainty could be observed at further distances from the riverbank, which was due to the smaller flooding area at the floodplain (and the consequent lower number of sample points at further distances) than the one related to the 500-year return period models.

In general, we observed that the best-fitting Manning's *n* value increased slightly from the methodological developed model (500-year return period peak flow) to the methodological test model (100-year return period peak flow). However, taking this into account, the optimal range of Manning's *n* value from 0.014 to 0.016 could be defined. Within this range of Manning's *n* value, the flow depth errors in the river floodplain were drastically reduced from the model without bathymetry data and a "normal" (0.027 in the study site) Manning's *n* value.

The results from the two peak flows considered in the present assessment point towards the validation of our methodological approach, and the usefulness of using a calibrated and reduced values of Manning's *n* coefficient where the topography of the riverbed is not available and its acquisition lies outside the economic budget of flood risk managers.

## *4.4. Local Results at Cultural Heritage Sites in Zamora (Spain)*

Beyond the results of the overall study area (with or without riverbank distance dependence), some control points (Figure 8) that are linked to different housing types in Zamora city were used for the result quality analysis of the Manning's-*n*-value-calibrated models. Four control points were selected, representing different types of buildings in the vicinity of the city of Zamora. Checkpoints 1 and 3 represented buildings in an urban environment with a high building density, checkpoint 2 corresponded to a cultural heritage site (chapel) and, finally, checkpoint 4 corresponded to a house that was isolated among agricultural fields.

In all cases, the absence of bathymetric data in the peak flow hydraulic modelling implied a flow depth overestimation ranging from 50 to 75 cm, where no process of artificial modification of the Manning's *n* value (calibrated model with *n* = 0.027) was carried out. These flow depth errors could be drastically reduced through the Manning's *n* value calibration process, as shown in Figure 8. However, as was also observed previously, the results obtained at the four control points did not show complete homogeneity. Again, the range of values between 0.010 and 0.014 seemed to show the best results overall. However, depending on the spatial location of the control point, it was observed that values lower than 0.010 could also give optimal results, but values higher than 0.014 did not. The

selection of only four locations may not be representative for the whole study area, and the previously exposed results (where the best fit was obtained for the Manning's *n* value ranging between 0.014 to 0.016) had greater confidence, but they were illustrative of the overestimation of flow depth in the absence of bathymetric data without some kind of calibration process. *Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 17 of 22

**Figure 8.** Local Manning's *n* value calibration results at control points in Zamora city. The upper and lower graphs show flow depth values for each calibrated model, where the red line shows the flow depth value for the control model, i.e., the "real scenario". **Figure 8.** Local Manning's *n* value calibration results at control points in Zamora city. The upper and lower graphs show flow depth values for each calibrated model, where the red line shows the flow depth value for the control model, i.e., the "real scenario".

By transferring these differences in flow depth to the analysis of direct economic damage caused by floods, they can lead to significant differences in damage estimates. Thus, based on a widely used magnitude–damage model [15], we found that a difference in the flow depth of 50–75 cm could lead to a variation in the direct damage estimates of By transferring these differences in flow depth to the analysis of direct economic damage caused by floods, they can lead to significant differences in damage estimates. Thus, based on a widely used magnitude–damage model [15], we found that a difference

25–30%. From the same model, we could estimate that, if we associated an average error of 10 cm with the calibrated model with n = 0.014, the error in the damage estimate would

non-linearity of the distribution of flood damage associated with the flow depth value, as well as the damage model used. Wagenaar et al. [67] pointed out that the resulting

in the flow depth of 50–75 cm could lead to a variation in the direct damage estimates of 25–30%. From the same model, we could estimate that, if we associated an average error of 10 cm with the calibrated model with *n* = 0.014, the error in the damage estimate would be around 6%. These percentages of economic direct damage can vary depending on the non-linearity of the distribution of flood damage associated with the flow depth value, as well as the damage model used. Wagenaar et al. [67] pointed out that the resulting uncertainties in estimated damage (due to different models) are in the order of magnitude of a factor of 2 to 5.

Furthermore, from the results obtained, it can be seen that the use of a reduced or lower (relative to that which is naturally associated with the characteristics of the riverbed) Manning's *n* value between 0.011 and 0.016 could lead to an error in the estimation of the flow depth that was no more than 25 cm, which, transferred to the estimation of direct damage, meant an approximate damage value error of 12%. Therefore, the use of an artificially lower Manning's *n* value could reduce the error in estimating flood damage by half. Therefore, this can be an interesting starting point for the improvement of flood damage estimates in areas without bathymetric data availability (even taking into account the fact that obtaining bathymetric data will always be the best option to achieve the best results). Furthermore, it could serve to carry out hazard analyses and, therefore, more personalized risk analyses depending on the elements at risk to be analyzed and their distances from the riverbanks.

However, from a practical point of view, this would introduce complexity into the systematic production of hazard and risk mapping, such as those of FEMA [68] in the USA, SNCZI [69] in Spain, or the Flood Factor [70], also in the USA. At the same time, it would give dynamism and ease of updating to large-scale local studies, which are optimal for urban areas or vulnerable infrastructures (large dams, nuclear power stations, industrial complexes, etc.).

## **5. Conclusions**

The present manuscript shows a new approach for improving flood hazard maps where bathymetric data are not available (or they are scarce, such as a few cross-sections for the whole river reach). The proposed solution to this unavailability of bathymetric data is valid for river floodplains but not inside the river main channel. Unlike the approaches based on the generation of simplified bathymetric shapes, or the hydraulic corrections (HDCM model), the proposal of this manuscript was based on the calibration of the Manning's *n* value (roughness surface index). This calibration point toward the use of abnormally low values for a natural channel sought to compensate for the reduction in the channel cross-section area using an increase of the flow velocity in the channel itself. The main conclusions that can be derived from this study are as follows:


reduction in the error in the percentage of damage from values of 25–30% to errors close to 5%.


**Supplementary Materials:** The following are available online at https://www.mdpi.com/article/ 10.3390/app11199267/s1, Figure S1: Scatter plot of the 100-year return period residual values for calibrated models, Figure S2: Box plot of the 100-year return period residual values for calibrated models, Table S1: Statistical results for the 100-year return period hydraulic models.

**Author Contributions:** All authors have made a significant contribution to the final version of the paper. Conceptualization, J.G.; Hydrodynamic modelling, J.G. and M.G.-J.; Statistical analysis, J.G., M.G.-J. and C.G.-A.; Writing—original draft, J.G. and A.D.-H.; Writing—review and editing, all authors. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the project DRAINAGE, CGL2017–83546–C3–R (MINE-ICO/AEI/FEDER, UE); it is specifically part of the assessments carried out under the subproject DRAINAGE–3–R (CGL2017–83546–C3–3–R). The participation of M.G.-J. is supported by the National System of Youth Guarantee (activity with reference PEJ2018-002477) co-financed under the Youth Employment Operational Program, with financial resources from the Youth Employment Initiative (YEI) and the European Social Fund (ESF).

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author. The data are not publicly available due to their file size, which is greater than 4 Gb.

**Acknowledgments:** The authors would like to thank to Gerardo Benito (MNCN, CSIC) and the Duero River Water Authority (CHD) who provided us with the bathymetric data from Zamora city and its surroundings.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **Abbreviations**


## **References**


## *Article* **Detecting Areas Vulnerable to Flooding Using Hydrological-Topographic Factors and Logistic Regression**

**Jae-Yeong Lee \* and Ji-Sung Kim**

Korea Institute of Civil Engineering and Building Technology, 283 Goyangdae-ro, Ilsanseo-gu, Goyang 10223, Korea; jisungk@kict.re.kr

**\*** Correspondence: jaeyeonglee@kict.re.kr

**Abstract:** As a result of rapid urbanization and population movement, flooding in urban areas has become one of the most common types of natural disaster, causing huge losses of both life and property. To mitigate and prevent the damage caused by the recent increase in floods, a number of measures are required, such as installing flood prevention facilities, or specially managing areas vulnerable to flooding. In this study, we presented a technique for determining areas susceptible to flooding using hydrological-topographic characteristics for the purpose of managing flood vulnerable areas. To begin, we collected digital topographic maps and stormwater drainage system data regarding the study area. Using the collected data, surface, locational, and resistant factors were analyzed. In addition, the maximum 1-h rainfall data were collected as an inducing factor and assigned to all grids through spatial interpolation. Next, a logistic regression analysis was performed by inputting hydrological-topographic factors and historical inundation trace maps for each grid as independent and dependent variables, respectively, through which a model for calculating the flood vulnerability of the study area was established. The performance of the model was evaluated by analyzing the receiver operating characteristics (ROC) curve of flood vulnerability and inundation trace maps, and it was found to be improved when the rainfall that changes according to flood events was also considered. The method presented in this study can be used not only to reasonably and efficiently select target sites for flood prevention facilities, but also to pre-detect areas vulnerable to flooding by using real-time rainfall forecasting.

**Keywords:** flood vulnerability; spatial analysis; logistic regression; ROC analysis; flood detection

## **1. Introduction**

Floods can have several causes, and result mainly from hydro-meteorological causes such as typhoons and localized torrential downpours. Recently, changes in atmospheric flow caused by global warming and climate change have brought about major meteorological problems. In particular, in Northeast Asian regions such as Korea, China, and Japan, atmospheric flow stagnated due to the abnormal high temperatures in the polar regions in the summer of 2020. This led to the longest rainy season ever, causing huge losses.

Other causes of flooding include a decrease in the rainwater storage effect of forests due to reckless development, and an increase in impervious areas due to urbanization. In Seoul, Korea, as the Gangnam region began to be developed in earnest after the 1970s, the low-lying areas were newly developed for residential purposes and lost their rainwater storage function [1]. In such densely populated urban areas, the occurrence of flooding will increase further because low-lying areas will be developed to resolve the inadequacy of the supply of housing compared to demand.

To reduce the loss caused by frequent floods in recent years, central and local governments have established measures to prevent such flood damage. However, budget limitations mean that not all areas vulnerable to flooding can be refurbished with flood prevention facilities. For this reason, it is important to prioritize relatively more vulnerable areas, and in some cases, information on vulnerable areas should be provided to residents.

**Citation:** Lee, J.-Y.; Kim, J.-S. Detecting Areas Vulnerable to Flooding Using Hydrological-Topographic Factors and Logistic Regression. *Appl. Sci.* **2021**, *11*, 5652. https://doi.org/10.3390/app11125652

Academic Editor: Ricardo Castedo

Received: 13 May 2021 Accepted: 10 June 2021 Published: 18 June 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Providing such information not only improves the ability of residents to cope with floods through education and training, but also has the effect of restraining the development of relevant areas.

The advantage in using physically based models is their high capability for prognosis and forecasting, while their disadvantage is the high input data demand [2]. For this reason, techniques for identifying flood vulnerable areas using topographic factors have been suggested in various ways by previous studies. The determination of flood vulnerable areas is one of the representative non-structural measures in flood defense, and should be performed reasonably through hydrological and topographic analysis of rainfall-runoff. As such, techniques for determining flood vulnerable areas have been studied by researchers in a number of fields including hydrology, topography, and soil science. Dehortin et al. [3] laid the foundation for calibrating or evaluating surface runoff susceptibility mappings through on-site monitoring that measures surface runoff. Lagadec et al. [4] presented the indicateur du ruissellement intense pluvial (IRIP) technique that maps the characteristics of surfaces that are susceptible to generation, transferal, and accumulation of surface rainfall-runoff. Lee et al. [5] compared the detection rates of flood vulnerability based on topographic factors using general data such as advanced spaceborne thermal emission and reflection radiometer (ASTER) and shuttle radar topography mission (STRM). Lee and Kim [6] analyzed the correlation between topographic factors considering rainfall-runoff characteristics, as well as actual inundation trace data.

Flood vulnerability has been estimated using the physical characteristics of the surfaces on which rainfall-runoff are likely to accumulate, such as lowlands and gentle slopes; more recently, studies have been performed that attempt to use machine learning to calculate flood vulnerability. Logistic regression, a field of machine learning, can suggest vulnerability in the study area in a probabilistic manner through binary classification of past data (yes or no) by connecting topographic factors and natural disasters such as floods and landslides [7–9]. In addition, studies on estimating flood vulnerability using other machine learning techniques are also being conducted by many researchers. Among those, studies using random forests [10,11] and principal component analysis (PCA) [12,13] have been actively conducted. In addition to studies that applied a single technique, studies which compare or connect several techniques have also been conducted. Pradhan and Lee [14] compared and proposed methods of detecting landslide-prone areas with logistic regression and artificial neural network (ANN). Lee et al. [15] compared flood vulnerability estimated using random forests and boosted trees with topographic factors as input data. Li et al. [16] used logistic regression, Naive Bayes, AdaBoost, and random forests to estimate flood vulnerability around the world, and compared the detection capabilities for each model. To reduce the dimensions of various topographic factors, studies on applying logistic regression after PCA [17–19] have also been conducted.

KICT [20] stated that it was necessary to establish special measures for areas prone to flooding and strengthen flood forecast warning systems, in order to respond to floods. Shin and Park [21] mentioned that the floods that occurred in Seoul in 2010 and 2011 had a high spatial correlation, and that they occurred repeatedly in the same area. In particular, it was analyzed that one-third of the areas which flooded in 2011 were areas that had previously suffered from floods [21]. On this basis, this study confirmed that flood vulnerable areas should be determined through an analysis of the topographical causes of areas where floods frequently occur in Seoul, the study area, and intensively managed.

A variety of approaches have been conducted to identify flood vulnerable areas, and the most representative of them is the method using the numerical models [22–24]. This method is to designate an expected flooding area by calculating the hydraulic-hydrological characteristics of rainfall-runoff for a hypothetical scenario precipitation with a numerical model. Although numerical models showed great capabilities for predicting a diverse range of flooding scenarios, they often require various types of hydro-geomorphological monitoring datasets, requiring intensive computation, which prohibits short-term prediction [25]. Previous studies have suggested data-based techniques for determining flood

vulnerable areas using hydrological-topographic factors due to the efficiency of data collection and analysis. However, these methods only calculate the flood vulnerability at the planning level, and do not detect floods for various actual events. To supplement this, in this study, a logistic regression model estimating flood vulnerability that changes according to rainfall was developed and the detection performance was evaluated with a new event. flood vulnerable areas using hydrological-topographic factors due to the efficiency of data collection and analysis. However, these methods only calculate the flood vulnerability at the planning level, and do not detect floods for various actual events. To supplement this, in this study, a logistic regression model estimating flood vulnerability that changes according to rainfall was developed and the detection performance was evaluated with a new event.

merical model. Although numerical models showed great capabilities for predicting a diverse range of flooding scenarios, they often require various types of hydro-geomorphological monitoring datasets, requiring intensive computation, which prohibits short-term prediction [25]. Previous studies have suggested data-based techniques for determining

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 3 of 21

In spatial data-based flood vulnerability analysis, it is important to select input data that can affect floods and collect data. The input data were selected by referring to the topographical factors mainly used in the previous studies [7–19] introduced above (slope, elevation, topographic wetness index, curvature, stream power index, distance from river, in order of most use). Meanwhile, in Korea, hydrological-topographic data can be easily obtained through the websites [26–29] of government agencies. These data can be regarded as reliable data because they are produced with strict quality control. In spatial data-based flood vulnerability analysis, it is important to select input data that can affect floods and collect data. The input data were selected by referring to the topographical factors mainly used in the previous studies [7–19] introduced above (slope, elevation, topographic wetness index, curvature, stream power index, distance from river, in order of most use). Meanwhile, in Korea, hydrological-topographic data can be easily obtained through the websites [26–29] of government agencies. These data can be re-

The purpose of this study is to develop a technique for determining flood vulnerable areas in order to reduce the damage caused by flooding. As shown in Figure 1, this technique can calculate flood vulnerability by estimating logistic regression coefficients taking into account the hydrological-topographic factors in the study area. This methodology can map flood vulnerable areas suitable for each flood event by changing the values according to the rainfall situation. With this, if real-time rainfall forecasting is used, flooding can be predicted. garded as reliable data because they are produced with strict quality control. The purpose of this study is to develop a technique for determining flood vulnerable areas in order to reduce the damage caused by flooding. As shown in Figure 1, this technique can calculate flood vulnerability by estimating logistic regression coefficients taking into account the hydrological-topographic factors in the study area. This methodology can map flood vulnerable areas suitable for each flood event by changing the values according to the rainfall situation. With this, if real-time rainfall forecasting is used, flooding can be predicted.

**Figure 1.** Flow chart of this study. **Figure 1.** Flow chart of this study.

### **2. Study Area and Materials 2. Study Area and Materials**

### *2.1. Seoul Metropolitan City 2.1. Seoul Metropolitan City*

Seoul metropolitan city (SMC), the capital city of South Korea, has seen continued population growth with the progress of industrialization and urbanization since the Seoul metropolitan city (SMC), the capital city of South Korea, has seen continued population growth with the progress of industrialization and urbanization since the 1960s. As a result, this city is not only a densely populated region with more than 10 million people, which is 20% of the total population of the country, on an area of 605 km<sup>2</sup> , but also shows a concentration of capital in highly dense office regions such as Gwanghwamun and Gangnam. In this environment, severe flooding occurred in 2010 and 2011, causing great damage to life and property in Seoul. The flood that occurred on 21 September 2010 flooded 17,905 households and injured one person. The flood of 27 July 2011 inundated

14,809 households, causing 19 deaths and 41 injuries [21]. With flood damage occurring every year since then, the city of Seoul has been striving to prevent it by increasing the design frequency of drainage pipes and installing pump stations. September 2010 flooded 17,905 households and injured one person. The flood of 27 July 2011 inundated 14,809 households, causing 19 deaths and 41 injuries [21]. With flood damage occurring every year since then, the city of Seoul has been striving to prevent it by

1960s. As a result, this city is not only a densely populated region with more than 10 million people, which is 20% of the total population of the country, on an area of 605 km2, but also shows a concentration of capital in highly dense office regions such as Gwanghwamun and Gangnam. In this environment, severe flooding occurred in 2010 and 2011, causing great damage to life and property in Seoul. The flood that occurred on 21

In this study, inundation trace maps generated in 2001 [26] were used to develop a logistic regression model to calculate flood vulnerability for each grid. The inundation trace maps for 2010 and 2011 were used to evaluate the performance of the developed regression model. Figure 2 shows the extent of the study area and the traces of flooding in 2001. increasing the design frequency of drainage pipes and installing pump stations. In this study, inundation trace maps generated in 2001 [26] were used to develop a logistic regression model to calculate flood vulnerability for each grid. The inundation trace maps for 2010 and 2011 were used to evaluate the performance of the developed regression model. Figure 2 shows the extent of the study area and the traces of flooding in 2001.

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 4 of 21

**Figure 2.** Inundation traces that occurred in 2001 in Seoul, the study area. **Figure 2.** Inundation traces that occurred in 2001 in Seoul, the study area.

### *2.2. Hydrological-Topographic Factors 2.2. Hydrological-Topographic Factors*

Hydrological-topographic factors were classified into three topographic factors (surface, locational, and resistant) and one hydrological factor (inducing factor). Elevation, slope, profile curvature, plan curvature, topographic wetness index (TWI), and stream power index (SPI) were considered for surface factors, which are the characteristics of runoff moving on the surface by gravity. For locational factors, distance from river and manhole were considered to indicate the range affected by catchment runoff due to natural factors (river) and artificial factors (manhole). As a resistant factor, pump capacity per drainage area was analyzed to consider the effect of drainage pumps installed to protect against urban flooding. The maximum 1-h rainfall was used as an inducing factor, which is an external factor that can directly affect the occurrence of floods. Hydrological-topographic factors were classified into three topographic factors (surface, locational, and resistant) and one hydrological factor (inducing factor). Elevation, slope, profile curvature, plan curvature, topographic wetness index (TWI), and stream power index (SPI) were considered for surface factors, which are the characteristics of runoff moving on the surface by gravity. For locational factors, distance from river and manhole were considered to indicate the range affected by catchment runoff due to natural factors (river) and artificial factors (manhole). As a resistant factor, pump capacity per drainage area was analyzed to consider the effect of drainage pumps installed to protect against urban flooding. The maximum 1-h rainfall was used as an inducing factor, which is an external factor that can directly affect the occurrence of floods.

### 2.2.1. Surface Factors 2.2.1. Surface Factors

The characteristics of surfaces that are vulnerable to flooding are typically lowlands, gentle slopes, and concave terrains, and can be estimated through spatial analysis. In this study, a digital topographic map drawn to a scale of 1:5000 (2018) provided by the NGII The characteristics of surfaces that are vulnerable to flooding are typically lowlands, gentle slopes, and concave terrains, and can be estimated through spatial analysis. In this study, a digital topographic map drawn to a scale of 1:5000 (2018) provided by the NGII [27] was used to calculate the topographic factors of the study area. The digital topographic map was converted into a 30 × 30 m digital elevation model (DEM) through spatial analysis because the contour lines and elevation points were composed in a vector form. Raster calculations were performed with this DEM (elevation) to calculate five surface factors including slope, profile curvature, plan curvature, topographic wetness index, and stream power index (Figure 3).

[27] was used to calculate the topographic factors of the study area. The digital topographic map was converted into a 30 × 30 m digital elevation model (DEM) through spatial analysis because the contour lines and elevation points were composed in a vector form. Raster calculations were performed with this DEM (elevation) to calculate five surface factors including slope, profile curvature, plan curvature, topographic wetness index, and

Elevation is the most representative factor explaining the characteristics of a surface that is prone to flooding; more lowlands means the area is more vulnerable to flooding. Since the flow velocity is slow in areas with gentle slopes, the runoff from rainfall accumulates and causes a flood. Curvature can be calculated as the second derivative of the surface, and can be classified into profile curvature and plan curvature, respectively, depending on whether it is calculated in a direction parallel to or perpendicular to the slope, as shown in Figure 4 [30]. Profile curvature is the curvature in the downward direction of the slope, and flooding is likely to occur in a concave terrain (positive). Plan curvature is the curvature in the horizontal direction of the slope, and runoff is likely to accumulate in a valley (negative). The topographic wetness index (TWI) was derived from the study of Beven and Kirkby [31] and can be calculated as shown in Equation (1). The TWI means that the gentler the slope () of the target grid and the larger the basin area () of the upstream region, the higher the potential wetness index of the region. The stream power index (SPI), which was proposed by Moore et al. [32], represents the degree of sediment movement and erosion from surface runoff, and is calculated as shown in Equation (2).

= ln(/tan) (1)

= ln( ൈ tan) (2)

stream power index (Figure 3).

**Figure 3.** Topographic factors: (**a**) Elevation; (**b**) Slope; (**c**) Profile curvature; (**d**) Plan curvature; (**e**) Topographic wetness index (TWI); (**f**) Stream power index (SPI). **Figure 3.** Topographic factors: (**a**) Elevation; (**b**) Slope; (**c**) Profile curvature; (**d**) Plan curvature; (**e**) Topographic wetness index (TWI); (**f**) Stream power index (SPI).

Elevation is the most representative factor explaining the characteristics of a surface that is prone to flooding; more lowlands means the area is more vulnerable to flooding. Since the flow velocity is slow in areas with gentle slopes, the runoff from rainfall accumulates and causes a flood. Curvature can be calculated as the second derivative of the surface, and can be classified into profile curvature and plan curvature, respectively,

Runoff from rainfall that reaches the ground flows from high to low along the slope by gravity. In natural basins, rainfall runoff gathers to form a river, while in urban areas such runoff is concentrated to manholes through a drainage pipe network. Therefore, areas near rivers or manholes are likely to be vulnerable to flooding when localized torrential downpours exceeding the capacity occur. To calculate the distance from the river and manhole, the location of the river and manholes was calculated for each grid using a dig-

2.2.2. Locational Factors

ital topographic map (Figure 5a,b).

(**a**) (**b**) **Figure 4.** Illustrations by type of curvature: (**a**) Profile curvature; (**b**) Plan curvature [30].

depending on whether it is calculated in a direction parallel to or perpendicular to the slope. Profile curvature is the curvature in the downward direction of the slope, and flooding is likely to occur in a concave terrain (positive). Plan curvature is the curvature in the horizontal direction of the slope, and runoff is likely to accumulate in a valley (negative). The topographic wetness index (TWI) was derived from the study of Beven and Kirkby [30] and can be calculated as shown in Equation (1). The TWI means that the gentler the slope (*θ*) of the target grid and the larger the basin area (*a*) of the upstream region, the higher the potential wetness index of the region. The stream power index (SPI), which was proposed by Moore et al. [31], represents the degree of sediment movement and erosion from surface runoff, and is calculated as shown in Equation (2).

$$TWI = \ln(a/\tan\theta) \tag{1}$$

$$SPI = \ln(a \times \tan \theta) \tag{2}$$

## 2.2.2. Locational Factors

Runoff from rainfall that reaches the ground flows from high to low along the slope by gravity. In natural basins, rainfall runoff gathers to form a river, while in urban areas such runoff is concentrated to manholes through a drainage pipe network. Therefore, areas near rivers or manholes are likely to be vulnerable to flooding when localized torrential downpours exceeding the capacity occur. To calculate the distance from the river and manhole, the location of the river and manholes was calculated for each grid using a digital topographic map (Figure 4a,b). *Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 7 of 21

**Figure 5.** Locational and resistant factors: (**a**) Distance from river; (**b**) Distance from manhole; (**c**) Pump capacity per drainage area. **Figure 4.** Locational and resistant factors: (**a**) Distance from river; (**b**) Distance from manhole; (**c**) Pump capacity per drainage area.

study, statistical data from the Ministry of Environment (ME) [34] were collected to investigate the location and specifications of drainage pumping stations in the study area. On the other hand, since the specific time of the establishment of drainage pumping stations and that of the increase in the capacity could be not confirmed, the year-end statistical data of a year before the flood event, which was applied to the development and verification of this model, were used. It was found that 91 pumping stations were operated in Seoul in 2000, and the total pumping capacity was 118,196 m3/min. In addition, there were 239 drainage sections in Seoul, and each drainage pumping station was designed to fit the area of the drainage section where the facility was located. Therefore, to reflect this, pumping capacity (, m3/min) was divided by the area (, m2) of the drainage section to calculate

pumping capacity per drainage area (), as shown in Equation (3) (Figure 5c).

Recently, the frequency of localized torrential rains has been increasing due to climate change [35]. In Seoul, which was affected by this, the number of occurrences of more

= / (3)

2.2.3. Resistant Factor

2.2.4. Inducing Factor

## 2.2.3. Resistant Factor

In urban areas, drainage pumping stations, which are representative facilities to reduce flood damage in lowlands during localized torrential rain, are installed [32]. In this study, statistical data from the Ministry of Environment (ME) [33] were collected to investigate the location and specifications of drainage pumping stations in the study area. On the other hand, since the specific time of the establishment of drainage pumping stations and that of the increase in the capacity could be not confirmed, the year-end statistical data of a year before the flood event, which was applied to the development and verification of this model, were used. It was found that 91 pumping stations were operated in Seoul in 2000, and the total pumping capacity was 118,196 m3/min. In addition, there were 239 drainage sections in Seoul, and each drainage pumping station was designed to fit the area of the drainage section where the facility was located. Therefore, to reflect this, pumping capacity (*C*, m3/min) was divided by the area (*A*, m<sup>2</sup> ) of the drainage section to calculate pumping capacity per drainage area (*P*), as shown in Equation (3) (Figure 4c).

$$P = \mathbb{C}/A\tag{3}$$

## 2.2.4. Inducing Factor

Recently, the frequency of localized torrential rains has been increasing due to climate change [34]. In Seoul, which was affected by this, the number of occurrences of more than 30 mm/h of rainfall increased by 2.3 times throughout the year compared to before 1990, and that of more than 50 mm/h of rainfall increased by 5.3 times [35]. In addition, Son et al. [35] analyzed that rainfall of 75.0 mm and 15.5 mm/h was observed at the Seodaemun (412) and Dobong (406) observatories in Seoul at 14:00 on 21 September 2010, respectively, showing an approximately 5-fold difference between the two observatories. As such, in terms of the temporal distribution of rainfall, the occurrence frequency of concentrated torrential rains (30 mm/h or more) increases, and the spatial distribution also shows a large deviation due to localized heavy rains. Therefore, it was confirmed that topographic and hydrological factors should be connected when estimating flood vulnerability in this study.

Inundation damage in Seoul resulted mainly from inland floods, which occurred in urban lowlands or were caused by rainfall that could overwhelm the drainage infrastructure, rather than fluvial floods [21]. In Korea, when designing drainage pipes to protect against flood, the rainfall duration and the return period generally considered are 1 h and 10–30 years, respectively [36]. Therefore, in this study, maximum 1-h rainfall was used as an inducing factor that causes urban flooding.

Korea Meteorological Administration (KMA) [29] provides various types of observation data, such as automated synoptic observing system (ASOS) and automated weather system (AWS), as shown in Table 1. ASOS is installed in the location of the former KMA to perform observation tasks such as observing weather phenomena and data sharing through international cooperation, and AWS is installed in places where observation by a human operator is difficult, such as on mountains or islands, to monitor localized severe weather phenomena in real time [37]. There were a total of 32 rainfall observatories located in and near Seoul, as shown in Table 1, but to secure the reliability of the data required to develop a regression model, it was necessary to select data in consideration of missing observations, and the opening/closure of such observatories. The data from the Gangseo (404) and Gwangjin (413) observatories were excluded because missing data were found at the time of the occurrence of maximum 1-h rainfall observed at a nearby observatory in 2001. Those from Bukaksan (422), Guro (423), Gangbuk (424), and Namhyeon (425) observatories were excluded because they opened after 2001. The selected data were interpolated using the inverse distance weighting (IDW) method to assign the rainfall at the point where the observatories were located to all the relevant grids (Figure 5).

On the other hand, inundation trace maps provided information on flooded areas, but did not provide information on the date and time of flooding. If information on the time of flooding is not available, such data cannot be linked to rainfall data. Therefore, in

this study, the maximum 1-h rainfall occurring in July 2001 was used as an independent variable for logistic regression. After that, to evaluate the performance of the regression equation, flood vulnerability was estimated using the maximum 1-h rainfall in September 2010 and July 2011, and compared with the inundation trace maps.


**Table 1.** Observatories in Seoul and maximum 1-h. rainfall.

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 10 of 21

**Figure 6.** Rainfall interpolation result (2001). **Figure 5.** Rainfall interpolation result (2001).

### **3. Methodology 3. Methodology**

### *3.1. Multi-Collinearity Test 3.1. Multi-Collinearity Test*

Multi-collinearity problems can cause when there is a correlation between two or more variables in a regression model. This problem can cause the calculations to be false, and the logistic parameters are incorrect and or inexact [39]. As the surface factors used in this study, five independent variables (slope, profile curvature, plan curvature, TWI, and SPI) calculated from elevation were used. Applying variables derived from one raster data to a regression model may cause multi-collinearity problems [17]. Therefore, the determination of multi-collinearity is an important step in detecting flood vulnerability using a logistic regression model. The variance inflation factor (*VIF*), one of the indicators used to determine multi-collinearity, can be calculated using the coefficient of determination (ଶ) as in Equation (4). Multi-collinearity problems can cause when there is a correlation between two or more variables in a regression model. This problem can cause the calculations to be false, and the logistic parameters are incorrect and or inexact [38]. As the surface factors used in this study, five independent variables (slope, profile curvature, plan curvature, TWI, and SPI) calculated from elevation were used. Applying variables derived from one raster data to a regression model may cause multi-collinearity problems [17]. Therefore, the determination of multi-collinearity is an important step in detecting flood vulnerability using a logistic regression model. The variance inflation factor (*VIF*), one of the indicators used to determine multi-collinearity, can be calculated using the coefficient of determination (*R* 2 ) as in Equation (4).

$$VIF = \frac{1}{1 - \mathbb{R}^2} \tag{4}$$

1−ଶ (4) Lin [40] stated that variables can be judged to have multi-collinearity when VIF is greater than 10. Table 1 shows that there is no multi-collinearity problem as the VIF values for the six independent variables of the surface factors ranged from 1.099 to 2.679 (Table 2). Therefore, it was confirmed that six surface factors can be used as independent varia-Lin [39] stated that variables can be judged to have multi-collinearity when VIF is greater than 10. Table 1 shows that there is no multi-collinearity problem as the VIF values for the six independent variables of the surface factors ranged from 1.099 to 2.679 (Table 2). Therefore, it was confirmed that six surface factors can be used as independent variables in logistic regression analysis to calculate flood vulnerability.

bles in logistic regression analysis to calculate flood vulnerability. **Table 2.** Evaluation of vulnerability according to area under the curve (AUC).


## *3.2. Logistic Regression*

*3.2. Logistic Regression*  Logistic regression is a probability model proposed by Cox [41], which is used for classification and prediction by expressing the relationship between dependent variables and independent variables as a regression equation. It was mainly proposed to classify Logistic regression is a probability model proposed by Cox [40], which is used for classification and prediction by expressing the relationship between dependent variables and independent variables as a regression equation. It was mainly proposed to classify events

in which the dependent variable follows a binomial distribution, such as the relationship between test scores and whether they pass the exam, or patient health status and whether they have a disease.

Odds ratio (*OR*) was introduced to utilize logistic regression for binary classification. OR represents the ratio of the probability, *p*, that an event will occur, and the probability, 1 − *p*, that it will not occur, and it is calculated as follows.

$$OR = \frac{p}{1-p} \tag{5}$$

In addition, the problem of binary classification is that a linear regression analysis cannot be performed, because the dependent variable is represented as "0" or "1", and thus the range is different from the independent variable having a continuous distribution. Accordingly, the dependent variable is adjusted to (−∞, ∞) in the range of [0, 1] through the logit transformation that applies the logarithm to OR. This can be expressed using the following equation.

$$\text{Logit}(OR) = \log\left(\frac{p}{1-p}\right) = Y = \beta\_0 + \beta\_1 \mathbf{x}\_1 + \dots + \beta\_n \mathbf{x}\_n \tag{6}$$

In this study, for the calculation of the regression coefficient (*βn*) of Equation (6), the occurrence of flooding events (*Y*) for all grids in the study area and 10 hydrologicaltopographic factors ( *x*<sup>1</sup> ∼ *x*10) were used. In addition, the maximum likelihood estimator is used to determine regression coefficients including the constant term [41].

Next, a logistic function is used to calculate the flooding probability for target grids using the calculated regression coefficients. The logistic function can be calculated as follows by using the inverse function relation in Equation (6).

$$e^{\beta\_0 + \beta\_1 \mathbf{x}\_1 + \dots + \beta\_n \mathbf{x}\_n} = \frac{p}{1 - p} \tag{7}$$

$$(1 - p)e^{\beta\_0 + \beta\_1 \mathbf{x}\_1 + \dots + \beta\_n \mathbf{x}\_n} = p \tag{8}$$

$$e^{\oint\_{\mathcal{O}} + \beta\_1 \mathbf{x}\_1 + \dots + \beta\_n \mathbf{x}\_n} = p \left( \mathbf{1} + e^{\oint\_{\mathcal{O}} + \beta\_1 \mathbf{x}\_1 + \dots + \beta\_n \mathbf{x}\_n} \right) \tag{9}$$

$$p = \frac{1}{1 + e^{-(\beta\_0 + \beta\_1 \mathbf{x}\_1 + \dots + \beta\_n \mathbf{x}\_n)}}\tag{10}$$

The probability of flooding *p* can be obtained by inputting the hydrological-topographic factor for a target grid to Equation (10). This flooding probability (*p*) corresponds to flood vulnerability in this study. The flood vulnerability estimated through the logistic regression has the range [0, 1], and the closer to 1, the higher the probability of flood occurrence.

## *3.3. 2* × *2 Confusion Matrix and ROC Analysis*

In this study, a receiver operating characteristics (ROC) analysis was conducted using a 2 × 2 confusion matrix to check the extent to which the areas with high flood vulnerability calculated using the logistic regression model were consistent with the inundation trace maps. The 2 × 2 confusion matrix and ROC analysis have been mainly used in the medical field, including the performance evaluation of reagents that discriminate negative from positive patients in the diagnostic test of COVID-19, which has been spreading around the world in recent years. This technique has recently been extended and applied to the fields of machine learning and object recognition to evaluate the classification accuracy of artificial intelligence [42,43]. ROC analysis allows us to determine whether a test method is useful by showing a curve for the degree to which an event is detected for each test method [44,45]. To draw this curve, four components of a 2 × 2 confusion matrix should be used.

As shown in Table 3, the 2 × 2 confusion matrix can be composed of 4 different combinations depending on whether the flood vulnerable area and inundation traces on the map coincide. If the flood vulnerable area and inundation traces coincide, it can be expressed as true positives (*TP*) or true negatives (*TN*); otherwise, it is expressed as false positives (*FP*) or false negatives (*FN*). For the plot of the ROC curve, the *x*-axis is calculated as 1-specificity, showing specificity which is the ratio of accurately predicted areas (TN) among the areas where no actual flooding occurred (FP + TN). The *y*-axis of the graph shows sensitivity, which is the ratio of the areas selected as flood vulnerable areas (TP) among the flooded areas (TP + FN). When expressed as an equation, specificity and sensitivity can be expressed as Equations (11) and (12), respectively, and the range of values is [0, 1] [45].

$$Specificity = \frac{TN}{FP + TN} \tag{11}$$

$$Sensitivity = \frac{TP}{TP + FN} \tag{12}$$

**Table 3.** Overview of 2 × 2 confusion matrix.


In ROC analysis, the performance of a test method can be evaluated by calculating the area under the curve (AUC). It can be evaluated that the closer the AUC is to 1, the better the detection performance is, while the closer the AUC is to 0, the worse the detection performance is. According to Ying et al. [46] and Simundic [47], the AUC can be evaluated as shown in Table 4. In addition, if the ROC curve is distributed below the diagonal with a slope of 1 and the AUC is calculated to be 0.5 or less, it means that the test method is not useful.

**Table 4.** Evaluation of vulnerability according to area under the curve (AUC).


## **4. Results and Discussion**

*4.1. Logistic Regression Modeling*

For the analysis, the city of Seoul was divided using a grid into 648,174 30 m × 30 m squares excluding rivers, and 47,065 of these were found to have inundation traces. Through this process, the grid where flooding had occurred and those where it had not were classified as 1 and 0, respectively, and these values were entered into Y of Equation (5). In addition, the hydrological-topographic factors for each grid were required in the logistic regression equation to estimate the flood vulnerability. This study intends to provide information on the changes in vulnerability according to rainfall, rather than calculating unchanged flood vulnerability for each grid by considering only the topographic factors. Therefore, two logistic regression models were developed and their performance was compared: an equation that used only topographic factors (T) as independent variables, and one that also included hydrological factors (TR). As a result, the logistic regression coefficients and constant terms of the two equations were determined, as shown in Equations (14) and (15), respectively. ்ோ = −4.486 − 1.323 ൈ − 0.206 ൈ +0.074 ൈ + 0.101 ൈ

 +0.049 ൈ + 0.070 ൈ +0.335 ൈ − 0.147 ൈ + 0.240 ൈ

் = −4.394 − 1.391 ൈ − 0.120 ൈ

 −5.746 ൈ ℎ −0.093 ൈ

was compared: an equation that used only topographic factors (T) as independent variables, and one that also included hydrological factors (TR). As a result, the logistic regression coefficients and constant terms of the two equations were determined, as shown in

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 13 of 21

Equations (14) and (15), respectively.

$$z = \beta\_0 + \beta\_1 x\_1 + \dots + \beta\_n x\_n \tag{13}$$

= + ଵଵ +⋯+ (13)

(14)

*z<sup>T</sup>* = −4.394 − 1.391 × *Elevation* − 0.120 × *Slope* +0.049 × *Pro f ile Curvature* + 0.070 × *Plan Curvature* +0.335 × *TW I* − 0.147 × *SPI* + 0.240 × *Distance f rom River* −5.746 × *Distance f rom Manhole* −0.093 × *Pump Capacity per Area* (14) *zTR* = −4.486 − 1.323 × *Elevation* − 0.206 × *Slope* +0.074 × *Pro f ile Curvature* + 0.101 × *Plan Curvature* +0.374 × *TW I* − 0.163 × *SPI* + 0.253 × *Distance f rom River* −5.610 × *Distance f rom Manhole* (15) −0.081 ൈ + 0.503 ൈ With the data for 2001, the flood vulnerability was calculated using the hydrologicaltopographic factors and the determined regression coefficients, for all grid in the study area (Figure 7). In the figure, a darker color indicates that the area is more vulnerable, while a lighter color indicates that the area is less vulnerable. The flood vulnerability was represented by classifying the probability in the range [0, 1] into five classes using the natural breaks method. The idea of the natural breaks method is to minimize variance among objects within the chosen subsets, and maximize variance between the subsets [49]. The five classes included very high (1.00–0.50), high (0.50–0.34), medium (0.34–0.22), low

−0.081 × *Pump Capacity per Area* + 0.503 × *Rain f all* (0.22–0.13), and very low (0.13–0.02). In addition, as areas with a probability of less than 2% were not evaluated to be vulnerable, a vulnerability level was not assigned to these.

With the data for 2001, the flood vulnerability was calculated using the hydrologicaltopographic factors and the determined regression coefficients, for all grid in the study area (Figure 6). In the figure, a darker color indicates that the area is more vulnerable, while a lighter color indicates that the area is less vulnerable. The flood vulnerability was represented by classifying the probability in the range [0, 1] into five classes using the natural breaks method. The idea of the natural breaks method is to minimize variance among objects within the chosen subsets, and maximize variance between the subsets [48]. The five classes included very high (1.00–0.50), high (0.50–0.34), medium (0.34–0.22), low (0.22–0.13), and very low (0.13–0.02). In addition, as areas with a probability of less than 2% were not evaluated to be vulnerable, a vulnerability level was not assigned to these. Flood vulnerability, which was calculated with two logistic regression equations, was divided into vulnerability considering only topographic factors (Figure 7a) and one that also considered maximum 1-h rainfall, a hydrological factor (Figure 7b). From the difference between Figure 7a,b, it can be seen that the vulnerability varies by region according to the spatial distribution of the maximum 1-h rainfall. When considering the hydrological factor, the area with very high-intensity rainfall of 100–110 mm/h (the area marked in red in the southwest in Figure 6) was more susceptible to flooding than when only topographic factors were considered. On the other hand, the area with rainfall of 60– 70 mm/h (the area represented in green in the northwest in Figure 6) was found to be less vulnerable.

**Figure 7.** Results of logistic regression (2001): (**a**) Topographic data; (**b**) Topographic and hydrological data. **Figure 6.** Results of logistic regression (2001): (**a**) Topographic data; (**b**) Topographic and hydrological data.

Flood vulnerability, which was calculated with two logistic regression equations, was divided into vulnerability considering only topographic factors (Figure 6a) and one that also considered maximum 1-h rainfall, a hydrological factor (Figure 6b). From the difference between Figure 6a,b, it can be seen that the vulnerability varies by region according to the spatial distribution of the maximum 1-h rainfall. When considering the hydrological factor, the area with very high-intensity rainfall of 100–110 mm/h (the area marked in red in the southwest in Figure 5) was more susceptible to flooding than when only topographic factors were considered. On the other hand, the area with rainfall of 60–70 mm/h (the area represented in green in the northwest in Figure 5) was found to be less vulnerable.

An ROC analysis was conducted to quantitatively confirm whether the flood vulnerable areas determined by the technique proposed in this study and those where floods

occurred in the past coincided. To plot the ROC curve with 10 points, the flood vulnerability of target areas was divided into 10 equal parts using quartiles. If many floods occurred in areas with high vulnerability in the ROC curve (lower side of the *x*-axis), the sensitivity would increase, and in particular would increase sharply at the beginning of the curve. Consequentially, as the AUC increases, it can be evaluated that the technique of this study detects flooded areas well. The ROC curve is shown in Figure 7. Based on the results of this analysis, it is considered that the logistic regression model detects flood occurrence well in the study area. Although the inputting of the hydrological factor did not make a distinct difference, it can be assumed that the vulnerability will change and the detection rate will improve if new rainfall data are input, even without topographic changes. To confirm this, the performance of the logistic regression equation was evaluated using the maximum 1-h rainfall and inundation trace maps in 2010 and 2011.

An ROC analysis was conducted to quantitatively confirm whether the flood vulnerable areas determined by the technique proposed in this study and those where floods occurred in the past coincided. To plot the ROC curve with 10 points, the flood vulnerability of target areas was divided into 10 equal parts using quartiles. If many floods occurred in areas with high vulnerability in the ROC curve (lower side of the *x*-axis), the sensitivity would increase, and in particular would increase sharply at the beginning of the curve. Consequentially, as the AUC increases, it can be evaluated that the technique

Through ROC analysis, it was found that the AUC of flood vulnerability considering only topographic factors and that including rainfall were 0.848 and 0.866, respectively, and both were evaluated as "very good" as shown in Table 4. Further, the precision was calculated to confirm the rate at which flood occurrence was detected for the flood vulnerability, which was classified into five classes. This can be obtained by using the number of samples classified as positive in a 2 × 2 confusion matrix as shown in Equation (16), and the range is [0, 1] (perfect value is 1) [50,51]. The precision for each class was calculated as shown in Figure 9 and, in both cases, it was found that floods were detected at a rate of

+ (16)

of this study detects flooded areas well. The ROC curve is shown in Figure 8.

more than 50% in the very high class, and more than 40% in the high class.

=

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 14 of 21

**Figure 8.** ROC curves of flood vulnerability and inundation traces. **Figure 7.** ROC curves of flood vulnerability and inundation traces.

Through ROC analysis, it was found that the AUC of flood vulnerability considering only topographic factors and that including rainfall were 0.848 and 0.866, respectively, and both were evaluated as "very good" as shown in Table 4. Further, the precision was calculated to confirm the rate at which flood occurrence was detected for the flood vulnerability, which was classified into five classes. This can be obtained by using the number of samples classified as positive in a 2 × 2 confusion matrix as shown in Equation (16), and the range is [0, 1] (perfect value is 1) [49,50]. The precision for each class was calculated as shown in Figure 8 and, in both cases, it was found that floods were detected at a rate of more than 50% in the very high class, and more than 40% in the high class.

$$Precision = \frac{TP}{TP + FP} \tag{16}$$

Based on the results of this analysis, it is considered that the logistic regression model detects flood occurrence well in the study area. Although the inputting of the hydrological factor did not make a distinct difference, it can be assumed that the vulnerability will change and the detection rate will improve if new rainfall data are input, even without topographic changes. To confirm this, the performance of the logistic regression equation was evaluated using the maximum 1-h rainfall and inundation trace maps in 2010 and 2011.

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 15 of 21

**Figure 9.** Proportion of flood occurrence in vulnerable areas. **Figure 8.** Proportion of flood occurrence in vulnerable areas. in 2010, and the Bukhansan (420) observatories, which were closed in 2011, were also ex-

#### *4.2. Mapping Vulnerable Areas in Other Flood Events 4.2. Mapping Vulnerable Areas in Other Flood Events* cluded. The rainfall of the observatories was interpolated using the IDW method as shown

To evaluate the performance of the logistic regression model developed in this study, inundation trace maps and the maximum 1-h rainfall data were collected for floods that occurred in September 2010 and July 2011. Of the rainfall observatories, the Bukaksan (422) and Namhyeon (425) observatories were excluded because they had not yet opened in 2010, and the Bukhansan (420) observatories, which were closed in 2011, were also excluded. The rainfall of the observatories was interpolated using the IDW method as shown in Figure 10. In 2010, very high-intensity rainfall of around 100 mm/h occurred in the west of the study area, while relatively low-intensity rainfall of around 60 mm/h was recorded in the north. Overall, in 2011, it rained less than in 2010, except in the south with 100 mm/h To evaluate the performance of the logistic regression model developed in this study, inundation trace maps and the maximum 1-h rainfall data were collected for floods that occurred in September 2010 and July 2011. Of the rainfall observatories, the Bukaksan (422) and Namhyeon (425) observatories were excluded because they had not yet opened in 2010, and the Bukhansan (420) observatories, which were closed in 2011, were also excluded. The rainfall of the observatories was interpolated using the IDW method as shown in Figure 9. In 2010, very high-intensity rainfall of around 100 mm/h occurred in the west of the study area, while relatively low-intensity rainfall of around 60 mm/h was recorded in the north. Overall, in 2011, it rained less than in 2010, except in the south with 100 mm/h of rainfall. in Figure 10. In 2010, very high-intensity rainfall of around 100 mm/h occurred in the west of the study area, while relatively low-intensity rainfall of around 60 mm/h was recorded in the north. Overall, in 2011, it rained less than in 2010, except in the south with 100 mm/h of rainfall. In addition, the year-end statistical data for 2009 and 2010 [52,53] were used to consider drainage pumping stations that were newly built or increased in capacity after flooding in 2001. In 2009 and 2010, the number of pumping stations was increased from 91 in 2000 to 111, and the capacity also was increased to 155,313 m3/min (2009) and 161,279 m3/min (2010). The changed pump capacity per drainage area is shown in Figure 11

**Figure 10.** Rainfall interpolation results by year: (**a**) September 2010; (**b**) July 2011. **Figure 9.** Rainfall interpolation results by year: (**a**) September 2010; (**b**) July 2011.

(**a**) (**b**) In addition, the year-end statistical data for 2009 and 2010 [51,52] were used to consider drainage pumping stations that were newly built or increased in capacity after flooding in 2001. In 2009 and 2010, the number of pumping stations was increased from 91 in 2000 to 111, and the capacity also was increased to 155,313 m3/min (2009) and 161,279 m3/min (2010). The changed pump capacity per drainage area is shown in Figure 10.

**Figure 10.** Rainfall interpolation results by year: (**a**) September 2010; (**b**) July 2011. In the logistic regression model developed above with data for 2001, the same values as those in 2001 were used for surface and locational factors, and values for 2010 and 2011 were entered for resistant and inducing factors, respectively. The flood vulnerability that was recalculated by inputting pumping capacity and rainfall to this model is shown in Figure 11. Interestingly, as low rainfall was input compared to that used in the development

of the model, the vulnerability in 2010 and 2011 was decreased significantly compared to in 2001. In addition, in 2010, the high-intensity rainfall in the west increased the vulnerability, and the low-intensity rainfall in the north decreased the vulnerability. In 2011, most areas were calculated to have low vulnerability, except for the increase in vulnerability in some areas due to high-intensity rainfall in the south. *Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 16 of 21

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 16 of 21

**Figure 11.** Density of drainage pump capacity by year: (**a**) September 2010; (**b**) July 2011. **Figure 10.** Density of drainage pump capacity by year: (**a**) September 2010; (**b**) July 2011. nerability in some areas due to high-intensity rainfall in the south.

**Figure 12.** Results of the selection for flood vulnerable areas: (**a**) September 2010; (**b**) July 2011. **Figure 11.** Results of the selection for flood vulnerable areas: (**a**) September 2010; (**b**) July 2011.

(**a**) (**b**) **Figure 12.** Results of the selection for flood vulnerable areas: (**a**) September 2010; (**b**) July 2011. An ROC analysis was conducted to quantitatively analyze the extent to which calcu-An ROC analysis was conducted to quantitatively analyze the extent to which calculated flood vulnerability in 2010 and 2011 actually detects floods. The ROC curves for 2010 and 2011 are shown in Figure 13. In both cases, it was found that the measure of flood vulnerability (AUC = 0.861, 0.815) that considered the hydrological factors together detected flood occurrence better than that (AUC = 0.841, 0.766), which considered only the topographic factors. The detection rate was calculated as shown in Figure 14. In 2010, among vulnerable areas considering rainfall, flooding occurred at a rate of 66% (T, 57%) in the very high class; 54% (T, 31%) in the high; 47% (T, 18%) in the medium; 33% (T, 12%) An ROC analysis was conducted to quantitatively analyze the extent to which calculated flood vulnerability in 2010 and 2011 actually detects floods. The ROC curves for 2010 and 2011 are shown in Figure 12. In both cases, it was found that the measure of flood vulnerability (AUC = 0.861, 0.815) that considered the hydrological factors together detected flood occurrence better than that (AUC = 0.841, 0.766), which considered only the topographic factors. The detection rate was calculated as shown in Figure 13. In 2010, among vulnerable areas considering rainfall, flooding occurred at a rate of 66% (T, 57%) in the very high class; 54% (T, 31%) in the high; 47% (T, 18%) in the medium; 33% (T, 12%) in the low; and 11% (T, 7%) in the very low. In 2011, floods occurred at a rate of 36% (T, 17%) in the very high class; 41% (T, 12%) in the high; 31% (T, 9%) in the medium; 28% (T, 6%) in the low; and 9% (T, 4%) in very low. Through ROC analysis and precision, it was found that the model for calculating flood vulnerability that only considers topographic

> lated flood vulnerability in 2010 and 2011 actually detects floods. The ROC curves for 2010 and 2011 are shown in Figure 13. In both cases, it was found that the measure of flood

> topographic factors. The detection rate was calculated as shown in Figure 14. In 2010, among vulnerable areas considering rainfall, flooding occurred at a rate of 66% (T, 57%) in the very high class; 54% (T, 31%) in the high; 47% (T, 18%) in the medium; 33% (T, 12%)

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 17 of 21

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 17 of 21

factors has a disadvantage of overestimating vulnerable areas, but that the detection rate could be improved by up to over four times (in the low class in 2011) when the rainfall was also considered. factors has a disadvantage of overestimating vulnerable areas, but that the detection rate could be improved by up to over four times (in the low class in 2011) when the rainfall was also considered. could be improved by up to over four times (in the low class in 2011) when the rainfall was also considered.

in the low; and 11% (T, 7%) in the very low. In 2011, floods occurred at a rate of 36% (T, 17%) in the very high class; 41% (T, 12%) in the high; 31% (T, 9%) in the medium; 28% (T, 6%) in the low; and 9% (T, 4%) in very low. Through ROC analysis and precision, it was found that the model for calculating flood vulnerability that only considers topographic

in the low; and 11% (T, 7%) in the very low. In 2011, floods occurred at a rate of 36% (T, 17%) in the very high class; 41% (T, 12%) in the high; 31% (T, 9%) in the medium; 28% (T, 6%) in the low; and 9% (T, 4%) in very low. Through ROC analysis and precision, it was found that the model for calculating flood vulnerability that only considers topographic factors has a disadvantage of overestimating vulnerable areas, but that the detection rate

**Figure 13.** ROC curves of flood vulnerability and inundation traces for performance evaluation: (**a**) September 2010; (**b**) July 2011. **Figure 12.** ROC curves of flood vulnerability and inundation traces for performance evaluation: (**a**) September 2010; (**b**) July 2011. **Figure 13.** ROC curves of flood vulnerability and inundation traces for performance evaluation: (**a**) September 2010; (**b**) July 2011.

**Figure 14.** Proportion of flood occurrence in vulnerable areas for performance evaluation: (**a**) September 2010; (**b**) July 2011. **Figure 14.** Proportion of flood occurrence in vulnerable areas for performance evaluation: (**a**) September 2010; (**b**) July 2011. **Figure 13.** Proportion of flood occurrence in vulnerable areas for performance evaluation: (**a**) September 2010; (**b**) July 2011.

#### *4.3. Discussion 4.3. Discussion 4.3. Discussion*

This study proposed a technique for calculating the flood vulnerability that changes according to the rainfall situation using hydrological-topographic factors. Lee et al. [5] suggested that studies using globally available data, such as SRTM and ASTER, are needed so that they can be used even in areas where data are insufficient in flood vulnerability analysis. In addition, they said that it is necessary to develop a technique that can This study proposed a technique for calculating the flood vulnerability that changes according to the rainfall situation using hydrological-topographic factors. Lee et al. [5] suggested that studies using globally available data, such as SRTM and ASTER, are needed so that they can be used even in areas where data are insufficient in flood vulnerability analysis. In addition, they said that it is necessary to develop a technique that can This study proposed a technique for calculating the flood vulnerability that changes according to the rainfall situation using hydrological-topographic factors. Lee et al. [5] suggested that studies using globally available data, such as SRTM and ASTER, are needed so that they can be used even in areas where data are insufficient in flood vulnerability analysis. In addition, they said that it is necessary to develop a technique that can evaluate flood vulnerability in a simple but scientific way that can be applied to areas where there are no data on hydrological observations or poor quality. Against this background, in this study, topographic data that can be used anywhere was used as an independent variable of the logistic regression model, and data on soil or land use, which may not be available

depending on the region, were not added. However, reviewing previous studies, 89% of the floods that occurred in Seoul in 2011 occurred in areas with an impermeability rate of 70% or higher [21]. Further, it was analyzed that 52.1% of the study area consisted of roads, residential, and commercial areas, and 89.4% of floods occurred in these areas. It would be good if soil impermeability or land use is added as an independent variable through future research.

In this study, flood vulnerability was calculated using hydrological-topographic factors and compared with historical inundation trace data. As a result, there were some cases where flooding occurred in areas with a calculated vulnerability lower than 0.5, and other cases where flooding did not occur even in areas above 0.5. However, it remains uncertain whether areas with relatively low vulnerability are safer. It is true that areas with high vulnerability require intensive management due to their high probability of flooding, but even areas with low vulnerability should be managed with constant attention to reduce flood damage. Since floods occur for very complicated causes, it may be difficult to detect them using only the factors used in this study. Kim et al. [53] proposed an optimal input data selection method by combining total rainfall, rainfall of various durations, kurtosis, and skewness to predict urban flooding using a deep neural network. If the characteristics of rainfall, such as various durations, kurtosis, and skewness, are considered as an inducing factor, the detection accuracy for flood vulnerable areas can be improved.

The flood vulnerability, calculated using hydrological-topographic factors, did not take into account the density and importance of the population and capital in the study area. If a flood occurs in a densely populated area, it is difficult for many people to evacuate all at once, so even if the area has a low vulnerability, it is necessary to pay close attention to the area. Similarly, in areas where major social overhead capital (SOC) facilities are located, such as power plants or water supply/wastewater treatment facilities, great damage can be caused to surrounding areas when floods occur. Rehman et al. [54] reviewed scholarly articles related to flood vulnerability from 1990 to 2018, noting that flood vulnerability is being analyzed in social, environmental, and economic contexts, and presented a list of indicators that can be used for future research. In this regard, it is necessary to distinguish the vulnerability of regions with high socio-economic vulnerability from the criteria for calculating flood vulnerability ratings for other regions.

Another point to be improved in the methodology of this study is that the vulnerability of the entire study area was calculated using one logistic regression equation. Accordingly, there is a limitation in that areas with the same hydrological-topographic factors are likely to be determined as vulnerable areas even though flood damage did not occur there. These areas may not actually be flooded due to flood protection facilities such as a retarding basin, or drainage pipe networks that have already been expanded to handle the greater amount of rainfall through local government management. Based on this, a good direction for a future study would be to develop a logistic regression equation for each drainage section rather than only one equation for the entire study area. It is considered that the disaster prevention performance of the drainage section can be reflected indirectly through such equations without using a physical model.

## **5. Conclusions**

In this study, we proposed a technique to detect flood vulnerable areas by simultaneously considering topographic and hydrological factors to reduce damage caused by flooding. To estimate the vulnerability to flooding of the study area, a logistic regression model was developed using historical inundation trace data, and hydrological-topographic factors based on the grid system. The conclusions obtained through this study are as follows.

(1) A logistic regression model was established by dividing into a model that only considered topographic factors (T) and one that included hydrological factors (TR), and the results were compared. When comparing the two models, it was found that the estimated result was different due to the influence of rainfall. In addition,

according to the results of ROC analysis and precision calculation, it was found that the method of estimating the flood vulnerability that included the hydrological factor was relatively better for detecting the flood occurrence pattern.


**Author Contributions:** Conceptualization, J.-Y.L. and J.-S.K.; methodology, J.-Y.L. and J.-S.K.; software and validation, J.-Y.L.; writing—original draft preparation, J.-Y.L.; writing—review and editing, J.-S.K. All authors have read and agreed to the published version of the manuscript.

**Funding:** This study was supported by a Grant (127568) from the Water Management Research Program funded by Ministry of Environment of Korean government.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Acknowledgments:** This study was supported by a Grant (127568) from the Water Management Research Program funded by Ministry of Environment of Korean government.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


## *Article* **Marine Gas Hydrate Geohazard Assessment on the European Continental Margins. The Impact of Critical Knowledge Gaps**

**Ricardo León \* , Miguel Llorente and Carmen Julia Giménez-Moreno**

Geological Survey of Spain (IGME), Department of Research and Prospective Geoscience, Rios Rosas 23, 28003 Madrid, Spain; m.llorente@igme.es (M.L.); j.gimenez@igme.es (C.J.G.-M.) **\*** Correspondence: r.leon@igme.es

**Featured Application: Results can be used by policy makers and companies for planning seafloor activities. The susceptibility assessment and the analysis of its reliability can be applied in other regional works of geohazard assessment.**

**Abstract:** This paper presents a geohazard assessment along the European continental margins and adjacent areas. This assessment is understood in the framework of the seafloor's susceptibility to (i.e., likelihood of) being affected by the presence of hydrate deposits and the subsequent hazardous dissociation processes (liquefaction, explosion, collapse, crater-like depressions or submarine landslides). Geological and geophysical evidence and indicators of marine gas hydrates in the theoretical gas hydrate stability zone (GHSZ) were taken into account as the main factors controlling the susceptibility calculation. Svalbald, the Barents Sea, the mid-Norwegian margin-northwest British Islands, the Gulf of Cádiz, the eastern Mediterranean and the Black Sea have the highest susceptibility. Seafloor areas outside the theoretical GHSZ were excluded from this geohazard assessment. The uncertainty analysis of the susceptibility inference shows extensive seafloor areas with no data and a very low density of data that are defined as critical knowledge gaps.

**Keywords:** gas hydrates; European margins; geohazard assessment

## **1. Introduction**

Marine gas hydrates are crystalline solids forming ice-like marine deposits. They are composed of water molecules surrounding light hydrocarbon gases, such as methane (the most common), ethane and propane, in cage-like lattices [1]. They are common in shallow marine sediments (<1000 m bsf) below 350 mwd under high pressure, relatively low temperature and high hydrocarbon gas saturation in pore water conditions [2–4]. Bacterial methanogenesis, thermogenesis [5] and serpentinised oceanic crust [6,7] are the main source of CH<sup>4</sup> in continental margin sediment.

Marine gas hydrate is considered an important geohazard feature [8]. Three main environmental parameters control the nucleation and dissociation of marine methane hydrates: seafloor temperature, geothermal gradient and pressure [4]. Depressurization due to drops in sea level and warming of bottom water is the natural main scenario where hydrate dissociation can take place, driving large-scale natural gas release with potentially profound impacts, generating landslides, pockmarks, collapses, seafloor explosions and gas release [3,9]. These processes have also been hypothesized in the geological record [10]. However, under stable pressure/temperature conditions inside the gas hydrate stability zone (GHSZ), hydrocarbon seepage (such as pockmarks and gas flares) is likely to occur along fluid migration pathways of deep hydrocarbon reservoirs [11]. Nevertheless, these pressure-temperature conditions of shallow sediments may be modified by human activity on deep-water infrastructure such as wellheads, pipelines, production facilities, seabed anchors, cable touchdown areas on the seabed and catenaries in the water column [12].

**Citation:** León, R.; Llorente, M.; Giménez-Moreno, C.J. Marine Gas Hydrate Geohazard Assessment on the European Continental Margins. The Impact of Critical Knowledge Gaps. *Appl. Sci.* **2021**, *11*, 2865. https://doi.org/10.3390/app11062865

Academic Editor: Paraskevi Nomikou

Received: 2 March 2021 Accepted: 19 March 2021 Published: 23 March 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Geological event inventories are a useful tool for regional risk analysis [13]. Global gas hydrate inventories have been used to make predictions about global hydrate volumes [14,15] and related risks [16], and to make impact projections based on future warm scenarios [17]. In the near future, marine gas hydrates will become a severe geohazard because of the unfavourable consequences of global warming on the marine gas hydrate stability field [11,17]. They will thus trigger seafloor instabilities in gas hydrate areas that are currently stable [17], discharging marine methane from shallow near-shore environments (0–50 m) to the atmosphere [18], lowering pH and causing geochemical changes in the water column due to aerobic oxidation [19]. However, moderate methane submarine emissions are absorbed by fragile chemosynthetic ecosystems that prosper in the vicinity of venting gas seeps [20,21]

Evidence of marine methane hydrates has been reported in eight main regions along the European continental margins, and in adjacent areas such as offshore Greenland and Svalbard, the Norwegian margin, offshore the northern British Islands, the southern Iberian and northwest African margins (the Gulf of Cádiz and Alborán Sea), and the Black, Marmara and eastern Mediterranean seas [22]. However, hydrate-related data (geological, geophysical and oceanographic) are not homogeneously covered in the whole extent of the European continental margins. This issue is especially important for obtaining hydraterelated predictions (e.g., creating a predictive—and quick—static and continuous model for the hydrate stability field along the whole of the European continental margins).

This paper presents, for the first time on the whole of the European margins and adjacent areas, a geohazard assessment (susceptibility analysis) of the presence of marine gas hydrates. It also assesses the main knowledge gaps of hydrate-related information with a pan-European scope, and analyses their impact on the uncertainty of susceptibility inference. Susceptibility is understood as the likelihood of the seafloor to be affected by the presence of hydrate deposits.

## **2. Geological Setting**

The study area offers a wide view of the European margins from Macaronesia (SW corner: 24◦150 N; 36◦100 W) to the Black Sea and the Barents Sea (NE corner: 60◦400 E, 90◦ N). This hydrate framework covers three great domains on the European continental margins: the Arctic, the northeast Atlantic Sea and the south European Alpine Belt (Figure 1).

Hydrate systems in the Arctic province are located on the west Greenland, Svalbard and western Barents Sea margins. The east Greenland and west Svalbard margins were created during the Cretaceous to Paleogene continental rifting [23]. East Greenland is bordered by a wide continental shelf and deep basins. Gas seepage, bottom-simulating reflector (BSR) levels and pore water anomalies related to offshore Mesozoic sedimentary basins have been reported, associated with thermogenic gas migration along fractures [24]. The west Svalbard margin is composed of glacigenic debris flows, turbidites, hemipelagic sediments and contourites [25], where gas flares, seepages and BSR levels take place [26]. Hydrates are mainly represented in mud volcanoes in south Svalbard (e.g., Håkon–Mosby mud volcano—HM in Figure 1; [27]). Hydrocarbon gases are mainly thermogenic and locally biogenic [28], but abiogenic [29] contributions have been reported.

The Barents Sea and northern Norwegian margins are composed of a complex structure of sedimentary basins (e.g., Møre and Vøring; MB and VB in Figure 1, respectively) and structural highs resulting from the Cenozoic rifting [30]. The sedimentary basins are filled mainly with fine-grained hemipelagic sediments (Miocene–Pliocene) and glacigenic debris flows and contourites (Plio-Pleistocene) [31]. The Storegga Slide (SS in Figure 1) affected a huge sediment volume (mainly of the Møre Basin) as a response to climatic variability, 8200 ya [32]. In the Barents Sea, hydrate indicators are mainly BSR levels, gas chimneys and seepage pipes associated with vertical fluid flow systems and shallow gas.

**Figure 1.** Main tectonic structures and geological domains on the European continental margins and adjacent areas. Location of the study area. HM, Håkon–Mosby mud volcano; VB, Vøring Basin; MB, Møre Basin; SS, Storegga Slide; CGFZ, Charlie–Gibbs fracture zone; EEC, East European craton; Biscay Bay; GFZ, Gloria Fracture Zone; GC, Gulf of Cádiz; AS, Alborán Sea; SAEB, south European Alpine Belt, NAF, North Anatolian Fault; DD, Danube delta fan. (Taken from [23,33,34]). **Figure 1.** Main tectonic structures and geological domains on the European continental margins and adjacent areas. Location of the study area. HM, Håkon–Mosby mud volcano; VB, Vøring Basin; MB, Møre Basin; SS, Storegga Slide; CGFZ, Charlie– Gibbs fracture zone; EEC, East European craton; Biscay Bay; GFZ, Gloria Fracture Zone; GC, Gulf of Cádiz; AS, Alborán Sea; SAEB, south European Alpine Belt, NAF, North Anatolian Fault; DD, Danube delta fan. (Taken from [23,33,34]).

However, no hydrates have been obtained. The nature of gases is mostly thermogenic, migrating through faults and fractures [35]. On the mid-Norwegian margin, hydrate samples were recovered during the TTR-16 cruise [36], and the indicators are BSR levels, bright spots, gas chimneys and pockmarks [37], all of them circumscribed to the However, no hydrates have been obtained. The nature of gases is mostly thermogenic, migrating through faults and fractures [35]. On the mid-Norwegian margin, hydrate samples were recovered during the TTR-16 cruise [36], and the indicators are BSR levels, bright spots, gas chimneys and pockmarks [37], all of them circumscribed to the header of the Storegga Slide. Here, gases have a microbial origin with thermogenic components [38].

header of the Storegga Slide. Here, gases have a microbial origin with thermogenic components [38]. The northwest British Islands margins (Figure 1) are composed of basins with thick Cenozoic series [39]. Several seepages and gas chimneys are present on the upper slope, The northwest British Islands margins (Figure 1) are composed of basins with thick Cenozoic series [39]. Several seepages and gas chimneys are present on the upper slope, controlled by unconformities and fractures and sourced from the Upper Carboniferous and Middle and Upper Jurassic successions [40]. No hydrates have been recovered.

controlled by unconformities and fractures and sourced from the Upper Carboniferous and Middle and Upper Jurassic successions [40]. No hydrates have been recovered. The hydrate province of the south European Alpine Belt (SEAB in Figure 1) is located on the southern Iberian and northwest African margins, in the eastern Mediterranean and in the Black Sea. These areas are located in the context of the Alpine orogeny, owing to the convergence between the African and Eurasian plates [41]. On the southern Iberian and northwest African margins and in the Gulf of Cádiz and Alborán Sea (GC and AS in Figure 1), hydrate samples have been recovered in mud volcanoes [42,43], associated with other hy-The hydrate province of the south European Alpine Belt (SEAB in Figure 1) is located on the southern Iberian and northwest African margins, in the eastern Mediterranean and in the Black Sea. These areas are located in the context of the Alpine orogeny, owing to the convergence between the African and Eurasian plates [41]. On the southern Iberian and northwest African margins and in the Gulf of Cádiz and Alborán Sea (GC and AS in Figure 1), hydrate samples have been recovered in mud volcanoes [42,43], associated with other hydrocarbon fluid flow structures such as pockmarks and hydrocarbon-derived authigenic carbonate (HDAC) [44]. This fluid flow is controlled by fractures linked to a deep-rooted mud

drocarbon fluid flow structures such as pockmarks and hydrocarbon-derived authigenic

diapirism in the allochthonous unit of the Gulf of Cádiz (AUGC in Figure 1; [45]). Hydrocarbon gases have three origins: thermogenic, mixed thermogenic/biogenic on the subsurface [46] and abiogenic [7].

In the eastern Mediterranean Sea, the main seabed fluid flow areas are the accretionary complex and the Nile delta (AP and ND in Figure 1, respectively), where multiple mud volcanoes, pockmark fields and broad degassing areas with chemosynthetic fauna and authigenic carbonates are present [47]. Hydrates have only been observed in mud volcanoes along the accretionary complex [48]. The potential sources for hydrocarbon are related to late Messinian and Miocene to recent sapropels [49]. In mud volcanoes, gas has a thermogenic signature, while in pockmarks, the signature is predominantly microbial methane [50]. In the Marmara Sea (MS in Figure 1), a pull-apart basin onshore of the North Anatolian Fault (NAF in Figure 1) [51], hydrates of thermogenic origin related to seismic indicators (e.g., bright spots and transparent and chaotic zones) [52], gas flares, mud volcanoes and pockmarks [53] have been acquired.

The Black Sea (2212 mwd) is an extensional, mostly anoxic back-arc basin that contains the largest hydrogen sulphide and methane reservoirs in the world [22]. It is composed of two sub-basins, the eastern and western ones [54]. The evidence and indicators of marine gas hydrates are BSR levels, seismic blanking, bright reflections, pockmarks and mud volcanoes. The northwestern part is dominated by organic-rich delta fan complexes (the Danube [DD in Figure 1] and Dniepr rivers), where gas flares (microbial origin) and BSR levels have been observed linked to their canyon and levee systems [55]. Gas hydrates have been recovered in mud volcanoes related to gas chimneys, active faulting and diapirism in deep areas of the western basin, as well as in shallow (upper-middle slope) areas in the southern and eastern parts [55,56]. In these areas, gas shows a mixed thermogenic and microbial composition in the subsurface.

## **3. Data Source and Methods**

This study used the hydrate-related GIS database of the GARAH project 731166, GeoERA-GE-1, H2020 Environment (https://geoera.eu/projects/garah4/; accessed on 20 December 2020). This GIS database (GARAH*ydrates*; [57]) is INSPIRE-compliant and stores hydrate-related geological, geophysical and oceanographic information (Figure 2). It is the result of a data collection from two main groups of data: (i) data of a pan-European scope from free public databases and project results, such as EMODnet, PERGAMON and MIGRATE; and (ii) data of a regional scope from scientific organizations. The source of the data, owner and person/institution of contact is stored in each database record.

The majority of the geological and geophysical evidence and indicators of marine gas hydrates of GARAH*ydrates* came from the results of Work Package 1 of the MIGRATE COST action—ES1405 (https://www.migrate-cost.eu/; data given by MIGRATE COST Action to GARAH project on 31 January 2019) led by the University of Southampton and the National Oceanographic Centre, in which 21 organizations from 15 countries were involved. The MIGRATE database contains 1892 records (vector and raster) and stores information regarding direct and indirect evidence of gas hydrates. The data on direct evidence of gas hydrates are from samples described in publications. The data on indirect evidence include seismic indicators such as BSR levels and areas, gas chimneys, high reflectivity areas and velocity anomalies. Other gas hydrate information includes seabed features (gas seepages areas), heat flow data, sediment thickness models, pore water anomalies, theoretical models of the base of the GHSZ, and relief and bathymetry models.

**Figure 2.** Geological, geophysical and oceanographic data sets used in the present paper. Marine gas hydrate evidence and indicators, and oceanographic variables stored in the GARAH*ydrates* data base. (**a**) Study area. (**b**) West Greenland. (**c**) Barents Sea. **Figure 2.** Geological, geophysical and oceanographic data sets used in the present paper. Marine gas hydrate evidence and indicators, and oceanographic variables stored in the GARAH*ydrates* data base. (**a**) Study area. (**b**) West Greenland. (**c**) Barents Sea.

EMODnet Geology marine minerals (https://www.emodnet-geology.eu/map-viewer/ ?p=marine\_minerals; accessed on 1 January 2019) supplied 28 records in polygon shapefile format regarding marine gas hydrate evidence in the Gulf of Cádiz, the Barents Sea and the Black Sea. The PERGAMON database was developed by the geological surveys of Spain (IGME) and Ireland (GSI) in 2011 and 2012 in the framework of the PERGAMON COST action—ES0902 (https://www.cost.eu/actions/ES0902/; data given by PERGAMON Cost Action to GARAH project on 1 January 2019). This database supplied seafloor temperature data and theoretical models of the thickness of the GHSZ in the Arctic Sea.

The geothermal gradient data were obtained from the global heat flow database of the International Heat Flow Commission (website: http://engineering.und.edu/geology-andgeological-engineering/globe-heat-flow-database/index.cfm; accessed on 31 March 2020). The data were downloaded with the ODV application (https://odv.awi.de/; accessed on 1 February 2019). Seafloor temperature is a composite dataset developed by the Geological Survey of Spain (IGME) using CTD data downloaded from the World Ocean Database (https://www.ncei.noaa.gov/products/world-ocean-database; accessed on 1 February 2019) and the British Oceanographic Data Centre (https://www.bodc.ac.uk/; data given by British Geological Survey to GARAH project on 30 November 2018). Finally, bathymetry was obtained from three sources: the EMODnet Bathymetry portal (https://www.emodnetbathymetry.eu/; accessed on 31 March 2020), IBCAO (https://www.gebco.net/data\_and\_ products/gridded\_bathymetry\_data/arctic\_ocean/; accessed on 1 February 2019) and GEBCO (https://www.gebco.net/; accessed on 31 March 2020).

Several records were added or updated using regional data from scientific organizations: (i) British Geological Survey (BGS) technical reports for geophysical indicators in the north of the British Islands [58–60]; (ii) the BGS 250k map series/MCA Civil Hydrography Prog data for pockmark mapping; (iii) marine gas hydrate evidence in the Black Sea from SRDE-Geoinform of Ukraine; and (iv) regional models of the base of the GHSZ for CH<sup>4</sup> and/or CO<sup>2</sup> in the Biscay Bay from the Bureau de Recherches Géologiques et Minières (BRGM).

The thickness of the GHSZ was taken from Núñez-Varela [61]. The hydrate stability field has been calculated with the CSMHYD program [62] for a standard composition of biogenic gas [63] and a salinity assumption of 36 psi for all the study area.

## **4. Results**

## *4.1. Hydrate-Related Information Stored in GARAHydrates*

The hydrate-related information is structured in four levels inside GARAH*ydrates*: (i) geological and geochemical evidence and indicators, (ii) geophysical indicators, (iii) seabed fluid flow structures, and (iv) oceanographic variables (Figure 2). Four types of items describe the information: location items, property metadata, geo-descriptors and references/comments (Table 1). Location items describe the geographical location (coordinates, geological setting, etc.). Property-reference metadata store the owner of the data and contact information. Geo-descriptors describe the geological, geochemical and geophysical characteristics of the evidence or indicator. Finally, references/comments store bibliographic references and other comments of interest of each item of evidence or indicator.

The level of information "geological and geochemical evidence and indicators" stores evidence (e.g., crystals of gas hydrates) and indicators (e.g., degassing structures and pore water anomalies) of gas hydrates acquired by direct sampling. The level "geophysical indicators" stores seismic or electric features of gas hydrate presence in the sediment column, such as high resistivity, BSR levels, bright spots, acoustic blanking facies and gas chimneys. The level "seabed fluid flow structures" stores structures related to fluid migration in areas where evidence or indicators of marine gas hydrates have been observed. Finally, the level "oceanographic variables" stores information about seafloor temperature, geothermal gradient and bathymetry.


**Table 1.** Description of the attributes (items) of the GARAH*ydrates* GIS database. NN, not null; LV, list of values.

More than 136,000 records of hydrate samples, seismic indicators and seabed fluid flow structures have been stored (Table 2). In west Greenland, no hydrates were recovered; there are only six indirect items of evidence or indicators, such as pore water anomalies of chloride. However, BSR levels (~9400 km<sup>2</sup> ) and gas flares and seepages (~3500 km<sup>2</sup> ) have been mapped. West Greenland–Svalbard–Barents Sea is, together with the Black Sea, one of the most extensive hydrate regions in the study area, with more than 2100 km<sup>2</sup> and 26,300 km<sup>2</sup> of mapped hydrates and BSR levels (58), respectively, as well as numerous gas flares and seepages. Although the mid-Norwegian margin is a very localized hydrate region, it is the only site where hydrates have been recovered on the northwest Atlantic European margins. Here, gas hydrates are linked to the old slumped slope (Storegga), BSR levels and seepage area. On the southern Iberian and northwest African margins and in the Mediterranean Sea, hydrates have only been recovered on mud volcanoes (in the Gulf of Cádiz and eastern Mediterranean Sea). A high number of hydrate samples have been recovered in these regions as a result of many oceanographic cruises by scientific groups. The Black Sea is the hydrate region that has most gas hydrate surfaces mapped (more than 3650 km<sup>2</sup> ), with more than 80 mud volcanoes on the seafloor, 91% of them inside the GHSZ.

Finally, three information layers are stored in the oceanographic variable group: seafloor temperature (5896 records), geothermal gradient (4332 records) and bathymetry (composite raster dataset with a cell size ~100 × 100 m).

*Appl. Sci.* **2021**, *11*, 2865


**Table 2.** Geological and geophysical evidence and indicators of marine gas hydrates stored in GARAH*ydrates.*

## *4.2. Susceptibility Assessment*

The presence of gas hydrates in marine sediments is a geohazard that has not yet been evaluated in the whole of the European continental margins. This work uses the database of marine gas hydrate evidence and indicators developed in the GARAH project to make a pan-European assessment of hydrate presence on its continental margins. This assessment was carried out in two steps: analysis and weighting of factors and susceptibility calculation.

## 4.2.1. Analysis and Weighting of Factors

Several factors were taken into account in this assessment: marine gas hydrate evidence, seismic indicators, seabed fluid flow structures and thickness of the GHSZ.

Evidence of marine gas hydrates is the ground truth of where hydrate exists in the seafloor and/or sub-seafloor. This layer establishes seafloor areas with a moderate–high likelihood of occurrence of dissociation processes in the seafloor or sub-seafloor. The magnitude of the processes will depend on the quantity of hydrate in the sedimentary column and its type within the sediment (massive, in layers, disseminated, etc.).

Seismic indicators show seafloor areas where hydrates could exist. Marine gas hydrates have not been recovered, but there is a moderate to high likelihood of them occurring. In areas where marine gas hydrates have been recovered by direct sampling (geological evidence), seismic indicators allow hydrate occurrence to be inferred.

Seafloor geological structures related to hydrocarbon fluid migration, such as pockmarks, gas flares, mud volcanoes and HDAC are directly linked to deep hydrocarbon reservoirs. The occurrence of these structures reveals a free fluid leakage from the sedimentary structure to the water column. These processes can occur inside the GHSZ, in some cases due to preferential fluid migration from deep reservoirs, and in other cases, as a result of hydrate dissociation processes. This layer of information will thus establish a wide spectrum of the susceptibility of presence of marine gas hydrates and their hazardousness. It will range from low or void in areas outside the GHSZ to moderate to high in areas inside the GHSZ.

The thickness of the GHSZ establishes the theoretical seafloor area where the occurrence of hydrates is physically possible under optimal gas saturation and salinity conditions. Seafloor areas inside the GHSZ will be considered potential areas to be affected by dissociation processes. In addition, the intersection between the base of the GHSZ and the seafloor will be considered a potential strip of the high likelihood of fluid leakage and dissociation processes. The three oceanographic variables taken into account in the thickness calculation were seafloor temperature, geothermal gradient and bathymetry.

Each geological and geophysical item of evidence and indicator was weighted according to the confidence/certainty of finding hydrates at the site. The maximum weight (or confidence) was given to recovered samples of gas hydrates or evidence of hydrate dissociation, such as degassing or liquidation structures in gravity cores. Seismic indicators of the presence of gas hydrates or hydrocarbon seabed fluid flow such as BSRs, acoustic blanking, amplitude anomalies and the presence of geological structures of seabed fluid flow in the vicinity of the GHSZ were weighted with a lower value, between 0.8 and 0.9, based on expert criteria (Table 3).

Regarding the theoretical GHSZ, the seafloor was weighted in three categories. Seafloor areas outside the theoretical GHSZ were excluded as not likely to be affected by hydrate dissociation processes. On the other hand, any location inside the GHSZ was selected as theoretically likely to suffer dissociation processes. A strip at the up-dip limit of the GHSZ (50 m in thickness) was a critical area for these dissociation processes (Figure 3).


**Table 3.** Weights given to each hydrate-related item of evidence or indicator for the development of the density map of evidence/indicators.

## 4.2.2. Susceptibility Calculation

The proposed methodology analyses the geological hazard by means of the susceptibility assessment. The term "susceptibility" is employed here to define the likelihood of occurrence of hydrates in the sediment column, and subsequently the likelihood of them being affected by dissociation processes resulting from natural or human-induced activities (liquefaction, explosions, collapse, crater-like depressions or submarine landslides). Susceptibility assessment is applied as the first step in a pan-European risk assessment, owing in particular to (i) the regional scope of the assessment (the European continental margins and adjacent areas) and (ii) the current state of European gas hydrate-related information characterized by intensively studied areas with a high density of high-quality data and wide areas of critical knowledge gaps with no data.

The baseline scenario (the initial hypothesis) is that gas hydrate occurrence is only possible in seafloor areas where pressure (bathymetry) and seafloor temperature conditions are inside the theoretical GHSZ. In this zone, the occurrence of gas hydrates is directly related to the presence of evidence (direct samples of hydrates) or indicators of it (e.g., pore water and velocity anomalies, BSRs and gas chimneys), as well as the occurrence of hydrocarbon fluid flow structures. Finally, the likelihood of the seafloor being affected by gas hydrate dissociation processes will be great at the base of the GHSZ and in the vicinity of gas hydrate evidence and indicators.

In order to prove this initial hypothesis, a susceptibility assessment was carried out through map algebra in a GIS environment from a density map of evidence and indicators and the pan-European map of the GHSZ on the seafloor.

The first step for the development of the density map was to create a lattice of evidence and indicators at a resolution of the work scale (5 × 5 km), which will be the final resolution of the susceptibility assessment map. In this lattice, each geographical feature of evidence and indicators was weighted according to Table 3. and the pan-European map of the GHSZ on the seafloor. The first step for the development of the density map was to create a lattice of evidence and indicators at a resolution of the work scale (5 × 5 km), which will be the final resolution of the susceptibility assessment map. In this lattice, each geographical feature of evidence and indicators was weighted according to Table 3.

In order to prove this initial hypothesis, a susceptibility assessment was carried out through map algebra in a GIS environment from a density map of evidence and indicators

*Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 11 of 24

**Figure 3.** Theoretical up-dip limit of the GHSZ in the study area (modified from Núñez-Varela, 2020). In the volcanic area of Macaronesia (the Canary and Madeira islands), the up-dip limit has been eliminated because of the absence of hydrocarbon reservoirs on the flanks of the islands. **Figure 3.** Theoretical up-dip limit of the GHSZ in the study area (modified from Núñez-Varela, 2020). In the volcanic area of Macaronesia (the Canary and Madeira islands), the up-dip limit has been eliminated because of the absence of hydrocarbon reservoirs on the flanks of the islands.

A hypothesis was then established that considered the database of ground evidence for sites that have been sampled, but occurrences of gas hydrates might not be restricted to these point locations. If a given pixel were located between a ground evidence and an indicator, the likeliness of that pixel containing gas hydrates would be greater than that of a pixel located far from either. Given the discrete nature of the features described within the database and the relative concept of this likeliness, a regionalization technique was applied following a smoothed saturated algorithm of kernel density. Here, the weighting of the features represents an abstract concept of the confidence of having gas hydrates. This technique consists of a kernel density estimation, which fits a smoothly tapered surface to each point or polyline. The search radius (default option) was calculated on the basis of the spatial configuration and the number of input points. This approach corrects A hypothesis was then established that considered the database of ground evidence for sites that have been sampled, but occurrences of gas hydrates might not be restricted to these point locations. If a given pixel were located between a ground evidence and an indicator, the likeliness of that pixel containing gas hydrates would be greater than that of a pixel located far from either. Given the discrete nature of the features described within the database and the relative concept of this likeliness, a regionalization technique was applied following a smoothed saturated algorithm of kernel density. Here, the weighting of the features represents an abstract concept of the confidence of having gas hydrates. This technique consists of a kernel density estimation, which fits a smoothly tapered surface to each point or polyline. The search radius (default option) was calculated on the basis of the spatial configuration and the number of input points. This approach corrects for spatial outliers (input points that are very far from the rest), so they will not make the search radius unreasonably large. These values of the weighted density map were then normalized from zero to one (Figure 4) to remove the per area output of the model, providing a relative likeliness of between 0 and 1, where 0 means that there is very little (but not no) confidence of finding significant amounts of gas hydrates at that location and 1 means that there is certainty of finding gas hydrates at that location.

for spatial outliers (input points that are very far from the rest), so they will not make the search radius unreasonably large. These values of the weighted density map were then normalized from zero to one (Figure 4) to remove the per area output of the model, providing a relative likeliness of between 0 and 1, where 0 means that there is very little (but not no) confidence of finding significant amounts of gas hydrates at that location and

1 means that there is certainty of finding gas hydrates at that location.

**Figure 4**. Normalized (zero to one) and weighted density map of hydrate evidence and indicators. Lattice of hydrate evidence and indicators overlapped, red dots. Density map developed with the "kernel density" algorithm of ArcGIS® . Parameters: population field, weight (taken from Table 3); **Figure 4.** Normalized (zero to one) and weighted density map of hydrate evidence and indicators. Lattice of hydrate evidence and indicators overlapped, red dots. Density map developed with the "kernel density" algorithm of ArcGIS®. Parameters: population field, weight (taken from Table 3); cell size, 5000; method, geodesic.

cell size, 5000; method, geodesic.

Regarding the weighted map of the theoretical GHSZ (Figure 3), the up-dip limit of the GHSZ in the vicinity of the low-latitude volcanic islands in the Atlantic Ocean (i.e., the Azores, Madeira and the Canary Islands) was not taken into account because of the absence of hydrocarbon reservoirs at these geological sites. Finally, this map was weighted in relation to the mean value of the normalized density map of evidence and indicators (mean = 0.00228; Figure 4). Thus, according to its likelihood of being affected by dissociation processes, the GHSZ on the seafloor was weighted using expert criteria. The strip of up-dip of the GHSZ was weighted with 0.00228 and the rest of the GHSZ with 0.00114 (half the likelihood). Seafloor areas outside the GHSZ were given a value of zero. Regarding the weighted map of the theoretical GHSZ (Figure 3), the up-dip limit of the GHSZ in the vicinity of the low-latitude volcanic islands in the Atlantic Ocean (i.e., the Azores, Madeira and the Canary Islands) was not taken into account because of the absence of hydrocarbon reservoirs at these geological sites. Finally, this map was weighted in relation to the mean value of the normalized density map of evidence and indicators (mean = 0.00228; Figure 4). Thus, according to its likelihood of being affected by dissociation processes, the GHSZ on the seafloor was weighted using expert criteria. The strip of up-dip of the GHSZ was weighted with 0.00228 and the rest of the GHSZ with 0.00114 (half the likelihood). Seafloor areas outside the GHSZ were given a value of zero.

The susceptibility assessment was performed by map algebra, taking into account the control maps of density of hydrate evidence and indicators and the weighted map of the GHSZ on the seafloor. Specifically, the final map (Figure 5) was conceived as a segmentation in three levels by quantiles resulting from the addition of the above control maps:

$$\mathcal{Sc} = \delta\_{\text{el}} + GHSZ\_w \tag{1}$$

where *Sc* is the susceptibility map; *δei* is the normalized weighted density map of hydrate evidence and indicators; and *GHSZ<sup>w</sup>* is the weighted map of the GHSZ on the seafloor. The final *Sc* value was masked with the positive values of the GHSZ map. Seafloor areas outside the GHSZ have a susceptibility value of zero.

outside the GHSZ have a susceptibility value of zero.

**Figure 5**. Susceptibility assessment of the seafloor to the presence of hydrates on the European continental margins and adjacent areas. **Figure 5.** Susceptibility assessment of the seafloor to the presence of hydrates on the European continental margins and adjacent areas.

> Susceptibility values were segmented into three levels by quantiles: low from 0.009 to 0.0129, middle from 0.0129 to 0.0325 and high from 0.0325 to 1.009. The susceptibility assessment shows seven areas with high values: Svalbard, the northern Norwegian margin─Barents Sea, the continental slope of the mid-Norwegian margin and the North Sea, the Gulf of Cádiz, the eastern Mediterranean and the Black Sea. Moderate values are located on the west Greenland continental shelf, near the northwest British Islands and on the continental slope of the western and northern Mediterranean Sea. Susceptibility values were segmented into three levels by quantiles: low from 0.009 to 0.0129, middle from 0.0129 to 0.0325 and high from 0.0325 to 1.009. The susceptibility assessment shows seven areas with high values: Svalbard, the northern Norwegian margin—Barents Sea, the continental slope of the mid-Norwegian margin and the North Sea, the Gulf of Cádiz, the eastern Mediterranean and the Black Sea. Moderate values are located on the west Greenland continental shelf, near the northwest British Islands and on the continental slope of the western and northern Mediterranean Sea.

The susceptibility assessment was performed by map algebra, taking into account the control maps of density of hydrate evidence and indicators and the weighted map of the GHSZ on the seafloor. Specifically, the final map (Figure 5) was conceived as a segmentation in three levels by quantiles resulting from the addition of the above control

where *Sc* is the susceptibility map; *δei* is the normalized weighted density map of hydrate evidence and indicators; and *GHSZ<sup>w</sup>* is the weighted map of the GHSZ on the seafloor. The final *Sc* value was masked with the positive values of the GHSZ map. Seafloor areas

= + (1)

## **5. Discussion**

maps:

This section analyses the pan-European database of susceptibility of presence of marine gas hydrates from a geohazard point of view, considering the impact of spatial data distribution on the uncertainty value and on the identification of critical knowledge gaps.

## *5.1. Hydrate-Related Knowledge Gaps*

Nucleation and dissociation of marine methane hydrates are directly controlled by the presence of hydrocarbon gas solubility in the sediment pore water and three environmental parameters: seafloor temperature, geothermal gradient and pressure (water depth) [3]. However, free public information about these key parameters shows a non-homogeneous continuity along the European continental margins. This issue is especially critical for understanding the behaviour of the GHSZ or making predictions or calculations on it. In particular, it is essential for assessments related to geohazards and risks, assessments of the abundance of sediment-hosted gas hydrates, and assessments of the role of CO2-rich hydrates in the geological storage of CO2. The issue is also of broad interest to the scientific community: petroleum geologists, biologists and ecologist working on vulnerable ecosystems, researchers on natural hazards and tsunamis, civil engineers and policy makers.

We therefore assessed the critical knowledge gaps in the geological and geophysical evidence and indicators and the oceanographic variables taken into account in the calculation of the GHSZ thickness. This assessment was carried out along density maps of seafloor temperature, geothermal gradient and hydrate evidence and indicators. Owing to the regional (pan-European) scope, an area unit was established at a sedimentary basin scale of 100,000 km<sup>2</sup> (a searching radius of ca. 178.4 km) and a surface resolution of 5 km × 5 km.

Evidence of marine methane hydrates has been reported in eight main regions along the European continental margins (Figure 2): offshore Greenland and Svalbard, the Norwegian margin, offshore the northern British Islands, the southern Iberian and northwest African margins (the Gulf of Cádiz and Alborán Sea), and the Black, Marmara and eastern Mediterranean seas. These areas show a high density of high-quality data resulting from several scientific oceanographic cruises and an intensive scientific survey. Outside the limits of these areas, the lack of evidence and indicators is obvious. In our opinion, the reliability of this lack of evidence is controversial (Figure 6). Although the majority of European continental margins have been prospected for the oil industry, some deep ocean basin areas have not. Therefore, some areas with a lack of evidence (possibly located in deep ocean basins) could be treated as information gaps resulting from a lack of prospection or scientific fluid flow research. *Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 15 of 24

**Figure 6.** Knowledge gap assessment of hydrate evidence and indicators. Density map developed with the "point density" algorithm of ArcGIS® . Pixel value, number of items of evidence and indicators of hydrates per 100,000 km<sup>2</sup> . Parameters: population field, none; cell size, 5000; radius, 178,415 metres; areal units, square kilometres; method, geodesic. **Figure 6.** Knowledge gap assessment of hydrate evidence and indicators. Density map developed with the "point density" algorithm of ArcGIS®. Pixel value, number of items of evidence and indicators of hydrates per 100,000 km<sup>2</sup> . Parameters: population field, none; cell size, 5000; radius, 178,415 metres; areal units, square kilometres; method, geodesic.

Marine geothermal data appear to be concentrated with high density in some of the abovementioned eight main regions with hydrate evidence surveyed by scientific cruises (Figure 7a). On the other hand, seafloor temperature data, the most sensitive variable in the theoretical calculation of the base of the GHSZ [63], are especially concentrated in the Black Sea and on the eastern Atlantic continental shelf (Figure 7b). For the two above datasets, areas with less than 1 record per 100,000 km<sup>2</sup> were selected as knowledge gaps (KG in Figure 7). These knowledge gaps are especially critical (i) in areas where direct hydrate samples have been recovered, (ii) in the vicinity of the up-dip limit of the GHSZ, and (iii) in areas where seabed fluid flow structures have been detected. The critical knowledge gaps for geothermal gradient data are east of Greenland, Svalbard─Barents Sea, the White Seafloor temperature and marine geothermal data have a heterogeneous distribution. Marine geothermal data appear to be concentrated with high density in some of the above-mentioned eight main regions with hydrate evidence surveyed by scientific cruises (Figure 7a). On the other hand, seafloor temperature data, the most sensitive variable in the theoretical calculation of the base of the GHSZ [63], are especially concentrated in the Black Sea and on the eastern Atlantic continental shelf (Figure 7b). For the two above datasets, areas with less than 1 record per 100,000 km<sup>2</sup> were selected as knowledge gaps (KG in Figure 7). These knowledge gaps are especially critical (i) in areas where direct hydrate samples have been recovered, (ii) in the vicinity of

Sea, northwest of the British Islands and the southeastern Mediterranean Sea; and for seafloor temperature they are east Greenland, the western Barents and White seas, the northern Black Sea and the southeastern Mediterranean Sea (north of Libya). In addition, be-

Seafloor temperature and marine geothermal data have a heterogeneous distribution.

the potential for uncertainty predictions for similar areas with no data.

the up-dip limit of the GHSZ, and (iii) in areas where seabed fluid flow structures have been detected. The critical knowledge gaps for geothermal gradient data are east of Greenland, Svalbard–Barents Sea, the White Sea, northwest of the British Islands and the southeastern Mediterranean Sea; and for seafloor temperature they are east Greenland, the western Barents and White seas, the northern Black Sea and the southeastern Mediterranean Sea (north of Libya). In addition, because of the critical need to understand geothermal gradient in areas that have a relatively high spatial variance, high-resolution coverage is critical, in particular in order to assess the potential for uncertainty predictions for similar areas with no data. *Appl. Sci.* **2021**, *11*, x FOR PEER REVIEW 16 of 24

**Figure 7.** (**a**) Knowledge gap analysis of geothermal gradient data. KG, knowledge gap of geothermal gradient data; CKG, critical knowledge gap of geothermal gradient data. (**b**) Knowledge gap analysis of seafloor temperature data. KG, knowledge gap of seafloor temperature data; CKG, critical knowledge gap of seafloor temperature data. Density maps developed with the "point density" algorithm of ArcGIS® . Pixel value, number of data per 100,000 km<sup>2</sup> . Parameters: population field, none; cell size, 5000; radius, 178,415 metres; areal units, square kilometres; method, geodesic. **Figure 7.** (**a**) Knowledge gap analysis of geothermal gradient data. KG, knowledge gap of geothermal gradient data; CKG, critical knowledge gap of geothermal gradient data. (**b**) Knowledge gap analysis of seafloor temperature data. KG, knowledge gap of seafloor temperature data; CKG, critical knowledge gap of seafloor temperature data. Density maps developed with the "point density" algorithm of ArcGIS®. Pixel value, number of data per 100,000 km<sup>2</sup> . Parameters: population field, none; cell size, 5000; radius, 178,415 metres; areal units, square kilometres; method, geodesic.

In general, the public bathymetry data collected (EMODnet Bathymetry and IBCAO) have a quite acceptable quality and have been very useful for the objectives of this hydraterelated pan-European study. The original grid has a cell size of 100 × 100 m and the inference was calculated with a cell size of 5 × 5 km.

The EMODnet Bathymetry dataset has a quality value stored for each pixel/bathymetry datum calculated from derived bathymetric parameters: minimum water depth in metres to the lowest astronomical tide (LAT), average water depth in metres to the LAT, maximum water depth in metres to the LAT, standard deviation of water depth in metres, number of values used for interpolation over the grid cell, the interpolation flag (identification of extrapolated cells), average water depth smoothed by means of a spline function in metres to the LAT, an indicator of the offsets between the average and smoothed water depth as a % of the water depth, and a reference to the prevailing source of data with metadata. Unfortunately, this information (per pixel) is not available on the web portal. Consequently, quantifiable targets for calculating knowledge gaps are not available. However, through a visual analysis, areas with poor accuracy or lack of data were selected, especially on the North African Mediterranean margins (e.g., north of Libya). These areas were classified as bathymetry knowledge gaps (KG in Figure 8b,c). However, owing to the resolution of the inference (5 × 5 km), no critical gaps were determined for the bathymetric data.

## *5.2. Reliability of the Susceptibility Assessment*

In areas where hydrate evidence and indicators have been reported, the assumption of a salinity of 36 psi for the theoretical GHSZ calculated by Núñez-Varela [61] along the study area involves an error in the thickness calculation of ±2–10 m in the Arctic region (33–35 psi), ±5 m in the Mediterranean Sea (38 psi) and ±30 m in the Black Sea (17 psi) [64]. Nevertheless, considering the regional scale of the susceptibility assessment (cell size of 5 × 5 km), these errors are acceptable in the thickness calculation because they lay inside the vertical precision of the assessment. The error made by the difference in salinity in the theoretical GHSZ is lower than the vertical precision of this calculation due to the variation of the bathymetry each 5 km along the continental slope. In particular, ±15–120 m in the Arctic region, and ±100–200 m in both the Mediterranean and Black Sea. Regarding the Black Sea, although errors coming from salinity are still acceptable, the susceptibility resulting should be taken with caution, as values could be higher.

In order to assess the reliability of the susceptibility inference, a qualitative value of uncertainty (very high, high, middle, low and very low) was established as a function of the data density taken into account in the susceptibility calculation (Figure 9). The reliability (*u*) is thus equal to the sum of the density maps of geothermal gradient (*ρgr*), seafloor temperature (*ρst*) and hydrate evidence and indicators (*ρhy*):

$$
\mu = \ \rho\_{\mathcal{S}^r} + \rho\_{\mathcal{st}} + \rho\_{\mathcal{hy}} \tag{2}
$$

Five levels of reliability were established. The reliability is considered "very low" with values from 0 to 36 data per 100,000 km<sup>2</sup> , approximately less than ca. 1 datum per 50 km in mean; and "low", "middle", "high" and "very high" from 36 to 144, from 144 to 648, from 648 to 3149, and from 3149 to 15,218 data per 100,000 km<sup>2</sup> , respectively. These levels were defined by the geometrical segmentation of *u*-value, except "very low" and "low", which were defined by expert criteria. Very low reliability areas were catalogued as global knowledge gaps (KG in Figure 9) that are critical (CKG) in the vicinity of the up-dip limit of the GHSZ and hydrocarbon seabed fluid flow structures.

Areas located in the proximity of the continental shelf and intensively surveyed by oceanographic cruises show the most reliable results in the susceptibility assessment. By contrast, areas distant from the coastline (e.g., the mid-Atlantic Ocean) and areas that are inaccessible because of the presence of icebergs (e.g., east Greenland) or political issues (e.g., north of Libya) have very high uncertainty.

**Figure 8.** Knowledge gaps of bathymetry data. (**a**) EMODnet Bathymetry mosaic in the study area (cell size ca. 200 × 200 m). (**b**) Detail of knowledge gap on the Algerian margin. (**c**) Detail of knowledge gap on the Libyan margin. Details downloaded from the EMODnet Bathymetry web **Figure 8.** Knowledge gaps of bathymetry data. (**a**) EMODnet Bathymetry mosaic in the study area (cell size ca. 200 × 200 m). (**b**) Detail of knowledge gap on the Algerian margin. (**c**) Detail of knowledge gap on the Libyan margin. Details downloaded from the EMODnet Bathymetry web portal (https://portal.emodnet-bathymetry.eu/; accessed on 1 January 2021).

portal (https://portal.emodnet-bathymetry.eu/; accessed on 1 January 2021).

**Figure 9.** Reliability of the susceptibility assessment of hydrate presence on the European continental margins. (**a**) Reliability prediction in the inference of the susceptibility of the seafloor to the **Figure 9.** Reliability of the susceptibility assessment of hydrate presence on the European continental margins. (**a**) Reliability prediction in the inference of the susceptibility of the seafloor to the presence of hydrates. KG, global knowledge gap; CKG, global critical knowledge gap. (**b**) Reliability and susceptibility assessments overposted.

## *5.3. Spatial Significance of the Susceptibility Assessment and the Impact of Knowledge gaps*

Owing to the methodology applied, the hydrate evidence knowledge gaps are directly related to low values of susceptibility. These knowledge gaps may have two possible causes: (i) the catalogue is incomplete, these areas have been poorly surveyed, no records have been recovered, but hydrates may exist and subsequently a high susceptibility may be potentially latent; and (ii) there are no data because there is no evidence of hydrates. In order to solve this uncertainty, two concepts are added in the susceptibility assessment: the up-dip limit of the GHSZ and the presence of hydrocarbon seabed fluid flow structures. Particularly, examples of this situation are the east Greenland shelf, the Irish margin, the western Iberian margin and the western Mediterranean Sea, where no hydrates have been recovered but hydrocarbon seabed fluid flow structures and seismic indicators (e.g., on the Irish margin) have been observed.

High susceptibility values are located in areas with a high density of evidence and indicators. The majority of gas hydrate evidence stored in the database was recovered in focused seabed fluid flow structures such as mud volcanoes or pockmarks. This is especially significant on the southern European margins in the Gulf of Cádiz and the eastern Mediterranean and Black seas. In these cases, gas hydrates are circumscribed to the feeder systems of the hydrocarbon fluid migration structures, which, subject to certain exceptions, do not exceed 0.1 to 1 km and 4 km in diameter for pockmarks and mud volcanoes, respectively. In these areas, there is therefore no continuous spatial variation in the presence of hydrates. Gas hydrates appear with a located distribution (nugget effect?) and focused inside the hydrocarbon fluid flow structures where fluid migration is mainly controlled by faults [45,65]. However, the presence of hydrocarbon fluid flow structures shows a continuous spatial variation in fluid leakage areas. In these areas, the density map shows areas where hydrate-bearing fluid flow structures are more probable and, subsequently, the likelihood of the seafloor suffering gas hydrate dissociation processes as a result of natural or human activities could also be high. Finally, although the susceptibility could be high in mud volcano fields, for instance, the real risk or magnitude of dissociation processes will be low because of the typology or internal structure of hydrates inside the sediment. In mud volcanoes, hydrates constitute small (millimetres or centimetres) crystals or aggregates, and their real volume is low.

Moderate susceptibility values seem to be controlled by the GHSZ and in particular by the optimal theoretical environmental conditions for hydrate presence on the continental shelves of the Arctic region and Mediterranean Sea. In our opinion, the presence of moderate values on the eastern continental shelf of Greenland and their absence on the western Norwegian shelf is directly related to the presence of cooler bottom water masses on the eastern continental shelf of Greenland and the subsequent influence on the theoretical GHSZ. Although no hydrates have been recovered in the Mediterranean Sea, owing to the particular seafloor temperature/pressure conditions (bathymetry) on the continental slope, this area has a slightly elevated likelihood of occurrence of hydrate dissociation processes in the hypothetical presence of hydrocarbon gases in the sediment column.

The future global warming scenario projected by the scientific community [66–68] increases the susceptibility assessed by the direct effect on the gas hydrate stability of the ocean temperature increase [69] and the isostatic rebound in polar and sub-polar areas [9]. This direct effect (seafloor temperature increase and effective seafloor pressure decrease) will have a high impact on the eastern Greenland shelf, the northwest Norwegian margin, Svalbard and the Barents Sea, and subsequently the susceptibility in these areas will increase greatly. Furthermore, future changes in the thermohaline circulation [67,68] could have dramatic effects at high latitudes on the seafloor temperature and subsequently on hydrate stability.

## **6. Conclusions**

Geological and geophysical evidence and indicators of the presence of marine gas hydrates and the oceanographic variables controlling the GHSZ show a heterogeneous distribution and knowledge gaps (areas with <1 records per 100 km<sup>2</sup> ) along the European continental margins. Some of these knowledge gaps have been classified as critical: (i) for geothermal gradient, data on the east of Greenland, Svalbard, the northern Norwegian margin, the southern Barents Sea and the White Sea, the north of the British Islands, the Gulf of Cádiz, the Bay of Biscay, the north-western Iberian margin and the southwestern Mediterranean Sea; and (ii) for seafloor temperature, data on east of Greenland, the western Barents Sea and White Sea, the northern Black Sea and the south-eastern Mediterranean Sea.

The susceptibility assessment of occurrence of hydrate dissociation processes on the seafloor shows high values in Svalbard, the northern Norwegian margin—Barents Sea, the continental slope of the mid-Norwegian margin and the North Sea, the Gulf of Cádiz and the eastern Mediterranean and the Black Sea. Moderate values are observed on the continental shelf of western Greenland, the northwest of the British Islands and the continental slope of the western and northern Mediterranean Sea.

**Author Contributions:** Conceptualization and writing—original draft preparation, R.L. and C.J.G.-M.; methodology, validation, formal analysis and writing—review and editing, R.L., M.L. and C.J.G.-M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the European Union's Horizon 2020 research and innovation programme under grant agreement No 731166, GARAH project (GeoERA- GeoE.171.002 GE-1), EMODnet Bathymetry—High Resolution Seabed Mapping (EASME/EMFF/2018/007).

**Data Availability Statement:** Data used in this paper are available in a public and permanent repository (https://data.mendeley.com/datasets/vbt6hspgpn/draft?preview=1, (accessed on 20 March 2020)) with doi:10.17632/vbt6hspgpn.1.

**Acknowledgments:** We thank C. Guardiola, A. Lounds, the WP3 team of the GARAH project (P. Mata, C. Rochelle, A. Burnol, T. Nielsen, J. Hopper, I. Reguera, M. Stewart, S. Cervel and U. Larsen), and the MIGRATE COST Action coordination and WP1 teams (K. Wallmann, S. Bünz, T. Minshull, H. Marín-Moreno, J. Schicks., J. Bialas, G. Cifci, M. Giustiniani, J. Hopper, V. Magalhaes, Y. Makovsky, M.D. Max, T. Nielsen, S. Okay, I. Ostrovsky, N. O'Neill, A. Plaza-Faverola, C. Rochelle, S. Roy, K. Schwalenberg, K. Senger, S. Vadakkepuliyambatta, A. Vasilev).

**Conflicts of Interest:** The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analysis or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

## **References**


## *Review* **Recent Advancement in Assessment and Control of Structures under Multi-Hazard**

**Matin Jami <sup>1</sup> , Rajesh Rupakhety 1,\* , Said Elias <sup>2</sup> , Bjarni Bessason <sup>3</sup> and Jonas Th. Snæbjörnsson <sup>4</sup>**


**Abstract:** This review presents an up-to-date account of research in multi-hazard assessment and vibration control of engineering structures. A general discussion of the importance of multi-hazard consideration in structural engineering, as well as recent advances in this area, is presented as a background. In terms of performance assessment and vibration control, various hazards are considered with an emphasis on seismic and wind loads. Although multi-hazard problems in civil engineering structures are generally discussed to some extent, the emphasis is placed on buildings, bridges, and wind turbine towers. The scientific literature in this area is vast with rapidly growing innovations. The literature is, therefore, classified by the structure type, and then, subsequently, by the hazard. Main contributions and conclusions from the reported studies are presented in summarized tables intended to provide readers with a quick reference and convenient navigation to related publications for further research. Finally, a summary of the literature review is provided with some insights on knowledge gaps and research needs.

**Keywords:** multi-hazard; earthquake; wind; flood; hazards; hurricane; mitigation; resilience; risk assessment; bridge; building; wind turbine; control system

## **1. Introduction**

Natural hazards, such as earthquakes and wind forces, pose a challenge for human safety and comfort. Forces generated by these natural processes can damage, or even collapse, vulnerable civil engineering structures. The risk to lives and properties posed by natural hazards increases with urbanization, where large cities and metropolitan areas get more and more densely populated. Increasing urbanization and shortage of land results in the need to build taller and more complex structures which can be more vulnerable to lateral forces created by wind and earthquakes. Effects of natural hazards on civil engineering structures is, therefore, an important field of research.

Between 1998 and 2017, natural disasters affected 4.4 billion people worldwide, caused 1.3 million casualties [1], and resulted in economic loss of 2900 billion USD. During this 20-year period, floods, storms, and earthquakes were the most frequent hazards, accounting for 43.4, 28.2, and 7.8% of all natural disasters, respectively. Although floods were the most frequent hazard during this time, earthquakes and storms have been the deadliest and the costliest, respectively. Floods and earthquakes, combined, killed nearly one million people and resulted in an economic loss of almost 2000 billion USD during this 20-year period. The frequencies, casualties, and economic losses, caused by different types of natural hazards

**Citation:** Jami, M.; Rupakhety, R.; Elias, S.; Bessason, B.; Snæbjörnsson, J.T. Recent Advancement in Assessment and Control of Structures under Multi-Hazard. *Appl. Sci.* **2022**, *12*, 5118. https://doi.org/10.3390/ app12105118

Academic Editors: Miguel Llorente Isidro, Ricardo Castedo and David Moncoulon

Received: 21 March 2022 Accepted: 13 May 2022 Published: 19 May 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

between 1998 and 2017, are shown in Figure 1. The numbers in Figure 1, which are based on CRED report [1], clearly show that earthquakes and storms are the most damaging natural hazards. It is interesting to note that earthquakes have killed more people than all other hazards combined. types of natural hazards between 1998 and 2017, are shown in Figure 1. The numbers in Figure 1, which are based on CRED report [1], clearly show that earthquakes and storms are the most damaging natural hazards. It is interesting to note that earthquakes have killed more people than all other hazards combined.

*Appl. Sci.* **2022**, *12*, x FOR PEER REVIEW 2 of 29

**Figure 1.** Frequencies of different natural hazards and their effects during 1998–2017 (based on CRED report, [1]. **Figure 1.** Frequencies of different natural hazards and their effects during 1998–2017 (based on CRED report, [1]).

Different natural processes affect structures and people in different ways. While the simultaneous occurrence of two different types of damaging hazards, such as strong wind and earthquake, is rare, some natural processes can induce secondary hazards. For example, fire and landslides are known to occur after strong earthquakes (see, for example, Ravankah et al. [2]). Moreover, a structure may be exposed to different types of natural hazards, albeit not simultaneously, during its lifetime. Therefore, it needs to be resistant to forces and damage mechanisms imposed by more than one natural process. Structures optimally designed for actions from one type of natural hazard may not necessarily be well equipped to deal with actions from all types of hazards. This leads to the need for hazard mapping, considering different types of natural processes and their interdepend-Different natural processes affect structures and people in different ways. While the simultaneous occurrence of two different types of damaging hazards, such as strong wind and earthquake, is rare, some natural processes can induce secondary hazards. For example, fire and landslides are known to occur after strong earthquakes (see, for example, Ravankah et al. [2]). Moreover, a structure may be exposed to different types of natural hazards, albeit not simultaneously, during its lifetime. Therefore, it needs to be resistant to forces and damage mechanisms imposed by more than one natural process. Structures optimally designed for actions from one type of natural hazard may not necessarily be well equipped to deal with actions from all types of hazards. This leads to the need for hazard mapping, considering different types of natural processes and their interdependencies.

encies. Consideration of multiple hazards in urban development is gaining popularity in the research community. For example, Bathrellos et al. [3] studied probabilities of incidence of floods, landslides, and earthquakes, in a specific area in Northeastern Greece, to map multiple hazards and identify areas suitable for urban development. Hicks et al. [4] explore disaster risk reduction from a multi-hazard perspective. Regional multi-hazard mapping for urban development is gaining popularity in research (e.g., [5]). Vulnerability and design of structures against multiple hazards is also gaining popularity in research. As an example, Aly [6], as well as Aly and Abburu [7], discuss some fundamental differences between wind and earthquake-resistant designs of high-rise buildings. A review of studies on the vulnerability of buildings subjected to wind and earthquake forces is presented by Indirli et al. [8]. A framework for life-cycle loss estimation in tall buildings subjected to wind and seismic forces is presented by Venanzi et al. [9]. Civil engineering infrastructure, such as dams, bridges, roads etc., are lifelines of modern society. Although multi-hazard risk assessment of infrastructure is challenging [10,11], it is an important Consideration of multiple hazards in urban development is gaining popularity in the research community. For example, Bathrellos et al. [3] studied probabilities of incidence of floods, landslides, and earthquakes, in a specific area in Northeastern Greece, to map multiple hazards and identify areas suitable for urban development. Hicks et al. [4] explore disaster risk reduction from a multi-hazard perspective. Regional multi-hazard mapping for urban development is gaining popularity in research (e.g., [5]). Vulnerability and design of structures against multiple hazards is also gaining popularity in research. As an example, Aly [6], as well as Aly and Abburu [7], discuss some fundamental differences between wind and earthquake-resistant designs of high-rise buildings. A review of studies on the vulnerability of buildings subjected to wind and earthquake forces is presented by Indirli et al. [8]. A framework for life-cycle loss estimation in tall buildings subjected to wind and seismic forces is presented by Venanzi et al. [9]. Civil engineering infrastructure, such as dams, bridges, roads etc., are lifelines of modern society. Although multi-hazard risk assessment of infrastructure is challenging [10,11], it is an important tool to improve their safety and operability following natural disasters, which is instrumental for social resilience. Various factors affecting costs and performance of infrastructure in a multi-hazard environment is

discussed in Ettouney and Alampalli [12]. Performance and fragilities of special structures, such as dams and floodwalls exposed to multiple hazards, are studied in Ardebili and Saouma [13] and Bodda [14].

Natural events, such as wind and earthquakes, impose dynamic forces on buildings and other civil engineering structures. Damage caused by such forces depends on the dynamic properties of the structure, as well as the characteristics of the wind forces and ground motion. In most cases, structural damage is a result of excessive vibration. Vibration control, which refers to reducing oscillations of structures exposed to dynamic forces, can, therefore, be used as a protective measure. Vibration control makes use of active, passive, or hybrid secondary devices that are installed on the structure and designed/tuned/actuated for optimal reduction in structural responses such as displacement, acceleration, etc. Baseisolation, for example, has been a popular and effective protection against earthquakes (see, for example, [15–19]). Tuned mass dampers (TMD) and other supplemental damping devices of different designs and configurations have also been known to effectively reduce wind and earthquake-induced vibrations of different types of structures (see, for example, [20–22]). Vibration control systems can provide an alternative protection for existing structures where retrofitting or strengthening is considered too costly or not feasible, due to factors such as aesthetics, cultural aspects, etc. Control devices that are effective against the forces generated by one type of natural hazard might not be effective against other hazards. For example, base isolation systems, which are effective for seismic protection of structures, might cause an adverse response during strong wind [23]. Due to the uncertainties in the amplitude and frequency content of dynamic forces, induced by wind and earthquakes, and their relationship with the properties of the affected structures, consideration of a multi-hazard scenario is especially important when designing vibration control systems.

This work is an attempt to bring together and synthesize valuable information and conclusions presented in a vast body of research literature on multi-hazard effects and control of civil engineering structures. The work is based on a review and synthesis of published literature. Relevant studies were searched through scholarly databases such as Web of Science, Google Scholar, and Scopus. The keywords used for searching were "multihazard", "vibration-control", "seismic control", "tuned mass dampers", "seismic fragility", and "life-cycle assessment". The search results were then narrowed first by scanning the titles of articles to include only those that indicated relevance in the multi-hazard problem, addressing one or more of the criteria: (a) hazard mapping/quantification, (b) performance assessment, (c) design and/or optimization, (c) fragility assessment, (d) life-cycle and/or cost-benefit analysis, and (e) vibration control. This resulted in more than 400 articles. The Abstract and Conclusion sections of these articles were then studied to further filter out studies that did not address the multi-hazard problem. This resulted in 220 articles. The references listed in these articles were then checked to search for more relevant articles. Special attention was given to state-of-the art review studies. References listed in studies were checked in detail to search for additional relevant articles. In total, 263 articles were studied and are referenced to in this work. Among these, there are 210 journal articles, 17 books/reports, 11 theses, 14 book chapters, and 11 conference papers. These include 18 state-of-the art reviews.

Initial thematic development of the work was, first and foremost, based on the keywords listed in these articles. The keywords in these articles were extracted, and their frequencies were counted. In total, 699 unique keywords were found. Multi-word keywords were then replaced by a single word (called, here, a reduced keyword) that is representative of the scope of the work. For example, "vibration control" was reduced to "control", "risk assessment" was reduced to "risk", and so on. In some cases, such as "wind turbines", both words were retained. This resulted in 117 keywords. Similar keywords were then grouped together to identify themes/scopes. For example, "seismic", "ground motion", and "earthquakes" were placed under the theme of "Seismic". The number of occurrences of these themes were then counted, and the themes were ranked. Frequency distribution of the most frequent themes is presented in Figure 2. Some reduced keywords

with a low frequency of occurrence are therefore not considered useful in creating an overall theme of the subject being studied and are not shown in the figure. Control is the most frequent theme, and multi-hazard is the third most frequent theme. Seismic and wind loads are the most frequently considered hazards. In terms of structures, bridges, buildings, and wind turbines are frequent themes, while only a few (less than 10) occurrences of other infrastructure and lifelines were encountered. This thematic distribution of the studied articles was used to prepare the main structure of this paper, which is schematically presented in Figure 3. duced keywords with a low frequency of occurrence are therefore not considered useful in creating an overall theme of the subject being studied and are not shown in the figure. Control is the most frequent theme, and multi-hazard is the third most frequent theme. Seismic and wind loads are the most frequently considered hazards. In terms of structures, bridges, buildings, and wind turbines are frequent themes, while only a few (less than 10) occurrences of other infrastructure and lifelines were encountered. This thematic distribution of the studied articles was used to prepare the main structure of this paper, which is schematically presented in Figure 3.

"ground motion", and "earthquakes" were placed under the theme of "Seismic". The number of occurrences of these themes were then counted, and the themes were ranked. Frequency distribution of the most frequent themes is presented in Figure 2. Some re-

*Appl. Sci.* **2022**, *12*, x FOR PEER REVIEW 5 of 29

*Appl. Sci.* **2022**, *12*, x FOR PEER REVIEW 4 of 29

Frequency

**Figure 2.** Frequency distribution of the most relevant themes extracted from reduced keywords. **Figure 2.** Frequency distribution of the most relevant themes extracted from reduced keywords.

**Figure 3.** Schematic representation of thematic classification and organization of the paper. **Figure 3.** Schematic representation of thematic classification and organization of the paper.

exposure, and vulnerability of the infrastructure. The roles of each of these factors are

The definition of ''hazard'' in a broader sense is "any external or internal process or event that might degrade the performance of the system on hand." [12]. The United Nations General Assembly [24] defines hazard as "a process, phenomenon or human activity that may cause loss of life, injury or other health impacts, property damage, social and

Natural events, such as storms, earthquakes, or floods, are well-known hazards with widespread potential to turn into disasters. While these events are mostly sudden and occur in a relatively short time window, slower processes, such as fatigue, corrosion, age-

Among the three elements that constitute disaster risk, hazard is the one that is mostly beyond human control. Nevertheless, a proper understanding of the occurrence frequency, spatio-temporal distribution, and intensity of the hazard is important for disaster risk reduction. Recent advances in sensing technology: data collection, processing, storage, sharing capabilities; and modelling/computational tools have improved our understanding how different hazards affect civil engineering structures. Hazards can be of different types. For example, they can be natural events, such as earthquakes, or man-

economic disruption or environmental degradation".

ing, etc., can also impact structural performance over their life span.

**2. Risk: Hazard, Exposure, and Vulnerability** 

briefly reviewed in the following sections.

made ones, such as explosions.

*2.1. Hazard* 

Background information about different types of hazards, multi-hazard scenarios, and associated vulnerability and risk is provided in Section 2. This section is not a state-ofthe-art review of these topics but rather background information for the rest of the paper (see Figure 3). The main part of the paper, which is the state-of-the-art review part of the paper, is briefly termed as assessment and control (see Figure 3). The review is based on the themes encountered in the studied literature. Topics such as fragility/vulnerability assessment, life-cycle assessment, multi-hazard assessment, and reliability assessment are covered under the assessment theme, while the control theme mainly deals with vibration suppression. As vibration control is the most dominant theme of the studied papers (see Figure 2), a brief literature review of different control systems is provided in Section 3. The review is primarily classified by two themes: namely, structure and scope of work. Bridges, buildings, and wind turbines are covered in Sections 4–6, respectively. Each of these sections is sub-divided into assessment and control sub-sections. The literature on bridges is dominated by the assessment theme, which is classified into secondary themes such as hazard type, type of bridge, and the main aims of the study. The literature on buildings and wind turbines contains several studies of multi-hazard vibration control. For each of these structures, the studies reviewed here are sub-classified into secondary themes of hazard, type of building/wind turbine, and the type of control device.

## **2. Risk: Hazard, Exposure, and Vulnerability**

Risk related to disasters (disaster risks) can include loss of lives, disrupted economy, damages to the environment, etc. Risk is linked to the combination of hazard, physical exposure, and vulnerability of the infrastructure. The roles of each of these factors are briefly reviewed in the following sections.

## *2.1. Hazard*

The definition of "hazard" in a broader sense is "any external or internal process or event that might degrade the performance of the system on hand" [12]. The United Nations General Assembly [24] defines hazard as "a process, phenomenon or human activity that may cause loss of life, injury or other health impacts, property damage, social and economic disruption or environmental degradation".

Natural events, such as storms, earthquakes, or floods, are well-known hazards with widespread potential to turn into disasters. While these events are mostly sudden and occur in a relatively short time window, slower processes, such as fatigue, corrosion, ageing, etc., can also impact structural performance over their life span.

Among the three elements that constitute disaster risk, hazard is the one that is mostly beyond human control. Nevertheless, a proper understanding of the occurrence frequency, spatio-temporal distribution, and intensity of the hazard is important for disaster risk reduction. Recent advances in sensing technology: data collection, processing, storage, sharing capabilities; and modelling/computational tools have improved our understanding how different hazards affect civil engineering structures. Hazards can be of different types. For example, they can be natural events, such as earthquakes, or man-made ones, such as explosions.

Different classifications of hazard have been proposed for multi-hazard studies. For example, Ettouney and Alampalli [12] discuss the classification of hazards based on Temporal, Frequency, and Newtonian characteristics. Temporal characterization distinguishes between simultaneous occurrence, segregation in time, and cascading effects. Frequency characterization distinguishes continuous processes, such as corrosion, from intermittent processes such as earthquakes. Intermittent processes can be further classified as frequent, intermediate, or rare. Newtonian characterization is another useful approach for hazard classification that is generally used in design codes. In design codes, hazards are generally quantified in terms of loads, such as wind load, earthquake load, etc. The impact of these loads can be quantified by different metrics, such as stress, deformation, etc., and are evaluated based on Newtonian mechanics. Such hazards have been termed as Newtonian [12]. Other processes, such as corrosion, wear and tear, fatigue, etc., are termed as non-Newtonian [12]. Natural hazards can also be classified based on their origin and the geo-atmospheric processes associated with them, such as


The idea that a structure needs to resist different types of hazards during its service life is well-established in civil engineering. For example, design codes and standards have provisions for different types of actions such as dead load, live load, wind load, seismic load, etc. Simultaneous occurrence of multiple actions is addressed in design codes through load combinations. Such recognition of multiple actions and load combinations does not encompass the real extent of multi-hazard effects and interactions. Multi-hazard generally refers to the concept where two or more hazards interact through structural performance. A multi-hazard interaction, for example, can impact risk due to a hazard when a decision regarding structural exposure and vulnerability against frequency, location, and amplitude of another hazard is made. For example, a change in design wind load and/or structural capacity can impact structural vulnerability to earthquake forces. Multi-hazard interactions may result in common or conflicting design solutions. For example, provision of structural ductility is beneficial for both blast and seismic loads.

Padgett and Kameshwar [25] present a comprehensive classification of multi-hazard combinations for bridges. Although their classification was intended for bridges, it can be generalized for most civil engineering structures, as is presented in Figure 4. Classification of hazard, according to Figure 4, helps to understand potential interactions between different hazards through their effects on structures. *Appl. Sci.* **2022**, *12*, x FOR PEER REVIEW 7 of 29

**Figure 4.** Classification of multi-hazard combinations [26] (modified from Padgett and Kameshwar **Figure 4.** Classification of multi-hazard combinations [26] (modified from Padgett and Kameshwar [25]).

[25]). Multi-hazard consideration is important for structural safety and reliability. Duthinh and Simiu [27] present an interesting point regarding the traditional practice of treating different hazards independently and designing structural components based on the more demanding hazards. Taking an example of wind and earthquakes, they show that the ASCE Standard 7 provisions are not risk-consistent in the sense that, in regions affected by both strong wind and earthquakes, risks of exceedance of limit states can be up to twice as high as those in regions where only one of these hazards dominates. Kappes et al. [28] discuss the challenges of analyzing multi-hazard risk and existing frameworks to address Multi-hazard consideration is important for structural safety and reliability. Duthinh and Simiu [27] present an interesting point regarding the traditional practice of treating different hazards independently and designing structural components based on the more demanding hazards. Taking an example of wind and earthquakes, they show that the ASCE Standard 7 provisions are not risk-consistent in the sense that, in regions affected by both strong wind and earthquakes, risks of exceedance of limit states can be up to twice as high as those in regions where only one of these hazards dominates. Kappes et al. [28] discuss the challenges of analyzing multi-hazard risk and existing frameworks to address those challenges. Zaghi et al. [29] presents the limitations of modern design codes in adequately

those challenges. Zaghi et al. [29] presents the limitations of modern design codes in adequately addressing multi-hazard risk and emphasizing the need for common nomencla-

multi-hazard design of structures. Different aspects of multi-hazard approaches, to mitigate risk to civil engineering infrastructure, are further discussed in Gardoni and Lafave [30]. The implications of considering potential multi-hazard effects in the life cycle cost analysis of an infrastructure is addressed by Jalayer et al. [31] and Fereshtehnejad and

In the context of disaster risk, the UNDRR (United Nations Office for Disaster Risk Reduction) defines exposure as "the situation of people, infrastructure, housing, production capacities and other tangible human assets located in hazard-prone areas" [33]. Exposure is a necessary factor for disaster risk. Exposure is one of the risk determinants that can be controlled, to some extent, by proper planning. Such decisions are, however, not feasible in cases of existing risk: for example, large cities already built-in hazardous space. Reducing exposure to multi-hazards is more challenging than if only a single hazard is considered. Urban planning, land-use policy-making, environmental protection deci-

Local and regional scale mapping of different types of hazards is essential in multihazard considerations when analyzing exposure. Although significant advancements have been made in mapping individual hazards, mapping multi-hazard is challenging due to the differences in their physical phenomena, measures of frequency/amplitude,

Barua et al. [34] present a multi-hazard map for different districts of Bangladesh based on local historical disaster database and comparison of scenario hazard scales with

sions, etc., need to rely on, and benefit from, multi-hazard considerations.

*2.2. Exposure, Multi-Hazard Mapping and Planning* 

Shafieezadeh [32].

impact on structures, etc.

addressing multi-hazard risk and emphasizing the need for common nomenclature for multi-hazard design. They also mention several problems and challenges in the multihazard design of structures. Different aspects of multi-hazard approaches, to mitigate risk to civil engineering infrastructure, are further discussed in Gardoni and Lafave [30]. The implications of considering potential multi-hazard effects in the life cycle cost analysis of an infrastructure is addressed by Jalayer et al. [31] and Fereshtehnejad and Shafieezadeh [32].

## *2.2. Exposure, Multi-Hazard Mapping and Planning*

In the context of disaster risk, the UNDRR (United Nations Office for Disaster Risk Reduction) defines exposure as "the situation of people, infrastructure, housing, production capacities and other tangible human assets located in hazard-prone areas" [33]. Exposure is a necessary factor for disaster risk. Exposure is one of the risk determinants that can be controlled, to some extent, by proper planning. Such decisions are, however, not feasible in cases of existing risk: for example, large cities already built-in hazardous space. Reducing exposure to multi-hazards is more challenging than if only a single hazard is considered. Urban planning, land-use policy-making, environmental protection decisions, etc., need to rely on, and benefit from, multi-hazard considerations.

Local and regional scale mapping of different types of hazards is essential in multihazard considerations when analyzing exposure. Although significant advancements have been made in mapping individual hazards, mapping multi-hazard is challenging due to the differences in their physical phenomena, measures of frequency/amplitude, impact on structures, etc.

Barua et al. [34] present a multi-hazard map for different districts of Bangladesh based on local historical disaster database and comparison of scenario hazard scales with those in other countries. Their study includes earthquakes, tornadoes, floods, and cyclones, which are combined through a weighing scheme. Pourghasemi et al. [35] present multi-hazard mapping of Fars Province in southern Iran. They consider floods, fires, and landslides. They test two different machine learning algorithms in predicting distribution of these hazards based on historical data, and make use of different conditioning factors such as aspect, elevation, drainage, annual mean rainfall, etc. They highlight the importance of multi-hazard mapping in land-use planning, sustainable development, and watershed management in the study region. A similar study for the western region of Iran is presented in Pourghasemi et al. [36].

For multi-hazard mapping of relatively small areas, the Analytical Hierarchy Process (AHP) has been proposed as a suitable method (see, for example, [3]). It is a class of Multi-Criteria Decision Analysis (MCA) and relies on the connection between influencing factors and hazards rather than statistics from historical databases. In this sense, the method is subjective in assigning intensities and weights of different hazards. Some examples of AHP application in multi-hazard mapping can be found in Bathrellos et al. [3], Karaman [37], and Khatakho et al. [5].

## *2.3. Vulnerability and Risk*

Vulnerability lies within the characteristics or properties of the elements (structures) at risk, making them susceptible to impacts of hazards. The UNGA [24] defines vulnerability as "the conditions determined by physical, social, economic and environmental factors or processes which increase the susceptibility of an individual, a community, assets or systems to the impacts of hazards". The concept of vulnerability is used in a broad sense and with different meanings in different fields. Vulnerability is the risk determinant that is the most feasible one to manage/control/reduce through human action or interference. Vulnerability reduction is, therefore, one of the most effective forms of risk reduction. However, the quantification of vulnerability of civil engineering structures even to individual hazards, such as earthquakes, is a challenging task with many uncertainties (see, for example, [38–40]). In a multi-hazard scenario, the overall vulnerability can be different from the vulnerability to a single hazard, which makes the definition of vulnerability especially challenging. Its

complexity is due to the variations in structural material types, geometries, environments, exposure to hazards, usage, age, maintenance, and many other factors. Quantification of structural vulnerability to different types of hazards is a popular and growing research field. Structural vulnerability is mostly expressed in terms of fragility or vulnerability curves, which quantify, in a probabilistic sense, the chance of exceeding undesired states of damage conditioned to a given intensity of hazard. On a larger geographical scale, vulnerability classification of structures relies on general information about the structures, their usage, and exposure to hazards. Such classifications are commonly used for buildings. This method of vulnerability assessment was used by Nassirpour et al. [41] to rank school infrastructure in the Philippines, considering flood, wind, and earthquake hazards. A vulnerability assessment methodology for building, subjected to both single and multi-hazards, was presented by Schwarz et al. [42]. By following the principles of the European Macroseismic Scale 1998 (EMS-98, [43]), they developed vulnerability tables for different hazards (wind, flood, and earthquake). A framework to create multi-dimensional vulnerability models from vulnerability tables was also presented and applied in a few test cities in Germany. Gautam and Dong [44] present multi-hazard damage to structures in central Nepal caused by the 2015 Gorkha Earthquake and the 2017 Chhatiune Khola flash flood. A conceptual model for multi-hazard assessment of the vulnerability of historic buildings is presented by Ayala et al. [45] (2006) with an example application considering English parish churches. A comprehensive review of single and multi-hazard vulnerability and risk in historic urban areas is presented by Julia and Ferreira [46]. They also present interesting examples of the use of multi-hazard risk analysis in historic urban areas.

An interesting methodology for the risk evaluation of offshore structures subjected to ocean waves, wind, and ground motion is proposed by Bhartia and Vanmarcke [47]. They consider failure probabilities under short-term loads, as well as overall risks, due to loads of different intensities. Their results show that limit states (of failure), structural characteristics, as well as features of different types of loads interact in a complex way, controlling the relative importance of different hazards. Aggravating effects in a multihazard scenario are clearly demonstrated in their case-study example of ambient (everpresent wind over the sea) and seismic loads. An outline for identification of different hazards and subsequent risk assessment has been introduced by the United States Federal Emergency Management Agency, 1997, [48]. Ciurean et al. [49] present a comprehensive report of recent developments in multi-hazard processes and risks related to research, policy, and industry.

## **3. Vibration Control of Structures**

Most of the structural damage caused by natural forces can be attributed to excessive vibrations. For static loads, vulnerability reduction can be achieved at the design stage by increasing stiffness and/or strength of structural elements. For existing structures, retrofitting strategies also aim to improve strength and/or stiffness of the structural elements. Similar strategies can also be used for dynamic forces such as wind and earthquakes, but newer and potentially more cost-effective solutions emerging in the scientific research are percolating to practical applications. These new solutions are not necessarily about increasing structural stiffness and/or strength. They fundamentally rely on changing the dynamic properties of the structures to make them less vulnerable to natural forces. This can, contrary to retrofitting in the traditional sense, even make the overall structure more flexible. A notable example is the well-established base-isolation technology for reducing earthquake-induced vibrations of buildings and bridges. Vibration reduction, also known as vibration control, makes use of different types of devices installed on the structure to reduce vibrations caused by different types of forces. While base-isolation, supplemental damping, and bracing systems to increase lateral stiffness and ductility have been researched and used for a long time, newer control strategies that rely mainly on dynamic devices installed on the primary structure are emerging. Depending on their mode of operation and need for external energy and/or internal feedback mechanism,

vibration control devices can be broadly classified as passive, active, semi-active, or hybrid systems. The following list gives a few examples of these different types of control systems


Detailed definitions and the basic theory behind these different classes of structural control devices can be found in the seminal work of Housner et al. [50]. Most of these control concepts have been extensively investigated, and many of these systems are already installed in different types of structures.

Passive control devices are the ones most frequently used, as they don't require external energy supply. Tuned mass dampers (TMDs), fluid viscous dampers (FVD), tuned liquid dampers (TLD), and seismic base isolation (BI) are the most popular passive control systems.

An early state-of-the art review of seismically base isolated buildings is presented by Kelly [51], Buckle and Mayes [52], and Jangid and Datta [17]. They discuss different types of base isolation systems and summarize findings of the contemporary literature about their performance in seismic response control in addition to presenting a parametric study of crucial design parameters for optimal reduction in seismic performance. Patil and Reddy [19] present a state-of-the-art review of base isolation systems in seismic response mitigation. They focus on design code provisions for isolated structures and discuss effects of soft-soil and near-fault ground motions. Kunde and Jangid [18] present a state-of-the art review of seismically isolated bridges and identify some knowledge gaps in the contemporary literature. Soong and Spencer [21] discuss different types of supplemental energy dissipation systems, including passive and active dampers for structural control. They provide an informative timeline of the development of these control technologies and describe the state-of-the-art review in the context of seismic-resistant design and the retrofitting of structures. A review on the behavior of structures with passive control systems exposed to seismic loads is presented by Buckle [52]. This study discusses the advantages of passive systems in a seismic design and provides several examples of their successful applications. It also highlights limitations of passive control, considering uncertainties in seismic forces and limit states induced by unexpectedly demanding events, and points towards the need for better practical guidelines in their design and implementation.

A comprehensive review on the response control of structures by TMDs is reported by Elias and Matsagar [22]. They review different configurations of dampers, involving one or more tuned masses, installed at different locations of the structure. They report that the findings in the literature support effectiveness of TMDs in reducing wind and earthquake-induced vibrations of certain types of structures. They also identify potential obstacles, such as robustness and reliability, across different levels of loading, especially those that exceed the yield limit, causing inelastic deformations in the structure.

A state-of-the-art review of different types of structural control systems was presented by Saeed et al. [53]. Their review includes different control technologies that can be classified as active, semi-active, passive, or hybrid. They conclude that control systems have a huge potential and importance in modern structures.

Symans and Constantinou [54] present a detailed review of semi-active control systems for the seismic protection of structures and conclude that different solutions, such as stiffness control devices, electrorheological dampers, friction control dampers, fluid viscous dampers, etc., have the potential of practical feasibility in full-scale structures. Spencer and Nagarajaiah [55] also report on the state of the art of semi-active technologies for structural vibration control. They report that smart damping devices, such as Magnetorheological (MR) dampers, appear to combine desirable features of both passive and active control solutions, and they offer a viable control solution against wind and earthquake forces.

The literature on control of structures against wind or seismic forces is vast. As explained above, the state-of-the-art and recent findings, in the structural control of different kinds, have been presented in many works. As an example, one of the first comprehensive state-of-the-art reviews of structural control systems was published more than two decades ago by Housner et al. [50]. Although performance of different control schemes in a single hazard scenario, such as wind or an earthquake, is well-known and summarized in many works, structural control in multi-hazard scenario is an emerging field of research. While some interesting research has been published in this field, there is a lack of an overview of the state-of-the-art, ongoing progress, and future directions. Most of the literature in this regard is on seismic and/or wind-induced response reduction in bridges, buildings, and wind turbines. These topics are dealt with separately in the following sections.

## **4. Multi-Hazard Assessment and Control of Bridges**

Bridges are lifelines of modern society. They are vulnerable to different hazards, as evidenced by several failures in the past. On 19 August 2016, a suspension railway bridge in Tolten-Chile collapsed due to train-induced vibrations [56]. On 29 August, during hurricane Katrina, the Twin Spans Bridge connecting New Orleans to Slidell, Louisiana, United States, suffered extensive damage [57]. On 21 July 2003, Kinzua Bridge in Pennsylvania, United States, was hit by a tornado with 100 mph (45 m/s) winds and collapsed [58]. On 14 January 2003, Sgt. Aubrey Cosens VC Memorial Bridge in Ontario, Canada collapsed [59] due to fatigue-induced failure of the steel hanger rods supporting the deck. On 17 January 1995, a bridge on Hanshin Expressway in Kobe, Japan collapsed during the Kobe Earthquake [60]. During the Loma Prieta earthquake in 1989, two famous bridges (Cypress Street Viaduct and San Francisco—Oakland Bay Bridge) in California, USA were heavily damaged [61,62]. The collapse of these two bridges killed forty-one persons.

Safety and reliability of bridges are controlled by a diverse set of factors related to the structural form, function, maintenance, and the hazards they are exposed to. Multi-hazard consideration is, therefore, emerging as an important topic in bridge design and safety assessment. Some of the recent advancements in this field are discussed in the following.

## *4.1. Multi-Hazard Assessment of Bridges*

To understand the consequences of multi-hazard effects on the safety/reliability of bridges, a wide range of experimental and analytical studies have been conducted and reported in the literature. One of the most studied scenarios is the interaction of earthquakes with other actions: for example, traffic-load. The interaction between these hazards, when they occur concurrently, can be amplifying or diminishing (see, for example, [63,64]). Cascading effects might also be observed when bridges, partly damaged by earthquakes, are exposed to traffic (see, for example, [65,66]). Ground shaking and liquefaction induced by earthquakes can have complex interactions in bridge response, both amplifying and diminishing (see, for example, [67,68]). Another multi-hazard scenario for bridges is the simultaneous occurrence of high waves and hurricane surge (see, for example, [69–73]). Another scenario is foundation scour due to floods, which may increase the seismic vulnerability of bridges [74–81].

Aging and corrosion of bridge elements causes structural deterioration that can amplify the effect of other hazards, such as earthquakes or wind forces. The effect of deterioration caused by seismic and traffic loads on a reinforced concrete bridge is addressed by Deco and Frangopol [82], Choe et al. [83], Kumar et al. [84], Choe et al. [85], Gardoni and Rosowsky [86], Choine et al. [87], Rokneddin et al. [88], and Biondini et al. [89]. Long-span bridges are especially sensitive to wind forces, but they can also be affected by seismic excitation. A framework for the assessment of vulnerability of long-span bridges subjected to multi-hazards (seismic and wind excitation) is presented by Martin et al. [90]. A summary of recent advances in wind effects on long-span bridges, with a multi-hazard perspective, is presented in Chen et al. [91]. Studies on the multi-hazard effects and performance of bridges is summarized in Table 1.


**Table 1.** A summary of published works on bridges subjected to multi-hazard.


### **Table 1.** *Cont.*


## **Table 1.** *Cont.*


**Table 1.** *Cont.*

## *4.2. Multi-Hazard Vibration Control of Bridges*

Although the literature on multi-hazard vulnerability and the risk assessment of bridges is vast, retrofitting bridges for multi-hazard protection is an emerging research topic that is gaining interest. Chandrasekaran and Banerjee [113] consider three different retrofit strategies to enhance bridge performance under the multi-hazard. Wang et al. [76] note that increasing foundation stiffness can be more beneficial than increasing foundation depth in reducing seismic vulnerability of bridges subjected to scour. Sung and Su [96] use time-dependent fragility curves to estimate the total direct costs of neutralized RC bridges as a function of ground motion intensity and service time and propose it as a tool to time retrofit campaigns. Benefits of the base isolation system, as a control/retrofit solution for increasing the reliability of steel bridges subjected to ground shaking and liquefaction hazards, is demonstrated in Wang et al. [124].

To the best of our knowledge, multi-hazard considerations in vibration control of bridges has not been reported in the literature yet.

## **5. Multi-Hazard Assessment and Vibration Control of Buildings**

This section provides a review of studies related to building response to multi-hazard, with emphasis on wind and seismic forces. Building response to seismic forces is controlled

by various factors, such as amplitude, duration, and frequency content of ground shaking. It also depends on the characteristics of the building itself and the underlying soil properties. Larger earthquakes produce ground motion with more energy at lower frequencies than smaller earthquakes. Large earthquakes are hazardous to all buildings, particularly to those that have natural frequencies close to the dominant frequency of ground shaking. Such a phenomenon has been observed in ground shaking and building response during past earthquakes (see, for example, [125,126]).

Wind loads contain energy at lower frequencies than seismic ground motions. Low to mid-rise buildings with relatively low vibration frequencies are therefore more susceptible to dynamic vibrations caused by seismic forces than those due to wind. Wind forces on such structures could, nevertheless, have undesired effects on components such as roofs, windows, chimneys, etc. Damage to light and improperly anchored roofs in low-rise buildings during strong wind is, therefore, of concern. In super tall buildings, wind generally induces stronger displacement response than earthquakes. Seismic loads, however, might excite higher vibration modes of such structures, resulting in high floor accelerations. This implies low inter-story drift and, therefore, lower risk of structural damage, but high floor acceleration can be critical for non-structural components [6,9,127,128]. From a structural point of view, wind loads are, therefore, critical for flexible structures, while seismic loads are more demanding on stiff structures. Occupant comfort and safety is another consideration when it comes to the response of buildings to wind and earthquake loads. Large floor accelerations can cause discomfort to occupants and may pose a safety threat due to moving objects. Floor accelerations in tall buildings are typically higher during moderate to strong ground shaking than during strong wind. Strong seismic loading is, however, typically less frequent than strong wind. From a serviceability point of view, wind action is, therefore, more critical for occupant comfort. Multi-hazard effects in buildings also need to be looked at from a life-cycle cost perspective and accumulation of damage due to multiple events: for example, wind response of a structure partially damaged by an earthquake or vice versa. Damage accumulation and fatigue due to repeated loading from frequent actions, such as moderate to strong wind, is also an important consideration.

## *5.1. Multi-Hazard Assessment of Buildings*

Huang [129] provides a comprehensive account of the dynamic responses of high-rise buildings under multiple hazards. It presents performance assessment methods and case study investigations using high-rise buildings in Hong Kong. Various factors, such as seismic source-to-site distance, recurrence periods, ground shaking amplitude, building height, damping ratios, properties of wind forces, etc., were considered in the analysis. The results show that seismic loads result in a higher floor acceleration response and lateral forces but weaker torsional forces and a lower displacement response compared to wind forces. The height of the buildings was also found to be an important parameter, with wind response being more sensitive to variation in height than seismic response. The results also showed that wind response is more strongly influenced by the level of damping of the building than seismic response.

Chen [127], and Rasigha and Neeladharan [128] report differences in the seismic and wind responses of mid-rise to high-rise buildings. Aly [6], as well as Aly and Abburu [7] present the responses of tall buildings subjected to wind and seismic forces. In these assessments, two tall buildings (76-story and 54-story) were considered for finite element analysis. They found that ground motions excite higher vibration modes in buildings, resulting in lower inter-story drift than wind forces, but higher floor accelerations last for a shorter time. Wind actions are, therefore, critical from an occupant comfort and serviceability consideration. Tall structures designed for strong wind may possess an adequate capacity against moderate ground shaking, but they might suffer non-structural losses due to high floor accelerations. A framework for life-cycle loss estimation, of non-structural damage in tall buildings under wind and seismic loads, is presented by Venanzi et al. [9]. Their framework assumes that damaged structures are restored to their

original condition after each hazardous event. Hazardous events are not simultaneous, and small maintenance costs are neglected. Their results show that for drift-dependent damages, wind forces are costlier than seismic forces. Seismic forces are costlier, in terms of non-structural damage, due to high floor accelerations. These observations are consistent with results reported in other studies, [6,7]. Antoun [130] studied the performance of a 74-story building located in Miami to evaluate the expected losses associated with a multi-hazard (wind and earthquake forces). Performance-based approaches were used for earthquake, wind, and hurricane forces. Monetary losses corresponding to structural and non-structural damage, as well as occupant discomfort, was estimated. They report that losses due to façade damage are dominant for high probabilities of exceedance, whereas structural damage becomes dominant at lower probabilities of exceedance.

Zhang et al. [131] proposed a framework for the damage risk assessment of high-rise buildings exposed to wind and seismic forces acting separately and concurrently. They used recorded earthquake and wind data, over a period of about 47 years, to estimate hazard curves for wind and seismic forces as well as copula-based bi-hazard surfaces. They then performed multi-hazard fragility assessment and estimated damage probabilities for separate and concurrent hazard models. Their results show that damage probability due to bi-hazards dominates the total damage probability in most damage states. They highlight the need for multi-hazard considerations in the design and evaluation of tall structures subjected to wind and seismic forces. Damage risk assessment and cost-benefit analysis of mitigation strategies, in residential buildings subjected to hurricane and seismic forces, are discussed in Li [132], giving a comprehensive overview of factors that are important in risk assessment, as well as their roles and impacts in hazard mitigation. The risk-costbenefit framework, based on life-cycle and scenario-case analyses presented by Li [132], incorporate probabilistic modelling of hazards, structural fragility, and expected costs during different service intervals.

Multi-hazard consideration in performance-based engineering and performance-based design criteria, addressing wind and seismic forces, has been researched extensively in the literature. Chiu and Chock [133] present one of the first applications of the performancebased engineering approach in a multi-hazard scenario.

A probabilistic framework, for the multi-hazard risk assessment of reinforced concrete buildings subjected to seismic and blast loads, is discussed in Asprone et al. [134]. Annual risk of structural collapse, considering seismic action and progressive collapse due to blast forces, is formulated in this study. They conclude that the Monte Carlo (MC) simulation is suitable for calculating probability of progressive collapse, as well as for identifying critical blast scenarios.

Multi-hazard performance of different structural elements, such as columns, frames, plates, walls, etc., have been reported by many researchers. Resistance capacity of precast segmental columns, subjected to impact and cyclic loading, is investigated experimentally by Zhang et al. [135]. They found that, compared to monolithic columns, segmental columns (precast segments joined together, often with pre-stressed tendons) possess better ductility and sustain lower residual drift under cyclic loading. Under impact loading, segmental columns were found to have better self-centering capacity. They showed that shear resistance of such columns can be significantly improved by introducing concrete shear keys, but it comes at some cost related to stress-concentration and potential damage to concrete segments.

Rachel [136] presents a methodology for the resilience assessment of buildings subjected to seismic, wind, fire, and various post-earthquake scenarios. The results of this study showed that post-earthquake fire resilience in moment frame buildings is independent of seismic damage if frame connections are intact. The results also showed that multi-hazard resilience of moment resisting frame buildings can be improved by strengthening and/or fire-proofing gravity columns. Shin [137] presents multi-hazard performance evaluation matrices for retrofitted non-ductile reinforced concrete buildings subjected to seismic and blast loads.

Unnikrishnan and Barbato [138] investigated multi-hazard interaction on the performance of low-rise wood-frame buildings. Chulahawat and Mahmoud [139] present an algorithm to optimize building systems, with suspended floor slabs subjected to wind and seismic hazards, and observe that tall buildings with such systems are effectively optimized for both wind and seismic forces without a significant trade-off on performance to individual hazards.

## *5.2. Multi-Hazard Vibration Control of Buildings*

Vibration control of buildings subjected to wind or seismic forces has been extensively researched. Vibration control of buildings in multi-hazard scenarios is, on the other hand, not as extensively studied. Some important studies in this area are summarized in Table 2. Performance assessment of control devices, their optimization, and life-cycle cost analysis are the main issues that have been addressed in these studies. Wind and earthquake forces are the most considered hazard in these studies. Most of these studies present traditional control systems such as passive TMDs, passive energy dissipation devices, viscous fluid dampers, multiple tuned passive TMDs, etc. Some recent advances in this area include inerter-based TMDs (Djerouni et al., [140]; Djerouni et al. [141]; Djerouni et al. [142]; Marian and Giaralis [143]), glass curtain wall TMDs (Bedon and Amadio [144]), and sliding floor isolators (Chulahwat and Mahmoud [139]; Mahmoud and Chulahwat [145]).

**Table 2.** A summary of published works on vibration control of the building subjected to multi-hazard.


## **6. Multi-Hazard Assessment and Control of Wind Turbines**

The tall and slender geometry of wind turbine towers and the large top mass of the turbine and the rotors make wind turbines sensitive to both wind and seismic excitation. Wind and seismic loading have been the two most common environmental actions considered for research on the performance assessment of wind turbine towers. For offshore turbines, wave loading is also an important factor.

## *6.1. Multi-Hazard Assessment of Wind Turbines*

Maryam [155], as well as Maryam and Gardoni [156] highlight the importance of multi-hazard consideration in site-selection and design of wind turbines. They present a multi-hazard probabilistic framework to evaluate the structural reliability of offshore wind turbines. Considering wind and seismic action, their results show that annual probabilities of failure are higher when seismic action is considered. Comparing two identical wind turbines, one in the Gulf of Mexico and the other off the coast of California, they conclude that, although the latter location is more favorable in terms of power production, annual probabilities of failure are higher due to higher seismicity. Avossa et al. [157] present a Monte Carlo simulation-based framework for the estimation of multi-hazard fragility curves of wind turbine structures. They provide an example application of the framework, to derive failure probabilities of a prototype wind turbine structure, conditioned on wind velocity and peak ground acceleration for different operational states of the turbine. Their results show that aerodynamic damping plays an important role in the seismic fragility. Fragility in an operational state, for seismic action in the fore-aft direction, increases with wind speed up to the rated wind speed, after which it starts to decrease. When the rotor is operating at the rated condition or is parked, the probability of failure is larger than 50% and the peak ground acceleration exceeds about 70% of the acceleration of gravity. Campo and Estrada [158] present similar conclusions regarding the importance of aerodynamic damping, stating that, while wind action is more damaging at the operational state, seismic action can be more threatening when the rotor is parked. Katsanos et al. [159] report, for a 5 MW offshore wind turbine, that seismic action contributes more than wind and wave action to structural demands such as base moment and tower-top displacement. They also report on the fragility of sensitive equipment located in the nacelle, which are found to be prone to severe damage at moderate ground shaking intensity. Zuo et al. [160] investigated the fragility of a prototype 5 MW offshore wind turbine structure subjected to aerodynamic forces and wave loading. They considered different operational states of the rotor and derived fragility curves for both the supporting tower and the rotor blades. Their results show that, when the wind speed is between the cut-in and cut-out range, exceedance probability of the blade failure is much higher than that of the tower failure. They also highlight the impact of aerodynamic damping in reducing wind-induced vibrations of the tower. Zuo et al. [161] studied the effect of soil structure interaction (SSI) on the 5 MW offshore prototype model. Their results show that the fore-aft displacement demand on the tower is significantly affected by SSI. Asareh et al. [162] investigated the fragility of a 5 MW wind turbine prototype subjected to wind and seismic action. Their results show that failure due to exceedance of tower-tip displacement and rotation is more likely than yielding or buckling of the tower.

## *6.2. Multi-Hazard Vibration Control of Wind Turbines*

Vibration control of wind turbine structures, subjected to the combined actions of wind, waves, and earthquake ground motions, is extensively reported in the literature. These studies are mostly aimed at the optimization and performance assessment of control systems. A summary of relevant studies on the vibration control of wind turbine structures subjected to multi-hazard is presented in Table 3.


**Table 3.** A summary of published works on vibration control of wind turbine structures subjected to multi-hazard.


### **Table 3.** *Cont.*

## **7. Concluding Remarks**

This work is an attempt to summarize a vast body of research literature on multihazard effects on structures and their vibration mitigation measures. Aspects such as performance assessment, fragility modelling, life-cycle cost assessment, and vibration control in a multi-hazard scenario are covered. The main emphasis is on wind and seismic actions on major infrastructure, such as bridges, buildings, and wind turbine towers. Understanding of multi-hazard scenarios in a probabilistic sense and mapping them out for engineering design is an evolving field. At a local scale, multi-hazard mapping using the Analytical Hierarchy Process (AHP) is gaining popularity. Multi-hazard mapping at regional scales remains a challenging task, demanding more research on unifying frameworks that standardize and unify existing probabilistic hazard assessment methods used for different natural actions. Some recent advances in multi-hazard vulnerability of buildings include multi-dimensional vulnerability modelling. Fragility modelling in a multi-hazard scenario is still a growing field of knowledge, with many unresolved questions. Some examples of such unresolved issues relate to: (i) definition of intensity measures of hazards that might interact with each other, resulting in overall effects that are of different nature than those due to individual hazards; (ii) definition of joint probabilities of exceedances of intensities of different types of hazards; (iii) lack of empirical data on actual damage recorded in multi-hazard scenario, etc.

Multi-hazard assessment of bridges is a widely studied topic. Most studies in this area focus on seismic loads and corrosion. Other effects, such as wind loads, scour, traffic loads, etc., in conjunction with seismic loads, have also been reported. Published literature on

bridges subjected to seismic loads and corrosion focus on the fragility assessment. Multihazard effects in such assessments are generally modelled through fragility increment functions, damage accumulation, and time-dependent fragility curves. Probabilistic load and resistance factors, for different hazards affecting bridges, is another widely reported research theme. In most cases, such fragility models are intended for a life-cycle cost and risk analysis. While multi-hazard fragility of individual bridges is widely reported, there are only a few deals with bridge networks. More research is needed in capacity modelling, risk metrics, stakeholder perceptions, and load combinations.

The literature on multi-hazard effects on buildings is dominated by wind and earthquake loads. Performance-based engineering frameworks, progressive damage and collapse modelling, resilience assessment, multi-hazard performance evaluation metrices, etc., are some of the recent advances in damage risk and life cycle cost analysis of buildings. A variety of control systems such as passive TMDs, passive energy dissipation devices, viscous fluid dampers, multiple tuned passive TMDs, have been investigated in control of buildings subjected to wind and seismic forces. Some recent advances in this area seem promising and practically appealing: for example, sliding floor isolators, glass curtain wall TMDs, and variable friction cladding connections [VFCC]. While the literature on the vibration control of buildings subjected to wind or seismic action is vast, relatively few studies have addressed their simultaneous occurrence. More research is needed on probabilistic treatment of multi-hazard load cases, robustness of control devices against uncertainties in structural properties, as well as loading the feasibility of control from a life-cycle perspective.

Seismic and wind forces are the two most considered environmental actions in the performance assessment and vibration control of wind turbine structures. Wave action, and effects of wave/wind misalignment in offshore wind turbines is also widely researched. Multi-hazard probabilistic framework for reliability assessment of offshore wind turbines is relatively well-established. Monte Carlo simulation-based frameworks for multi-hazard fragility assessment are recently emerging. For land-based wind turbines, several studies have highlighted the role of aerodynamic damping in response to combined action of wind and earthquakes. Most of the published work on vibration control of wind turbines focuses on structural fragility of the supporting tower. Offshore wind turbines subjected to wind and waves is the most investigated scenario. The most reported control device is passive TMD placed on the nacelle, although use of TMD on the platform of a barge-type floating turbine has also been reported to be effective. Most of the studies conclude that control devices are effective in reducing multi-hazard fragility. Recent advances in multi-hazard control of wind turbines include interesting innovations such as braced viscous dampers, and semi-active control systems. Effect of ground motion variability on control performance is an area that needs to be studied better. Control performance against impulsive loads caused by, for example, near-fault ground motions (see, for example, Rupakhety et al. [186]; Elias et al. [187]; Sigurðsson et al. [188], Jami et al. [189]) also need to be investigated better. In addition, fragility of rotor blades and effects of drivetrain dynamics need more attention.

In most of structural vibration control studies reported in the literature, the structure is assumed to remain elastic, which may not be realistic in extreme loading conditions. Inelastic deformations of the structure can result in de-tuning of the control device resulting in lower performance. Control optimization and performance assessment of yielding structures subjected to multi-hazard scenarios, as well as damage accumulation due to multiple hazards occurring over the useful life of a structure, need to be investigated and better understood to facilitate practical applications of control systems in actual engineering projects.

**Author Contributions:** M.J.: Conceptualization, research, literature study, writing, and review; R.R.: methodology, writing, and review; S.E.: review; B.B.: writing, and review; J.T.S.: writing and review. All authors have read and agreed to the published version of the manuscript.

**Funding:** Matin Jami is supported by a doctoral grant from the University of Iceland. RR acknowledges support from the University of Iceland Research Fund.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


## **Debris Flow Classification and Risk Assessment Based on Combination Weighting Method and Cluster Analysis: A Case Study of Debris Flow Clusters in Longmenshan Town, Pengzhou, China**

**Yuanzheng Li 1,2 , Junhui Shen 1,2,\*, Meng Huang 1,2 and Zhanghai Peng 1,2**


**Abstract:** Debris flows can damage infrastructure and threaten human life and property safety, especially in tourist attractions. Therefore, it is crucial to classify and evaluate the risk of debris flows. This article takes 14 debris flows in Longmenshan Town, Pengzhou, Sichuan, China, as the research object. Based on on-site geological surveys, combined with drone images and multiple remote sensing images, the essential characteristics of each debris flow are comprehensively determined. A total of nine factors are used as the primary indicators affecting the risk of debris flow: drainage density, roundness, the average gradient of the main channel, maximum elevation difference, bending coefficient of the main channel, the loose-material supply length ratio, vegetation area ratio, population density, and loose-material volume of unit area. The subjective weights of each indicator are obtained using the Analytic Hierarchy Process, while the objective weights are obtained using the CRITIC method. Based on this, the distance function is introduced to couple the subjective and objective weights, determine each indicator's combined weights, and obtain the integrated evaluation score values of different debris flow hazards. Considering the integrated evaluation score of debris flow, cluster analysis was used to classify 14 debris flows and cluster effectiveness indicators were introduced to determine the effectiveness of debris flow classification. A quantitative standard for the risk of debris flow within the study area was proposed, and finally, a risk assessment of debris flow in the study area was made. Comparing the results of this paper with the gray correlation method, the coupled synergistic method, and the geological field survey results, proves that the proposed method is feasible and provides a reasonable scientific basis for the study of the hazard assessment of regional debris flow clusters and other related issues within the scope of the Jianjiang River basin and other areas.

**Keywords:** combination weighting method; cluster analysis; optimization; classification of debris flows; risk assessment; Longmenshan Town

## **1. Introduction**

Debris flow is a common geological hazard widely distributed in mountainous areas. It is a debris flow composed of water, rock, soil, and steam, and its formation process is highly complex. Its eruption is sudden and short-lived [1–3]. Due to the characteristics of high density, strong fluidity, and fast flow velocity, debris flows are highly destructive. In recent years, frequent debris flows have significantly harmed human life, property, the economy, and the environment, especially in mountainous areas' seismic and geological active zones [4–6]. The CMLR (Chinese Ministry of Land and Resources) reports that thousands of disasters occur yearly in China, and mountainous hazards threaten 74 million

**Citation:** Li, Y.; Shen, J.; Huang, M.; Peng, Z. Debris Flow Classification and Risk Assessment Based on Combination Weighting Method and Cluster Analysis: A Case Study of Debris Flow Clusters in Longmenshan Town, Pengzhou, China. *Appl. Sci.* **2023**, *13*, 7551. https://doi.org/10.3390/ app13137551

Academic Editors: Ricardo Castedo, Miguel Llorente Isidro and David Moncoulon

Received: 22 May 2023 Revised: 19 June 2023 Accepted: 24 June 2023 Published: 26 June 2023

**Copyright:** © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

people. Specifically, during the decade between 2001 and 2010, mountain hazards caused 9933 deaths and missing persons, excluding approximately 25,000 deaths caused by landslides, collapses, and mudslides during the Wenchuan earthquake [7]. Especially for the debris flow group located in the active seismic zone and scenic area, because of the dense population and numerous road networks, if the debris flow disaster occurs, it will cause incalculable losses, so it is essential to classify debris flow in particular areas and assess the risk of debris flow in special areas [8,9].

In recent years, research on the risk assessment of debris flows has been divided into numerical simulation [10], empirical analysis [11], and artificial intelligence [12]. Among them, numerical simulation can not only simulate the movement process of debris flows but also calculate flow velocity and movement distance. However, the parameters required for numerical simulation calculation are generally difficult to obtain. The simulation process is also relatively complex, making the obtained results not necessarily consistent with the actual situation. Moreover, simulating and calculating each debris flow is impractical for regional debris flow clusters [13,14]. The empirical analysis mainly depends on the geological engineering scientists' on-site survey results. Human subjectivity plays a leading role. Different researchers may have different evaluation criteria for the same debris flow risk, leading to the need for empirical analysis methods to combine other methods to determine the debris flow risk comprehensively [15].

With the development of artificial intelligence, extension theory [16], artificial neural networks [17], Bayesian theory [18], genetic algorithms [19], evidence weight method [20], grey correlation method, and other methods have been widely used in debris flow risk assessment [21]. When most algorithms are applied to debris flow risk assessment, complex risk assessment indicators need to be selected, indicator weights need to be calculated, and the final evaluation model needs to be determined. However, things could still be improved in the current method of determining the weight of evaluation indicators. Researchers often pay more attention to the influence of objective indicators or subjective factors, and it is necessary to consider the collective impact of the two comprehensively. Some evaluation models require manual scoring and determination of grading boundaries, leading to certain deficiencies in the evaluation results of debris flow risk. Therefore, understanding how to combine the influence of subjective and objective weights and establish a scientific evaluation model based on this has practical significance for the risk assessment of debris flows.

Through an on-site geological survey, combined with UAV images and remote sensing images, 14 debris flows were found in Longmenshan Town, Pengzhou City, Sichuan Province, China, distributed on both sides of Baishui River and Jianjiang River, with 5 on both sides of Jianjiang River, and 9 on both sides of Baishui River. Considering the rapid and sudden occurrence of debris flows and many surrounding villages and tourists, it is essential to classify and evaluate the risk of debris flows in this area.

## **2. Study Area**

Pengzhou City is located in the northwest of Sichuan Basin and the northwest edge of Chengdu Plain, 36 km away from the urban area of Chengdu, spanning 103◦400~104◦100 east longitude and 30◦540~31◦26<sup>0</sup> north latitude. The city covers an area of 1421 km<sup>2</sup> . Longmenshan Town is located in the north of Pengzhou, upstream of Jianjiang River, 55 km away from the Pengzhou urban area, connecting Shifang City in the east, Cifeng Town in the south, Dujiangyan Irrigation Project in the west, and Wenchuan County in the north. The research area is located in the core area of the Longmenshan Fault, with the Yingxiu Beichuan Fault passing through the right bank of the Baishui River and the Guanxian Anxian Fault passing through the downstream Xiaoyudong Town, which belongs to a controlled structure. The base tectonic layer in the study area is the Huangshuihe Group stratum, and the main rock types of the group is the "Pengguan Complex", which has experienced many strong orogenies (Himalayan movement, Indosinian movement, Chengjiang movement). Finally, neotectonics led to the formation of typical mountain

canyon geomorphic features. Longmenshan Town has a humid subtropical climate, with the highest temperature of 24.8 ◦C in summer and the coldest temperature of 5.2 ◦C in January in winter. The average annual rainfall for many years is 932.5 mm. There is a significant amount of rainfall in summer, primarily rainstorms. Due to the humid subtropical climate, the plants in the study area are mainly subtropical alpine forest vegetation. Before the 2008 earthquake, the vegetation coverage rate exceeded about 60%. The "Longmen Mountain National Scenic Area" is located here and is a famous tourist resort. After the Wenchuan earthquake, geological disasters frequently occurred in the area, with different numbers and scales of debris flow occurring in 2008, 2009, 2010, 2012, and 2022, causing significant property losses and casualties. The distribution of multiple remote sensing images and debris flows in the study area is shown in the following figure (Figures 1–3). the highest temperature of 24.8 °C in summer and the coldest temperature of 5.2 °C in January in winter. The average annual rainfall for many years is 932.5 mm. There is a significant amount of rainfall in summer, primarily rainstorms. Due to the humid subtropical climate, the plants in the study area are mainly subtropical alpine forest vegetation. Before the 2008 earthquake, the vegetation coverage rate exceeded about 60%. The "Longmen Mountain National Scenic Area" is located here and is a famous tourist resort. After the Wenchuan earthquake, geological disasters frequently occurred in the area, with different numbers and scales of debris flow occurring in 2008, 2009, 2010, 2012, and 2022, causing significant property losses and casualties. The distribution of multiple remote sensing images and debris flows in the study area is shown in the following figure (Figures 1–3). different numbers and scales of debris flow occurring in 2008, 2009, 2010, 2012, and 2022, causing significant property losses and casualties. The distribution of multiple remote sensing images and debris flows in the study area is shown in the following figure (Figures 1–3).

Chengjiang movement). Finally, neotectonics led to the formation of typical mountain canyon geomorphic features. Longmenshan Town has a humid subtropical climate, with

Chengjiang movement). Finally, neotectonics led to the formation of typical mountain canyon geomorphic features. Longmenshan Town has a humid subtropical climate, with the highest temperature of 24.8 °C in summer and the coldest temperature of 5.2 °C in January in winter. The average annual rainfall for many years is 932.5 mm. There is a significant amount of rainfall in summer, primarily rainstorms. Due to the humid subtropical climate, the plants in the study area are mainly subtropical alpine forest vegetation. Before the 2008 earthquake, the vegetation coverage rate exceeded about 60%. The "Longmen Mountain National Scenic Area" is located here and is a famous tourist resort. After the Wenchuan earthquake, geological disasters frequently occurred in the area, with

*Appl. Sci.* **2023**, *13*, 7551 3 of 23

*Appl. Sci.* **2023**, *13*, 7551 3 of 23

**Figure 1.** Remote sensing image of GF-6 and distribution of 14 debris flow gullies in the research area. **Figure 1.** Remote sensing image of GF-6 and distribution of 14 debris flow gullies in the research area. area.

**Figure 2. Figure 2.**  Remote sensing images of the research area (Pleiades). Remote sensing images of the research area (Pleiades).

**Figure 3.** Remote sensing images of the research area (GF-2). **Figure 3.** Remote sensing images of the research area (GF-2).

## **3. Materials and Methods**

**3. Materials and Methods**  The calculation of the weight of debris flow indicators mainly adopts a single subjective and objective weighting method. The subjective weighting method obtains weights based on individuals' subjective experiences, such as AHP. This has unique advantages in determining the weights of various indicators at different levels in an extensive system The calculation of the weight of debris flow indicators mainly adopts a single subjective and objective weighting method. The subjective weighting method obtains weights based on individuals' subjective experiences, such as AHP. This has unique advantages in determining the weights of various indicators at different levels in an extensive system and can fully utilize the expert experience in the corresponding field. The objective weighting rule entirely relies on the laws of the data, such as the CRITIC method, which can reflect the relative importance of various factors. Combining the AHP and CRITIC methods can reflect researchers' intuitive understanding of debris flow in the geological field survey stage and consider the regularity of objective data, making the weight obtained more scientific.

and can fully utilize the expert experience in the corresponding field. The objective weighting rule entirely relies on the laws of the data, such as the CRITIC method, which can reflect the relative importance of various factors. Combining the AHP and CRITIC methods can reflect researchers' intuitive understanding of debris flow in the geological field survey stage and consider the regularity of objective data, making the weight ob-The classification of debris flow is a straightforward guide to prevent the occurrence of debris flow disasters, and the classification method is well established [22]. However, because there are many classification methods, the same debris flow has different classification results under different criteria. The affinity propagation cluster analysis method, which applies to analyzing various geostatistical data, is applied to classify debris flows in this paper [23,24]. The final result of debris flow hazard evaluation is obtained based on the calculation results of debris flow index weights combined with the classification of debris flow hazard. The risk assessment process is shown in Figure 4:

## tained more scientific. *3.1. Indicator Selection*

The classification of debris flow is a straightforward guide to prevent the occurrence of debris flow disasters, and the classification method is well established [22]. However, because there are many classification methods, the same debris flow has different classification results under different criteria. The affinity propagation cluster analysis method, which applies to analyzing various geostatistical data, is applied to classify debris flows in this paper [23,24]. The final result of debris flow hazard evaluation is obtained based on the calculation results of debris flow index weights combined with the classification of debris flow hazard. The risk assessment process is shown in Figure 4: The selection of risk assessment indicators for debris flow mainly considers the primary conditions for forming and developing debris flow disasters. From the quantitative evaluation requirements perspective, specific indicators need to reflect the debris flow risk. Geological conditions, material conditions, and trigger conditions play a crucial role in the distribution and activity of debris flows. When selecting debris flow indicators, it is necessary to consider the scientific, representative, comprehensive, and regional differences between the indicators. Table 1shows the indicators selected by researchers worldwide in recent years in the study of debris flow risk assessment. Table 1 shows specific differences in the selection of evaluation indicators in different regions. Considering that rainfall within the study area is the same, it is difficult to accurately obtain the debris flow frequency and soil particle size. At the same time, the lithology and fault length of the strata can be reflected to a certain extent by the amount of material sources. The study

**Figure 4.** Debris Flow risk assessment process.

area is located in a scenic area with a relatively dense population. Therefore, based on on-site geological surveys, combined with the analysis results of drone images and multiple remote sensing images, there are nine specific factors that are important and closely related to the occurrence of debris flow in the research area and can be used as essential indicators for debris flow risk assessment. These are drainage density, roundness, average gradient of main channel, maximum elevation difference, Bengding coefficient of main channel, loosematerial supply length ratio, vegetation area ratio, population density, and loose-material volume of unit area. This article utilizes multiple remote sensing images, digital elevation models (DEM), drone stereo aerial photography, and field investigations (Figures 1–3 and 5) (Table 2) to ultimately obtain the size of various risk assessment indicators for debris flow in the study area through calculations and depictions. field survey stage and consider the regularity of objective data, making the weight obtained more scientific. The classification of debris flow is a straightforward guide to prevent the occurrence of debris flow disasters, and the classification method is well established [22]. However, because there are many classification methods, the same debris flow has different classification results under different criteria. The affinity propagation cluster analysis method, which applies to analyzing various geostatistical data, is applied to classify debris flows in this paper [23,24]. The final result of debris flow hazard evaluation is obtained based on the calculation results of debris flow index weights combined with the classification of debris flow hazard. The risk assessment process is shown in Figure 4:

The calculation of the weight of debris flow indicators mainly adopts a single subjective and objective weighting method. The subjective weighting method obtains weights based on individuals' subjective experiences, such as AHP. This has unique advantages in determining the weights of various indicators at different levels in an extensive system and can fully utilize the expert experience in the corresponding field. The objective weighting rule entirely relies on the laws of the data, such as the CRITIC method, which can reflect the relative importance of various factors. Combining the AHP and CRITIC methods can reflect researchers' intuitive understanding of debris flow in the geological

*Appl. Sci.* **2023**, *13*, 7551 4 of 23

**Figure 3.** Remote sensing images of the research area (GF-2).

**3. Materials and Methods** 

**Figure 4. Figure 4.**  Debris Flow risk assessment process. Debris Flow risk assessment process.

**Table 1.** Factors frequently used in risk assessment of debris flow.


**Figure 5.** Research area digital elevation model (DEM). **Figure 5.** Research area digital elevation model (DEM).

Drainage density (F1) (km/km2): The ratio of the total length of gullies developed **Table 2.** Data Description.


the flat form of valleys varies, and the degree of danger also varies. This value is calculated using ArcGIS geometry from remote sensing images. Average gradient of main channel (F3) (°): This is the ratio of the maximum elevation Drainage density (F1) (km/km<sup>2</sup> ): The ratio of the total length of gullies developed within the debris flow basin area to the basin area, comprehensively reflecting the engineering geological conditions within the watershed. This value is calculated using ArcGIS geometry from remote sensing images.

difference of the main channel to its linear length. The larger the value, the better the hydrodynamic condition is. The value is obtained through DEM. Maximum elevation difference (F4) (m): The difference between the highest and lowest elevations in the basin provides kinetic conditions for the occurrence of debris flow Roundness (F2) (km/km<sup>2</sup> ): This refers to the ratio of the length of the main gully of a debris flow to the basin area. In general, at different stages of debris flow development, the flat form of valleys varies, and the degree of danger also varies. This value is calculated using ArcGIS geometry from remote sensing images.

disasters. The value is obtained through DEM. Bengding coefficient of main channel (F5): This refers to the ratio of the main channel Average gradient of main channel (F3) (◦ ): This is the ratio of the maximum elevation difference of the main channel to its linear length. The larger the value, the better the hydrodynamic condition is. The value is obtained through DEM.

length to its linear length, which reflects the degree of channel blockage. The size of the bending coefficient is positively correlated with the blockage coefficient and is related to the flow rate and scale of the debris flow. This value is calculated using ArcGIS geometry Maximum elevation difference (F4) (m): The difference between the highest and lowest elevations in the basin provides kinetic conditions for the occurrence of debris flow disasters. The value is obtained through DEM.

from remote sensing images. The loose-material supply length ratio (F6) (%): This refers to the ratio of loose-material length along a channel to total channel length, which reflects the successive supplied sediments. This value is obtained through on-site geological surveys and remote sensing Bengding coefficient of main channel (F5): This refers to the ratio of the main channel length to its linear length, which reflects the degree of channel blockage. The size of the bending coefficient is positively correlated with the blockage coefficient and is related to the flow rate and scale of the debris flow. This value is calculated using ArcGIS geometry from remote sensing images.

images. Vegetation area ratio (F7) (%): Low vegetation coverage can cause severe soil erosion in the basin. The value is obtained through drone aerial photography and remote sensing The loose-material supply length ratio (F6) (%): This refers to the ratio of loosematerial length along a channel to total channel length, which reflects the successive supplied sediments. This value is obtained through on-site geological surveys and remote sensing images.

images, and the vegetation coverage is estimated based on the depth of the color. The lighter the color, the lower the vegetation coverage, and this is corrected through drone

economy and technology, human activities have become one of the essential factors affecting mudslides, and population density can reflect the intensity of human activities. This value is estimated based on the number of buildings using remote sensing images

and confirmed through on-site investigations.

aerial photography.

Vegetation area ratio (F7) (%): Low vegetation coverage can cause severe soil erosion in the basin. The value is obtained through drone aerial photography and remote sensing images, and the vegetation coverage is estimated based on the depth of the color. The lighter the color, the lower the vegetation coverage, and this is corrected through drone aerial photography.

Population density (F8) (number of people per km<sup>2</sup> ): With the development of the economy and technology, human activities have become one of the essential factors affecting mudslides, and population density can reflect the intensity of human activities. This value is estimated based on the number of buildings using remote sensing images and confirmed through on-site investigations.

Loose-material volume of unit area (F9) (×10<sup>4</sup> <sup>m</sup>3/km<sup>2</sup> ): The ratio of the source quantity of a single debris flow to the basin area. The source of materials in the ditch is one of the basic factors that cause debris flow disasters, and the size of the unit area loose material volume is directly proportional to the risk of debris flow. The value is obtained by combining a Laser rangefinder with a remote-sensing image, and the thickness is obtained by combining field estimation and drilling data.

Table 3 shows the evaluation index values of debris flow risk in the study area.


**Table 3.** Evaluation index values of debris flow risk in the research area.

## *3.2. Combination Weighting Method*

## 3.2.1. CRITIC Method

The CRITIC method is an objective weighting method that reflects the discreteness and factor conflict between samples through standard deviation and correlation coefficient. The size of the standard deviation is directly proportional to the degree of discreteness and factor weight, and the size of the correlation coefficient is also directly proportional to the conflict between factors. The larger the correlation coefficient, the smaller the weight [34]. The CRITIC method takes into account sample information and factor correlation. Also, it utilizes the coefficient of variation to make the dispersion reflected by standard deviation more realistic, with significant advantages [35–37]. The specific calculation steps are as follows:

(1) Assuming *m* samples containing *n* indicators, construct the original data matrix using the indicators:

$$X = \begin{bmatrix} \alpha\_{11} & \alpha\_{12} & \dots & \alpha\_{1n} \\ \alpha\_{21} & \alpha\_{22} & \dots & \alpha\_{2n} \\ \dots & \dots & \dots & \dots \\ \alpha\_{m1} & \alpha\_{m2} & \dots & \alpha\_{mn} \end{bmatrix} \tag{1}$$

(2) Normalization of indicators:

$$q\_{i\dot{j}} = \frac{\alpha\_{i\dot{j}} - \min\_{\dot{j}}(\alpha\_{i\dot{j}})}{\max\_{\dot{j}}(\alpha\_{i\dot{j}}) - \min\_{\dot{j}}(\alpha\_{i\dot{j}})} \tag{2}$$

(3) Calculate coefficient of variation:

$$\overline{\alpha\_{j}} = \frac{\sum\_{i=1}^{m} \mathfrak{a}\_{ij}}{m} \tag{3}$$

$$\mathcal{S}\_{\bar{j}} = \sqrt{\frac{1}{m} \sum\_{i=1}^{m} \left(\alpha\_{i\bar{j}} - \overline{\alpha\_{\bar{j}}}\right)^{2}} \tag{4}$$

$$
\mu\_j = \frac{S\_j}{\overline{\mathfrak{a}\_j}} \tag{5}
$$

In the formula, *α<sup>j</sup>* is the average value of each indicator; *S<sup>j</sup>* is the standard deviation; *µj* is the coefficient of variation.

(4) Calculate the correlation coefficient matrix:

$$\kappa\_{ij} = \frac{cov(y\_k, y\_l)}{(s\_k s\_j)}, k = 1, 2, \dots, n; l = 1, 2, \dots, n \tag{6}$$

In the formula, *κij* represents the correlation coefficient between indicators, and *cov*(*y<sup>k</sup>* , *y<sup>l</sup>* ) represents the covariance between indicators.

(5) Calculation of indicator information quantity:

$$
\omega\_{\vec{\jmath}} = \mu\_{\vec{\jmath}} \sum\_{i=1}^{n} (1 - \kappa\_{\vec{\imath}\vec{\jmath}}), \vec{\jmath} = 1, 2, \dots, n \tag{7}
$$

The weights of each indicator are:

$$y\_j = \frac{\omega\_j}{\sum\_{i=1}^n \omega\_j} j = 1, 2, \dots, n \tag{8}$$

## 3.2.2. Analytic Hierarchy Process (AHP)

The Analytic Hierarchy Process (AHP) was proposed by renowned mathematician Saaty and is a simple and feasible decision-making method with significant subjectivity [38]. The main advantage lies in the ability to determine the weights of various indicators at different levels in the system, which has unique advantages and is simple and convenient to calculate; therefore, it is widely used [39,40]. The specific calculation steps are as follows:

(1) Establish a tiered hierarchical structure model. The hierarchical structure is generally divided into three layers: target layer, criterion layer, and scheme layer.

(2) Establish the judgement matrix. For different factors at the same level, establish a judgment matrix by comparing their impact on the target factors. The formula for constructing the judgment matrix is as follows:

$$A = \left(a\_{\text{ij}}\right)\_{n \times n'} a\_{\text{ij}} \succ 0, \quad a\_{\text{ij}} = \frac{1}{a\_{\text{ij}}}, (\text{i.} \, j = 1, 2, \dots \, n) \tag{9}$$

Among them, *aij* is the ratio of the influence degree of elements *B<sup>i</sup>* and *B<sup>j</sup>* , usually represented by a scoring method of one to nine, as shown in the Table 4 below.


**Table 4.** Definition of comparative importance.

(3) Calculate consistency indicator *CI*. By calculating the eigenvalues and eigenvectors of the judgment matrix, it can be represented as:

$$CI = \frac{\lambda\_{\text{max}} - n}{n - 1} \tag{10}$$

Among *λmax* is the eigenvalue of matrix *A*. (4) Calculate consistency ratio *CR*

$$CR = \frac{CI}{RI} \tag{11}$$

Among them, *RI* is the consistency indicator of the judgment matrix (Table 5). When *CR* < 0.1, it is judged that the matrix meets the consistency requirement. Otherwise, it is considered that the matrix does not meet the consistency requirement, and further adjustments are needed until the consistency check is met.

**Table 5.** The random average consistency index.


## 3.2.3. Combination Weighting Rule

The Analytic Hierarchy Process determines the judgment matrix mainly based on the subjective experience of experts. In contrast, the evaluation process of the CRITIC method relies entirely on the own laws of objective data. In order to reflect both the researchers' intuitive understanding of debris flow in the field geological investigation stage and to take into account the regularity of objective data, the degree of difference between the weights obtained by the two methods is consistent with the degree of difference between their corresponding distribution coefficients, this paper introduces a distance function, which couples the weights obtained by the two methods together to determine the index weights comprehensively.

Suppose the weight vector obtained by the Analytic Hierarchy Process method is *ω<sup>i</sup> c* , the weight vector obtained by the CRITIC method is *ω<sup>i</sup> y* , and the distance function between them is denoted as *d*(*ω<sup>i</sup> c* , *ω<sup>i</sup> y* ) [33]:

$$d(\omega\_i^{\mathcal{L}}, \omega\_i^{\mathcal{Y}}) = \left[\frac{1}{2} \sum\_{i=1}^n \left(\omega\_i^{\mathcal{L}} - \omega\_i^{\mathcal{Y}}\right)^2\right]^{\frac{1}{2}} \tag{12}$$

Assuming that the combined weights are *ω<sup>i</sup> <sup>z</sup>* and the linear weighting method is used to obtain *ω<sup>i</sup> z* , and assuming that the distribution coefficients of the two weights are *a* and *b*, respectively, then *ω<sup>i</sup> z* can be expressed as: *Appl. Sci.* **2023**, *13*, 7551 10 of 23

$$
\omega\_{\rm i}^{\,z} = a\omega\_{\rm i}^{\,c} + b\omega\_{\rm i}^{\,y} \tag{13}
$$

To ensure a consistent degree of variation in the magnitude of the weights and the distribution coefficients, make the distance function and the distribution coefficients the same: The two weight assignment coefficients also have to satisfy Equation (15). Combining Equations (14) and (15) yields the assignment coefficients *a* and *b*. Substituting *a* and *b* into

$$d(\omega\_{\rm i}^{\varepsilon}, \omega\_{\rm i}^{y})^2 = (a - b)^2 \tag{14}$$

The two weight assignment coefficients also have to satisfy Equation (15). Combining Equations (14) and (15) yields the assignment coefficients *a* and *b*. Substituting *a* and *b* into Equation (13) yields *ω<sup>i</sup> z* . *a b* + =1 (15)

$$a + b = 1\tag{15}$$

### *3.3. Cluster Analysis* As an essential method for studying classification problems, cluster analysis groups

Equation (13) yields *ω<sup>i</sup>*

*3.3. Cluster Analysis* 

51].

*z*.

As an essential method for studying classification problems, cluster analysis groups similar things together as much as possible and separates things that are more different (Figure 6). The fundamental laws within things can be more clearly recognized by cluster analysis, which plays an essential role in several scientific fields [41–46]. There are certain drawbacks to the traditional clustering method: (1) artificially determining the number of groups and (2) artificially selecting the clustering center. However, the result of such division is mainly affected by human factors, which makes the final division unreliable [47–51]. similar things together as much as possible and separates things that are more different (Figure 6). The fundamental laws within things can be more clearly recognized by cluster analysis, which plays an essential role in several scientific fields [41–46]. There are certain drawbacks to the traditional clustering method: (1) artificially determining the number of groups and (2) artificially selecting the clustering center. However, the result of such division is mainly affected by human factors, which makes the final division unreliable [47–

**Figure 6.** Cluster Analysis. **Figure 6.** Cluster Analysis.

The affinity propagation clustering algorithm overcomes the shortcomings of traditional clustering analysis. Its principle is to achieve efficient and accurate data clustering by iteratively transmitting attraction and attribution information between data points in a given data set. It is a clustering algorithm based on the "information transfer" between data points. It takes the similarity between a pair of data points as input and exchanges accurate and valuable information between data points until an optimal set of class representative points and clustering are gradually formed. The main advantages include: (1) the number of clusters and the cluster center can be obtained by calculation, which does not need to be specified manually; (2) each data point can be used as a potential cluster center; (3) the clustering results are unique; (4) the starting condition of the algorithm is The affinity propagation clustering algorithm overcomes the shortcomings of traditional clustering analysis. Its principle is to achieve efficient and accurate data clustering by iteratively transmitting attraction and attribution information between data points in a given data set. It is a clustering algorithm based on the "information transfer" between data points. It takes the similarity between a pair of data points as input and exchanges accurate and valuable information between data points until an optimal set of class representative points and clustering are gradually formed. The main advantages include: (1) the number of clusters and the cluster center can be obtained by calculation, which does not need to be specified manually; (2) each data point can be used as a potential cluster center; (3) the clustering results are unique; (4) the starting condition of the algorithm is the input correlation matrix, and (5) there is no requirement for the symmetry of the matrix. The specific algorithm flow is as follows [52,53].

(3) Calculate the responsibility information and the availability information between

In the study of debris flow classification, traditional clustering analysis methods require the manual determination of classification numbers, which is subjective. However, using the affinity propagation clustering algorithm to classify debris flows can be completed without specifying the number of classifications and clustering centers in advance. The calculation results are unique and reasonable. However, an essential parameter in the

the input correlation matrix, and (5) there is no requirement for the symmetry of the ma-(1) Calculate data point correlation matrix.

(1) Calculate data point correlation matrix.

monitoring points.

(5) Calculate the cluster center.

trix. The specific algorithm flow is as follows [52,53].

(4) Update the responsibility information and availability information.

(6) The maximum number of cycles is reached, and the final result is obtained.


In the study of debris flow classification, traditional clustering analysis methods require the manual determination of classification numbers, which is subjective. However, using the affinity propagation clustering algorithm to classify debris flows can be completed without specifying the number of classifications and clustering centers in advance. The calculation results are unique and reasonable. However, an essential parameter in the affinity propagation algorithm is the reference *p*-value, which refers to the reliability of using data points as clustering centers. The size of the reference *p*-value directly affects the clustering results, and its size is directly proportional to the number of clusters. Improper selection of *p*-values can lead to poor clustering results [54]. In general, when there is no prior knowledge, the *p*-value is set as the median of the similarity matrix and remains unchanged during the clustering process. However, selecting the median of the similarity matrix may not necessarily result in the optimal result [55]. Considering that the basis of algorithm startup is a correlation matrix, and different *p*-values input in the calculation process will also lead to different calculation results, therefore, determine the correlation between debris flows before calculation, and use quantum particle swarm optimization (QPSO) [56] to optimize the *p*-value, find the *p*-value under the optimization condition of the objective function, and obtain the classification result of debris flows under the optimization *p*-value condition.

## 3.3.1. Correlation Calculation

They assumed that for any two debris flows, *i* and *j*, each has *z* evaluation indicators. The *kth* evaluation indicator of the two debris flows can be expressed as *i<sup>k</sup>* and *j<sup>k</sup>* . The evaluation indicators of debris flow *i* and *j* can be combined into a set of data pairs (*i<sup>k</sup>* , *jk* ) (1 ≤ *k* ≤ *z*). For any two data pairs (*i<sup>k</sup>* , *jk* ) and (*i<sup>l</sup>* , *jl* ) in a set, when *i<sup>k</sup>* > *i<sup>l</sup>* and *j<sup>k</sup>* > *j<sup>l</sup>* , or *i<sup>k</sup>* < *i<sup>l</sup>* and *j<sup>k</sup>* < *j<sup>l</sup>* , the data pair is said to be consistent; when *i<sup>k</sup>* > *i<sup>l</sup>* and *j<sup>k</sup>* < *j<sup>l</sup>* , or *i<sup>k</sup>* < *i<sup>l</sup>* and *j<sup>k</sup>* > *j<sup>l</sup>* , the data pair is said to be inconsistent; when *i<sup>k</sup>* = *i<sup>l</sup>* and *j<sup>k</sup>* = *j<sup>l</sup>* , this data pair is neither consistent nor inconsistent. If correlation analysis is conducted on the evaluation indicators between two debris flows, the correlation between any two evaluation indicators can be expressed as [57]:

$$\pi\_{ij} = \frac{2\mathcal{C}}{\frac{1}{2}z(z-1)} - 1 = \frac{4\mathcal{C}}{z(z-1)} - 1\tag{16}$$

Among them, *C* is the number of identical order pairs. The value range of *τij* is [−1, +1], and when *τij* = 1, it indicates that the two debris flows have the exact level correlation; when *τij* = −1, it indicates that two debris flows have opposite level correlations; when *τij* = 0, it indicates that the two debris flows are independent of each other. For the risk assessment indicators of debris flow, due to the different dimensions of each indicator, correlation calculation can not only eliminate the impact of different dimensions of each evaluation indicator but also serve as a prerequisite for establishing a correlation matrix [58].

## 3.3.2. Classification and Risk Assessment of Debris Flow

Assuming there are *n* debris flows in total, for a particular debris flow *i*, it is necessary to calculate the correlation between debris flow *i* and *n* − 1 debris flow other than debris flow *i* and establish a debris flow correlation matrix *Sij*:

$$S\_{ij} = \begin{bmatrix} \tau\_{11} & \tau\_{12} & \dots & \tau\_{1n} \\ \tau\_{21} & \tau\_{22} & \dots & \tau\_{2n} \\ \dots & \dots & \dots & \dots \\ \tau\_{n1} & \tau\_{n2} & \dots & \tau\_{nn} \end{bmatrix} \tag{17}$$

Based on the correlation matrix of debris flow, affinity propagation clustering analysis is conducted because selecting different *p*-values will result in different classification results. However, too many or too few classification numbers do not match debris flow's actual risk classification results. Therefore, the optimal classification results need to be determined through the clustering effectiveness function. Among many clustering indicators, the Silhouette indicator is widely used, which can not only reflect the intra-class tightness and inter-class separateness of clustering results but also assess the optimal number of clusters and evaluate the quality of clustering, so the Silhouette indicator is chosen to judge the optimal debris flow classification.

Suppose there is a data set with *n* data points, which is divided into *K* clusters *C<sup>i</sup>* (*i* = 1, 2, . . . *K*). *a*(*t*) denotes the average dissimilarity of data point *t* in cluster *C<sup>j</sup>* to all other data points within *C<sup>j</sup>* , and *d*(*t*, *C<sup>i</sup>* ) is the average dissimilarity of data point *t* in *C<sup>j</sup>* to all data points in another class *C<sup>i</sup>* , then *b*(*t*) = min{*d*(*t*, *C<sup>i</sup>* }, where *i* = 1, 2, . . . *K*, *i* 6= *j*. Therefore, the Silhouette index of a data point is [59–61]:

$$S(t) = \frac{b(t) - a(t)}{\max\{a(t), b(t)\}} \tag{18}$$

The average *S*(*t*) value *Savg*(*C<sup>i</sup>* ) of all data points in cluster *C<sup>i</sup>* can be obtained from *S*(*t*), which reflects the compactness and separation of cluster *C<sup>i</sup>* . The average *S*(*t*) value *Savg* of all data points in a dataset can reflect the quality of clustering results. The larger *Savg*, the better the clustering quality, and the optimal number of clusters must correspond to the maximum *Savg* value. The formula is as follows:

$$S\_{\text{avg}} = \frac{\sum\_{t=1}^{n} S(t)}{n} \tag{19}$$

According to the above description, when the quantum particle swarm optimization (QPSO) algorithm is used to optimize the *p*-value, assuming that the number of *p*-values to be optimized is *N*, the *p*-values to be optimized are *P*1, *P*2, *P*3, . . . *P<sup>N</sup>* and the total number of variables to be solved are *N*, which can be transformed into an *N*-dimensional optimization problem [62,63]. The optimization calculation process is as follows:

Step 1: Calculate the correlation matrix *Sij* of debris flow.

Step 2: Initialization. The qubit phase plays the role of the random initial population, which is in the range of [0, 2π] calculated by random number function. Then, combined with the upper and lower limits of *p*-values variables, by solving the solution space transformation formula, the probability amplitude is converted to the variable space.

Step 3: The correlation matrix *Sij* and *p*-value variables were used for affinity propagation clustering calculation, and the Silhouette index value was obtained from the output classification results. The average Silhouette index value is taken as the fitness value of the QPSO algorithm. Considering the comprehensive evaluation of classification results, the average Silhouette index value (*Savg*) is selected as the objective function.

Step 4: Nonlinear adjustment of inertia weight.

Step 5: Update particle state (update qubit depression angle and qubit probability amplitude)

Step 6: Adaptive adjustment of mutation operator and mutation processing. The quantum nongate is used to mutate particles.

Step 7: The correlation matrix *Sij* and *p*-value variables were used for affinity propagation clustering calculation, and the average value of the Silhouette index was calculated. If the average value of Silhouette index meets the stopping condition, or if the number of iterations has reached the maximum, the clustering results under the optimized *p*-value will be output; otherwise, return to Step 3 to continue the cyclic calculation.

Figure 7 shows the flowchart of the debris flow classification algorithm:

Based on the clustering analysis method proposed above, debris flows can be divided into different types. However, the clustering analysis results only classify debris flows into different categories, and another judgment is needed regarding which risk level corresponds to different categories of debris flows. Based on the clustering analysis method proposed above, debris flows can be divided into different types. However, the clustering analysis results only classify debris flows into different categories, and another judgment is needed regarding which risk level corresponds to different categories of debris flows.

This article calculates the synthetic evaluation score (*Di*) for each debris flow risk based on the combined weight values obtained by the combination weighting method (Equation (20)) [33]. The larger *Di*, the greater the probability of the debris flow occurring and the more dangerous it is. Select indicators and synthetic evaluation scores to calculate the correlation between debris flows. Based on correlation calculation (*τij*), establish a correlation matrix (*Sij*) and perform cluster analysis. Based on the clustering results and syn-This article calculates the synthetic evaluation score (*D<sup>i</sup>* ) for each debris flow risk based on the combined weight values obtained by the combination weighting method (Equation (20)) [33]. The larger *D<sup>i</sup>* , the greater the probability of the debris flow occurring and the more dangerous it is. Select indicators and synthetic evaluation scores to calculate the correlation between debris flows. Based on correlation calculation (*τij*), establish a correlation matrix (*Sij*) and perform cluster analysis. Based on the clustering results and synthetic evaluation scores, classify the risk of debris flow.

$$D\_i = \sum\_{j=1}^{n} \omega\_z(j) q\_{lj} \tag{20}$$

#### 1 *j* = **4. Classification and Risk Assessment Results of Debris Flow in the Study Area**

*4.1. Weight Calculation*

structed (Figure 8).

**4. Classification and Risk Assessment Results of Debris Flow in the Study Area**  4.1.1. Results of the AHP

*4.1. Weight Calculation*  4.1.1. Results of the AHP Based on the selection of evaluation indicators for debris flow, a hierarchical structure model for evaluating the risk of debris flow groups in Longmenshan Town is constructed (Figure 8).

*i z ij*

Based on the selection of evaluation indicators for debris flow, a hierarchical structure model for evaluating the risk of debris flow groups in Longmenshan Town is con-

**Figure 8.** Hierarchical structure for debris flow risk assessment. **Figure 8.** Hierarchical structure for debris flow risk assessment.

According to the hierarchical structure model of debris flow evaluation indicators, each evaluation indicator is graded using the 1–9 scale method, a judgment matrix is constructed (Tables 6–9), and consistency testing is conducted. Finally, the weights of each evaluation indicator are obtained (Table 10). During the evaluation process, a total of 15 experts were selected for scoring, all of whom were from the Sichuan Province Sudden Major Geological Disaster Expert Database. According to the hierarchical structure model of debris flow evaluation indicators, each evaluation indicator is graded using the 1–9 scale method, a judgment matrix is constructed (Tables 6–9), and consistency testing is conducted. Finally, the weights of each evaluation indicator are obtained (Table 10). During the evaluation process, a total of 15 experts were selected for scoring, all of whom were from the Sichuan Province Sudden Major Geological Disaster Expert Database.


Trigger condition 1/2 1/3 1


**Table 7.** Criterion layer judgment matrix for geology condition.


F2 1/2 1/3 1/2 1/4 1 **Table 8.** Criterion layer judgment matrix for material condition.

F4 3 4 4 1 4


**Table 9.** Criterion layer judgment matrix for trigger condition.

**Trigger Condition F8 F7 CI** *RI CR* 

F9 1 3 0 0 0 F6 1/3 1


**Table 9.** Criterion layer judgment matrix for trigger condition.

4.1.2. Results of CRITIC Method

Using the data in Table 4, establish the original evaluation index matrix using the CRITIC method, normalize the indicators, calculate the information content, and finally use Equation (8) to obtain the objective weight value (Table 11).

**Table 11.** CRITIC method evaluation index weight.


4.1.3. Results of the Combination Weighting Method

Based on the weight results obtained by the Analytic Hierarchy Process and CRITIC method, the combined weights of each evaluation index are calculated using Equations (12)–(15) (Table 12). According to Equation (12), this is calculated as:

$$d(\omega\_l^c, \omega\_l^y) = \left[\frac{1}{2} \sum\_{i=1}^n \left(\omega\_l^c - \omega\_l^y\right)^2\right]^{\frac{1}{2}} = 0.1746\tag{21}$$

**Table 12.** Combination weighting method for evaluating indicator weight results.


By combining Equations (13) and (14), it can be concluded that:

$$\begin{cases} a - b = 0.1746 \\ a + b = 1 \end{cases} \tag{22}$$

Solved:

$$\begin{array}{l} a = 0.5873 \\ b = 0.4127 \end{array} \tag{23}$$

Therefore, the combination weight value obtained by the combination weighting method can be expressed as:

$$
\omega\_{\rm l}^{\circ} = 0.5873\omega\_{\rm l}^{\circ} + 0.4127\omega\_{\rm l}^{\circ} \tag{24}
$$

## *4.2. Classification Results of Debris Flow*

Considering the synthetic weight values obtained by the combination weighting method, the synthetic evaluation score (*D<sup>i</sup>* ) for each debris flow risk degree is calculated (Table 13).


**Number Debris Flow** *Di* **Number Debris Flow** *Di*

1 Xiaoniuquan 0.5359 8 Manban 0.2113 2 Lianshan 0.2660 9 Henghe 0.3963 3 Feishuiyan 0.2692 10 Yushi 0.7948 4 Huilong 0.4727 11 Longcao 0.8301 5 Yanzidong 0.5210 12 Meizilin 0.6718

**Table 13.** Synthetic evaluation score (*D<sup>i</sup>* ) of debris flow in the research area. 6 Shiliangzi 0.2716 13 Xujia 0.2856

*Appl. Sci.* **2023**, *13*, 7551 16 of 23

**Table 13.** Synthetic evaluation score (*Di*) of debris flow in the research area.

This study selected 9 evaluation index values (F1~F9) from 14 debris flows in the research area as well as the synthetic evaluation score of each debris flow (*D<sup>i</sup>* ). Using these 10 indicators, Equation (16) was used to calculate the correlation between the 14 debris flows *τij*. Based on the correlation calculation, the correlation matrix of debris flow in the study area was established according to Equation (17) (Figure 9). more scientific and reasonable the classification results are. The correlation matrix shown in Figure 9 is used as the basis for starting the algorithm, and the QPSO-optimized affinity propagation clustering is used to compute the matrix. The calculation results showed that, when the debris flow is divided into four types, the *Savg* index value is the highest, reaching 0.66. Therefore, the classification result is the optimal number of classifications (Table 14).

According to the previous description of *Savg*, the larger the calculated *Savg* value, the

**Figure 9.** Calculation results of debris flow correlation in the study area. **Figure 9.** Calculation results of debris flow correlation in the study area.

According to the previous description of *Savg*, the larger the calculated *Savg* value, the more scientific and reasonable the classification results are. The correlation matrix shown in Figure 9 is used as the basis for starting the algorithm, and the QPSO-optimized affinity propagation clustering is used to compute the matrix. The calculation results showed that, when the debris flow is divided into four types, the *Savg* index value is the highest, reaching 0.66. Therefore, the classification result is the optimal number of classifications (Table 14).


**Table 14.** The optimal classification results of debris flow in the research area.

## *4.3. Risk Assessment Based on Classification Results*

Based on the above clustering analysis results, debris flows are divided into four categories. However, the clustering analysis results only classify debris flows into different categories and which risk level corresponds to each of the four categories of debris flows. The clustering results are not provided, and another judgment is needed. Table 15 shows that, among Class I debris flows, Longcao Gully has the highest synthetic evaluation score of 0.8301, while Meizilin Gully has the lowest synthetic evaluation score of 0.6718. Among Class II debris flows, the synthetic evaluation score of Xiaoniuquan Gully debris flow is the highest at 0.5359, while the synthetic evaluation score of Henghe Gully is the lowest at 0.3963. Among Class III debris flows, Baiyan Gully has the highest synthetic evaluation score of 0.3064, while Shiliangzi Gully has the lowest synthetic evaluation score of 0.2716. Among Class IV debris flows, Feishui Rock Gully has the highest synthetic evaluation score of 0.2692, while Manban Gully has the lowest synthetic evaluation score of 0.2113. Therefore, based on the calculation results, the Class I debris flow in the study area is classified as extremely dangerous (0.6718 ≤ *D<sup>i</sup>* ≤ 0.8301), the Class II debris flow is classified as highly dangerous (0.3963 ≤ *D<sup>i</sup>* ≤ 0.5359), the Class III debris flow is classified as moderately dangerous (0.2716 ≤ *D<sup>i</sup>* ≤ 0.3064), and the Class IV debris flow is classified as low-risk (0.2113 ≤ *D<sup>i</sup>* ≤ 0.2692).

**Table 15.** Debris flow risk assessment in study area.


This article compared and analyzed the debris flow risk assessment results with those obtained by the grey correlation and collaborative coupling methods (Table 16). Table 16 shows that the results obtained in this article are consistent with those obtained by the coupling synergy method, generally one level higher than the risk obtained by the grey correlation method. The results obtained by the grey correlation method are mainly medium to low, with a small amount being hazardous, and there is no extreme risk. Compared to the results obtained in this article, they tend to be conservative. The results of the grey correlation method indicate that the possibility of debris flow outbreaks in the study area is minimal. However, the results obtained by the grey correlation method do not match the actual situation of multiple debris flows that have already erupted in the research area. For example, in July 2009, a debris flow broke out in Yushi Gully, depositing some houses (Figures 10 and 11). In 2012, a significant debris flow disaster occurred on August 18th in Longmenshan Town [64]. In August 2022, a debris flow disaster broke out in Longcao Gully, and on-site investigations also showed that there were multiple stages of debris flow accumulation in most debris flow channels (Figure 12). The evaluation criteria of the grey correlation method are significantly too low, so the evaluation results obtained in this article can better reflect the actual situation of debris flow outbreaks in the study area.


**Table 16.** Comparative analysis of risk assessment results using different methods.

**Figure 10.** Debris flow broke out in Yushi Gully (2009). **Figure 10.** Debris flow broke out in Yushi Gully (2009). **Figure 10.** Debris flow broke out in Yushi Gully (2009).

**Figure 11.** A Debris flow broke out in Yushi Gully, destroying houses (2009). 7 Machang Low risk degree Low risk degree Low risk degree **Figure 11.** A Debris flow broke out in Yushi Gully, destroying houses (2009).

**Figure 12.** Multi-phase Debris flow buildup in Huilong Gully.

1 Xiaoniuquan High risk degree Moderate risk degree High risk degree 2 Lianshan Low risk degree Low risk degree Low risk degree 3 Feishuiyan Low risk degree Low risk degree Low risk degree 4 Huilong High risk degree Moderate risk degree High risk degree 5 Yanzidong High risk degree Moderate risk degree High risk degree 6 Shiliangzi Moderate risk degree Low risk degree Moderate risk degree 7 Machang Low risk degree Low risk degree Low risk degree

**Figure 12.** Multi-phase Debris flow buildup in Huilong Gully.

1 Xiaoniuquan High risk degree Moderate risk degree High risk degree

2 Lianshan Low risk degree Low risk degree Low risk degree 3 Feishuiyan Low risk degree Low risk degree Low risk degree

4 Huilong High risk degree Moderate risk degree High risk degree 5 Yanzidong High risk degree Moderate risk degree High risk degree

6 Shiliangzi Moderate risk degree Low risk degree Moderate risk degree

**Number Debris Flow Results of This Article Results of Grey Correlation** 

**Number Debris Flow Results of This Article Results of Grey Correlation** 

**Table 16.** Comparative analysis of risk assessment results using different methods.

**Method** 

**Method** 

**Table 16.** Comparative analysis of risk assessment results using different methods.

**Results of Synergistic Coupling Method** 

**Results of Synergistic Cou-**

**pling Method** 

**Figure 11.** A Debris flow broke out in Yushi Gully, destroying houses (2009).

**Figure 10.** Debris flow broke out in Yushi Gully (2009).

**Figure 12.** Multi-phase Debris flow buildup in Huilong Gully. **Figure 12.** Multi-phase Debris flow buildup in Huilong Gully.

### **Table 16.** Comparative analysis of risk assessment results using different methods. **5. Discussion**

**Number Debris Flow Results of This Article Results of Grey Correlation Method Results of Synergistic Coupling Method**  1 Xiaoniuquan High risk degree Moderate risk degree High risk degree 2 Lianshan Low risk degree Low risk degree Low risk degree 3 Feishuiyan Low risk degree Low risk degree Low risk degree 4 Huilong High risk degree Moderate risk degree High risk degree 5 Yanzidong High risk degree Moderate risk degree High risk degree 6 Shiliangzi Moderate risk degree Low risk degree Moderate risk degree 7 Machang Low risk degree Low risk degree Low risk degree The accuracy of classification and risk assessment of debris flows is crucial for preventing and controlling debris flows. According to different classification standards, the same debris flow can belong to different categories simultaneously. The traditional classification standards for debris flows have a certain lag in preventing and controlling current debris flows [25]. For regional debris flows, different regions of debris flows have different impact factors, which need to be comprehensively selected based on the actual situation of different research regions [65]. At the same time, there are also areas for improvement in calculating the weight of the selected influencing factors. For current debris flow risk assessment models, most require manual determination of risk classification standards, which results in the inability of debris flow risk assessment results to escape the influence of human subjectivity [33]. This article proposes associating the combination weighting method with affinity propagation clustering analysis to obtain combination weights scientifically and, based on this, use clustering analysis to obtain accurate classification standards.

> In this article, the reasonable and correct selection of debris flow evaluation indicators and the calculation of indicator weights are prerequisites for using cluster analysis methods for debris flow classification and risk assessment [66]. This article uses various methods, such as on-site geological surveys, multiple remote sensing images, and drone images, to select nine influencing factors based on the essential characteristics of the debris flow clusters in Longmenshan Town. These influencing factors reflect the various geological, material, and trigger conditions of the debris flows within the study area and calculate the weight of the selected factors. The CRITIC method is an objective weight calculation method, but objective methods cannot reasonably exclude singular data in data processing, which may result in incorrect results. The AHP can fully utilize the experience of experts in the corresponding field to calculate weights, which is a subjective method. However, different experts judge different factors, which can cause incorrect results. Therefore, using the combination weighting method to combine the advantages of the two methods can obtain more scientific indicator weights, which is superior to applying a single method.

> Compared with traditional clustering analysis, affinity propagation clustering analysis does not require manually specifying the number of clusters and cluster centers, and the clustering results are unique, with obvious advantages. When using affinity propagation clustering analysis to classify and evaluate the risk of debris flows, optimizing the affinity propagation clustering algorithm, further optimizing the algorithm performance, scientifically obtaining the classification results of debris flows, quantifying the classification standards, and correctly evaluating the risk of debris flows. However, this classification and evaluation method has limitations: (1) it does not apply to the risk assessment of individual debris flows; (2) when the number of debris flows is small, this method cannot

be applied; and (3) a certain number of evaluation indicators need to be selected, and this method cannot be applied when the number of evaluation indicators is too small.

As the results of this article are based on the primary data of 14 debris flows in Longmenshan Town, Pengzhou City, China, with significant regional significance, considering the different development characteristics of debris flows in different regions, when applying this method to debris flows in other regions, the differences in regional conditions should be considered.

## **6. Conclusions**


**Author Contributions:** Y.L.: Investigation, Methodology, Data curation, Visualization, Writing—original draft. J.S.: Funding acquisition, Data curation. M.H.: Investigation. Z.P.: Investigation. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by The National Natural Science Foundation of China (No. 41572308).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding authors.

**Conflicts of Interest:** The authors declare no conflict of interest.

## **References**


**Disclaimer/Publisher's Note:** The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

MDPI St. Alban-Anlage 66 4052 Basel Switzerland www.mdpi.com

*Applied Sciences* Editorial Office E-mail: applsci@mdpi.com www.mdpi.com/journal/applsci

Disclaimer/Publisher's Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Academic Open Access Publishing

mdpi.com ISBN 978-3-0365-9038-7