1. Introduction
Landslides are one of the most severe and common geological hazards in the world, being significantly widespread, catastrophic and destructive, prone to chain disasters, and mainly occurring in mountainous areas [
1,
2]. The periodic water level rise and fall in the Three Gorges Reservoir Area makes an unstable state in the slope on both sides of the reservoir, which aggravates the existing landslide recurrence or potential landslide instability, so there are countless hazards. Landslides can directly result in a threat to life, livelihood, casualties, agricultural livestock and forest growth throughout the world, ranging from minor social disruption to serious economic losses [
3,
4]. Landslide research has attracted worldwide attention, mainly due to the continuous improvement in people’s awareness of the socio-economic impact of landslide, and the increasing pressure of urbanization on mountain environments [
5]. During the periods between 2004–2010, 2620 landslide events were recorded worldwide, causing a total of 32,322 fatalities [
6]. Only in China, more than 25,000 people have died from landslides over the past 60 years, and up to
$50 million a year of economic losses were caused by landslides [
7]. This grim situation makes measures to prevent and forecast landslide disasters extremely urgent. To minimize losses and damages, studies need to be strengthened, starting from landslide data collection for landslide risk assessment.
In 1984, Varnes [
8] first proposed the concept of landslide risk, which refers to the possible loss of population and economic activity caused by landslide disasters over a certain period of time. Moreover, landslide risk assessment refers to the interaction between the disaster-causing body and the disaster-bearing body in order to evaluate and estimate the number of casualties or property losses that may be caused by landslides [
9]. The disaster-causing body can cause danger but does not consider the hazard object, reflecting a natural attribute of landslides. The disaster-bearing body has the disaster-bearing function, which refers to the personnel, property, etc., which suffer the landslide disaster, manifesting its social consequences. Therefore, landslide risk assessment results can intuitively show landslide risk distribution in the study area, and guide disaster prevention and mitigation work according to local conditions.
Risk assessment methods can be divided into qualitative, quantitative and qualitative-semi-quantitative methods. Qualitative methods are generally carried out depending on the experience of expert engineering geologists and geomorphologists, which may be subjective. Qualitative-semi-quantitative methods are a combination of the above two methods. Biçer et al. [
9] assessed landslide risk by a semi-quantitative approach in a landslide-prone area located in the Eastern Mediterranean region of Turkey and produced a landslide risk index map. However, quantitative methods are performed by using statistical and/or mathematical modelling techniques. Based on the value estimation of the hazard bearing body at different times and working conditions, Bonachea et al. [
10] established the quantitative analysis model of landslide disaster risk assessment and the results showed that this quantitative method is feasible. Risk assessment for single landslide has become relatively mature, while landslide risk assessment at the regional scale has not been a frequent topic in the literature. Xu et al. [
11] took the Ganba landslide in Xuanen County of Hubei Province in China as a case study of landslide risk assessment. Zhang et al. [
12] researched the risk of a barrier dam induced by the Caijiaba landslide, finding that the riverway would be blocked by the debris, forming a weir dam. However, the scope of single landslide risk research is limited, whereas research on a regional scale can grasp the basic situation of regional landslide risk more macroscopically, so as to provide a theoretical basis for management departments.
As the basic premises and core work of landslide risk assessment [
13], landslide susceptibility, hazard and vulnerability assessments have been conducted by many scholars during the recent decades [
14,
15,
16]. As a foundation for landslide prevention and spatial planning, landslide susceptibility mapping (LSM) depicts the future possibility of landslides in a region [
17,
18]. The adoption of modeling methodologies plays a key role in the effectiveness of LSM [
19]. There are many methods of landslide susceptibility assessment, and the early stage is mainly statistical analysis. Due to the complex nonlinear characteristics of landslide development, many problems with landslide susceptibility assessment research have not been systematically solved. In the context of the rapid development of data mining technology, most researchers have begun to use machine learning algorithms to study landslide susceptibility, including random forest (RF) [
14,
19], logistic regression (LR) [
20,
21], artificial neural networks (ANN) [
15,
22], support vector machine (SVM) [
23,
24] and other models. Further research shows that, compared with other machine learning algorithms, the random forest, tree-based ensemble algorithm, can achieve better results, because of its robust performance and high accuracy. It only needs a small amount of adjustment before model training [
25]. In addition to the different algorithms used, redundancy and noise factors will also increase the uncertainty of the model and reduce the prediction ability. Therefore, screening of dominant and effective factors is conducive to improving the accuracy of risk assessment. Usually, factor selection methods can be classified into three categories: statistical methods, machine learning methods, and other methods. The commonly used statistical methods include factor analysis [
26], correlation coefficient and rough set [
27]. Machine learning methods including RF [
28], and LR [
29] has been employed in the literature. Although these methods can filter out relatively important influence factors and increase the reliability of LSM to a certain extent, they do not consider the pattern characteristics of spatial data, and the improvement of accuracy is limited. However, different to the above methods, GeoDetector [
30] takes into account the spatial pattern characteristics between factors and landslide data, and the selected factors are more representative, which improves the accuracy of LSM.
In this paper, considering the above concepts, a quantitative assessment of landslide risk based on susceptibility mapping using random forest and GeoDetector was performed in Fengjie County in the Three Gorges Reservoir Area. Firstly, based on summarizing the theoretical methods of landslide risk assessment, taking 1522 historical landslides of Fengjie County as a sample, the development and distribution characteristics of landslides were deeply analyzed, and the condition factors optimized by GeoDetector. Then, the LSM was generated by RF model. Lastly, landslide risk assessment was studied combined with hazard and vulnerability assessment, and the landslide risk assessment model of Fengjie County was constructed. The risk assessment levels of different regions were obtained, and the spatial distribution characteristics of landslide risk were revealed, which provided a scientific basis for the prevention and control of landslide disasters in Fengjie County. The highlights of this paper include: (1) GeoDetector was adopted for factor screening; (2) Machine learning method was applied to LSM; (3) Landslide risk assessment at regional scale was conducted.
5. Conclusions
Taking Fengjie County in the Three Gorges Reservoir Area as the research region, this paper studied quantitative risk assessment of landslides based on susceptibility mapping using random forest and GeoDetector. The main conclusions are drawn as follows:
(1) 19 dominant factors, such as elevation, lithology, groundwater type, etc., were selected, by using Geodetector, as susceptibility assessment factors, and the landslide comprehensive susceptibility assessment model was established based on the RF model. Secondly, the annual average rainfall and human engineering activities were determined as risk assessment indexes, and the hazard assessment model of Fengjie County was established based on GIS software and the weighted superposition method. At the same time, the vulnerability evaluation factors include POI kernel density, population, GDP and road cost. The landslide vulnerability assessment model was established based on GIS grid technology and weight superposition method. On this basis, the distribution of Fengjie County landslide and quantitative risk assessment research was developed.
(2) Most landslides in Fengjie County are distributed on both sides of the reservoir bank and the primary and secondary tributaries, showing a spatial distribution pattern which is more north than south. In terms of quantity, landslide risk is mostly in the low-risk level, a small part of the high risk or very high-risk level. Very low and low-risk areas accounted for the largest proportion, 73.7%, of the study area. The middle, high and very high-risk areas accounted for 23.79% and 2.5%, respectively. From the perspective of spatial pattern, the overall risk level shows the high spatial distribution characteristics in the central and eastern urban areas, and low in the southern and northern high-altitude areas. Because very low and low-risk areas are mostly distributed above the mountains, human activities are weak, even if there is a low risk landslide. The middle-risk areas are mostly located near scattered villages along valleys and rivers, so the villagers suffer more landslide risk. High and very high-risk areas are located along the Three Gorges Reservoir, concentrated in the central city, densely populated and full of buildings. Under the influence of extreme weather such as reservoir water lifting and heavy rainfall, once landslides occur they cause serious casualties and property losses. The results of risk zoning are in line with the actual situation of Fengjie County, which can provide a basis for disaster prevention and mitigation and land space planning in the study area.
(3) The importance of the results for different landslide conditioning factors are in line with basic geological laws and the regional characteristics. Elevation, lithology, and groundwater type are the main factors. Secondly, the general, sub-key and key prevention and control areas were divided, and the landslide prevention and control management targeted. The general control area accounted for 73.71%, and the landslide density was 0.07/km2, which is widely distributed and has low risk. Sub-key control areas are mostly located in the valley to the transition zone landslide, an unstable slope area, accounting for 23.79% of the area, and landslide density of 0.82/km2; compared with the general control area, this is an increase of nearly 12 times. The key prevention and control areas are divided into three sub-regions, which are mostly located around the central towns, landslides and highways. The landslide density can reach 3–5 places/km2. It is necessary to strictly prevent and control landslide disasters. Starting from two aspects of prevention and control, according to the difference of risk levels in various regions, zoning management is carried out according to local conditions. This study is helpful for all levels of management departments to make timely and accurate disaster prevention and mitigation decisions, and to provide decision-making information for regional urban planning, land resources development, land use development and social and economic sustainable development.