Next Article in Journal
Trend Analysis of Climatic Variables in the Cross River Basin, Nigeria
Next Article in Special Issue
A Unique Conditions Model for Landslide Susceptibility Mapping
Previous Article in Journal
Hydroacoustic Monitoring of Mayotte Submarine Volcano during Its Eruptive Phase
Previous Article in Special Issue
UAV, GNSS, and GIS for the Rapid Assessment of Multi-Occurrence Landslides
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Landslide Susceptibility Assessment by Machine Learning and Frequency Ratio Methods Using XRAIN Radar-Acquired Rainfall Data

by
José Maria dos Santos Rodrigues Neto
1 and
Netra Prakash Bhandary
2,*
1
Canaan Geo Research Ltd., Matsuyama 791-1106, Japan
2
Faculty of Collaborative Regional Innovation, Ehime University, Matsuyama 790-8577, Japan
*
Author to whom correspondence should be addressed.
Geosciences 2024, 14(6), 171; https://doi.org/10.3390/geosciences14060171
Submission received: 6 May 2024 / Revised: 13 June 2024 / Accepted: 14 June 2024 / Published: 18 June 2024
(This article belongs to the Special Issue Landslide Monitoring and Mapping II)

Abstract

:
This study is an efficiency comparison between four methods for the production of landslide susceptibility maps (LSMs), which include random forest (RF), artificial neural network (ANN), and logistic regression (LR) as the machine learning (ML) techniques and frequency ratio (FR) as a statistical method. The study area is located in the Southern Hiroshima Prefecture in western Japan, a locality known to suffer from rainfall-induced landslide disasters, the most recent one in July 2018. The landslide conditioning factors (LCFs) considered in this study are lithology, land use, altitude, slope angle, slope aspect, distance to drainage, distance to lineament, soil class, and mean annual precipitation. The rainfall LCF data comprise XRAIN (eXtended RAdar Information Network) radar records, which are novel in the task of LSM production. The accuracy of the produced LSMs was calculated with the area under the receiver operating characteristic curve (AUROC), and an automatic hyperparameter tuning and result comparison system based on AUROC scores was utilized. The calculated AUROC scores of the resulting LSMs were 0.952 for the RF method, 0.9247 for the ANN method, 0.9016 for the LR method, and 0.8424 for the FR. It is also noteworthy that the ML methods are substantially swifter and more practical than the FR method and allow for multiple and automatic experimentations with different hyperparameter settings, providing fine and accurate outcomes with the given data. The results evidence that ML techniques are more efficient when dealing with hazard assessment problems such as the one exemplified in this study. Although the conclusion that the RF method is the most accurate for LSM production as found by other authors in the literature, ML method efficiency may vary depending on the specific study area, and thus the use of an automatic multi-method LSM production system with hyperparameter tuning such as the one utilized in this study is advised. It was also found that XRAIN radar-acquired mean annual precipitation data are effective when used as an LCF in LSM production.

1. Introduction

The United Nations Office for Disaster Risk Reduction reports that between 1998 and 2017, more than 4.8 million people were affected, and more than 18,000 lost their lives in landslide disasters [1], which are expected to become more frequent in the next few decades due to urbanization, deforestation and especially owing to climate change effects [2]. Landslide damage is particularly heavy in urban areas where the topographical settings force urbanization into or near mountainous areas. Although predisposed by local factors attributed to the slope, landslides primarily occur due to some kind of triggering factors. According to Osanai et al., 17,640 cases of slope failures were triggered by rainfall in Japan between 1972 and 2007 [3].
A recent case of rainfall-induced landslide disaster in Japan was recorded in July 2018. The landslides, together with massive flooding in a large part of Southwest Japan, were caused by heavy rains, an event officially referred to as “Heavy Rains of July, Heisei Year 30”. During the course of about 10 days, from 28 June until 8 July, rainfall records reached as high as 1800 mm on the island of Shikoku and 1200 mm in the Tokai region [4]. The most severe period of the disasters was between 5 and 7 July, when most of the rainfall intensity peaks were observed, and many landslides and floods occurred. Property loss caused by the July 2018 disasters was estimated to be ¥1.16 trillion (i.e., about US$ 10 billion), including damage to industries and public infrastructures [5]. Although emergency warnings were issued for eight prefectures, the death toll caused by the landslides and floods during the July 2018 disasters was above 225. The most affected prefecture was Hiroshima, with 113 fatalities, and one of the most affected areas was Kure City, with 24 deceased due to landslides. Additionally, most transportation lines into the city (except maritime ways) were cut off, and 760 houses were damaged.
A valuable source of data that has been used extensively in recent years for the analysis of rainfall-induced landslides in the context of Japan [6,7,8,9] is XRAIN (eXtended Radar Information Network). Sponsored by the Ministry of Education, Culture, Sports, Science, and Technology (MEXT) and available through the University of Tokyo’s Data Integration & Analysis System (DIAS) platform since 2014, XRAIN offers real-time rainfall measurements using Multi-factor (MP) radars [10]. These radars provide spatially accurate rainfall intensity data. In landslide susceptibility mapping, rainfall-related data are commonly based on point measurements from rain gauges located in or near the municipality’s center. However, rain gauge data are limited to specific points, requiring extrapolation for areas in between, potentially leading to inaccuracies, especially over large distances. In contrast, radar-based rainfall data samples a bi-dimensional grid over the measurement equipment, providing genuinely gauged and geographically accurate results when visualized in GIS for large areas. Given the variation in parameters over short distances in Landslide susceptibility mapping, utilizing radar-acquired rainfall data like XRAIN can offer significant benefits.
One of the strategies for minimizing the damage of landslide disasters is the production and use of landslide susceptibility maps (LSMs), which assess the probability of landslide occurrence in an area considering slope failure-related factors and the actual occurrence of past landslides in a GIS platform. An LSM is a vital tool for urban planning and risk management of urbanized areas prone to landslide activity. Currently, there are various methods of calculating spatial landslide probability and producing LSMs. Yilmaz et al. [11] defend that the frequency ratio (FR) method is one of the most practical and efficient methods for landslide susceptibility calculation in the GIS platform. However, advancements in programming and computation technology in recent years have brought about the extensive use of machine learning (ML) methods in myriads of areas of application, including the domain of natural hazards and landslide susceptibility assessment, as advocated by Goetz et al. [12]; Youssef and Pourghasemi [13]; Ado et al. [14], and others. Liu et al. [15], in a review study of the most recent advancements in LSM production, find that although different study areas may present different performances of specific methods, with sufficient and good quality data, ML-based methods efficiently produce reliable LSMs and argue that the application of hybrid or ensemble models further improves the map’s accuracy, a conclusion also verified by Fang et al. [16] and Ado et al. [14]. These authors also point out that in areas with insufficient landslide inventory, the introduction of generative adversarial networks (GAN) and transfer learning techniques allow the production of accurate maps even in areas with scarce data.
Many studies also point out that deep learning-based methods present excellent results in LSM production [17,18], but these models require a very large amount of data inventory to satisfactorily train their complex neural networks, which are usually not available in most situations, even in developed countries such as Japan [15]. Huang et al. [19] also attempted the production of LSMs with the approach of utilizing rainfall data by coupling susceptibility maps and rainfall threshold models. A similar approach was also proposed by Palau et al. [20] to integrate an LSM with real-time radar-acquired rainfall data.
Attempts at landslide susceptibility mapping of the area of Kure City have been made by Wu and Nakahita [7] with the logistic regression (LR) method using XRAIN radar-acquired rainfall data as a landslide conditioning factor (LCF) along with soil water index (SWI) and lithology type, as well as landslide data from the July 2018 disasters. Another attempt was made by Rodrigues Neto [21] with the FR method using a wider collection of LCFs.
Among many different ML methods, one of the most recommended for landslide susceptibility assessment is the random forest (RF) method, as also pointed out by Goetz et al. [12], Chen et al. [22], Youssef and Pourghasemi [13], Li et al. [23], Xia et al. [24], Imtiaz et al. [25], Huang et al. [19], and Daviran et al. [26].
As the main validation assessment of produced LSMs, the great majority of LSM and the majority of recent studies use the receiver operating characteristic (ROC) test’s area under curve (AUC) value, where 0.5 is considered the threshold value and values closer to 1 represent more accurate results [14,27,28,29,30].
Although much work has been performed in recent years on ML-based LSM production, most of them admit that the presented methods and conclusions do not present a universal application and that specific study area circumstances in LCFs, such as topography, lithology or rainfall characteristics may affect the susceptibility technique, and different methods may display different performances in different areas [14,15]. For that aspect, it is important to experiment with landslide susceptibility mapping methods for specific study areas. Although LSMs for the study area have been performed with the use of XRAIN radar data, no attempt has been made yet to use ML techniques and an extensive LCF collection.
This study aims at improving LSM production methods with the novel aspect of using XRAIN radar-acquired rainfall data as one of the LCFs and producing the maps using random forest (RF), artificial neural networks (ANN), and logistic regression (LR) as three different ML algorithms. For comparison purposes, the statistical method of frequency ratio (FR) was also employed to produce the LSM. An accuracy assessment is performed for the produced maps to find out the most efficient ML algorithm for LSM production in the proposed study area. A new and original computer-code-based method is also proposed to perform LSM production, performance assessment and comparison automatically, a technique which would significantly improve both practicability and accuracy of LSM production in virtually any study area where data are available. The research aims to use Kure City (Hiroshima Prefecture) as a study area and the July 2018 heavy rain-induced disasters as a specific case for the study since both landslide and rainfall records are abundant in the specific event for the area. It is expected that the investigations and experimentations concerning LSM production will not only lead to improvements in landslide disaster prevention strategies through the creation of better LSMs and the development of an original, more practical, and accurate landslide susceptibility mapping method but also provide insights into the characteristics of rainfall-induced rapid-moving landslides in the context of Japan, and in other regions with similar geological, geomorphological, meteorological, and social circumstances.

2. Materials and Methods

2.1. Study Area

For this study, a rectangular area of 390.5 km2 (approximately 28 km × 14 km) covering Kure City, south of Hiroshima Prefecture, was used (Figure 1). The cartographical edges that limit the study area are Top: 34.325104 dd; Right: 132.800008 dd; Bottom: 34.197426 dd; Left: 132.500202 dd.
A geographically small port town adjacent to the Seto Inland Sea, Kure started as a shipbuilding facility at the end of the 19th century. Kure was soon made into a major dockyard and military base; the port and the city around it grew quickly due to the Imperial Japanese Navy’s and its facilities’ rapid development until the end of World War Two. However, the area’s flat terrains are cramped and limited by mountains (a common scenario in Japan), which forced the town’s rapid expansion into and near adjacent hills. As a result, a lot of the town’s habitations are located in rugged terrain. In fact, about 14% of the buildings framed in the study area are on slopes with steepness above 20°, a situation susceptible to landslides. This risky scenario was explicit in the heavy damages suffered by the city of Kure in the July 2018 heavy rain disasters.
Topographically, the area is mostly comprised of rugged terrain: from the 390.5 km2 of land in the study area, 191 km2 (about 49%) is composed of slopes above 20° steepness. The altitude varies between sea level around the dockyard areas and 839 m at the peak of Mt. Noro. Other predominant peaks in the area include Mt. Enofuji (664 m), Mt Yasumi (497 m), and Mt. Ege (568 m).
Geological mapping performed by Yamada et al. [32] in the area shows that the region’s lithographical setting is composed primarily of igneous rocks of the Late Cretaceous, predominantly biotite granite and hornblende–biotite granite of the Hiroshima Granitic Rocks group, commonly referred to as “Hiroshima Granite”. In the study area, these rocks are distributed in most of the western portion of the map (Mt. Eboshiwa, Mt. Ege, Mt. Myojin, Mt. Tengujo, Mt. Hachimaki, Mt. Mitsuishi, Mt. Yasumi, Mt. Mitsumine, and Mt. Hisago), as well as in a peninsula in the far east of the area (Mt. Mizunoura).
The other Late Cretaceous igneous rocks that occur secondarily but still dominantly in the area, particularly Kure City, are volcanic rocks from the Takada Rhyolites and Hikimi Group, namely dacite to rhyolite welded tuff, tuffaceous sandstone, mudstone conglomerate, found in the central–west (Mt. Ozumi Mt. Enofuji, Mt. Age and Mt. Tsuchi) and the far northeastern part of the area (Mt. Kanashioku, Mt. Mosuke, Mt. Nada), as well as rhyolite welded tuff with non-welded pyroclastic rock, dacite welded tuff, found in the central east part of the area (Mt. Noro).
The region is occasionally carved by granite porphyry and granophyre dikes of the late cretaceous (though posterior to the Hiroshima Granitic Rocks), cutting the granitic rocks in NE-SW orientation and the granodioritic and dacitic rocks in NW-SE orientation. The Quaternary deposits in the area, found along river vales, plains, and deltas, comprise sand, mud, and gravel from the Saijo Formation, as well as gravel, sand, and mud alluviums from the Holocene. Much of the port and dockyard terrain (sometimes directly adjacent to the granitic mountains) comprises artificial reclamation grounds.
Regarding the structural geology arrangement of the study area, the regional structures are primarily dominated by NE–SW lineaments (especially near Mt. Noro), which might be related to the Triassic Maizuru Belt [33]. This orientation governs many morphological aspects of the area, as exemplified by Mt. Yasumi’s peninsula or Kurose River and its valley. Though not so common in a regional aspect, the study area around Kure City also presents an NNW–SSE fault line along the northwestern granites of the area, near Mt. Ege and Mt. Myojin.
The most predominant bedrock lithology of the area, coarse biotite granite/granodiorite and hornblende-biotite granite/granodiorite of the Hiroshima Granitic Rocks group, or simply “Hiroshima Granites”, are very easily weathered, changing into a soil commonly referred to as Masado or “real sand”, in Japanese [34]. Masado granitic soil is known to have good permeability and be very brittle when wet, which causes it to be prone to lose its structure and stability when rainfall infiltration occurs and, thus, to be a very susceptible soil for landslide occurrence during heavy rainfall events.
Additionally, according to Chigira [35], granites in mountainous areas are microscopically sheeted around depths of 50 m to undergo the stress of the ridge morphology. According to the author, these micro-sheet configurations stimulate the formation of cracks in the granite due to stress release, temperature change, and water content near the surface, thus facilitating landslide activity in events of heavy rainfall. The Seto Inland Sea has little rainfall compared to the surrounding oceanic coastal areas in Japan, like the Sea of Japan and the Pacific Ocean. Although a fairly good part of the oceanic precipitation clouds is blocked either by the Chugoku Mountains northward or the Shikoku Mountains southward and the region is considered relatively dry [36], heavy rainfall is particularly concentrated in mountainous areas. In Kure, the average annual average precipitation ranges from 1000 to 1600 mm, characterizing a relatively mild rainy zone. Mountain areas around the Seto Inland Sea, however, reach annual average precipitation of 2000 mm to 3000 mm. The period of the year with the heaviest rainfall occurs between June and July every year when the average monthly precipitation reaches 227 mm [37].

2.2. Landslide Conditioning Factors (LCFs)

In this study, nine LCFs (i.e., slope failure-causing factors), as detailed in the following sub-sections, were used for landslide susceptibility assessment.

2.2.1. Lithology

The physical properties and weathering susceptibility of rocks depend on their lithological setting. Numerous studies (e.g., [11,35,38,39,40,41,42,43]) have explored this relationship. In the study area, granites with medium to coarse granulometry are highly prone to weathering [34], making them susceptible to landslides. Guzzetti et al. [38] also found that slope failure events correlate with sedimentary and tectonic discontinuities, particularly between hard and soft rocks and permeable and impermeable layers.
However, detailed and spatially localized weathering grading requires time-consuming fieldwork, making it impractical for large-scale analysis. Therefore, weathering levels should be considered in the context of the specific location’s climate and geological features. Sediments and deposits in our study area are mostly found in landscapes with low to neutral slope inclinations, leading to a low correlation with the probability of slope failure.
The lithology LCF in this study originates from the geological mapping performed by Yamada et al. [32] and is presented in a 1:200,000 Hiroshima geological map (NI-53-33).

2.2.2. Land Use

Landslides, though a natural process in landscape shaping, can be exacerbated by human influences [43]. Changing drainage patterns for habitation or agriculture is a common human cause of landslide activity. Destabilizing slopes through construction overload or poorly planned re-landscaping also reduces slope stability. Deforestation increases landslide probability due to the importance of vegetation cover. In Japan, the cramped setting of plain areas forces the population onto rugged terrain for habitation, agriculture, or industry, making land use a significant factor in landslide probability compared to more spacious areas. The land use LCF of this study originates from data provided by Japan’s National Land Information Division (NLID), Ministry of Land, Infrastructure, Transport and Tourism (MLIT) [44].

2.2.3. Elevation

The elevation attribute is often used in landslide risk assessment due to its influence on contextual geomorphology and geology [11,40]. Low elevations typically feature gentle terrains with thick soils, while high elevations consist of leveled mountain summits or cliffs with strong bedrock [40]. This leads to a concentration of landslide risk in areas with intermediate elevations. However, the impact of elevation on slope failure occurrence is relative to the regional geology of the research area, so it requires specific consideration and adjustment rather than adopting global characteristics. The altitude LCF of this study is derived from the digital elevation model (DEM) availed by the Geospatial Information Authority of Japan (GSI) [45].

2.2.4. Slope Angle

Slope angle significantly influences landslide susceptibility. In a uniform slope with isotropic material, failure probability increases with the slope gradient [39]. Thus, slopes with low angles have low to insignificant landslide probability. However, actual material stability varies along slopes based on geology and weathering levels. Consequently, slopes with higher gradients are often more resistant to slope failure [40]. In the study area, high-inclination slopes (usually above 50°) are composed of outcropping igneous bedrock sections that are comparatively resistant to slope failure due to limited weathering impact. The slope angle LCF in this study was derived from the DEM availed by the GSI [45].

2.2.5. Slope Aspect

The relation of slope aspect (diving direction) with landslide probability is not fully understood. Yet, some investigators use it in statistically calculating landslide failure, as patterns between failure occurrence and slope direction are commonly observed. Two widely accepted hypotheses for the slope aspect’s effect on landslides are: (a) it relates to the area’s general physiographic trend, and (b) depending on the slope direction in relation to the sun’s trajectory, humidity may be more or less prone to evaporation, affecting landslides differently based on the main precipitation direction [46]. The slope aspect LCF in this study was derived from the DEM availed by the GSI [45].

2.2.6. Distance to Drainage

Drainage systems significantly influence landscape morphology and slope settings, often associated with increased landslide activity [42]. Slope stability models heavily depend on subterranean water contents, like pore water pressure capacity, which is influenced by groundwater systems associated with surface-level drainage systems. In landslide risk assessments, the drainage system factor is usually considered in terms of relative distance to drainage channels or drainage density values [11,40,41]. This study uses distance to drainage as the relevant measure extracted from the DEM availed by the GSI [45].

2.2.7. Distance to Lineaments

Geological structures are strain marks or “wounds” caused by tectonic forces throughout a rock’s history [47]. They are crucial for understanding rock formation and have practical applications in economic exploration and geotechnics. In map view, these structures are represented as lineaments. Faults, folds, foliations, and joints are common structural defects that weaken the rock’s cohesion and contribute to slope instability. Therefore, geological structures, particularly lineaments, are essential in assessing landslide risk [11,38,39,43,48,49]. In GIS analysis, they are considered as the distance to lineament factor. The distance to lineament LCF in this study is derived from a 1:200,000 geological map (NI-53-33) by Yamada et al. [32].

2.2.8. Soil Classification

The topsoil characteristics of a slope affect its stability and landslide susceptibility in the sense that it greatly influences how rainwater infiltrates the slope material, being considered an important LCF [50]. The soil class LCF utilized in this study was prepared from a 1:50,000 Kure soil classification map (NI-53-33-7) by Tanimoto et al. [51].

2.2.9. Rainfall

Landslide disasters are often triggered by increased pore water density and saturation due to heavy rainfall [34,35,43,52,53,54,55,56]. The July 2018 Southwestern Japan disasters exemplify how landslides are commonly accompanied by flooding. However, the association is not solely due to precipitation and ground saturation. Flooding can cause landslides by cutting slopes’ basal section around stream banks. Additionally, debris flows, and mudflows can be mistaken for floods, as they occur in small drainage systems, and they might happen simultaneously. On the other hand, landslides may deposit debris into rivers, causing sudden water level changes and even dam collapses in severe cases [43].
Although rainfall is commonly regarded as a triggering factor of landslides more than a causative factor, it is found that long-term precipitation localization patterns are recurrent, and the rainfall localization patterns of new rainfall events are generally expected to follow mean annual precipitation localization patterns [31,57,58], a phenomenon mostly associated with topographical interactions, and thus particularly remarkable in areas with rugged terrain such as Kure. For that reason, it is expected that recorded mean annual precipitation may be utilized in the prediction of future rainfall localization, thus making it a relevant LCF.
The mean annual precipitation between the years of 2016 and 2021 data utilized in this study were acquired through XRAIN radar-acquired data, thoroughly described in the following section.

2.3. XRAIN Radar-Acquired Rainfall Data

One of the landslide-causing factors commonly used in the production of LSMs is precipitation volume. However, most studies use rainfall data acquired through rain gauge measurement stations, which are only representative of the measurement location and not detailed enough for spatial modeling problems. However, new strategies in precipitation measurement have been developed in a few countries, such as Japan. XRAIN (eXtended Radar Information Network) data are operated by the Ministry of Land, Infrastructure, Transport and Tourism (MEXT) and made available through the University of Tokyo’s Data Integration & Analysis System (DIAS) platform. Starting operation in 2014, XRAIN comprises a real-time rainfall measuring system based on Multi-factor (MP) radars, which allow for more spatially accurate measurements of rainfall volume. In contrast to more conventional point-based or extrapolated rain gauge measurement station data (commonly used in landslide susceptibility mapping methods), radar-acquired data allows for bidimensional sampling of rainfall over an area, which makes up for much more detailed localization aspects of rainfall data.
In this research, the XRAIN rainfall data were downloaded as .zip packed .csv files spaced in 5 min, 30 min, or 1 h intervals. The .csv files comprise tables with cells spatially organized so that each cell represents a 287 × 230 m pixel in a north-oriented grid representing the designated area, and each cell’s value expresses the rainfall intensity in mm/h at the time of measurement. For the study area, each of the files comprised a 97 × 67 grid with 4999 pixels. XRAIN data collection for the area of Chugoku (where Kure is located) started in 2016, so the range of collected data span from 2016 to 2021, summing up to more than 60,000 records.

2.4. LSM Production Methods

In this study, four different methods of LSM production were tested and compared for prediction accuracy. As a general statistical approach, the frequency ratio (FR) method was used, while machine learning (ML) techniques random forest (RF), artificial neural network (ANN), and logistic regression (LR) were adopted in order to find which one is the most efficient for the intended objectives. The target area for the LSM production contains 432,258 20 m pixels.
In order to make a comparison between the above four methods, an identical collection of eight LCFs was used in shapefile form: (a) lithology, (b) land use, (c) altitude, (d) slope angle, (e) slope aspect, (f) drainage density, (g) distance from lineaments, (h) soil class, and (i) mean annual precipitation. These LCFs are presented in a map view in Figure 2.
Multi-collinearity of the LCFs, that is, a phenomenon in which two or more factors are too closely correlated, would cause distortion and inaccuracy of the model estimation of the susceptibility assessment model [22,59]. In order to determine the independence of LCFs, many authors utilize the variance inflation factor (VIF) method, where it is generally accepted that VIF > 5 indicates potential multi-collinearity [22,60,61]. T. The VIF values of the 8 LCFs used in this study are shown in Table 1, ranging from 1.03 for the lowest value to 2.14 for the highest value, indicating no multi-collinearity and independence of the utilized 8 LCFs of this study.
The DEM used in this study was provided by the GSI of Japan [45]. The land use data were provided by the NLID, MLIT of Japan [44]. Lithology and lineament data were extracted from the Hiroshima Geological map by Yamada et al. [32]. The mean annual precipitation data comprises recorded rainfall between 2016 and 2021 with XRAIN technology. A summary of all the data used in the analysis is shown in Table 2.
The landslide inventory used in this study comprises 1177 landslide points referent to the July 2018 disasters, which were mapped and provided by the GSI of Japan [45] with aerial photography immediately after the disasters (Figure 1). The landslide points were randomly separated into training and test sets, with a ratio of 70% for the training set (824 landslide points) and 30% for the test set (353 landslide points). These points are distributed along the pixels of the study area, characterizing landslide pixels and non-landslide pixels along the study area, where the non-landslide pixels are more numerous than the former.

2.4.1. Frequency Ratio (FR) Method

The FR method uses the assumption that landslide occurrence is determined by factors that are related to the event and, thus, that new landslides will generally occur under the same conditions as past landslides [11,48,62]. In the FR method, FR values represent the ratio between the landslide occurrence and the area of the specified factor for a given factor class. In order to acquire values representative of FR analysis, the landslide occurrence and factor area values are normalized by calculating the percentage of the specific class to the total factor or landslide of the analysis [11]. For example, in a study area with 100 km2 and 20 landslides, a factor class of 10 km2 and 5 landslides represents 10% of the analyzed area and 25% of the landslides, resulting in a value of 2.5 for that class. In FR calculation, 1 is considered the medium value, and thus, values larger than 1 represent a higher correlation of landslide occurrence with the given attribute and, consequently, a higher risk [63]. In some FR method calculations, such as the one conducted by Yan et al. [64], FR values are normalized according to the other classes of FR values in that factor for the sake of a smooth analysis process. After the FR is calculated for each range or type of factor, the values are summed in each pixel of the susceptibility map to calculate the landslide susceptibility index (LSI) for that specific pixel [11,48]:
L S I = F R
Once the LSI values are established for each pixel of the map, a final LSM is produced, where higher LSI values represent a higher risk of landslide occurrence in that location.
In this study, the platforms used for FR calculation were ArcGIS Pro (for data manipulation) and Microsoft Excel (for calculations and graphical presentations). Each of the landslide-related factors was reclassified into 10 classes (for the quantitative factors) or was used in the original classification (for qualitative factors). The number of landslides in each class was found using the spatial join analysis tool in ArcGIS, and the class area was calculated using the calculate geometry tool. With the variables necessary for FR calculation attained, the FR value for each factor class was calculated in the field calculator. Finally, the LSI value for each 20 m pixel of the study area was obtained by summing up the respective FR values for the specific blending of factor classes of that pixel, resulting in the FR-based LSM.

2.4.2. ML Methods

ML is a form of artificial intelligence comprising methods where a system “learns” based on a set of data, looking for patterns in it and how they affect a certain result relative to a problem. It was found that this technology is successful in completing varied specialized tasks when set up with a sufficient dataset and adequate hyperparameter tuning. The past decade has seen a substantial rise in ML techniques in various areas of application, such as autonomous driving, health care, finance, manufacturing, and energy [65]. The utilization of ML methods in the domain of natural hazards (including landslide susceptibility assessment) is supported by Goetz et al. [12] and Youssef and Pourghasemi [13]. Some of the advantages encompassed by the use of MLs include adjusting its internal structure to the experimented data, the ability to extract knowledge automatically from a dataset and to build classification and regression to provide accurate models, as well as efficiency and practicability even in large areas [13,66,67].
ML includes various specific techniques with varied degrees of complexity. Although all these techniques employ the common concept in ML of identifying patterns in data and building a prediction model, the specific mechanisms and algorithms used differ.

Random Forest (RF) Method

Much of the bibliographical research suggests that the ML technique judged most effective for the specific case of landslide risk assessment mapping is the RF technique [11,13]. The RF technique is actually an expansion of another ML method, the decision tree. The decision tree technique is a supervised ML method; that is, a set of inputs (samples) and outputs (labels) are given so that the algorithm may figure out a function (model) to find outputs for “unlabeled” inputs. For this specific method, the machine observes a provided dataset and creates a “tree” (e.g., prediction model) that best represents it. In the “tree”, every time the data split according to a variance in a parameter, a new “branch” cleaves out of the tree in what is called a decision node. The splitting goes along for every other parameter variety, creating new “branches” in decision nodes until it finally ends in a “leaf”, representing a predicted outcome for that particular set of parameters in the data. Due to their simplicity and intelligibility, decision trees are a popular ML technique [68].
However, decision tree learners have the limitation of being prone to creating over-complex trees, which do not reflect an accurate representation of the data in a generalized way, which is known as overfitting. This was solved with the RF ML technique, which, as the name suggests, is an ensemble model comprising a “forest” with many decision trees (Figure 3). Each tree is an entirely random and independent experiment, which prevents overfitting by outputting a result comprising an ensemble-averaged prediction for all the decision trees in the RF, avoiding the result given by a singular (possibly unrepresentative) decision tree. Data used in the RF learning method have few necessities of transformation, as the algorithm deals with outliers and missing values [11].
In this study, the RF algorithm utilized was the one provided in the scikit-learn ML library (sklearn.ensemble.RandomForestClassifier), based on the Jupyter Notebook platform with Python 3.9 programming language, developed by Pedregosa et al. [69].

Artificial Neural Network (ANN) Method

Although all ML models are in some way inspired by the human brain, ANNs most directly resemble the brain structure, as is suggested in the method’s name itself. ANNs are composed of nodes called artificial neurons, which work by transmitting signals to the next node, much like synapses of neuron cells in a biological mammalian brain [70].
In its computational model, ANNs acquire, represent, and compute a mapping from a multivariate space of information to another based on a dataset that represents that mapping [71]. The most commonly utilized ANN method is the backpropagation learning algorithm, which consists of creating a network composed of one input layer, one output layer, and various hidden layers in between them [72]. Each layer is composed of processing unit nodes (neurons), and the nodes in one layer are connected to the ones in other layers by attributed weights and transfer functions. Each neuron receives “signals” (represented by real numbers) from the neurons of the previous layer, and its output to the subsequent layer is computed through a non-linear function of the sum of its inputs, and the “strength” of its signal is determined by the weight of the connection, attributed with training. The result is processed from the input layer through hidden layers until reaching the output layer, which represents the result calculated by the model. As in other ML methods, the ANN model goes through a training process equipped with a dataset of known input and results, adjusting the weights of its connections based on a trial-and-error process known as supervised learning. The process goes on until it gets enough “experience” to finely tune the communication between layers so as to minimize errors and achieve the best results [70]. An illustration of the ANN concept is shown in Figure 4.

Logistic Regression (LR) Method

Logistic regression is a well-known statistical model used for calculating the probability of an event with a binary (0 or 1, yes or no) outcome [73]. It is differentiated from linear regression in the sense that the latter predicts continuous outcomes; that is, the outcome may be any possible value between an interval. In this sense, LR prediction is more suitable to apply in cases where the aftermath may either go one way or the other, such as in landslide susceptibility analysis. The slope failure either occurs or does not, with no in-between.
Although the analyzed event in binary logistic regression only has two outcomes, the logistic function, represented by a sigmoid curve in a bidimensional graph where the vertical axis represents the dependent binary valuable and the horizontal axis represents the independent variables, calculates probabilistic values between 0 and 1, allowing for a usable result in landslide susceptibility assessment tasks. An example of a typical logistic regression with a sigmoid curve is shown in Figure 5.

2.5. Performance Assessment

In order to assess the performance of the produced LSMs, the receiver operating characteristic (ROC) analysis was employed in this study. Initially developed for radar accuracy tests, the ROC method is recommended for landslide zoning validation tasks due to its threshold-independent nature [27,28,29]; that is, it does not require a fixed value to determine that either negates or requires a landslide activation since LSI is a probability assessment, not a deterministic one. Thus, ROC analysis uses multiple thresholds with different interspaces, and the points in the ROC curve represent these possible cutoff thresholds given by the multiple cases in a model (i.e., LSM). Each threshold forms a confusion matrix with four types of outcomes for each pixel: true positive (TP), false positive (FP), true negative (TN), and false negative (FN). Each threshold point in the curve then corresponds to a pair of sensitivity and specificity values, where sensitivity is represented by the y-axis and represents the true positive rate (TPR), and the specificity is shown in the x-axis, representing the false positive rate (FPR). TPR and FPR are calculated as follows:
T P R = T P T P + F N
F P R = F P T N + F P
After the ROC curve is plotted, its area under the curve (AUC) may be calculated derived from the constructed ROC curve as follows:
A U C = i = 1 n + 1 1 2 ( x i x i + 1 ) 2 × ( y i + y i + 1 )
The AUC is used as a metric to assess the quality of the LSM, where a larger area (ranging from 0.5 to 1) represents better prediction performance, that is, how well the model separates the validations landslides throughout the susceptibility zones of the LSM. For that reason, the AUC value is used as the primary measure for LSM accuracy in this study [27,28,29].
As an additional method for LSM validation, landslide density is checked for each one of the 5 LSI zones attributed in the LSM. It is expected that in an efficient LSM, the landslide density distribution will follow a proportionally direct growth with each zone change from very low to very high.

2.6. Automated Hyperparameter Tuning Based on AUROC Analysis Validation

As illustrated in the previous topics, a varied selection of LSM production methods is available, each with its specific hyperparameters and configurations that may yield better or worse (more or less accurate results) in the final LSM. Although many studies have been made comparing LSM production methods based on their accuracy, it is a known fact that the optimal choice of method and its inbuilt parameters may vary depending on the specific context of the LSM area. Although LCFs generally show similar patterns concerning landslide susceptibility, the peculiar characteristics of a specific area’s topography, lithology, or climatological aspects may significantly change the dynamics between LCFs and landslide susceptibility, and different LSM production methods may be more or less proficient in dealing with such dynamics. For that reason, this study takes the novel approach of using an automated process based on programming code to experiment with different LSM production methods (namely ML ones), probing different parameters and configurations in each method allowing for automatic hyperparameter tuning. The accuracy of each experimented LSM is automatically tested through AUROC so that the most efficient method and resultant LSM is assigned, all through an automated process built in the Python programming language.

3. Results

3.1. LCF FR Values

The FR values calculated for each LCF class are exhibited in Table 3, and graphs showing class dynamics in each LCF are shown in Figure 6.
The lithology LCF FR value calculation results are shown in Figure 6a. It was noted that non-alkaline felsic volcanic rocks (of the Takada Rhyolites/Hikimi Group) present the highest FR value of 1.34, followed by granites of the Hiroshima Granites with an FR value of 0.9. The interpretation that these lithologies, particularly the Hiroshima Granites, are highly susceptible to landslides is backed by much of the literature on the area, such as by Wang et al. [17] and Chigira [35].
The land use LCF FR values (shown in Figure 6b) show that the highest FR values in the area are found in the “golf course” class with 3.72, followed by “forest” with 1.10, and the lowest FR values are found in the “constructions” class, with 0.02. Many large landslides were recorded in golf courses during the July 2018 disasters around Hiroshima Prefecture. It is expected that such zones, when near slopes, are highly prone to landslides due to their lack of vegetation and susceptibility to rainwater saturation. Constructions, although also being areas altered by human activities, are usually cemented and thus do not represent slope failure risk.
FR calculation of classes in the altitude LCF (Figure 6c) shows that FR values peak at the 405 m class with a value of 2.18, bottoming with an FR of 0.21 on either extremity (0 m and 825 m). It is judged that this dynamic is explained by the fact that low-altitude classes are represented by zones with low slope angles, mostly related to valley bottoms. The highest altitudes, on the other hand, are represented by mountaintops with low slope angles or cliffs with steep slope angles but no soil material. Landslide activity peaks at intermediate altitudes in the flanks of mountains and hills.
The FR values for the slope angle LCF are shown in graph form in Figure 6d. It is noticeable that landslide activity peaks at intermediate slope angle values, particularly in the 35 to 40 degrees interval, with an FR value of 1.27, and bottoms in the extreme classes with 0.11 for the 20 to 25 degrees interval and 0.21 for the steeper than 50 degrees interval. Although it may be commonly assumed that steeper slope angles will be more susceptible to landslide because of gravity, that dynamic does not occur in nature due to the simple fact, in general, the steeper the slope, the lower its amount of soil, with the actual material of the slope becoming increasingly composed by hard rock material, not susceptible to common landslides.
The slope aspect FR values calculated in the Kure study area are shown in Figure 6e. Higher FR values are concentrated in northwestern-bound slopes, with FR peaking at the west at 1.33. Accordingly, lower FR values are found in southwestern-bound slopes, bottoming at 0.67. It is supposed that the specific topographical characteristics of hills in the study area, highly influenced by structural geology aspects, may lead to the scenario in the study area.
Distance to drainage FR values (shown in graph form in Figure 6f) start with a relatively high value of 0.99 in the smallest classes (closest to drainage), peaking in the 200 m class with 1.39. This behavior is expected since areas closer to drainage lines are more prone to be affected by water discharge from rainfall, making the soil saturated and prone to landslide activity.
The distance to lineament LCF values (shown in graph form in Figure 6g) display higher values closer, peaking at the 400 m class with a value of 1.37 and decreasing the farther it gets from lineaments, reaching the lowest value of 0.85. This is expected since areas closer to faults and lineaments are prone to exhibit more brittle geological features, which may present less slope stability.
The FR values for the soil class LCF, shown in 6H, peak in the “Mih-1” class, with a value of 2.56. “Mih-1”, according to the soil mapping performed by Tanimoto et al. [51], is referent to soil unit Mihara-1, classified as “residual immature soils”, a soil type that does not present consolidated structure, being highly unstable and thus prone to landslides, which explains the high FR value calculated.
Finally, the mean annual precipitation LCF presents FR values (shown in graph form in Figure 6i) starting with the lowest values in areas of low precipitation (0.36 for below 2100 mm), increasing until reaching peak FR value in the intermediate 2578 mm class. However, from that point FR values decrease, reaching 1.29 in the highest precipitation class of 2865 mm. This behavior is expected because the highest precipitation zones are usually found in high-altitude areas [31]. As shown in the altitude LCF FR values, high altitude areas are referent to mountaintops and hard rock hillsides, which are not prone to landslide activity.
It is noticed that all LCFs show a significant variation of FR values between classes, with the lowest FR variance being shown in the distance to lineaments LCF with a still relevant maximum difference of 0.52. This evidences the importance of all LCFs in the susceptibility assessment calculation, as there are considerable tendencies for specific classes to be associated with areas with lower or higher landslide densities.

3.2. FR Method LSM

With all the FR values calculated for each LCF class, LSI was summed for 20 m pixels in the study area in order to construct the final FR LSM for the Kure study area, shown in Figure 7. Visual analysis of the map shows a relatively good distribution of the LSI zones in landslide-affected areas, particularly in the flank of mountains, with very high LSI zones found particularly in the northeastern side of Mt. Noro, heavily affected by landslide activity. It is noticeable, however, that higher LSI zones seem too wide and not always associated with landslides of the July 2018 disasters, as seen near Peak Hayaga, where the FR-based LSM attributed large high and very high LSI zones even though landslide activity was not substantial, an aspect which may result in high false positive rates.
The AUROC analysis of the resultant LSM map constructed with the FR method for the Kure study area resulted in a score of 0.8424 (84.24% predictability potential) and is shown in graph form in Figure 8. The value is considered as “good” standing for landslide susceptibility assessment.
The analysis of landslide density per LSI zone for the FR map shows an increasing trend, though slightly exponential, in higher LSI zones. Pearson’s correlation coefficient analysis showed a final value of 0.88, configuring a good correlation between LSI increase and actual landslide density in each zone (Figure 9).

3.3. RF Method LSM

When applying the Kure study area dataset to the utilized RF algorithm, automatic parameter testing found that the best setting of the minimum number of samples required to be at a leaf node (“min_samples_leaf”) was 4560. Using the RF ML model produced the LSM seen in Figure 10. Visual analysis of the map shows excellent distribution of LSI zones, particularly with very high LSI in the northeast flank of Mt. Noro, where landslide activity was substantial during the July 2018 disasters. Although some false positives are still found in areas of Peak Hayaga, and some occasional false negatives can be seen particularly in some areas in the south of the study area, the map seems to sufficiently reflect the actual behavior of landsliding during the July 2018 disasters.
AUROC analysis of the RF LSM resulted in a score of 0.952 (95.2% predictability potential), considered excellent for landslide susceptibility assessment. The analysis is shown in graph form in Figure 11.
The distribution of landslide density for each of the LSI zones of the RF map in the Kure study area also shows a steady upward trend with increasing LSI zones (Figure 12), and Pearson’s correlation coefficient score was found to be 0.93. The steady correlation between LSI zones and landslide density suggests an LSM with good predictability.

3.4. ANN Method LSM

Applying automatic parameter testing of the ANN method in the Kure study area led to an optimal configuration of the hidden layer sizes parameter (“hidden_layer_sizes”) with three hidden layers with 20 nodes each. The resultant LSM produced with the ANN method for the study area is shown in Figure 13.
Visual analysis of the map shows the concentration of high LSI zones near mountain flanks in intermediate altitudes and near lineaments. The LSI distribution seems mostly accurate and related to landslide occurrences, especially in the northeastern flank of Mt. Noro. A few areas, like the flanks of Peak Hayaga and the region of Yanocho, however, present many false positives.
AUROC analysis of the map resulted in a score of 0.9247 (92.47% predictability potential), considered excellent for landslide susceptibility assessment. The AUROC graph is shown in Figure 14.
The correlation of landslide density for each of the LSI zones is shown in Figure 15. It is noticeable that the landslide density has a somewhat abrupt increase for each, especially in the intermediate zones. The increase is constant, though, and Pearson’s correlation coefficient was found to be 0.92, which conveys an excellent correlation.

3.5. LR Method LSM

Production of an LSM in the study area with the LR method was executed after automatic parameter testing, resulting in an optimal setting of using solver model (“solver”) “saga”, maximum iterations (“max_iter”) at 268, class weight (“class_weight”) as “balanced”, random state (“random_state”) at 101, and inverse of regularization strength (“C”) as 0.1. The produced LSM is shown in Figure 16.
AUROC analysis of the resultant LSM produced with the LR method in the Kure study area resulted in a score of 0.90.16 (90.16% predictability potential), considered excellent for landslide susceptibility assessment. The analysis graph is shown in Figure 17.
Landslide density per LSI zone in the map shows a steady increase until the high LSI zone and then a small decrease in the very high LSI zone. Although the trend is mostly steady upward in most of the LSI zones, the decrease between high and very high zones may convey a not-so-reliant LSM. Pearson’s correlation coefficient, though, resulted in a value of 0.83, conveying a good correlation coefficient (Figure 18).

3.6. LSM Accuracy Comparison

The production of LSMs for the Kure study area and subsequent validation with AUROC analysis for all the verified four methods (Figure 19) led to the finding that the RF method had the best results, with an AUC of 0.952, relative to a 95.2% predictability potential. The RF-produced LSM for the Kure study area is followed in AUC value by the ANN method (AUC = 0.9247), LR method (AUC = 0.9016), and the statistical FR method (0.8424).
The assessment that the RF method scored an AUROC of 95.2% in this study, the best among all the seven methods and produced an excellent LSM with finely tuned LSI zone distribution and significant landslide density increase with LSI. The assessment that the RF method is the most appropriate for landslide susceptibility mapping has also been indicated by other studies such as Goetz et al. [12]; Chen et al. [22]; Youssef and Pourghasemi [13]; Xia et al. [24]; Imtiaz et al. [25]; and Daviran et al. [26]. The RF’s decision trees are considered good models for tasks with large datasets and a high number of features (LCFs), as is the case of landslide susceptibility mapping. Although decision trees are easily subject to mistakes, being an ensemble model, the RF algorithm is highly unsusceptible to some problems in ML algorithms, such as under/overfitting, and can deal well with datasets such as the ones involved in the production of LSMs.
The ANN method, widely regarded as one of the most appropriate models for LSM production [15,23,74,75], scored an AUROC of 92.47% in this study, second in the list, and exhibiting an appropriate LSI zone distribution in the resultant LSM and steady increase in landslide density per LSI. Being able to work well with large datasets and adaptable to many different tasks, neural network models may still be regarded as one of the leading ML models for many kinds of assessment, as was demonstrated in this study.
Some studies rank the LR method as a high-ranking method for landslide susceptibility mapping [76,77]. In this study, the method performed relatively well with an AUROC score of 90.16%, scoring third among the utilized machine learning methods. The produced LSMs show a slightly exaggerated distribution of high LSI zones (and thus a relatively false positive rate), which may explain the maps’ lower performance when compared to the other produced LSMs.
Finally, the traditional statistical FR method ranked below all ML methods in this study, with an AUROC of 0.8424. Despite it being the primary method for LSM production before the advent of ML methods [11,78,79], more recent studies usually fall under the same conclusion that the method is usually outperformed by ML models [11,17,74,80]. It is also noteworthy that with the need for manual processing and manipulation of data in GIS platforms during the production of an FR LSM, this LSM took the most time and effort during production, which may be considered a significant blow in efficiency. Moreover, quick and automatic parameter tuning with code-based AUROC analysis is not yet possible.

3.7. Impact of XRAIN Radar-Acquired Rainfall Data

In order to assess whether XRAIN data provided a significant impact on the LSM accuracy and, if so, whether it was positive or negative, the RF algorithm (ranked best among the tested) was executed multiple times with the same parameter set as in the top ranking process, but with no XRAIN rainfall data as an LCF for comparison purposes. This test resulted in a mean AUROC score of 0.92, approximately 3.2% below the RF method executed with X.RAIN rainfall data included as an LCF.

4. Discussion

The AUROC test AUC value of 0.952 for the RF map, followed by 0.9247 for the ANN method, 0.9016 for the LR method, and 0.8424 for the FR method, evidences the superior efficiency of the ML technique RF in LSM production by a margin of about 11% in accuracy results compared to the lowest scoring LSM. The better efficiency of the RF-based LSM is also evidenced by the steady distribution of landslide density per each LSI zone. When seen in map form, it is noticeable that the RF-based map, when compared to the other 3 LSMs, seems to more finely define high LSI zones fitting with actual landslide points of the July 2018 disasters, characterizing a low false positive rate result. All maps point out high landslide susceptibility in the flanks of the mountains in the central-east region of the map, where there was, in fact, high landslide activity during the July 2018 disasters.
As explored by other authors that have assessed the efficiency of ML techniques in LSM production, it is judged that ML techniques, with their ability to deal with large sets of data and find specific patterns and relations in the intrinsic details of the dataset, perform a more accurate and fine assessment of landslide susceptibility when compared to simpler purely statistical methods such as the FR method. Moreover, as an ensemble model, the RF technique is also less subject to errors that may be present in ML techniques, such as over or underfitting [14,16], which may also explain the good performance of the model in this study’s attempt. ML techniques also have the advantage of allowing for fine parameter tuning rather easily for users familiarized with the method. Namely, this study’s automatic parameter tuning method found a large variance in the RF algorithm’s efficiency by tuning the “minimal samples leaf” parameter, which allowed for the optimal settings that delivered the final resultant LSM with an AUC of 0.952
One arguable advantage of the FR model, however, is that it evidences the relevance and importance of each LCF in the susceptibility analysis, as the variations in FR values demonstrate that each factor’s different class does relate with lower or higher landslide density, as demonstrated by the possibility to visualize the calculated FR values for each LCF class, allowing for a detailed understanding of how each LCF may dynamically affect slope stability by analysis of FR value for each LCF class. Although this information is not visualized in the final LSM, it may be useful for other landslide mitigation and prevention methods, as well as knowledge for decision-making in situations where an LSM is not readily available. In this study, for example, it was evidenced that both slope angle and altitude LCF landslide activity peaks in intermediate LCF values (i.e., 35–39-degree slopes, 405–488 m) instead of maximum LCF values, as would be normally assumed. This phenomenon is explained by the hard material with almost no soil content of high-degree slopes or the flat mountain peaks that usually dominate the high-altitude areas of Kure city. These aspects are not universal and may change in areas with different topographical and geological contexts, which evidences the importance of landslide susceptibility assessments for different areas.
The higher mean score of AUROC with LSM production with XRAIN rainfall data included as an LCF (approximately 3.2% above the same algorithm with no XRAIN data) evidences the advantage of utilizing this kind of spatially accurate radar-acquired rainfall data in landslide susceptibility analysis.

5. Conclusions

The production and subsequent validation through AUROC analysis of 14 LSMs for the study area using four different methods, one of them being a traditional statistical method (frequency ratio, or FR) and the other 3 being ML methods (random forest, or RF; artificial neural networks, or ANN; logistic regression, or LR) led to the final AUROC-assessed ranking from best to worst of RF; ANN; LR; and FR, with respective AUCs of 0.952; 0.9247; 0.9016; and 0.8424. Although all maps showed good to excellent results, the RF method was judged the best with a predictability potential of 95.2.
The assessment that the RF method is the most appropriate for landslide susceptibility mapping according to AUROC analysis is supported by the majority of the available literature [12,13,22,24,25,26], as well as that the FR method is outperformed by ML methods [11,17,74,80]. It is noteworthy, however, that the FR method provides the unique opportunity to examine all the LCF classes’ FR values (susceptibility to landslides), valuable knowledge to understand the deep dynamics of how environmental factors may affect the probability of slope failure in a particular area, which is not possible with ML methods.
The novelties of this LSM comparison study include the use of high-precision XRAIN radar-acquired rainfall data, which significantly differs from the commonly utilized rain gauge station measurement data in terms of spatial accuracy, which is significantly important in GIS-based susceptibility assessments, and the use of a code-based automated hyperparameter tuning based on iterated AUROC analysis validation, which allows for quick assessment of the ML method’s efficiency, each finely tuned with the best parameters for the specific context of the assessed study area. This method proposed a quick and practical process to achieve the optimal LSM given a dataset of LCFs and landslides. It is hoped that the LSM production process used in this study may provide more prompt and more practical LSM production with excellent results for a variety of areas, allowing for more efficient resilience strategies against landslide disasters.

Author Contributions

Conceptualization, J.M.d.S.R.N. and N.P.B. (supervised); methodology analyses and validations, J.M.d.S.R.N. (checked by N.P.B.); writing—original draft preparation, J.M.d.S.R.N.; writing—review and editing, N.P.B.; supervision, N.P.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research work received no external funding.

Data Availability Statement

In this work, we have primarily used the datasets availed by the Geospatial Information Authority of Japan (https://www.gsi.go.jp/ (accessed on 5 May 2024)) and the Data Integration and Analysis System (DIAS: https://diasjp.net/en/ (accessed on 5 May 2024)) office from The University of Tokyo. We have not created any other special dataset for this study, and what we have utilized in the analyses has been presented in the paper itself.

Acknowledgments

The immensely valuable free access to XRAIN data, widely utilized as a main item in this research work, was kindly permitted for use in this work by the Data Integration and Analysis System (DIAS: https://diasjp.net/en/ (accessed on 5 May 2024)) office from The University of Tokyo. Also, the spatial information and the 2018 landslide data utilized in this study were availed freely by the Geospatial Information Authority of Japan (https://www.gsi.go.jp/ (accessed on 5 May 2024)). This is a part of the first author’s Ph.D. research work, which would not have been possible without the generous scholarship support from the Ministry of Education, Culture, Sports, Science and Technology (MEXT) of Japan during his doctoral study at the Graduate School of Science and Engineering of Ehime University (2020~2023).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. United Nations Office for Disaster Risk Reduction. Economic Losses, Poverty and Disasters 1998–2017. 2018, p. 31. Available online: https://www.undrr.org/publication/economic-losses-poverty-disasters-1998-2017 (accessed on 22 June 2023).
  2. Gariano, S.; Guzzetti, F. Landslides in a CHANGING climate. Earth-Sci. Rev. 2016, 162, 227–252. [Google Scholar] [CrossRef]
  3. Osanai, N.; Tomita, Y.; Akiyama, K.; Matsushita, T. Technical Note of National Institute for Land and Infrastructure Management No.530; National Institute for Land and Infrastructure Management, Ministry of Land, Infrastructure, Transport and Tourism: Tokyo, Japan, 2009. [Google Scholar]
  4. Japan Meteorological Agency. Announcement of Special Warnings in Kyoto and Hyogo Prefectures (Translated Title). 2018. Available online: http://www.jma.go.jp/jma/press/1807/06e/kaisetsu2018070623.pdf (accessed on 15 June 2023).
  5. Ministry of Land, Infrastructure, Transport and Tourism. Press Release: The 2018 Rain Disasters Result in the Heaviest Economic Damage Ever Since Statistical Data Management (Translated Title). 2019. Available online: https://www.mlit.go.jp/common/001301033.pdf (accessed on 5 May 2024).
  6. Moriyama, T.; Hirano, M. Relationship of three hours of cumulative rainfall during concentration time of slope and collapsed area of landslide. In Proceedings of the 21st IAHR-APD Congress 2018, Yogyakarta, Indonesia, 2–5 September 2018; pp. 999–1006. [Google Scholar]
  7. Wu, Y.H.; Nakakita, E. Assessment of landslide hazards using logistic regression with high-resolution radar rainfall observation and geological factor. J. Jpn. Soc. Civ. Eng. Ser. B1 (Hydraul. Eng.) 2019, 75, I_157–I_162. [Google Scholar] [CrossRef] [PubMed]
  8. Marc, O.; Gosset, M.; Saito, H.; Uchida, T.; Malet, J.-P. Spatial Patterns of Storm-Induced Landslides and Their Relation to Rainfall Anomaly Maps. Geophys. Res. Lett. 2019, 46, 11167–11177. [Google Scholar] [CrossRef]
  9. Yokoe, Y.; Kita, M.; Uchida, T.; Kawahara, Y. Characteristics of precipitation system in hiroshima prefecture during the heavy rainfall event of July 2018 observed using xrain data. J. JSCE 2021, 9, 212–220. [Google Scholar] [CrossRef] [PubMed]
  10. Data Integration & Analysis System. XRAIN Realtime Precipitation Data; The University of Tokyo, Sponsored by the Ministry of Education, Culture, Sports, Science and Technology: Tokyo, Japan, 2020; Available online: https://diasjp.net/ (accessed on 10 June 2022).
  11. Yilmaz, I. Landslide susceptibility mapping using FR, logistic regression, artificial neural networks, and their comparison: A case study from Kat landslides (Tokat—Turkey). Comput. Geosci. 2009, 35, 1125–1138. [Google Scholar] [CrossRef]
  12. Goetz, J.N.; Brenning, A.; Petschko, H.; Leopold, P. Evaluating ML and statistical prediction techniques for landslide susceptibility modeling. Comput. Geosci 2015, 81, 1–11. [Google Scholar] [CrossRef]
  13. Youssef, A.H.; Pourghasemi, H.R. Landslide susceptibility mapping using ML algorithms and comparison of their performance at Abha Basin, Asir Region, Saudi Arabia. Geosci. Front. 2021, 12, 639–655. [Google Scholar] [CrossRef]
  14. Ado, M.; Amitab, K.; Maji, A.K.; Jasińska, E.; Gono, R.; Leonowicz, Z.; Jasiński, M. Landslide susceptibility mapping using ML: A literature survey. Remote Sens. 2022, 14, 3029. [Google Scholar] [CrossRef]
  15. Liu, S.; Wang, L.; Zhang, W.; He, Y.; Pijush, S. A comprehensive review of ML-based methods in Landslide susceptibility mapping. Geol. J. 2023, 58, 2283–2301. [Google Scholar] [CrossRef]
  16. Fang, Z.; Wang, Y.; Peng, L.; Hong, H. A comparative study of heterogeneous ensemble-learning techniques for Landslide susceptibility mapping. Int. J. Geogr. Inf. Sci. 2021, 35, 321–347. [Google Scholar] [CrossRef]
  17. Wang, Y.; Fang, Z.; Wang, M.; Peng, L.; Hong, H. Comparative study of Landslide susceptibility mapping with different recurrent neural networks. Comput. Geosci. 2020, 138, 104445. [Google Scholar] [CrossRef]
  18. Azarafza, M.; Azarafza, M.; Akgün, H.; Atkinson, P.M.; Derakhshani, R. Deep learning-based Landslide susceptibility mapping. Sci. Rep. 2021, 11, 24112. [Google Scholar] [CrossRef] [PubMed]
  19. Huang, F.; Chen, J.; Liu, W.; Huang, J.; Hong, H.; Chen, W. Regional rainfall-induced landslide hazard warning based on Landslide susceptibility mapping and a critical rainfall threshold. Geomorphology 2022, 408, 108236. [Google Scholar] [CrossRef]
  20. Palau, R.M.; Hürlimann, M.; Berenguer, M.; Sempere-Torres, D. Debris-Flow Early Warning System at Regional Scale Using Weather Radar and Susceptibility Mapping. Doctoral Dissertation, Colorado School of Mines. Arthur Lakes Library, Golden, CO, USA, 2019. [Google Scholar]
  21. Rodrigues Neto, J.M.S. Analysis of XRAIN Data and Landslide Distribution in Southern Hiroshima during the July 2018 Heavy Rain-Induced Disasters. Master’s Thesis, Ehime University, Ehime, Japan, 2020. [Google Scholar]
  22. Chen, W.; Pourghasemi, H.R.; Panahi, M.; Kornejady, A.; Wang, J.; Xie, X.; Cao, S. Spatial prediction of landslide susceptibility using an adaptive neuro-fuzzy inference system combined with FR, generalized additive model, and support vector machine techniques. Geomorphology 2017, 297, 69–85. [Google Scholar] [CrossRef]
  23. Li, J.; Wang, W.; Li, Y.; Han, Z.; Chen, G. Spatiotemporal Landslide susceptibility mapping incorporating the effects of heavy rainfall: A case study of the heavy rainfall in August 2021 in Kitakyushu, Fukuoka, Japan. Water 2021, 13, 3312. [Google Scholar] [CrossRef]
  24. Xia, D.; Tang, H.; Sun, S.; Tang, C.; Zhang, B. Landslide susceptibility mapping based on the germinal center optimization algorithm and support vector classification. Remote Sens. 2022, 14, 2707. [Google Scholar] [CrossRef]
  25. Imtiaz, I.; Umar, M.; Latif, M.; Ahmed, R.; Azam, M. Landslide susceptibility mapping: Improvements in variable weights estimation through machine learning algorithms—A case study of upper Indus River Basin, Pakistan. Environ. Earth Sci. 2022, 81, 112. [Google Scholar] [CrossRef]
  26. Daviran, M.; Shamekhi, M.; Ghezelbash, R.; Maghsoudi, A. Landslide susceptibility prediction using artificial neural networks, SVMs and random forest: Hyperparameters tuning by genetic optimization algorithm. Int. J. Environ. Sci. Technol. 2023, 20, 259–276. [Google Scholar] [CrossRef]
  27. Beguería, S. Validation and evaluation of predictive models in hazard assessment and risk management. Nat. Hazards. 2006, 37, 315–329. [Google Scholar] [CrossRef]
  28. Corominas, J.; van Westen, C.; Frattini, P.; Cascini, L.; Malet, J.-P.; Fotopoulou, S.; Catani, F.; Van Den Eeckhaut, M.; Mavrouli, O.; Agliardi, F. Recommendations for the quantitative analysis of landslide risk. Bull. Eng. Geol. Environ. 2014, 73, 209–263. [Google Scholar] [CrossRef]
  29. Vakhshoori, V.; Zare, M. Is the ROC curve a reliable tool to compare the validity of LSMs? Geomat. Nat. Hazards Risk 2018, 9, 249–266. [Google Scholar] [CrossRef]
  30. Li, B.; Liu, K.; Wang, M.; He, Q.; Jiang, Z.; Zhu, W.; Qiao, N. Global dynamic rainfall-induced Landslide susceptibility mapping using ML. Remote Sens. 2022, 14, 5795. [Google Scholar] [CrossRef]
  31. Rodrigues Neto, J.M.S.; Bhandary, N.P. Influence of Localized Rainfall Patterns on Landslide Occurrence—A Case Study of Southern Hiroshima with eXtended Radar Information Network Data during the July 2018 Heavy Rain Disasters. Geosciences 2023, 13, 245. [Google Scholar] [CrossRef]
  32. Yamada, N.; Higashimoto, S.; Mizuno, K.; Hiroshima, T.; Suda, Y. Hiroshima 1:200,000 Geological Map, NI-53-33; Geological Survey of Japan: Tsukuba, Japan, 1986. [Google Scholar]
  33. Takemura, S.; Sugamori, Y.; Suzuki, S. Maizuru and Ultra-Tamba zones in the eastern Okayama and southwestern Hyogo prefectures, southwest Japan. J. Geol. Soc. Japan 2009, 115 (Suppl.), 123–137. (In Japanese) [Google Scholar] [CrossRef]
  34. Wang, F.; Wu, Y.-H.; Yang, H.; Tanida, Y.; Kamei, A. Preliminary investigation of the 20 August 2014 debris flows triggered by a severe rainstorm in Hiroshima City, Japan. Geoenvironmental Disasters 2015, 2, 17. [Google Scholar] [CrossRef]
  35. Chigira, M. Micro-sheeting of granite and its relationship with landsliding specifically after the heavy rainstorm in June 1999, Hiroshima Prefecture, Japan. Eng. Geol. 2001, 59, 219–231. [Google Scholar] [CrossRef]
  36. Kamada, M.; Nakagoshi, N. Landscape structure and the disturbance regime at three rural regions in Hiroshima Prefecture, Japan. Landsc. Ecol. 1996, 11, 15–25. [Google Scholar] [CrossRef]
  37. Japan Meteorological Agency. Average Monthly Precipitation in a Year (Translated Title). 2020. Available online: http://www.data.jma.go.jp/ (accessed on 15 June 2023).
  38. Guzzetti, F.; Cardinali, M.; Reichenbach, P. The Influence of Structural Setting and Lithology on Landslide Type and Pattern. Environ. Eng. Geosci. 1996, II, 531–555. [Google Scholar] [CrossRef]
  39. Atkinson, P.M.; Massari, R. Generalised linear modelling of susceptibility to landsliding in the central apennines, Italy. Comput. Geosci. 1998, 24, 373–385. [Google Scholar] [CrossRef]
  40. Dai, F.; Lee, C.; Ngai, Y. Landslide risk assessment and management: An overview. Eng. Geol. 2002, 64, 65–87. [Google Scholar] [CrossRef]
  41. Çevik, E.; Topal, T. GIS-based landslide susceptibility mapping for a problematic segment of the natural gas pipeline, Hendek (Turkey). Environ. Geol. 2003, 44, 949–962. [Google Scholar] [CrossRef]
  42. Peart, M.R.; Ng, K.Y.; Zhang, D.D. Landslides and sediment delivery to a drainage system: Some observations from Hong Kong. J. Asian Earth Sci. 2005, 25, 821–836. [Google Scholar] [CrossRef]
  43. Highland, L.M.; Bobrowsky, P. The Landslide Handbook—A Guide to Understanding Landslides; Geological Survey Circular: Reston, VA, USA, 2008; Volume 1325, 129p. [Google Scholar]
  44. Ministry of Land, Infrastructure, Transport and Tourism. National Land Numerical Information Database (Translated Title); Ministry of Land, Infrastructure, Transport and Tourism: Tokyo, Japan, 2022.
  45. Geospatial Information Authority of Japan. Database. 2022. Available online: https://www.gsi.go.jp/top.html (accessed on 10 June 2022).
  46. Gokceoglu, M.E.C. Assessment of landslide susceptibility for a landslide-prone area (north of Yenice, NW Turkey) by fuzzy approach. Environ. Geol. 2002, 41, 720–730. [Google Scholar] [CrossRef]
  47. Fossen, H. Structural Geology; Cambridge University Press: Cambridge, UK, 2016. [Google Scholar]
  48. Lee, S.; Talib, J.A. Probabilistic landslide susceptibility and factor effect analysis. Environ. Geol. 2005, 47, 982–990. [Google Scholar] [CrossRef]
  49. Ambrosi, C.; Crosta, G.B. Large sackung along major tectonic features in the Central Italian Alps. Eng. Geol. 2006, 83, 183–200. [Google Scholar] [CrossRef]
  50. Sarkar, S.; Kanungo, D.P. An integrated approach for landslide susceptibility mapping using remote sensing and GIS. Photogramm. Eng. Remote Sens. 2004, 70, 617–625. [Google Scholar] [CrossRef]
  51. Tanimoto, T.; Uemoto, S.; Kanazawa, S.; Miyazi, K.; Matsuura, K.; Yoshida, F.; Higashi, T.; Hyodo, H. 1:50,000 Kure Soil Map (NI-53-33-7); Geological Survey of Japan: Tsukuba, Japan, 1985. [Google Scholar]
  52. Guidicini, G.; Iwasa, O.Y. Tentative correlation between rainfall and landslides in a humid tropical environment. Bull. Int. Assoc. Eng. Geol. 1977, 16, 13–20. [Google Scholar] [CrossRef]
  53. Caine, N. The rainfall intensity–duration control of shallow landslides and debris flows. Geogr. Ann. A 1980, 62, 23–27. [Google Scholar]
  54. Guzzetti, F.; Peruccacci, S.; Rossi, M.; Stark, C.P. Rainfall thresholds for the initiation of landslides in central and southern Europe. Meteorol. Atmos. Phys. 2007, 98, 239–267. [Google Scholar] [CrossRef]
  55. Dahal, R.K.; Hasegawa, S. Representative rainfall thresholds for landslides in the Nepal Himalaya. Geomorphology 2008, 100, 429–443. [Google Scholar] [CrossRef]
  56. Dahal, R.K. Rainfall-induced Landslides in Nepal. Int. J. Eros. Control. Eng. 2012, 5, 1–8. [Google Scholar] [CrossRef]
  57. Kansakar, S.R.; Hannah, D.M.; Gerrard, J.; Rees, G. Spatial pattern in the precipitation regime of Nepal. Int. J. Clim. 2004, 24, 1645–1659. [Google Scholar] [CrossRef]
  58. Vélez, A.; Martin-Vide, J.; Royé, D. Spatial analysis of daily precipitation concentration in Puerto Rico. Theor. Appl. Climatol. 2019, 136, 1347–1355. [Google Scholar] [CrossRef]
  59. Zhao, F.; Meng, X.; Zhang, Y.; Chen, G.; Su, X.; Yue, D. Landslide susceptibility mapping of Karakorum highway combined with the application of SBAS-InSAR technology. Sensors 2019, 19, 2685. [Google Scholar] [CrossRef] [PubMed]
  60. O’brien, R.M. A caution regarding rules of thumb for variance inflation factors. Qual. Quant. 2007, 41, 673–690. [Google Scholar] [CrossRef]
  61. Tien Bui, D.; Lofman, O.; Revhaug, I.; Dick, O. Landslide susceptibility analysis in the Hoa Binh province of Vietnam using statistical index and logistic regression. Nat. Hazards. 2011, 59, 1413–1444. [Google Scholar]
  62. Rasyid, A.R.; Bhandary, N.P.; Yatabe, R. Performance of FR and logistic regression model in creating GIS based landslides susceptibility map at Lompobattang Mountain, Indonesia. Geoenvironmental Disasters 2016, 3, 1–16. [Google Scholar] [CrossRef]
  63. Lee, S.; Pradhan, B. Landslide hazard mapping at Selangor, Malaysia using FR and logistic regression models. Landslides 2006, 4, 33–41. [Google Scholar] [CrossRef]
  64. Yan, F.; Zhang, Q.; Ye, S.; Ren, B. A novel hybrid approach for Landslide susceptibility mapping integrating analytical hierarchy process and normalized FR methods with the cloud model. Geomorphology 2018, 327, 170–187. [Google Scholar] [CrossRef]
  65. Carleo, G.; Cirac, I.; Cranmer, K.; Daudet, L.; Schuld, M.; Tishby, N.; Vogt-Maranto, L.; Zdeborová, L. ML and the physical sciences. Rev. Mod. Phys. 2019, 91, 045002. [Google Scholar] [CrossRef]
  66. Felicísimo, A.; Cuartero, A.; Remondo, J.; Quirós, E. Mapping landslide susceptibility with logistic regression, multiple adaptive regression splines, classification and regression trees, and maximum entropy methods: A comparative study. Landslides 2013, 10, 175–189. [Google Scholar] [CrossRef]
  67. Kavzoglu, T.; Colkesen, I.; Sahin, E.K. ML techniques in Landslide susceptibility mapping: A survey and a case study. In Advances in Natural and Technological Hazards Research; Pradhan, S., Vishal, V., Singh, T., Eds.; Landslides: Theory, Practice and Modelling; Springer: Cham, Switzerland, 2019; Volume 50, pp. 283–301. [Google Scholar]
  68. Wu, X.; Vipin Kumar, J.; Quinlan, R.; Ghosh, J.; Yang, Q.; Motoda, H.; McLachlan, J.; Ng, A.; Liu, B.; Yu, P.S.; et al. Top 10 algorithms in data mining. Knowl. Inf. Syst. 2008, 14, 1–37. [Google Scholar] [CrossRef]
  69. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
  70. Ermini, L.; Catani, F.; Casagli, N. Artificial neural networks applied to landslide susceptibility assessment. Geomorphology 2005, 66, 327–343. [Google Scholar] [CrossRef]
  71. Garrett, J. Where and why artificial neural networks are applicable in civil engineering. J. Comput. Civ. Eng. 1994, 8, 129–130. [Google Scholar] [CrossRef]
  72. Lee, S.; Ryu, J.H.; Min, K.; Won, J.S. Landslide susceptibility analysis using GIS and artificial neural network. Earth Surf. Process. Landf. J. Br. Geomorphol. Res. Group 2003, 28, 1361–1376. [Google Scholar] [CrossRef]
  73. Tolles, J.; Meurer, W.J. Logistic Regression Relating Patient Characteristics to Outcomes. JAMA 2016, 316, 533–534. [Google Scholar] [CrossRef] [PubMed]
  74. Gautam, P.; Kubota, T.; Sapkota, L.M.; Shinohara, Y. Landslide susceptibility mapping with GIS in high mountain area of Nepal: A comparison of four methods. Environ. Earth Sci. 2021, 80, 1–18. [Google Scholar] [CrossRef]
  75. Aslam, B.; Zafar, A.; Khalil, U. Comparison of multiple conventional and unconventional machine learning models for landslide susceptibility mapping of Northern part of Pakistan. Environ. Dev. Sustain. 2022, 1–28. [Google Scholar] [CrossRef]
  76. Yi, Y.; Zhang, Z.; Zhang, W.; Xu, C. Comparison of different machine learning models for landslide susceptibility mapping. In Proceedings of the GARSS 2019—2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 28 July–2 August 2019; pp. 9318–9321. [Google Scholar] [CrossRef]
  77. Nhu, V.-H.; Shirzadi, A.; Shahabi, H.; Singh, S.K.; Al-Ansari, N.; Clague, J.J.; Jaafari, A.; Chen, W.; Miraki, S.; Dou, J.; et al. Shallow Landslide Susceptibility Mapping: A Comparison between Logistic Model Tree, Logistic Regression, Naïve Bayes Tree, Artificial Neural Network, and Support Vector Machine Algorithms. Int. J. Environ. Res. Public Health 2020, 17, 2749. [Google Scholar] [CrossRef]
  78. Lee, S.; Sambath, T. Landslide susceptibility mapping in the Damrei Romel area, Cambodia using frequency ratio and logistic regression models. Environ. Geol. 2006, 50, 847–855. [Google Scholar] [CrossRef]
  79. Akgun, A.; Dag, S.; Bulut, F. Landslide susceptibility mapping for a landslide-prone area (Findikli, NE of Turkey) by likelihood-frequency ratio and weighted linear combination models. Environ. Geol. 2008, 54, 1127–1143. [Google Scholar] [CrossRef]
  80. Himan, S.; Saeed, K.; Baharin, A.; Mazlan, H. Landslide susceptibility mapping at central Zab basin, Iran: A comparison between analytical hierarchy process, frequency ratio and logistic regression models. Catena 2014, 115, 55–70. [Google Scholar]
Figure 1. Localization map of the study area, in Kure City, Southern Hiroshima, along with landslide points referent to the July 2018 disasters [31].
Figure 1. Localization map of the study area, in Kure City, Southern Hiroshima, along with landslide points referent to the July 2018 disasters [31].
Geosciences 14 00171 g001
Figure 2. LCFs used in the LSM production: (a) lithology, (b) land use, (c) altitude, (d) slope angle, (e) slope aspect, (f) distance to drainage, (g) distance to lineament, (h) soil class, (i) mean annual precipitation. White areas comprise slopes lower than 20° or higher than 50°, which were left out of the analysis for being considered not prone to landslides.
Figure 2. LCFs used in the LSM production: (a) lithology, (b) land use, (c) altitude, (d) slope angle, (e) slope aspect, (f) distance to drainage, (g) distance to lineament, (h) soil class, (i) mean annual precipitation. White areas comprise slopes lower than 20° or higher than 50°, which were left out of the analysis for being considered not prone to landslides.
Geosciences 14 00171 g002
Figure 3. Explanatory illustration of the RF ML method. Each decision tree is composed of decision nodes that process the dynamics between LCFs and landslide occurrence until a decision leaf is reached. Green nodes represent the branching options followed in the specified tree, while yellow nodes represent alternative branching options not followed by the specified tree. “Class A” represents a non-landslide leaf (i.e., result) in the specific tree, while “Class B” represents a landslide leaf (i.e., result). The LSI is then calculated by voting all the trees in the “forest”.
Figure 3. Explanatory illustration of the RF ML method. Each decision tree is composed of decision nodes that process the dynamics between LCFs and landslide occurrence until a decision leaf is reached. Green nodes represent the branching options followed in the specified tree, while yellow nodes represent alternative branching options not followed by the specified tree. “Class A” represents a non-landslide leaf (i.e., result) in the specific tree, while “Class B” represents a landslide leaf (i.e., result). The LSI is then calculated by voting all the trees in the “forest”.
Geosciences 14 00171 g003
Figure 4. Explanatory illustration of the ANN ML method. The decision nodes (neurons) found in the hidden layers process the weights and transfer functions of the previous layers, until an output layer assessing a result is reached.
Figure 4. Explanatory illustration of the ANN ML method. The decision nodes (neurons) found in the hidden layers process the weights and transfer functions of the previous layers, until an output layer assessing a result is reached.
Geosciences 14 00171 g004
Figure 5. Sigmoid function curve, referent to the LR ML method. A sigmoid curve is determined based on the occurrence (1) or non-occurrence (o) of landslides, allowing for assessing the probability of landslide occurrence in new points.
Figure 5. Sigmoid function curve, referent to the LR ML method. A sigmoid curve is determined based on the occurrence (1) or non-occurrence (o) of landslides, allowing for assessing the probability of landslide occurrence in new points.
Geosciences 14 00171 g005
Figure 6. Graphs of FR values calculated for LCFs: (a) lithology, (b) land use, (c) altitude, (d) slope angle, (e) slope aspect, (f) distance to drainage, (g) distance to lineament, (h) soil class, and (i) XRAIN mean annual precipitation. Classes with no landslide occurrence (and thus with an FR value of 0) are omitted.
Figure 6. Graphs of FR values calculated for LCFs: (a) lithology, (b) land use, (c) altitude, (d) slope angle, (e) slope aspect, (f) distance to drainage, (g) distance to lineament, (h) soil class, and (i) XRAIN mean annual precipitation. Classes with no landslide occurrence (and thus with an FR value of 0) are omitted.
Geosciences 14 00171 g006
Figure 7. LSM (LSM) produced with the FR (FR) statistical method, along with landslide points from the July 2018 disasters.
Figure 7. LSM (LSM) produced with the FR (FR) statistical method, along with landslide points from the July 2018 disasters.
Geosciences 14 00171 g007
Figure 8. AUROC analysis graph of the LSM for the Kure study area produced with the FR method, resulting in 0.8424 (84.24% predictability potential).
Figure 8. AUROC analysis graph of the LSM for the Kure study area produced with the FR method, resulting in 0.8424 (84.24% predictability potential).
Geosciences 14 00171 g008
Figure 9. Landslide density per LSI zone in the LSM for the Kure study area produced with the FR method. The values above bars represent the actual landslide quantity in the respective zone. The resultant Pearson’s correlation coefficient was calculated as 0.88.
Figure 9. Landslide density per LSI zone in the LSM for the Kure study area produced with the FR method. The values above bars represent the actual landslide quantity in the respective zone. The resultant Pearson’s correlation coefficient was calculated as 0.88.
Geosciences 14 00171 g009
Figure 10. LSM produced for the study area with the RF method. AUROC analysis of this map resulted in a predictability potential of 95.2%.
Figure 10. LSM produced for the study area with the RF method. AUROC analysis of this map resulted in a predictability potential of 95.2%.
Geosciences 14 00171 g010
Figure 11. AUROC analysis graph of the LSM produced with the RF method, resulting in 0.952 (95.2% predictability potential).
Figure 11. AUROC analysis graph of the LSM produced with the RF method, resulting in 0.952 (95.2% predictability potential).
Geosciences 14 00171 g011
Figure 12. Landslide density per LSI zone in the LSM produced with the RF method. The values above bars represent the actual landslide quantity in the respective zone. The resultant Pearson’s correlation coefficient was calculated as 0.93.
Figure 12. Landslide density per LSI zone in the LSM produced with the RF method. The values above bars represent the actual landslide quantity in the respective zone. The resultant Pearson’s correlation coefficient was calculated as 0.93.
Geosciences 14 00171 g012
Figure 13. LSM produced with the ANN method. AUROC analysis of this map resulted in a predictability potential of 92.47%.
Figure 13. LSM produced with the ANN method. AUROC analysis of this map resulted in a predictability potential of 92.47%.
Geosciences 14 00171 g013
Figure 14. AUROC analysis graph of the LSM produced with the ANN method, resulting in 0.9247 (92.47% predictability potential).
Figure 14. AUROC analysis graph of the LSM produced with the ANN method, resulting in 0.9247 (92.47% predictability potential).
Geosciences 14 00171 g014
Figure 15. Landslide density per LSI zone in the LSM produced with the ANN method. The values above bars represent the actual landslide quantity in the respective zone. The resultant Pearson’s correlation coefficient was calculated as 0.92.
Figure 15. Landslide density per LSI zone in the LSM produced with the ANN method. The values above bars represent the actual landslide quantity in the respective zone. The resultant Pearson’s correlation coefficient was calculated as 0.92.
Geosciences 14 00171 g015
Figure 16. LSM produced with the LR method. AUROC analysis of this map resulted in a predictability potential of 90.16%.
Figure 16. LSM produced with the LR method. AUROC analysis of this map resulted in a predictability potential of 90.16%.
Geosciences 14 00171 g016
Figure 17. AUROC analysis graph of the LSM produced with the LR method, resulting in 0.9016 (90.16% predictability potential).
Figure 17. AUROC analysis graph of the LSM produced with the LR method, resulting in 0.9016 (90.16% predictability potential).
Geosciences 14 00171 g017
Figure 18. Landslide density per LSI zone in the LSM produced with the LR method. The values above bars represent the actual landslide quantity in the respective zone. The resultant Pearson’s correlation coefficient was calculated as 0.83.
Figure 18. Landslide density per LSI zone in the LSM produced with the LR method. The values above bars represent the actual landslide quantity in the respective zone. The resultant Pearson’s correlation coefficient was calculated as 0.83.
Geosciences 14 00171 g018
Figure 19. Comparison of ROC curves for the 4 LSM production methods in the study area. The RF method showed the best results with an AUC of 0.952, while the FR method showed the worst results with an AUC of 0.8424.
Figure 19. Comparison of ROC curves for the 4 LSM production methods in the study area. The RF method showed the best results with an AUC of 0.952, while the FR method showed the worst results with an AUC of 0.8424.
Geosciences 14 00171 g019
Table 1. Variance inflation factor (VIF) for the utilized LCFs of this study. All LCFs display a VIF < 5, evidencing independence between variables and no multi-collinearity.
Table 1. Variance inflation factor (VIF) for the utilized LCFs of this study. All LCFs display a VIF < 5, evidencing independence between variables and no multi-collinearity.
LCFVariance Inflation Factor (VIF)
Lithology2.147
Land use1.037
Altitude1.525
Slope angle1.029
Slope aspect1.002
Distance from drainage1.072
Distance from lineaments1.187
Mean annual precipitation1.674
Table 2. Summary of the data used in the analysis of this research.
Table 2. Summary of the data used in the analysis of this research.
DataFormatSource
Landslides from July 2018 disastersPoint-type shapefileGeospatial Information Authority of Japan (GSI) [45].
LithologyPolygon-type shapefileHiroshima 1:200,000 geological map (NI-53-33) by Yamada et al. [32].
Altitude20 m RasterDEM by GSI [45].
Slope angle20 m Raster(Extracted from) DEM by GSI [45].
Slope aspect20 m Raster(Extracted from) DEM by GSI [45].
Distance to drainagePolygon-type shapefile(Extracted from) DEM by GSI [45].
Distance to lineamentPolygon-type shapefileExtracted from geological map (Yamada et al. [32]).
Land usePolygon-type shapefileNLID, MLIT of Japan [44].
Soil classPolygon-type shapefile1:50,000 Kure soil map (NI-53-33-7) by Tanimoto et al. [51].
Mean annual precipitation (2016–2021)Polygon-type shapefileXRAIN radar-acquired data, Ministry of Education, Culture, Sports, Science and Technology (MEXT), University of Tokyo’s Data Integration & Analysis System (DIAS) (2018).
Table 3. FR calculation table for all classes of each LCF. Classes with no landslide occurrence (and thus with an FR value of 0) are omitted.
Table 3. FR calculation table for all classes of each LCF. Classes with no landslide occurrence (and thus with an FR value of 0) are omitted.
Parameters and ClassesLandslidesArea (km2)Landslides (%)Area (%)FR
LithologyArgillite53.550.472.100.22
Porphyre granite319.972.895.880.49
Clastics298.532.715.030.54
Gravel20.450.190.260.71
Hiroshima granite31764.7329.6038.190.78
Granitic rock10612.879.907.591.30
Rhyolites58169.4054.2540.941.32
Land usedeforested area20032.5818.6719.200.97
forest751106.4970.1262.761.12
crop field1513.061.407.700.18
rice field14.910.092.900.03
others10412.649.717.451.30
Altitude (m)0–1522720.982.5212.240.21
153–2367036.016.5421.010.31
237–32019436.3618.1121.210.85
321–40437833.8235.2919.731.79
405–48827920.4826.0511.952.18
489–57210310.889.626.351.52
573–824126.811.123.970.28
>82586.030.753.520.21
Slope angle (degrees)20–2434.510.282.630.11
25–294216.083.929.380.42
30–3423038.7621.4822.610.95
35–3948761.3945.4735.821.27
40–4426139.5224.3723.061.06
45–494710.344.396.040.73
≥5010.760.090.450.21
Slope aspectNNE14418.8613.4511.001.22
ENE12719.1911.8611.201.06
ESE11923.0111.1113.430.83
SSE10224.389.5214.220.67
SSW12624.6311.7614.370.82
WSW14121.5013.1712.551.05
WNW16920.3715.7811.891.33
NNW14319.4113.3511.331.18
Distance to drainage (m)<5036859.7734.3634.870.99
50–9918525.8417.2715.081.15
100–14917820.5416.6211.981.39
150–19911315.3210.558.941.18
200–2497911.427.386.661.11
250–299508.824.675.150.91
300–349326.612.993.860.77
350–399255.062.332.950.79
400–449154.011.402.340.60
≥4502613.972.438.150.30
Distance to lineament (m)<250719.716.635.661.17
240–3999611.218.966.541.37
400–549445.984.113.491.18
550–6999011.788.406.871.22
700–849385.463.553.181.11
850–9996710.676.266.221.00
1000–1149405.243.733.061.22
1150–1299759.797.005.711.23
1300–1449354.493.272.621.25
≥150051597.0548.0956.620.85
Soil classMih-1351.373.160.122.56
ZZ390.480.810.041.89
Fut-1151.011.360.091.48
Har-151839.0746.843.531.33
Ser-190.700.810.061.28
Fjs20.180.180.021.11
Swa70.790.630.070.89
Tsm40.550.360.050.73
Gsa-134450.4431.104.560.68
Isi-1479.144.250.830.51
Har-26313.005.701.180.48
Tuc92.100.810.190.43
Trg20.590.180.050.34
Gsa-22410.032.170.910.24
Kmi10.500.090.050.20
Urt95.130.810.460.18
Ngz10.800.090.070.13
Une-311.250.090.110.08
Une-111.360.090.120.07
ZZ258.880.450.800.06
Ebe00.650.000.060.00
Gos00.220.000.020.00
Km00.410.000.040.00
Kri-100.480.000.040.00
Kyt00.090.000.010.00
Kzs00.180.000.020.00
Okk00.050.000.000.00
Tdn00.240.000.020.00
Tns00.070.000.010.00
Ttr00.020.000.000.00
Yad00.030.000.000.00
Znt00.050.000.000.00
ZZ00.080.000.010.00
Mean Annual Precipitation (mm)<2100146.281.313.660.36
2100–219513650.2612.7029.320.43
2196–229138165.6235.5738.280.93
2292–238623321.4121.7612.491.74
2387–248213310.8412.426.331.96
2483–2578917.388.504.311.97
2579–2674535.824.953.401.46
2675–2769273.382.521.971.28
2770–286530.370.280.221.29
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Rodrigues Neto, J.M.d.S.; Bhandary, N.P. Landslide Susceptibility Assessment by Machine Learning and Frequency Ratio Methods Using XRAIN Radar-Acquired Rainfall Data. Geosciences 2024, 14, 171. https://doi.org/10.3390/geosciences14060171

AMA Style

Rodrigues Neto JMdS, Bhandary NP. Landslide Susceptibility Assessment by Machine Learning and Frequency Ratio Methods Using XRAIN Radar-Acquired Rainfall Data. Geosciences. 2024; 14(6):171. https://doi.org/10.3390/geosciences14060171

Chicago/Turabian Style

Rodrigues Neto, José Maria dos Santos, and Netra Prakash Bhandary. 2024. "Landslide Susceptibility Assessment by Machine Learning and Frequency Ratio Methods Using XRAIN Radar-Acquired Rainfall Data" Geosciences 14, no. 6: 171. https://doi.org/10.3390/geosciences14060171

APA Style

Rodrigues Neto, J. M. d. S., & Bhandary, N. P. (2024). Landslide Susceptibility Assessment by Machine Learning and Frequency Ratio Methods Using XRAIN Radar-Acquired Rainfall Data. Geosciences, 14(6), 171. https://doi.org/10.3390/geosciences14060171

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop