*2.1. Data*

An important question, in human losses expected studies, is whether or not a conditioning variable is actually useful and needed for the assessment and prediction. There are many factors affecting the casualties of earthquakes, such as the intensity of earthquakes, the vulnerability of houses and the economic development in earthquake areas. However, some factors do not have data for every earthquake. We chose the following ten features: date, time, magnitude, epicentral intensity, abnormal intensity, focal depth, secondary disasters, population density, economic situation, and damage ratio of different structure types.


9. Most of the fatalities are caused by building damage [2] and this factor is vital to the number of deaths [13]. Therefore, this paper chooses the damage ratio of houses as the feature. Damage ratios consist of collapsed structures, heavy damage, moderate damage, and slight damage. deaths [13]. Therefore, this paper chooses the damage ratio of houses as the feature. Damage ratios consist of collapsed structures, heavy damage, moderate damage, and slight damage. 10. Different earthquake occurrence dates can sometimes lead to aggravation of earthquake

9. Most of the fatalities are caused by building damage [2] and this factor is vital to the number of

10. Different earthquake occurrence dates can sometimes lead to aggravation of earthquake damage, for instance, rain and snowy weather will affect rescue efforts. Dates were processed quarterly, and a year is divided into four quarters in this study. damage, for instance, rain and snowy weather will affect rescue efforts. Dates were processed quarterly, and a year is divided into four quarters in this study.

#### *2.2. Importance Assessments of 9 Features 2.2. Importance Assessments of 9 Features*

The selected supervised classifiers are the random forests (RF), adaptive boosting (AdaBoost) and classification and regression tree (CART). The CART algorithm, a decision tree model and a non-parametric data mining method, has many advantages including ease of handling numerical and categorical data and multiple outputs situations [25]. The CART algorithm is a component learner with gini features, such as division standard, while ensemble learning combines multiple weak classifiers using different methods. The most common methods of ensemble learning are bootstrap aggregating (bagging) method and boosting method, in which bagging is a parallel algorithm and boosting is a sequential algorithm. The RF algorithm, the expansion of bagging method, exploits random binary trees to discriminate and classify data [7]. The AdaBoost approach, a boosting algorithm, constructs a strong classifier with weak classifiers and updates the weight of samples based on learning error, as shown in Figure 2. Each sample in the training data is given equal weight α at first. A weak classifier is trained on the training data and the error rate ε of the classifier is calculated. Then, the classifier is trained again on the unified data. The ε will be increased while αwill be reduced to the first classification on the second training classifier. AdaBoost calculates ε of each weak classifier and assigns α to each classifier. In Figure 2, the first row is the data set, where the different width of histograms represents different weight on each sample. The data set will be weighted by α in the third row after passing through the classifier [26]. The final output is obtained by summing the weighted results. The the error rate ε (Equation (1)) and the weight α (Equation (2)) are calculated as follows: The selected supervised classifiers are the random forests (RF), adaptive boosting (AdaBoost) and classification and regression tree (CART). The CART algorithm, a decision tree model and a nonparametric data mining method, has many advantages including ease of handling numerical and categorical data and multiple outputs situations [25]. The CART algorithm is a component learner with gini features, such as division standard, while ensemble learning combines multiple weak classifiers using different methods. The most common methods of ensemble learning are bootstrap aggregating (bagging) method and boosting method, in which bagging is a parallel algorithm and boosting is a sequential algorithm. The RF algorithm, the expansion of bagging method, exploits random binary trees to discriminate and classify data [7]. The AdaBoost approach, a boosting algorithm, constructs a strong classifier with weak classifiers and updates the weight of samples based on learning error, as shown in Figure 2. Each sample in the training data is given equal weight α at first. A weak classifier is trained on the training data and the error rate ε of the classifier is calculated. Then, the classifier is trained again on the unified data. The ε will be increased while αwill be reduced to the first classification on the second training classifier. AdaBoost calculates ε of each weak classifier and assigns α to each classifier. In Figure 2, the first row is the data set, where the different width of histograms represents different weight on each sample. The data set will be weighted by α in the third row after passing through the classifier [26]. The final output is obtained by summing the weighted results.The the error rate ε (Equation (1)) and the weight α (Equation (2)) are calculated as follows:

$$
\varepsilon = \frac{\text{N1}}{\text{N2}} \tag{1}
$$

$$\alpha = \frac{1}{2} \left( \frac{1 - \varepsilon}{\varepsilon} \right) \tag{2}$$

where N1, N2 are the number of incorrectly classified and classified samples, respectively. where N1, N2 are the number of incorrectly classified and classified samples, respectively.

**Figure 2.** Simple example showing that AdaBoost can construct strong classifier from set of weak classifiers. **Figure 2.** Simple example showing that AdaBoost can construct strong classifier from set of weak classifiers.

We worked with the CART, RF, and AdaBoost models implemented in the jupyter of the Anaconda Navigator. Data was generally required to be standardized in machine learning. We used We worked with the CART, RF, and AdaBoost models implemented in the jupyter of the Anaconda Navigator. Data was generally required to be standardized in machine learning. We used the

StandardScaler preprocessing method of the sklearn function library to process the magnitude, focal depth, epicentral intensity, and population density. Equation (3) presents the process:

$$\mathbf{x} = (\mathbf{x} - \boldsymbol{\mu}) / \sigma \tag{3}$$

number of estimators = 379

where x is the features, µ is the mean of the data and δ is the variance of the data. The parameters selection of each model is shown in Table 2. Table 3 presents the result of verifying the model with cross validation score function in sklearn function library. And the importance of nine features in casualty assessment is shown in Figure 3. the StandardScaler preprocessing method of the sklearn function library to process the magnitude, focal depth, epicentral intensity, and population density. Equation (3) presents the process: x = (x − μ)/σ (3) where x is the features, μ is the mean of the data and δ is the variance of the data. The parameters

**Table 2.** Parameters of random forest (RF), CART, and AdaBoost (Unset parameters were used as default parameters in sklearn, AdaBoost has two classes of classification algorithms SAMME and SAMME.R, among which SAMME.R is better based on class probability). cross validation score function in sklearn function library. And the importance of nine features in casualty assessment is shown in Figure 3. **Table 2.** Parameters of random forest (RF), CART, and AdaBoost (Unset parameters were used as

selection of each model is shown in Table 2. Table 3 presents the result of verifying the model with


**Table 3.** Results of testing the models on unseen validation set of 9 features.

criterion = gini

**Figure 3.** Comparing the relative importance of input variables with RF, CART, and AdaBoost algorithms. To make the contrast clearer and more readable, use the bar chart in the sigmaplot software for drawing. **Figure 3.** Comparing the relative importance of input variables with RF, CART, and AdaBoost algorithms. To make the contrast clearer and more readable, use the bar chart in the sigmaplot software for drawing.

#### *2.3. Importance Assessments of Structure Types*

criterion = gini

The importance of different structure types was assessed alone because of the complexity of the structure types and the great contribution to the death [27]. For the accuracy and comprehensiveness of the assessment, this study listed 43 universal and special structure types from the Earthquake Disasters and Losses Assessment Report in Chinese Mainland from 1992 to 2007: reinforced concrete frame structure, masonry structure, brick-wood structure, civil structure, national brick-wood structure,

**3. Model** 

*3.1. Data* 

were as follows:

learning model.

be applicable to the future.

accompanied by them.

also great, which was only divided into four seasons.

different and every destruction was divided into five parts.

brick-concrete structure, bucket-piercing frame structure, brick-concrete structure (building of two or more floors), shed, brick-concrete structure(building of only one floor), brick-column civil structure, wing-room, national civil structure, simple house with dry fortified earth wall, brick-column adobe structure, dry brick building, brick structure, timber structure, timber stack structure, timber framework, brick-masonry structure, frame structure, adobe structure, wood-column adobe structure, reinforced frame structure, mixed house, brick adobe structure, general houses (owned by citizens), stone-wood structure, stone structure, simple house, old Tibetan house, stone-grass structure, stone-concrete structure, alunite house, earth rock house, industrial plant, steel frame structure, soil tamper structure, cave dwelling, wooden frame house, flag stone house, and general house.

Structural damages were divided into five grades: collapse, heavy damage, moderate damage, slight damage, and basically undamaged. We chose the collapse, heavy damage of buildings, and population density as input parameters. Firstly, death was almost being caused by the collapse and heavy damage of the architecture. Another, population density, was closely related to the seismic casualty and the number of buildings. We only selected random forest algorithm to assess the importance of structure types for the mean accuracy of the RF model was higher than the other algorithm as seen from Table 3. Table 4 presents the procedure of RF in feature assessment. The importance of structure types can be seen in Figure 4.

**Table 4.** Steps of the random forest algorithm.

**Figure 4.** Importance of structure type (The structure types that contribute little to the casualty not shown. The importance of general houses and structure types below it close to zero). **Figure 4.** Importance of structure type (The structure types that contribute little to the casualty not shown. The importance of general houses and structure types below it close to zero).

We chose the population density, magnitude, focal depth, epicentral intensity, and time as the

1) Features were selected from high to low in the order of features importance. According to their subjectivity, the number of occurrences and obtained time after the earthquake. 2) The results of the importance assessment show the relative values of the nine factors that contribute to the assessment of the casualty, rather than the absolute value of each of the factors alone. Time ranked seventh instead of higher in the importance assessment results because we only divided it into the two parts. In the deep learning model, we did not to classify the time. Therefore, we chose the time because it could be almost obtained at the same time as an earthquake and it had not subjectivity in the deep

3) The division of the economic status had a strong subjectivity and the data selected in the test set was the economic situation of the year of the occurrence of the earthquake. With the annual inflation and the depreciation of the coin, the situation of that year may not

4) Although the date was more important than the time, the subjectivity of the date was

5) The secondary disasters and abnormal intensity were only divided into yes and no. The combined number of the two phenomena was small and not every earthquake was

6) The structure types were too complex. Structure destroyed during each earthquake were

We collected 289 destructive earthquakes occurred in the Chinese mainland from 1992 to 2017 in the Earthquake Disasters and Losses Assessment Report in Chinese Mainland (Table A1). The excel table data were pre-processed by openpyxl module in deep learning method, and the data set of 228 earthquake cases were returned without a missing value. Among these data, we selected 180 data as
