Classification of Seismaesthesia Information and Seismic Intensity Assessment by Multi-Model Coupling

Lv, Qingzhou; Liu, Wanzeng; Li, Ran; Yang, Hui; Tao, Yuan; Wang, Mengjiao

doi:10.3390/ijgi12020046

Open AccessArticle

Classification of Seismaesthesia Information and Seismic Intensity Assessment by Multi-Model Coupling

by

Qingzhou Lv

^1,2,3,

Wanzeng Liu

^1,2,4,*,

Ran Li

^1,2,4,

Hui Yang

⁵

,

Yuan Tao

^2,5 and

Mengjiao Wang

⁶

¹

Hubei Luojia Laboratory, Wuhan 430079, China

²

National Geomatics Center of China, Beijing 100830, China

³

Faculty of Land Resources Engineering, Kunming University of Science and Technology, Kunming 650093, China

⁴

Key Laboratory of Spatio-Temporal Information and Intelligent Services, Ministry of Natural Resources of China, Beijing 100830, China

⁵

School of Resources and Geosciences, China University of Mining and Technology, Xuzhou 221116, China

⁶

Institute of Geographic Sciences and Nature Resources Research, Chinese Academy of Sciences, Beijing 100101, China

^*

Author to whom correspondence should be addressed.

ISPRS Int. J. Geo-Inf. 2023, 12(2), 46; https://doi.org/10.3390/ijgi12020046

Submission received: 15 November 2022 / Revised: 20 January 2023 / Accepted: 30 January 2023 / Published: 31 January 2023

(This article belongs to the Special Issue GIS Software and Engineering for Big Data)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Earthquake disaster assessment is one of the most critical aspects in reducing earthquake disaster losses. However, traditional seismic intensity assessment methods are not effective in disaster-stricken areas with insufficient observation data. Social media data contain a large amount of disaster information with the advantages of timeliness and multiple temporal-spatial scales, opening up a new channel for seismic intensity assessment. Based on the earthquake disaster information on the microblog platform obtained by the network technique, a multi-model coupled seismic intensity assessment method is proposed, which is based on the BERT-TextCNN model, constrained by the seismaesthesia intensity attenuation model, and supplemented by the method of ellipse-fitting inverse distance interpolation. Taking four earthquakes in Sichuan Province as examples, the earthquake intensity was evaluated in the affected areas from the perspective of seismaesthesia. The results show that (1) the microblog data contain a large amount of earthquake information, which can help identify the approximate scope of the disaster area; (2) the influences of the subjectivity and uneven spatial distribution of microblog data on the seismic intensity assessment can be reduced by using the seismaesthesia intensity attenuation model and the method of ellipse-fitting inverse distance interpolation; and (3) the accuracy of seismic intensity assessment based on the coupled model is 70.81%. Thus, the model has higher accuracy and universality. It can be used to assess seismic intensity in multiple regions and assist in the formulation of earthquake relief plans.

Keywords:

multi-model coupling; natural language processing; BERT-TextCNN model; social media data; seismic intensity assessment

1. Introduction

Earthquake disasters are known as one of the most destructive disasters because of their suddenness, unpredictability, strongly destructive ability, and wide impact range, which pose a great threat to the safety of people’s lives and property [1,2]. Post-earthquake disaster information acquisition and rapid intensity assessment have always been the most core aspect in earthquake emergency rescue actions of government departments at all levels and are also hot topics in academic research. The internationally common methods for rapid intensity assessment include empirical models based on statistical relations, seismic intensity rapid reporting based on strong motion observation networks, and rapid assessment based on remote sensing methods [3]. In those methods, calculating intensity based on the statistical relation model—which is built on the basis of the statistics of historical earthquake cases—is the most common method for rapid post-earthquake assessment, and particularly its elliptical model is widely used [4]. According to historical earthquake case data, Nie Gaozhong and Xu Jinghai established—by fitting—a rapid assessment model for polar seismic zones with earthquake magnitude and source depth as input parameters, and have applied it to several destructive earthquakes since 2014 [5]. However, their method relies too much on historical earthquake data, and it is difficult to establish an accurate intensity attenuation model for earthquake disaster areas with insufficient historical data, so the accuracy of the assessment results cannot be ensured. The assessment method of intensity rapid reporting based on a strong motion network aims to establish the transformation relationship between seismic intensity and ground motion parameters. Relevant research results have been applied to some large and medium-sized cities to establish similar intensity rapid report systems [6,7]. However, due to objective conditions, the distribution of strong motion observation networks in China’s mainland is uneven. At present, the instrument intensity given by strong motion observation networks cannot be released to the public as formal results, so it is still difficult to apply such methods in earthquake emergency rescue [8]. As satellite remote sensing data have the advantages of wide coverage, short imaging period, and night imaging, the situation of earthquake areas and conducting large-scale disaster assessments can be acquired quickly and comprehensively by using remote sensing satellites and unmanned aerial vehicles, which can overcome the shortcomings of statistical models and strong motion observation networks [9]. Wang S. et al. established a remote sensing satellite receiving station and developed a geometric correction algorithm for remote sensing satellite data based on conjugate triangle and affine transformation, which detected and evaluated earthquake disasters referring to the surface temperature, vegetation index, and aerosol data [10]. However, the rapid intensity assessment achieved by the remote sensing method is greatly limited by such image acquisition conditions as resolution, cover of clouds, and the effect of seismic recognition for weak intensity is poor.

At the beginning of the 1990s, the United States Geological Survey (USGS) implemented a project called “Did You Feel It?” (DYFI), in which electronic questionnaires were distributed to the public in earthquake-affected areas; their feelings and responses to the earthquake were summarized, the earthquake intensity results were obtained with postal code as the statistical unit, and the idea of disaster assessment was put forward based on the social perception data [11]. In recent years, the rapid development of big data, mobile Internet, and social media has made it possible to assess seismic intensity based on the concept of “crowdsourcing” [12,13]. Bo Tao et al. designed and developed a public service information system regarding earthquakes to collect disaster situations and intensity reporting information from mobile users through smartphones and to quickly sketch intensity maps [14]. Fan et al. collected mobile phone positioning data and their time-varying situations before and after the Jiuzhaigou earthquake, and analyzed the correlation between the personnel density, flow direction, and seismic intensity [15]. Sakaki T. et al. studied the real-time early seismic warning problem by analyzing and mining the data on Twitter [16]. Xu et al. established a classification mapping table between microblog data and earthquake disasters in accordance with the sign-in data of microblog users describing their feelings about the earthquake, and assessed the seismic intensity with the spatial distribution of microblog data [17]. Although the above methods can be expected to basically achieve the purpose of assessing the intensity of earthquake disasters, the traditional data mining methods based on keyword matching combined with manual interpretation are inefficient in the face of the complex and changeable online text data; it is difficult to fully reflect the advantages of timeliness and diversity of data, which seriously restricts the disaster reduction ability of models.

With the rapid development of computer technology, deep learning technology has gradually matured in natural language processing, making the concept of “citizens as sensors” gradually become a reality [18,19]. By a large number of deep learning models, such as Bidirectional Encoder Representations from Transformers (BERT), Convolutional Neural Networks (CNN), and Generative Pre-training (GPT) models, target information can be automatically extracted from social media data after their training and fine-tuning and complex tasks—such as text clustering and grading through the feature extraction and semantic understanding of data—can be realized [20,21,22,23]. Ruan T. et al. used machine learning methods to analyze the reactions of different groups of people to the earthquake that struck Ridgecrest in 2019 based on the data from Reddit and Twitter platforms [24]. Bo et al. proposed a rapid assessment model based on machine learning methods to estimate seismic intensity [25]. With Bo’s study as a basis, Wu Xinhua and Luan Cuiju proposed that the accuracy of assessment results can be improved through keyword filtering and time relationship recognition [26]. Yao K. et al. used microblog data to extract seismic intensity information and adopted the grid correction method of the comprehensive thermal intensity matrix and the seismic attenuation model to improve the accuracy of the intensity assessment [27]. From the perspective of data, the existing examples of research mostly focus on the mining of social media data, emphasizing the advantages of massive social media data, but ignoring its disadvantages such as a large amount of noise, strong subjectivity, and uneven distribution, which have serious impact on the accuracy of assessment results. From the perspective of research methods, the intensity assessment methods in the current research results are relatively single, the anti-interference ability of the related models is weak, and the universality is poor and difficult to transplant, which cannot meet the needs of seismic intensity assessment in different regions. Therefore, in this paper, a multi-model coupled seismic intensity assessment method is proposed, which is a method of using the Bidirectional Encoder Representations from Transformers—Text Convolutional Neural Networks (BERT-TextCNN) model to classify the seismic level of microblog disaster data, using the seismaesthesia intensity attenuation model to solve the problem of subjectivity of microblog data, and using the fitting elliptic interpolation method to solve the problem of insufficient data in some disaster areas so as to improve the accuracy and universality of the seismic intensity assessment model.

2. Data Acquisition and Processing

2.1. Microblog Data Acquisition

Microblog (Weibo) is a broadcast social media platform that shares short, real-time information through a mechanism for capturing attention. With a simple and convenient usage, rich functions, diverse data, and a good user base, Microblog is considered to be one of the most popular and large-scale social media platforms. The continuous stream of microblog data from hundreds of millions of users is an ideal data source for earthquake disaster assessment. At present, there are four main ways to obtain microblog data: microblog Application Programming Interface (API), web crawler based on Uniform Resource Location (URL), data source mirroring, and open data platform. Due to the particularity of microblog text data and the limitation of the microblog platform API, it is often difficult for a single data acquisition method to fully meet the needs of data acquisition in terms of speed, efficiency, depth, and breadth. Therefore, a method combining API and URL is employed to crawl microblog data, and the specific process is as follows.

(1): Using the open API of microblog, the microblog data of earthquake disaster and the publisher ID can be obtained according to the keywords during the earthquake period, such as “earthquake”, “seismaesthesia”, “shaking”, etc. Due to the limitation of API functions, this crawling method cannot obtain the location coordinates of the published data, and it is difficult to crawl comprehensive data.
(2): The microblog API is used to crawl the watchlist data, and the user ID in the same province and city as the current user is selected from the followers and watchlists of the Weibo users, who are more likely to be affected by the earthquake and publish microblogs that are helpful for seismic intensity assessment. Further, microblog content should be crawled to make up for incomplete data crawling caused by keyword search, expand the amount of disaster data, and improve the accuracy of assessment.
(3): According to the publisher ID of each microblog data, the URL web crawler technology is used to obtain the location coordinates, the number of comments, likes, forwarding, and other data within the microblog data page. If the published microblog data has no coordinate attribute, it will return a null value.
(4): All the crawled data are summarized according to the publisher IDs, and the earthquake disaster data with location coordinate attributes are screened out as one of the bases for the seismic intensity assessment.

The crawler technology based on Application Programming Interface and Uniform Resource Location (API and URL) was used to crawl microblogs related to four earthquakes on the microblog platform (the four earthquakes are the magnitude 6.0 earthquake in Luxian County, Luzhou City of Sichuan Province on 16 September 2021, the magnitude 6.8 earthquake in Luding County, Garze Tibetan Autonomous Prefecture of Sichuan Province on 5 September 2022, the magnitude 6.1 earthquake in Lushan County, Ya’an City of Sichuan Province on 1 June 2022, and the magnitude 6.0 earthquake in Changning County, Yibin City of Sichuan Province on 17 June 2019). The crawling content includes microblog text, publisher ID, post location coordinates, posting time, number of retweets, number of comments, and number of likes (Table 1). Considering the timeliness of data, a total of 106,913 items, including 9114 data of location coordinates, were screened out within 24 h after the earthquake.

2.2. Other Data Acquisition and Processing

In addition to a large amount of microblog data, earthquake intensity assessment also needs to combine China’s seismic distribution zone data, officially released seismic intensity maps, and historical seismic data such as earthquake magnitude, depth, and epicenter coordinates (Table 2) to predict the spatial morphology of the influenced range of earthquake intensity and verify the accuracy of the model’s assessment results.

The China seismic zone map is a distribution map that is used to show the susceptibility of earthquakes in various parts of China. Since 2004, China has begun to study seismically active fault zones in 21 large cities with concentrated populations. China’s seismic zone is mainly divided into five seismic zones: North China, Qinghai-Tibet Plateau, Xinjiang, Taiwan, and South China. The four earthquakes in the scope of this study belong to the Qinghai-Tibet Plateau seismic zone. The spatial distribution pattern and direction of faults in seismic zones are of great significance to the delimitation of the influence scope of earthquake and are factors that cannot be obtained to predict seismic intensity. The data of China’s seismic distribution zone in this text are derived from the National Earthquake Science Data Center.

The seismic data such as the location of the epicenter, the magnitude of the epicenter, and the depth of the focal point are the main factors that determine the origin and destructive power of the earthquake disaster. They are often used in the rough seismic intensity assessment and are also one of the parameters in the seismic intensity attenuation model. Usually, such data are obtained by calculating and analyzing the seismic waves received by the instruments at seismic stations. The historical seismic data in this paper are from the China Earthquake Networks.

An earthquake intensity map indicates the extent to which the ground and buildings in a certain area are affected by an earthquake (or the degree of earthquake impact and damage). According to the damage degree of buildings, the change of the ground surface, the feeling of people during the earthquake, or the reaction degree of objects after the earthquake, the seismic intensity of different areas from the epicenter is assessed, and the isointensity lines are drawn as a description of the damage degree of the earthquake. The earthquake intensity map used in this experiment comes from Sichuan Provincial Earthquake Bureau, which is mainly used to verify the accuracy of the model.

3. Research Methods and Ideas

In this study, the social media crawler technology was used to obtain the earthquake disaster information on the microblog platform, then a seismic information classification system was established according to the corresponding relationship between earthquake magnitude and seismic intensity in the China Seismic Intensity Classification Table, and the microblog data were divided into five levels according to five different earthquake magnitudes. Later, manual annotation was employed to build a data set for training the BERT-TextCNN model. It is expected to classify the seismaesthesia level with the earthquake-related text information from microblogs. Next, the seismaesthesia intensity attenuation model was used to calculate the seismaesthesia intensity value of microblog data points by using the spatial and quantitative attributes and seismaesthesia level of the data. Based on historical earthquake data, the distribution map of China’s seismic zones, and microblog data, the scope of earthquake impact was roughly determined. Finally, the inverse distance interpolation method of ellipse-fitting was used to interpolate and supplement the affected areas with insufficient data following the seismaesthesia intensity values of the existing microblog data points. It is expected to optimize the accuracy of seismaesthesia assessment result.

In this paper, taking high quality earthquake data of Luxian County, Sichuan Province, as the reference earthquake case for seismic intensity assessment, the seismic seismaesthesia assessment results of the Luxian County earthquake in Sichuan Province are superimposed and analyzed with the officially released seismic intensity map, so as to realize the mapping from the seismaesthesia intensity value to the seismic intensity level of the microblog data and determine the seismic intensity grading threshold. Taking the three earthquakes in Luding County, Changning County, and Lushan County of Sichuan Province as examples, the seismic intensity assessment of multi-model coupling was carried out, and the model’s assessment accuracy was verified according to the officially released seismic intensity map. It shows that the model is feasible, accurate, and universal (Figure 1).

3.1. Establishment of the Seismaesthesia Classification System of Microblog Data

According to the classification indicators (human perception, object response, building damage, and other phenomena) in the China Seismic Intensity Scale and the corresponding relationship between seismic intensity and earthquake type, we established the mapping relationship between microblog data and the level of seismaesthesia, and built a seismaesthesia classification system for earthquake disaster data. The earthquake-related microblog data are divided into categories from 0 to 4 with a total of five levels. In this seismaesthesia classification system, the level 0 of seismaesthesia indicates that no seismaesthesia can be shown in the microblog data, which belongs to weak earthquakes, corresponding to 1–2 magnitude earthquake intensity. Level 1 of seismaesthesia means that slight seismaesthesia can be expressed in the microblog data; this belongs to felt earthquakes, corresponding to magnitude 3–4 earthquake intensity. Level 2 of seismaesthesia indicates that obvious seismaesthesia can be expressed in the microblog data; this belongs to moderate earthquakes, corresponding to magnitude 5–6 earthquake intensity. Level 3 of seismaesthesia indicates that strong seismaesthesia can be expressed in the microblog data; this belongs to strong earthquakes and corresponds to magnitude 7–8 earthquake intensity; Level 4 of seismaesthesia indicates that drastic seismaesthesia can be expressed in the microblog data, which belongs to disastrous earthquakes and corresponds to magnitude 9 or higher earthquake intensity (Table 3).

3.2. Establish BRET TextCNN Model

Considering the characteristics of the BERT model and the TextCNN model, the BERT model, which is good at word embedding, is combined with the TextCNN model, which is good at text classification. Then, the word embedding results of the BERT model are input into the embedded layer in the TextCNN model, and a BERT-TextCNN model which can be used to classify the seismaesthesia level of earthquake data is constructed.

3.2.1. The Principle of the BERT-TextCNN Model

The BERT model is a bidirectional encoder from Transformer for language understanding proposed by Google in 2018. The core structure of the model is to stack and combine multiple encoder parts, to use the bidirectional encoder to understand the context semantics, and to extract the features of the target text to make word embedding accurate sot that it can adapt to downstream tasks better. CNN has been widely used in various fields. In 2014, Yoon Kim put forward an efficient and accurate text classification model, the TextCNN model, which achieves high-precision text classification by calculations such as convolution, pooling, splicing, and normalization.

Firstly, the disaster data crawled from microblog is input into BERT pre-training model, and each text data is vectorized according to the ID sequence of the thesaurus in the BERT pre-training model to obtain the initial word vectors. Next, the word vectors are transferred to the self-attention mechanism in the Transformer Encoder layer to update the attention value of each vector by performing matrix operations. Multiple self-attention mechanisms are combined into a multi-head self-attention mechanism to achieve parallel calculating and grouping processing of attention value, obtain high dimensional feature information of the text, and fully understand the semantic relationship between contextual phrases. Then, the output value of the multi-head self-attention layer is passed to the residual network and the normalization layer for processing to obtain the output result of the Transformer Encoder. Each time the Transformer Encoder is calculated, the model will be fine-tuned to make the output word vector more suitable for the classification requirements of the data. When the data passes through all Transformer Encoder layers, the model training ends. At this time, the word vector contains many text language features. The numerical differences of different word vectors can reflect their language differences. After that, the word vector output by the BERT model is transferred to the TextCNN model, making the extracted text features convolved and pooled so that the features are mapped to higher dimensions for classification. Finally, the classification probability of the text is calculated by the Softmax layer, and the maximum value is taken as the classification result (Figure 2).

3.2.2. The Training and Accuracy Verification of the BERT-TextCNN Model

In this study, 5000 pieces of data were randomly screened from the earthquake disaster data crawled on microblog platform. Then, the seismaesthesia level reflected by the data was manually labeled according to the seismaesthesia classification system of disaster data to build the training dataset of the BERT-TextCNN model. Finally, the training dataset was divided into the train set (3000 pieces of data), test set (1000 pieces of data) and validation set (1000 pieces of data), which were input into the BERT-TextCNN model for training.

The BERT pre-training model in this paper is the Bert_Chinese_L-12_H-768_A-12 Chinese model released by Google. The CNN model adopts the TextCNN model released by Yoon Kim, and the training environment adopts Python version 3.7 and Pytorch 1.1 framework.

It can be seen from Figure 3 that the model converges rapidly in the training process. With the increase in Epoch number, the loss value of the model decreases rapidly, and the accuracy (Acc) of the training set and test set increases rapidly. When the model reaches the 25th Epoch, the accuracy of the training set is basically stable at 98%, and the accuracy of the test set is slowly improved until the end of the training.

The accuracy of classification of the BERT-TextCNN model is verified by using the validation set. The precision value represents the accuracy of the model, the recall value represents the recall rate of the model, and the F1-score value is the harmonic average of the precision and recall rate. Taking the F1-score value and Acc value as the evaluation index of the accuracy of classification, the recognition effect of irrelevant data in the 6 types of data is the worst, with F1-score value only reaching 73.81%, which may be due to the complex and diverse text types of irrelevant data, and the train set samples by manually labeled are limited, resulting in incomplete feature extraction affecting the recognition accuracy. The highest recognition accuracy was the data with sharp seismaesthesia, whose F1-score value reaches 88.57%, because the description of text language in such data is intense, and the features are obviously easy to distinguish. The recognition accuracy of other categories of data is not much different and basically stable at about 80%. The comprehensive accuracy of the model is 84.56%, and the overall accuracy is high, which can support the seismic intensity assessment and analysis (Table 4).

3.3. Optimization Method for Seismic Seismaesthesia Assessment

Since the BERT TextCNN model can only classify microblog disaster data based on semantic features, but microblog data itself is subjective, and different users may have different descriptions of the same earthquake, this will produce abnormal classification results. In addition, because the distribution of microblog data is random and some sparsely populated affected areas may have insufficient data, the accuracy of earthquake intensity assessment will be affected. In response to the above two problems, this article proposes two solutions.

3.3.1. Seismaesthesia Intensity Attenuation Model

When evaluating the seismic intensity based on the seismaesthesia data of microblogs, we found that the greater the earthquake intensity, the more users will post microblogs with high-intensity seismaesthesia. The microblogs with high-intensity seismaesthesia are more likely to appear closer to the epicenter, and as the earthquake magnitude increases, the number of microblogs with high-intensity seismaesthesia which are far from the epicenter will gradually increase. In other words, when an earthquake occurs, the larger the proportion of microblogs with high-intensity seismaesthesia, the greater the scope of influence of its earthquake intensity. Following this law and combining with the idea of seismic intensity attenuation model, a seismaesthesia intensity attenuation model is proposed. This model is based on the influence range of the seismic intensity and the proportion of the number of microblogs with high-intensity seismaesthesia to speculate the actual seismaesthesia intensity value of microblog data points in the target earthquake case. The microblog data are constrained from the perspective of prior knowledge of earthquake laws, spatial, and quantitative attributes of data points to reduce the subjective influence of the data on seismaesthesia intensity assessment results, realizing the mapping of social perception data to seismaesthesia intensity values. It should be emphasized that the seismaesthesia intensity value calculated by the model is a relative value. For the same data point, the seismaesthesia intensity value calculated by selecting different reference earthquake cases will be different. The specific calculation formula is as follows:

Y_{i} = \frac{X_{i}}{D_{i}} [1 - (A - \frac{M}{N})]

(1)

In the above formula, Y_i is the seismaesthesia intensity value of the ith data point, X_i is the average seismaesthesia intensity level of the ith data point and the nearest two data points, D_i is the ratio of the actual distance of the ith data point from the center of the earthquake to the average distance from the epicenter of all points with a seismaesthesia level greater than 2, A is the ratio of the number of microblogs with a seismaesthesia level greater than 2 in the reference earthquake to the total number of microblogs (in this paper, the value of the Luxian earthquake is 227/949), M is the number of microblogs with the seismaesthesia level greater than 2 in the target case, and N is the total number of microblog data points in the target case.

3.3.2. Inverse Distance Interpolation Model Based on Fitted Ellipse

The inverse distance interpolation method based on fitted ellipse is mainly implemented in three steps. The first step is to determine the seismaesthesia intensity value of data points that we have. The second step is to determine the parameters of the fitted ellipse. In the third step, the relationship between the seismaesthesia intensity value and the distance from the epicenter is obtained as the interpolation function. The formulas are as follows:

μ = \frac{\sum_{i = 1}^{n} x_{i}}{n}; v = \frac{\sum_{i = 1}^{n} y_{i}}{n}

(2)

\tan θ = \frac{(\sum_{i = 1}^{n} x_{i}^{2} - \sum_{i = 1}^{n} y_{i}^{2}) + \sqrt{{(\sum_{i = 1}^{n} x_{i}^{2} - \sum_{i = 1}^{n} y_{i}^{2})}^{2} + 4 {(\sum_{i = 1}^{n} x_{i}^{2} y_{i}^{2})}^{2}}}{2 \sum_{i = 1}^{n} x_{i} y_{i}}

(3)

α = \frac{(θ + β)}{2}

(4)

\begin{array}{l} δ_{x} = \sqrt{\frac{\sum_{i = 1}^{n} {(x_{i}^{'} \cos θ - y_{i}^{'} \sin θ)}^{2}}{n}} \end{array}

(5)

δ_{y} = \sqrt{\frac{\sum_{i = 1}^{n} {(x_{i}^{'} \sin θ + y_{i}^{'} \cos θ)}^{2}}{n}}

(6)

Y = F (Z)

(7)

In the above formulas, μ and ν are the center coordinates of the fitted ellipse, whose value are the mean value of the x and y coordinates of all points, and n is the total number of data points.

x_{i}^{'}

and

y_{i}^{'}

are the relative coordinates of each point from the center of the ellipse, θ is the elliptic rotation angle calculated based on the microblog point data, β is the direction angle of the seismic zone distribution, and α is the final rotation angle of the fitted ellipse. δ_x and δ_y are the major and minor axes of the ellipse, respectively. Y is the interpolation result of the intensity value, f is the interpolation function, and Z is the distance from the epicenter.

4. Cases Analysis and Accuracy Verification

With the example of a magnitude 6.0 earthquake in Luxian County, Luzhou city of Sichuan Province on 16 September 2021, the BERT-CNN model is used to classify the seismaesthesia intensity of microblog data, and the results are optimized by using the optimization method for seismaesthesia assessment. Furthermore, the officially released seismic intensity map, as a reference, is adopted to produce the seismic intensity classification index. In the end, seismic intensity evaluation based on seismaesthesia data is expected to be achieved. In addition, another three examples (a magnitude 6.8 earthquake in Luding county, Garze Tibetan Autonomous Prefecture of Sichuan province on 5 September 2022, the 6.1-magnitude earthquake in Lushan County of Ya’an city in Sichuan on 1 June 2022, and a 6.0-magnitude earthquake in Changning County of Yibin City, Sichuan Province on 17 June 2019) are also presented to prove the reliability and applicability of the model assessment result. The multi-model coupling method is used for seismic intensity assessment according to the established seismic intensity classification index, and the accuracy of the assessment result is verified against the officially released seismic intensity map.

4.1. Seismaesthesia Intensity Assessment and Optimization

After excluding irrelevant and non-seismaesthesia microblog data, the seismaesthesia intensity classification is performed on the earthquake data of Luxian County based on the BERT-TextCNN model. The classification result is visualized, and the county scale is used for hierarchical statistics to calculate the mean value of all seismaesthesia levels data for each county as the seismaesthesia intensity situation of the county, with level 0 being non-seismaesthesia earthquake areas, level 0–1 being slight seismaesthesia earthquake areas, level 1–2 being obvious seismaesthesia earthquake areas, level 2–3 being strong seismaesthesia earthquake areas, and 3–4 being sharp seismaesthesia earthquake areas (Figure 4).

According to the distribution of seismaesthesia intensity areas, Luxian County—the earthquake center—is accurately evaluated as a strong seismaesthesia area, and the strong seismaesthesia areas are distributed mainly around Luxian County. The experimental result is basically in accordance with the reality, indicating that the microblog data have a close correlation with the earthquake disaster situation and can be used as supplementary data to support the assessment of the earthquake disaster situation. However, there are also obvious errors in the assessment result. The classification of the four brown boxed areas in Figure 4a all show exceptional situation of a large span of seismaesthesia intensity level for a small range of data points, which is not in accordance with the distribution pattern of seismaesthesia intensity. It is most likely influenced by the subjectivity of microblog data. The counties including Shizhong and Dongpo in Figure 4a are all far from the earthquake center in Luxian county, and it is difficult to classify them into strong seismaesthesia earthquake areas. While the counties including Weiyuan and Dazu are close to the earthquake center, it is impossible for the residents not to feel the earthquake. The main cause for this error may be the uneven distribution of microblog data points. There are few or no microblog data points in those counties, and the subjective determination of individuals exerts a great influence on the final seismaesthesia intensity assessment of the county, which makes the result deviate badly.

In Figure 4b, the distinction of seismaesthesia intensity values from three regions is greatly reduced after the intensity values are calculated by seismaesthesia intensity attenuation model. The result is much more consistent with the reality, indicating that the model can effectively reduce the influence of subjectivity of microblog data. The direction of seismic zone distribution is calculated according to the directional distribution function in spatial statistics, and the standard deviation ellipse is obtained based on the microblog data points. In addition, the mean value of the standard deviation ellipse direction and the seismic zone distribution direction is regarded as the direction of the earthquake impact area to determine the general spatial pattern of the earthquake impact (Figure 5a). The interpolation function is determined based on the seismaesthesia intensity values of known microblogging points and the distance from the earthquake center (Figure 5b). The inverse distance interpolation algorithm of the fitted ellipse is used to interpolate the areas with insufficient data (Figure 5c) to realize the interpolation to supplement the disaster areas with missing data, solve the problem of uneven distribution of microblogging data points, and optimize the assessment results.

4.2. Determination of Seismic Intensity Grading Threshold

There is a strong correlation between seismaesthesia and seismic intensity level. During the seismaesthesia intensity assessment of the earthquake in Luxian county, the officially released seismic intensity range is added and analyzed with the seismaesthesia assessment result to determine the threshold value for seismic intensity grading. It reveals that the Seismic Intensity grading threshold values are relative values as well as the seismaesthesia intensity. Different reference earthquake examples will result in various grading thresholds, and the level of the grading thresholds is positively correlated with the magnitude of the reference earthquake examples. The limitation of the official seismic intensity map is that only grading thresholds for magnitudes 5–8 can be determined at present.

The seismaesthesia intensity value above 50 corresponds to the seismic intensity area of magnitude 8 or above. When the value is less than 50 and more than 23, it refers to the seismic intensity area of magnitude 7, and when the value is less than 23 and more than 10, it is the seismic intensity area of magnitude 6. When the value is less than 10, it is the seismic intensity areas of magnitude 5 even below (Figure 6).

4.3. Seismic Intensity Assessment Results and Accuracy Verification

4.3.1. Seismic Intensity Result Based on the BERT−TextCNN Model

The seismaesthesia intensity of microblog data from three earthquakes in Lushan, Luding, and Changning counties of Sichuan province is classified according to the BERT−TextCNN model, and the seismaesthesia level of microblog data points are confirmed. The seismaesthesia intensity attenuation model is used to remove the influence of subjectivity of microblog data. The spatial distribution of intensity values and data points are employed to determine the earthquake-affected areas, and the inverse distance spatial interpolation method based on fitted ellipse is used to interpolate the affected areas with supplementary missing microblog data. Seismic intensity of the affected areas is evaluated based on seismic intensity classification thresholds (Figure 7).

The experiment shows that the microblog data after the BERT−TextCNN model, the seismaesthesia intensity attenuation model, and the spatial interpolation processing play a leading role in seismic intensity evaluation. The results of the seismic intensity evaluation based on the multi-model coupling are approximately the same as the actual earthquake disaster situation, and the model has high accuracy and can be applied to different levels of earthquake intensity evaluation in different regions, almost realizing the concept of accurately mapping social perception data to seismic intensity values. Figure 7 and Figure 8 reveal that the assessment accuracy of Lushan County is significantly higher than that of Changning and Luding counties, which may be caused by the more uniform distribution of microblog data and higher data quality in Lushan County. The result indicates that the multi-model coupling method can only minimize the influence of the data itself on the assessment results, instead of completely eliminating this influence. Study should be further deepened with enough data from social media to improve the quality of data as well as the assessment accuracy of the BERT-TextCNN model.

4.3.2. Accuracy Verification of Multi-Model Coupling

The accuracy, recall rate, and F1-score value are employed to evaluation metrics to assess the model accuracy, with the officially released seismic intensity map as a reference. The accurate rate is the ratio of the accurate area of the intensity evaluation to the true total area of that intensity, and the recall rate is the ratio of the accurate area of the intensity evaluation to the total area of that intensity evaluation. F1-score value is the harmonic mean of the accurate rate and the recall rate (Table 5), and the overall accurate rate is the mean value of the F1-score value.

From these cases, the assessment result of Lushan County is the most ideal, with an overall accurate rate of 80.88%, while the model’s intensity assessment result in Changning County is poor with an overall accurate rate of 62.57% and with great fluctuation. In terms of different intensities, the model has the most accurate F1-score value up to 75.4% for the assessment of seismic intensity of magnitude 8, and the F1-score values of 69.4% and 67.6% for seismic intensities of magnitude 6 and 7, with little fluctuation in accuracy rate. The overall accuracy rate of the model reaches 70.81%, and it is applicable to the seismic intensity assessment in different regions, which has the advantage of a high accuracy and universality compared with the traditional seismic intensity evaluation model.

5. Discussions

The research value of this paper is mainly reflected in the following three aspects: (1) The mapping relationship between social media data and earthquake intensity level is established. (2) The availability of social media data in seismic intensity assessment is proved. Although the microblog data have problems such as subjectivity and uneven distribution, they can reflect the severity of the earthquake disaster well and can be used as supplementary data for seismic intensity assessment. (3) A multi-model coupled seismic intensity assessment model is proposed, which can effectively reduce the influence of the problems existing in microblog data on seismic intensity assessment and improve the assessment accuracy and universality of the model.

5.1. The Mapping Relationship between Social Media Data and Earthquake Intensity Level

Social media platforms are sensitive to natural disasters like earthquakes. Usually, within a few minutes after an earthquake occurs, the microblog platform will obtain a large number of data related to the earthquake disaster. Although these data have the advantages of strong timeliness, wide coverage, and sufficient data volume, they are subjective to a certain extent, and the contents described by these data are complex and changeable, with different quality and lack of unified standards, making it difficult to classify them. It is necessary to establish a scientific and unified classification system before further research and analysis can be carried out. At present, most of the examples of research on classification and recognition of disaster data tend to use labelled data to train machine learning or deep learning models to realize data classification. For example, in Lin Sen’s study [27], he divided the microblog data into several categories, such as ‘hazards information’, ‘loss information’, ‘rescue information’, and ‘useless information’, and realized the classification and recognition of earthquake disaster data based on BERT model. This essentially treats the problem of earthquake disaster data assessment as a problem of text classification in natural language processing. However, the simple text classification method is not precise enough to reflect the difference in the user’s description of seismaesthesia intensity. For example, “This is the most terrible earthquake I have ever experienced” and “I felt the shaking of an earthquake” are both text messages describing disasters, but the severity of the disasters reflected in the two texts is very different, which means that the inaccurate classification system will affect the assessment results.

In order to reflect the different semantics described by different microblog data, we set up a seismaesthesia classification system according to the mapping relationship between seismic intensity grade and earthquake magnitude, as well as the China Seismic Intensity Classification Table. Considering the attenuation law of seismic intensity, the mapping relationship between social media data and seismaesthesia intensity level is established. Disaster data with different descriptions are divided into five different seismaesthesia levels, and based on this seismaesthesia classification system, a microblog earthquake disaster dataset for training natural language processing models is constructed. Although the dataset is small in scale, it has high universality and can support the training tasks of various models. Based on the seismaesthesia intensity attenuation model, the seismaesthesia values of different microblog data can be calculated, and the mapping between social media data and seismaesthesia values can be realized. Compared with the traditional classification method, it is more continuous and specific, which is conducive to improving the accuracy of seismic intensity assessment. It should be pointed out that this mapping relationship is not invariable, and it will be affected by objective conditions such as earthquake magnitude, microblog data amount, seismic zone distribution, etc. Under different objective conditions, even if texts are the same, the seismaesthesia value will be different.

5.2. The Availability of Social Media Data in Seismic Intensity Assessment

In earthquake emergency rescue work, timely and effective rescue is based on the rapid acquisition, processing, and analysis of disaster information. In the past, earthquake workers were unable to quickly carry out earthquake emergency and disaster assessment work due to the lack of data. For example, in the 2008 Wenchuan earthquake, the communication system was severely damaged by a strong earthquake, forming “information isolated islands” in many areas of the disaster areas, and the disaster information could not be transmitted in time, which brought considerable difficulties to the emergency command and rescue deployment [1]. With the development of science and technology, social media platform, as a wide information carrier, has become an indispensable part of people’s lives. Social media such as Facebook, Twitter, Youtube, and Sina Weibo have broken down traditional information barriers and connected the world in a more efficient and intelligent way. Hundreds of millions of people have formed a huge and complex online social network through social media. There is a huge amount of big data generated every moment. These data are created by human beings and will eventually serve as a valuable resource for human beings.

At present, social media data are widely used in public opinion analysis, disaster assessment, urban planning, and other fields. The timeliness, massiveness, extensiveness, and interactivity of social media data have brought new ideas for seismic intensity assessment and have become one of the important sources of disaster information after sudden disaster events. In this experiment, after the 7.0 magnitude earthquake in Lushan in 2019, there were more than 1.2 million microblog posts about the earthquake on the mobile terminal of Sina Weibo within 72 h, and there were more than 1000 microblog data including coordinates within 60 min. The information posted on these social media platforms contains a wealth of earthquake information that can help with emergency and rescue efforts, such as locations, feelings, descriptions of the damage seen, and messages for help. This fully illustrates the importance of social media data for seismic intensity assessment. If we can adopt appropriate methods and technical means to overcome the shortcomings of microblog data, fully excavate post-earthquake social media data, and make comprehensive use of it in combination with actual needs, it will have certain theoretical and practical significance for rapid seismic intensity assessment.

5.3. The Proposed Multi-Model Coupled Seismic Intensity Assessment Model

Common methods for seismic intensity assessment include field investigation, the empirical model based on statistical relation, seismic intensity rapid reporting based on strong motion observation networks, the rapid assessment based on remote sensing means, etc. [7]. Compared with these methods, social media data disaster assessment based on deep learning models has the advantages of strong timeliness, low cost, and easy data acquisition. However, due to the shortcomings of social media data such as subjectivity, uneven distribution, and complex types, the method of using a single deep learning or machine learning model to classify text will cause many errors and affect the evaluation accuracy. Moreover, the universality of the model is not high, and it is difficult to adapt to the needs of disaster assessment in different regions.

The multi-model coupling assessment method is based on the BERT-TextCNN model to classify the seismaesthesia of microblog disaster data. On the basis of classification, taking the seismaesthesia intensity attenuation model, different attributes of data points can be considered comprehensively and the evaluation results will be optimized. The seismaesthesia values of data points can be calculated, which will constrain the subjectivity of microblog data. Finally, the fitted elliptic attenuation model is used to interpolate and supplement the disaster areas with insufficient data according to the seismaesthesia values of known points in order to solve the problem of uneven distribution of microblog data. The results show that the accuracy of the assessment results after multi-model coupling processing can reach 70.81%, which is significantly improved compared with the single text classification model.

5.4. Limitations and Deficiencies in the Study

Although the seismic intensity assessment results based on the multi-model coupling method are close to the officially released seismic intensity map, the time series analysis of seismic intensity cannot be carried out due to the relatively scarce amount of check-in microblog data. In addition, the mining of social media data in this article is still not comprehensive. For some data without coordinates, the text itself already contains location attributes, and its specific geographic location can be obtained by means of geographical named entity recognition. At the same time, social media contains a large number of pictures, videos, and audio information related to the earthquake damage, which can also be used in the determination of earthquake intensity because they are more objective and reliable than the language description. Combining multiple data for seismic intensity assessment is what we need to study in the future.

6. Conclusions

In this paper, Microblog earthquake disaster data are used to assess the earthquake intensity of the affected areas through the multi-model coupling of the BERT-TextCNN model, seismaesthesia intensity attenuation model, and inverse distance interpolation method of fitting ellipse, and then the assessment accuracy is verified against the officially released earthquake intensity map. The following three conclusions can be drawn from this study:

First, there is strong spatial correlation between Microblog disaster data and seismic intensity, which can quickly respond to sudden earthquakes and roughly determine earthquake-stricken areas. The comprehensive text classification accuracy based on the BERT-TextCNN model is 84.56%, and the F1-score values of various types are higher than 80%. This shows that the model has high accuracy and can accurately distinguish the seismaesthesia intensity level of Microblog disaster data.

Second, it is proved that the seismaesthesia intensity attenuation model and the inverse distance interpolation method based on fitting ellipse can effectively constrain the subjectivity of microblog data, solve the problem of uneven distribution of microblog data, and greatly improve the accuracy of seismic intensity assessment.

Third, the seismic intensity assessment model based on multi-model coupling can realize the mapping of social perception data to seismic intensity value. The overall accuracy of the model can reach 70.81%, which is more accurate and universal than traditional methods. Therefore, it can assist in the formulation of earthquake relief work plans, improve rescue efficiency, and reduce losses caused by earthquake disasters.

Author Contributions

Qingzhou Lv, writing and conducting relevant experiments; Wanzeng Liu revised the paper and gave guidance; Ran Li and Hui Yang verified the manuscript; Yuan Tao revised the paper’s format; Mengjiao Wang verified the paper and realized the visualization. All authors have read and agreed to the published version of the manuscript.

Funding

National Key Research and Development Program of China (No. 2022YFB3904205), The Project Supported by the Open Fund of Hubei Luojia Laboratory (No. 220100037), The Project Supported by the Third Comprehensive Scientific Investigation Project of Xinjiang (2022xjkk1006), the National Natural Science Foundation of China (grant number 41971335 and 51978144), and the Xinjiang Uygur Autonomous Region Key Research and Development Program (2022B01012-1).

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Bo, T. Earthquake Disaster Data Mining and Application of Rapid Intensity Assessment Based on Social Media. Inst. Eng. Mech. China Earthq. Adm. 2018. Available online: https://kns.cnki.net/KCMS/detail/detail.aspx?dbname=CDFDLAST2022&filename=1019239057.nh (accessed on 10 November 2022).
Partelow, S. Social capital and community disaster resilience: Post-earthquake tourism recovery on Gili Trawangan, Indonesia. Sustain. Sci. 2021, 16, 203–220. [Google Scholar] [CrossRef] [PubMed]
Wang, D.; Ni, S.; Li, J. Research status of rapid assessment on seismic intensity. Prog. Geophys. 2013, 28, 1772–1784. [Google Scholar] [CrossRef]
Ren, J.; Xu, Z.S.; Duan, Y.H. Identification of earthquake intensity attenuation relationship models in various provinces of China. Seismol. Geom. Obs. Res. 2020, 41, 75–82. [Google Scholar] [CrossRef]
Peng, C.; Jiang, P.; Ma, Q.; Wu, P.; Su, J.; Zheng, Y.; Yang, J. Performance evaluation of an earthquake early warning system in the 2019–2020 M6.0 Changning, Sichuan, China, seismic sequence. Front. Earth Sci. 2021, 9, 699941. [Google Scholar] [CrossRef]
Liu, S.; Nie, Y.; Gao, W. Application of delaunay triangulation in seismic intensity quick report; case study of Tianjin Strong Motion Network. J. Northwest Seismol. 2013, 34, 405–412. [Google Scholar] [CrossRef]
Qiu, Y.; Jiang, C.; Si, Z. Summary of technical methods for optimizing layout of seismic monitoring network. Prog. Geophys. 2019, 35, 866–873. [Google Scholar] [CrossRef]
Yan, Z.; De Sheng, C.; Zhong, R.H. The research of building earthquake damage object-oriented segmentation based on multi feature combination with remote sensing image. Procedia Comput. Sci. 2019, 154, 817–823. [Google Scholar] [CrossRef]
Wang, S.; Dou, A.; Ding, L.; Yuan, X. Low resolution remote sensing image processing and productions development for earthquake disaster monitoring application. IOP Conf. Ser. Earth Environ. Sci. 2020, 569, 12007. [Google Scholar] [CrossRef]
Quitoriano, V.; Wald, D.J. USGS “Did you feel it?”—Science and lessons from 20 years of citizen science-based macroseismology. Front. Earth Sci. 2020, 8, 120–139. [Google Scholar] [CrossRef]
Avvenuti, M.; Cresci, S.; La Polla, M.N.; Meletti, C.; Tesconi, M. Nowcasting of earthquake consequences using big social data. IEEE Internet Comput. 2017, 21, 37–45. [Google Scholar] [CrossRef]
Zhai, W. A multi-level analytic framework for disaster situational awareness using Twitter data. Comput. Urban Sci. 2022, 2, 23. [Google Scholar] [CrossRef]
Bo, T.; Wang, Y.T.; Li, M.Z. Research on seismic information release based on wechat public platform. J. Inst. Disaster Prev. 2016, 18, 62–70. [Google Scholar] [CrossRef]
Fan, K.H.; Zhou, Z.H. Analysis on the rapid and accurate assessment of Jiuzhaigou MS 7.0 earthquake. S. China J. Seismol. 2021, 41, 36–39. [Google Scholar] [CrossRef]
Sakaki, T.; Okazaki, M.; Matsuo, Y. Tweet analysis for real-time event detection and earthquake reporting system development. IEEE Trans. Knowl. Data Eng. 2012, 25, 919–931. [Google Scholar] [CrossRef]
Xu, J.H.; Chu, J.X.; Nie, G.Z. Earthquake disaster information extraction based on location microblog. J. Nat. Disasters 2015, 24, 12–18. [Google Scholar] [CrossRef]
Evensen, D.; Varley, A.; Whitmarsh, L.; Devine-Wright, P.; Dickie, J.; Bartie, P.; Napier, H.; Mosca, I.; Foad, C.; Ryder, S. Effect of linguistic framing and information provision on attitudes towards induced seismicity and seismicity regulation. Sci. Rep. 2022, 12, 11239. [Google Scholar] [CrossRef] [PubMed]
Yousefzadeh, M.; Hosseini, S.A.; Farnaghi, M. Spatiotemporally explicit earthquake prediction using deep neural network. Soil Dyn. Earthq. Eng. 2021, 144, 106663. [Google Scholar] [CrossRef]
Kryvasheyeu, Y.; Chen, H.; Obradovich, N.; Moro, E.; Van Hentenryck, P.; Fowler, J.; Cebrian, M. Rapid assessment of disaster damage using social media activity. Sci. Adv. 2016, 2, e1500779. [Google Scholar] [CrossRef] [Green Version]
Yang, T.; Xie, J.; Li, G.; Mou, N.; Li, Z.; Tian, C.; Zhao, J. Social media big data mining and spatio-temporal analysis on public emotions for disaster mitigation. ISPRS Int. J. Geo-Inf. 2019, 8, 29. [Google Scholar] [CrossRef] [Green Version]
Ragini, J.R.; Anand, P.R.; Bhaskar, V. Big data analytics for disaster response and recovery through sentiment analysis. Int. J. Inf. Manag. 2018, 42, 13–24. [Google Scholar] [CrossRef]
Mendoza, M.; Poblete, B.; Valderrama, I. Nowcasting earthquake damages with Twitter. EPJ Data Sci. 2019, 8, 3. [Google Scholar] [CrossRef] [Green Version]
Ruan, T.; Kong, Q.; McBride, S.K.; Sethjiwala, A.; Lv, Q. Cross-platform analysis of public responses to the 2019 Ridgecrest earthquake sequence on twitter and reddit. Sci. Rep. 2022, 12, 1634. [Google Scholar] [CrossRef] [PubMed]
Tao, B.; Xiaojun, L.I.; Su, C. Research of seismic intensity rapid assessment based on social media data. Earthquake. Eng. Dyn. 2018, 38, 208–217. [Google Scholar] [CrossRef]
Wu, X.H.; Luan, C.J. A method for detecting sudden earthquake events based on micro-blog text classification. Microcomput. Its Appl. 2017, 36, 58–65. [Google Scholar] [CrossRef]
Yao, K.; Yang, S.; Tang, J. Rapid assessment of seismic intensity based on Sina Weibo—A case study of the changning earthquake in Sichuan Province, China. Int. J. Disaster Risk Reduct. 2021, 58, 102–120. [Google Scholar] [CrossRef]
Lin, S.; Liu, B.; Li, J. Social media information classification of earthquake disasters based on BERT transfer learning model. Geom. Infor. Sci. Wuhan Univer. 2022, 11, 15–30. [Google Scholar] [CrossRef]

Figure 1. Research Framework.

Figure 2. The BERT-TextCNN Model Structure.

Figure 3. Model Training Process.

Figure 4. Preliminary optimization of seismaesthesia intensity assessment in Luxian County; (a) seismaesthesia intensity evaluation result based on BERT TextCNN model; (b) evaluation result based on the optimization of the seismaesthesia intensity attenuation model.

Figure 5. Inverse distance interpolation result based on fitting ellipse; (a) predict the approximate earthquake impact range based on microblog data and earthquake related data; (b) interpolation function; (c) the interpolation result.

Figure 6. Determination of Seismic Intensity Grading Threshold.

Figure 7. Luding County Seismic Intensity Evaluation Coupled with Multi−model; (a) basic data distribution map; (b) seismaesthesia value of microblog data points calculated by the seismaesthesia intensity attenuation model; (c) seismaesthesia interpolation result calculated by the fitting ellipse inverse distance interpolation method; (d) interpolation function; (e) official Luding earthquake intensity map; (f) evaluation result of earthquake intensity in Luding County.

Figure 8. Seismic intensity assessment based on multi-model coupling; (a) evaluation result of Changning County; (b) evaluation result of Lushan County.

Table 1. Seismaesthesia Data Crawling Results.

Publisher ID	Microblog Text	x-Coordinates	y-Coordinates	Date	Likes	Comments
6042801999	“I was almost scared to death by a magnitude 6 earthquake”	105.83	28.81	16 September 2021	1	0
1846273200	“It’s the first time that I was awakened by an earthquake, I couldn’t fall asleep” Chongqing	106.5	29.53	16 September 2021	1	0
6291434753	“I immediately picked up my phone to see how many people had been shaken awake”—Peng’an County, Nanchong	106.57	30.83	16 September 2021	0	0
3316876605	“A person was woken up at home by the earthquake and frightened”—Luzhou	105.44	28.89	16 September 2021	1	0
…	…	…	…	…	…	…
…	…	…	…	…	…	…
5745038828	“The earthquake was so strong that it was frightening”—Luzhou	105.44	28.89	16 September 2021	1	0

Table 2. Historical Seismic Data Table.

Earthquake Location	Epicenter Coordinates		Date	Earthquake Magnitude (Ms)	Depth (Km)
Earthquake Location	X (°)	Y (°)	Date	Earthquake Magnitude (Ms)	Depth (Km)
Luding County, Sichuan province	102.08	25.59	5 September 2022	6.8	16
Lushan County, Sichuan province	102.94	30.37	1 June 2022	6.1	17
Luxian County, Sichuan province	105.34	29.2	16 September 2021	6.0	10
Changning County, Sichuan province	104.9	28.34	17 June 2021	6.0	16

Table 3. Correspondence Table Between Seismaesthesia and Microblog Data in China.

Earthquake Intensity	Seismic Intensity Assessment Index			Earthquake Type	Microblog Data and Seismaesthesia Level
Earthquake Intensity	Human Perception	Object Response	Building Damage
Ⅰ	Nobody felt it	——	——	Weak earthquake	level 0: data with non- seismaesthesia
Ⅱ	A few people who were still indoors or in the high-rise felt it	——	——	Weak earthquake	level 0: data with non- seismaesthesia
Ⅲ	A few people who were still indoors or in the high-rise felt it obviously	The suspension swayed slightly	——	Felt earthquake	level 1: data with slight seismaesthesia
Ⅳ	Most people indoors and a few people outdoors felt it; A few people woke up in dreams	The suspension swung significantly	——	Felt earthquake	level 1: data with slight seismaesthesia
Ⅴ	Most people outdoors and indoors felt it, most people woke up in dreams, and a few people fled outside	The water and suspension shook sharply	——	Moderate earthquake	Level 2: data with obvious seismaesthesia
Ⅵ	Most people stood unsteady and fled outdoors	A few light furniture and objects moved	A few buildings were slightly damaged	Moderate earthquake	Level 2: data with obvious seismaesthesia
Ⅶ	Most people fled the outdoors, and the occupants of moving cars felt it	Objects fell from shelves and a few furniture fell over	A few buildings were moderately	Strong earthquake	Level 3: data with strong seismaesthesia
Ⅷ	Most people strongly felt it; it is difficult to walk	Most indoor objects were dumped and displaced	A few buildings were completely destroyed	Strong earthquake	Level 3: data with strong seismaesthesia
Ⅸ	People had difficulty in standing, and those who were moving fell down	——	Most buildings were badly damaged	Disastrous earthquake	level 4: data with sharp seismaesthesia
Ⅹ	Cyclists fell and had a feeling of being thrown up	——	Most buildings and bridges were destroyed
ⅩⅠ	——	——	All buildings were destroyed
ⅩⅡ	——	——	All buildings were destroyed

Table 4. BERT-TextCNN Model Accuracy Table.

Text Types	Precision (%)	Recall (%)	F1-Score (%)
Irrelevant data	80.50	68.12	73.81
Data with non-seismaesthesia	86.49	79.81	83.02
Data with slight seismaesthesia	76.89	86.84	81.56
Data with obvious seismaesthesia	81.70	80.65	81.17
Data with strong seismaesthesia	80.69	80.69	80.69
Data with sharp seismaesthesia	92.08	85.32	88.57

Table 5. Accuracy Table of Seismic Intensity Assessment for Multi-Model Coupling.

Epicenter Location	Accuracy (%)			Recall Rate (%)			F1-Score (%)			Comprehensive Accuracy (%)
Epicenter Location	Intensity Ⅵ	Intensity Ⅶ	Intensity Ⅷ	Intensity Ⅵ	Intensity Ⅶ	Intensity Ⅷ	Intensity Ⅵ	Intensity Ⅶ	Intensity Ⅷ	Comprehensive Accuracy (%)
Luding County, Sichuan province	69.92	55.76	60.79	65.67	74.34	94.90	67.68	65.14	74.1	68.98
Lushan County, Sichuan province	73.60	83.82	84.47	75.36	93.24	75.65	74.47	88.28	79.88	80.88
Changning County, Sichuan province	62.92	47.41	74.18	69.75	51.59	70.21	66.16	49.41	72.14	62.57
mean value	68.8	62.3	73.1	70.3	73.1	80.3	69.4	67.6	75.4	70.81

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lv, Q.; Liu, W.; Li, R.; Yang, H.; Tao, Y.; Wang, M. Classification of Seismaesthesia Information and Seismic Intensity Assessment by Multi-Model Coupling. ISPRS Int. J. Geo-Inf. 2023, 12, 46. https://doi.org/10.3390/ijgi12020046

AMA Style

Lv Q, Liu W, Li R, Yang H, Tao Y, Wang M. Classification of Seismaesthesia Information and Seismic Intensity Assessment by Multi-Model Coupling. ISPRS International Journal of Geo-Information. 2023; 12(2):46. https://doi.org/10.3390/ijgi12020046

Chicago/Turabian Style

Lv, Qingzhou, Wanzeng Liu, Ran Li, Hui Yang, Yuan Tao, and Mengjiao Wang. 2023. "Classification of Seismaesthesia Information and Seismic Intensity Assessment by Multi-Model Coupling" ISPRS International Journal of Geo-Information 12, no. 2: 46. https://doi.org/10.3390/ijgi12020046

APA Style

Lv, Q., Liu, W., Li, R., Yang, H., Tao, Y., & Wang, M. (2023). Classification of Seismaesthesia Information and Seismic Intensity Assessment by Multi-Model Coupling. ISPRS International Journal of Geo-Information, 12(2), 46. https://doi.org/10.3390/ijgi12020046

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Classification of Seismaesthesia Information and Seismic Intensity Assessment by Multi-Model Coupling

Abstract

1. Introduction

2. Data Acquisition and Processing

2.1. Microblog Data Acquisition

2.2. Other Data Acquisition and Processing

3. Research Methods and Ideas

3.1. Establishment of the Seismaesthesia Classification System of Microblog Data

3.2. Establish BRET TextCNN Model

3.2.1. The Principle of the BERT-TextCNN Model

3.2.2. The Training and Accuracy Verification of the BERT-TextCNN Model

3.3. Optimization Method for Seismic Seismaesthesia Assessment

3.3.1. Seismaesthesia Intensity Attenuation Model

3.3.2. Inverse Distance Interpolation Model Based on Fitted Ellipse

4. Cases Analysis and Accuracy Verification

4.1. Seismaesthesia Intensity Assessment and Optimization

4.2. Determination of Seismic Intensity Grading Threshold

4.3. Seismic Intensity Assessment Results and Accuracy Verification

4.3.1. Seismic Intensity Result Based on the BERT−TextCNN Model

4.3.2. Accuracy Verification of Multi-Model Coupling

5. Discussions

5.1. The Mapping Relationship between Social Media Data and Earthquake Intensity Level

5.2. The Availability of Social Media Data in Seismic Intensity Assessment

5.3. The Proposed Multi-Model Coupled Seismic Intensity Assessment Model

5.4. Limitations and Deficiencies in the Study

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI