Time Series and Non-Time Series Models of Earthquake Prediction Based on AETA Data: 16-Week Real Case Study

Wang, Chenyang; Li, Chaorun; Yong, Shanshan; Wang, Xin’an; Yang, Chao

doi:10.3390/app12178536

Open AccessArticle

Time Series and Non-Time Series Models of Earthquake Prediction Based on AETA Data: 16-Week Real Case Study

by

Chenyang Wang

¹,

Chaorun Li

¹,

Shanshan Yong

^2,*,

Xin’an Wang

^1,* and

Chao Yang

¹

The Key Laboratory of Integrated Microsystems, Peking University Shenzhen Graduate School, Shenzhen 518055, China

²

Faculty of Engineering, Shenzhen MSU-BIT University, Shenzhen 518172, China

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2022, 12(17), 8536; https://doi.org/10.3390/app12178536

Submission received: 1 August 2022 / Revised: 22 August 2022 / Accepted: 23 August 2022 / Published: 26 August 2022

Download

Browse Figures

Versions Notes

Abstract

:

The Key Laboratory of Integrated Microsystems (IMS) of Peking University Shenzhen Graduate School has deployed a self-developed acoustic and electromagnetics to artificial intelligence (AETA) system on a large scale and at a high density in China to comprehensively monitor and collect the precursor anomaly signals that occur before earthquakes for seismic prediction. This paper constructs several classic time series and non-time series prediction models for comparison and analysis in order to find the most suitable earthquake-prediction model among these models. The long short-term memory (LSTM) neural network, which gains the best results in earthquake prediction based on AETA data extracted from the precursor anomaly signals, is selected for real-earthquake prediction for 16 consecutive weeks.

Keywords:

earthquake prediction; AETA; time series model; non-time series models; LSTM

1. Introduction

Earthquakes can cause huge damage to the natural environment and human society. If the three aspects of earthquakes can be accurately predicted, i.e., time, epicenter, magnitude, the damage can be largely reduced.

Acoustic and electromagnetics to artificial intelligence (AETA) is a multi-component earthquake-monitoring and prediction system developed by the Key Laboratory of Integrated Microsystems of Peking University Shenzhen Graduate School, which is used for earthquake prediction by collecting underground earthquake precursor anomaly signals, including electromagnetic (EM) signals and geoacoustic (GA) signals [1]. The importance of EM signals and GA signals to earthquake prediction has been revealed by numerous studies during the last forty years [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]. The AETA team has completed a large-scale deployment in China and has collected over 50 TB of data which laid a good foundation for the establishment of earthquake-prediction models.

Previously, many scholars attempted to use these algorithmic models for earthquake prediction in the field of seismology. Some of these researchers applied non-time series models to predict earthquakes. In non-time series models, the temporal data were converted to non-temporal data, i.e., the two-dimensional temporal feature matrix was reduced and compressed into a one-dimensional feature vector, and then the extracted temporal features were added to construct a one-dimensional sample set. Zhang, Y. et al. proposed an earthquake disaster image information anomaly detection model based on scale-invariant feature transform (SIFT) feature and support vector machine (SVM) classification in 2019 [17]. Jozinovic et al. proposed a method based on deep convolutional neural networks (CNN) to predict the degree of ground shaking when an earthquake occurred in 2020 [18]. Xiong, P. et al. used light gradient boosting machine (LightGBM) in a five-fold cross-validation test on the benchmarking datasets, which shows a strong capability in discriminating electromagnetic pre-earthquake perturbations in 2020 [19]. Wang, L. et al. used an efficient seismic slope stability analysis approach by introducing LightGBM to analyze the seismic stability of a hypothetical embankment in 2021 [20]. Saad constructed a random forest (RF) model to detect the location of an upcoming earthquake in 2022 [21]. The non-time series models developed in this paper include neural network (NN), SVM, gradient boosting decision tree (GBDT), RF and LightBGM models represented by integrated trees.

Some experts use time series models for earthquake forecasting. Compared with non-time series models, time series models can preserve the richness of the data, and can learn the asymptotic changes of various features in the time scale, which can detect the earthquake precursor anomaly signals. Kanarachos et al. introduced a signal processing algorithm, which combined wavelets, neural networks and the Hilbert transform to predict earthquake activity in 2017 [22]. Zhou, Y. et al. combined convolutional and recurrent neural networks to pick phases from archived continuous waveforms in 2019 [23]. In 2019, Titos et al. also used the gated recurrent unit (GRU) model to exploit temporal and frequency information from continuous seismic data to detect and classify continuous sequences of volcanic seismic events at the Deception Island Volcano, Antarctica [24]. In 2020, Jena et al. developed a recurrent neural network (RNN) model to create an earthquake probability map for the eastern region of India, including the coastal state of Odisha [25]. Xu, Y. et al. proposes a framework based on a long short-term memory (LSTM) neural network architecture for real-time regional seismic damage assessment in 2021 [26]. In 2021, Yan, X. et al. utilized two LSTM models to simulate and forecast hydrological variations based on hydrological time series of data from a monitoring site to identify possible precursors to the Lijiang earthquake [27]. In 2021, Huang, Y. et al. introduced a moving-steps strategy and established three recurrent neural network models: simple-RNN, LSTM, and GRU models to the prediction of the slope dynamic response [28]. The main time series models established in this paper include the LSTM prediction model, the GRU, and the CNN+GRU stacking prediction model. Xue, J. et al. applied deep learning to extract SESs and develop a novel deep learning network based on geoelectric field characteristics by combining the LSTM in 2022 [29]. This paper analyzes and compares various non-time series models and time series models in order to find a more suitable and simpler model for earthquake prediction based on AETA data.

In Section 2, this paper introduces the AETA system developed by the IMS lab and the process of constructing the dataset. In Section 3, the construction of the non-time series and time series models are introduced respectively. Section 4 shows the results of each prediction model. In Section 5, the prediction results of non-time series models and time series models are compared and analyzed, and LSTM in the time series model gets the best results in earthquake prediction based on AETA data. Finally, the research work of this paper is summarized in the conclusion section.

2. AETA System

2.1. AETA Devices and Data Acquisition

AETA, the multi-component earthquake monitoring and prediction system, was developed by the Key Laboratory of Integrated Microsystems of Peking University Shenzhen Graduate School for earthquake prediction. This system consists of two sensors, a terminal, a cloud platform, and a data analysis system. The two sensors are an electromagnetic sensor and a geoacoustic sensor, both of which are used to collect electromagnetic and geoacoustic signals, respectively. Then they transmit data to the data-processing terminal through cables, undergo sampling, compression, and filtering processes, and finally upload the data to the cloud server through the network for subsequent data analysis, the capture of earthquake precursor anomaly signals and research of seismic-related activities [30].

The electromagnetic sensor mainly monitors the electromagnetic signal band containing a very low frequency (VLF) and an ultra-low frequency (ULF), with a frequency range of 0.1 Hz to 10 kHz, an amplitude range of 0.1 to 1000 nT, a sensitivity of >20 mV/nT@0.1 Hz to 10 kHz, a resolution of 18 bits, and a sampling rate of 500 Hz at low frequency and 30 kHz at full frequency [31]. The frequency range of the geoacoustic signals monitored by the geoacoustic sensor is 0.1 Hz to 50 kHz, with a resolution of 18 bits, the sensitivity of 3LSB/pa@ 0.1 Hz to 50 kHz, and the sampling rate of 500 Hz at low frequencies and 150 kHz at full frequencies [32].

The AETA sensor system is low-cost and convenient for large-scale, high-density deployment. Up to now, nearly 300 sets of AETA devices have been deployed in some of the most seismically active regions in China, such as Sichuan, Yunnan, and the surrounding provinces. The AETA system has accumulated over 50 TB of data over 5 years of observation. The rich observation data has laid a good foundation for the establishment and comparative analysis of earthquake-prediction models.

2.2. Data Set Construction

AETA raw data contains electromagnetic signals and geoacoustic signals. Most of the anomalies of AETA data occur within n days before an earthquake, where n is 27 for electromagnetic signals and n is 10 for geoacoustic signals. The reason for the value of n can be confirmed in Section 3. This paper selects the feature data of the first n days as the input sample and the last 7 days as the output sample. The current time is

T_{0}

, the first n days are

T_{- n}

, and the last 7 days are

T_{+ 7}

, earthquake-prediction tasks are defined as follows:

{\tilde{y}}_{1} = F_{1} (x_{T_{- n}}, x_{T_{- n + 1}}, \dots, x_{T_{- 1}}), {\tilde{y}}_{1} \in [0, 1]

(1)

{\tilde{y}}_{2} = F_{2} (x_{T_{- n}}, x_{T_{- n + 1}}, \dots, x_{T_{- 1}})

(2)

{\tilde{y}}_{2} = [{\tilde{y}}_{m a g}, {\tilde{y}}_{l a t}, {\tilde{y}}_{l o n}], {\tilde{y}}_{m a g} \in [3.5, 8], {\tilde{y}}_{l a t} \in [22, 34], {\tilde{y}}_{l o n} \in [98, 107],

(3)

where

{\tilde{y}}_{1}

is the result of earthquake prediction, and

F_{1}

is the earthquake prediction model;

{\tilde{y}}_{2}

is the prediction result of magnitude and epicenter location, and

F_{2}

is the corresponding prediction model,

x

represents the input feature data, and

{\tilde{y}}_{m a g}

,

{\tilde{y}}_{l a t}

,

{\tilde{y}}_{l o n}

represent the predicted magnitude, and epicenter latitude and longitude, respectively.

In terms of the extraction range of the dataset, a typical earthquake-prone region in China has been selected for earthquake prediction in this paper. To be specific, this region is located in the Sichuan–Yunnan region (

22.00 ° N ~ 34.00 ° N, 98.00 ° E ~ 107.00 ° E)

, which has suffered a total of 206 earthquakes with magnitude larger than 3.5 from January 2017 to January 2021, according to the China Earthquake Network (http://www.ceic.ac.cn/history, accessed on 10 March 2021). Figure 1 shows the distribution of stations and earthquakes with magnitude above Ms3.5 in the Sichuan–Yunnan region over 5 years. Moreover, the epicenter distribution shows a typical regional clustering effect.

In order to improve the accuracy of prediction, the Sichuan–Yunnan region is divided into three areas according to the clustering feature of earthquake distribution. The three areas are marked out with different color boxes in Figure 1. Area1 is marked out with green box, area2 is marked out with brown box and area3 is marked out with pink box. The range of the stations are expressed in Equations (4)–(7) in order to reduce the effect of random deviation generated by the division of regions and allows the station data in the overlapping regions to be comprehensively trained on both regional models.

area 1 = (30 ° - Δ N ~ 34 ° + Δ N, 102 ° - Δ E ~ 106 ° + Δ E)

(4)

area 2 = (26 ° - Δ N ~ 30 ° + Δ N, 102 ° - Δ E ~ 106 ° + Δ E)

(5)

area 3 = (24 ° - Δ N ~ 28 ° + Δ N, 98 ° - Δ E ~ 102 ° + Δ E)

(6)

Δ = 1 °

(7)

Earthquake-prediction models are built in each of the three regions with the sample matrices obtained by sliding windows [33]. The step of the sliding window is set to 1 day, all features are arranged in time sequence by day, and the one-dimensional sequence sample set can be generated along the time dimension according to this sliding window. In order to enhance the robustness of the data, this paper checks the data loss rate for each sliding window matrix, and if the loss rate > 0.3, this sample is to be discarded. The loss rate is chosen as 0.3, which is an experimental result. This paper tried to use 0.1~0.5 as the threshold, the step size of which was set to 0.05. The amount of data was sufficient and the validation set obtained was good enough when 0.3 was chosen as the threshold. Figure 2 shows the regional sample composition process.

This paper builds the AETA feature library with a total of 95 kinds of featured data whose validity has been verified [34]. The detailed information is shown in Appendix A.

In addition, due to the infrequent occurrence of earthquakes, the imbalance of positive and negative samples must be considered when constructing the prediction model. Otherwise, it will greatly affect the training of the model, resulting in the model being more likely to predict no earthquakes. For the problem of the lack of few positive samples, the synthetic minority oversampling technique (SMOTE) algorithm must be applied when constructing the sample set for the non-time series prediction model.

To generate new samples by algorithm synthesis, SMOTE iterates through the sparse number of unbalanced samples, calculates the distance of each sample x from all the remaining samples, and gets the k samples closest to it. Then, according to the imbalance ratio of positive and negative samples, some samples are randomly selected from the obtained k samples to be added to the original sample. Finally, a new sample is generated based on each selected sample

\hat{x}

and the original sample x [35]. The equation is as follows:

x_{new} = x + rand (0, 1) \times (\hat{x} - x) .

(8)

In contrast, the SMOTE algorithm cannot be used directly for the time series prediction model because its samples are two-dimensional matrices. Therefore, this paper designs a two-dimensional matrix SMOTE algorithm to generate new two-dimensional samples. The aim of this is to compress two-dimensional samples into one-dimensional samples in time sequence, then process them by using the SMOTE algorithm, and finally reconstitute the two-dimensional samples in time sequence. The use of the SMOTE algorithm allows for a more productive and balanced model training with a ratio of positive and negative samples between 1:1 and 1:1.5. Figure 3 shows the two-dimensional matrix SMOTE algorithm.

For each station with different installation time and different number of total samples, 85% of the total samples of each station are taken as the training set and 15% as the validation set to build the seismic prediction model and the magnitude and epicenter prediction model respectively.

3. Model Construction

The AETA raw signal: electromagnetic signals and geoacoustic signals, both of which are time series signals. However, it has not been verified whether the featured data after feature extraction retains temporal information [34]. Therefore, this paper constructs both time series and non-time series models for earthquake prediction.

3.1. Non-Time Series Prediction Model

AETA feature data contains rich information of earthquake precursor anomaly signals and have high correlations with earthquakes. Therefore, robust non-time series models can commonly be used to identify the hidden outliers in the featured data for earthquake prediction.

3.1.1. LightGBM

For processing non-time series data, GBDT is often harnessed by researchers because of the distributed training feasibility and low memory consumption. So far, there have been various extended models, including the categorical boosting (CatBoosting) and the extreme gradient boosting (XGBoost) [36], but their efficiency and scalability are not ideal when the amount of data is large. LightGBM is more suitable for seismic prediction that are based on large amounts of data in terms of decision making and scalability by using gradient-based one-side sampling (GOSS), exclusive feature bundling (EFB) [37].

3.1.2. NN

NN is a complex network consisting of multiple layers of neural cells fully connected in parallel. Neural networks can be divided into three main layers, with the first layer as the input layer, the last layer as the output layer, and all the middle ones as hidden layers. To be more specific, the training data is input from the input layer, propagated forward along the network, and the loss is calculated in the output layer. As far as the structure is concerned, the neural network layers are interconnected in parallel, and a particular neuron is interconnected with all neurons in the preceding layer and all neurons in the following layer [38]. Each local neuron is a perceptron consisting of a linear expression

z = \sum^{} w_{i} x_{i} + b

with an activation function

σ (z)

. The back propagation (BP) algorithm is regarded as the most basic algorithm for the training of a neural network, which updates the neuron weight matrix w and the bias vector b by the gradient descent (GD) algorithm.

The commonly used activation functions are sigmoid, tanh, relu, and elu. The general loss function includes cross-entropy, mean absolute deviation, and mean square deviation. Among them, cross-entropy is often used in classification tasks, whereas mean absolute deviation and mean square deviation are commonly used in regression tasks.

3.1.3. Other Models

In addition to the two models mentioned above, other well-performing algorithms in machine learning tasks include the SVM and RF. SVM is mainly used for binary classification, seeking to maximize the classification interval, with linear and nonlinear [39]; RF is a kind of bagging integrated tree, taking the principle of minority rule; it is not easy to overfit, and has strong generalization ability [40].

3.2. Time-Series Prediction Models

Time-series prediction models contain the LSTM prediction model, the GRU prediction model, and the CNN+GRU stacking prediction model. These models are fully trained on the sample set to predict the three elements of earthquakes. This paper selects the most suitable model among them and the optimal parameter set for seismic prediction by grid search and a five-fold cross-validation method.

3.2.1. LSTM

LSTM is a time series network that takes the previous output and the current input as the input for the next time, thus allowing the model to take into account historical or future information of the data when performing prediction tasks. Compared with the traditional time-series model RNN, LSTM incorporates the concept of “gates”, which is used to solve the problem of retaining long sequences of historical information and long-range gradient transfer. To be specific, LSTM introduces cell states, input gates, forgetting gates, and output gates, which are used for storing information about historical data, inputting current and historical information, controlling the information that needs to be forgotten, and output predicted results, respectively. The schematic diagram of LSTM network architecture is shown in Figure 4.

The input and output of the LSTM are controlled by gates at each moment. The parameter matrix is gradient-updated by the loss of the final loss function, which affects the weights of the gates. Through the feedback regulation of the three gates, important information can be selectively retained, redundant information forgotten, and then beneficial information passed on. The data are processed as shown in the following equations:

f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})

(9)

i_{t} = σ (W_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})

(10)

{\tilde{C}}_{t} = t a n h (W_{c} \cdot [h_{t - 1}, x_{t}] + b_{C})

(11)

h_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o}) * t a n h (C_{t})

(12)

C_{t} = f_{t} * C_{t - 1} + i_{t} * {\tilde{C}}_{t},

(13)

where

x_{t}

is the input, W_S is the weight parameter, b is bias parameter,

σ (z)

is an activation function,

C_{t}

refers to the previously stored information,

h_{t - 1}

is the output at the previous moment, and

h_{t}

is the output.

Figure 5 shows the training structure of the LSTM model in this paper, which mainly contains three LSTM layers with 32 neurons and one fully connected layer with the suitable number of neurons. The LSTM model extracts the anomaly information from input data for n consecutive days and uses it for time sequence analysis.

3.2.2. GRU

GRU has only two gates: the reset gate and the update gate [41]. Because the number of gates is reduced by one, its structure is correspondingly more simplified. The role of the update gate of GRU incorporates the forgetting gate and the input gate in LSTM, and the principle is similar to that of LSTM. Thus, the effects of the two models are not much different on several tasks. The GRU network architecture schematic is shown in Figure 6.

The data are processed as shown in the following equations:

z_{t} = σ (W_{z} \cdot [h_{t - 1}, x_{t}])

(14)

r_{t} = σ (W_{r} \cdot [h_{t - 1}, x_{t}])

(15)

{\tilde{h}}_{t} = t a n h (W \cdot [r_{t} * h_{t - 1}, x_{t}])

(16)

h_{t} = (1 - z_{t}) * h_{t - 1} + z_{t} * {\tilde{h}}_{t},

(17)

where

x_{t}

is the input, W_S is the weight parameter,

σ (z)

is an activation function,

h_{t - 1}

is the output at the previous moment, and

h_{t}

is the output.

3.2.3. CNN+GRU

CNN uses convolutional kernels to effectively grasp local information and perceive the whole from the local by convolutional operation, which has an excellent performance in multidimensional image processing.

This paper proposes a CNN+GRU stacking method for earthquake prediction. First, the one-dimensional convolution kernel performs convolution operation along the time dimension to extract the anomaly information from the two-dimensional feature matrix for n consecutive days. One of the dimensions of the two-dimensional feature matrix is composed of n, and the other is the number of feature data.

After that, the data extracted by multiple convolutional kernels are stitched horizontally to form a new two-dimensional anomaly data matrix. Finally, it is input to the GRU temporal network for classification or regression tasks. The overall architecture of the model is shown in Figure 7.

3.3. Model Parameters

As for the non-time series model, the two-dimensional feature matrix compressed into a one-dimensional dense vector in the time dimension by the principal components analysis (PCA) algorithm, which can reduce the dimension to show the main components of feature data [42]. Then the time-series features extracted by the Tsfresh algorithm are added to non-time series model as the input [43]. The temporality of the compressed data is to be excluded from consideration, so that additional temporal features are added to the feature matrix as compensation.

In addition, the cross-validation method is used to increase the robustness of the model [44], which can make the results more convincing by dividing the total sample into k mutually exclusive subsets and being trained accordingly. Figure 8 shows the schematic diagram of the five-fold cross-validation.

For the time series model, the two-dimensional time series matrix sample needs to be constructed by the time-dimensional sliding window. The optimal model is preserved by observing the change in the loss function, through cross-validation. In terms of the cross-validation of time series data, the problem of data leakage needs to be considered, which needs to be addressed by sliding windows. The quality of the data directly affects the model prediction effect, and the amount of information contained is different in different time scales of the sliding window. If the time scale is too short, the amount of information obtained becomes insufficient. On the other hand, if the time scale is too long, it means that too much useless information must be obtained. Therefore, choosing a suitable sliding window is vital to the final results of the model.

In terms of earthquake prediction, a problem is that there are usually more no-earthquake samples than samples with earthquakes. As a result, the prediction model tends to identify “almost” samples as no-earthquake samples to get a higher area under ROC curve (AUC) index. Therefore, it is necessary to introduce the receiver operating characteristic (ROC) curve when predicting earthquakes in order to make the prediction results more accurate. The ROC curve has the true positive rate (TPR) as the vertical axis and the false positive rate (FPR) as the horizontal axis. AUC is the area under the ROC curve. When the AUC is closer to 1, it indicates that the effect of prediction model is better [45,46,47]. The TPR, FPR, and AUC calculation method is defined as follows:

TPR = \frac{TP}{TP + FN}

(18)

FPR = \frac{FP}{TN + FP}

(19)

AUC = \frac{1}{2} \sum_{i = 1}^{m - 1} (x_{i + 1} {- x}_{i}) \cdot (y_{i} {+ y}_{i + 1}),

(20)

where TP is true positive, FP is false positive, TN is true negative, FN is false negative, and

x_{i + 1}

,

x_{i}

,

y_{i}

,

y_{i + 1}

represent the FPR and TPR under different thresholds of the ROC curve.

Taking the processing of electromagnetic signals as an example, in this paper, eight control experiments were done for the sliding window time scale parameter with an interval of 5 days each time. This paper verifies the effect of the model and counts the number of stations with an AUC index greater than 0.65. It is obvious that the effect of the model increases at first but decreases as the time scale of the sliding window becomes longer. It achieves the best effect in the time range of [24 days, 30 days], and finally the sliding window size is set to 27 days for electromagnetic signals according to the results in Figure 9a. Similarly, the sliding window size is determined to be 10 days for the geoacoustic signals as shown in Figure 9b.

3.4. Softmax-AUC Index Weighting Method

There are different prediction effects for different stations despite using the same model. Thus, this paper proposes the method of multi-station softmax-AUC with index weighting, which can integrate the information of each station for seismic prediction. The AUC index on the validation set can measure the prediction effects of each station. Therefore, this paper uses the softmax-AUC index weighting method to select suitable stations with AUC above 0.65. The prediction result of each station is 0 or 1. Specifically, 0 means no earthquake has occurred, and 1 means an earthquake has occurred, and the weight is normalized by

e^{{AUC}_{i}}

. The threshold is set to 0.5, if the risk value (risk_value) is bigger than the threshold (threshold_value), the region is identified to have an earthquake, and vice versa. The equations are defined as follows:

{station_risk_value}_{i} = \frac{e^{{AUC}_{i}}}{\sum_{j} e^{{AUC}_{j}}} * {pred}_{i}

(21)

risk_value = \sum_{i} {station_risk_value}_{i}

(22)

area_pred = {\begin{matrix} 1 if risk_value > threshold_value \\ 0 if risk_value \leq threshold_value \end{matrix},

(23)

where

{station_risk_value}_{i}

is the risk index of the station,

{pred}_{i}

is prediction result of station,

threshold_value

is the threshold which is set to 0.5, area_pred is the final regional prediction results.

3.5. Model Evaluation Indicators

This paper focus not only on the accuracy of earthquake prediction, but also on the missed prediction and false prediction. Therefore, this paper utilizes the indicators of PA, PP, RP, PN, and RN to evaluate the goodness of the model. The relevant indicators are explained separately below. If these indicators get closer to 1, it indicates that the prediction effect of the model is better. The threshold is set to 0.5, the region is identified to have an earthquake if the risk value exceeds threshold. Based on this judgment condition, the number of predicted earthquakes and predicted no-earthquakes can be counted. According to the prediction results, the indicators can be calculated to illustrate the effects of the prediction model.

Precision All (PA) refers to the total accuracy, the number of correct prediction results divided by the total number of predictions.

PA = \frac{TP + TN}{TP + FN + FP + TN}

(24)

Precision positive (PP) is the accuracy of predicting earthquakes, the number of correct prediction results divided by the number of predicted earthquakes.

PP = \frac{TP}{TP + FP}

(25)

Recall positive (RP) is the recall rate, the number of correct prediction results divided by the total number of actual earthquakes.

RP = \frac{TP}{TP + FN}

(26)

Precision negative (PN) is the accuracy of predicting no-earthquakes, the number of correct prediction results divided by the total number of predicted no-earthquakes.

PN = \frac{TN}{FN + TN}

(27)

Recall negative (RN) is the recall rate of no-earthquakes, the number of correctly prediction results divided by the total number of actual no-earthquakes.

RN = \frac{TN}{FP + TN}

(28)

where TP is true positive, FP is false positive, TN is true negative, and FN is false negative.

The magnitude prediction, epicenter latitude and longitude prediction belong to the ternary regression task. This paper uses two vital indicators—magnitude of absolute mean deviation (mag_mae) and distance average deviation (distance_average)—to evaluate the effect of the model which focuses on the absolute deviation of magnitude prediction and the distance deviation of the epicenter prediction. The two related indicators are explained below:

Magnitude of absolute mean deviation (mag_mae) represents the average of the absolute deviation between the predicted magnitude and the actual magnitude. It can reflect the accuracy of the magnitude prediction. The calculation method is defined as follows:

mag_mae = \frac{1}{m} \sum_{i = 1}^{m} | y_{i} - {\hat{y}}_{i} |,

(29)

where

y_{i}

represent the actual magnitude and

{\hat{y}}_{i}

represents the predicted magnitude.

Distance average deviation (distance_average) represents the mean value of the difference between the actual and predicted positions of the epicenter. It can reflect the accuracy of the epicenter prediction. The calculation method is defined as follows:

distance_average = \frac{1}{m} \sum_{i = 1}^{m} geodesic ((y_{i_l a t}, y_{i_l o n}), ({\hat{y}}_{i_l a t}, {\hat{y}}_{i_l o n})),

(30)

where

y_{i_l a t}, y_{i_l o n}

represents the actual epicenter latitude and longitude, respectively and

{\hat{y}}_{i_l a t}, {\hat{y}}_{i_l o n}

represents the predicted epicenter latitude and longitude, respectively.

4. Results

This paper compares the effects of the models based on the four indicators: AUC, RP,

distance_average

, and

mag_mae

. To be specific, the AUC metric represents the overall performance of the model for earthquake prediction, including correct prediction, false prediction, and missed prediction. The RP measures the recall of earthquake prediction and its expression is the same as TPR. The distance_average and mag_mae evaluate the deviation of epicenter and magnitude prediction, respectively. The higher the number of stations in the same index, the better the model effect is indicated.

4.1. Prediction Results of Non-Time Series Models

The results of the non-time series models were verified in the Sichuan–Yunnan region, which are shown in Table 1. The data in the table represent the number of corresponding stations.

The results of the five non-time series prediction models show that the LightGBM model and the NN model have better prediction results than the other models. The two indicators of the LightGBM model, AUC and distance_average, are better than those of the NN model, but the RP and mag_mae are worse than those of the NN model.

4.2. Prediction Results of the Time Series Models

The results of the time series models are verified in the Sichuan–Yunnan region, which are shown in Table 2. The data in the table represent the number of corresponding stations.

Overall, the three time series prediction models all has great prediction results. The two indicators of the LSTM model, AUC, and distance_average, are better than those of other models. In the next section, the paper compares the prediction results of the above eight models in detail.

5. Discussion

5.1. Comparison of Non-Time Series Models and Time Series Models

To determine a more suitable earthquake prediction model for the AETA data, this paper compares the effects of three time series models and five non-time series models on the validation set, as shown in Figure 10. The first five are non-time series prediction models, namely LightGBM, NN, SVM, GBDT, RF, and the last three are time series prediction models, namely LSTM, GRU, and CNN+GRU.

The input of the time series prediction model is two-dimensional time series feature data, which completely contains the information of AETA raw signals to the maximum extent. In contrast, the input data of the non-time series prediction model is compressed, and no longer retains the time series information of the raw data. Therefore, it is difficult for the model to learn the changes of the time dimension.

According to the bar chart, it is obvious that the overall effect of the three time series prediction models is significantly better than that of the five non-time series prediction models in four indicators. Moreover, the LSTM model has the best performance among these prediction models. The LSTM prediction effects of each station are shown in Appendix B. Prediction effects of the LSTM model for magnitude and epicenter of each station are shown in Appendix C.

5.2. Real-Earthquake Prediction

LSTM, which is the model with the optimal effect among these several prediction models, was chosen to predict the earthquake with a magnitude larger than 3.5 in the Sichuan–Yunnan region (

22 . 00 ° N ~ 34 . 00 ° N, 98 . 00 ° E ~ 107 . 00 ° E)

from April 2021 to July 2021 for 16 consecutive weeks. Each prediction was made every Sunday and includes Y/N prediction for next 7 days in the target region, if it was a Y prediction, then the location and magnitude would be given as well.

The whole prediction process is shown in Figure 11. When making a prediction for the next week’s earthquake, the features are first input into the model of all stations in each region to get the risk_value for each region. If the risk_value is less than the threshold_value for the region, it means there is no expected earthquake and the prediction is finished. On the contrary, if it is judged to be an earthquake, then the prediction of magnitude and epicenter latitude and longitude will start.

LSTM is used for real-earthquake prediction for 16 consecutive weeks based on the multi-station decision mechanism method. Most of the large prediction deviation occurred in the first month, as it was difficult to set a reasonable regional risk threshold at the beginning. In order to reduce prediction deviation, this paper made several historical projections of the model, adjusted the regional risk threshold and optimized the model, which finally improved the model prediction results. AETA is also continuously collecting new data, and the model is incrementally learned in batches every two weeks. Then the new collected data is used for sample set construction. Every month, the model is learned in a full batch, and the training and testing sets are regenerated with all the data and the whole model is retrained. Finally, we save the best model with the best effect on the validation set.

This paper focuses on the earthquakes with the largest magnitude within a week. Table 3 shows the results of the real-earthquake prediction for 16 consecutive weeks; the letter “N” means no earthquakes.

The number of correct predictions is 10, the number of false predictions is 2, and the number of missed predictions is 4. The real-earthquake prediction achieves great results with an accuracy of 0.625, which is the number of correct prediction results divided by the total number of weeks. Some prediction results have a small deviation, such as the predictions in the eighth, eleventh, twelfth, thirteenth and sixteenth weeks. In addition, the model has won the top three in the second AETA earthquake prediction competition, which can also prove the effectiveness of our model. The website of the competition is https://competition.aeta.io/, accessed on 1 September 2021.

However, there are still some actual earthquakes that are not predicted. The predicted results may be inaccurate sometimes when predicting earthquakes near Yunnan. After research and discussion, the reasons can be attributed to the fact that the stations in Sichuan province are installed with high density and concentrated distribution, so that the stations can capture the anomaly signals in time, whereas the stations in Yunnan province are mostly scattered, resulting in the lack of abundant monitoring data and anomaly signals. Therefore, the number of stations installed near Yunnan or the weight of Yunnan in the model needs to be increased in the future.

6. Conclusions

AETA, as a multi-component earthquake detection and prediction system, has been developed independently by the Key Laboratory of Integrated Microsystems (IMS) of Peking University Shenzhen Graduate School. After five years of data acquisition, a large amount of data has been accumulated. This paper constructs several non-time series and time-series models for earthquake prediction based on the feature data extracted from the AETA raw signals. After comparison and analysis, it is confirmed that the LSTM model can achieve the best results for earthquake prediction. Then, the LSTM model is selected for real-earthquake prediction for 16 consecutive weeks. In addition, this paper proposes the method of multi-station softmax-AUC with index weighting to solve the situation that the prediction effect of the same model varies in different stations. This model has placed in the top three of the second AETA earthquake prediction competition.

Author Contributions

Conceptualization, C.W. and S.Y.; methodology, C.W. and C.Y.; software, C.W. and C.L.; validation, C.W., S.Y. and X.W.; formal analysis, C.W.; investigation, C.W. and C.L.; resources, S.Y. and X.W.; data curation, C.W., C.L. and C.Y.; writing—original draft preparation, C.W.; writing—review and editing, S.Y. and X.W.; visualization, C.W. and C.L.; supervision, S.Y. and X.W.; project administration, S.Y. and X.W.; funding acquisition, S.Y. and X.W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by a fundamental research grant from Shenzhen Science and Technology Program, grant number is JCYJ20200109120404043 and Youth Innovation Talent Project of Guangdong Province Universities, grant number is 2021KQNCX112.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available from the authors upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. AETA Feature Library.

Type	Feature	Meaning	Number of EM Feature	Number of GA Feature
Time domain features	abs_mean	Mean of absolute value	2	2
	var	Variance	2	1
	power	Power	2	1
	skew	Skewness	2	1
	kurt	Kurtosis	2	1
	abs_max	Maximum absolute value	2	1
	abs_top_x	Absolute maximum x% of position	4	2
	energy_sstd	standard deviation of short-time energy	2	1
	energy_smax	Short-time maximum energy	2	1
	s_zero_rate	Short-time average over-zero rate	0	1
	s_zero_rate_max	Short-time maximum over-zero rate	0	1
Frequency domain features	power_rate_atob	Power from a~bHz in the frequency spectrum	11	11
	frequency_center	Center of gravity frequency	1	1
	mean_square_frequency	Mean square frequency	1	1
	variance_frequency	Frequency variance	1	1
	frequency_entropy	Entropy of the spectrum	1	1
Wavelet transforms	levelx_absmean	Mean value after the reconstruction of layer x	4	4
	levelx_energy	Energy after the reconstruction of layer x	4	4
	levelx_energy_svar	Variance of the energy value after the reconstruction of layer x	4	4
	levelx_energy_smax	Maximum value of energy after the reconstruction of layer x	4	4
Total			51	44

Appendix B

There are six indicators, AUC, PA, PP, RP, PN, RN, to evaluate the accuracy of the model and the missed prediction and false prediction. The evaluation indicators of the models are described in detail in Section 3.

Table A2. Prediction results of LSTM on area1.

No.	Station	AUC	PA	PP	RP	PN	RN
1	DJY	0.68	0.68	0.64	0.81	0.74	0.55
2	SMSD	0.68	0.68	0.60	1.00	1.00	0.37
3	QC	0.69	0.69	0.69	0.71	0.69	0.67
4	WC	0.67	0.67	0.73	0.55	0.62	0.79
5	BX	0.78	0.78	0.83	0.71	0.74	0.85
6	GZYJ	0.80	0.80	0.87	0.71	0.75	0.89
7	EB	0.66	0.66	0.60	1.00	1.00	0.33
8	GYCT	0.68	0.69	0.66	0.84	0.75	0.53
9	JC	0.82	0.82	0.76	0.96	0.94	0.69
10	DF	0.76	0.76	0.69	1.00	1.00	0.52
11	QCYD	0.76	0.77	0.96	0.54	0.70	0.98
12	QCPS	0.78	0.78	0.87	0.67	0.72	0.90
13	CZ	0.81	0.81	0.73	1.00	1.00	0.62
14	PWHY	0.78	0.77	0.94	0.60	0.68	0.95
15	SPMJ	0.68	0.68	0.72	0.59	0.66	0.77
16	PWBM	0.74	0.75	0.68	1.00	1.00	0.49
17	JCAN	0.66	0.66	0.96	0.33	0.60	0.99
18	YAYJ	0.69	0.69	0.70	0.67	0.68	0.71
19	HS	0.77	0.77	0.69	1.00	1.00	0.55
20	MXDX	0.69	0.70	0.89	0.44	0.63	0.95
21	JZG4	0.71	0.71	0.72	0.67	0.70	0.75
22	JZG5	0.73	0.73	0.71	0.80	0.75	0.65
23	JZG2	0.70	0.70	0.65	0.82	0.77	0.57
24	PWNB	0.69	0.68	0.65	0.75	0.72	0.62
25	JZG1	0.68	0.68	0.68	0.71	0.68	0.66
26	WXZZ	0.80	0.80	0.72	1.00	1.00	0.60
27	HYA	0.67	0.65	0.58	1.00	1.00	0.34
28	DL	0.80	0.80	0.79	0.76	0.81	0.83
29	BK	0.68	0.68	0.70	0.64	0.66	0.72
30	HBY	0.72	0.72	0.75	0.65	0.70	0.79
31	REG	0.88	0.88	0.83	0.96	0.95	0.79
32	EMHW	0.95	0.95	0.91	1.00	1.00	0.90
33	JYZJJ	0.93	0.93	0.87	1.00	1.00	0.86
34	LSSW	0.69	0.70	0.82	0.47	0.65	0.91
35	RXCS	0.96	0.96	0.93	1.00	1.00	0.93
36	ZGDA	0.77	0.77	0.68	1.00	1.00	0.55
37	MYBC	0.86	0.86	0.77	1.00	1.00	0.72

Table A3. Prediction results of LSTM on area2.

No.	Station	AUC	PA	PP	RP	PN	RN
1	MB	0.65	0.65	0.59	0.73	0.72	0.58
2	LB	0.76	0.76	0.77	0.79	0.75	0.72
3	ML	0.73	0.75	0.81	0.57	0.72	0.89
4	EMS	0.70	0.68	0.59	0.89	0.85	0.50
5	XJX	0.66	0.66	0.60	0.68	0.72	0.64
6	DF	0.65	0.65	0.68	0.66	0.62	0.65
7	XCXM	0.65	0.65	0.70	0.59	0.61	0.71
8	LDDZ	0.65	0.66	0.69	0.54	0.64	0.77
9	YAYJ	0.66	0.68	0.62	1.00	1.00	0.32
10	LSBS	0.69	0.69	0.78	0.52	0.64	0.85
11	HYA	0.85	0.84	0.75	1.00	1.00	0.70
12	MSQS	0.67	0.70	0.68	0.52	0.71	0.83
13	EMGQ	0.68	0.73	0.95	0.37	0.69	0.99
14	MBMZ	0.68	0.63	0.92	0.41	0.53	0.95
15	MBRD	0.75	0.76	0.77	0.80	0.75	0.70
16	MBYJ	0.69	0.69	0.65	0.63	0.73	0.74
17	WTQ	0.77	0.77	0.69	0.79	0.83	0.74
18	NJWYYLZ	0.66	0.66	0.61	0.71	0.71	0.60
19	YBYX	0.98	0.98	0.95	1.00	1.00	0.95
20	LSFZJZ	0.75	0.76	0.76	0.65	0.75	0.84
21	ZGDA	0.75	0.75	0.69	0.76	0.80	0.74

Table A4. Prediction results of LSTM on area3.

No.	Station	AUC	PA	PP	RP	PN	RN
1	TH	0.69	0.69	0.62	1.00	1.00	0.38
2	CX	0.80	0.80	0.71	1.00	1.00	0.61
3	QJ	0.68	0.69	0.68	0.73	0.69	0.64
4	LJSD	0.78	0.78	0.78	0.79	0.78	0.77
5	SPI	0.85	0.86	0.78	1.00	1.00	0.71
6	DHZ	0.68	0.67	0.62	0.87	0.79	0.49
7	DR	0.73	0.73	0.87	0.53	0.66	0.92
8	DLSL	0.77	0.74	0.63	0.96	0.95	0.58
9	JN	0.85	0.85	0.86	0.85	0.85	0.86
10	YX	0.68	0.68	0.90	0.40	0.62	0.95
11	KM	0.70	0.71	0.83	0.50	0.66	0.90
12	LJYS	0.67	0.67	0.61	1.00	1.00	0.33
13	LJDZ	0.76	0.76	0.68	1.00	1.00	0.52
14	JZS	0.91	0.91	0.97	0.86	0.87	0.97
15	LJLD	0.69	0.69	0.86	0.47	0.62	0.92
16	TC	0.79	0.79	0.73	0.93	0.90	0.65
17	DQZ	0.68	0.68	0.67	0.73	0.70	0.63
18	JP	0.87	0.87	0.79	1.00	1.00	0.73
19	HH	0.83	0.83	0.75	1.00	1.00	0.66
20	TCMZ	0.80	0.80	0.90	0.67	0.75	0.93
21	LJNL	0.68	0.68	0.77	0.53	0.63	0.83
22	YYLG	0.65	0.64	0.57	1.00	1.00	0.31
23	XCH	0.87	0.87	0.79	1.00	1.00	0.74
24	DLHZ	0.92	0.91	0.84	1.00	1.00	0.83
25	XGLL	0.90	0.90	0.83	1.00	1.00	0.79

Appendix C

There are two indicators, mag_mae and distance_average, to evaluate the absolute magnitude deviation and the epicenter deviation of the model. The evaluation indicators of the models are described in detail in Section 3.

Table A5. Prediction results of LSTM model for magnitude and epicenter on area1.

No.	Station	Mag_Mae	Distance_Average (km)
1	DJY	0.26	98.83
2	SMWJ	0.16	57.97
3	LXSM	0.44	119.46
4	WC	0.12	96.04
5	GYCT	0.08	102.62
6	SP	0.32	50.16
7	QCYD	0.27	100.69
8	QCPS	0.08	50.48
9	YAYJ	0.14	50.47
10	QCCB	0.18	44.30
11	HS	0.25	47.65
12	JZG2	0.15	19.53
13	LSBS	0.15	50.04
14	MSQS	0.40	48.30
15	HBY	0.17	82.37
16	REG	0.38	72.72
17	WTQ	0.26	26.32
18	NJWYYLZ	0.13	26.88
19	JYZJJ	0.09	14.30
20	LSSW	0.13	18.91
21	RXCS	0.27	63.67
22	LSFZJZ	0.10	16.96
23	LSJJRMZF	0.08	13.22
24	ZGDA	0.08	21.58
25	MYBC	0.10	95.15
26	SMAS	0.13	95.60

Table A6. Prediction results of LSTM model for magnitude and epicenter on area2.

No.	Station	Mag_Mae	Distance_Average (km)
1	CX	0.25	77.79
2	SMWJ	0.25	80.03
3	QW	0.40	82.78
4	GAX	0.20	93.71
5	YYYT	0.13	97.71
6	EB	0.57	79.23
7	MS	0.22	85.35
8	DF	0.33	75.27
9	YM	0.13	96.95
10	LDDZ	0.50	79.81
11	KM	0.21	60.83
12	CZ	0.40	60.74
13	MNLZ	0.19	94.53
14	HYA	0.46	88.26
15	YSHX	0.13	96.26
16	MBQB	0.17	97.19
17	MBMZ	0.39	57.86
18	MBSK	0.41	37.05
19	GYCT	0.18	80.16
20	SMAS	0.22	91.43
21	MBYJ	0.15	85.54
22	JYZJJ	0.24	73.20
23	RXCS	0.30	58.51
24	LSFZJZ	0.22	83.11
25	LSJJRMZF	0.22	78.03
26	ZGDA	0.20	78.62
27	YBCNQXJ	0.64	59.24
28	YBXWSHC	0.27	96.84

Table A7. Prediction results of LSTM model for magnitude and epicenter on area3.

No.	Station	Mag_Mae	Distance_Average (km)
1	TH	0.06	42.34
2	XC	0.26	101.01
3	DC	0.15	99.96
4	DHZ	0.23	42.08
5	XCXM	0.68	117.21
6	DLSL	0.27	113.03
7	YL	0.30	55.52
8	YM	0.31	44.14
9	HA	0.06	109.35
10	YX	0.16	79.27
11	LJYS	0.05	78.47
12	LJGC	0.04	54.87
13	DQZ	0.03	59.05
14	HH	0.03	14.53
15	LJDD	0.06	42.34
16	TCMZ	0.26	101.01
17	LJHP	0.15	99.96
18	TCRH	0.23	42.08
19	LJNL	0.68	117.21
20	XCH	0.27	113.03

References

Wang, X.; Yong, S.; Xu, B.; Liang, Y.; Bai, Z.; An, H.; Zhang, X.; Huang, J.; Xie, Z.; Lin, K.; et al. Research and Implementation of Multi-component Seismic Monitoring System AETA. Acta Sci. Nat. Univ. Pekin. 2018, 54, 487–494. [Google Scholar]
Varotsos, P.; Alexopoulos, K. Physical properties of the variations of the electric field of the earth preceding earthquakes, I. Tectonophysics 1984, 110, 73–98. [Google Scholar] [CrossRef]
Frasersmith, A.C.; Bernardi, A.; McGill, P.R.; Ladd, M.E.; Helliwell, R.A.; Villard, O.G. Low-frequency magnetic-field measurements near the epicenter of the ms-7.1 Loma-Prieta earthquake. Geophys. Res. Lett. 1990, 17, 1465–1468. [Google Scholar] [CrossRef]
Huang, Q.; Ikeya, M. Seismic electromagnetic signals (SEMS) explained by a simulation experiment using electromagnetic waves. Phys. Earth Planet. Inter. 1998, 109, 107–114. [Google Scholar] [CrossRef]
Varotsos, P.A.; Sarlis, N.V.; Skordas, E.S. Magnetic field variations associated with SES. Proc. Jpn. Acad. Ser. B Phys. Biol. Sci. 2001, 77, 87–92. [Google Scholar] [CrossRef]
Varotsos, P.A.; Sarlis, N.V.; Skordas, E.S. Electric Fields that “Arrive’’ before the Time Derivative of the Magnetic Field prior to Major Earthquakes. Phys. Rev. Lett. 2003, 91, 148501. [Google Scholar] [CrossRef]
Huang, Q. Controlled analogue experiments on propagation of seismic electromagnetic signals. Chin. Sci. Bull. 2005, 50, 1956–1961. [Google Scholar] [CrossRef]
Uyeda, S.; Nagao, T.; Kamogawa, M. Short-term earthquake prediction: Current status of seismo-electromagnetics. Tectonophysics 2009, 470, 205–213. [Google Scholar] [CrossRef]
Varotsos, P.A.; Sarlis, N.V.; Skordas, E.S. Identifying long-range correlated signals upon significant periodic data loss. Tectonophysics 2011, 503, 189–194. [Google Scholar] [CrossRef]
Potirakis, S.M.; Karadimitrakis, A.; Eftaxias, K. Natural time analysis of critical phenomena: The case of pre-fracture electromagnetic emissions. Chaos 2013, 23, 23117. [Google Scholar] [CrossRef]
Han, P.; Hattori, K.; Hirokawa, M.; Zhuang, J.; Chen, C.-H.; Febriani, F.; Yamaguchi, H.; Yoshino, C.; Liu, J.-Y.; Yoshida, S. Statistical analysis of ULF seismomagnetic phenomena at Kakioka, Japan, during 2001–2010. J. Geophys. Res. Space Phys. 2014, 119, 4998–5011. [Google Scholar] [CrossRef]
Hayakawa, M.; Schekotov, A.; Potirakis, S.; Eftaxias, K. Criticality features in ULF magnetic fields prior to the 2011 Tohoku earthquake. Jpn. Acad. Ser. B Phys. Biol. Sci. 2015, 91, 25–30. [Google Scholar] [CrossRef] [PubMed]
Han, P.; Hattori, K.; Huang, Q.; Hirooka, S.; Yoshino, C. Spatiotemporal characteristics of the geomagnetic diurnal variation anomalies prior to the 2011 Tohoku earthquake (Mw 9.0) and the possible coupling of multiple pre-earthquake phenomena. J. Asian Earth Sci. 2016, 129, 13–21. [Google Scholar] [CrossRef]
Sarlis, N.V. Statistical Significance of Earth’s Electric and Magnetic Field Variations Preceding Earthquakes in Greece and Japan Revisited. Entropy 2018, 20, 561. [Google Scholar] [CrossRef]
Sarlis, N.V.; Varotos, P.A.; Skordas, E.S.; Uyeda, S.; Zlotnicki, J.; Nagao, T.; Rybin, A.; Lazaridou-Varotsos, M.S.; Papadopoulou, K.A. Seismic electric signals in seismic prone areas. Earthq. Sci. 2018, 31, 44–51. [Google Scholar] [CrossRef]
Varotsos, P.A.; Sarlis, N.V.; Skordas, E.S. Order Parameter and Entropy of Seismicity in Natural Time before Major Earthquakes: Recent Results. Geosciences 2022, 12, 225. [Google Scholar] [CrossRef]
Zhang, Y.; Guo, H.; Yin, W.; Zhao, Z.; Ran, Q. Detection Method of Earthquake Disaster Image Anomaly Based on SIFT Feature and SVM Classification. J. Seismol. Res. 2019, 42, 265–272. [Google Scholar]
Jozinovic, D.; Lomax, A.; Stajduhar, I.; Michelini, A. Rapid prediction of earthquake ground shaking intensity using raw waveform data and a convolutional neural network. Geophys. J. Int. 2020, 222, 1379–1389. [Google Scholar] [CrossRef]
Xiong, P.; Long, C.; Zhou, H.Y.; Battiston, R.; Zhang, X.M.; Shen, X.H. Identification of Electromagnetic Pre-Earthquake Perturbations from the DEMETER Data by Machine Learning. Remote Sens. 2020, 12, 3643. [Google Scholar] [CrossRef]
Wang, L.; Wu, J.; Zhang, W.; Wang, L.; Cui, W. Efficient Seismic Stability Analysis of Embankment Slopes Subjected to Water Level Changes Using Gradient Boosting Algorithms. Front. Earth Sci. 2021, 9, 807317. [Google Scholar] [CrossRef]
Saad, O.M.; Chen, Y.F.; Trugman, D.; Soliman, M.S.; Samy, L.; Savvaidis, A.; Khamis, M.A.; Hafez, A.G.; Fomel, S.; Chen, Y.K. Machine Learning for Fast and Reliable Source-Location Estimation in Earthquake Early Warning. IEEE Geosci. Remote Sens. Lett. 2022, 19, 8025705. [Google Scholar] [CrossRef]
Kanarachos, S.; Christopoulos, S.R.G.; Chroneos, A.; Fitzpatrick, M.E. Detecting anomalies in time series data via a deep learning algorithm combining wavelets, neural networks and Hilbert transform. Expert Syst. Appl. 2017, 85, 292–304. [Google Scholar] [CrossRef]
Zhou, Y.; Yue, H.; Kong, Q.; Zhou, S. Hybrid Event Detection and Phase-Picking Algorithm Using Convolutional and Recurrent Neural Networks. Seismol. Res. Lett. 2019, 90, 1079–1087. [Google Scholar] [CrossRef]
Titos, M.; Bueno, A.; Garcia, L.; Benitez, M.C.; Ibanez, J. Detection and Classification of Continuous Volcano-Seismic Signals with Recurrent Neural Networks. IEEE Trans. Geosci. Remote Sens. 2019, 57, 1936–1948. [Google Scholar] [CrossRef]
Jena, R.; Pradhan, B.; Alamri, A.M. Susceptibility to Seismic Amplification and Earthquake Probability Estimation Using Recurrent Neural Network (RNN) Model in Odisha, India. Appl. Sci. 2020, 10, 5355. [Google Scholar] [CrossRef]
Xu, Y.; Lu, X.; Cetiner, B.; Taciroglu, E. Real-time regional seismic damage assessment framework based on long short-term memory neural network. Comput. Aided Civil Infrastruct. Eng. 2021, 36, 504–521. [Google Scholar] [CrossRef]
Yan, X.; Shi, Z.M.; Wang, G.; Zhang, H.; Bi, E. Detection of possible hydrological precursor anomalies using long short-term memory: A case study of the 1996 Lijiang earthquake. J. Hydrol. 2021, 599, 126369. [Google Scholar] [CrossRef]
Huang, Y.; Han, X.; Zhao, L. Recurrent neural networks for complicated seismic dynamic response prediction of a slope system. Eng. Geol. 2021, 289, 106198. [Google Scholar] [CrossRef]
Xue, J.; Huang, Q.; Wu, S.; Nagao, T. LSTM-Autoencoder Network for the Detection of Seismic Electric Signals. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5917012. [Google Scholar] [CrossRef]
Yong, S.; Wang, X.; Zhang, X.; Guo, Q.; Wang, J.; Yang, C.; Jiang, B.H. Periodic electromagnetic signals as potential precursor for seismic activity. J. Cent. South Univ. 2021, 28, 2463–2471. [Google Scholar] [CrossRef]
Bao, Z.; Zhao, J.; Huang, P.; Yong, S.; Wang, X. Deep Learning-Based Electromagnetic Signal for Earthquake Magnitude Prediction. Sensors 2021, 21, 4434. [Google Scholar] [CrossRef] [PubMed]
Yong, S.; Wang, X.; Pang, R.; Jin, X.; Zeng, J.; Han, C.; Xu, B.X. Development of Inductive Magnetic Sensor for Multi-component Seismic Monitoring System AETA. Acta Sci. Nat. Univ. Pekin. 2018, 54, 495–501. [Google Scholar]
Carmona-Cabezas, R.; Gomez-Gomez, J.; de Rave, E.G.; Jimenez-Hornero, F.J. A sliding window-based algorithm for faster transformation of time series into complex networks. Chaos 2019, 29, 103121. [Google Scholar] [CrossRef] [PubMed]
Bao, Z.; Yong, S.; Wang, X.; Yang, C.; Xie, J.; He, C. Seismic Reflection Analysis of AETA Electromagnetic Signals. Appl. Sci. 2021, 11, 5869. [Google Scholar] [CrossRef]
Hussein, A.S.; Li, T.R.; Yohannese, C.W.; Bashir, K. A-SMOTE: A New Preprocessing Approach for Highly Imbalanced Datasets by Improving SMOTE. Int. J. Comput. Intell. Syst. 2019, 12, 1412–1422. [Google Scholar] [CrossRef]
Liang, W.; Luo, S.; Zhao, G.; Wu, H. Predicting Hard Rock Pillar Stability Using GBDT, XGBoost, and LightGBM Algorithms. Mathematics 2020, 8, 765. [Google Scholar] [CrossRef]
Zhang, D.; Gong, Y. The Comparison of LightGBM and XGBoost Coupling Factor Analysis and Prediagnosis of Acute Liver Failure. IEEE Access 2020, 8, 220990–221003. [Google Scholar] [CrossRef]
Abdi, H. A neural network primer. J. Biol. Syst. 1994, 2, 247–281. [Google Scholar] [CrossRef]
Tsang, I.W.; Kwok, J.T.; Cheung, P.M. Core vector machines: Fast SVM training on very large data sets. J. Mach. Learn. Res. 2005, 6, 363–392. [Google Scholar]
Speiser, J.L.; Miller, M.E.; Tooze, J.; Ip, E. A comparison of random forest variable selection methods for classification prediction modeling. Expert Syst. Appl. 2019, 134, 93–101. [Google Scholar] [CrossRef]
Zhang, W.; Li, H.; Tang, L.; Gu, X.; Wang, L.; Wang, L. Displacement prediction of Jiuxianping landslide using gated recurrent unit (GRU) networks. Acta Geotech. 2022, 17, 1367–1382. [Google Scholar] [CrossRef]
Liu, Y.; Yong, S.; He, C.; Wang, X.; Bao, Z.; Xie, J.; Zhang, X. An Earthquake Forecast Model Based on Multi-Station PCA Algorithm. Appl. Sci. 2022, 12, 3311. [Google Scholar] [CrossRef]
Christ, M.; Braun, N.; Neuffer, J.; Kempa-Liehr, A.W. Time Series Feature Extraction on basis of Scalable Hypothesis tests (tsfresh-A Python package). Neurocomputing 2018, 307, 72–77. [Google Scholar] [CrossRef]
Santos, M.S.; Soares, J.P.; Abreu, P.H.; Araujo, H.; Santos, J. Cross-Validation for Imbalanced Datasets: Avoiding Overoptimistic and Overfitting Approaches. IEEE Comput. Intell. Mag. 2018, 13, 59–76. [Google Scholar] [CrossRef]
Hosmer, D.W.; Lemeshow, S. Applied Logistic Regression; John Wiley & Sons, Ltd.: New York, NY, USA, 2000. [Google Scholar]
Fawcett, T. An introduction to ROC analysis. Pattern Recogn. Lett. 2006, 27, 861–874. [Google Scholar] [CrossRef]
Sarlis, N.V.; Christopoulos, S.R.G. Visualization of the significance of Receiver Operating Characteristics based on confidence ellipses. Comput. Phys. Commun. 2014, 185, 1172–1176. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Distribution of earthquakes and stations in Sichuan–Yunnan region.

Figure 2. The process of sample construction.

Figure 3. Two-dimensional matrix SMOTE algorithm.

Figure 4. LSTM architecture schematic.

Figure 5. LSTM model structure.

Figure 6. GRU architecture schematic.

Figure 7. CNN+GRU stacking network architecture.

Figure 8. Schematic diagram of five-fold cross-validation.

Figure 9. (a) Effect of window size for electromagnetic signals on prediction model. (b) Effect of time window size for geoacoustic signals on prediction model.

Figure 10. Overall results of the eight prediction models.

Figure 11. Decision process of earthquake-prediction model.

Table 1. The results of non-time series models.

Model	$AUC \geq 0.65$	$RP \geq 0.70$	$Distance_Average \leq 100 km$	$Mag_Mae \leq 0.25$
LightGBM	68	43	52	36
NN	57	49	47	39
SVM	51	39	43	41
GBDT	36	42	39	43
RF	45	37	41	32

Table 2. The results of time series models.

Model	$AUC \geq 0.65$	$RP \geq 0.70$	$Distance_Average \leq 100 km$	$Mag_Mae \leq 0.25$
LSTM	84	55	64	47
GRU	82	58	61	48
CNN+GRU	79	49	56	40

Table 3. The results of real-earthquake prediction.

	Actual Magnitude	Predicted Magnitude	Actual Epicenter	Predicted Epicenter
1th week (5 April 2021–11 April 2021)	N	N	N	N
2th week (12 April 2021–18 April 2021)	N	Ms4.0	N	$(28.38 ° N, 104.76 ° E)$
3th week (19 April 2021–25 April 2021)	N	N	N	N
4th week (26 April 2021–2 May 2021)	N	N	N	N
5th week (3 May 2021–9 May 2021)	Ms3.6	N	$(32.4 ° N, 104.02 ° E)$	N
6th week (10 May 2021–16 May 2021)	Ms4.7	N	$(24.43 ° N, 99.24 ° E)$	N
7th week (17 May 2021–23 May 2021)	Ms6.4	Ms3.9	$(25.67 ° N, 99.87 ° E)$	$(28.41 ° N$ $, 104.65 ° E$ )
8th week (24 May 2021–30 May 2021)	Ms4.1	Ms4.5	$(25.74 ° N, 99.95 ° E)$	$(25.59 ° N$ $, 99.95 ° E$ )
9th week (31 May 2021–6 June 2021)	N	Ms4.1	N	$(25.64 ° N, 99.98 ° E)$
10th week (7 June 2021–13 June 2021)	Ms5.1	N	$(24.34 ° N, 101.91 ° E)$	N
11th week (14 June 2021–20 June 2021)	Ms4.2	Ms4.2	$(24.33 ° N, 101.91 ° E)$	$(24.53 ° N$ $, 99.41 ° E$ )
12th week (21 June 2021–27 June 2021)	Ms3.8	Ms4.0	$(32.2 ° N, 104.94 ° E)$	$(24.31 ° N$ $, 101.87 ° E$ )
13th week (28 June 2021–4 July 2021)	Ms4.6	Ms3.9	$(24.31 ° N, 101.89 ° E)$	$(32.08 ° N$ $, 104.57 ° E$ )
14th week (5 July 2021–11 July 2021)	Ms4.7	N	$(24.43 ° N, 99.24 ° E)$	N
15th week (12 July 2021–18 July 2021)	Ms4.8	Ms3.9	$(32.97 ° N, 103.84 ° E)$	$(28.12 ° N$ $, 104.64 ° E$ )
16th week (19 July 2021–25 July 2021)	Ms4.1	Ms4.0	$(29.28 ° N, 105.44 ° E)$	$(28.14 ° N$ $, 104.69 ° E$ )

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, C.; Li, C.; Yong, S.; Wang, X.; Yang, C. Time Series and Non-Time Series Models of Earthquake Prediction Based on AETA Data: 16-Week Real Case Study. Appl. Sci. 2022, 12, 8536. https://doi.org/10.3390/app12178536

AMA Style

Wang C, Li C, Yong S, Wang X, Yang C. Time Series and Non-Time Series Models of Earthquake Prediction Based on AETA Data: 16-Week Real Case Study. Applied Sciences. 2022; 12(17):8536. https://doi.org/10.3390/app12178536

Chicago/Turabian Style

Wang, Chenyang, Chaorun Li, Shanshan Yong, Xin’an Wang, and Chao Yang. 2022. "Time Series and Non-Time Series Models of Earthquake Prediction Based on AETA Data: 16-Week Real Case Study" Applied Sciences 12, no. 17: 8536. https://doi.org/10.3390/app12178536

APA Style

Wang, C., Li, C., Yong, S., Wang, X., & Yang, C. (2022). Time Series and Non-Time Series Models of Earthquake Prediction Based on AETA Data: 16-Week Real Case Study. Applied Sciences, 12(17), 8536. https://doi.org/10.3390/app12178536

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Time Series and Non-Time Series Models of Earthquake Prediction Based on AETA Data: 16-Week Real Case Study

Abstract

1. Introduction

2. AETA System

2.1. AETA Devices and Data Acquisition

2.2. Data Set Construction

3. Model Construction

3.1. Non-Time Series Prediction Model

3.1.1. LightGBM

3.1.2. NN

3.1.3. Other Models

3.2. Time-Series Prediction Models

3.2.1. LSTM

3.2.2. GRU

3.2.3. CNN+GRU

3.3. Model Parameters

3.4. Softmax-AUC Index Weighting Method

3.5. Model Evaluation Indicators

4. Results

4.1. Prediction Results of Non-Time Series Models

4.2. Prediction Results of the Time Series Models

5. Discussion

5.1. Comparison of Non-Time Series Models and Time Series Models

5.2. Real-Earthquake Prediction

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

Appendix B

Appendix C

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI