Research on Seismic Phase Recognition Method Based on Bi-LSTM Network

Wang, Li; Cai, Jianxian; Duan, Li; Guo, Lili; Shi, Xingxing; Cai, Huanyu

doi:10.3390/app14166917

Open AccessArticle

Research on Seismic Phase Recognition Method Based on Bi-LSTM Network

by

Li Wang

^1,2

,

Jianxian Cai

^1,2,*,

Li Duan

^1,2,

Lili Guo

^1,2,

Xingxing Shi

^1,2 and

Huanyu Cai

^1,2

¹

College of Electronic Science and Control Engineering, Institute of Disaster Prevention, Langfang 065201, China

²

Hebei Key Laboratory of Seismic Disaster Instrument and Monitoring Technology, Institute of Disaster Prevention, Langfang 065201, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(16), 6917; https://doi.org/10.3390/app14166917

Submission received: 7 June 2024 / Revised: 24 July 2024 / Accepted: 2 August 2024 / Published: 7 August 2024

Download

Browse Figures

Versions Notes

Abstract

:

In order to improve the precision of phase recognition and reduce the rate of misdetection, this paper applies the deep learning method to automatic phase recognition. In this paper, an automatic seismic phase recognition model based on the Bi-LSTM network is designed. To test the performance of this model, the STEAD dataset is used for training and testing, and this model is compared with the traditional STA/LTA and AIC methods. The experimental results show that, compared to STA/LTA and AIC methods, the Bi-LSTM network can reduce the misdetection rate by about 8–15%, and improve the RSEM; especially, the prediction error of S-wave is greatly reduced.

Keywords:

deep learning; seismic phase recognition; Bi-LSTM

1. Introduction

As part of the basic work of seismic signal processing, seismic phase recognition plays an important role in the field of seismology. In the early research process of seismology, seismic phase recognition basically needs to be completed by manual recognition, and all kinds of phase data are calculated by combining with a travel time table. Manual recognition requires a long period of professional knowledge and skills training and relies on personal experience to preform recognition, and cannot be promoted on a large scale. Meanwhile, the speed of artificial recognition cannot meet the needs of real-time processing of an earthquake rapid report. With the gradual improvement of seismic stations and networks, the total number of seismic stations has reached 1972. The rapid increase in seismic waveform data makes it difficult to continue the manual recognition work, and the efficiency is gradually reduced.

To address the problem of the slow speed and low precision of the current manual recognition methods, scholars at home and abroad began to study the automatic recognition methods of seismic phases in the 1960s, and have achieved a series of scientific research results. At present, the automatic recognition methods of seismic phases mainly include two categories: the traditional automatic phase recognition method and the deep learning method emerging in recent years due to big data technology. Among the traditional methods for automatic recognition of seismic phase, the most commonly used methods are STA/LTA (short-term average/long-term average) [1,2] and AIC (Akaike information criterion) [3,4], as well as fractal dimension method, polarization analysis method, and artificial neural network, among others. At the same time, with the advancement of technology in recent years, deep learning technology has developed rapidly. Experts and scholars have begun to try to apply deep learning technology to seismic phase automatic recognition, and deep learning methods for seismic phase automatic recognition have gradually become a research hotspot.

In 1976, Stevenson [3] proposed the application of STA/LTA to pick up the arrival time of earthquake events. This method calculates the ratio of the average short-term window signal (STA) to the average long-term window signal (LTA) at each point in the signal. The judgment principle is that at a point where the seismic wave arrives, this ratio will increase sharply. Therefore, a threshold is set for judgment, and the judgment criterion is the comparison between the ratio at that point and the threshold. When the ratio exceeds the set value, this point is the point where the seismic wave arrives. The STA/LTA seismic phase automatic recognition method has been widely used in real-time seismic phase recognition due to its advantages of simple principle and fast calculation speed. However, due to the issue of feature function and threshold selection, different feature functions and thresholds have a significant impact on the final recognition accuracy. At the same time, the random noise and high-frequency pulses generated during the seismic recording process can only estimate the approximate range of signal arrival in low signal-to-noise ratio situations, and sometimes there may even be misjudgments in signal pickup. Some experts and scholars have proposed improved methods based on this approach to address these issues, achieving an increase in picking accuracy [5,6,7,8]. At present, STA/LTA is commonly used to roughly determine the picking range, and other high-precision methods are used to pick up the exact time.

The Akaike information criterion (AIC) [4] was first proposed by Japanese statistician Hiroshi Akaike. Based on the basic concept of entropy, it is a standard used to measure the accuracy of statistical model fitting. The AIC method has been widely used in seismic phase automatic recognition systems [9,10,11,12], but it is greatly affected by the signal-to-noise ratio of seismic signals. In low signal-to-noise ratio situations, extreme points are not obvious, resulting in large recognition errors and low accuracy.

The fractal theory was first proposed by Mandelbrot in a paper published in the journal Science in 1967 [13], and later gradually improved and written into a new theory: fractal geometry. Fractal geometry breaks the traditional geometry that only uses integers to describe the dimensions of images and curves. In fractal theory, integers or decimals can be used to represent the fractal dimension of images or curves, describing their irregularity and variation patterns. Boschetti and Dentith [14] first applied fractal theory to the automatic identification of seismic phases. The principle is that the fractal dimension values of noise signals and seismic signals remain unchanged before and after the arrival of an earthquake, and only change at the moment of the arrival of the seismic phase. The arrival of the seismic phase is the starting point of the fractal dimension value change. Han Xiaojun et al. [15] applied the principle of segmentation when calculating fractal dimension values, which accelerated the calculation speed and made the calculation results more stable. Cao Maosen et al. [16] proposed the length fractal dimension method after simplifying the Huasdorff and divider methods, which improved computational efficiency and recognition accuracy. Compared with other seismic phase automatic recognition methods, the fractal dimension method can perform seismic phase recognition well under low signal-to-noise ratio conditions, with higher accuracy than other algorithms. However, the problem is that the calculation process of fractal dimension is relatively complex.

Since the 1860s, the polarization analysis method has gradually been applied to automatic identification of seismic phases. The difference between the polarization analysis method and STA/LTA, AIC, and other methods is that it simultaneously processes and calculates the data of three seismic wave phases, while other methods are only based on a single component. In recent years, experts and scholars at home and abroad have been conducting in-depth research on the application of polarization analysis in seismic phase automatic identification systems [17,18,19]. Scholars have found that in the identification of S-wave seismic phases, the polarization analysis method has much better accuracy than STA/LTA and AIC methods, and the identification effect is better. However, the polarization analysis method requires solving eigenvalues and eigenvectors, which is significantly more computationally intensive compared to other methods. Its disadvantages are more obvious when fast calculations are needed.

In recent years, with the continuous development of deep learning technology, considering the characteristics of temporal correlation of seismic signals, some experts and scholars have tried to apply the RNN (recurrent neural network) as a deep learning method for the automatic recognition of seismic phases. In 2018, Yao Kaiyi and Li Yingyu [20] applied the LSTM network to the automatic recognition of seismic phases, and the LSTM model achieved good results when the judgment threshold was set at 1 s. Compared with LSTM network, the Bi-LSTM network, as a variant of LSTM, is more suitable for processing temporal data; it has been widely used in other temporal signal recognition fields, and its recognition effect is obviously better than other networks. In 2019, Zhu Lili et al. [21] used Bi-LSTM to recognize and classify heart sound signals of congenital heart disease, and the results showed that the Bi-LSTM network had good recognition effect, and the correct recognition rate of heart sound signals reached 84.2%. In 2019, Wu Changliang and Zhu Bo et al. [22] applied the Bi-LSTM network to the pattern recognition of quality control charts, and the experimental results showed that the Bi-LSTM network had higher precision. Because of the strong time dependence of seismic data, the Bi-LSTM network can learn the feature of history information and future information at the same time, so as to better recognize the seismic phase automatically. Therefore, the Bi-LSTM network is used to construct the automatic phase recognition model to realize the automatic phase recognition with high precision and low misdetection rate.

2. Design of Automatic Seismic Phase Recognition Model Based on Bi-LSTM Network

In this paper, the Bi-LSTM network is used as the main frame structure of the network, and an automatic seismic phase recognition model based on the Bi-LSTM network is designed.

The Bi-LSTM model is a bidirectional long short-term memory network model that combines forward and backward LSTM networks to capture long-term dependencies in sequential data. The design concept of the Bi-LSTM model is to enable the feature data obtained at time t to have both past and future information, thereby improving the efficiency and performance of feature extraction. This model performs feature extraction by inputting the input sequence into two independent LSTM models in both forward and reverse order, and then concatenating the two output vectors to form the final vector representation. This design enables Bi-LSTM to perform well in processing sequence data with bidirectional dependencies. Therefore, the Bi-LSTM network is an improved structure of the recurrent neural network, which can learn based on the time series characteristics of data, reduce the misdetection rate of seismic phase recognition through continuous training, and improve the accuracy of model recognition.

The structure of the automatic seismic phase recognition model based on the Bi-LSTM network is shown in Figure 1. The input data are the original seismic waveform data. After the phase recognition model based on the Bi-LSTM network, the phase feature in the input data can be learned, and the output is the signal label corresponding to the seismic waveform, where the P-wave’s signal label is set to 1, the S-wave is set to 2, and other parts are 0.

In Figure 1, the middle part is the main part of the automatic seismic phase recognition model based on the Bi-LSTM network, which is composed of two LSTM layers with the same size and opposite direction, a forward LSTM layer, and a backward LSTM layer, where the number of hidden layers is set to 2 and the number of hidden nodes is set to 100. The forward LSTM layer can learn the history information of the seismic phase feature in the waveform, and the backward LSTM layer can learn the future information of the seismic phase feature in the seismic waveform from the opposite direction, which combines the history information with the future information to achieve better learning of the waveform’s temporal feature. The Bi-LSTM network is composed of two LSTM layers with the same structure and different weight parameters. The layer with the input forward sequence is called the forward LSTM, and the layer obtains an output sequence

h_{a}

, while the backward LSTM layer inputs the same input in the reverse order to another LSTM network, and then obtains an output sequence

h_{b}

. Finally, the output of the Bi-LSTM network is calculated by adding the two sequences, as shown in Equation (1).

H = h_{a} \oplus h_{b}

(1)

Bi-LSTM can simultaneously learn the forward history information and backward future information in the sequence. The input sequence is divided into two directions and input into the model, respectively, and the information of the two directions is saved by using two hidden layers and input into the same output layer, respectively. The two hidden layers have the same structure, except for the direction of the input sequence. Therefore, the data feature finally learned by the network contains temporal information in forward and backward directions, the time correlation of the seismic waveform data is better utilized, the temporal relationship between the phase feature is established, P and S wave phases of the seismic signals are further accurately recognized, and the seismic phase automatic recognition with high precision and low misdetection rate is realized.

3. Preparation for the Experiments

In this section, the preparatory work for the experiment of phase automatic recognition model based on the Bi-LSTM network is carried out, which is divided into three aspects: the construction of the phase automatic recognition dataset, the selection of model evaluation metric, and the setting of model parameters. The construction of the seismic phase recognition dataset needs to preprocess the original data and construct the dataset according to the original data and the relevant requirements of the model’s input and output. To set the evaluation metric, it needs to refer to the judgment methods of other experts and scholars on the results of automatic phase recognition, and specify the threshold range of judgment according to the data’s actual situation in this paper. The experimental settings include the parameters of the hardware equipment and development environment of the Bi-LSTM-based recognition model’s building platform, other software tools used, and the hyperparameters that need to be manually debugged in the training of the Bi-LSTM-based phase recognition model.

3.1. Construction of Automatic Seismic Phase Recognition Dataset

Because the deep learning method requires a large amount of data to train the network model, this paper uses the dataset STEAD (Stanford Earthquake Dataset) as the original dataset, and generates the sample data according to the magnitude, SNR (signal-to-noise ratio), and other conditions [23]. Each seismic waveform datum in this dataset is 60 s of three-component data, including 35 pieces of label information. The data example of the STEAD dataset is shown in Figure 2, and the sample data label information is shown in Table 1. In Figure 2, the blue line represents the first arrival point of the P-wave phase, the red line represents the first arrival point of the S-wave phase, and the green line represents the point where the energy of the seismic signal disappears.

In the original dataset, the data are filtered by magnitude based on the information provided by the labels. Magnitude refers to the size of an earthquake, which is a measure of the strength of an earthquake. It is determined by the amount of energy released by each seismic activity measured by seismometers. The magnitude is usually represented by the letter M, and the internationally recognized Richter scale is divided into nine levels. Earthquakes with M < 1 magnitude are generally referred to as super microseismic events; M more than 1 and less than 3 are called weak or microseismic; earthquakes with M more than 3 and less than 4.5 are called felt earthquakes; M more than 4.5 and less than 6 are called moderate strong earthquakes; strong earthquakes are those with M more than 6 and less than 7; earthquakes with M more than 7 and less than 8 are called major earthquakes; Earthquakes with 8 or above are called massive earthquakes. The laboratory has been studying microseismic signals for a long time. Therefore, this article screened the seismic waveform data according to the magnitude distribution between M0.5 and M2.5 and the signal-to-noise ratio range between 20 dB and 30 dB. A total of 43,700 seismic waveform data were obtained, and 10,000 seismic waveform data were randomly selected as sample data.

We preprocess the filtered sample data, cut the waveform according to the first arrival sampling point of the P-wave and the S-wave provided by the data label, randomly cut the 3 to 5 s before the arrival of the P-wave as the waveform starting point and select the 10 s after the arrival of the S-wave as the waveform ending point, and cut the waveform data to a length of 30 s. When the length after cutting is less than 30 s, zero values are randomly used before or after the data for supplement. After cutting, all the data are normalized to prevent the model from overfitting and affecting the training effect of the model. The normalization operation is shown in Equation (2), where

v_{i}

represents the amplitude value of the seismic waveform data.

v_{i} = \frac{v_{i}}{max ∣ v_{i} ∣}

(2)

In order to ensure that the dataset matches the model’s input, most of the unused labels in the 35 labels were removed. Only seven pieces of label information, such as trace name, trace start time, p arrival sample, s arrival sample, coda end sample, snr db, and network code, were retained to improve the calculation speed of the model. After preprocessing, we also manually labeled and checked the dataset, removed the blank data, manually modified the obvious errors, and randomly assigned them according to the proportion of 70%, 10% and 20%. A total of 70% of the data was used as the training dataset, 10% as the validation dataset, and 20% as the test dataset.

3.2. Model Evaluation Metric Selection

The seismic phase identification indicators mainly include root mean square error (RMSE), accuracy (A), and misdetection rate (M). These indicators are used to evaluate the performance of the seismic phase automatic recognition model. RMSE is used to measure the deviation between the predicted and actual values of a model, and is an important indicator for measuring the accuracy of model predictions. In the automatic identification of seismic phases, RMSE can be used to evaluate the accuracy of the model in identifying seismic phases. A refers to the proportion of seismic phases correctly identified by a model to the total number of identified phases. High accuracy means that the model can accurately identify most seismic phases, which is crucial for the analysis of seismic data. M represents the proportion of seismic phases that the model failed to identify to the total number of identified phases. A low misdetection rate means that the model can identify as many seismic phases as possible to avoid missing important information.

During the seismic phase recognition experiment, in order to objectively evaluate the performance of the automatic phase recognition model based on the Bi-LSTM network and ensure the reliability and validity of the experimental results, this paper calculates the RMSE, A, and M of different methods. These three metrics are used as the basis for evaluating the effect of the phase recognition model and whether it meets the practical use standard, and the calculation formulas of the three evaluation indexes are shown in Equations (3)–(5).

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} [{(T_{i} - t_{i})}^{2}]}

(3)

A = \frac{T_{e}}{(T_{e} + F_{e})}

(4)

M = \frac{F_{n}}{(T_{e} + F_{n})}

(5)

RMSE is the root mean square error between the model-recognized phase arrival time

T_{i}

and the manually labeled reference phase arrival time

t_{i}

, which can indicate the precision of the model’s automatic phase recognition, where n denotes the number of seismic waveform data in the dataset and i denotes the seismic waveform data’s sequence number; A is the accuracy rate of the model’s phase recognition,

T_{e}

is the number of times the model correctly detects the phase’s arrival time, while

F_{e}

is the number of times the model wrongly detects the phase’s arrival time, and the threshold values for correct recognition of seismic phase are ±0.5 s for P-wave arrival and ±0.5 s for S-wave arrival; more than ±0.5 s will result in incorrect recognition. A is the ratio of the number of phases correctly recognized by the model within the error range to the total number of phases recognized by the model, which reflects the accuracy of the model’s phase recognition. M is the misdetection rate of model seismic phase recognition, and

F_{n}

is the number of misdetected phases. The misdetection rate M reflects the completeness of model seismic phase recognition for the manually labeled reference phases and indicates the effectiveness of model seismic phase recognition.

3.3. Model Parameters Setting

The STEAD training dataset preprocessed above is input into the automatic seismic phase recognition model based on Bi-LSTM network as training data. After repeated training and debugging, the error of model training is reduced. The final network parameters are shown in Table 2.

In this paper, seven pieces of label information of seismic waveform data are used as input data, the input tensor size is (batch_size, n_input, n_steps), and the time window step length is 100. Since the length of single seismic waveform data is 30 s and the sampling rate is 100 Hz, the length of single seismic waveform data is 3000 points. Therefore, the input data tensor is 30 × 100, the size of batch_size is 100, the size of n_input is 70, the step size of n_steps is 100, and, finally, the input tensor shape of Bi-LSTM network is 100 × 30 × 100. The label dimension is set as 3, which represents P-wave, S-wave, and neither P-wave nor S-wave, respectively. The Bi-LSTM-based automatic phase recognition model training uses the cross-entropy loss function, and uses the Adam algorithm to optimize the calculation process of the loss function. The Adam algorithm has the advantages of high computational efficiency and self-adaptive learning rate, which can accelerate the convergence speed of the model.

4. Experimental Results and Analysis

The parameter values in Table 2 are finally determined according to multiple debugging, and the STEAD training dataset is used for training. Through the judgment of model convergence and loss value, the trained network model is finally obtained, which can be used for automatic phase recognition experiments to test the effectiveness and stability of the model. In order to verify the phase recognition effect of the phase automatic recognition model based on the Bi-LSTM network, the phase automatic recognition model based on the Bi-LSTM network is compared with the traditional STA/LTA and AIC phase automatic recognition methods. In the comparative experiments, the test dataset used by all the methods is a unified 2000 samples of STEAD test dataset, and the range of SNR is between 20 and 30 dB. According to the evaluation metric proposed in this paper, the results of the automatic phase recognition experiments of all the methods are evaluated and analyzed.

The recognition example of the automatic phase recognition model based on the Bi-LSTM network is shown in Figure 3, and the comparison example of the model results and the manually labeled reference phase results is shown in Table 3. In Figure 3, the rectangular square wave with the height of 1 in the upper part represents the P-wave phase recognized by the phase automatic recognition model based on the Bi-LSTM network, the rectangular square wave with the height of 2 represents the S-wave phase recognized by the model, and the black and red vertical line in the lower part represents the reference P-wave and S-wave phase manually recognized, respectively. The P-wave and S-wave phase results of the automatic recognition model based on the Bi-LSTM network and the reference P-wave and S-wave phase results of manual recognition are shown in Table 3. If the error between the recognition results of the phase automatic recognition model based on the Bi-LSTM network and the manual recognition results is less than ±0.5 s, it is regarded as a correct result. Table 3 also shows the RMSE values of P-wave and S-wave calculated from the example, where the phase recognition data are represented as sampling points, and the RMSE is converted into seconds (s) by the unit of calculation.

The comparative experimental results of the automatic seismic phase recognition model based on the Bi-LSTM network with STA/LTA and AIC methods are shown in Table 4. It can be seen that, compared with the other two methods, the phase automatic recognition model based on the Bi-LSTM network reduces the misdetection rate M by about 8–15% compared with STA/LTA and AIC methods, and slightly improves the RMSE and A. The experiment shows that the automatic seismic phase recognition model based on the Bi-LSTM network can reduce the problem of misdetection to a certain extent.

5. Conclusions

In this paper, the application of the Bi-LSTM network in automatic seismic phase recognition was studied, and a seismic phase recognition model based on the Bi-LSTM network was designed to solve the problems existing in the current phase recognition methods, with the STEAD dataset used for training and testing. At the same time, the proposed model based on the Bi-LSTM network was compared with the traditional STA/LTA and AIC methods. The experimental results show that when the Bi-LSTM network is applied to phase automatic recognition, compared with the STA/LTA and AIC methods, the misdetection rate is reduced by about 8–15%, and the precision and RMSE are slightly improved. This shows that the automatic seismic phase recognition model based on the Bi-LSTM network can reduce the misdetection rate of automatic phase recognition.

Author Contributions

Conceptualization, L.W. and J.C.; methodology, L.W.; software, H.C.; validation, L.W., L.D. and L.G.; formal analysis, X.S.; investigation, L.D.; resources, L.W.; data curation, J.C.; writing—original draft preparation, L.W.; writing—review and editing, L.W.; visualization, L.W.; supervision, X.S.; project administration, J.C.; funding acquisition, J.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Scientific Research Project Item of Hebei Province Education Department, grant number ZC2023104; Science and Technology Innovation Program for Postgraduate students in IDP subsidized by Fundamental Research Funds for the Central Universities, grant number ZY20230322; the Key Laboratory Open Fund Project of Hebei Provincial, grant number FZ224105; and the College Students’ Innovation and Entrepreneurship Training Program Project of Institute of Disaster Prevention, grant number 202311775011.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The Stanford Earthquake Dataset can be downloaded from the official website of Stanford Earthquake Dataset (STEAD). This dataset is a global seismic signal dataset for artificial intelligence, supporting the Python programming language.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

STA	Short-term average
LTA	Long-term average
AIC	Akaike information criterion
RNN	Recurrent neural network
LSTM	Long short-term memory
Bi-LSTM	Bi-directional long short-term memory
STEAD	Stanford Earthquake Dataset
SNR	Signal-to-noise ratio

References

Fu, J.; Wang, X.; Li, Z.; Meng, H.; Wang, J.; Wang, W.; Tang, C. Automatic Phase-Picking Method for Detecting Earthquakes Based on the Signal-to-Noise-Ratio Concept. Seismol. Res. Lett. 2019, 91, 334–342. [Google Scholar] [CrossRef]
Zhu, W.; Beroza, G.C. PhaseNet. PhaseNet: A deep-neural-network-based seismic arrival-time picking method. Geophys. J. Int. 2019, 216, 261–273. [Google Scholar] [CrossRef]
Stevenson, P.R. Microearthquakes at Flathead Lake, Montana: A study using automatic earthquake processing. Bull. Seismol. Soc. Am. 1976, 66, 61–80. [Google Scholar] [CrossRef]
Akaike, H. Statistical Predictor Identification. In Selected Papers of Hirotugu Akaike; Parzen, E., Tanabe, K., Kitagawa, G., Eds.; Springer Series in Statistics; Springer: New York, NY, USA, 1998; pp. 137–151. [Google Scholar]
Allen, R.V. Automatic earthquake recognition and timing from single traces. Bull. Seismol. Soc. Am. 1978, 68, 1521–1532. [Google Scholar] [CrossRef]
Allen, R. Automatic phase pickers: Their present use and future prospects. Bull. Seismol. Soc. Am. 1982, 72, S225–S242. [Google Scholar] [CrossRef]
Majer, E.; Mcevilly, T.; Albores, A. Seismological studies at Cerro Prieto. Geothermics 1980, 9, 79–88. [Google Scholar] [CrossRef]
Baer, M.; Kradolfer, U. An automatic phase picker for local and teleseismic events. Bull. Seismol. Soc. Am. 1987, 77, 1437–1445. [Google Scholar] [CrossRef]
Maeda, N. A method for reading and checking phase times in autoprocessing system of seismic wave data. Zisin 1985, 38, 365–379. [Google Scholar] [CrossRef] [PubMed]
Liu, X.Q.; Cai, Y.; Zhao, R.; Qu, B.A.; Zhao, Y.G.; Feng, Z.J.; Li, H. Automatic recognition method and application of regional seismic signals. Appl. Geophys. 2014, 11, 128–138. [Google Scholar] [CrossRef]
Zhao, D.P.; Liu, X.Q.; Liu, Y.X.; Wang, Z.S.; Zhao, H.; Zhang, Y.L. Application of high-order statistics and AIC method in identifying regional seismic events and initial movements of direct P-waves. Earthq. Geomagn. Obs. Res. 2013, 34, 61–69. [Google Scholar]
Jiang, C.; Wu, J.P.; Fang, L.H. Research on earthquake detection and automatic picking of seismic phases. J. Seismol. 2018, 40, 45–57. [Google Scholar]
Mandelbrot, B. How Long Is the Coast of Britain Statistical Self-Similarity and Fractional Dimension. Science 1967, 156, 636–638. [Google Scholar] [PubMed]
Boschetti, F.; Dentith, M.D.; List, R.D. A fractal-based algorithm for detecting first arrivals on seismic traces. Geophysics 1996, 61, 1095–1102. [Google Scholar] [CrossRef]
Han, X.J.; Shi, Z.J.; Li, Y.L. An improved algorithm for picking up the first arrival of seismic waves using fractal dimension. Pet. Geophys. Explor. 2002, 37, 60–63. [Google Scholar]
Cao, M.S.; Ren, Q.W.; Wan, L.M.; Luo, Y. The Length fractal dimension algorithm picks up the first arrival of seismic waves. Pet. Geophys. Explor. 2004, 39, 509–514. [Google Scholar]
Vidale, J.E. Complex polarization analysis of particle motion. Bull. Seismol. Soc. Am. 1986, 76, 1393–1405. [Google Scholar]
Amoroso, O.; Maercklin, N.; Zollo, A. S-wave identification by polarization filtering and waveform coherence analyses. Bull. Seismol. Soc. Am. 2012, 102, 854–861. [Google Scholar] [CrossRef]
Ross, Z.E.; Ben-Zion, Y. Automatic picking of direct P, S seismic phases and fault zone head waves. Geophys. J. Int. 2014, 199, 368–381. [Google Scholar] [CrossRef]
Soto, H.; Schurr, B. DeepPhasePick: A method for detecting and picking seismic phases from local earthquakes based on highly optimized convolutional and recurrent deep neural networks. Geophys. J. Int. 2021, 227, 1268–1294. [Google Scholar] [CrossRef]
Tao, B.; Pejman, T. Attention-based LSTM-FCN for earthquake detection and location. Geophys. J. Int. 2022, 227, 1568–1576. [Google Scholar]
Yang, Z.; Yang, C.; Li, X.; Min, C. Pattern Recognition of the Vertical Hydraulic Fracture Shapes in Coalbed Methane Reservoirs Based on Hierarchical Bi-LSTM Network. Complexity 2020, 2020, 1734048. [Google Scholar] [CrossRef]
Mousavi, S.M.; Sheng, Y.; Zhu, W.; Beroza, G.C. STanford EArthquake Dataset (STEAD): A global data set of seismic signals for AI. IEEE Access 2019, 4, 179464–179476. [Google Scholar] [CrossRef]

Figure 1. Structure of automatic phase recognition model based on Bi-LSTM network.

Figure 2. Data example of STEAD dataset.

Figure 3. Recognition example of automatic seismic recognition model based on Bi-LSTM network.

Table 1. Sample data label information table of STEAD dataset.

Label	Information	Label	Information
back azimuth deg	59.7	source distance deg	0.31
coda end sample	3534	source distance km	34.7
network code	TA	source error sec	0.6709
p arrival sample	600.0	source gap deg	15.649
p status	manual	source depth uncertainty km	0.58
p travel sec	8.819	source id	16,012,717
p weight	0.5	source latitude	48.5636
receiver code	A04D	source longitude	−123.1136
receiver elevation m	13.0	source magnitude	3.3
receiver latitude	48.7201	source magnitude author	None
receiver longitude	−122.7063	source magnitude type	mb
receiver type	BH	snr db	[52.70 54.09]
s status	manual	source depth km	50.84
s weight	0.5	trace category	earthquake local
source horizontal uncertainty km	3.55464	source origin time	8 February 2011 16:36:36.83
source mechanism strike dip rake	none	trace name	A04D.TA 20110208163638 EV
source origin uncertainty sec	0.94	trace start time	8 February 2011 16:36:39.65
s arrival sample	1305.0

Table 2. Parameters values of automatic seismic phase recognition model based on Bi-LSTM network.

Parameter	Parameter’s Meaning	Setting Value
training_iters	Number of training iterations	50,000
batch_size	Batch size of single training	100
n_input	Number of single training batch	30
n_step	Number of seismic data processed in single training	100
n_classes	Label dimension	3

Table 3. Comparison examples between recognition results of automatic phase recognition model based on Bi-LSTM network and manual reference results.

Station Code	Bi-LSTM Model P-Wave	Manual Reference P-Wave	Bi-LSTM Model S-Wave	Manual Reference S-Wave
TA	389.7	376.2	992.5	981.2
ZQ	492.7	477.5	895.2	913.3
NC	384.6	368.1	1288.4	1276.3
HV	538.6	551.1	1155.2	1177.1
SN	505.8	495.4	1533.8	1521.4
KR	332.8	364.9	1050.2	1022.9
PB	433.2	419.0	1435.6	1418.2
	RMSE of P-wave: 0.4671 (s)		RMSE of S-wave: 0.4778 (s)

Table 4. Comparative experimental results of the automatic seismic phase recognition model based on the Bi-LSTM network with STA/LTA and AIC methods.

Method		RMSE	A	M
Bi-LSTM	P	0.48	72.13%	23.18%
Bi-LSTM	S	0.49	73.89%	25.34%
STA/LTA	P	0.49	73.45%	37.52%
STA/LTA	S	0.62	76.23%	35.87%
AIC	P	0.43	75.13%	32.81%
AIC	S	0.58	78.92%	30.72%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, L.; Cai, J.; Duan, L.; Guo, L.; Shi, X.; Cai, H. Research on Seismic Phase Recognition Method Based on Bi-LSTM Network. Appl. Sci. 2024, 14, 6917. https://doi.org/10.3390/app14166917

AMA Style

Wang L, Cai J, Duan L, Guo L, Shi X, Cai H. Research on Seismic Phase Recognition Method Based on Bi-LSTM Network. Applied Sciences. 2024; 14(16):6917. https://doi.org/10.3390/app14166917

Chicago/Turabian Style

Wang, Li, Jianxian Cai, Li Duan, Lili Guo, Xingxing Shi, and Huanyu Cai. 2024. "Research on Seismic Phase Recognition Method Based on Bi-LSTM Network" Applied Sciences 14, no. 16: 6917. https://doi.org/10.3390/app14166917

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on Seismic Phase Recognition Method Based on Bi-LSTM Network

Abstract

1. Introduction

2. Design of Automatic Seismic Phase Recognition Model Based on Bi-LSTM Network

3. Preparation for the Experiments

3.1. Construction of Automatic Seismic Phase Recognition Dataset

3.2. Model Evaluation Metric Selection

3.3. Model Parameters Setting

4. Experimental Results and Analysis

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI