1. Introduction
Earthquakes are a natural phenomenon that releases energy from the Earth’s interior, resulting in significant destruction and devastating consequences for human society. China, situated at the intersection of the Pacific Rim Seismic Zone and the Eurasian Seismic Zone, is one of the countries most severely affected by seismic disasters. Conducting scientific research on earthquake precursors and exploring methods for identifying earthquake precursors are crucial for disaster prevention and mitigation. Currently, earthquake precursor identification primarily relies on the observation and analysis of various precursors, including crustal deformation, changes in seismic wave velocity, and geomagnetic field anomalies. Researchers have examined a range of precursor anomalies before earthquakes and have sought to identify seismic events based on the precursor information. Among the studies, Chien et al. [
1] investigated the relationship between geothermal precursor anomalies and the locations of future earthquakes in Taiwan Island, Singh et al. [
2] analysed the correlation between hydrogen peroxide concentrations in hot spring boreholes and earthquakes in active fault zones, Li et al. [
3] developed a short-term earthquake warning technique based on continuous GPS signal anomalies, and Yusof et al. [
4] sought to identify earthquake precursor features and verify their correlation with seismic events through signal processing methods. However, these precursor signals often exhibit significant uncertainties in complex geological environments, which can limit the accuracy and timeliness of precursor identification. Consequently, exploring new earthquake precursor signals and their identification methods has become crucial for effective earthquake early warning.
Rock acoustic emission (AE) monitoring techniques based on fracture mechanics are cutting-edge tools for geophysical diagnostics. During the earthquake preparation phase, a large area of crustal rock is subjected to compression, leading to the accumulation of significant strain energy. In regions of high compression before an earthquake, localised fractures in the rock can occur, resulting in concentrated and intense AE phenomena. In fracture mechanics, AE represents the strain energy released during material fracture, which is emitted as stress waves [
5,
6,
7]. These stress waves exhibit a wide frequency range, from 1 THz for nanoscale fractures to 1 Hz for kilometre-scale fractures. The latter frequencies correspond to typical seismic wave frequencies, which can be detected by sensors placed on the earth’s surface [
8]. Therefore, by monitoring the rock AE, it is possible to detect AE signals generated by micro-seismic activities before an earthquake, allowing for early earthquake warning. This method has been studied in laboratory settings [
9], but it is rarely measured in the field. Research has indicated that crustal stress changes during the earthquake preparation phase [
10], and an increase in AE activity may signify the redistribution of crustal stress in the preparation zone [
11,
12]. Zimatore et al. [
13] analysed AE time series obtained from two monitoring stations located 300 km apart in Italy and found that AE can provide information about crustal stress anomalies related to earthquakes. Studies examining the correlation between AE and seismicity suggest that AE can serve as an earthquake precursor [
14,
15]. For instance, a sudden and significant increase in AE signals was observed at about 400 km of distance from the epicentral area before the occurrence of the Assisi earthquake [
16]. Carpinteri et al. [
17] identified a strong correlation between AE and seismic sequences in adjacent areas through experimental observations in gypsum mines in northern Italy, Lukovenkova et al. [
18] noted a distinct difference in the frequency domain of pre-seismic AE signals compared to the usual background signals during the Chupanov earthquake, and Spivak et al. [
19] studied the propagation and perturbation of AE signals in the atmosphere related to earthquakes of magnitudes 5.1 to 6.9 in Albania, Greece, Iran, and Turkey, estimating the energy of these earthquakes based on spectral features. These studies illustrate that AE monitoring techniques offer a viable approach for short-term earthquake warning.
However, during AE monitoring, the data acquisition systems record massive data that include noise from multiple sources, making it challenging to identify useful information in real time through manual analysis. With the rapid advancement of artificial intelligence [
20,
21], the incorporation of deep learning holds the potential to significantly enhance technological progress in the field of earthquake precursor identification. Deep learning technology provides robust data processing and recognition capabilities, enabling the automatic analysis of large volumes of AE data to reveal inherent earthquake precursor features. In recent years, many scholars have explored deep learning methods for earthquake warning [
22,
23,
24,
25]. Banna et al. [
26] developed a long short-term memory (LSTM) model for identifying seismic events in Bangladesh within a month, whereas Jozinovi et al. [
27] utilised a convolutional neural network (CNN) to identify distant peak ground intensity. Additionally, researchers developed CNN-LSTM models for identifying significant earthquakes in Japan [
28], daily seismic events in Chile [
29], seismic acoustic signals from lab simulations [
30], and coal mine earthquakes [
31]. While these studies demonstrate impressive advancements in earthquake warning, they also exhibit certain limitations. For instance, many models, including those by Wang et al. [
32], might be constrained by their reliance on specific geographic data and may not generalise well to different regions. Shcherbakov et al. [
33] combined Bayesian networks with extreme value theory; although innovative, this approach may overlook certain real-time data integration challenges. Furthermore, the use of deep learning by DeVries et al. [
34] on over 130,000 main and aftershock pairs could be constrained by the quality and range of input data, potentially impacting warning accuracy. These shortcomings present critical gaps that this paper aims to address, and real-time forecasting requires the timely processing of massive amounts of data. By improving real-time data integration, enhancing model adaptability, and refining input data quality, we aspire to advance the state of deep learning applications in earthquake precursor identification.
Existing research has not yet produced a comprehensive model that fully addresses the challenges of earthquake precursor identification, primarily due to the complex physical mechanisms that generate earthquakes and their precursors. However, with the continuous advancement of AE monitoring technology and deep learning, it is essential to conduct in-depth studies on earthquake precursor identification. Rocks in the deeper layers of the Earth’s crust typically reflect stress changes more directly, and the signal attenuation of stress wave propagation in rocks is relatively slow, making them more sensitive to micro-seismic activities. Therefore, monitoring AE signals in rocks allows for the acquisition of more accurate information about earthquake precursors while minimising noise interference.
The eastern Guangdong region of China is located within the Southeast Coastal Seismic Zone and the Pacific Rim Seismic Zone, and is recognised as a high-seismic-intensity zone. This paper utilises a dedicated seismic observation station in eastern Guangdong to investigate a rapid earthquake precursor identification method through AE and deep learning. In this study, AE equipment and seismometers are installed in a dedicated all-granite mountain tunnel to collect AE signals and seismic activity from the rocks in the high-intensity zone. The process involves extracting samples containing precursor features, which are then combined with deep learning models for training, validation, and testing. The resulting warning model is designed for rapid earthquake warning. This study offers a new technical approach to earthquake precursor identification that may help to prevent earthquake disasters in the future.
4. Experimental Analysis of Identification Models
4.1. Cross-Validation of Deep Learning Models
Cross-validation studies are conducted to train and validate the model infrastructure and evaluate the effectiveness of its modules. Three distinct deep learning models are developed using the dataset from the two biggest earthquakes, as detailed in
Table 3 and
Figure 12.
4.2. Optimisation of Neural Network Parameters
A parametric performance is conducted through pre-training experiments to select optimal model parameters. Using Model 1 as an example, the sigmoid activation function is chosen, and data are batch-loaded using the mini-batch gradient descent method. A mini-batch size of 20 and a maximum training period of 1000 are set.
Training results for 10, 20, and 40 neurons, shown in
Figure 13a, reveal that increasing neuron count enhances accuracy. A loss value of 0.0232 is achieved with 20 neurons, and further increasing the neuron count improves accuracy by only 2% when reaching 30 neurons. Thus, 20 neurons provide better training outcomes while optimising computational efficiency.
A comparison using 20 neurons with varying numbers of hidden layers is conducted, as shown in
Figure 13b. The results indicate that increasing network depth reduces the loss value to 0.0207 with four hidden layers. However, the improvement in accuracy compared to the three-layer model is minimal, and it also introduces an increased risk of overfitting. Thus, three hidden layers are selected.
In addition, the comparison results for different learning rates and optimisers are presented in
Figure 13c and
Figure 13d, respectively. The optimisers compared include Stochastic Gradient Descent (SGD), Adaptive Moment Estimation (Adam), and Root Mean Square Propagation (RMSprop), which are widely used in deep learning for parameter optimisation with different convergence characteristics. The results indicate that model training is most effective with a learning rate of 0.005, while the validation accuracy is highest when the SGD optimiser is employed, reaching 98.7%.
Following the pre-training and validation of the network, appropriate hyperparameters can be selected for the subsequent training, validation, and testing of the models.
4.3. Training and Validation of Models
After completing one cycle of training for each model, the loss function values for the training set and validation set are obtained using Equation (5). The total number of training epochs is set to 1000, and the loss values for the training and validation phases across different epochs are illustrated in
Figure 14 and summarised in
Table 4. The validation set accuracy is presented in
Table 5.
The results indicate that after 1000 training cycles, the loss values of the validation set for the three models are 0.0115, 0.0176, and 0.0122, respectively. The corresponding accuracies of the validation set are 98.7%, 97.8%, and 98.3%. This indicates that the models achieve high accuracy following neural network training and exhibit strong robustness, faster convergence rates, and effective classification performance.
4.4. Testing and Evaluation of Models
A separate test set, entirely distinct from the training and validation sets, is selected to evaluate their generalisation performance. For comparative analysis, the recognition results of the models are presented in the confusion matrix shown in
Figure 15. Out of 456 test samples, the models correctly identified 445, 441, and 443 samples, respectively.
The results for the three models using the test set are compared in
Table 6. The results show that all three models achieve high test accuracy. Model I creates an accuracy of 97.6%, a recall rate of 99.6%, and an
F1 score of 0.975, reflecting optimal accuracy and robustness. As a result, it is preferred as the best recognition model in this paper. However, its accuracy is only 0.8% and 0.3% higher compared to Model II and Model III, respectively, suggesting that all three identification models show commendable generalisation performance. The identification model using AE features as inputs provides good recognition of seismic events. The trained model is well-suited for the real-time monitoring of rock AE signals, enabling the automatic warning of seismic events.
4.5. Identification Results of Seismic Events
To evaluate the effectiveness of the models in identifying seismic events, we cross-test them with real-time data. Model II is evaluated with data from the EQ.2 earthquake, while Model III is tested with data from the EQ.1 earthquake. The cross-testing strategy (Model II on EQ.2 and Model III on EQ.1) is employed to verify that the models’ performance is robust across different earthquake events, not just the ones they are trained on.
The identification results of the models are presented in
Figure 16. During the selected period, two warnings are issued, accurately corresponding to the two biggest seismic events that occur within that timeframe. This demonstrates the model’s high identification accuracy for significant seismic events.
4.6. Comparison with Various Machine Learning Models
As a comparative study, the proposed deep learning model is experimentally evaluated against traditional machine learning models, including Support Vector Machine (SVM), Light GBM (LGB), and Random Forest (RF). SVM is a supervised learning method that classifies data into two classes by finding an optimal hyperplane that maximises the margin between them; the LGB model, an enhanced gradient boosting decision tree algorithm, combines unilateral gradient sampling and mutually exclusive feature bundling; and the RF model is a bagging ensemble algorithm that builds training subsets through random sampling, trains individual decision trees, and derives final predictions through majority voting.
Table 7 presents the test results for the SVM, LGB, and RF models. The RF model shows the lowest accuracy at 93.2%, followed by the LGB model at 94.6% and the SVM model at 95.3%. In contrast, the deep learning model achieves higher accuracy, with
F1 scores surpassing the SVM, LGB, and RF models by 2.6%, 3.1%, and 5.1%, respectively. These results emphasise the deep learning model’s capability to extract hidden features from complex data and enhance generalisation through the backpropagation algorithm. For earthquake early warning, the model using AE features as inputs achieves remarkable accuracy and robustness in identifying seismic events compared to traditional machine learning algorithms. Hence, the trained model is well-suited for online monitoring and early warning of rock AE signals.
5. Discussion
While this study presents a novel approach to earthquake precursor detection using AE signals and deep learning, we acknowledge several limitations that warrant discussion. First, the 35-day monitoring period, though sufficient for demonstrating proof-of-concept, represents only a snapshot of seismic activity in the region. The two target events (ML = 3.2 and ML = 2.4) provided clear AE precursor patterns, but longer-term monitoring is essential to establish statistical robustness and account for potential variability in precursor signals across different seismic cycles. For instance, seasonal changes in groundwater levels or tectonic stress accumulation rates could influence AE characteristics, and these factors cannot be assessed within our short observation window.
Second, the limited number of seismic events in our dataset restricts the model’s ability to generalise across a broader magnitude range. While the deep learning algorithm achieved high accuracy (97.6%) for the recorded events, its performance for less frequent but potentially more destructive earthquakes (e.g., ML > 5.0) remains untested. Additionally, the absence of weaker events (0 < ML < 1.0) in our analysis leaves open questions about whether micro-seismic activity shares precursor features with larger earthquakes or represents a distinct phenomenon. Future studies should prioritise extended monitoring to capture these less frequent but critical scenarios.
The observed AE precursors correlated with seismic events within a 100~200 km radius; however, the spatial effectiveness of precursors may scale with earthquake magnitude and crustal stress distribution, necessitating long-term multi-site monitoring.
To address these limitations, the research will adopt a phased implementation approach. In the short term, an expanded network of AE sensors will be strategically deployed across geologically diverse sites in eastern Guangdong to acquire comprehensive datasets under varying subsurface conditions. Subsequent long-term monitoring (3–5 years) will systematically examine the correlation between AE patterns and seismic events across the full magnitude spectrum, with particular attention to potential environmental confounders. The methodology’s application in other seismically active regions could further elucidate the differentiation between universal precursor characteristics and region-specific signatures. While the present findings represent an initial proof-of-concept, they provide a substantive framework for advancing the development of next-generation earthquake early-warning systems.