Enhancing Microseismic Signal Classification in Metal Mines Using Transformer-Based Deep Learning

Peng, Pingan; Lei, Ru; Wang, Jinmiao

doi:10.3390/su152014959

Open AccessArticle

Enhancing Microseismic Signal Classification in Metal Mines Using Transformer-Based Deep Learning

by

Pingan Peng

¹

,

Ru Lei

^1,* and

Jinmiao Wang

²

¹

School of Resources and Safety Engineering, Central South University, Changsha 410083, China

²

School of Environment and Resources, Xiangtan University, Xiangtan 411105, China

^*

Author to whom correspondence should be addressed.

Sustainability 2023, 15(20), 14959; https://doi.org/10.3390/su152014959

Submission received: 23 August 2023 / Revised: 5 October 2023 / Accepted: 13 October 2023 / Published: 17 October 2023

(This article belongs to the Special Issue Advances in Intelligent and Sustainable Mining)

Download

Browse Figures

Versions Notes

Abstract

:

As microseismic monitoring technology gains widespread application in mine risk pre-warning, the demand for automatic data processing has become increasingly evident. One crucial requirement that has emerged is the automatic classification of signals. To address this, we propose a Transformer-based method for signal classification, leveraging the global feature extraction capability of the Transformer model. Firstly, the original waveform data were framed, windowed, and feature-extracted to obtain a 16 × 16 feature matrix, serving as the primary input for the subsequent microseismic signal classification models. Then, we verified the classification performance of the Transformer model compared with five microseismic signal classification models, including VGG16, ResNet18, ResNet34, SVM, and KNN. The experimental results demonstrate the effectiveness of the Transformer model, which outperforms previous methods in terms of accuracy, precision, recall, and F1 score. In addition, a comprehensive analysis was performed to investigate the impact of the Transformer model’s parameters and feature importance on outcomes, which provides a valuable reference for further enhancing microseismic signal classification performance.

Keywords:

metal mines; microseismic monitoring; deep learning; transformer; automatic classification

1. Introduction

Due to the depletion of shallow surface mineral deposits, deep mining has become an inevitable trend in addressing the surging energy demand [1]. However, as mining activities extend into deeper layers, high temperatures and geological stresses pose more severe challenges to production safety. At depth, high geological stresses induce destructive ground movements, including rock bursts, roof falls, collapses, and coal and gas outbursts [2,3,4]. To closely monitor subtle structural changes within rock formations during deep underground excavation projects and detect internal indicators of large-scale rock activities preceding significant rock movements, microseismic monitoring technology has seen increased use in deep mining operations [5,6]. This technology captures various signals triggered by signal sources during the monitoring process. With the expansion of monitoring scope and increased data collection frequency, the data obtained through microseismic monitoring systems are rapidly growing. Traditional classification methods rely heavily on manual interventions, displaying inherent subjectivity, unstable classification accuracy, and inefficiency. Consequently, the automatic classification of microseismic signals has become a critical issue for mining microseismic monitoring technology [7].

Numerous researchers have made significant progress in classifying microseismic signals. Zhao et al. [8] introduced an innovative approach that uses the slope value of the starting trend line as a distinctive parameter, which deviates from the traditional use of the starting angle. Shang et al. [9] employed empirical mode decomposition (EMD) and singular value decomposition (SVD) to develop a classification approach. Building upon previous research, Yang et al. [10] integrated wavelet packet decomposition (WPD) with SVD to improve the precision of identifying microseismic signals in mining. In a different vein, He et al. [11] extracted features using Mel-frequency cepstral coefficients (MFCCs) and constructed a Gaussian Mixture Model-Hidden Markov Model (GMM-HMM) to achieve automatic classification of mine microseismic signals. Meanwhile, Shang et al. [12] proposed an artificial neural network model based on principal component analysis (PCA), which demonstrated commendable classification efficacy for microseismic signals. Wei et al. [13] proposed a waveform image classification method grounded in PCA and support vector machines (SVMs), which improved the accuracy of identifying microseismic and blasting events. Bicego et al. [14] improved the performance of the K-Nearest Neighbors algorithm (KNN) through distance correction.

The machine learning-based methods mentioned above have achieved a satisfactory level of classification accuracy. However, processing massive amounts of data remains a challenge. Significant computational support and a theoretical understanding of deep learning have been provided via continuous advances in computing hardware, such as GPUs and TPUs, and ongoing research in neuroscience. This advancement in deep learning technology has been greatly promoted [15]. Peng et al. [16] extracted multiple feature parameters from waveform data and employed the genetic algorithm (GA)-optimized correlation-based feature selection (CFS) method to eliminate redundant and irrelevant features. They subsequently utilized a convolutional neural network (CNN) model for the automatic classification of mine microseismic signals. Zhao et al. [17] introduced the VGG4-CNN deep learning neural network model. It demonstrated a high capability for noise filtering and rapid single-signal response times. Li et al. [18] proposed using an ensemble model, which includes ResNet18, for image recognition of spectrograms of microseismic, noise, and electrical signals. Meanwhile, Tang et al. [19] presented an innovative network architecture called ResSCA, which combined a new deep spatial and channel attention (DSCA) module with improved residual connections and CNN. Zhao et al. [20] established a hybrid model fusing singular spectrum analysis (SSA), CNN, and long short-term memory (LSTM). Saad et al. [21] developed a fully automatic real-time amplitude estimation system based on the Vision Transformer network. This system was designed for detecting and estimating the magnitude of earthquakes in real time, resulting in noteworthy outcomes in feature extraction and magnitude determination.

Currently, most deep-learning-based microseismic signal classification methods use CNNs and variants. These models utilize fixed square convolutional kernels and prioritize local features, but their capacity to extract global features remains relatively limited. By comparison, the Transformer model [22], equipped with its multi-head attention mechanism, enables the capture of long-range dependencies. It has already demonstrated impressive performance in natural language processing (NLP) [22], image classification [23], object detection [24], and speech recognition [25]. However, its potential in microseismic monitoring remains unexplored. Therefore, this study aims to leverage the inherent comprehensive feature extraction capability of the Transformer model to classify three common signal classes: microseismic signals, blasting signals, and noise signals. The aim is to enhance classification accuracy and address challenges like manual classification, temporal variability, and subjective interpretation.

The paper proposed here comprises three key components: the data preprocessing, feature extraction, and training of six microseismic signal classification models. During the data preprocessing phase, all original waveforms are divided into 16 separate frames. Next, a total of 16 feature parameters are extracted in both the temporal and spectral domains. These include the zero-crossing rate, energy, spectral centroid, spectral roll-off, spectral bandwidth, and Mel-frequency cepstral coefficients (MFCCs) with coefficients from the second to the twelfth. These features serve as the input for the subsequent microseismic signal classification models. Finally, these trained models can effectively classify signals based on the extracted features. Our paper combines data preprocessing, feature extraction, and the utilization of microseismic signal classification models to achieve accurate classification.

Our contribution has several key aspects. Firstly, we present a novel deep learning approach based on the Transformer architecture for the automatic classification of microseismic signals in metal mines. Leveraging the Transformer’s global feature extraction capabilities, we present a unique method of classifying microseismic signals, blasting signals, and noise signals with high accuracy, achieving remarkable results of 96.6%, 95.9%, and 95.8%, respectively. Secondly, our feature extraction process plays a crucial role in enhancing classification performance. We carefully framed and windowed the waveform to extract 16 key features. These features were combined into a 16 × 16 feature matrix, which serves as the input to our Transformer model. This feature engineering approach contributes significantly to the model’s effectiveness across different signals. Moreover, we verified that our Transformer model with established models such as VGG16 [26], ResNet18 [27], ResNet34 [27], Support Vector Machine (SVM) [28], and K-Nearest Neighbors (KNN) [29], and we demonstrated its superior performance in terms of accuracy, precision, recall, and F1 score. This highlights the effectiveness of our proposed approach compared to existing methods. In summary, our contributions include the introduction of a novel Transformer-based model to improve the classification accuracy of microseismic signals with a robust feature extraction process.

2. Engineering Background and Data Source

2.1. Engineering Background

Dongguashan Copper Mine is located in the Shizishan Ore Field, Tongling City, China, along the Yangtze River polymetallic metallogenic belt. The spatial relationship between the excavating system and the ore body is shown in Figure 1. The main ore body in the mining area is located on the axis of the anticline, with an elevation from −680 to −1000 m, and most of the area is located below −730 m. The ore body dips at 35°and has a length of 1820 m. The horizontal projection width is 204–882 m, and the thickness is 30–50 m. In addition, the mine microseismic monitoring system, manufactured by Changsha Digital Mine Co. Ltd., (Changsha, China) was distributed near the test stope to monitor the stability of the test stope. Based on the prevailing conditions on-site, seven microseismic signal sensors (six single-component geophones and one triple-component geophone) with a 10,000 Hz sampling frequency were distributed near the –730 m and −790 m sublevels of the test stope.

2.2. Data Source and Its Characteristics

In this study, we use a dataset of 2400 records collected via the microseismic monitoring system at Dongguashan Copper Mine from December 2017 to June 2018. Event types were meticulously annotated through on-site verification and double-checked by operators to ensure accurate correspondence between each data instance and its corresponding event type. Within these 2400 data instances, there were 800 instances each of microseismic signals, blasting signals, and noise signals. The training dataset consisted of 85% (2040 instances) of the data, while the remaining 15% (360 instances) formed the testing dataset. The original waveforms of three signal types—microseismic, blasting, and noise—are shown in Figure 2, and the distribution of maximum amplitudes and dominant frequency for these three signal types are shown in Figure 3.

Figure 2a and Figure 3 show that the microseismic signal typically had a single peak and showed rapid signal attenuation. Its maximum amplitude fell between that of the blasting signals and noise signals, with a broader distribution of dominant frequency. Figure 2b and Figure 3 show that the blasting signal had higher amplitudes compared to the microseismic signal. It contained slower signal attenuation and longer durations, resulting in a broader distribution of maximum amplitude than the microseismic signal. The noise signal, shown in Figure 2c and Figure 3, had lower maximum amplitudes and a lower dominant frequency. Nonetheless, the waveforms of the noise signals were, at times, similar to the microseismic signals, adding to the challenge of differentiation. Because of these similarities among the three signal classes, the direct classification of microseismic signals based solely on the two waveform feature parameters—maximum amplitude and dominant frequency—proved complicated. In addition, signal complexity is amplified by multiple reflections, refractions, and attenuations during transmission, compounding the intricacies of signal classification. Consequently, achieving accurate classification of microseismic signals requires a comprehensive consideration of their features in both the temporal and spectral domains.

3. Methods

3.1. Data Preprocessing

To effectively classify microseismic signals, this study applied a series of processes to the original waveform data, including frame segmentation and windowing. These processes transform non-stationary signals into more stable forms [30]. Each original waveform was divided into a total of 16 distinct frames, with the temporal extent of each frame consisting of 380 individual sampling points, and the inter-frame shift was meticulously stipulated at 80 samples. This ensured continuity and smooth transitions between contiguous frames while mitigating the frequency leakage caused by signal step changes. For the same purpose, the Hamming window was applied to each frame. The mathematical expression for windowing is given by:

S^{'} (n) = S (n) \times W (n)

(1)

W (n) = 0.54 - 0.46 \times \cos (\frac{2 π n}{N - 1}), 0 \leq n \leq N - 1

(2)

where S′(n) represents the waveform after windowing, S(n) represents the waveform of each frame, W(n) represents the Hamming window, and N is the number of frames in the original waveform data, where N = 16.

3.2. Feature Extraction

After frame segmentation and windowing, features in both the temporal and spectral domains were extracted from each frame. In this study, a total of 16 feature parameters were extracted, including the zero-crossing rate, energy, spectral centroid, spectral roll-off, spectral bandwidth, and Mel-frequency cepstral coefficients (MFCCs) with coefficients from the second to the twelfth.

The zero-crossing rate is used to determine the presence of microseismic signals, blasting signals, and noise signals within a frame, and it is calculated as follows:

ZCR = \frac{1}{(T - 1)} \sum_{t = 1}^{T - 1} Π {s_{t} s_{(t - 1)} < 0}

(3)

where ZCR represents the zero-crossing rate, T is the length of signal s, and the indicator function Π{A} takes the value of 1 if the logical value of A is true and 0 otherwise.

Energy is often used to indicate the strength of a signal and is calculated as follows:

E = \sum_{t = 1}^{T} {| s_{t} |}^{2}

(4)

where E represents the energy, and T is the length of signal s.

Spectral centroid reflects the fundamental frequency of the major harmonics within a signal and is calculated as follows:

SC = \frac{\sum_{j = 0}^{T - 1} f (j) E (j)}{\sum_{j = 1}^{T - 1} E (j)}

(5)

where SC represents the spectral centroid, T is the signal length, f(j) indicates the frequency of the j-th point, and E(j) indicates the energy of the frequency f(j).

Spectral roll-off describes the energy distribution across a signal’s spectrum, indicating the frequency below which a certain percentage (typically 85% or 90%; in this study, we used 90%) of the total energy is contained. This frequency, also referred to as the spectral roll-off point, is calculated as follows:

\underset{f_{c} \in 1, \dots, T}{\arg \min} \sum_{i = 1}^{f_{c}} m_{i} \geq 0.9 \cdot \sum_{i = 1}^{T} m_{i}

(6)

where f_c represents the spectral roll-off frequency, m_i represents the amplitude of the i-th frequency component in the spectrum, and T is the signal length.

Spectral bandwidth denotes the frequency range occupied by a signal. Generally, a broader bandwidth indicates a more complex frequency composition, whereas a narrower bandwidth suggests simpler frequency components. The calculation formula is as follows:

B_{p} = {(\sum_{f = 0}^{F_{s} / 2} E_{f} (f - SC))}^{1 / p}

(7)

where B_p represents the spectral bandwidth, F_s is the sampling frequency (the data used in this study had a sampling frequency of 10 kHz), E_f denotes the energy at frequency f, SC denotes the spectral centroid, and P is the order.

Mel-frequency cepstral coefficients (MFCCs) are the most widely used feature parameters for signal classification, with the logarithm of spectral energy reducing the dynamic range and increasing parameter stability. To balance computational complexity and storage requirements, coefficients 2 to 12 of the MFCCs are selected. These coefficients encapsulate the essential signal information while minimizing redundancy. The calculation formula is as follows:

MFCC (n) = \sum_{m = 0}^{M - 1} [\ln (\sum_{i = 1}^{N - 1} {|X (i)|}^{2} H_{m} (i)) \cdot \cos (\frac{π n (m - 0.5)}{M})]

(8)

where MFCC(n) represents the n-th Mel-frequency cepstral coefficient, n = 2,…,12, X represents the signal spectrum, H_m is the m-th filter in the Mel filterbank (with 0 ≤ m < M), and N is the number of frames.

After frame segmentation, windowing, and feature extraction, each original waveform data instance was transformed into a 16 × 16 feature matrix. This matrix served as the initial input to the Transformer model.

3.3. Constructing the Transformer Model

The Transformer model has ushered in a new era of sequential data processing, redefining the benchmarks of performance and adaptability. In this section, with the help of the Transformer model, we propose a Transformer-based automatic classification method, and the Transformer model is the core content of this study. From the structure of the Transformer model, the network is briefly described: the typical network structure of the Transformer model is mainly composed of an encoder module and a decoder module (Figure 4). The encoder module plays a central role in extracting comprehensive representations from the input sequence, while the decoder module generates the output sequence used for the final classification.

3.3.1. Encoder Module

The encoder module plays a pivotal role in the processing of input sequences. It uses multi-head attention mechanisms that allow it to capture intricate relationships and dependencies within the data. In multi-head attention, the self-attention mechanism is applied multiple times in parallel with different learned linear projections of the query, key, and value matrices. These multiple attention heads operate in parallel, and their outputs are concatenated and linearly projected to generate the final output of the attention layer. The multi-head attention mechanism can be expressed as follows:

M ultiHead (Q, K, V) = Concat ({head}_{1}, {head}_{2, \dots,} {head}_{h}) W_{0}

(9)

{head}_{i} = Attention (Q_{i}, K_{i}, V_{i})

(10)

Attention (Q, K, V) = Softmax (\frac{Q K^{T}}{\sqrt{d_{k}}}) V

(11)

where Q, K, and V represent the query, key, and value matrices, respectively, d_k denotes the dimension of the key vectors, and W₀ denotes a learned weight. Moreover, positional encodings are injected to preserve sequence order, ensuring that the model can recognize the sequential nature of the data. Layer normalization techniques further stabilize the training process by normalizing intermediate representations. Additionally, the feedforward neural network (FFN) operates on each position’s encoded information independently and identically. It consists of two linear transformations, separated by a nonlinear activation function:

FFN (x) = f (W_{1} x + b_{1}) W_{2} + b_{2}

(12)

where W₁, b₁, W₂, and b₂ are learned weights and biases, and f denotes the non-linear activation function; we used the GELU nonlinear activation function. By peering into the intricacies of the encoder, we illuminated how raw input sequences are transformed into rich contextual representations.

3.3.2. Decoder Module

The decoder module has a similar structure to the encoder module, with an additional focus on generating the output sequence progressively. Masked multi-head attention introduces a restriction that prevents positions from attending to subsequent positions in the sequence. Masked multi-head attention is modified from multi-head attention, and this masking ensures that the model’s predictions for each position are based only on the tokens that come before it in the sequence. Then, after passing through the fully connected layer and the softmax layer, the output is obtained.

4. Results

The model was constructed using PyTorch 1.13.1 and RTX 3070 as the GPU for the experimental setup. We used the Adam optimizer [31] as the optimizer, which can train the neural network by adaptively adjusting the learning rate. The cross-entropy loss function was utilized to quantify the distinction between the predicted outcomes and the actual values, and it is expressed as follows:

L = \frac{1}{N} \sum_{i} L_{i} = - \frac{1}{N} \sum_{i} \sum_{c = 1}^{M} y_{ic} \log (p_{ic})

(13)

where N represents the batch size, M is the number of classes, y_ic is a symbolic function, taking the value of 1 when the actual class of sample i equals c and otherwise taking 0, and p_ic denotes the predicted probability of sample i belonging to class c.

Additionally, we took the confusion matrix, accuracy, precision, recall, and F1 score as the metrics to evaluate the performance of multi-classification. The confusion matrix can effectively reflect the misclassification between different classes. The diagonal elements represent the proportion (or number) of correctly classified instances, while the off-diagonal elements represent the misclassified instances. Accuracy denotes the ratio of correctly classified samples to the total number of samples. Precision gauges the ratio of true positive predictions to all predicted positive instances. Recall, also known as sensitivity, quantifies the proportion of actual positive instances that were correctly predicted. The F1 score takes into account both precision and recall, offering a comprehensive assessment of the model’s precision and recall. The calculation of accuracy, precision, recall, and F1 score are as follows:

Accuracy = \frac{TP + TN}{TP + TN + FN + FP}

(14)

Precision = \frac{TP}{TP + FP}

(15)

Recall = \frac{TP}{TP + FN}

(16)

F 1 = \frac{2 \times Precision \times Recall}{Precision + Recall}

(17)

where TP represents the count of true positive instances, TN denotes the count of true negative instances, FP denotes the count of false positive instances, and FN denotes the count of false negative instances.

The Transformer model’s classification results on the test dataset are shown in Table 1. From Table 1, it can be observed that the Transformer model performs well in terms of precision, recall, and F1 score on three types of signals. The accuracy of the Transformer model for three types of signals—microseismic, blasting, and noise—reached 96.6%, 95.9%, and 95.8%, respectively. This indicates that the Transformer model can make accurate predictions. The F1 score for the three types of signals is relatively close to their respective precision and recall, indicating that the performance of the Transformer model can strike a balance between precision and recall, and there is a good trade-off between correctly classifying positive instances and minimizing false positives and false negatives.

The confusion matrix of the Transformer model is shown in Figure 5. As we can see in this figure, out of 117 microseismic signals, 113 were correctly classified. Among the four misclassified microseismic signals, three were misclassified as blasting signals, while the remaining one was classified as a noise signal. Of the 123 blasting signals, 118 were correctly classified, with 5 misclassified instances. Among the 120 noise signals, 115 were correctly identified, while 5 were misclassified. These results show the high classification accuracy of the Transformer model for mining microseismic signals and its comparable performance in distinguishing the other two signal types. This demonstrates the effectiveness of the proposed Transformer-based automatic classification method for mining microseismic signals, blasting signals, and noise signals in reducing the misclassification rates.

5. Discussion

In the previous section, we presented the results of the Transformer model experiment, highlighting the performance metrics achieved. Building upon these findings, we now turn our attention to a detailed selection of the model parameters, followed by an exploration of feature importance. Subsequently, we compare methods to provide a comprehensive understanding of the study’s implications on the Dongguashan Copper Mine dataset and the Tognkuangyu Copper Mine dataset.

5.1. Selection of the Transformer Model Parameters

With these results in mind, we proceeded to examine the parameters of the Transformer model in detail. The learning rate directly affects the model’s speed, and the batch size is a crucial factor in determining the model’s generalization ability. To discern the effects of distinct learning rates and batch sizes on the Transformer model, we set a spectrum of learning rates, ranging from 4 × 10⁻⁵ to 1 × 10⁻⁴. Similarly, various batch sizes were tested, ranging from 4, 8, 16, and 32 to 64. The accuracy comparison of the Transformer model under different parameters is shown in Table 2. For the same batch size, the accuracy of the Transformer model tends to increase and then decrease as the learning rate increases. It is worth noting that a learning rate of 7 × 10⁻⁵ achieves the maximum accuracy at batch sizes of 4, 8, 32, and 64 with accuracy of 95.8%, 96.1%, 95.5%, and 95.6%, respectively. Only at the batch size of 16 does the accuracy modestly descend beneath that of the learning rate of 4 × 10⁻⁵. Consequently, the learning rate of 7 × 10⁻⁵ stands as the guiding star, propelling the Transformer model towards an equilibrium of accuracy and stability. Among the batch sizes that achieve maximum accuracy, the Transformer model achieves its highest accuracy of 96.1% with a batch size of 8. Therefore, we meticulously selected the learning rate of 7 × 10⁻⁵, and the batch size of 8 emerged as the optimal parameter for the Transformer model.

5.2. Exploration of Feature Importance

Echoing the patterns unveiled in the results section, our exploration of feature importance highlights the pivotal role that certain features play in influencing the Transformer model’s predictions. Feature importance serves as a metric for quantifying each input feature’s contribution to the Transformer model’s predictive outcomes. With 16 features as initial inputs, the task was to ascertain the relative importance of each feature during the Transformer model training process. The Max-Relevance and Min-Redundancy (MRMR) algorithm [32] was used, a computational tool designed to measure the importance of individual features. The MRMR algorithm is characterized by its capacity to identify the most relevant features aligned with final output results while minimizing the redundancy between features. It was employed to quantify the importance of each of the 16 features, which was calculated as follows:

m a x D (S, c) = \frac{1}{|S|} \sum_{X_{i} \in S} I (X_{i}; c)

(18)

I (X_{i}; c) = \iint p (X_{i}, c) l o g \frac{p (X_{i}, c)}{p (X_{i}) p (c)} d X_{i} d c

(19)

where S represents the set of features, X_i corresponds to the i-th feature, c denotes the class variable, and I(X_i; c) denotes the mutual information between feature X_i and class variable c.

Using the MRMR algorithm, we obtained the importance scores of 16 features, which were subsequently normalized, as shown in Figure 6. It can be observed that energy occupied a dominant share of 36.5%, which was the largest proportion. Following closely were the spectral bandwidth and coefficient 9 of the MFCCs. The cumulative importance of these three features accounted for a substantial share of 68.1%, which indicates that within the classification of microseismic signals, blasting signals, and noise signals, the features of energy, spectral bandwidth, and coefficient 9 of MFCCs reign supreme in terms of importance. These features exhibited a pronounced correlation with the final output results, whereas the relevance of other features for the same output was relatively diminished.

5.3. Comparison of Methods Using the Dongguashan Copper Mine Dataset

In this section, we verify the classification performance of the Transformer model compared with five commonly used classification models, including VGG16, ResNet18, ResNet34, SVM, and KNN. VGG is a neural network model based on CNN architecture, where VGG16 comprises a total of 16 weight layers (13 convolutional layers and 3 fully connected layers). In reference [17], the VGG model achieved a recognition rate of 94% for mine microseismic signals. ResNet is also a CNN-based model that employs residual connections to allow initial information to bypass certain network layers and propagate directly through deeper layers; this approach prevents information distortion with increasing depth. In reference [18], ResNet was used to predict mine microseismic signals, and the average precision reached 96%. SVM achieves classification by finding a hyperplane in the feature space that maximizes the margin between classes. In reference [9], SVM was applied to classify microseismic and blasting signals and achieved an accuracy rate of 93%. KNN assigns classes to unknown data by calculating the distances between the unknown data and known data points. In reference [14], KNN was employed to predict volcanic earthquake signals with an accuracy rate of 78%.

The optimal hyperparameter combinations for six microseismic signal classification models are shown in Table 3. Among the four deep learning microseismic signal classification models, Transformer, VGG16, ResNet18, and ResNet34 shared identical initial inputs of 16 × 16 feature matrices. Due to the limitations of SVM and KNN in accommodating multidimensional inputs, an adjustment is necessary. We flattened the 16 × 16 feature matrices to convert them into a one-dimensional vector of dimensions 1 × 256. The flattened vector was subsequently used as the initial input for SVM and KNN classification models.

To determine accurate classification results and validate the Transformer model, we trained the six microseismic signal classification models under optimal hyperparameters. As a result, the classification performance of the total and individual signals of the six microseismic signal classification models is shown in Table 4 and Figure 7a. Figure 8 shows the confusion matrix of the six microseismic signal classification models. It can be observed from Figure 7a and Figure 8 that, among the evaluated models, the Transformer model had the highest classification rate for microseismic signals (96.6%) and blasting signals (95.9%), while the rate for noise signals (95.8%) was slightly lower than the corresponding rates of VGG16 (96.3%) and ResNet18 (96.3%). Transformer (96.1%), VGG16 (94.4%), ResNet34 (93.8%), SVM (93.7%), ResNet18 (93.4%), and KNN (92.9%) are ranked in descending order by total classification rate.

Furthermore, the evaluation indexes of the classification results of six microseismic signal classification models are shown in Table 5 and Figure 7b. It can be seen that the Transformer model achieves 96.1% in four evaluation indexes for accuracy, precision, recall, and F1 score. Among the six microseismic signal classification models, it is shown that the Transformer model consistently emerges as the leader across all evaluation indicators, followed by VGG16, which achieved 94.4% in each evaluation index. When comparing the classification methods proposed by the aforementioned researchers, our Transformer-based approach, when applied to the Dongguashan Copper Mine dataset, achieved results equal to or better than other methods. These results underscore the effectiveness of our Transformer-based automatic microseismic signal classification method.

5.4. Comparison of Methods Using the Tongkuangyu Copper Mine Dataset

In the previous section, we compared six microseismic signal classification models using the Dongguashan Copper Mine dataset. The Transformer model yielded favorable results in terms of accuracy, precision, recall, and F1 score. Next, we will further compare these six models using the Tongkuangyu Copper Mine dataset. Tongkuangyu Copper Mine is located in Yuquan County, China, surrounded by mountains on all sides and situated beneath the Yuangulu Peak at the northern end of the Zhongtiao Mountains. Its geographical coordinates are 111°40′ east longitude and 35°19′ north latitude. The highest elevation in the mining area is 1699.2 m, with an erosion reference elevation of 620 m and a relative topographic difference of 300–600 m. The microseismic signal sensors operate with a 10,000 Hz sampling frequency.

In this section, we utilized the same model hyperparameters as in the previous section, as shown in Table 3. Additionally, the data preprocessing procedures remained consistent with those described earlier. As a result, the classification performance of the six seismic signal classification models for the three signal classes—microseismic, blasting, and noise—are shown in Table 6 and Figure 9a. Figure 10 shows the confusion matrix of the six microseismic signal classification models. It can be observed from Figure 9a and Figure 10 that, compared to the previous section, when evaluating the performance of different models using the Tongkuangnyu Copper Mine dataset, the recognition rates of microseismic, blasting, and noise signals improved to varying degrees. Among the evaluated models, the Transformer model still had the highest classification rate for microseismic signals (99.1%), while the classification rate for noise signals (95.2%) was slightly lower than that of ResNet18 (96.4%). Transformer (97.6%), ResNet34 (97.0%), VGG16 (97.0%), ResNet18 (96.9%), SVM (95.2%), and KNN (94.8%) are ranked in descending order according to their total classification rate. Moreover, Table 7 and Figure 9b show that the Transformer model continued to have the highest performance in terms of accuracy, precision, recall, and F1 score, which were all around 97.5%. Comparing the classification methods proposed by the aforementioned researchers, our Transformer-based approach, when applied to the Tongkuangnyu Copper Mine dataset, achieved results superior to those of other methods. These results emphasize the effectiveness of the automatic microseismic signal classification method in mining based on the Transformer model. This method extracts underlying features from the initial input and effectively classifies signals based on the differences and similarities of these features. It accurately classifies microseismic signals from monitoring data in complex underground environments, serving as a powerful solution for practical applications.

6. Conclusions

In this study, we present a novel approach for automatic classification using a Transformer-based method, taking advantage of the inherent global feature extraction capabilities of the Transformer model. After framing and windowing the waveform, we extracted 16 features: zero-crossing rate, energy, spectral centroid, spectral roll-off, spectral bandwidth, and Mel-frequency cepstral coefficients (MFCCs) with coefficients from the second to the twelfth. These features culminated in a 16 × 16 feature matrix that served as the input for the Transformer model. With the trained Transformer model, the classification accuracy of microseismic signals, blasting signals, and noise signals reached 96.6%, 95.9%, and 95.8%, respectively. In addition, compared with VGG16, ResNet18, ResNet34, SVM, and KNN, the Transformer model showed superior performance in terms of accuracy, precision, recall, and F1 score with both the Dongguashan Copper Mine dataset and the Tongkuangyu Copper Mine dataset. Furthermore, our study included a comprehensive analysis of the influence of the Transformer model parameters and the importance of the features on the results. The analysis not only provides valuable insights for enhancing the classification performance of microseismic signals but also serves as a crucial reference for future advancements. These findings demonstrate that the classification of microseismic signals through a deep learning approach is effective and simple. Additionally, this approach can be applied to a variety of mines in addition to the mine investigated in this study.

Author Contributions

Conceptualization, P.P., R.L. and J.W.; data curation, P.P., R.L. and J.W.; formal analysis, P.P., R.L. and J.W.; methodology, P.P., R.L. and J.W.; software, P.P., R.L. and J.W.; writing—original draft, P.P., R.L. and J.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China under Grant 52104170, the science and technology innovation program of Hunan Province under Grant 2023RC3069, and the National Key Research and Development Program of China under grant 2022YFC2904105.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Gu, D.; Zhou, K. Development Theme of the Modern Metal Mining. Met. Mine 2012, 47, 1–8. [Google Scholar]
Cai, M.; Li, P.; Tan, W.; Ren, F. Key Engineering Technologies to Achieve Green, Intelligent, and Sustainable Development of Deep Metal Mines in China. Engineering 2021, 7, 1513–1517. [Google Scholar] [CrossRef]
Wang, J.; Xue, Y.; Xiao, J.; Shi, D. Diffusion Characteristics of Airflow and CO in the Dead-End Tunnel with Different Ventilation Parameters after Tunneling Blasting. ACS Omega 2023, 39, 36269–36283. [Google Scholar] [CrossRef] [PubMed]
Xie, C.; Chen, Z.; Xiong, G.; Yang, B.; Shen, J. Study on the Evolutionary Mechanisms Driving Deformation Damage of Dry Tailing Stack Earth–Rock Dam under Short-Term Extreme Rainfall Conditions. Nat. Hazards 2023. [Google Scholar] [CrossRef]
Cai, M. Prediction and Prevention of Rockburst in Metal Mines—A Case Study of Sanshandao Gold Mine. J. Rock Mech. Geotech. Eng. 2016, 8, 204–211. [Google Scholar] [CrossRef]
Liu, D.; Li, X.; Li, X.; Shang, X. K-means clustering method based on LOF and its application in microseismic monitoring. J. Saf. Sci. Technol. 2019, 15, 81–87. [Google Scholar]
Li, W.; Chen, Z.; Tang, C.; Su, Y.; Tang, L.; Hu, J.; Tao, L. Application of microseismic shape recognition based on signal time-frequency characteristics in rockburst prediction. J. China Inst. Water Resour. Hydropower Res. 2022, 20, 156–531;+556. [Google Scholar]
Zhao, G.; Ma, J.; Dong, L.; Li, X.; Chen, G.; Zhang, C. Classification of Mine Blasts and Microseismic Events Using Starting-up Features in Seismograms. Trans. Nonferrous Met. Soc. China 2015, 25, 3410–3420. [Google Scholar] [CrossRef]
Shang, X.; Li, X.; Peng, K.; Dong, L.; Wang, Z. Feature extraction and classification of mine microseism and blast based on EMD-SVD. Chin. J. Geotech. Eng. 2016, 38, 1849–1858. [Google Scholar]
Yang, C.; Wu, J. Feature Extraction and Classification of Mine Microseismic Signal Based on WPD-SVD. Min. Saf. Environ. Prot. 2018, 45, 37–41. [Google Scholar]
He, Z.; Peng, P.; Liao, Z. An automatic identification and classification method of complex microseismic signals in mines based on Mel-frequency cepstral coefficients. J. Saf. Sci. Technol. 2018, 14, 41–47. [Google Scholar]
Shang, X.; Li, X.; Morales-Esteban, A.; Chen, G. Improving Microseismic Event and Quarry Blast Classification Using Artificial Neural Networks Based on Principal Component Analysis. Soil Dyn. Earthq. Eng. 2017, 99, 142–149. [Google Scholar] [CrossRef]
Wei, H.; Shu, W.; Dong, L.; Huang, Z.; Sun, D. A Waveform Image Method for Discriminating Micro-Seismic Events and Blasts in Underground Mines. Sensors 2020, 20, 4322. [Google Scholar] [CrossRef] [PubMed]
Bicego, M.; Rossetto, A.; Olivieri, M.; Londono-Bonilla, J.M.; Orozco-Alzate, M. Advanced KNN Approaches for Explainable Seismic-Volcanic Signal Classification. Math. Geosci. 2023, 55, 59–80. [Google Scholar] [CrossRef]
Plebe, A.; Grasso, G. The Unbearable Shallow Understanding of Deep Learning. Minds Mach. 2019, 29, 515–553. [Google Scholar] [CrossRef]
Peng, P.; He, Z.; Wang, L.; Jiang, Y. Automatic Classification of Microseismic Records in Underground Mining: A Deep Learning Approach. IEEE Access 2020, 8, 17863–17876. [Google Scholar] [CrossRef]
Zhao, H.; Liu, R.; Liu, Y.; Zhang, Y.; Gu, T. Research on classification and identification of mine microseismic signals based on deep learning method. J. Min. Sci. Technol. 2022, 7, 166–174. [Google Scholar]
Li, J.; Tang, S.; Li, K.; Zhang, S.; Tang, L.; Cao, L.; Ji, F. Automatic Recognition and Classification of Microseismic Waveforms Based on Computer Vision. Tunn. Undergr. Space Technol. 2022, 121, 104327. [Google Scholar] [CrossRef]
Tang, S.; Wang, J.; Tang, C. Identification of Microseismic Events in Rock Engineering by a Convolutional Neural Network Combined with an Attention Mechanism. Rock Mech. Rock Eng. 2021, 54, 47–69. [Google Scholar] [CrossRef]
Zhao, Y.; Xu, H.; Yang, T.; Wang, S.; Sun, D. A Hybrid Recognition Model of Microseismic Signals for Underground Mining Based on CNN and LSTM Networks. Geomat. Nat. Hazards Risk 2021, 12, 2803–2834. [Google Scholar] [CrossRef]
Saad, O.M.; Chen, Y.; Savvaidis, A.; Fomel, S.; Chen, Y. Real-Time Earthquake Detection and Magnitude Estimation Using Vision Transformer. J. Geophys. Res. Solid Earth 2022, 127, e2021JB023657. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. Adv. Neural Inf. Process. Syst. 2017, 30, 3058. [Google Scholar]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16 × 16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
Zhu, Y.; Xia, Q.; Jin, W. SRDD: A Lightweight End-to-End Object Detection with Transformer. Connect. Sci. 2022, 34, 2448–2465. [Google Scholar] [CrossRef]
Dong, L.; Xu, S.; Xu, B. Speech-Transformer: A No-Recurrence Sequence-to-Sequence Model for Speech Recognition. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 15–20 April 2018; pp. 5884–5888. [Google Scholar]
Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2015, arXiv:1409.1556. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Suykens, J.A.K.; Vandewalle, J. Least Squares Support Vector Machine Classifiers. Neural Process. Lett. 1999, 9, 293–300. [Google Scholar] [CrossRef]
Keller, J.M.; Gray, M.R.; Givens, J.A. A Fuzzy K-Nearest Neighbor Algorithm. IEEE Trans. Syst. Man Cybern. 1985, 4, 580–585. [Google Scholar] [CrossRef]
Paliwal, K.; Wojcicki, K. Effect of Analysis Window Duration on Speech Intelligibility. IEEE Signal Process. Lett. 2008, 15, 785–788. [Google Scholar] [CrossRef]
Diederik, K.; Jimmy, B. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Peng, H.; Long, F.; Ding, C. Feature Selection Based on Mutual Information Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 1226–1238. [Google Scholar] [CrossRef]

Figure 1. Spatial relationship between the excavating system and the ore body at Dongguashan Copper Mine. The green lines denote shaft wells, the brown blocks represent the ore body, and the purple lines denote tunnels.

Figure 2. Original waveforms of the three types of signals. (a) Waveform of microseismic signal. (b) Waveform of blasting signal. (c) Waveform of noise signal. The sampling frequency of the sensors was 10 kHz, and the event time is in the format of year–month–day–hour–minute–second–millisecond. The horizontal axis indicates the sampling points, and the vertical axis indicates the amplitude magnitude.

Figure 3. The distribution of the maximum amplitude and dominant frequency of the three types of signals. (a) Maximum amplitude distribution. (b) Dominant frequency distribution.

Figure 4. Structure of the Transformer model.

Figure 5. Confusion matrix of the Transformer model for three types of signals. The horizontal axis indicates the true class, and the vertical axis indicates the predicted class. The diagonal elements represent the number of correctly classified instances, while the off-diagonal elements represent the incorrectly classified instances.

Figure 6. Normalized feature importance distribution. The horizontal axis indicates the 16 features, and the vertical axis indicates the feature importance.

Figure 7. Comparison of signal classification precision and evaluation indexes among six microseismic signal classification models using the Dongguashan Copper Mine dataset. (a) Comparison of signal classification precision among six microseismic signal classification models. (b) Comparison of evaluation indexes among six microseismic signal classification models.

Figure 8. Confusion matrices of classification results among six microseismic signal classification models using the Dongguashan Copper Mine dataset. (a) Transformer. (b) VGG16. (c) ResNet34. (d) ResNet18. (e) SVM. (f) KNN. The horizontal axis indicates the true class, and the vertical axis indicates the predicted class. The diagonal elements represent the proportion of correctly classified instances, while the off-diagonal elements represent the proportion of incorrectly classified instances.

Figure 9. Comparison of signal classification precision and evaluation indexes among six microseismic signal classification models using the Tongkuangyu Copper Mine dataset. (a) Comparison of signal classification precision among six microseismic signal classification models. (b) Comparison of evaluation indexes among six microseismic signal classification models.

Figure 10. Confusion matrices of classification results among six microseismic signal classification models using the Tongkuangyu Copper Mine dataset. (a) Transformer. (b) VGG16. (c) ResNet34. (d) ResNet18. (e) SVM. (f) KNN. The horizontal axis indicates the true class, and the vertical axis indicates the predicted class. The diagonal elements represent the proportion of correctly classified instances, while the off-diagonal elements represent the proportion of incorrectly classified instances.

Table 1. Classification results of the Transformer model.

Signal Classes	Transformer
Signal Classes	Precision	Recall	F1 Score
Microseismic	0.966	0.942	0.954
Blasting	0.959	0.983	0.971
Noise	0.958	0.958	0.958

Table 2. Accuracy comparison of Transformer model under different parameters.

Learning Rate	Patch Size
Learning Rate	4	8	16	32	64
4 × 10⁻⁵	0.947	0.958	0.957	0.943	0.919
6 × 10⁻⁵	0.958	0.942	0.949	0.943	0.925
7 × 10⁻⁵	0.958	0.961	0.949	0.955	0.956
8 × 10⁻⁵	0.950	0.939	0.949	0.952	0.947
1 × 10⁻⁴	0.950	0.947	0.949	0.952	0.928

Table 3. Optimal hyperparameter combinations for six microseismic signal classification models.

Model	Optimal Hyperparameter Combination
Transformer	{‘learning_rate’: 7 × 10⁻⁵, ‘batch_size’: 8, ‘optimizer’: Adam}
VGG16	{‘learning_rate’: 7 × 10⁻⁵, ‘batch_size’: 64, ‘optimizer’: Adam}
ResNet34	{‘learning_rate’: 7 × 10⁻⁵, ‘batch_size’: 32, ‘optimizer’: Adam}
ResNe18	{‘learning_rate’: 7 × 10⁻⁵, ‘batch_size’: 64, ‘optimizer’: Adam}
SVM	{‘kernel’: ‘rbf’, ‘kernel_scale’: 13.05, ‘C’: 8.88}
KNN	{‘n_neighbors’: 1, ‘weights’: distance}

Table 4. Comparison of signal classification precision among six microseismic signal classification models using the Dongguashan Copper Mine dataset.

Signal Type	Transformer	VGG16	ResNet34	ResNet18	SVM	KNN
Microseismic	0.9658	0.9182	0.9328	0.8972	0.9646	0.9554
Blasting	0.9593	0.9231	0.9487	0.9434	0.9350	0.9268
Noise	0.9583	0.9633	0.9310	0.9626	0.9113	0.9040
Total	0.9612	0.9440	0.9375	0.9344	0.9370	0.9287

Table 5. Comparison of evaluation indexes among six microseismic signal classification models using the Dongguashan Copper Mine dataset.

Evaluation Index	Transformer	VGG16	ResNet34	ResNet18	SVM	KNN
Accuracy	0.9611	0.9438	0.9375	0.9344	0.9360	0.9280
Precision	0.9612	0.9440	0.9375	0.9344	0.9370	0.9287
Recall	0.9611	0.9436	0.9375	0.9342	0.9361	0.9278
F1	0.9610	0.9438	0.9375	0.9342	0.9361	0.9278

Table 6. Comparison of signal classification precision among six microseismic signal classification models using the Tongkuangyu Copper Mine dataset.

Signal Type	Transformer	VGG16	ResNet34	ResNet18	SVM	KNN
Microseismic	0.9912	0.9802	0.9907	0.9901	0.9612	0.9736
Blasting	0.9520	0.9292	0.9365	0.9636	0.9445	0.9104
Noise	0.9836	1.0000	0.9831	0.9541	0.9509	0.9610
Total	0.9756	0.9698	0.9701	0.9693	0.9522	0.9483

Table 7. Comparison of evaluation indexes among six microseismic signal classification models using the Tongkuangyu Copper Mine dataset.

Evaluation Index	Transformer	VGG16	ResNet34	ResNet18	SVM	KNN
Accuracy	0.9750	0.96875	0.96875	0.9688	0.9520	0.9475
Precision	0.9756	0.9698	0.9701	0.9692	0.9522	0.9483
Recall	0.9750	0.9689	0.9684	0.9690	0.9520	0.9475
F1	0.9749	0.9686	0.9684	0.9686	0.9493	0.9476

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Peng, P.; Lei, R.; Wang, J. Enhancing Microseismic Signal Classification in Metal Mines Using Transformer-Based Deep Learning. Sustainability 2023, 15, 14959. https://doi.org/10.3390/su152014959

AMA Style

Peng P, Lei R, Wang J. Enhancing Microseismic Signal Classification in Metal Mines Using Transformer-Based Deep Learning. Sustainability. 2023; 15(20):14959. https://doi.org/10.3390/su152014959

Chicago/Turabian Style

Peng, Pingan, Ru Lei, and Jinmiao Wang. 2023. "Enhancing Microseismic Signal Classification in Metal Mines Using Transformer-Based Deep Learning" Sustainability 15, no. 20: 14959. https://doi.org/10.3390/su152014959

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Enhancing Microseismic Signal Classification in Metal Mines Using Transformer-Based Deep Learning

Abstract

1. Introduction

2. Engineering Background and Data Source

2.1. Engineering Background

2.2. Data Source and Its Characteristics

3. Methods

3.1. Data Preprocessing

3.2. Feature Extraction

3.3. Constructing the Transformer Model

3.3.1. Encoder Module

3.3.2. Decoder Module

4. Results

5. Discussion

5.1. Selection of the Transformer Model Parameters

5.2. Exploration of Feature Importance

5.3. Comparison of Methods Using the Dongguashan Copper Mine Dataset

5.4. Comparison of Methods Using the Tongkuangyu Copper Mine Dataset

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI