Research on Gas Concentration Anomaly Detection in Coal Mining Based on SGDBO-Transformer-LSSVM

Liu, Mingyang; Zhang, Longcheng; Yan, Zhenguo; Wang, Xiaodong; Qiao, Wei; Feng, Longfei

doi:10.3390/pr13092699

Open AccessArticle

Research on Gas Concentration Anomaly Detection in Coal Mining Based on SGDBO-Transformer-LSSVM

by

Mingyang Liu

^1,2,

Longcheng Zhang

^3,*

,

Zhenguo Yan

³

,

Xiaodong Wang

^1,2,

Wei Qiao

^1,2 and

Longfei Feng

^1,2

¹

Technology & Engineering, Xi’an Research Institute of China Coal (Group), Corp., Xi’an 710077, China

²

State Key Laboratory of Coal Mine Disaster Prevention and Control, Xi’an 710077, China

³

College of Safety Science and Engineering, Xi’an University of Science and Technology, Xi’an 710054, China

^*

Author to whom correspondence should be addressed.

Processes 2025, 13(9), 2699; https://doi.org/10.3390/pr13092699

Submission received: 14 July 2025 / Revised: 14 August 2025 / Accepted: 19 August 2025 / Published: 25 August 2025

(This article belongs to the Section Process Control and Monitoring)

Download

Browse Figures

Versions Notes

Abstract

Methane concentration anomalies during coal mining operations are identified as important factors triggering major safety accidents. This study aimed to address the key issues of insufficient adaptability of existing detection methods in dynamic and complex underground environments and limited characterization capabilities for non-uniform sampling data. Specifically, an intelligent diagnostic model was proposed by integrating the improved Dung Beetle Optimization Algorithm (SGDBO) with Transformer-SVM. A dual-path feature fusion architecture was innovatively constructed. First, the original sequence length of samples was unified by interpolation algorithms to adapt to deep learning model inputs. Meanwhile, statistical features of samples (such as kurtosis and differential standard deviation) were extracted to deeply characterize local mutation characteristics. Then, the Transformer network was utilized to automatically capture the temporal dependencies of concentration time series. Additionally, the output features were concatenated with manual statistical features and input into the LSSVM classifier to form a complementary enhancement diagnostic mechanism. Sine chaotic mapping initialization and a golden sine search mechanism were integrated into DBO. Subsequently, the SGDBO algorithm was employed to optimize the hyperparameters of the Transformer-LSSVM hybrid model, breaking through the bottleneck of traditional parameter optimization falling into local optima. Experiments reveal that this model can significantly improve the classification accuracy and robustness of anomaly curve discrimination. Furthermore, core technical support can be provided to construct coal mine safety monitoring systems, demonstrating critical practical value for ensuring national energy security production.

Keywords:

coal mining; methane concentration; anomaly curve prediction; transformer-LSSVM; SGDBO

1. Introduction

Coal is regarded as the core pillar of global primary energy consumption, and its stable supply profoundly affects world industrial operation [1]. In China, coal has long accounted for more than 60% of the primary energy consumption structure and serves as an indispensable strategic cornerstone for sustained national economic development [2]. Coal mining activities are commonly located deep within complex geological environments, accompanied by a series of severe safety and production challenges. Among these challenges, gas disasters are recognized as vital risk sources that threaten miners’ life safety and mine facility integrity, attributed to their sudden occurrence and destructive nature. Underground methane (CH₄) is identified as the main combustible gas associated with coal seams, and its concentration dynamics form one of the most direct and critical indicators reflecting the safety status of mining operation environments [3].

The dynamic changes in coal mine gas concentration are influenced by complex coupling effects of multi-dimensional and multi-scale factors, while the accuracy and reliability of monitoring data are directly related to the effectiveness of early warning systems [4]. Methane exists in free and adsorbed states in coal bodies and surrounding rocks, and dynamic release characteristics are exhibited under mining disturbance effects. Specifically, desorption of adsorbed gas causes sensor values to rise as mining activities intensify; after operations cease, concentration curves demonstrate exponential decay, and a dynamic equilibrium is eventually formed with the mine ventilation system [5]. This natural release pattern is generally disturbed by abnormal behaviors such as ground pressure activities, equipment vibrations, or even artificial sensor shielding, leading to persistent deviation signals. While such anomalies are highly similar to real disaster precursors visually, it is difficult to achieve effective identification through traditional threshold methods. In particular, gas anomalies under high ground stress conditions present complex patterns, such as transient, continuous, and fluctuating types, as the mining depth breaks through the kilometer level. The increased ground pressure induces an enhanced adsorption capacity of coal bodies, and free gas is converted to an adsorbed state, contributing to regular increases in methane emission. Meanwhile, deep fault structures are prone to forming gas enrichment zones, which are significantly higher than shallow layers [6]. Traditional identification methods based on empirical rules or single statistical models have revealed inadequacy when dealing with three challenges: the dynamic nature of gas occurrence, temperature–pressure coupling disturbance of adsorption equilibrium, and the superposition of sensor drift and noise [7].

With the digital transformation driven by Industry 4.0, breakthrough progress has been achieved globally in gas concentration prediction and anomaly detection research with machine learning. International researchers have considerably reinforced the model generalization capability and engineering applicability through multi-disciplinary cross-fusion [8]. Paul Morris et al. [9] innovatively employed zeolitic imidazolate frameworks (ZIFs) to construct gas sensor arrays. Abnormal gas components were predicted through thermodynamic adsorption models, and anomaly detection with false positive rates < 0.5% was achieved by SVM models. This provided solutions for humidity-interference-resistant sensor design. Diana Sorg et al. [10] validated the potential of unsupervised learning in dynamic gas monitoring through portable laser methane detectors (LMDs), and their strict quality control protocol offered a reference for mine sensor calibration. Chang et al. [11] transformed gas concentration time series into recurrence plots (RPs) and mined spatiotemporal correlations between sensors using Convolutional Neural Networks (CH₄ correlation coefficient > 0.85 at CH₄ sensor positions T0 and T2), so as to complete the unsupervised diagnosis of faulty sensors. The hierarchical temporal memory algorithm (HTM) developed by Subutai Ahmad et al. [12] first achieved real-time learning and prediction synchronization in streaming data scenarios and refreshed anomaly detection records with an F1-score of 89.3% in NAB benchmark testing.

A mine safety cloud platform based on the Internet of Things was constructed by Byung Wan Jo et al. [13]. The goal of a CH₄/CO anomaly detection accuracy > 90% was achieved in coal mines through integrated clustering analysis and spatiotemporal statistical algorithms. A hybrid hidden Markov-ANN model was designed by J. Praveenchandar et al. [14]. Sensor fault diagnosis was fulfilled in dynamic gas mixtures to significantly improve industrial environment monitoring reliability. In the field of hybrid model optimization contributions [15], a TCN-Transformer architecture was developed by Wang et al., who combined a Temporal Convolutional Network (TCN) and Transformer encoder [16]. Transient fluctuation characteristics were captured through causal dilated convolution, and prediction R² reached 0.981 when it was combined with the flood optimization algorithm (FLA). The MMoE multi-disaster fusion model constructed by Liu et al. [17] achieved joint early warning of fire and gas risks, validating the decision-making advantages of deep learning in complex coupled disasters. Prasanjit et al. [18] proposed a prediction model based on a hybrid Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) models with Internet of Things sensors to improve the safety and production efficiency of underground coal mines.

Although significant progress has been made in existing gas anomaly detection methods, there are several key bottlenecks. Sensor signals in dynamic, complex underground environments are easily influenced by the physical impact and electromagnetic interference, bringing about a sharp decline in the adaptability of traditional models. The spatiotemporal representation capability of deep learning models is insufficient for non-uniformly sampled data, hindering it from capturing the subtle “gradual increase–steep rise” patterns that precede coal and gas outbursts. Hence, a closed-loop diagnostic architecture based on deep time-series modeling, statistical decision optimization, and hyperparameter optimization was proposed in this study to address these problems. Specifically, sample sequence lengths were unified through piecewise cubic spline interpolation, and a dual-stream fusion channel was constructed with artificial statistical features (kurtosis, differential standard deviation) and Transformer [19] time-series features. Local mutation physical significance was preserved, while long-range dependencies were captured. The concatenated features were fed into a Least Squares Support Vector Machine (LSSVM) [20] for decision classification. Meanwhile, an improved Dung Beetle Optimization Algorithm (SGDBO) was proposed by integrating chaotic mapping initialization and a golden sine search mechanism to achieve efficient global optimization of the Transformer-LSSVM model. This can break through the local optimum dilemma of traditional parameter tuning. This research can provide “fast–accurate–stable” technical support for coal mine safety monitoring, as well as vital practical significance for safeguarding national energy strategic security.

2. Concentration Anomaly Diagnosis Theory and Algorithm

2.1. CH₄ Concentration Anomaly Diagnosis Process and Feature Engineering

2.1.1. CH₄ Concentration Anomaly Diagnosis Process

The dynamic monitoring data of CH₄ in underground coal mines is essentially a hierarchical classification problem of non-stationary time series. The non-stationary characteristics are triggered by strong interference from complex underground environments (ventilation disturbances, mechanical vibrations) on gas concentration changes. Consequently, abnormal curves demonstrate highly heterogeneous morphological features. “Sensor-type impact faults” in CH₄ concentration anomalies are manifested as instantaneous pulse spikes. “Borehole gas emission” presents a gradual upward trend. Such morphologies exhibit differences in fluctuation ranges like “local wind stoppage” curves, as well as peak shape overlaps like “sensor faults” and “roof collapse”. This nonlinear, multi-morphological time-series distribution brings about the failure of traditional static threshold detection methods. Their ability to distinguish different morphological features, such as spikes, gradual rises, and oscillations, is lost.

As revealed in Figure 1, a two-level intelligent diagnosis framework was constructed to break through the classification barriers of traditional methods. At the first level, statistical theory was applied. The extreme value distribution of normal sample concentrations is maximized, and threshold

τ

is determined to quickly lock onto risk curves:

τ = \underset{i}{m a x} \underset{j}{m a x} x_{i j}, i = 1, \dots, N

(1)

where

x_{i j}

represents the

j

-th concentration value of the

i

-th CH₄ concentration curve;

N

denotes the normal sample size;

\underset{j}{m a x} x_{i j}

indicates the maximum CH₄ concentration in sample

i

; and

\underset{i}{m a x} \underset{j}{m a x} x_{i j}

embodies the maximum CH₄ concentration among the

N

samples. When the maximum curve value

x_{m a x} > τ

, it is determined as an abnormal sample; otherwise, it is classified as a normal sample.

At the second level, a Transformer-LSSVM collaborative decision mechanism was introduced. In the Transformer network, CH₄ concentration time-series change features were captured through self-attention. Meanwhile, sensitivity to morphological features was reinforced. Concerning LSSVM, high-dimensional decision surfaces in the feature space were constructed based on radial kernel functions. Precise classification of complex abnormal samples was achieved. This framework integrated time-series deep modeling with statistical decision optimization. Then, a closed-loop diagnostic mode of “feature extraction–statistical decision” was formed.

2.1.2. Feature Engineering

Gas concentration monitoring data in underground coal mines reveal significant non-uniform time distribution characteristics because of sensor sampling frequency fluctuations. This non-stationary characteristic leads to different lengths and random time intervals of the original sequences. They cannot be directly input into deep learning models. Therefore, a piecewise cubic spline interpolation algorithm was utilized in this paper to unify sequence lengths. Adjacent data points were fitted through piecewise cubic polynomials. Curve smoothness was ensured, while the real concentration change trend was accurately approximated. Regarding any non-uniform sampling sequence

{(t_{k}, c_{k})}_{k = 1}^{m}

(where

t_{k}

is the timestamp and

c_{k}

is the concentration value), the interpolation formula on the target uniform time grid

{t_{i}^{'}}_{i = 1}^{n}

is

c_{i}^{'} = \sum_{j = 0}^{3} a_{j}^{(p)} {(t_{i}^{'} - t_{p})}^{j} t_{p} \leq t_{i}^{'} < t_{p + 1}

(2)

where

p

represents the interval index where

t_{i}^{'}

is located. The coefficient

a_{j}^{(p)}

is determined by boundary continuity conditions. This method overcomes the step distortion of linear interpolation and the over-smoothing defects of quadratic interpolation. Moreover, the input data with a unified length and regular distribution can be provided for subsequent time-series modeling.

Abnormal CH₄ concentration sequences need both overall distribution characteristics to be focused on and local mutation behaviors to be captured. Therefore, feature extraction was conducted on the sequences before interpolation in this paper. The extracted relevant feature indicators include maximum value

x_{m a x}

, mean value

\bar{x}

, standard deviation

σ

, and kurtosis

x_{k u r t}

. Among them,

x_{m a x}

characterizes peak risk;

\bar{x}

indicates the concentration baseline level; and

σ

represents the degree of dispersion. First-order differencing was performed on the sequences to further characterize the sharpness of data distribution, especially to detect jumps, mutations, and other drastic changes in sequences. In addition, three parameters

{\overline{x}}_{d}

,

σ_{d}

, and

x_{d - k u r t}

were extracted. Abnormal characteristics of sequences can be rapidly reflected. This feature-processing method balances the overall characteristics of data and local mutation behaviors. Furthermore, a robust feature foundation was established for the subsequent Transformer-SVM fusion model.

The corresponding indicator calculation formulas are

\{\begin{matrix} x_{m a x} = \max (x_{i}) \\ \bar{x} = \sum_{i = 1}^{n} x_{i} / n \\ σ = {(\frac{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}}{n - 1})}^{1 / 2} \\ x_{k u r t} = \sum_{i = 1}^{n} {(\frac{x_{i} - \bar{x}}{σ})}^{4} / n \\ {\bar{x}}_{d} = \sum_{i = 1}^{n - 1} |x_{i + 1} - x_{i}| / n \\ σ_{d} = {(\frac{\sum_{i = 1}^{n - 1} {(|x_{i + 1} - x_{i}| - {\bar{x}}_{d})}^{2}}{n - 1})}^{1 / 2} \\ x_{d - k u r t} = \sum_{i = 1}^{n} {(\frac{|x_{i + 1} - x_{i}| - {\bar{x}}_{d}}{σ_{d}})}^{4} / n \end{matrix}

(3)

2.2. Construction of Prediction Model

2.2.1. Transformer Encoder

The Transformer encoder was utilized as a new type of deep neural network incorporating attention mechanisms. Traditional recurrent and convolutional operations were abandoned. Deep mining of sequence data was achieved through stacked self-attention layers and point-wise feed-forward networks. The architecture is composed of an input embedding layer, a positional encoding layer, and multiple encoder layers connected in series. In this study, dynamic correlations between elements within sequences were first captured through multi-head self-attention mechanisms. Afterward, nonlinear transformations were performed by feed-forward neural networks. Signal transmission paths were finally stabilized through residual connections and layer normalization.

Position-aware gas sequence embedding is crucial for the model to understand dynamic environments. Concerning non-uniformly sampled CH₄ concentration sequences, the high-dimensional matrix

X \in R^{L \times d_{m o d e l}}

output from feature engineering should be flattened into time-series vectors. Then, absolute position information is injected:

\{\begin{matrix} E_{P E} (P_{p o s}, 2 i) = \sin (P_{p o s} / 10000^{2 i / d_{m o d e l}}) \\ E_{P E} (P_{p o s}, 2 i + 1) = \cos (P_{p o s} / 10000^{2 i / d_{m o d e l}}) \\ X_{e n c} = X + E_{P E} \end{matrix}

(4)

where

E_{P E}

represents positional encoding;

L

indicates sequence length;

d_{m o d e l}

signifies vector dimension;

P_{p o s}

embodies vector position; and

X_{e n c}

stands for the encoded CH₄ concentration vector matrix.

The multi-head self-attention mechanism (MSA) serves as the intelligent correlation hub of the encoder. Its workflow can be decomposed into three cognitive stages: feature focusing, multi-perspective analysis, and knowledge fusion.

\{\begin{matrix} a t t e n t i o n (Q, K, V) = s o f t m a x (\frac{Q K^{T}}{\sqrt{d_{k}}}) V \\ z_{i} = Attention (X_{e n c} W_{i}^{Q}, X_{e n c} W_{i}^{K}, X_{e n c} W_{i}^{V}) \\ Z = Concat (z_{1}, z_{2}, \dots z_{i}) W^{O} \end{matrix}

(5)

where

Q

,

K

, and

V

represent the query matrix, key matrix, and value matrix, respectively;

W_{i}^{Q}

,

W_{i}^{K}

, and

W_{i}^{V}

denote the parameter matrices for query, key, and value linear transformations, respectively;

z_{i}

indicates the

i

-th attention head; Concat( ) embodies the aggregation function of

H

attention heads; and

W^{O}

refers to the parameter matrix for the final linear transformation of the multi-head attention mechanism.

Additionally, residual structures were adopted in both the self-attention sublayer and the feed-forward sublayer, enabling input-layer signals to pass directly to the output layer and preventing gradient vanishing. Meanwhile, layer normalization smoothed signal distribution along the feature dimension. Concerning the feed-forward neural network, dimension expansion–activation–dimension reduction operations were performed on the attention output to extract nonlinear interactive features and complete the feature abstraction cycle of a single encoder layer.

2.2.2. Least Squares Support Vector Machine (LSSVM)

Support Vector Machine (SVM) is a supervised learning algorithm based on statistical learning theory. Its core idea is to achieve classification decisions with a high generalization capability by constructing maximum margin hyperplanes. Least Squares Support Vector Machine (LSSVM) features an improvement on SVM. Compared to SVM, it has better self-learning and self-adaptive capabilities, suitable for processing complex CH₄ concentration sequences.

For a given dataset

T = {(x_{1}, y_{1}), (x_{2}, y_{2}), \dots, (x_{n}, y_{n})}

, its regression function can be defined as

f (x) = w^{T} + b

. Among them,

x

denotes the sample input,

y

represents the sample output, and

ω

and

b

refer to the normal vector and intercept of the hyperplane in high-dimensional space, respectively. Following the risk minimization principle, the regression problem can be transformed into a constrained problem:

\{\begin{matrix} \underset{ω, e}{m i n} J (ω, e) = \frac{1}{2} ω^{T} ω + \frac{1}{2} γ \sum_{i = 1}^{N} e_{i}^{2} \\ s . t . y_{i} = ω^{T} φ (x_{i}) + b + e_{i}, i = 1,2, \dots, N \end{matrix}

(6)

where

e_{i}

represents the slack variable; and

γ

denotes the regularization factor. Introducing Lagrange multiplier

α

yields

L (ω, b, e, α) = J (ω, e) - \sum_{i = 1}^{N} α_{i} ω^{⊤} φ (x_{i}) + b + e_{i} - y_{i}

(7)

By taking partial derivatives with respect to

ω

,

b

,

e

, and

α

to obtain optimal values, the regression function can be established as

y (x) = \sum_{i = 1}^{N} α_{i} K (x, x_{i}) + b

(8)

where

K (x, x_{i})

represents the kernel function. Since the RBF kernel function has low complexity and better performance, the RBF kernel function is adopted in this paper:

K (x, x_{i}) = e x p (- \frac{∥ x_{i}, y_{i} ∥^{2}}{2 {σ_{*}}^{2}})

(9)

where

σ_{*}

indicates the RBF kernel parameter.

2.2.3. Classification Model Based on Transformer-LSSVM

The CH₄ concentration anomaly detection model architecture based on Transformer-LSSVM was developed through dual-path feature fusion. This architecture achieved complementary representation of CH₄ concentration dynamic characteristics. As illustrated in Figure 2, a cubic spline interpolation algorithm was employed to unify the time steps for the original concentration sequences with non-uniform sampling from underground. In addition, standardized time-series inputs were generated.

The deep feature path adopted a Transformer model to process the interpolated sequences. The temporal context of concentration changes was revealed by the model through the position encoding layer. The stacked encoder layers utilized MSA to capture long-range causal correlations and mine global dependency patterns of concentration evolution. Finally, the high-dimensional output was curtailed to a seven-dimensional deep feature vector through the feature compression layer. The representation capability for periodic evolution patterns was preserved.

The artificial feature path simultaneously extracted seven-dimensional key indicators describing curve mutation characteristics, with a focus on the physical interpretability of local anomaly patterns. The dual-path outputs were concatenated into a 14-dimensional joint vector in the feature fusion layer. Then, this vector was input into the LSSVM classifier for decision-making. The RBF kernel function was employed to map high-dimensional space to separate nonlinear patterns. Notably, concentration anomaly detection labels were output.

2.3. Algorithm Improvement and Transformer-LSSVM Model Hyperparameter Optimization

The hyperparameter optimization mechanism is the core for efficient model operation and enhanced adaptability. Transformer-LSSVM structural parameters and decision parameters were optimized. An improved dung beetle algorithm (SGDBO) was proposed. Population initialization was enhanced through sine chaotic mapping to increase diversity in parameter space exploration. In addition, the golden sine search mechanism was combined to adaptively adjust the step size. Meanwhile, the local optimal trap of traditional grid search was avoided. The accuracy of CH₄ concentration anomaly detection and model convergence efficiency were significantly improved.

2.3.1. Dung Beetle Optimization Algorithm (DBO)

DBO provides a powerful global exploration capability for Transformer-SVM hyperparameter optimization through bionic mechanisms that simulate the adaptive behavior of dung beetle populations in complex environments [21].

N

dung beetle individuals

{x_{i}}_{i = 1}^{N}

were generated in the algorithm initialization stage. Each individual corresponded to a set of d-dimensional hyperparameter solution vector

x_{i} = {[r_{1}, r_{2}, . . ., r_{d}]}^{T}

. Following individual behavioral characteristics, the population was divided into four roles: rolling beetles, brooding beetles, small beetles, and thieves. Furthermore, a dynamic balance between exploration and exploitation was achieved through role collaboration. The core iterative process involves the following behavioral models:

(1) Rolling navigation. Rolling beetles simulate celestial navigation behavior. Their position update is controlled by polarized light intensity

k

, deflection factor coefficient

α

, and step scaling factor

b

:

x_{i}^{t + 1} = x_{i}^{t} + α \cdot k \cdot x_{i}^{t - 1} + b \cdot (x_{best}^{t} - x_{i}^{t})

(10)

where

x_{i}^{t}

represents the position of individual

i

; and

x_{best}^{t}

denotes the optimal individual in the

t

-th iteration. The rolling navigation mechanism enables individuals to approach better regions while retaining better solutions from the previous generation.

(2) Dancing obstacle avoidance and direction reset. When rolling beetle individuals encounter local extremes, dancing behavior is triggered to redirect:

x_{i}^{t + 1} = x_{i}^{t} + t a n (θ) \cdot ∥ x_{i}^{t} - x_{i}^{t - 1} ∥

(11)

where

θ

represents the random deflection angle.

(3) Dynamic shrinkage of spawning area. Safe spawning areas are constructed by brooding beetles in the neighborhood of the global optimal solution

x_{best}^{t}

. The boundaries are dynamically adjusted as

\{\begin{array}{l} L_{b}^{*} = m a x (x_{best}^{t} ⊙ (1 - R), L_{b}) \\ U_{b}^{*} = m i n (x_{best}^{t} ⊙ (1 + R), U_{b}) \end{array}

(12)

where

R

denotes the time-varying inertia weight; and ⊙ embodies the Hadamard product. The position update formula for brooding beetles is expressed as

B_{i}^{t + 1} = x_{best}^{t} + b_{1} (B_{i}^{t} - L_{b}^{*}) + b_{2} (B_{i}^{t} - U_{b}^{*})

(13)

where

b_{1}, b_{2} \sim N (0,1)

denote Gaussian disturbance terms.

(4) Foraging area gradient tracking. The foraging range is adjusted by small beetles upon the historical optimal position

x_{best}^{b}

:

\{\begin{array}{l} L_{b}^{b} = m a x (x_{best}^{b} ⊙ (1 - R), L_{b}) \\ U_{b}^{b} = m i n (x_{best}^{b} ⊙ (1 + R), U_{b}) \end{array}

(14)

Momentum terms and gradient acceleration are introduced in the position update of small beetles:

x_{i}^{t + 1} = x_{i}^{t} + c_{1} (x_{i}^{t} - L_{b}^{b}) + c_{2} (x_{i}^{t} - U_{b}^{b})

(15)

where

c_{1}, c_{2} \sim U (0,2)

indicate exploration intensity.

(5) Theft disturbance to escape stagnation. Population convergence stagnation is broken by thief beetles through random migration:

x_{i}^{t + 1} = x_{best}^{b} + S \cdot g \cdot (∥ x_{i}^{t} - x_{best}^{b} ∥)

(16)

where

S

signifies disturbance intensity, and

g \sim N (0,1)

represents Gaussian noise.

2.3.2. Multi-Strategy Enhanced Dung Beetle Optimization Algorithm

(1) Sine Chaotic Mapping Initialization

In swarm intelligence optimization algorithms, population initialization methods directly influence convergence speed and solution accuracy. This study overcame the insufficient diversity problem induced by random initialization in the DBO algorithm. The randomness, ergodicity, and regularity characteristics of chaotic motion were utilized to replace random initialization. Population diversity was maintained by avoiding local optima through this strategy. Global search capability was enhanced. Sine chaotic mapping was selected for DBO algorithm population initialization because of its more significant chaotic characteristics compared to logistic chaotic mapping [22]. GDBO was constructed. Sine chaotic mapping is expressed as

x_{n + 1} = s i n (2 / x_{n}), n = 0,1, \dots, N; - 1 \leq x_{n} \leq 1; x_{n} \neq 0

(17)

(2) Golden Sine Search Mechanism

The golden sine search mechanism utilizes sine functions for iterative optimization. The optimal solution can be approached by gradually reducing the search space [23]. This study aimed to address the weak local exploitation ability of the Dung Beetle Optimization Algorithm (DBO) in the later stages of iteration. The slow convergence speed and poor accuracy problems provoked by the thief beetle position updates interfered with by local optimal solutions were solved. In addition, the golden sine mechanism was introduced into the position update process of thief beetles. GDBO can be constructed. The expression of golden sine search is

P_{i}^{z + 1} = P_{i}^{z} \cdot |s i n (r_{1})| + r_{2} \cdot s i n (r_{1}) \cdot | y_{1} \cdot P_{g b e s t}^{b} - y_{2} \cdot P_{i}^{z} | (u > p_{r})

(18)

where

y_{1} = - π h + π (1 - h), h = (\sqrt{5} - 1) / 2

represents the golden ratio;

y_{2} = - π (1 - h) + π h

;

u

refers to a random number within (0, 1);

r_{i}

denotes a random number within [0, 2π];

r_{i}

signifies a random number within [0, π]; and

p_{r} = 0.5

indicates the probability threshold. The DBO algorithm that integrates sine chaotic mapping initialization and a golden sine search mechanism is referred to as SGDBO in this study.

2.3.3. Hyperparameter Optimization Framework

Hyperparameter optimization for the Transformer-LSSVM model was implemented by SGDBO to construct a closed-loop iterative framework. The core process is exhibited in Figure 3. Specifically, the model hyperparameter set was first initialized based on previous experience. The set covers the number of Transformer filters (16–128), convolution kernel size (3/5/7), number of attention heads (4–12), number of hidden-layer nodes (16–128), initial learning rate (10⁻⁴–10⁻¹), as well as the RBF kernel width

σ_{*}

(10⁻⁴–10⁻¹) and regularization coefficient γ (10⁻³–10⁻¹) of LSSVM. During the training phase, the model calculates the accuracy of the ratio between the actual labels and predicted labels of CH₄ concentration samples through forward propagation, and the network weight matrix was updated through backpropagation. Efficient optimization was achieved by SGDBO under dynamic contraction of the search space. The hyperparameter candidate interval was adjusted based on the validation set loss gradient in each iteration. Pareto frontier solutions were adopted as screening criteria. Finally, a bidirectional coupling mechanism between hyperparameter exploration and performance feedback was formed.

A five-fold cross-validation strategy was adopted by Transformer-LSSVM to suppress overfitting and enhance the model’s generalization ability in complex mine environments. The epoch number was fixed at 50, and the batch size was set to 32, so as to maintain gradient update stability. The hyperparameter combination was finally selected following the Pareto optimal principle. Final predictions were made on the test set. Bidirectional coupling of hyperparameter space exploration and model performance feedback contributed to significantly enhancing the robustness and generalization ability of Transformer-LSSVM in complex mine CH₄ concentration anomaly detection by this optimization mechanism.

2.4. Evaluation Metric System

With the purpose of comprehensively assessing multi-classification model performance, four core indicators were adopted in this study: accuracy, precision, recall, and F1-score. The calculation formulas for each indicator are

\{\begin{matrix} \begin{matrix} A c c u r a c y = \frac{m}{n} \\ P r e c i s i o n = \frac{1}{c} \sum \frac{{T P}_{i}}{{T P}_{i} + {F P}_{i}} \end{matrix} \\ \begin{matrix} R e c a l l = \frac{1}{c} \sum {\frac{{T P}_{i}}{{T P}_{i} + {F N}_{i}}}_{i} \\ F 1 = \frac{1}{c} \sum \frac{2 * {P r e c i s i o n}_{i} \times {R e c a l l}_{i}}{{P r e c i s i o n}_{i} + {R e c a l l}_{i}} \end{matrix} \end{matrix}

(19)

where

m

and

n

represent the number of correctly predicted samples and the total number of samples, respectively;

{T P}_{i}

denotes the quantity correctly predicted as category

i

;

{F P}_{i}

indicates the quantity incorrectly predicted as category

i

;

{F N}_{i}

signifies the quantity that is actually category

i

but incorrectly predicted as other categories; and F1 designates the harmonic mean of precision and recall.

3. Experiments and Discussion

3.1. Original Sample Analysis and Sample Classification

This study was conducted with underground safety monitoring systems from 12 coal mines in a mining area. CH₄ concentration time-series data from nearly two years were collected. CH₄ concentrations were easily affected by environmental disturbances such as ventilation disruptions and mechanical vibrations. Moreover, non-uniform sampling characteristics were presented, resulting in highly heterogeneous concentration curve patterns.

As revealed in Figure 4, methane concentrations fluctuated relatively stably under normal operating conditions and were concentrated in low-risk intervals. Abnormal operating conditions significantly deviated from the baseline pattern. Extreme value statistical analysis of normal samples suggests that methane concentration peaks were generally below 0.6%. Based on the actually measured gas emission volume, gas emission forms, actual gas dynamic phenomena that occurred, and measured outburst risk parameters, it is determined that the coal mines in this mining area all belong to low-gas mines. Their set gas over-limit alarm threshold is 0.5~0.8%. Accordingly, an abnormal judgment threshold of CH₄ = 0.6% was set by extreme-value statistical methods. This threshold served as the primary segmentation boundary for distinguishing normal and abnormal curves, laying the foundation for hierarchical diagnosis.

Labels were assigned for different concentration anomalies in this study to achieve learnable classification tasks for abnormal curves. As demonstrated in Table 1, One-Hot encoding was adopted to label samples of different categories for CH₄ concentration anomalies. Curve types are closely related to the physical causes of underground accidents. For example, “sensor-type impact” corresponds to pulse spikes from equipment failures. “Roof collapse” and “borehole gas emission” feature gradual fluctuations from ventilation anomalies. Supervisory signals with physical interpretability were provided for the model. Among them, sensor fault categories (labels 5 and 6) were manifested as instantaneous pulses or baseline drift caused by equipment failure. Human intervention categories (labels 1 and 2) were induced by gradual rise or stepped fluctuations generated by ventilation parameter regulation. Random disaster categories (labels 3 and 4) were provoked by sudden concentration increases stemming from geological stress mutations, which require real-time monitoring and millisecond-level lockout prevention and control to curb gas explosion risks.

3.2. Reconstruction Based on Sample Features

As observed from Figure 5, the piecewise cubic spline interpolation algorithm was employed to effectively eliminate sequence length fluctuations caused by sensor sampling rate differences. Standardized input data was generated by this method. Key features of the original morphology were preserved by the interpolation curves, including instantaneous spikes from impact-type failures and gradual rising trends from gas emissions. Concurrently, local distortion problems from uneven sampling intervals were significantly improved. Comparisons before and after interpolation reflect that the algorithm demonstrated a superior morphology preservation capability for oscillatory anomalies. This study avoided both staircase artifacts from linear interpolation and excessive smoothing effects from quadratic interpolation.

After sequence lengths were unified, sample features were further extracted (Figure 6, Table 2) to distinguish the essential characteristics of various sample types. Sensor-type impact failures exhibited significantly higher values in both peak value (

x_{m a x}

= 14.24) and overall concentration level (

\bar{x}

= 9.84) compared to other types. This implies the existence of continuous high-concentration accumulation risk. While sensor calibration presented anomalies in peak value (

x_{m a x}

= 4.0), its mean value (

\bar{x}

= 0.41) was close to normal levels (0.29), reflecting local equipment anomaly characteristics. In local mutation indicators, the differential kurtosis of borehole gas emission (

x_{d - k u r t}

= 69.49) was extremely high, specifying the continuous fluctuation characteristics of the emission process. Meanwhile, all indicators for reverse ventilation drills and local ventilation shutdowns (such as

σ_{d}

< 0.01) approached normal values, conforming to the low-risk gradual disturbance pattern.

A dataset containing seven types of samples was constructed simultaneously (Figure 7). The specific distribution of the dataset is detailed in Table 3. Abnormal samples accounted for 63.4% (576 cases). Six typical underground gas concentration anomaly patterns were covered, consisting of impact-type failures and local ventilation shutdowns. A balanced and physically meaningful data foundation was provided for subsequent model training.

3.3. Analysis of Classification Results Based on SGDBO-Transformer-LSSVM Model

3.3.1. Algorithm Performance Comparison Based on IEEE CEC 2022

The performance of SGDBO was compared with DBO, SDBO, and GDBO based on the IEEE CEC 2022 standard test set to evaluate the global optimization capability of FLA. Additionally, all algorithms were run on the same testing platform with identical experimental parameters to ensure fairness in simulation experiments. The processor used was a 12th Gen Intel(R) Core(TM) i5-12600KF (3.70 GHz), and all experiments were implemented through MATLAB 2024a programming. The population size of the four algorithms was set to 30, the number of iterations was set to 200, the test function dimension was set to 20, and the number of runs was set to 50. Finally, the Wilcoxon signed-rank test and rank-sum test results for the convergence of different algorithms on different test functions were obtained (Table 4), as well as six statistical indicators, including the optimal value, standard deviation, mean value, median value, worst value, and average running time (Table 5).

The Wilcoxon signed-rank test and rank-sum test results suggest that SGDBO demonstrated significant global optimization advantages on 12 benchmark functions in 20-dimensional space under the significance level α = 0.05. In unimodal function optimization, the signed test p-values of SGDBO for F1, F2, F6, and F9 were all below 1.50 × 10⁻⁸, verifying its global optimization capability for convex test functions. Concerning multimodal function robustness, the p-values of SGDBO for F3, F4, and F10 were stabilized in the range of 7.56 × 10⁻¹⁰ to 8.82 × 10⁻⁴, revealing its stronger stability against noise interference and non-convex functions. The p-values of both types of tests for SGDBO were below 1 × 10⁻⁹ on more than 80% of functions, while SDBO and GDBO failed to pass significance tests in scenarios such as F7 (e.g., GDBO signed test p = 9.58 × 10⁻¹). This confirms that SGDBO had a superior convergence accuracy and statistical significance. Table 5 reflects that the average running time of SGDBO on test functions F1~F11 was below 0.05s, which was close to the time cost of the DBO algorithm.

Furthermore, the combined analysis of box plots (Figure 8) and convergence curves (Figure 9) unveils that in unimodal function optimization, such as F1 and F6, the median fitness values of SGDBO reached 15,966.182 and 3678.437, which were reduced by 60.1% and 97.5%, respectively, compared to DBO. The interquartile ranges were narrowed to below 20% of the comparison algorithms, verifying the stability of the algorithm’s solutions. The convergence curves further specify that while the convergence speed of SGDBO was slightly lower than GDBO in the early iterations (0–50 times) of multimodal functions such as F7 and F11, its sinusoidal chaotic strategy enhanced the search mechanism in the middle period (50–150 times). As a result, the slope of the convergence curve increased sharply. The optimal solutions on F7 and F11 finally reached 2046.148 and 2832.427, respectively. This characteristic formed statistical mutual verification with the rank-sum test p-values in F10, implying that SGDBO possessed both accuracy advantages and noise robustness when handling high-dimensional, multimodal, and ill-conditioned functions.

Figure 10 displays the average fitness rankings of different algorithms on various test functions. The radar chart on the left indicates that SGDBO achieved optimal results in six test functions. The height of the bar chart on the right represents the average ranking of algorithms, where lower rankings (shorter bars) denote a better algorithm performance. The average ranking of SGDBO was significantly superior to DBO (3.58), PSO (2.5), and GWO (2.33), with its average ranking being the lowest (1.58). Hence, SGDBO integrated with enhancement strategies effectively balanced exploration and exploitation, providing a reliable tool for hyperparameter optimization of the Transformer-LSSVM model.

3.3.2. Hyperparameter Optimization Results

The sample set was first divided into training and testing sets at a 7:3 ratio uniformly to guarantee a consistent distribution of various abnormal samples. Then, four improved dung beetle algorithms (DBO, SDBO, GDBO, SGDBO) were adopted to optimize the hyperparameters of the Transformer-LSSVM model.

The convergence characteristics of the four algorithms are illustrated in Figure 11. DBO exhibited a slow convergence speed, with a high final convergence value of 0.109. This reflects that the basic algorithm was prone to local optima. SDBO incorporated sine chaotic mapping initialization. Early convergence efficiency was significantly improved. However, the final convergence value (0.102) remained limited owing to the lack of fine search capability. GDBO introduced the golden sine search mechanism. In addition, the middle iterations demonstrated a strong global exploration capability. The convergence value (0.095) was superior to the previous two. SGDBO combined both enhancement strategies. The early convergence speed approached that of SDBO. Mid-term optimization efficiency was comparable to GDBO. With the lowest convergence value (0.094), the optimal parameter combination was obtained as filter number 29, convolution kernel size 3, attention head number 9, hidden layer node number 15, learning rate 0.012, kernel width

σ_{*}

= 0.035, and regularization coefficient

γ

= 0.009.

The effectiveness of the hybrid improvement strategy was verified by these results. Population diversity was enhanced by chaotic initialization. Fine-tuning capability was improved by a golden sine search. This synergy has overcome the local convergence bottleneck of traditional algorithms.

The hyperparameter sensitivity visualization analysis results (Figure 12) suggest that the response characteristics of model performance to different hyperparameters exhibited significant differentiation. When the number of filters was adjusted, the classification error rate formed an optimal solution at the optimal value (29). When the number was reduced to 26, the error rate sharply increased to 0.312, implying that the model was highly sensitive to filter configuration. The number of attention heads reached the lowest error rate of 0.094 when the head number was 9. When it was increased to 12, the error rate increased by 19%, reflecting that the number of multi-head attention heads needed precise control. In contrast, the convolution kernel size only brought about a weak fluctuation of 0.004 within the range of 3, 5, and 7, revealing excellent scale robustness. Moreover, the error rate fluctuation was small when the learning rate changed, confirming the adaptive stability of gradient optimization. This phenomenon specifies that the filter architecture and attention mechanism should be solidified through pre-training to ensure accuracy. Meanwhile, the convolutional feature extraction layer and gradient update mechanism possessed strong adaptability under dynamic environments.

3.3.3. Performance Comparison of Transformer-LSSVM

Three sets of comparative experiments were conducted in this study to systematically verify the effectiveness of the Transformer-LSSVM model architecture, involving the comparison between SVM and LSSVM, the comparison between RNN and Transformer, and the comparison between Transformer-LSSVM and individual Transformer and LSSVM models.

The radar chart intuitively presented the effects of different models. LSSVM achieved an accuracy = 0.721 on the test set, which was superior to SVM’s accuracy = 0.546. In addition, the improvement effect of statistical learning algorithms was revealed, the complexity of the SVM model was significantly reduced, and the generalization capability was enhanced through mathematical adjustments in LSSVM. Additionally, the accuracy of the Transformer was improved by 25.32% compared to the RNN. In other words, the Transformer was better suited for the demand of global dependency analysis in complex time-series anomaly detection through its parallelized feature extraction mechanism.

Additionally, the connecting lines of Transformer-LSSVM were closest to the periphery on all four coordinate axes: accuracy (0.906), precision (0.910), recall (0.923), and F1-score (0.912) (Figure 13 and Table 6). The largest encompassing area was formed. The RNN exhibited considerable inward contraction on the recall axis (0.763). SVM presented severe depression on the F1 axis (0.532). The prediction accuracy of Transformer-LSSVM on the test set was improved by 4.62% and 25.66% compared to the Transformer (0.866) and LSSVM (0.721), respectively. This suggests that the fusion model had a stronger overall discriminative capability for samples. Its recall = 0.923 validates its superior control over false negative risks (such as unidentified gas anomalies). The F1-score = 0.912 implies that the ability to balance precision and recall was better than all baseline models.

The prediction error distribution in Figure 14 reveals the confusion patterns of the model for specific anomaly types. Among the 12 error samples in the training set, the mutual misclassification between label 2 (local ventilation stoppage) and label 3 (reverse ventilation drill) accounted for the highest proportion (five cases). The same confusion between these two labels was predominant in the 19 error samples of the test set (five cases). This high misclassification rate was because both types of anomalies manifested as slow changes in the ventilation system. Their characteristics were highly similar in mean values and fluctuation ranges. Moreover, the model had difficulty distinguishing such gradual disturbances.

Meanwhile, label 4 (roof collapse) and label 5 (irregular calibration) were mutually confused in three cases among the training set error samples, reaching six cases in the test set. These two types of anomalies shared high pulse characteristics (such as instantaneous concentration surges). While the model captured pulse patterns, it failed to resolve the physical cause differences where the roof collapse originated from geological impact. Additionally, irregular calibration was an equipment failure, leading to misclassification of pulse-type anomalies.

4. Conclusions and Prospects

In this study, the SGDBO-Transformer-LSSVM fusion model was proposed to significantly improve the accuracy of methane concentration anomaly detection in coal mines. This was achieved through the fusion of global temporal features and local statistical features, along with the collaborative design of the SGDBO intelligent optimization strategy. Based on samples collected from a mining area in Shaanxi, China, model parameters were dynamically adjusted using SGDBO, and this model achieved intelligent optimization and feature fusion. In the test set of 210 samples, a discrimination accuracy of 0.906 and an F1-score of 0.912 were achieved, which outperformed other single models and verified the effectiveness of the feature extraction–statistical decision collaborative architecture.

Furthermore, although the model still has misjudgments for specific abnormal types, low-frequency confusion exists between local ventilation stoppage and reverse ventilation drills, and between roof caving and sensor calibration. Therefore, future research will continue to explore and construct online incremental learning mechanisms. Different types of gas concentration anomaly samples will be collected. Optimization will be conducted from various aspects including the model structure and training efficiency. The aim is to provide “fast–accurate–stable” technical support and deployable technical solutions for gas concentration safety monitoring in coal mining.

Author Contributions

Conceptualization, Z.Y.; methodology, L.Z.; software, L.Z.; validation, L.Z., W.Q. and M.L.; formal analysis, L.F.; investigation, L.F.; resources, X.W.; data curation, L.Z.; writing—original draft preparation, L.Z.; writing—review and editing, Z.Y.; visualization, W.Q.; supervision, L.F.; project administration, Z.Y. and M.L.; funding acquisition, M.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Natural Science Foundation of Shaanxi Province under the grant number 2025JC-YBMS-472. It is also funded by the Science and Technology Innovation Fund of Xi’an Research Institute, with funding numbers 2023XAYPT01 and 2025XAYPT01 respectively.

Data Availability Statement

The data are not publicly available due to commercial confidentiality, as they contain information that could compromise the privacy of research participants.

Acknowledgments

Thank you for the strong support of Xi’an University of Science, the State Key Laboratory of Coal Mine Disaster Prevention and Control, and other members of the team research group.

Conflicts of Interest

The author, Mingyang Liu, served in the management of Xi’an Research Institute of China Coal and was awarded the Shaanxi Provincial Natural Science Foundation Project (2025JC-YBMS-472); Xi’an Research Institute Science and Technology Innovation Fund 2023XAYPT01, 2025XAYPT01); As the main person in charge of the intelligent construction project of coal mine ventilation in a mining area in Shaanxi, China. The main content of this study is closely related to these intelligent construction projects. The author, Zhenguo Yan, is a professor at the College of Safety Science and Engineering, providing technical support to the author Mingyang Liu and a project in a mining area in Shaanxi, China. The authors, Mingyang Liu and Zhenguo Yan, are carrying out intelligent construction of coal mine ventilation in a mining area in Shaanxi, China, and have signed a number of mine disaster intelligent deduction and early warning projects, which involve safety production monitoring systems, monitoring data and network security of multiple mines. Therefore, the dataset and practical engineering applications of this study cannot be made public. Other authors state no conflicts of interest.

References

Gasparotto, J.; Da Boit Martinello, K. Coal as an Energy Source and Its Impacts on Human Health. Energy Geosci. 2021, 2, 113–120. [Google Scholar] [CrossRef]
Wang, T.; Wu, F.; Dickinson, D.; Zhao, W. Energy Price Bubbles and Extreme Price Movements: Evidence from China’s Coal Market. Energy Econ. 2024, 129, 107253. [Google Scholar] [CrossRef]
Cloney, C.T.; Ripley, R.C.; Pegg, M.J.; Khan, F.; Amyotte, P.R. Lower Flammability Limits of Hybrid Mixtures Containing 10 Micron Coal Dust Particles and Methane Gas. Process Saf. Environ. Prot. 2018, 120, 215–226. [Google Scholar] [CrossRef]
Ray, S.K.; Khan, A.M.; Mohalik, N.K.; Mishra, D.; Mandal, S.; Pandey, J.K. Review of Preventive and Constructive Measures for Coal Mine Explosions: An Indian Perspective. Int. J. Min. Sci. Technol. 2022, 32, 471–485. [Google Scholar] [CrossRef]
Qiu, L.; Peng, Y.; Song, D. Risk Prediction of Coal and Gas Outburst Based on Abnormal Gas Concentration in Blasting Driving Face. Geofluids 2022, 2022, 3917846. [Google Scholar] [CrossRef]
Yang, D.; Peng, K.; Zheng, Y. Study on the Characteristics of Coal and Gas Outburst Hazard under the Influence of High Formation Temperature in Deep Mines. Energy 2023, 268, 126645. [Google Scholar] [CrossRef]
Diaz, J.; Agioutantis, Z.; Hristopulos, D.T.; Schafrik, S.; Luxbacher, K. Time Series Modeling of Methane Gas in Underground Mines. Min. Metall. Explor. 2022, 39, 1961–1982. [Google Scholar] [CrossRef]
Martirosyan, A.V.; Ilyushin, Y.V. The Development of the Toxic and Flammable Gases Concentration Monitoring System for Coalmines. Energies 2022, 15, 8917. [Google Scholar] [CrossRef]
Morris, P.; Simon, C.M. Computationally Predicting the Performance of Gas Sensor Arrays for Anomaly Detection. Sens. Diagn. 2024, 3, 1699–1713. [Google Scholar] [CrossRef]
Sorg, D. Measuring Livestock CH₄ Emissions with the Laser Methane Detector: A Review. Methane 2022, 1, 38–57. [Google Scholar] [CrossRef]
Chang, G.; Chang, H. Underground Abnormal Sensor Condition Detection Based on Gas Monitoring Data and Deep Learning Image Feature Engineering. Heliyon 2023, 9, e22026. [Google Scholar] [CrossRef] [PubMed]
Ahmad, S.; Lavin, A.; Purdy, S.; Agha, Z. Unsupervised Real-Time Anomaly Detection for Streaming Data. Neurocomputing 2017, 262, 134–147. [Google Scholar] [CrossRef]
Jo, B.; Khan, R. An Event Reporting and Early-Warning Safety System Based on the Internet of Things for Underground Coal Mines: A Case Study. Appl. Sci. 2017, 7, 925. [Google Scholar] [CrossRef]
Praveenchandar, J.; Vetrithangam, D.; Kaliappan, S.; Karthick, M.; Pegada, N.K.; Patil, P.P.; Rao, S.G.; Umar, S. IoT-based Harmful Toxic Gases Monitoring and Fault Detection on the Sensor Dataset Using Deep Learning Techniques. Sci. Program. 2022, 2022, 7516328. [Google Scholar] [CrossRef]
Rezazadeh, N.; De Luca, A.; Perfetto, D. Systematic Critical Review of Structural Health Monitoring under Environmental and Operational Variability: Approaches for Baseline Compensation, Adaptation, and Reference-Free Techniques. Smart Mater. Struct. 2025, 34, 73001. [Google Scholar] [CrossRef]
Wang, Y.; Zhang, L.; Yan, Z.; Deng, J.; Huang, Y.; Qin, Z.; Cao, Y.; Wang, Y. TCN-Transformer Spatio-Temporal Feature Decoupling and Dynamic Kernel Density Estimation for Gas Concentration Fluctuation Warning. Fire 2025, 8, 175. [Google Scholar] [CrossRef]
Liu, C.; Wang, E.; Li, Z.; Zang, Z.; Li, B.; Yin, S.; Zhang, C.; Liu, Y.; Wang, J. Research on Multi-Factor Adaptive Integrated Early Warning Method for Coal Mine Disaster Risks Based on Multi-Task Learning. Reliab. Eng. Syst. Saf. 2025, 260, 111002. [Google Scholar] [CrossRef]
Dey, P.; Chaulya, S.K.; Kumar, S. Hybrid CNN-LSTM and IoT-Based Coal Mine Hazards Monitoring and Prediction System. Process Saf. Environ. Prot. 2021, 152, 249–263. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. arXiv 2017, arXiv:1706.03762. [Google Scholar]
Kumar, L.; Sripada, S.K.; Sureka, A.; Rath, S.K. Effective Fault Prediction Model Developed Using Least Square Support Vector Machine (LSSVM). J. Syst. Softw. 2018, 137, 686–712. [Google Scholar] [CrossRef]
Xue, J.; Shen, B. Dung Beetle Optimizer: A New Meta-Heuristic Algorithm for Global Optimization. J. Supercomput. 2023, 79, 7305–7336. [Google Scholar] [CrossRef]
Alzaidi, A.A.; Ahmad, M.; Ahmed, H.S.; Solami, E.A. Sine-cosine Optimization-based Bijective Substitution-boxes Construction Using Enhanced Dynamics of Chaotic Map. Complexity 2018, 2018, 9389065. [Google Scholar] [CrossRef]
Tanyildizi, E.; Demir, G. Golden Sine Algorithm: A Novel Math-Inspired Algorithm. Adv. Electr. Comput. Eng. 2017, 17, 71–78. [Google Scholar] [CrossRef]

Figure 1. CH₄ concentration anomaly diagnosis process.

Figure 2. SGDBO-Transformer-LSSVM model framework.

Figure 3. Transformer-LSSVM model hyperparameter optimization framework.

Figure 4. Partial CH₄ concentration line graph.

Figure 5. Example image of sample instance and interpolation image.

Figure 6. Line graph of sample feature indicators.

Figure 7. Statistical graph of sample quantities for different types of datasets.

Figure 8. Box plots of various algorithms running on CEC 2022.

Figure 9. Line graphs of various algorithms running on CEC 2022.

Figure 10. Radar chart and average ranking chart.

Figure 11. Iterative convergence diagram of hyperparameters optimized by different algorithms for the Transformer-LSSVM model.

Figure 12. Transformer-LSSVM hyperparameter experiment diagram.

Figure 13. Radar chart of performance indicators for different classification models.

Figure 14. Prediction error distribution chart for SGDBO-Transformer-LSSVM.

Table 1. Methane anomaly curve-type label settings.

Sample Type	Normal Sample	Drilling Gas Outburst	Reverse Wind Drill	Local Wind Stoppage	Roof Collapse	Sensor Calibration Is Not Standardized	Sensor-Type Impact Failure
Sample labels	0	1	2	3	4	5	6

Table 2. Sample features.

Sample Type	$x_{m a x}$	$\bar{x}$	$σ$	$x_{k u r t}$	${\bar{x}}_{d}$	$σ_{d}$	$x_{d - k u r t}$
Normal sample	0.580	0.292	0.122	2.685	0.032	0.066	32.156
Drilling gas outburst	3.190	1.256	0.881	2.470	0.050	0.150	69.488
Reverse wind drill	1.040	0.593	0.319	1.686	0.007	0.006	7.149
Local wind stoppage	1.770	1.366	0.290	2.325	0.006	0.008	4.418
Roof collapse	2.150	0.964	0.399	2.303	0.084	0.159	11.249
Sensor calibration is not standardized	4.000	0.410	0.734	7.597	0.340	0.733	13.079
Sensor-type impact failures	14.240	9.836	5.930	2.098	0.420	2.281	31.747

Table 3. Statistical table of sample quantities for different types of datasets.

	Normal Sample	Drilling Gas Outburst	Reverse Wind Drill	Local Wind Stoppage	Roof Collapse	Sensor Calibration Is Not Standardized	Sensor-Type Impact Failure
sample size	116	106	139	88	91	73	87

Table 4. Results of the Wilcoxon signed-rank test and rank-sum test under 20Dim conditions.

	20Dim-Wilcoxon Sign Test Results			20Dim-Wilcoxon Rank Sum Test Results
	SDBO	GDBO	SGDBO	SDBO	GDBO	SGDBO
F1	8.0311 × 10⁻¹⁰	5.4126 × 10⁻²	1.4993 × 10⁻⁸	6.7195 × 10⁻¹⁷	9.3221 × 10⁻²	4.7599 × 10⁻¹⁶
F2	2.1184 × 10⁻⁷	7.5569 × 10⁻¹⁰	7.5569 × 10⁻¹⁰	1.4066 × 10⁻¹²	7.0661 × 10⁻¹⁸	7.0661 × 10⁻¹⁸
F3	2.0893 × 10⁻⁶	1.0717 × 10⁻⁷	7.5569 × 10⁻¹⁰	1.7435 × 10⁻⁸	9.4600 × 10⁻¹⁰	8.4620 × 10⁻¹⁸
F4	7.5569 × 10⁻¹⁰	8.0311 × 10⁻¹⁰	7.5569 × 10⁻¹⁰	2.4739 × 10⁻¹⁷	7.5510 × 10⁻¹⁷	7.0661 × 10⁻¹⁸
F5	3.3668 × 10⁻⁹	2.5085 × 10⁻⁹	7.5569 × 10⁻¹⁰	3.7684 × 10⁻¹⁵	3.3752 × 10⁻¹⁵	4.2058 × 10⁻¹⁷
F6	7.1595 × 10⁻⁹	7.5569 × 10⁻¹⁰	7.5569 × 10⁻¹⁰	5.9715 × 10⁻¹⁶	7.0661 × 10⁻¹⁸	7.0661 × 10⁻¹⁸
F7	1.4705 × 10⁻⁷	9.5766 × 10⁻¹	1.2659 × 10⁻⁸	5.4558 × 10⁻⁸	3.9455 × 10⁻¹	2.1605 × 10⁻¹³
F8	3.6788 × 10⁻⁵	4.3380 × 10⁻⁴	1.3238 × 10⁻⁷	8.1172 × 10⁻⁷	1.8891 × 10⁻⁵	1.1737 × 10⁻⁹
F9	2.6198 × 10⁻⁸	7.5569 × 10⁻¹⁰	7.5569 × 10⁻¹⁰	3.8499 × 10⁻¹⁴	7.0661 × 10⁻¹⁸	7.0661 × 10⁻¹⁸
F10	1.9919 × 10⁻⁶	1.7039 × 10⁻⁵	8.8243 × 10⁻⁴	1.6865 × 10⁻⁶	1.2192 × 10⁻⁵	2.6798 × 10⁻³
F11	4.0126 × 10⁻⁹	7.5569 × 10⁻¹⁰	7.5569 × 10⁻¹⁰	1.2227 × 10⁻¹³	7.0661 × 10⁻¹⁸	7.0661 × 10⁻¹⁸
F12	6.0236 × 10⁻⁹	1.7742 × 10⁻⁸	1.3553 × 10⁻⁶	7.8032 × 10⁻¹²	1.3203 × 10⁻¹⁴	4.9294 × 10⁻⁷

Table 5. Statistical results of six indicators.

	Optimal Value				Standard Deviation
	DBO	SDBO	GDBO	SGDBO	DBO	SDBO	GDBO	SGDBO
F1	24,674.671	3925.881	16,655.296	5790.734	11,160.400	6475.174	15,965.870	36,967.735
F2	757.579	488.974	451.848	421.759	243.172	312.337	26.543	41.733
F3	651.737	635.763	620.752	607.738	7.604	10.124	13.953	9.645
F4	936.073	855.499	858.776	834.088	14.211	21.499	15.639	15.556
F5	2480.762	1626.982	1791.505	1062.147	384.475	355.811	246.355	424.095
F6	107,947.065	3387.453	1863.746	1928.317	950,206.054	4,887,986.866	3577.495	7225.608
F7	2120.264	2070.648	2055.488	2046.148	34.009	44.206	108.116	50.411
F8	2243.152	2228.167	2223.516	2225.162	124.971	76.270	69.913	57.871
F9	2589.256	2494.595	2480.840	2480.809	141.577	80.328	2.123	2.259
F10	2543.674	2503.869	2501.279	2500.566	1061.003	1138.090	815.026	1131.126
F11	6267.105	3599.669	2933.661	2832.427	561.510	1086.416	89.606	256.312
F12	3029.282	2951.569	2948.774	2959.030	145.235	75.218	97.803	99.314
	Mean				Median
	DBO	SDBO	GDBO	SGDBO	DBO	SDBO	GDBO	SGDBO
F1	40,452.169	13,301.656	45,056.143	21,868.149	40,052.867	11,589.225	45,523.311	15,966.182
F2	1196.172	750.380	488.055	492.173	1177.303	647.486	478.408	475.500
F3	667.031	655.271	650.264	619.974	667.123	654.117	651.243	617.385
F4	958.414	887.944	889.074	866.949	960.150	886.585	886.574	866.805
F5	3176.424	2302.412	2481.683	1782.836	3113.103	2289.769	2508.380	1790.385
F6	189,072.204	10,341.286	4828.619	7364.613	146,829.706	19,081.704	3508.690	3678.437
F7	2194.045	2144.133	2205.093	2109.393	2198.739	2139.450	2166.844	2099.535
F8	2397.934	2298.778	2309.479	2274.338	2387.885	2260.326	2306.254	2248.602
F9	2739.251	2580.093	2483.288	2481.640	2701.778	2570.604	2482.549	2480.976
F10	2974.821	4642.110	4065.245	3869.570	2589.797	4879.866	4302.358	3905.154
F11	7374.774	5427.446	3023.808	3254.710	7333.953	5279.420	2989.825	3151.129
F12	3265.027	3071.346	3033.412	3126.855	3261.336	3065.984	3002.210	3117.147
	Worst Value				Average Runtime (s)
	DBO	SDBO	GDBO	SGDBO	DBO	SDBO	GDBO	SGDBO
F1	71,970.869	36,595.135	83,566.225	27,503.712	0.020	0.025	0.098	0.069
F2	1718.588	2055.138	576.095	598.068	0.077	0.036	0.025	0.022
F3	680.490	682.651	681.282	653.952	0.037	0.045	0.039	0.034
F4	992.650	950.167	969.159	899.935	0.023	0.026	0.026	0.024
F5	4034.685	3399.718	3201.698	3188.911	0.024	0.027	0.027	0.024
F6	1,613,946.819	415,981.307	20,608.741	25,152.072	0.020	0.020	0.022	0.021
F7	2265.255	2261.594	2509.716	2337.577	0.037	0.052	0.044	0.038
F8	2670.251	2480.569	2478.835	2465.056	0.040	0.057	0.047	0.040
F9	3497.811	3041.436	2490.214	2492.729	0.038	0.057	0.047	0.039
F10	6593.482	6909.626	5110.160	6582.753	0.033	0.045	0.039	0.036
F11	8672.914	8216.632	3375.262	3860.248	0.048	0.097	0.054	0.044
F12	3763.936	3234.444	3589.413	3398.466	0.167	0.231	0.188	0.133

Table 6. Performance indicators for different classification models.

Indicators	SGDBO-RNN		SGDBO-SVM		SGDBO-LSSVM		SGDBO-Transformer		SGDBO-Transformer-LSSVM
Indicators	Training Set	Test Set	Training Set	Test Set	Training Set	Test Set	Training Set	Test Set	Training Set	Test Set
Accuracy	0.913	0.707	0.807	0.546	0.891	0.721	0.933	0.866	0.973	0.906
Precision	0.916	0.766	0.851	0.608	0.892	0.702	0.928	0.872	0.984	0.910
Recall	0.911	0.763	0.801	0.574	0.803	0.698	0.923	0.883	0.998	0.923
F1-Score	0.912	0.713	0.806	0.532	0.845	0.694	0.931	0.864	0.992	0.912

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, M.; Zhang, L.; Yan, Z.; Wang, X.; Qiao, W.; Feng, L. Research on Gas Concentration Anomaly Detection in Coal Mining Based on SGDBO-Transformer-LSSVM. Processes 2025, 13, 2699. https://doi.org/10.3390/pr13092699

AMA Style

Liu M, Zhang L, Yan Z, Wang X, Qiao W, Feng L. Research on Gas Concentration Anomaly Detection in Coal Mining Based on SGDBO-Transformer-LSSVM. Processes. 2025; 13(9):2699. https://doi.org/10.3390/pr13092699

Chicago/Turabian Style

Liu, Mingyang, Longcheng Zhang, Zhenguo Yan, Xiaodong Wang, Wei Qiao, and Longfei Feng. 2025. "Research on Gas Concentration Anomaly Detection in Coal Mining Based on SGDBO-Transformer-LSSVM" Processes 13, no. 9: 2699. https://doi.org/10.3390/pr13092699

APA Style

Liu, M., Zhang, L., Yan, Z., Wang, X., Qiao, W., & Feng, L. (2025). Research on Gas Concentration Anomaly Detection in Coal Mining Based on SGDBO-Transformer-LSSVM. Processes, 13(9), 2699. https://doi.org/10.3390/pr13092699

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on Gas Concentration Anomaly Detection in Coal Mining Based on SGDBO-Transformer-LSSVM

Abstract

1. Introduction

2. Concentration Anomaly Diagnosis Theory and Algorithm

2.1. CH₄ Concentration Anomaly Diagnosis Process and Feature Engineering

2.1.1. CH₄ Concentration Anomaly Diagnosis Process

2.1.2. Feature Engineering

2.2. Construction of Prediction Model

2.2.1. Transformer Encoder

2.2.2. Least Squares Support Vector Machine (LSSVM)

2.2.3. Classification Model Based on Transformer-LSSVM

2.3. Algorithm Improvement and Transformer-LSSVM Model Hyperparameter Optimization

2.3.1. Dung Beetle Optimization Algorithm (DBO)

2.3.2. Multi-Strategy Enhanced Dung Beetle Optimization Algorithm

2.3.3. Hyperparameter Optimization Framework

2.4. Evaluation Metric System

3. Experiments and Discussion

3.1. Original Sample Analysis and Sample Classification

3.2. Reconstruction Based on Sample Features

3.3. Analysis of Classification Results Based on SGDBO-Transformer-LSSVM Model

3.3.1. Algorithm Performance Comparison Based on IEEE CEC 2022

3.3.2. Hyperparameter Optimization Results

3.3.3. Performance Comparison of Transformer-LSSVM

4. Conclusions and Prospects

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Research on Gas Concentration Anomaly Detection in Coal Mining Based on SGDBO-Transformer-LSSVM

Abstract

1. Introduction

2. Concentration Anomaly Diagnosis Theory and Algorithm

2.1. CH4 Concentration Anomaly Diagnosis Process and Feature Engineering

2.1.1. CH4 Concentration Anomaly Diagnosis Process

2.1.2. Feature Engineering

2.2. Construction of Prediction Model

2.2.1. Transformer Encoder

2.2.2. Least Squares Support Vector Machine (LSSVM)

2.2.3. Classification Model Based on Transformer-LSSVM

2.3. Algorithm Improvement and Transformer-LSSVM Model Hyperparameter Optimization

2.3.1. Dung Beetle Optimization Algorithm (DBO)

2.3.2. Multi-Strategy Enhanced Dung Beetle Optimization Algorithm

2.3.3. Hyperparameter Optimization Framework

2.4. Evaluation Metric System

3. Experiments and Discussion

3.1. Original Sample Analysis and Sample Classification

3.2. Reconstruction Based on Sample Features

3.3. Analysis of Classification Results Based on SGDBO-Transformer-LSSVM Model

3.3.1. Algorithm Performance Comparison Based on IEEE CEC 2022

3.3.2. Hyperparameter Optimization Results

3.3.3. Performance Comparison of Transformer-LSSVM

4. Conclusions and Prospects

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

2.1. CH₄ Concentration Anomaly Diagnosis Process and Feature Engineering

2.1.1. CH₄ Concentration Anomaly Diagnosis Process