NSA-CHG: An Intelligent Prediction Framework for Real-Time TBM Parameter Optimization in Complex Geological Conditions

Chen, Youliang; Guan, Wencan; Azzam, Rafig; Chen, Siyu

doi:10.3390/ai6060127

Open AccessArticle

NSA-CHG: An Intelligent Prediction Framework for Real-Time TBM Parameter Optimization in Complex Geological Conditions

¹

Department of Civil Engineering, University of Shanghai for Science and Technology, 516 Jungong Rd., Shanghai 200093, China

²

Department of Engineering Geology and Hydrogeology, RWTH Aachen University, Lochnerstr. 4-20 Haus A, D-52064 Aachen, Germany

³

Department of Civil and Transportation Engineering, Shenzhen University, Shenzhen 518060, China

^*

Author to whom correspondence should be addressed.

AI 2025, 6(6), 127; https://doi.org/10.3390/ai6060127

Submission received: 19 May 2025 / Revised: 5 June 2025 / Accepted: 6 June 2025 / Published: 16 June 2025

Download

Browse Figures

Versions Notes

Abstract

:

This study proposes an intelligent prediction framework integrating native sparse attention (NSA) with the Chen-Guan (CHG) algorithm to optimize tunnel boring machine (TBM) operations in heterogeneous geological environments. The framework resolves critical limitations of conventional experience-driven approaches that inadequately address the nonlinear coupling between the spatial heterogeneity of rock mass parameters and mechanical system responses. Three principal innovations are introduced: (1) a hardware-compatible sparse attention architecture achieving O(n) computational complexity while preserving high-fidelity geological feature extraction capabilities; (2) an adaptive kernel function optimization mechanism that reduces confidence interval width by 41.3% through synergistic integration of boundary likelihood-driven kernel selection with Chebyshev inequality-based posterior estimation; and (3) a physics-enhanced modelling methodology combining non-Hertzian contact mechanics with eddy field evolution equations. Validation experiments employing field data from the Pujiang Town Plot 125-2 Tunnel Project demonstrated superior performance metrics, including 92.4% ± 1.8% warning accuracy for fractured zones, ≤28 ms optimization response time, and ≤4.7% relative error in energy dissipation analysis. Comparative analysis revealed a 32.7% reduction in root mean square error (p < 0.01) and 4.8-fold inference speed acceleration relative to conventional methods, establishing a novel data–physics fusion paradigm for TBM control with substantial implications for intelligent tunnelling in complex geological formations.

Keywords:

machine learning; TBM excavation; geological exploration; tunnel construction; neural network

1. Introduction

The accelerated advancement of global urbanisation and the sustained growth in demand for infrastructure development have led to the emergence of the development of underground space as a critical strategic direction in modern engineering construction [1,2,3,4]. Tunnel excavation engineering, as the core technological field of underground space utilisation, directly depends on the construction efficiency and safety of accurately predicting and responding in real time to geological conditions ahead of the excavation face. In the contemporary context of complex and evolving geological environments, the tunnel boring machine (TBM) has emerged as the preferred equipment for large-scale tunnel engineering projects. This is due to its outstanding mechanical performance, environmental adaptability, and high degree of automation [5,6]. However, the efficient operation of TBMs in heterogeneous geological conditions poses severe challenges. These challenges include the need to achieve precise control and dynamic optimisation of key mechanical parameters, such as total thrust, torque, and rotational speed, in complex geological environments. Such environments may include fractured zones, water-bearing strata, and high-stress areas [7,8,9]. In the course of the construction of the Swiss St. Gotthard Base Tunnel [10], the TBM encountered a quartzite vein with a composition of 12% while traversing gneiss and amphibolite interlayers in the Alps. This necessitated a gradual adjustment of the cutterhead thrust from 28 MN to 35 MN over a 10-m excavation section. In the Singapore Deep Tunnel Sewage Treatment System Project [11], the shield machine experienced a two. It has been determined that a 7% pitch angle deviation exists at the interface between coastal soft soil and residual soil. This deviation necessitates a millimetre-level attitude correction via a zoned hydraulic system. In a similar manner, in China’s Yunnan-Guizhou Water Diversion Project [12], the team faced rock burst risks at the contact zone between basalt and limestone. This necessitated a dynamic adjustment of the advance rate based on microseismic monitoring data. The conventional decision-making model, which is dependent on engineering experience, has demonstrated notable deficiencies in addressing the nonlinear coupling between rock mechanics, parameter spatial heterogeneity and mechanical system responses. This has resulted in a substantial reduction in tunnelling efficiency and an increase in the likelihood of engineering risks. Consequently, the establishment of an accurate geological–mechanical coupling dynamic model and the development of an intelligent TBM parameter optimisation system have become critical technical challenges that must be addressed in the field of tunnel engineering [13,14,15].

In recent years, the rapid development of artificial intelligence technology has provided unprecedented opportunities for addressing these technical challenges. From the earliest expert systems to contemporary deep learning architectures, machine learning methods have progressively evolved into a significant technical approach for modelling geomechanical parameters [16,17]. This is particularly evident in the continuous maturation of cutting-edge technologies such as deep neural networks, reinforcement learning, and physics-informed neural networks. Consequently, a solid theoretical foundation has been established for constructing intelligent TBM parameter optimization systems [18]. The application of machine learning technologies in the TBM field has undergone a technological evolution from simple regression prediction to complex intelligent control. In the early stages of research, the primary focus was on the utilisation of conventional machine learning algorithms, such as support vector machines and random forests, for the static prediction of TBM performance parameters [19]. The intermediate development phase witnessed the introduction of neural network technologies, which enabled the attainment of a profound level of modelling in the context of geological–mechanical parameter mapping relationships. Contemporary research is oriented towards comprehensive modelling approaches that integrate deep learning, reinforcement learning, and physical constraints, with the objective of achieving enhanced prediction accuracy and augmented real-time response capabilities [20,21].

The following research achievements have been identified: firstly, the Geological Response TBM Model (GRTBM) proposed by Cao et al. [22], which established dynamic correlations between operational parameter adjustments and TBM monitoring data through offline reinforcement learning techniques; secondly, Yin et al.’s [23] integration of multi-source heterogeneous data using Bayesian classification algorithms to achieve real-time discrimination of surrounding rock classifications, thereby substantially enhancing the geological perception capabilities during TBM excavation processes; and thirdly, Liu et al.’s [24] contributions. It is noteworthy that Amoroso et al. revealed the critical influence of model selection on prediction accuracy by comparing the performance of various neural network architectures in TBM parameter prediction. Shan et al. [25] conducted a systematic investigation into the effect of permeability rate (PR) on TBM excavation efficiency. To this end, they employed a hybrid modelling approach that combined the autoregressive integrated moving average (ARIMA) and recurrent neural networks (RNNs). These models were based on data from the Zhengzhou Metro tunnel project. Zhang et al. [26] proposed a TBM tunnel parameter prediction method based on a particle swarm optimization–bidirectional long short-term memory (PSO-Bi-LSTM) model. Through dynamic adjustment of network hyperparameters using particle swarm algorithms and capturing the forward–backward temporal characteristics of excavation parameters with bidirectional LSTM, they achieved the precise prediction of key parameters such as cutterhead rotation speed and advance rate, providing assistance for intelligent TBM construction decision-making under similar geological conditions. Zhou et al. [27] constructed multi-dimensional prediction models based on rock quality designation (RQD) and uniaxial compressive strength (UCS) by optimising support vector machine (SVM) hyperparameters using three swarm intelligence algorithms. The following algorithms are of particular interest: grey wolf optimisation (GWO), the whale optimisation algorithm (WOA), and moth–flame optimisation (MFO). The research team further found, through comparison, that GWO-SVM performed optimally in hard rock formations, while WOA-SVM demonstrated stronger adaptability to jointed rock formations.

Notwithstanding the considerable progress achieved in the fields of excavation precision control and construction safety enhancement, current research continues to encounter the following issues.

The model’s capacity for generalisation is inadequate. Despite the fact that extant research (e.g., PSO-Bi-LSTM, GWO-SVM, etc.) has been undertaken to optimise prediction performance for specific geological conditions, there is a paucity of universal validation across different strata and projects, with limited model transferability [28]. This has the effect of making adaptation to complex and variable geological engineering environments difficult.

Secondly, the entity in question demonstrates deficient real-time dynamic adaptability. Offline reinforcement learning models, such as GRTBM, rely on historical data training, resulting in a delayed online response to sudden geological changes during excavation processes [29]. This is due to incomplete closed-loop feedback mechanisms for real-time parameter adjustment and geological perception.

Thirdly, the issue of insufficient depth in multi-source data fusion has been identified. Existing methods (e.g., Bayesian classification and SVM multi-dimensional prediction) inadequately consider the physical constraints of rock–machine interaction mechanisms [28,29,30], failing to fully exploit the nonlinear coupling relationships among geological parameters, mechanical responses, and operational decisions. This, in turn, affects the interpretability of prediction results.

The following section will examine the limitations of optimisation algorithms. Swarm intelligence algorithms (PSO, GWO, etc.) are susceptible to local optima in hyperparameter tuning [31], and the fundamental problem of balancing computational efficiency with prediction accuracy remains unresolved, constraining real-time processing capabilities for large-scale engineering data.

Fifthly, there are deficiencies in the modelling of long-term temporal dependency. Despite the fact that models such as Bi-LSTM are capable of capturing short-term characteristics of excavation parameters, the modelling of long-term evolutionary patterns such as cutterhead wear and surrounding rock creep remains inadequate. This inadequacy has a detrimental effect on the long-term reliability of prediction results.

In order to address the aforementioned technical challenges, the study designed and implemented an intelligent prediction framework that integrates the native sparse attention mechanism (NSA) [32,33] with the Chen-Guan (CHG) [34,35] algorithm. The fundamental components of this framework are manifested in three aspects:

(1) Collaborative optimisation architecture for computational efficiency and feature extraction. The NSA mechanism employs a hardware-compatible computational architecture [33] that employs dynamic hierarchical sparse strategies (integrating dynamic token compression with fine-grained feature selection) to ensure the efficient identification and extraction of critical geological features (e.g., uniaxial compressive strength, UCS) while maintaining O(n) linear computational complexity.

(2) Dynamic regulation mechanism for prediction accuracy and uncertainty. Specifically, through the integration of the CHG algorithm [34], whose core lies in an adaptive optimization kernel that combines boundary likelihood-driven kernel selection strategies with optimal posterior estimation techniques based on Chebyshev inequality [36]; the reliability of prediction results is significantly enhanced. This is demonstrated by a substantial reduction (41.3%) in the confidence interval width of TBM mechanical parameter predictions.

The third point to consider is that of innovative architectural frameworks that facilitate the integration of physical mechanisms and data-driven methodologies. Specifically, physics-informed neural networks (PINNs) [37,38] are proposed as a unifying conduit, incorporating physical laws such as non-Hertzian contact mechanics models that delineate cutter–rock interactions and vortex field evolution equations into the learning process. This development leads to the establishment of a geological–mechanical joint inversion model that surpasses the conventional theoretical limitations of traditional data-driven methods, while concurrently ensuring the physical consistency of the model.

It is evident that the synergistic effects of these three core components result in effectively addressing the multifaceted challenges. These challenges include generalization, real-time performance, fusion depth, optimization efficiency, and temporal modelling. The following innovations are at the core of the project: The NSA algorithm has been demonstrated to achieve the efficient extraction of critical geological features, such as uniaxial compressive strength (UCS), through the implementation of dynamic hierarchical sparse strategies (dynamic token compression and fine-grained feature selection). This is accomplished while maintaining O(n) computational complexity. The CHG algorithm integrates boundary likelihood-driven dynamic kernel selection strategies with optimal posterior estimation based on Chebyshev inequality, achieving a 41.3% reduction in the confidence interval width of TBM mechanical parameter predictions. The third point to consider is the introduction of physics-informed neural networks (PINNs). These are combined with non-Hertzian contact mechanics models of cutter–rock interactions and vortex field evolution equations. The result is a geological–mechanical joint inversion model that transcends the theoretical limitations of physical consistency in traditional data-driven methods.

The present study systematically validated the engineering performance of the model in three aspects: fractured zone advance warning (warning accuracy: 92.4% ± 1.8%), the dynamic optimization of mechanical parameters (response time ≤ 28 ms), and energy consumption analysis (relative error ≤ 4.7%), based on measured data from the Shanghai Construction Group Central Research Institute’s Pujiang Town Plot 125-2 Tunnel Project (Project No.: 24YJKF-27). The experimental findings demonstrate that, in comparison with conventional methodologies (standard transformer and static Gaussian process regression), the proposed framework exhibits substantial advantages in terms of prediction accuracy (32.7% reduction in root mean square error, p < 0.01) and computational efficiency (4.8-fold enhancement in inference speed). This transformation of TBM control from an experience-driven mode to a data–physics dual-driven mode is a notable accomplishment. The research findings provide a theoretical foundation and technical support for intelligent tunnel construction under complex geological conditions, possessing significant engineering application value.

2. Materials and Methods

2.1. Experimental Data Collection

The present study employs tunnel engineering data from the 125-2 project in Pujiang Town, implemented by the Shanghai Institute of Building Central Research (Project Number: 24YJKF-27). Preliminary geological exploration employed tunnel geophysical seismic (TGS) [39] technology to generate seismic waves, producing geological radar profiles extending 30 metres ahead of the excavation face. As demonstrated in Figure 1, Figure 1a presents the geological radar imaging outcomes from the preliminary site investigation, while Figure 1b,c illustrate the configuration of TGS transducers within the tunnel excavation environment. As demonstrated in Figure 2, Figure 2a indicates TBM main body, while Figure 2b,c illustrate the tunnel face and changes in tunnel excavation process.

2.2. Native Sparse Attention (NSA) Algorithm

The datasets generated during the excavation of the tunnel are both geological and mechanical in nature. They are characterised by their enormous volume and intricate interrelationships. Whilst standard attention mechanisms, such as transformer architectures, effectively address these complex correlations, they impose substantial computational demands that present significant challenges for real-time implementation. Sparse attention mechanisms have been identified as a potentially valuable avenue for enhancing computational efficiency while preserving predictive performance. A plethora of sparse attention methodologies have been proposed, including KV cache eviction techniques, block-wise KV cache selection approaches, and methods based on sampling, clustering, or hashing principles. However, these approaches are inherently limited in practical deployment scenarios, as evidenced by their failure to achieve theoretical acceleration effects or inadequate support during training phases [40,41].

Native sparse attention (NSA) [33] represents a hardware-aligned sparse attention mechanism that implements a dynamic hierarchical sparsity strategy, integrating coarse-grained token compression with fine-grained token selection to maintain the precise capture of globally significant geological information [42]. Through algorithmic innovations and hardware optimizations, NSA achieves efficient geological reconstruction modelling with substantially reduced computational overhead [43]. Figure 3 presents the foundational native sparse attention mechanism, implementing a hardware-aligned sparse attention strategy. The architecture features dynamic hierarchical sparsity through coarse-grained token compression and fine-grained token selection, achieving O(n) computational complexity while maintaining geological feature extraction precision.

In order to optimise the original NSA algorithm for the characteristics of TGS geological data and TBM tunnel excavation mechanical parameter data, it is necessary to efficiently capture the geological UCS strength. This is conducive to pre-training the locally deployed NSA model with the original learning and small sample-efficient training methods. As demonstrated in Figure 4, the optimised NSA architecture for capturing tunnel rock UCS is illustrated, with the complete architecture provided in the Appendix A.

O_{t} = A_{t t n} (q_{t}, k_{t}, v_{t}) = \sum_{i = 1}^{t} a_{t, i} v_{i}

(1)

a_{t, i} = \frac{e^{\frac{q_{t}^{T} k_{i}}{\sqrt{d_{k}}}}}{\sum_{j = 1}^{t} e^{\frac{q_{t}^{T} k_{j}}{\sqrt{d_{k}}}}}

(2)

a_{t^{'}, i} = \frac{e^{\frac{(q_{t^{'}}^{T} + U C S_{t}) k_{i}}{\sqrt{d_{k}}}}}{\sum_{j = 1}^{t} e^{\frac{(q_{t^{'}}^{T} + U C S_{t}) k_{j}}{\sqrt{d_{k}}}}}

(3)

where

O_{t}

: output at position t;

A_{t t n}

: attention function;

q_{t}

: query vector at position t;

k_{t}

: key vector at position t;

a_{t, i}

: attention weight from position t to position i; and

U C S_{t}

: rock strength quantification parameters.

2.3. Chen-Guan (CHG) Algorithm Establishment

In the context of complex rock tunnel excavation projects, the PFPI (Field Penetration Index) and PTPI (Torque Penetration Index) of the TBM have been shown to offer an indirect reflection of the impact of geological conditions on tunnel construction. This provides insight into the distribution and mechanical properties of complex tunnel rocks, as expressed by the following formula.

F_{F P I} = \frac{F \cdot S_{C R}}{S_{A R}}

(4)

T_{T P I} = \frac{T \cdot S_{C R}}{S_{A R}}

(5)

In the formula,

F

represents the total thrust;

S_{C R}

is the cutterhead rotation speed;

T

is the cutterhead torque; and

S_{A R}

is the advancing speed. When rock stiffness is high, the Penetration Force Performance Index (FPI) and Penetration Torque Performance Index (TPI) values increase due to the combination of smaller denominators and larger numerators in their respective equations [44,45]. Conversely, when rock stiffness decreases, both FPI and TPI values diminish accordingly, indirectly reflecting the distribution of geological formations within the tunnel environment. Consequently, predicting appropriate time-variant TBM mechanical parameter curves based on geological conditions necessitates an algorithm capable of handling weak mapping relationships while providing robust normalization capabilities.

This research introduces the Chen-Guan (CHG) algorithm, an optimized Gaussian process-based approach whose core components are illustrated in Figure 5. The algorithm consists of three primary stages: CHG Gaussian process prior formulation, hyperparameter learning procedures, and CHG Gaussian inference mechanisms [46]. The CHG algorithm employs composite kernel functions and applies Chebyshev inequality optimization to the posterior process, substantially enhancing the confidence intervals of the predictive distributions. Given the inherent stochasticity of complex geological data at the tunnel face, the CHG algorithm incorporates a specialized kernel function selection mechanism driven by data boundary likelihood values [47].

Furthermore, the radial basis function (RBF) kernel has been optimised through the introduction of adaptive scaling factors. The aforementioned scaling factors are defined through the utilisation of a key–value pair approach, which effectively implements an attention mechanism optimisation strategy. The methodology incorporates fine-tuned large language models (LLMs) to perform comparative operational analyses and data optimisation on a specialised excavation face database. The transmission of refined parameters to the attention mechanism that controls the scaling factors is an integral component of the process. This mechanism is responsible for computing appropriate scaling values, which are subsequently incorporated into the radial basis function (RBF) kernel for further computations. The integration of these processes enables the prediction of TBM mechanical parameters based on critical geological information captured by the native sparse attention (NSA) mechanism. Through the implementation of suitable normalisation techniques, the system successfully integrates with the NSA pre-trained model, thereby achieving comprehensive excavation prediction capabilities that are both enhanced in terms of accuracy and computational efficiency.

2.3.1. Kernel Function Optimization

Common Gaussian process algorithms typically use the RBF kernel and the Matérn kernel for data [48]. The difference between the two is that the RBF kernel is better at handling smooth data, while the latter provides better results for discretized data. The respective forms of both kernels are shown below.

K (x, x^{'}) = σ_{f}^{2} \exp (- \frac{‖ x - x^{'} ‖^{2}}{2 l^{2}})

(6)

K_{ν} (x, x^{'}) = \frac{2^{1 - ν}}{Γ (ν)} {(\sqrt{2 ν} ‖ x - x^{'} ‖)}^{ν} k_{ν} (\sqrt{2 ν} ‖ x - x^{'} ‖)

(7)

where

σ_{f}^{2}

is the signal variance,

l^{2}

is the length scale,

Γ (ν)

is the gamma function,

‖ x - x^{'} ‖

represents the Euclidean distance between two points, and

k_{v}

is the second kind of the modified Bessel function [49]. By combining Equations (6) and (7), and introducing a decision mechanism, when the data falls within a certain range, the boundary likelihood value within that range is calculated to determine which kernel function should be selected. The formula for calculating the boundary likelihood value is as follows

\log p (Y | x, k) = - \frac{1}{2} y^{T} K^{- 1} y - \frac{1}{2} \log | K | - \frac{n}{2} \log 2 π

(8)

where

Y

is the observation vector,

x

is the input feature matrix,

k

is the covariance matrix generated by the kernel function, and

n

is the number of observations. As shown in Figure 6, if the marginal likelihood value of the RBF kernel is higher, the data are more likely to be smooth; if the marginal likelihood value of the Matérn kernel is higher [49], the data are likely to be rougher. Therefore, by introducing a dynamic judgment mechanism, the reliability of the CHG prior process can be ensured, improving the stability of training parameters and providing a foundation for subsequent CHG inference as well as hyperparameter optimization and learning.

2.3.2. RBF Radial Basis Function Optimization

This study aims to achieve scale control and the optimized configuration of the radial basis function (RBF), laying a theoretical foundation for posterior analysis. Figure 7 shows that the RBF function optimization mechanism regulates the covariance function evolution process by introducing a dynamic scaling factor. The determination of the scaling factor uses a multimodal parameter optimization method based on the DIFY framework [42,50]: First, through the large-scale model qwen2-72*B for fine-tuning complex rock tunnel dynamic excavation technical parameters, numerical simulation and parameter inference are performed on the engineering conditions and training parameter sets in the database [32,51,52]. Then, the optimized parameters are input into the independently designed attention feed-forward network, and through normalization, residual connections, and standardization operations, the dynamic analysis and adaptive adjustment of the scaling factor are achieved. Equation (9) represents the internal Gaussian function of CHG, and Equation (10) represents the radial basis function.

f (x) = \sum_{i = 1}^{I} w_{i} ϕ_{i} (x), w_{i} ~ N (0, \frac{σ^{2}}{j})

(9)

ϕ_{j} (x) = \exp (- \frac{{(x - c_{j})}^{2}}{2 l^{2}} \cdot a)

(10)

where

f (x)

is the CHG internal Gaussian function,

w_{i}

is the corresponding random weight value, and

ϕ_{i}

is the radial basis function centred at

c_{j}

The final RBF kernel function obtained by Formulas (9) and (10) is

K (x, x^{'}) = \frac{σ^{2}}{I} \sum_{i = 1}^{I} \exp (- \frac{{(x - c_{i})}^{2}}{2 l^{2}} \cdot a) \exp (- \frac{{(x^{'} - c_{i})}^{2}}{2 l^{2}} \cdot a)

(11)

2.3.3. Posterior Optimization of the CHG Algorithm

The predictive output of the CHG algorithm is identical to that of Gaussian processes, yielding a probability distribution whose results are termed posterior distributions. The posterior distributions provide confidence intervals for the prediction results. In Section 2.3.3 of this study, an adaptive optimization mechanism for confidence intervals was implemented for the CHG posterior process by introducing Chebyshev’s inequality. This method defines parameter k in Chebyshev’s inequality as a threshold parameter, which is used to establish the upper bound of the prediction result probability. When predicted values exceed the preset confidence interval, the system automatically identifies these anomalous deviations and triggers re-optimization procedures. This mechanism enables the dynamic enhancement of confidence levels (from 95% to 98%). The specific mathematical derivation process is as follows.

Standard Chebyshev’s inequality:

P (|X - μ| \geq k σ) \leq \frac{1}{k^{2}}

(12)

P denotes the probability measure, μ represents the mathematical expectation of the random variable, σ represents the standard deviation of the random variable, and k is any positive real number.

Parameter initialization:

α_{0} = 0.05 (95 %)

(13)

k_{0} = \sqrt{\frac{1}{α_{0}}} = \sqrt{20} \approx 4.47

(14)

In the formula, the initial confidence level is set to 95% (

α_{0}

= 0.05), and the initial threshold coefficient

k_{0}

is calculated based on Chebyshev’s inequality. This threshold determines the width of the initial confidence interval.

Calculate CHG process posterior statistics:

μ^{*} (x^{*}) = μ (x^{*}) + k^{T} {(K + σ_{n}^{2} I)}^{- 1} (y - μ)

(15)

σ^{*} (x^{*}) = \sqrt{k^{*} (x^{*}, x^{*})} = \sqrt{k (x^{*}, x^{*}) - k^{T} {(K + σ_{n}^{2} I)}^{- 1} k}

(16)

In the formula, the posterior mean

μ^{*}

of the prediction point is calculated based on training data, and the posterior standard deviation

σ^{*}

of the prediction point is computed. These two statistics serve as the foundation for constructing confidence intervals.

Anomaly detection and determination:

f^{*} (x^{*}) - μ^{*} (x^{*}) | > k_{i} \cdot σ^{*} (x^{*})

(17)

Formula (17) calculates the deviation between the predicted value and the posterior mean, comparing the deviation with the current threshold coefficient and standard deviation. If it exceeds the range, it is considered an anomalous prediction, requiring an adjustment of the confidence interval.

Dynamic threshold adjustment:

Δ k = β \cdot \frac{| f^{*} (x^{*}) - μ^{*} (x^{*}) |}{σ^{*} (x^{*})}

(18)

k_{new} = k_{old} + Δ k

(19)

In the formula,

Δ k

represents the threshold increment calculated based on the degree of anomaly, and

β

is the adjustment coefficient, typically set to 0.1, controlling the adjustment magnitude.

Confidence interval reconstruction:

α_{new} = \frac{1}{k_{new}^{2}}

(20)

{CI}_{CHG} = [μ^{*} (x^{*}) - k_{new} \cdot σ^{*} (x^{*}), μ^{*} (x^{*}) + k_{new} \cdot σ^{*} (x^{*})]

(21)

Formula (21) represents the recalculation of confidence levels based on the new threshold coefficient, constructing adaptive Chebyshev confidence intervals. The new interval guarantees at least (1 −

α_{new}

) coverage probability.

Figure 8 illustrates that the CHG algorithm employs an adaptive confidence interval optimization mechanism based on Chebyshev’s inequality, enhancing prediction reliability through dynamic adjustment. This process initially establishes a 95% confidence level and calculates the corresponding threshold coefficient

k_{0}

to determine the confidence interval range. During the prediction process, the algorithm continuously monitors the deviation between predicted values and posterior means. When the deviation exceeds the preset threshold, the system automatically identifies these anomalous predictions and triggers optimization procedures. At this point, the algorithm dynamically increases the threshold coefficient according to the degree of anomaly, adaptively elevating the confidence level from 95% to 98%, while correspondingly broadening the coverage range of the confidence interval. This self-regulating mechanism enables the CHG algorithm to significantly improve the reliability and robustness of prediction results while maintaining prediction accuracy.

3. Model Establishment

3.1. Original Data Optimization

Figure 9 (ISRM Rock UCS Classification Box Plot [53,54]) provides a standardised geological strength calibration basis for the NSA algorithm’s analysis of Figure 10 (TGS advanced geological exploration radar image). This establishes the correspondence between rock UCS values and strength grades through box plot representation. This provides a quantitative classification benchmark for the wave velocity characteristics (reflecting rock stiffness) displayed in the TGS radar image of Figure 10. The establishment of a mapping relationship between radar wave velocity and UCS values facilitates associative modelling between rock mechanical properties and geophysical exploration data. The original geological radar data presented in Figure 10 (comprising wave velocity distributions of varying rock layers) necessitate conversion into vectorised features through the implementation of pre-processing methodologies such as wavelet transforms. The UCS classification system (see Figure 9) provides standardised labels for this process, thereby enabling the NSA algorithm to accurately identify during the feature extraction stage. The rock mechanical classification standards depicted in Figure 9 provide physically meaningful training labels for the radar image analysis in Figure 10, thereby supporting the NSA-CHG framework’s dynamic optimization of TBM mechanical parameters and three-dimensional geological reconstruction.

The TGS geological radar image of the area 30 metres behind the tunnel face is shown in Figure 10. The horizontal axis denotes the diameter of the tunnel excavation, the left vertical axis indicates the seismic wave propagation time, and the right vertical axis signifies the depth of the tunnel behind the face. Subfigures a to f illustrate the geological radar imaging 30 m behind the tunnel face as excavation progresses.

As demonstrated in Figure 10, the raw geological radar data are not directly compatible with the NSA algorithm for computation. Vectorisation pre-processing is imperative to guarantee that the input data possess both vectorised features and dimensional consistency. The data undergo token compression as they are passed through the capture module, where feature extraction takes place. This data are subsequently input into the feed-forward network layer, where the calculation of the attention score is completed. The present study employs the wavelet transform method illustrated in Figure 11 to convert the geological radar image into a standardised training data format that fulfils the input requirements of the NSA algorithm.

3.2. Tunnel Boring Geological Reconstruction

As outlined in Section 3.1, this study employs a methodology that integrates topology structure analysis with MySQL database technology to construct a comprehensive NSA pre-training database system. As demonstrated in Figure 12, the schematic diagram of the database topology structure illustrates that the system performs geological parameter clustering analysis based on a unified classification standard (UCS). When combined with the engineering case data foundation, it can effectively identify key sensitive parameters in the tunnel geological environment and enable 3D geological model reconstruction.

The learning mechanism of the NSA algorithm is shown in Figure 13. This process is constructed based on the correlation between the full-face tunnel boring machine (TBM) mechanical parameters and the uniaxial compressive strength (UCS) of rocks, as revealed by the clustering analysis of the database in Figure 12.

Figure 14 shows the automatic identification and classification ability of the uniaxial compressive strength (UCS) values in the geological radar spectra, while Figure 15 quantifies the prediction accuracy of this detection method.

Figure 16 shows the 3D geological model of a tunnel project constructed using the NSA algorithm, based on the unified classification [55] system (UCS) parameters and radar map data, along with the geological database.

3.3. Training and Output Augmentation

As outlined in Section 3.2 of this study, the CHG algorithm integrates key geological information captured by the pre-trained NSA model. This information includes, but is not limited to, fault zones, water-bearing layers, alteration zones, and high in situ stress layers. The integration of this information enables dynamic, time-varying predictions of TBM total torque, total thrust, and rotational speed in response to geological condition changes. On this basis, a geological–mechanical parameter collaborative inversion model is constructed by introducing physics-informed neural networks (PINN) to couple the cutterhead vortex field dynamic evolution characteristics with the multi-scale displacement field variation of the PFC cutterhead. This results in a significant improvement in the prediction accuracy of TBM mechanical parameters. As illustrated in Figure 17, the performance of the CHG and NSA fusion architecture during the training process is systematically demonstrated. The dynamic variation trends of the training and validation curves are specifically shown, while key metrics such as the minimum values of the loss functions for both the training and validation sets are also recorded.

As illustrated in Figure 18, the visualisation simulation of the cutter–rock interaction employs a combination of rigid body kinematics, simplified contact mechanics (non-Hertzian model), Newtonian dynamics, and the discretisation of the vortex field. The numerical method is utilised to implement this simulation, which serves to enhance the PINN loss function. In particular, Equations (22) and (23) are representative of the constraints on the kinematics of a rigid body [56].

x_{t + Δ t} = x_{t} + v_{t} Δ t + \frac{1}{2} a_{t} Δ t^{2}

(22)

q_{t + Δ t} = q_{t} + \frac{1}{2} q_{t} \otimes [0, w_{t}] Δ t

(23)

In the formula,

x_{t}

represents the tool position,

v_{t}

is the tool linear velocity,

a_{t}

is the tool acceleration,

q_{t}

is the quaternion for the tool direction, and

w_{t}

is the angular velocity. The symbol

\otimes

represents quaternion multiplication. Equations (24) and (25) represent the non-Hertzian contact model [57].

F_{n} = k_{n} δ_{n}^{a} n + D_{n} V_{n}

(24)

F_{t} = \min (μ | F_{n} |, k_{t} | | δ_{t} | |) \frac{δ_{t}}{| | δ_{t} | |} + D_{t} v_{t}

(25)

In the formula,

F_{n}

is the normal contact force,

F_{t}

is the tangential contact force,

δ_{n}^{a}

is the normal penetration depth,

δ_{t}

is the tangential displacement,

k_{n}

,

k_{t}

are the normal and tangential stiffness coefficients,

D_{n}

,

D_{t}

are the damping coefficient,

μ

is the friction coefficient, and

V_{n}

,

v_{t}

are the normal and tangential relative velocities. Equations (26) and (27) represent Newtonian mechanics constraints [58].

m a = \sum F

(26)

I α + w \times (I w) = \sum M

(27)

In the formula,

m

represents the tool mass,

I

is the inertia tensor,

α

is the angular acceleration, and

M

is the torque acting on the tool. Equations (28)–(30) represent the discretization calculation of the eddy current field [59].

w = \nabla \times v

(28)

w_{i, j, k} = {(\frac{\partial v_{z}}{\partial y} - \frac{\partial v_{y}}{\partial z}, \frac{\partial v_{x}}{\partial z} - \frac{\partial v_{z}}{\partial x}, \frac{\partial v_{y}}{\partial x} - \frac{\partial v_{x}}{\partial y})}_{i, j, k}

(29)

{\frac{\partial v_{x}}{\partial y}|}_{i, j, k} \approx \frac{v_{x, i, j + 1, k} - v_{x, i, j - 1, k}}{2 Δ_{y}}

(30)

Equation (28) defines vorticity as the curl of the velocity field, Equation (29) calculates the vorticity on discrete grid points, and Equation (30) represents the central difference approximation. Equation (31) shows the vorticity constraint in the PINN loss function [60].

L_{w} = \frac{1}{N} \sum_{i = 1}^{N} | | ω_{i} - (\nabla \times v_{i}) | |^{2}

(31)

As illustrated in Figure 18, the vorticity field evolution dynamics during cutterhead–rock interaction are demonstrated based on the physics-informed neural network (PINN) technique described in Section 3.3 of the NSA-CHG framework. The vorticity calculation theory, as defined in Equations (28)–(31) of the paper, is systematically validated through three-dimensional vorticity field analysis at three distinct Z positions (Z = −0.06 m, Z = 0.61 m, and Z = 1.28 m) and three radial field zones (near-cutting field, mid-field, and far-field). The velocity magnitude scale (0–200+ units) in the figure employs colour coding to display the complete flow characteristic distribution—from the maximum velocity gradient region near the cutterhead centre (yellow-green high-intensity zone) through the transitional flow pattern region (cyan-blue medium-intensity zone) to the background flow field region (purple low-intensity zone). The red circular boundary delineates the geometric constraints of the cutterhead in the non-Hertzian contact mechanics model.

This vorticity field analysis functions as a physics-informed constraint for the NSA-CHG model, providing fluid dynamics principles for TBM parameter prediction and enabling the quantitative analysis of energy dissipation mechanisms with real-time optimisation responses ≤ 28 ms. Moreover, through systematic progressive analysis from near-field to far-field at varying Z depths, it reveals how cutterhead vortex evolution influences the surrounding geological environment.

3.4. PFC Enhancement Analysis

The numerical validation of the rigid body kinematic constraints, non-Hertzian contact mechanics model, and vortex field discretization described in Equations (22)–(31) is illustrated in Figure 19, Figure 20 and Figure 21. As demonstrated in Figure 19, the torque variation of an L-shaped geometry is shown over a time sequence from 1.0 × 10⁻³ s to 4.0 × 10⁻³ s. This reveals that under geological conditions with softer upper layers and harder lower layers, granite produces localized torque peaks. This validates the correctness of the non-Hertzian model. As illustrated in Figure 20, the particle flow distribution diagram (4.5 × 10⁻³ s to 4.8 × 10⁻³ s), utilising the discrete element method, simulates the motion trajectories and distribution patterns of particles during rock fragmentation. This figure demonstrates the presence of localized high-gradient displacement fields (with a maximum displacement of 0.35 mm) in the granite contact zone, thereby verifying the applicability of the non-Hertzian contact model outlined in Section 3.3 of this study. As illustrated in Figure 21, the stress distribution and displacement field characteristics of a TBM cutterhead cross-section are reproduced, wherein the granite contact zone exhibits localized high-gradient displacement fields with a maximum displacement of 0.35 mm, deviating ≤ 3.5% from theoretical predictions. This forms a synergistic validation with the torque variation curve in Figure 19 and the particle flow distribution in Figure 20, comprehensively revealing the mechanical response laws of the non-Hertzian contact model in granite–sandstone composite strata and providing micro-mechanical evidence for the cutterhead energy loss analysis in the NSA-CHG-PINN integrated framework.

3.5. Model Evaluation

As demonstrated in Figure 22, the enhanced PINN network structure, as outlined in Equations (22)–(31), in conjunction with the serial coupling with the NSA-CHG algorithm, has been shown to markedly enhance the training efficiency of the model. The convergence characteristics of the system output’s parameter optimization process are enhanced by the deep integration of the PINN network and the NSA-CHG algorithm.

Figure 23 systematically presents the TBM mechanical parameters and specific energy dissipation classification heat map, which has been optimised based on the energy dissipation-roller cutter rock-breaking evaluation system, along with its torque-specific energy fitting curve. This finding considerably improves the structural accuracy of the rock mass mechanical model. The construction of a quantitative mapping relationship between the energy dissipation mechanism and roller cutter rock-breaking efficiency represents a significant achievement of this study, as it facilitates a collaborative analysis mechanism for rock strength parameters and mechanical response characteristics.

As illustrated in Figure 24, the NSA-CHG algorithm, when employed in conjunction with the PINN networks outlined in Section 3.3 and Section 3.4 of this study, effectively addresses the wear effects of non-Hertzian contact models on the cutterhead. This integration results in the establishment of a comprehensive predictive model structure. It is evident that the model effectively captures the physical relationships between geological conditions (UCS) and TBM performance parameters through (a). (b) demonstrates the consistent performance of NSA-CHG’s generalization capability on the engineering test samples in this project. This characterises the feasibility of organically combining data-driven approaches with engineering complexity, providing a foundation for model feasibility in Section 4 (model validation) of this study.

4. Model Validation

The model validation in this study is based on the actual excavation data from a tunnel project (Pujiang Town 125-2 Plot Tunnel Project No. 24YJKF-27) for empirical analysis. The TBM mechanical parameter operation records from a typical 50-metre excavation section are selected as validation samples, which include an ascending segment with transitional geological conditions and a stable segment. The present study focuses on validating two core functions by comparing and analysing the TBM mechanical operation parameter curves output by the NSA-CHG prediction model with the actual operational parameter curves from the project. Firstly, it is necessary to test whether the model can anticipate geological anomalies such as fault zones and water-bearing layers through abnormal fluctuations in parameters, achieving early shutdown warnings [12]. Secondly, it is necessary to evaluate whether the model possesses the intelligent control capability to dynamically optimise TBM mechanical operation parameters in real-time based on changes in geological conditions.

As illustrated in Table 1, the geological section is divided into distinct layers, with each layer characterised by unique features. Table 2 presents the borehole exploration records, detailing the results of exploratory drilling.

Figure 25 show the UCS pre-processing analysis and prediction were conducted on five major rock types (sandstone, limestone, shale, granite, and fault zones) in a 50 × 12 × 12 m geological model. The rock strength ranged from 5 to 140 MPa, demonstrating the complex geological structural characteristics validated in this study. The NSA prediction results indicate that this geological section contains multi-frequency folded stratifications, 65 dipping fault zones, vertical intrusive dikes, and lenticular geological bodies. The results show that when the excavation strength decreases by 30%, the permeability increases by 30%, suggesting the need to enhance the overall mechanical power of the TBM.

As shown in Figure 26, this study systematically classified and conducted 3D geological modelling of the tunnel surrounding rock’s uniaxial compressive strength (UCS) characteristics based on the NSA (neural segmentation algorithm) technology, successfully reconstructing the rock mass distribution model within a 50 m range of the tunnel excavation section. The study focuses on the most challenging construction section, using 3D spatial visualization technology combined with multidimensional parameter analysis from the geological database.

As shown in Figure 27, the NSA algorithm captures the complex geological structure of fault zones and water-bearing layers. A systematic feature analysis was conducted, and a geological model was constructed. The dataset was then transferred to the CHG system for 3D geological reconstruction modelling.

Figure 28 displays the measured and predicted curves of TBM mechanical parameters (within a typical 50 m tunnelling section), verifying the model’s parameter prediction capability under complex geological conditions such as fault zones (5–10 m section), soft rock strata (0–5 m section), and high-ground-stress seepage zones (10–35 m section). The mean absolute error (MAE) between predicted and actual parameter curves is controlled within the range of 16.05–20.99 kN. Through a time-series torque–thrust coupling analysis, the figure demonstrates the system’s dynamic response capability (response time ≤ 28 ms) under conditions where tunnelling intensity decreases by 30%.

Figure 29 and Figure 30 show that the mean absolute error (MAE) between predicted and measured values of total thrust is controlled within the range of 16.05–20.99 kN, with a relative error confidence interval of ±5% of the theoretical value. For total torque, in fault fracture zones (5–10 m sections) and soft rock formations (0–5 m sections), torque fluctuations exhibit significant positive correlation with geological condition variations. The prediction error (MAPE) of rotational speed is as low as 1.72%, demonstrating rapid response capability (≤28 ms) in high-geostress seepage zones (10–35 m sections). When seepage flow reaches the threshold of 30 m³/d, the system triggers shutdown commands 12.7 s in advance with a confidence level of 92.4% ± 1.8%. Compared with conventional methods, the boundary likelihood-driven kernel function selection strategy reduces false alarm rates by 32.7% (p < 0.01). Under working conditions where rock permeability increases by 30%, the algorithm compresses the confidence interval width of thrust parameters by 41.3% through a dynamic adjustment of the RBF kernel function’s scale factor, verifying the robustness of the CHG algorithm.

As demonstrated in Figure 31, in the soft rock–fractured zone transition area, the deviation between the predicted total thrust and the actual value is less than 10%, and the torque prediction error is within 5%. This indicates that the model can effectively capture parameter fluctuations caused by abrupt geological changes. In the fractured zone–high-stress zone transition area, the predicted rotational speed demonstrates a high degree of consistency with the actual values (R² = 0.9066), reflecting the model’s sensitivity to variations in rock strength. As demonstrated in Figure 32a, the classification count matrix indicates that 442 samples were accurately predicted. The classification accuracy matrix demonstrates that the prediction accuracy for the FZ (fractured zone) achieves 92.5%, with the misclassification rate to other categories (e.g., WS (water-bearing strata), SR (soft rock formation), etc.) remaining below 3%, signifying a minimal occurrence of misclassifications. (b) This section demonstrates the classification performance of the NSA-CHG framework for different geological conditions as a function of depth. It includes rock types. The rock formations are categorised as follows: NR (normal rock), FZ (fractured zone), WS (water-bearing strata), HS (high-stress zone), and SR (soft rock formation). The metrics encompass true positive rate (TPR), false positive rate (FPR), precision, and classification confidence.

5. Conclusions

The present study proposes an intelligent prediction framework that integrates native sparse attention (NSA) with the Chen-Guan (CHG) algorithm for the optimisation of tunnel boring machine (TBM) operations in complex geological environments. The core finding demonstrates that physics-informed data fusion can effectively bridge the gap between geological heterogeneity and mechanical system responses, establishing a paradigm shift from experience-driven to dual data–physics-driven TBM control.

5.1. Validation of Research Assumptions

The process of field validation serves to substantiate the fundamental assumptions underlying the model. The non-Hertzian contact mechanics model provides an adequate description of cutter–rock interactions across UCS ranges of 5–140 MPa, with ≤3.5% deviation from theoretical predictions. The sparse attention mechanism has been demonstrated to successfully preserve the quality of geological feature extraction (92.4% ± 1.8% warning accuracy) whilst concomitantly reducing the computational burden, thus validating the hypothesis that not all attention weights are equally critical for the execution of geological prediction tasks.

5.2. Parameter Impact Analysis

It is imperative to acknowledge that critical parameters exert a substantial influence on the efficacy of the model. The Chebyshev inequality threshold parameter k has been shown to demonstrate optimal performance at a value of 2.0–2.5, with regard to balancing the width of the confidence interval and the coverage probability. Sparse attention ratios of 15–25% have been demonstrated to provide optimal trade-offs between computational efficiency and geological feature preservation. It is evident that RBF kernel scaling factors demonstrate maximum effectiveness when geological UCS variations exceed 30 MPa within 10-metre intervals.

5.3. Research Limitations

The study presents several constraints that limit generalizability. Validation is primarily conducted on Shanghai-region geological conditions, leaving performance in significantly different geological environments unverified. The framework is oriented towards short-term predictions (i.e., ≤28 millisecond response times); however, it does not address the issue of long-term equipment degradation effects. Furthermore, optimal performance is contingent upon the availability of high-quality TGS radar data, which may not be obtainable in all operational environments. In the future, As shown in Figure 33, the baseline model capabilities can be continuously optimised to enhance its application in more fields. Please refer to the relevant Supplementary Materials.

5.4. Future Research Directions

In light of the limitations that have been identified, there is a necessity for critical research to be conducted in the following areas. The following four points must be considered when validating the model across diverse geological environments, particularly those with carbonate karst terrains and volcanic formations: Firstly, degradation-aware models must be developed, incorporating long-term cutter wear and geological condition evolution. Secondly, uncertainty quantification must be enhanced to address both aleatory and epistemic uncertainty sources. Thirdly, online learning capabilities must be integrated for real-time adaptation to unexpected geological conditions.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ai6060127/s1.

Author Contributions

Conceptualization, W.G.; methodology, W.G.; software, W.G.; validation, Y.C.; formal analysis, S.C.; investigation, R.A.; resources, S.C.; data curation, W.G.; writing—original draft preparation, W.G.; writing—review and editing, W.G. and R.A.; visualization, Y.C.; supervision, Y.C.; project administration, Y.C.; funding acquisition, Y.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by The Natural Science Foundation of China, grant number 42107168, The Natural Science Foundation of Shanghai of China, grant number 23ZR1443600, and The National Natural Science Foundation of China, grant number 51978401.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is unavailable due to privacy or ethical restrictions.

Conflicts of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The authors have no competing interests as defined by the journal, or other interests that might be perceived to influence the results and/or discussion reported in this paper. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Appendix A

Figure A1. The NSA complete architecture.

References

Maslennikov, V.; Kalinina, I.; Mudrak, S.A. Foresight of modern urban infrastructure. In Proceedings of the International Science Conference SPbWOSCE—SMART City, Peter Great Saint Petersburg Polytechn Univ, Inst Civil Engn, St. Petersburg, Russia, 15–17 November 2016. [Google Scholar]
Diaz, D.; Bai, Y.; Chen, J.X. Integrated Sustainable Underground Space Development. In Proceedings of the International Conference on Sustainable Infrastructure—Policy, Finance, and Education, New York, NY, USA, 26–28 October 2017; pp. 211–222. [Google Scholar]
Hanamura, T. Underground space networks in the 21st century: New infrastructure with modal shift technology and geotechnology. In Proceedings of the International Symposium on Modern Tunneling Science and Technology, Kyoto, Japan, 30 October–1 November 2001; pp. 55–65. [Google Scholar]
Kaliampakos, D.; Benardos, A. Underground space development: Setting modern strategies. In Proceedings of the 1st International Conference on Underground Spaces—Design, Engineering and Environmental Aspects, Wessex Inst Technol, New Forest, UK, 8–10 September 2008; pp. 1–10. [Google Scholar]
Li, X.; Li, H.B.; Du, S.Z.; Jing, L.J.; Li, P.Y. Cross-project utilisation of tunnel boring machine (TBM) construction data: A case study using big data from Yin-Song diversion project in China. Georisk-Assess. Manag. Risk Eng. Syst. Geohazards 2023, 17, 127–147. [Google Scholar] [CrossRef]
Gokceoglu, C.; Bal, C.; Aladag, C.H. Modeling of Tunnel Boring Machine Performance Employing Random Forest Algorithm. Geotech. Geol. Eng. 2023, 41, 4205–4231. [Google Scholar] [CrossRef]
Shi, J.G.; Yang, Z.H.; Li, G.S.; Qiu, Z.L.; Mu, Z.J.; Sun, Z.W.; Zhao, K.; Yan, J.N. Investigations on optimal parameters for enhancing penetration efficiency during axial-torsional coupled impact drilling in hard rock. Geoenergy Sci. Eng. 2025, 247, 213668. [Google Scholar] [CrossRef]
Jalakani, R.; Moradi, S.S.T.; Morenov, V. ReliabilityAnalysis of a Drilling Bit Penetration Model in Oil and Gas Wells: A Case Study. Int. J. Eng. 2024, 37, 2213–2222. [Google Scholar] [CrossRef]
Wennmohs, K.H. The Conventional tunnelling excavation a universal excavation method for different rock formations. In Proceedings of the 11th International Conference on Underground Construction Prague 2010, Prague, Czech Republic, 14–16 June 2010; pp. 355–357. [Google Scholar]
Carrera, E.; Ballacchino, G.; Fries, T.; Ackermann, T. Second tube of the Gotthard Road Tunnel: Tunnel design and challenges after 2 years of construction of the preparatory works. In Proceedings of the ITA-AITES World Tunnel Congress (WTC)/49th General Assembly of the International-Tunnelling-and-Underground-Space-Association (ITA-AITES), Athens, Greece, 12–18 May 2023; pp. 494–502. [Google Scholar]
Tan, B.T.; Van Weele, B. Design and construction of sewer tunnels under the deep tunnel sewerage system. In Proceedings of the International Conference on Tunnels and Underground Structures, Singapore, 26–29 November 2000; pp. 235–240. [Google Scholar]
Feng, A.L.; Zhu, Z.Y.; Zhu, X.D.; Zhang, Q.; Yan, F.L.; Li, Z.J.; Guo, Y.W.; Singh, V.P.; Zhang, K.W.; Wang, G. Impacts of Water Diversion Projects on Vegetation Coverage in Central Yunnan Province, China (2017–2022). Remote Sens. 2024, 16, 2373. [Google Scholar] [CrossRef]
Kong, F.C.; Lu, D.C.; Ma, Y.D.; Li, J.L.; Tian, T. Analysis and Intelligent Prediction for Displacement of Stratum and Tunnel Lining by Shield Tunnel Excavation in Complex Geological Conditions: A Case Study. IEEE Trans. Intell. Transp. Syst. 2022, 23, 22206–22216. [Google Scholar] [CrossRef]
An, P.Z.; Jia, B.X.; Meng, F.L.; Wang, Z.X.; Wei, H.L.; Zhang, Y.H. Forecast of Ground Deformation Caused by Tunnel Excavation Based on Intelligent Neural Network Model. Mob. Inf. Syst. 2022, 2022, 2924093. [Google Scholar] [CrossRef]
Fu, K.; Qiu, D.H.; Xue, Y.G.; Zhang, W.Q.; Shao, T. Optimization of TBM Tunneling Parameters for Deep Buried Tunnel Based on Rock Cluster Grading and Strata Intelligent Identification. Rock Mech. Rock Eng. 2025, 58, 2607–2633. [Google Scholar] [CrossRef]
Jiang, S.J.; Sweet, L.B.; Blougouras, G.; Brenning, A.; Li, W.T.; Reichstein, M.; Denzler, J.; Wei, S.G.; Yu, G.; Huang, F.N.; et al. How Interpretable Machine Learning Can Benefit Process Understanding in the Geosciences. Earths Future 2024, 12, e2024EF004540. [Google Scholar] [CrossRef]
Karpatne, A.; Ebert-Uphoff, I.; Ravela, S.; Babaie, H.A.; Kumar, V. Machine Learning for the Geosciences: Challenges and Opportunities. IEEE Trans. Knowl. Data Eng. 2019, 31, 1544–1554. [Google Scholar] [CrossRef]
Liu, B.; Wang, J.W.; Wang, R.R.; Wang, Y.X.; Zhao, G.Z. Intelligent decision-making method of TBM operating parameters based on multiple constraints and objective optimization. J. Rock Mech. Geotech. Eng. 2023, 15, 2842–2856. [Google Scholar] [CrossRef]
Amoroso, C.; Maiello, E.; Iasiello, C. TBM machine parameters: From the empirical to the AI approaches. Gall. E Grandi Opere Sotter. 2022, 144, 39–44. [Google Scholar]
Liu, Z.H.; Xu, X.; Qiao, P.; Li, D.S. Acceleration for Deep Reinforcement Learning using Parallel and Distributed Computing: A Survey. ACM Comput. Surv. 2025, 57, 1–35. [Google Scholar] [CrossRef]
Thalagala, S.; Wong, P.K.; Wang, X.Z.; Sun, T.N. Broad Critic Deep Actor Reinforcement Learning for Continuous Control. IEEE Trans. Neural Netw. Learn. Syst. 2025; early access. [Google Scholar] [CrossRef]
Cao, Y.; Luo, W.; Xue, Y.; Lin, W.; Zhang, F. Model-based offline reinforcement learning framework for optimizing tunnel boring machine operation. Undergr. Space 2024, 19, 47–71. [Google Scholar] [CrossRef]
Yin, X.; Liu, Q.S.; Huang, X.; Pan, Y.C. Perception model of surrounding rock geological conditions based on TBM operational big data and combined unsupervised-supervised learning. Tunn. Undergr. Space Technol. 2022, 120, 104285. [Google Scholar] [CrossRef]
Liu, Z.B.; Li, L.; Fang, X.L.; Qi, W.B.; Shen, J.M.; Zhou, H.Y.; Zhang, Y.L. Hard-rock tunnel lithology prediction with TBM construction big data using a global-attention-mechanism-based LSTM network. Autom. Constr. 2021, 125, 103647. [Google Scholar] [CrossRef]
Shan, F.; He, X.Z.; Armaghani, D.J.; Sheng, D.C. Effects of data smoothing and recurrent neural network (RNN) algorithms for real-time forecasting of tunnel boring machine (TBM) performance. J. Rock Mech. Geotech. Eng. 2024, 16, 1538–1551. [Google Scholar] [CrossRef]
Zhang, Q.L.; Zhu, Y.W.; Ma, R.; Du, C.X.; Du, S.L.; Shao, K.; Li, Q.B. Prediction Method of TBM Tunneling Parameters Based on PSO-Bi-LSTM Model. Front. Earth Sci. 2022, 10, 854807. [Google Scholar] [CrossRef]
Zhou, J.; Qiu, Y.G.; Zhu, S.L.; Armaghani, D.J.; Li, C.Q.; Nguyen, H.; Yagiz, S. Optimization of support vector machine through the use of metaheuristic algorithms in forecasting TBM advance rate. Eng. Appl. Artif. Intell. 2021, 97, 104015. [Google Scholar] [CrossRef]
Monthanopparat, N.; Tanchaisawat, T. Advancing Tunnel Boring Machine Performance Prediction in Massive and Highly Fractured Granite: Integrating Innovative Deep Learning and Block Model Techniques. Geotech. Eng. 2024, 55, 26–34. [Google Scholar]
Soranzo, E.; Guardiani, C.; Wu, W. The application of reinforcement learning to NATM tunnel design. Undergr. Space 2022, 7, 990–1002. [Google Scholar] [CrossRef]
Li, D.; Liu, Z.; Xiao, P.; Zhou, J.; Armaghani, D.J. Intelligent rockburst prediction model with sample category balance using feedforward neural network and Bayesian optimization. Undergr. Space 2022, 7, 833–846. [Google Scholar] [CrossRef]
Wang, X.J.; Wang, F.; He, Q.; Guo, Y.A. A multi-swarm optimizer with a reinforcement learning mechanism for large-scale optimization. Swarm Evol. Comput. 2024, 86, 101486. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CL, USA, 4–9 December 2017; pp. 6000–6010. [Google Scholar]
Yuan, J.; Gao, H.; Dai, D.; Luo, J.; Zhao, L.; Zhang, Z.; Xie, Z.; Wei, Y.X.; Wang, L.; Xiao, Z.; et al. Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention. arXiv 2025. [Google Scholar] [CrossRef]
Gardner, J.R.; Pleiss, G.; Bindel, D.; Weinberger, K.Q.; Wilson, A.G. GPyTorch: Blackbox Matrix-Matrix Gaussian Process Inference with GPU Acceleration. In Proceedings of the 32nd Conference on Neural Information Processing Systems (NIPS), Montreal, QC, Canada, 2–8 December 2018. [Google Scholar]
Wang, Y.; Brubaker, M.; Chaib-Draa, B.; Urtasun, R. Sequential Inference for Deep Gaussian Process. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS), Cadiz, Spain, 9–11 May 2016; pp. 694–703. [Google Scholar]
Chen, H.; Zhou, Y.; Mei, K.H.; Wang, N.; Tang, M.D.; Cai, G.X. An Improved Density Peak Clustering Algorithm Based on Chebyshev Inequality and Differential Privacy. Appl. Sci. 2023, 13, 8674. [Google Scholar] [CrossRef]
Rasht-Behesht, M.; Huber, C.; Shukla, K.; Karniadakis, G.E. Physics-Informed Neural Networks (PINNs) for Wave Propagation and Full Waveform Inversions. J. Geophys. Res. Solid Earth 2022, 127, e2021JB023120. [Google Scholar] [CrossRef]
Anagnostopoulos, S.J.; Toscano, J.D.; Stergiopulos, N.; Karniadakis, G.E. Residual-based attention in physics-informed neural networks. Comput. Methods Appl. Mech. Eng. 2024, 421, 116805. [Google Scholar] [CrossRef]
Pei, C.Y.; Zhu, Z.X.; Wang, C.L.; Li, J.W.; Zhou, X.H.; Zhang, L.H.; Chen, Z.B. A Novel Method for Detecting Geological Anomalies in Tunnel Space Using Multiparameter Elastic Full Waveform Inversion. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5911622. [Google Scholar] [CrossRef]
Ye, L.; Tao, Z.; Huang, Y.; Li, Y. ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition. In Proceedings of the 62nd Annual Meeting of the Association-for-Computational-Linguistics (ACL)/Student Research Workshop (SRW), Bangkok, Thailand, 11–16 August 2024; pp. 11608–11620. [Google Scholar]
Bae, S.H.; Lee, H.J.; Kim, H. Cache Compression with Golomb-Rice Code and Quantization for Convolutional Neural Networks. In Proceedings of the IEEE International Symposium on Circuits and Systems (IEEE ISCAS), Daegu, Republic of Korea, 22–28 May 2021. [Google Scholar]
Xiao, G.; Tang, J.; Zuo, J.; Guo, J.; Yang, S.; Tang, H.; Fu, Y.; Han, S. DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads. arXiv 2024. [Google Scholar] [CrossRef]
Zhang, K.; Li, J.; Li, G.; Shi, X.; Jin, Z. CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges. arXiv 2024. [Google Scholar] [CrossRef]
Xue, Y.D.; Luo, W.; Chen, L.; Dong, H.X.; Shu, L.S.; Zhao, L. An intelligent method for TBM surrounding rock classification based on time series segmentation of rock-machine interaction data. Tunn. Undergr. Space Technol. 2023, 140, 105317. [Google Scholar] [CrossRef]
Yang, Y.L.; Du, L.J.; Tang, R.; Wei, F.; Zhang, H.L. Prediction of TBM penetration rate for different surrounding rocks and cutter head diameters. Heliyon 2024, 10, e33174. [Google Scholar] [CrossRef]
Grigoriu, M. A class of models for non-stationary Gaussian processes. Probabilistic Eng. Mech. 2003, 18, 203–213. [Google Scholar] [CrossRef]
Li, Y.; Xu, J. Neural network-aided simulation of non-Gaussian stochastic processes. Reliab. Eng. Syst. Saf. 2024, 242, 109786. [Google Scholar] [CrossRef]
De Martino, A.; Diki, K. Gaussian RBF kernels via Fock spaces: Quaternionic and several complex variables settings. Quantum Stud. -Math. Found. 2024, 11, 69–85. [Google Scholar] [CrossRef]
Zhang, C.; Fu, Z.J.; Zhang, Y.M. A novel global RBF direct collocation method for solving partial differential equations with variable coefficients. Eng. Anal. Bound. Elem. 2024, 160, 14–27. [Google Scholar] [CrossRef]
Arai, K. Design of On-Premises Version of RAG with AI Agent for Framework Selection Together with Dify and DSL as Well as Ollama for LLM. Int. J. Adv. Comput. Sci. Appl. 2024, 15, 117–124. [Google Scholar] [CrossRef]
Liu, S.H.; Lin, T.W.; He, D.L.; Li, F.; Wang, M.L.; Li, X.; Sun, Z.X.; Li, Q.; Ding, E.R. AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer. In Proceedings of the 18th IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 11–17 October 2021; pp. 6629–6638. [Google Scholar]
Beltagy, I.; Peters, M.E.; Cohan, A. Longformer: The Long-Document Transformer. arXiv 2020, arXiv:2004.05150. [Google Scholar]
Zhou, Z.Q.; Sun, J.W.; Lai, Y.B.; Wei, C.C.; Hou, J.; Bai, S.S.; Huang, X.X.; Liu, H.L.; Xiong, K.Q.; Cheng, S. Study on size effect of jointed rock mass and influencing factors of the REV size based on the SRM method. Tunn. Undergr. Space Technol. 2022, 127, 104613. [Google Scholar] [CrossRef]
Mendes, P.R., Jr.; Salavati, S.; Linares, O.; Gonçalves, M.M.; Zampieri, M.F.; Ferreira, V.H.D.; Castro, M.; Werneck, R.D.; Moura, R.; Morais, E.; et al. Rock-type classification: A (critical) machine-learning perspective. Comput. Geosci. 2024, 193, 105730. [Google Scholar] [CrossRef]
Hatem, A.E.; Reitman, N.G.; Briggs, R.W.; Gold, R.D.; Jobe, J.A.T.; Burgette, R.J. Western US Geologic Deformation Model for Use in the US National Seismic Hazard Model 2023. Seismol. Res. Lett. 2022, 93, 3053–3067. [Google Scholar] [CrossRef]
Muharliamov, R. On the equations of kinematics and dynamics of constrained mechanical systems. Multibody Syst. Dyn. 2001, 6, 17–28. [Google Scholar] [CrossRef]
Xing, Y.; Xu, H.; Pei, S.Y.; Chen, X.L.; Liu, X.J. A novel non-Hertzian contact model of spherical roller bearings. Proc. Inst. Mech. Eng. Part J J. Eng. Tribol. 2016, 230, 3–13. [Google Scholar] [CrossRef]
Acharya, A.; Sengupta, A.N. Action principles for dissipative, non-holonomic Newtonian mechanics. Proc. R. Soc. A Math. Phys. Eng. Sci. 2024, 480, 20240113. [Google Scholar] [CrossRef]
van Riesen, D.; Kaehler, C.; Henneberger, G. Convergence behaviour of different formulations for time-harmonic and transient eddy-current computations in 3D. IEE Proc. Sci. Meas. Technol. 2004, 151, 434–439. [Google Scholar] [CrossRef]
Zhang, H.M.; Shao, X.P.; Zhang, Z.F.; He, M.Y. E-PINN: Extended physics informed neural network for the forward and inverse problems of high-order nonlinear integro-differential equations. Int. J. Comput. Math. 2024, 101, 732–749. [Google Scholar] [CrossRef]
Zhang, Q.L.; Zhu, Y.W.; Du, C.X.; Du, S.L.; Shao, K.; Jin, Z.H. Dynamic Rock-Breaking Process of TBM Disc Cutters and Response Mechanism of Rock Mass Based on Discrete Element. Adv. Civ. Eng. 2022, 2022, 1917836. [Google Scholar] [CrossRef]

Figure 1. Advanced TGS exploration.

Figure 2. TBM host and on-site tunnelling process.

Figure 3. NSA original complete architecture model [33].

Figure 4. Improved NSA complete architecture based on UCS capture.

Figure 5. CHG algorithm based on the Gaussian process: complete algorithm architecture.

Figure 6. RBF kernel and the Matérn kernel data selection training process and scores (legend indicates the scoring situation).

Figure 7. Conceptual diagram of the RBF kernel function optimization process.

Figure 8. The dynamic confidence level enhancement process of the CHG algorithm incorporating Chebyshev’s inequality.

Figure 9. ISRM rock UCS classification box plot.

Figure 10. Advanced TGS exploration radar image (based on rock UCS wave velocity display).

Figure 11. Wavelet transform for denoising geological radar images (conceptual diagram).

Figure 12. Three-dimensional display of the topology structure of rock UCS and engineering case databases.

Figure 13. NSA capturing geological radar images based on rock UCS cluster analysis (for a specific profile).

Figure 14. NSA’s learning process and effectiveness based on geological radar images.

Figure 15. NSA lithology classification accuracy and learning effectiveness evaluation.

Figure 16. NSA reconstruction of the three-dimensional geological tunnelling section (selecting the rock section for model validation).

Figure 17. Visualization of the NSA-CHG training process (in a three-dimensional Cartesian coordinate system).

Figure 18. The variation process of the rock eddy field and the cutterhead interaction.

Figure 19. Variation in torque in PFC disc cutters breaking rock. (The geological conditions at this time are sandstone above and granite below [61]).

Figure 20. PFC hobbing cutter rock displacement field changes (geological conditions are the same as Figure 19).

Figure 21. PFC hobbing cutter rock displacement cloud diagram (geological conditions are the same as in Figure 19 and Figure 20).

Figure 22. NSA-CHG with PINN standardized learning process.

Figure 23. Analysis of mechanical parameters and specific energy dissipation related to TBM.

Figure 24. NSA-CHG model real-time dynamic prediction of TBM torque, thrust, and rotational speed: process and evaluation system.

Figure 25. The NSA’s analysis of the geological radar image for the experimental subject (50 m typical tunnelling section).

Figure 26. NSA’s preliminary analysis and sensitive value extraction process.

Figure 27. The NSA’s typical analysis and modelling of fractured generation and water infiltration areas.

Figure 28. NSA-CHG prediction of TBM main mechanical parameters and shutdown based on experimental subjects (50 m typical tunnelling section).

Figure 29. NSA-CHG prediction of TBM mechanical parameters integration chart.

Figure 30. Comparison of TBM real forecast values after fitting with forecast values.

Figure 31. Comparison of TBM actual values and predicted values in special geological sections.

Figure 32. NSA-CHG framework predictive performance characterization. (a) NSA-CHG framework: geological classification performance matrix. (b) The variation in classification performance of the NSA-CHG framework for different geological conditions with depth.

Figure 33. Baseline model capability comparison.

Table 1. Geological section division and features.

Geological Section	Mileage Range (m)	Recommended Construction Method	Risk Level
Soft Rock Area	0–5	Conventional TBM excavation, medium thrust and torque	Low
Fractured Zone with Water Area	5–10	Cautious TBM excavation, enhanced ahead-of-face geological forecasting, pre-grouting treatment	High
High In Situ Stress Zone	10–35	TBM high-thrust excavation, control excavation speed, prevent rock bursts	Medium

Table 2. Presents the borehole exploration records.

Borehole ID	Chainage (m)	Borehole Depth (m)	Rock Mass Classification	RQD Value (%)	Uniaxial Compressive Strength (MPa)	Groundwater Level (m)	Major Geological Hazards
ZK-01	1.5	25	III	65–75	35–45	Not detected	None
ZK-02	3.8	28	III	60–70	40–50	Not detected	Developed joints
ZK-03	6.2	30	IV	30–40	15–25	5.5	Developed fissures, water-bearing
ZK-04	8.5	32	V	10–20	5–15	4.2	Fault, water-rich
ZK-05	12.0	35	II	75–85	80–100	Not detected	High in situ stress
ZK-06	18.5	38	II	80–90	90–110	Not detected	High in situ stress
ZK-07	24.0	40	II	75–85	85–105	Not detected	High in situ stress
ZK-08	30.5	38	II	75–85	80–100	Not detected	High in situ stress

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, Y.; Guan, W.; Azzam, R.; Chen, S. NSA-CHG: An Intelligent Prediction Framework for Real-Time TBM Parameter Optimization in Complex Geological Conditions. AI 2025, 6, 127. https://doi.org/10.3390/ai6060127

AMA Style

Chen Y, Guan W, Azzam R, Chen S. NSA-CHG: An Intelligent Prediction Framework for Real-Time TBM Parameter Optimization in Complex Geological Conditions. AI. 2025; 6(6):127. https://doi.org/10.3390/ai6060127

Chicago/Turabian Style

Chen, Youliang, Wencan Guan, Rafig Azzam, and Siyu Chen. 2025. "NSA-CHG: An Intelligent Prediction Framework for Real-Time TBM Parameter Optimization in Complex Geological Conditions" AI 6, no. 6: 127. https://doi.org/10.3390/ai6060127

APA Style

Chen, Y., Guan, W., Azzam, R., & Chen, S. (2025). NSA-CHG: An Intelligent Prediction Framework for Real-Time TBM Parameter Optimization in Complex Geological Conditions. AI, 6(6), 127. https://doi.org/10.3390/ai6060127

Article Menu

NSA-CHG: An Intelligent Prediction Framework for Real-Time TBM Parameter Optimization in Complex Geological Conditions

Abstract

1. Introduction

2. Materials and Methods

2.1. Experimental Data Collection

2.2. Native Sparse Attention (NSA) Algorithm

2.3. Chen-Guan (CHG) Algorithm Establishment

2.3.1. Kernel Function Optimization

2.3.2. RBF Radial Basis Function Optimization

2.3.3. Posterior Optimization of the CHG Algorithm

3. Model Establishment

3.1. Original Data Optimization

3.2. Tunnel Boring Geological Reconstruction

3.3. Training and Output Augmentation

3.4. PFC Enhancement Analysis

3.5. Model Evaluation

4. Model Validation

5. Conclusions

5.1. Validation of Research Assumptions

5.2. Parameter Impact Analysis

5.3. Research Limitations

5.4. Future Research Directions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI