1. Introduction
The demand for urbanization and the scarcity of spatial resources continuously drives the development of underground spaces. The shield tunneling method has been widely applied as the mainstream technology for underground space development. However, the shield tunneling process is influenced by various factors, such as external construction environment and equipment performance, leading to various abnormal scenarios that might occur. Common abnormal scenarios include sudden geological change [
1], mudcake events [
2], cutter wear [
3], face instability [
4], and excessive ground settlement [
5]. These not only affect the safety of tunnel construction but also increase project costs. Therefore, our research objective is to develop an anomaly detection model that can detect and identify multiple types of abnormal scenarios in shield tunneling construction, providing effective support for risk monitoring and management in the shield tunneling construction process.
The existing analysis of anomalies in the shield tunneling process mainly focuses on the following three issues: selecting monitoring parameters for anomaly detection, designing anomaly detection models, and identifying anomaly scenarios.
The selection of monitoring parameters for anomaly detection can affect a model’s performance. Existing research on anomaly detection in shield tunneling often selects suitable detection parameters according to the characteristics of abnormal scenarios. One method for monitoring parameter selection is based on empirical research. For example, Zhang et al. [
6] used cover depth, advance rate, earth pressure, moisture content, soil elastic modulus, and a standard penetration test value to detect excessive ground settlement during shield tunneling. Similarly, Hu et al. [
7] summarized advance speed, total thrust, torque, type of conditioning agent, and seven other parameters associated with mudcake events based on more than 30 literature cases and used them as inputs for the mudcake event detection model. Another method for monitoring parameter selection is using feature selection techniques in machine learning. Kannangara et al. [
8] used the Shapley additive explanations (SHAP) method to obtain input features of the excessive ground settlement detection model: torque, vertical deviation, pitching angle, groundwater level, and jack pressure. However, these studies require individual parameter selection for each abnormal scenario, and the selected monitoring parameters are usually specific. Even for the same abnormal scenario, the monitoring parameters summarized by different studies are also different (references [
6,
8]). As a result, these studies are limited in the types of abnormal scenarios they can detect, and they might not be able to detect some unknown anomalies. Therefore, there is a need to find monitoring parameters that are more universally applicable to detecting various abnormal scenarios in the shield tunneling process.
In terms of anomaly detection model design, traditional anomaly detection research usually relies on supervised models, which need labeled data. Zhai et al. [
9] proposed a random forest-based classification method that needs labeled operational data as input and predicts the abnormal states in the shield tunneling process. However, anomalies in the shield tunneling process are diverse and constitute a minority of the entire dataset, making it challenging to acquire labeled anomaly samples to train the models. Existing research focuses more on unsupervised methods, which detect anomalies according to whether parameter features change. Xu et al. [
10] and Hu et al. [
11] proposed anomaly detection models that can learn features of construction parameters during a normal shield tunneling process. They suggested that when anomalies occur, the features of the construction parameters will change. This will lead to significant differences between expected features and actual features, which can then be used to determine the occurrence of anomalies. These methods do not require data with abnormal labels and can achieve anomaly detection only by learning the features of normal data. There is an obvious imbalance between abnormal and normal data in the shield tunneling process, and the proportion of normal data is more significant. Therefore, compared with supervised-based anomaly detection algorithms, unsupervised-based anomaly detection algorithms are more suitable.
The purpose of anomaly identification is to further analyze the nature or causes of anomaly, which involves identifying the specific abnormal scenario. Current research primarily relies on engineering experience and rule-based judgment to realize anomaly identification. Relevant studies record the occurrence process of anomalies in shield tunneling projects and describe the specific characteristics of parameters before and after the occurrence of anomalies. This provides directional guidance for anomaly identification in shield tunneling. Liu Wei [
12] summarized the principles of hydraulic cylinder leakage, reversing valve leakage, and overflow valve leakage in the propulsion system of the shield machine based on expert experience. He then detected these anomalies based on if–then rules and a fuzzy inference engine. However, the shield tunneling process contains many parameters, and various abnormal scenarios involve complex correlations among multiple parameters. Traditional anomaly identification methods based on empirical rules may make it difficult to accurately describe the characteristics of different abnormal scenarios.
The occurrence of anomalies is often accompanied by some changes in the characteristics of energy consumption. This point of view has been proved by many theories. Energy dissipation theory suggests that in normal operating conditions, the transformation and dissipation of energy in the system follow a stable process, while abnormal behavior may result in abnormal energy flow and dissipation [
13,
14,
15]. Catastrophe Theory also demonstrates that anomalies can lead to the accumulation or release of energy, thereby causing a sudden change in energy consumption [
15]. These theories provide a new perspective for anomaly detection. Existing studies have demonstrated the feasibility of using energy consumption to provide feedback on the operational state of the system. By monitoring the energy consumption in various stages of construction, different anomalies can be detected in time. Monferrer et al. [
16] used the energy consumption of the spindle as an indicator for the real-time monitoring of the CFRP drilling process and detected multiple types of anomalies in the drilling process. Selvaraj et al. [
17] and Quiroz et al. [
18] collected and extracted energy consumption data features from systems operating in different states. They then constructed detection models using these features to differentiate between normal and abnormal operation modes of the systems, validating the effectiveness of using energy consumption data as the monitoring parameter. Therefore, the energy consumption in a system is often closely related to its stability. Using energy consumption as the detection target provides an effective solution to the selection of monitoring parameters for anomaly detection. In addition, since energy consumption can directly reflect the system’s operational state, anomaly detection methods based on energy consumption can capture different abnormal scenarios, thereby enhancing the generalization ability of the anomaly detection methods.
Inspired by this, energy consumption during shield tunneling process can serve as an indicator of anomalies. Compared with other operational data, it can make the anomaly detection model more suitable for detecting various abnormal scenarios. Therefore, herein, we analyze and monitor the shield tunneling process from the perspective of energy consumption. Based on the characteristics of the energy consumption data of shield tunneling, we design an anomaly detection model that can improve the accuracy of anomaly detection. In addition, we further identify the scenarios of anomalies that are detected based on the correlation among multidimensional variables. This can help engineers focus on specific issues, allowing them to have a more targeted and appropriate response. So, we propose the AD_SI model (Anomaly Detection and Scenario Identification model of shield tunneling) to monitor the shield tunneling process and apply it to actual projects. The proposed method can not only detect various anomalies that occur during the shield tunneling process but also identify the scenarios of anomalies by considering the correlation of multiple construction parameters.
The structure of this paper is organized as follows:
Section 2 summarizes the research status of anomaly detection and identification. The structure and principle of the AD_SI model is introduced in
Section 3.
Section 4 presents the application of the model in a subway tunnel construction project in Nanjing and compares its performance with other anomaly detection models. Our main conclusions are provided in
Section 5.
3. Shield Tunneling Anomaly Detection and Scenario Identification
3.1. System Framework
The framework of the Anomaly Detection and Scenario Identification model (AD_SI) is represented in
Figure 1, and it can be divided into two phases: (1) detect abnormal sections in the shield tunneling process, and (2) identify the specific scenarios of anomalies based on anomaly detection. The first phase focuses more on the abnormal change within the time series and the second phase focuses on the correlation among multidimensional parameters.
The first phase constructs an anomaly detection model using energy consumption as the monitoring parameter. Considering the advantages of unsupervised methods, the proposed anomaly detection model is based on an improved reconstruction model VAE-LSTM (Variational Autoencoder–Long Short-Term Memory). It can report potential abnormal segments by detecting changes in the state of energy consumption time-series data. In addition, recognizing that threshold setting can affect the model’s detection accuracy, we introduce a dynamic threshold setting within the model.
The first phase is accomplished through three steps: (1) Training the anomaly detection model to learn the data features of energy consumption in normal scenarios. (2) Setting a dynamic threshold based on the mean and variance of the reconstruction error of the training set. This threshold will be continuously adjusted with the updating of the training set after the completion of each tunneling ring. (3) Detecting abnormal sections by comparing the reconstruction error of testing data with the dynamic threshold. Data with reconstruction errors exceeding the threshold will be detected as potential anomalies.
The core task of the second phase is to identify scenarios of the abnormal sections. Based on the analysis of the feasibility and advantages of using the parameter correlation to realize abnormal identification in
Section 2.2, we represent and identify scenarios of the abnormal sections detected in the first phase by using correlations among different construction parameters.
Therefore, we use two steps to realize the second phase: (1) Extracting parameter correlation from multidimensional data of known scenarios in historical projects. The correlation is used to create a feature matrix representing each known scenario, which is then stored in the parameter correlation base of known shield tunneling scenarios. (2) The second step extracts the parameter correlation of abnormal sections detected in the first phase. By comparing the parameter correlation of abnormal sections with that of known scenarios, we can obtain the specific abnormal scenario and thus achieve anomaly identification.
3.2. Anomaly Detection Model Based on Dynamic Threshold
The setting of the threshold in an anomaly detection model has an impact on the accuracy of anomaly detection. When the threshold is set too high, the anomaly detection model becomes conservative, leading to cases of missed anomaly detection. On the other hand, if the threshold is set too low, the model becomes overly sensitive, resulting in unnecessary false alarms. A reasonable and accurate threshold setting can reduce false alarms while capturing more abnormal data effectively. Furthermore, noise data and differences in data characteristics under different shield tunneling scenarios can introduce more interference in determining the threshold for the anomaly detection model. Therefore, to improve the performance of the anomaly detection model, we make improvements to the VAE-LSTM model and design an anomaly detection model with a dynamic threshold to monitor energy consumption in the shield tunneling process. The process of shield tunneling detection model using the dynamic threshold, DT_VAE-LSTM (VAE-LSTM with Dynamic Threshold), is illustrated in
Figure 2.
The DT-VAE-LSTM model can be divided into three modules: offline training, dynamic threshold setting, and online detection. The offline training module combines the VAE and LSTM algorithms to implement a reconstruction-based anomaly detection model, using anomaly-free data to train the model and learn data features in normal scenarios. Specifically, the VAE algorithm is used to learn the latent distribution of the data, while the LSTM algorithm can capture the temporal features. The dynamic threshold setting module will dynamically update the threshold with each iteration of model training, which is set based on the reconstruction error of the training data. The online detection module uses the trained anomaly detection model to reconstruct the test data, obtaining the reconstruction error. Finally, the final anomaly detection result is obtained by comparing the reconstruction error of the test data with the threshold.
DT_VAE-LSTM first preprocesses the collected data. DT_VAE-LSTM primarily monitors the operational segment where the shield machine is working. Therefore, the data preprocessing stage initially removes data at the time of cessation. In addition, data are smoothed and normalized. After data preprocessing, the training data and the data to be detected enter the offline training phase and the online detection phase, respectively.
3.2.1. Offline Training
During the offline training phase, the model is trained using data from the normal tunneling scenario to learn the distribution characteristics of data in normal scenario. We first use a sliding window with length
and step size 1 to cut the original training time-series data
into subsequences so that the input of the encoder is a vector of the window size. Assuming a total of
subsequences are obtained after segmentation, given a subsequence input
, the encoder transforms the input data into a latent representation, and the mean vector and variance vector of the latent space distribution can be obtained. The mean vector and variance vector generate the corresponding latent variable
that satisfies the unit Gaussian distribution. We use
to represent all the embeddings output by the encoder, and
ei represents the embedding of the
i-th subsequence of the input data. We use a sliding window of length
L to segment
E and obtain
k non-overlapping subsequences (where
is an integer). Afterwards, the LSTM model uses these subsequences of embeddings as inputs. We have the LSTM model take the first
embeddings and predict the next
embeddings. This process can be expressed using Formula (1):
where
represents the
i-th embedding in the
j-th subsequence of embeddings.
When using
as the predicted results output by LSTM model, the decoder in the DT_VAE-LSTM model uses
to reconstruct the data. The reconstructed data are the output of the decoder, represented as
. The DT_VAE-LSTM model continuously optimizes its parameters by minimizing the objective function (Formula (2)) to accurately reconstruct the normal time-series data characteristics.
The objective function consists of two parts, representing the KL divergence loss and the reconstruction error loss, respectively. measures the difference between the latent variable distribution and the predefined prior distribution. The reconstruction error loss measures the error between the reconstructed data and the original data .
3.2.2. Dynamic Threshold Setting
The setting of the threshold also impacts the results of anomaly detection, and it is essential to avoid setting an inappropriate threshold that may cause false negatives or false positives. In anomaly detection, thresholds are typically set as fixed values [
34]. However, during the shield tunneling process, differences in data characteristics under different scenarios and local normal fluctuations in the time-series data can interfere with the threshold setting process. The method of using a constant threshold is simple and intuitive, but its lower flexibility leads to poorer performance in anomaly detection. Therefore, the model dynamically adjusts the threshold based on the reconstruction error values of the training data to ensure the reasonability of threshold setting. Assuming the reconstruction error of the training data with a length of
N is denoted as
,
is the reconstruction error of the
i-th sample in the training set, which is calculated as follows:
Considering that the fluctuating data and outliers in the training set may cause significant disturbances, using the maximum value of the reconstruction errors in the training set as the threshold directly is not precise enough. Therefore, the threshold
is dynamically adjusted based on the mean and standard deviation of the reconstruction errors of the training set.
is an ordered set of positive values, and the value is chosen to maximize Formula (5):
where
,
,
.
3.2.3. Online Detection
In the online detection phase, the trained DT_VAE-LSTM model is used to reconstruct the time-series data to be tested, and the reconstruction error is used to represent the anomaly score. When detecting whether the data at time is abnormal, the sub-sequence ending at time is used as the test data. When using to represent data at time , the DT_VAE-LSTM model can generate the reconstructed data denoted as and calculate the reconstruction error between the reconstructed time series and the original time-series data. Finally, is chosen as the anomaly score at time t. The anomaly detection result is determined by comparing with the dynamic threshold , and is used to represent the result. When , will be detected as an anomaly and .
3.3. Feature Presentation and Identification of Shield Tunneling Scenario
The anomaly detection model can only feedback whether anomalies occurred during the shield tunneling process. To provide effective references for the construction process, it is necessary to further identify the scenarios of anomalies that occur. Therefore, we use accumulated historical shield tunneling data and corresponding scenarios to explore the features of known shield tunneling scenarios, providing references for identifying pending shield tunneling scenarios. We achieve the representation and identification of shield tunneling scenarios based on the correlation of multidimensional construction parameters. The model structure is shown in
Figure 3.
For a specific shield tunneling scenario
r, it is assumed that the shield tunneling scenario is represented by the correlation of
m construction parameters in a continuous sequence of
n rings. The correlation is represented using the Pearson correlation coefficient, which is calculated as follows:
In the Formula (6),
represents the covariance between variable
and variable
Y, while
and represent the standard deviations of variable
and variable
, respectively. Therefore, the final feature matrix representing the shield tunneling scenario
r is constructed as follows:
Assuming that there are known scenarios in the historical project data, there will be feature matrices representing the known shield tunneling scenarios. These matrices form a parameter correlation base of known shield tunneling scenarios, denoted as .
The different dimensions of the feature matrix representing the shield tunneling scenarios have different meanings. The Mahalanobis distance takes into account the differences between different dimensions when calculating matrix similarity [
35]. Therefore, we use the Mahalanobis distance between feature matrices of different shield tunneling scenarios as a measure of similarity for shield tunneling scenarios. A smaller Mahalanobis distance indicates a higher similarity between scenarios. Specifically, when identifying a pending scenario, the first step is to construct a feature matrix
s that represents the characteristics of that shield tunneling segment. Next,
s is compared with the feature matrices in a parameter correlation base of known shield tunneling scenarios. The matching formula is shown as Formula (7).
In Formula (7), represents the Mahalanobis distance between the feature matrix s and that of a known shield tunneling scenario i. A smaller value indicates a higher similarity between the pending scenario and scenario i. And s_result represents a known shield tunneling scenario with a feature matrix most similar to s.
In addition, we also set a similarity threshold based on the distribution of MD values for the same scenarios in the parameter correlation base of known shield tunneling scenarios. If , the pending scenario belongs to the shield tunneling scenario s_result. If , the pending scenario will be identified as a new scenario.
4. Engineering Application
We applied the AD_SI model on the K section of a subway tunnel construction project in Nanjing and analyzed its application performance. The AD_SI model allows for dynamic and continuous model training and testing during the shield construction process and continuously adjusts the model’s training data according to the anomaly detection results. Therefore, this model can adapt to the dynamically changing external environment.
4.1. Engineering Background
The left line of the K section in the subway tunnel construction project in Nanjing has a length of 734.514 m, with a total of 616 rings. The shield tunneling process uses an earth pressure balance shield machine with a diameter of 6.2 m. The main geological types that this project traverses are interlayers of silty clay and silt, silty clay with silt and mud, clay, and weathered silty sandstone. The geological conditions within this section exhibit alternating soft and hard characteristics, with significant variations and distinct differences in engineering properties. The geological distributions of the K section are shown in
Figure 4.
4.2. Application Overview
We applied the AD_SI model to the detection of the shield tunneling process in the K section from ring 126. The shield tunneling information was obtained from real-time data collected by sensors installed on the shield machine. The sensors collected data at a sampling frequency of one sample per second. We also calculated the corresponding energy consumption based on the real-time current and voltage information of the shield machine collected during the tunneling process and the time taken to advance a unit distance. During the application of the AD_SI model, the construction parameters mainly used are the overall energy consumption of the shield machine, cutterhead torque, cutterhead rotation speed, total thrust, speed, and earth pressure. In the AD_SI model, the sliding window length for cutting the original time series is 48. Both the encoder and decoder of VAE are set with a four-layer structure, and the dimension of the latent variable is set to 6. The sliding window length for cutting the LSTM input data is set to 12, and the hidden size of the LSTM unit is set to 64. Since the input of the LSTM is the embeddings obtained from the VAE encoder and we have the LSTM model take the first embeddings and predict the next embeddings, the dimension of the input vector of the LSTM network is (11, 6). We collect data samples with each 5 mm advancement of the shield tunneling machine, using 2000 samples to construct the training set. In addition, we implement rolling training for the model, updating the training set when the shield tunneling machine completes one ring of advancement. We also built a base containing feature representation of three known abnormal scenarios based on historical construction projects. The known abnormal scenarios are cutter wear, mudcake event, and geological transition from soft soil to hard rock layer.
The AD_SI model reported three anomalies within the K section. There were three anomalies that occurred during the actual construction process, which were located at ring 210–221, ring 231–240, and ring 258–268, respectively.
Figure 5 illustrates the variation in energy consumption of this segment. It also notes the actual range of the three construction risk events and the anomalies detected by the AD_SI model.
Table 1 summarizes the anomaly detection results of the AD_SI model and the corresponding actual anomalies.
The AD_SI model initially detected a prolonged and continuous anomaly within the range of ring 210–221 and determined the abnormal scenario to be the geological transition. After the AD_SI model reported the anomaly, the engineers observed and analyzed the muck and concluded that the geological conditions encountered by the shield machine began to change when the shield cutterhead approached near ring 216 (corresponding to the actual ring number 210).
Subsequently, the AD_SI model detected long-lasting anomalies, again starting from ring 231. It identified an abnormal scenario from ring 232 to ring 240, and the abnormal scenario was geological transition. The engineers did not pay much attention to the anomaly reported by the model at the beginning. They then noticed a significant change in parameters starting from ring 235. Subsequently, they analyzed the muck and historical data and concluded that the shield cutterhead gradually traversed from highly weathered silty sandstone to moderately weathered silty sandstone from ring 231 to ring 240. Therefore, the AD_SI model we propose can provide feedback on the changes in the operational status of the shield machine and accurately identify the scenario when the geological condition changes.
Finally, the AD_SI model detected continuous anomalies starting from ring 258 to ring 269, and it determined that this segment’s anomaly corresponded to a mudcake event. According to feedback from the engineers, the shield machine’s efficiency decreased significantly from ring 265. Then, the engineers inspected the shield machine at ring 268. It was discovered that there was a severe mudcake phenomenon inside the earth chamber (see
Figure 6). After expert analysis, it was estimated that the mudcake started forming at ring 258. The anomaly continued until ring 268, when the mudcake was cleaned up. Therefore, the AD_SI model we propose demonstrates good detection performance for mudcake events. It can detect anomalies earlier than engineers through energy consumption monitoring and accurately identify specific abnormal scenarios.
4.3. Model Performance Analysis
To validate the performance of the proposed anomaly detection method in this paper, we compared the effectiveness of comparative models and the method we propose in terms of three aspects and conducted three comparative experiments. The settings for the comparative models are shown in
Table 2. Firstly, to compare the performance of different anomaly detection algorithms, we compared the anomaly detection results of DT_VAE-LSTM with those of VAE and AE-LSTM. Furthermore, to validate the effectiveness of the dynamic threshold proposed in
Section 3.2, we constructed the VAE-LSTM with a fixed threshold. The fixed threshold was set using the maximum reconstruction error of the training dataset obtained during the initial model training. We compared its detection performance with that of the DT_VAE-LSTM. Secondly, to investigate the impact of selecting different monitoring parameters on detection performance, we replaced the monitoring parameter of the anomaly detection model with the cutterhead torque and total thrust. We compared the detection performance of the DT_VAE-LSTM (torque) and DT_VAE-LSTM (thrust) with the DT_VAE-LSTM (energy) model using the same DT_VAE-LSTM algorithm. Lastly, the proposed AD_SI model combined the results of the anomaly detection model and scenario identification model to make the final decision. Therefore, we also verified whether the fusion of the two-stage results contributes to improving anomaly detection performance.
We used precision, recall, and
F1 as evaluation metrics for the model’s performance, as shown in Formulas (8)–(10).
TP represents the number of data points correctly labeled as anomalies,
FP represents the number of data points incorrectly labeled as anomalies,
TN represents the number of data points correctly labeled as normal, and
FN represents the number of data points incorrectly labeled as normal.
4.3.1. Comparison of Anomaly Detection Algorithms
In comparative experiment 1, we compared the performance of different anomaly detection algorithms when using energy consumption as the monitoring parameter. The detection results and a comparison of the different threshold settings are shown in
Figure 7. To provide clearer visualization of threshold settings and anomaly detection results, we add magnified insets of the detection results of VAE-LSTM and DT_VAE-LSTM for abnormal events 2 and 3 in
Figure 7. The detection performance of each algorithm is presented in
Table 3. The proposed DT_VAE-LSTM model exhibits significantly higher recall values for various abnormal scenarios than other methods. This indicates that the proposed method can more accurately identify abnormal information in the shield tunneling data, resulting in a lower false negative rate. In particular, during the detection process of abnormal event 2 and abnormal event 3, the DT_VAE-LSTM detected anomalies significantly earlier than the AE-LSTM and VAE, demonstrating more stable and reliable anomaly detection performance. Although the VAE-LSTM, which set a fixed threshold in the comparative experiment, somewhat reduces the false positive rate, it lacks adaptability to data in different scenarios, resulting in decreased detection performance when the construction scenario changes. The dynamic threshold method proposed in this article can reduce the influence of noise data and set more accurate thresholds by adapting to data from different shield tunneling scenarios. The VAE-LSTM exhibited noticeable false negatives in the detection of abnormal event 2 and abnormal event 3, with a reduction of 21% and 8% in recall values, respectively, compared to the DT_VAE-LSTM based on dynamic threshold setting. Therefore, the anomaly detection method we propose, which is based on dynamic threshold setting, can achieve a better balance between recall and precision. The
F1 score consistently remains the highest, thereby enhancing the overall detection capability of various abnormal scenarios.
4.3.2. Comparison of Monitoring Parameters
To compare the impact of different monitoring parameters on the results, we replaced the monitoring parameter in
Section 3.2 with cutterhead torque and total thrust as comparative models. Cutterhead torque and total thrust are commonly used detection parameters in the shield construction field. The detection results based on different monitoring parameters are shown in
Figure 8, and the anomaly detection performance of each model is shown in
Table 4. Firstly, we found that compared to DT_VAE-LSTM (energy), DT_VAE-LSTM (torque) and DT_VAE-LSTM (thrust) are more susceptible to interference, which makes them more prone to false alarms in the models, especially in the case of DT_VAE-LSTM (thrust), which consistently had a precision value below 0.6. Such false alarms can introduce interference to the precise localization of the anomaly range, thereby affecting the credibility of the anomaly detection results. Secondly, DT_VAE-LSTM (thrust) failed to detect abnormal event 1, having a recall value of only 0.53. Thirdly, during the detection process of abnormal event 2, DT_VAE-LSTM (torque) reported the anomaly significantly later than DT_VAE-LSTM (energy). Therefore, DT_VAE-LSTM (thrust) and DT_VAE-LSTM (torque) exhibited a higher rate of missed detections, indicating poorer performance. In conclusion, the
F1 score of DT_VAE-LSTM (energy) is higher than the other two comparative models on three abnormal events. Using energy consumption as a monitoring parameter can better reflect the overall status of the shield tunneling process and provide more valuable references for detecting and locating the range of anomalies.
4.3.3. Comparison of Fusion Effects of Anomaly Detection and Identification
The proposed AD_SI model uses the scenario identification model to determine the scenarios of the detected abnormal sections after the anomaly detection model detects continuous anomalies. It integrates the results of the anomaly detection model and scenario identification model to make the final judgment. A single-anomaly detection model may have certain false positives or false negatives, but incorporating scenario identification as supplementary information can help refine the anomaly detection results. Therefore, this fusion method combines results from two perspectives, further enhancing the accuracy and robustness of anomaly detection.
Figure 9,
Figure 10 and
Figure 11 demonstrate the performance of the AD_SI model and all other comparative models on three abnormal events. It can be observed that the anomaly detection performance of the AD_SI model has been further improved, with its
F1 score being consistently higher than that of any other model.
Hu et al. [
7] designed an anomaly detection model KMCED driven by data and knowledge. They used the same shield tunneling project as the research object to detect abnormal event 3. We compared the detection performance of the KMCED model and the AD_SI model (
Figure 11). Both KMCED and AD_SI demonstrated good capabilities in capturing abnormal event 3, with recall values exceeding 0.9. However, KMCED had a higher false alarm rate (precision = 0.6), leading to the imprecise localization of the actual range of anomalies and impacting the reliability of the anomaly detection results. Therefore, the AD_SI model performed significantly better in detecting abnormal event 3 (
F1 = 0.81) compared to the KMCED model proposed in reference [
7] (
F1 = 0.56).
5. Conclusions
This paper proposes the AD_SI model to realize anomaly detection and identification for the shield tunneling process. This model has two main phases: detecting various anomalies and identifying the specific abnormal scenarios. In this method, energy consumption data are innovatively proposed as a monitoring parameter for anomaly detection. By constructing an anomaly detection model based on energy consumption, we aim to detect various anomalies during the shield tunneling process, thereby enhancing the model’s applicability. In addition, we further identify the scenarios of the anomalies based on the relationship among different parameters and thus provide a more straightforward explanation for the detected anomalies.
The model was applied to a subway tunnel construction project in Nanjing, and the following conclusions were obtained:
- (1)
The energy consumption data during the shield tunneling process can serve as a monitoring parameter for anomaly detection. The AD_SI model we propose can detect various anomalies in time based on the energy consumption data. Furthermore, the AD_SI model performs better than models that are based on other commonly used monitoring parameters.
- (2)
The representation and identification of shield tunneling scenarios can be realized based on the correlation among construction parameters. After detecting prolonged anomalies, the AD_SI model uses the verification based on the correlation of parameters to accurately identify the shield tunneling scenario, thereby realizing anomaly identification. This provides a further explanation for abnormal energy consumption behavior.
- (3)
Compared to all the comparative models, the AD_SI model exhibits higher sensitivity, enabling it to detect various anomalies in time. Its warning time is significantly earlier than the engineers’ discovery time, which can provide effective support for risk monitoring and disposal in the shield tunneling process.
However, the AD_SI model still has certain limitations. Firstly, we only considered the linear relationships among the parameters when realizing anomaly identification. Future research could take into consideration the non-linear relationships among the parameters. In addition, we only validated the detection and identification performance of the AD_SI model on the geological transition and mudcake events. Future research could explore applying the model to the detection of other shield tunneling abnormal scenarios, thus further enhancing the model’s practicality.