Process Monitoring of One-Shot Drilling of Al/CFRP Aeronautical Stacks Using the 1DCAE-GMM Framework

Mattera, Giulio; Marchesano, Maria Grazia; Caggiano, Alessandra; Guizzi, Guido; Nele, Luigi

doi:10.3390/electronics14091777

Open AccessArticle

Process Monitoring of One-Shot Drilling of Al/CFRP Aeronautical Stacks Using the 1DCAE-GMM Framework

by

Giulio Mattera

^1,*

,

Maria Grazia Marchesano

¹

,

Alessandra Caggiano

²,

Guido Guizzi

¹ and

Luigi Nele

¹

Department of Chemical, Materials and Industrial Production Engineering, University of Naples Federico II, 80125 Naples, Italy

²

Department of Industrial, Electronic and Mechanical Engineering, Roma Tre University, 00146 Rome, Italy

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(9), 1777; https://doi.org/10.3390/electronics14091777

Submission received: 25 March 2025 / Revised: 20 April 2025 / Accepted: 25 April 2025 / Published: 27 April 2025

(This article belongs to the Special Issue Applications of Artificial Intelligence in Intelligent Manufacturing)

Download

Browse Figures

Versions Notes

Abstract

:

This study explores advanced process monitoring for one-shot drilling of aeronautical stacks made of aluminium 2024 and carbon fibre-reinforced polymer (CFRP) laminates using a 4.8 mm diameter drilling tool and unsupervised machine learning techniques. An experimental campaign is conducted to collect thrust force and torque signals at a 10 kHz sampling rate during the drilling process. These signals are employed for real-time process monitoring, focusing on material change detection and anomaly identification, where anomalies are defined as holes that fail to meet predefined quality criteria. An innovative approach based on unsupervised learning is proposed to enable automatic material change identification, signal segmentation, feature extraction, and hole quality assessment. Specifically, a semi-supervised approach based on a Gaussian Mixture Model (GMM) and 1D Convolutional AutoEncoder (1D-CAE) is employed to detect deviations from normal drilling conditions. The proposed method is benchmarked against state-of-the-art supervised techniques, including logistic regression (LR) and Support Vector Machines (SVMs). Results show that these traditional models struggle with class imbalance, leading to overfitting and limited generalisation, as reflected by the F1 scores of 0.78 and 0.75 for LR and SVM, respectively. In contrast, the proposed semi-supervised approach improves anomaly detection, achieving an F1 score of 0.87 by more effectively identifying poor-quality holes. This study demonstrates the potential of deep learning-based semi-supervised methods for intelligent process monitoring, enabling adaptive control in the drilling process of hybrid stacks and detecting anomalous holes. While the proposed approach effectively handles small and imbalanced datasets, further research into the application of generative AI could enhance performance, aiming for F1 scores above 0.90, thereby supporting adaptation in real industrial environments with high performance.

Keywords:

machine learning; advanced process monitoring; composite materials; multi-material stack; drilling

1. Introduction

In aerospace manufacturing, drilling of multi-materials stacks is the most utilised machining process during the assembly of aircraft [1]. These stacks consist of metals such as aluminium- and titanium-based aerospace alloys, as well as composite materials like carbon fibre-reinforced polymer (CFRP) or aramid fibre-reinforced polymer (AFRP) composites [2,3,4]. Finding the optimal balance among time efficiency, cost-effectiveness, and uncompromised product quality has always been a core objective of the aerospace industry. Today, intelligent manufacturing [5] is revolutionising this landscape, enabling the integration of advanced process monitoring systems, with the promise of improving the overall quality of processes in smart factories. These advanced monitoring systems track every stage of the drilling process, identifying potential issues such as misalignment, excessive tool wear [6], or tool breakage [7]. Moreover, the detection of the change between different materials in multi-material stack drilling enables the dynamic adjustment of process parameters [8], ensuring that each multi-layer stack meets the stringent quality standards required in aerospace applications.

Machine learning algorithms, brought by the Fourth Industrial Revolution, serve as the analytical backbone of these smart systems. They can process vast amounts of data generated during manufacturing operations to extract actionable insights, enable predictive capabilities, and make well-informed decisions that improve production efficiency while reducing the risk of costly rework or failure [9].

In aeronautical stack drilling, different applications have been developed for process monitoring using machine learning techniques. Caggiano et al. [10] utilised a multi-sensor monitoring system comprising force, torque, and acoustic emission sensors to estimate tool wear at the end of each CFRP/CFRP drilling operation using an artificial neural network (ANN). Similarly, Domínguez-Monferrer et al. [11] applied a Random Forest model to estimate tool wear in CFRP drilling.

Hitze et al. [12] used decision trees and ANNs to analyse motor current and machine vibration data, enabling the detection of process parameters and potentially identifying material change. Likewise, Haouoa et al. [13] employed motor currents, a dynamometer, and accelerometers with a Random Forest classifier to detect material change. Cruz et al. [14] utilised a multi-sensor monitoring system with force, torque, and acoustic emission sensors to assess the quality of aluminium/titanium drilling using an ANN. Recently, Lee et al. [15] implemented an Ensemble Neural Network (ENN) to estimate hole quality based on torque signals, specifically considering delamination in CFRP holes.

As highlighted in the literature review, only supervised machine learning methods have been employed for process monitoring applications—regression models for tool wear estimation and classification models for material change point detection and quality assessment. However, industrial datasets are often imbalanced and not always large enough to meet the requirements of supervised approaches [16]. As a result, there is growing interest in leveraging both semi-supervised and unsupervised techniques for industrial applications. These methods allow for the development of classification models, such as material change detection and hole quality estimation, without relying on extensive labelled data or large datasets. Despite this increasing interest, no studies have explored the application of unsupervised learning for process monitoring in multi-material and multi-stack drilling within the aerospace industry.

In this work, two unsupervised machine learning approaches are proposed for material change detection and hole quality assessment in one-shot drilling of multi-material aluminium 2024/CFRP stacks. An experimental drilling campaign was conducted, using an automated system representative of aerospace manufacturing processes. Thrust force and torque signals were collected and analysed to identify material change points, which can serve as a basis for real-time process parameter adjustments or as features for further analysis. These extracted features were then fed to an unsupervised anomaly detection procedure, where anomalies—defined as holes that deviate from the required quality standards—were identified using advanced machine learning techniques. The proposed methodology enhances process monitoring by providing a robust, data-driven approach for detecting deviations in drilling operations and ensuring compliance with aerospace quality standards. Moreover, the novelty lies in the use of an unsupervised approach, which effectively addresses challenges such as class imbalance and limited generalisation commonly encountered in industrial datasets.

2. Materials and Methods

2.1. Data Collection

To demonstrate the methodology for unsupervised learning-based monitoring of the drilling process, an aerospace multi-material stack comprising 2024 aluminium alloy and CFRP is employed. Specifically, the stack consists of a 2.5 mm thick 2024 aluminium alloy sheet, followed by two 3 mm CFRP laminates, resulting in a total stack thickness of 8.5 mm. The 2024 aluminium alloy is subjected to solution heat treatment at 500 °C for 40 min, followed by quenching and natural ageing for 96 h (T42). The CFRP laminates are made of epoxy-impregnated graphite tape plies in the intermediate layers, while the first and last three plies of each laminate consist of two fibreglass plies and one epoxy-impregnated graphite woven fabric ply. For all tests, two tungsten carbide twist drills with a point angle of 120°, a helix angle of 25°, and a diameter of 4.8 mm are employed to drill 2 series of 50 sequential holes for a total of 100 holes, with the feed rate of 0.13 mm/rev and a spindle speed of 4500 rev/min. The process parameters were selected based on the tool manufacturer specifications and preliminary drilling tests conducted on the same multi-material stack configuration. A review of the literature identified relevant parameters for CFRP [17,18], aluminium [19,20], and hybrid stacks [21,22], revealing significant differences in conditions such as lubrication, tool coating, diameter, and stack thickness. These parameters were optimised for our specific experimental setup, accounting for differences in tool geometry, thickness, coating, and the absence of lubrication assistance for aluminium. The summary is reported in Table 1.

To develop a cognitive monitoring module capable of estimating drill quality at the end of each hole—where a good hole is defined in this study as a hole with a delamination factor below a specified limit—a multi-sensor monitoring system was employed to collect thrust force along the vertical direction (Fz) and cutting torque along the vertical axis (Mz). The experimental setup is illustrated in Figure 1. The specimens are mounted within a custom-designed specimen holder, specifically engineered for experimental tests. The monitoring system is composed of a Kistler 9272 sensor (Kistler Group, Winterthur, Switzerland), which measures both thrust force and drilling torque; a Kistler 5165A amplifier (Kistler Group, Winterthur, Switzerland); and an NI USB-6361 data acquisition device (DAQ) (National Instruments, Austin, TX, USA), which samples the signals at a frequency of 10 kHz. These components are connected to an industrial PC operating on a Windows platform. Data acquisition and real-time process state diagnosis are performed using the Nidaqmx Python library. Subsequently, the collected data are processed through machine learning techniques for further analysis.

In this work, a smart monitoring system that learns patterns of optimal quality signals from specific combinations of process parameters was developed. This approach is particularly relevant for industry, where consistent parameters are used to drill stacks of specific thicknesses and material arrangements for different aircraft parts. Distinct models can be created for specific assembly parts, which can be managed using simple if–else logic. Thus, Tool 1 and Tool 2, as reported in the previous table, allow for the assessment of process parameter repeatability and testing of the algorithm with additional data. Every three holes, the tool wear of the drill bit, the delamination factor, and the hole diameter were recorded to monitor the progressive degradation of the tool and its effect on hole quality. Tool wear (

V B

) was measured as the width of the flank wear at

D / 6

of the tool’s diameter, using a Keyence VHX-7000 optical microscope (Keyence Corporation, Osaka, Japan), following standard procedures for assessing tool degradation and quality [23]. The delamination factor (

F_{d}

) was computed as per Equation (1), where

D_{max}

is the maximum damaged diameter observed at the hole entry or exit, and

D_{nominal}

is the nominal hole diameter. This metric quantifies the extent of the peel-up and push-out delamination around the hole perimeter, which is critical in composite drilling applications. This parameter is commonly used in aeronautics as a quality criterion, as excessive delamination can compromise the safety of riveted joints [24,25,26]. Figure 2 illustrates the evaluation process of tool wear and the delamination factor used in this study.

F_{d} = \frac{D_{max}}{D_{nominal}}

(1)

By analysing the hole diameters and delamination factors, it was observed that all holes remained within the tolerance range for diameter but exhibited an increase in delamination. Consequently, a threshold (THR) was defined for the delamination factor, leading to the classification of 82 holes as “good quality (0)” and 18 as “low quality (1)”. Given the imbalanced nature of the dataset, an unsupervised machine learning approach is proposed to enable process monitoring despite the limited and imbalanced data.

2.2. Material Change Detection and Automatic Signal Segmentation

In this study, a methodology for detecting material changes in multi-material drilling processes is employed to extract features related to the aluminium alloy and CFRP parts. The proposed approach, illustrated in Figure 3, involves segmenting the continuous signal data into fixed-length signal windows and then extracting statistical features such as mean values and standard deviations. These features can be employed to identify material changes using a K-Means clustering approach. In fact, it is well established that both thrust force and torque during the drilling process exhibit significant differences in terms of mean values, with aluminium demonstrating higher resistance compared to CFRP, as well as in standard deviations [27]. CFRP, due to its anisotropic nature, produces more variable and less stable signals, resulting in greater fluctuation in these parameters.

Specifically, following a methodology in line with the one proposed in [28], once all signals are collected, a window length of

w = 500

samples (corresponding to 50 ms) is employed to identify clusters. As soon as the thrust force reaches 10 N, indicating the start of drilling, the first 50 ms window is assigned to the closer cluster based on centroids, which in this case corresponds to the aluminium sheet, which is the first material drilled in the stack. Subsequent windows are appended to a buffer related to aluminium until the next window corresponding to CFRP is identified, marked by a significant change in the mean and an increase in the standard deviation values. This allows for distinguishing between the sensor signal segments related to aluminium drilling and those related to CFRP drilling, thus enabling the identification of the change between different materials and allowing feature extraction from both signal segments, as shown in Figure 4.

After segmentation, several statistical features are extracted from each segment of both thrust force (Fz) and torque (Mz) signals to characterise the signals for both aluminium (Al) and CFRP components. These features include the mean and the standard deviation for both materials. Additionally, the mean and the standard deviation of the entire signal are considered, resulting in a total of 12 features summarised in Table 2. The identification of the change between different materials is essential, both for automatic signal segmentation and for the potential optimisation of the process. By detecting material changes in the material, the system could automatically adjust the process parameters, thus ensuring the optimal settings for each material type. Instead of applying the same parameters across both metal and composite materials, this feedback mechanism could be integrated into a control system to optimise the process for each detected material. This approach is particularly valuable for more complex applications involving different material stacks of various thicknesses.

2.3. Machine Learning for Anomaly Detection in Unbalanced Industrial Datasets

As outlined in the introduction, early studies on drill-quality detection typically employed supervised models which require balanced datasets containing both good and defective samples. However, in aerospace settings, defective samples are relatively rare due to strict production controls, making it difficult to compile sufficiently large labelled sets of anomalies. In this scenario, semi-supervised learning provides a viable alternative by using the more readily available data of “good” holes to detect deviations from a complex pattern of normality learned from the algorithm. Aiming to compare the results of employing semi-supervised learning with supervised learning, in this work, 2 supervised ML methods have been proposed, namely logistic regression (LR) and Support Vector Machine (SVM). Logistic regression serves as a simple and interpretable baseline model, leveraging linear regression to estimate the probability of binary classification. In this scenario, the model is trained on labelled data, including both good and defective hole samples. Given the features of a new hole, logistic regression predicts the probability

P (y = 0 | x)

that the hole is defective, using the sigmoid function in Equation (2), where

x

represents the feature vector,

w

is the weight vector, and b is the bias term.

P (y = 1 | x) = \frac{1}{1 + e^{- (w^{T} x + b)}}

(2)

A threshold (e.g., 0.5) is then applied to classify the hole as either good or defective based on the predicted probability. A Support Vector Machine is a supervised learning algorithm used for classification and regression. It works by finding an optimal hyperplane that maximises the margin between different classes, ensuring better generalisation to unseen data. For non-linearly separable data, SVM uses kernel functions to transform inputs into a higher-dimensional space, allowing for more flexible decision boundaries. Both LR and SVM require a sufficient number of defective samples to accurately capture the boundary between good and defective classes, an assumption often challenged by real-world aerospace data, like the dataset collected in this study. To address this challenge, this work proposes two semi-supervised learning methods: the Gaussian Mixture Model and Deep Autoencoder.

2.4. Anomaly Detection Using Gaussian Mixture Models (GMMs)

In order to identify holes that deviate from typical drilling conditions—specifically those associated with poor hole quality—an unsupervised anomaly detection method based on Gaussian Mixture Models (GMMs) was employed. The goal is to distinguish good holes from defective holes (i.e., those failing to meet the required quality criteria due to tool wear, excessive force, or torque). To achieve this, data were collected and, after applying a segmentation procedure, statistical features were extracted from the two main sensor signal segments (aluminium alloy and CFRP) using a material change detection method. These features were then used as input for the GMM algorithm.

A GMM represents the data as a weighted sum of multiple Gaussian components expressed in Equation (3), where K denotes the number of Gaussian components, and

π_{k}

are the mixing coefficients satisfying

\sum_{k = 1}^{K} π_{k} = 1

. Each component is characterised by a mean vector

μ_{k}

and a covariance matrix

Σ_{k}

, calculated based on the extracted features from the Fz and Mz signals. The GMM parameters are learned via the Expectation Maximisation (EM) algorithm, which iteratively refines the means, covariances, and mixing coefficients to maximise the log-likelihood of the observed data.

p (x ∣ θ) = \sum_{k = 1}^{K} π_{k} N (x ∣ μ_{k}, Σ_{k}),

(3)

In this work, the GMM is trained using a semi-supervised learning approach. Therefore, 75% of the good data are used to train the GMM, aiming to represent it with two groups. Although drilling data can exhibit various operational modes even under normal conditions (e.g., variations in force amplitude due to tool wear), we set

K = 2

to capture two main patterns (Figure 5). The probability score, which indicates how likely a sample is to belong to the learned distribution based on its features, is then used as the anomaly score. By setting the threshold (THR) at the 98th percentile of the scores obtained during training, the procedure to identify abnormal conditions becomes straightforward and follows Equation (4).

s c o r e < T H R

(4)

2.5. Anomaly Detection Using 1D Convolutional Autoencoder (1DCAE)

A 1D Convolutional Autoencoder (1DCAE) is a specialised neural network designed for sequential data, such as time series. It comprises two primary components:

Encoder: Built using 1D convolutional layers (often followed by pooling operations) to compress the original input sequence into a lower-dimensional representation known as the latent space. By applying convolutional filters along the time dimension, the encoder effectively captures local temporal dependencies—such as recurring patterns or characteristic waveforms—more efficiently than traditional fully connected networks.
Decoder: Responsible for reconstructing the original sequence from the latent representation, typically using transpose convolution layers or upsampling techniques.

The entire architecture is trained end-to-end by minimising the reconstruction error between the original input sequence and the reconstructed output.

Unlike fully connected autoencoders, 1DCAEs exploit local context in the time dimension. Consecutive time steps in real-world signals often exhibit correlations due to periodicity, trends, or discontinuities. By scanning the input with kernels moving along the time axis, the network learns shift-invariant features relevant to temporal patterns, efficiently modelling essential signal characteristics with fewer parameters. This approach preserves the inherent sequential structure of the data, enables faster convergence during training due to parameter sharing in convolutional layers, and enhances the model’s ability to capture local temporal dependencies.

The latent space in a 1DCAE contains a compressed set of features that the model identifies as most relevant for reconstructing the original time series. When trained on “normal” data, the network minimises reconstruction error, thereby retaining essential information about typical signal patterns. This compact embedding functions as an automatic feature extractor—instead of requiring manual feature engineering, the network autonomously discovers patterns such as repeating cycles, underlying trends, and characteristic waveforms.

Given the limitations of GMM on raw data, a 1DCAE is introduced to learn a more compact and informative representation of the time series, reducing dimensionality while retaining essential temporal patterns. This latent representation is expected to better differentiate normal behaviour from anomalies. Finally, GMM is re-applied to the transformed latent space of the 1DCAE, leveraging the learned features to improve clustering accuracy. This hierarchical approach benefits from both the feature extraction capability of the autoencoder and the probabilistic nature of GMM, yielding a more robust anomaly detection framework.

Once trained, the 1DCAE detects anomalies through two primary approaches. Firstly, time-series segments that deviate significantly from learned patterns will yield higher reconstruction errors, indicating potential anomalies. In practice, a threshold for this error can be defined to classify segments as either “normal” or “non-complaint”. Secondly, segments mapping to points in the latent space that are distant from clusters of known “normal” embeddings may be flagged as anomalies. This approach leverages the compact representation to identify outliers.

A comprehensive anomaly detection system integrates a 1DCAE with probabilistic clustering through the following steps (Figure 6):

Data Preparation: The dataset is loaded and pre-processed, including segmentation, material change labelling, and normalisation to ensure that the time-series values fall within an appropriate range for neural network training.
Model Architecture: The 1DCAE is implemented with multiple convolutional layers, batch normalisation, and pooling operations for encoding, followed by corresponding transpose convolutions and upsampling for decoding.
Feature Extraction: Once trained, the model extracts latent representations that serve as learned features, capturing significant temporal patterns.
Probabilistic Clustering: These latent representations are input into a Gaussian Mixture Model (GMM), which is trained on normal examples. The GMM estimates the likelihood of each new sample belonging to the normal distribution.
Anomaly Scoring: Data points with abnormally low probability scores are flagged as anomalies.
Evaluation: The system’s performance is assessed by comparing predictions to ground-truth labels, yielding metrics such as precision, recall, and F1-scores.
Visualisation: Throughout the process, visualisations are employed to validate the autoencoder’s reconstruction quality and the GMM’s effectiveness in distinguishing normal from anomalous cases.

This integrated approach leverages both the pattern-learning capabilities of 1DCAEs and the statistical properties of the latent space, providing a robust framework for unsupervised anomaly detection in complex time-series data.

2.6. Evaluation Metrics for Anomaly Detection

To systematically evaluate the effectiveness of the proposed GMM-based and 1DCAE-based anomaly detection, we employ several key performance indicators (KPIs):

Confusion matrix (CM). We define a 2 × 2 confusion matrix with the following categories: True Positives (TP), False Positives (FP), False Negatives (FN), and True Negatives (TN). In this context, a “positive” indicates an anomalous (bad) hole, whereas a “negative” indicates a normal (good) hole.
Precision (P):

$Precision = \frac{TP}{TP + FP},$

indicating what fraction of holes predicted as anomalous are actually anomalous.
Recall (R):

$Recall = \frac{TP}{TP + FN},$

which assesses how many of the truly bad holes are correctly flagged.
F1 Score:

$F 1 = 2 \frac{P \times R}{P + R} .$

The F1 score is the harmonic mean of precision and recall.

These metrics collectively provide a robust quantitative assessment of how well the proposed approaches identifies holes that fail to meet quality standards. Since the available dataset is relatively imbalanced (fewer bad holes than good ones with a ratio 18–72%), metrics such as recall and confusion matrix are especially valuable. Indeed, in aerospace settings, the cost of failing to detect a truly faulty hole (false negative) can be prohibitively high, thus justifying an emphasis on recall, since defective holes result in waste (e.g., the parts cannot be assembled). Moreover, by analysing the confusion matrix, it is possible to quantify the number of missed anomalies and false alarms, both of which can impact production time. When an anomaly is detected by the algorithm, an operator must inspect the quality of the last hole before deciding whether to continue or stop and replace the tool. Therefore, explaining the metrics is not a trivial task and requires the right context. In this study, we carefully analyse all relevant metrics to determine the most suitable methodology based on the collected dataset. This analysis aims to guide experts in multi-stack drilling in making similar considerations when applying machine learning methods to solve their problems.

3. Results and Discussion

The following section presents a comprehensive evaluation of the various methodologies applied to detect anomalous hole quality in multi-material stacks used in aerospace drilling. A progressive comparison is conducted between supervised, semi-supervised, and unsupervised learning approaches to highlight their relative strengths and weaknesses. The two supervised classifiers, logistic regression (LR) and Support Vector Machines (SVMs), employed as baseline models to assess their effectiveness in detecting defects, exhibited significant limitations, particularly in handling imbalanced data, leading to overfitting and a failure to detect defective holes in the test dataset.

To overcome these limitations, a semi-supervised approach using a Gaussian Mixture Model (GMM) was introduced. By leveraging only compliant samples for training, this method significantly improved precision and recall, outperforming the supervised models in terms of balanced anomaly detection. However, despite the improved performance, the GMM still exhibited sensitivity to data distribution assumptions and struggled with complex feature interactions.

To further enhance detection performance, a 1D Convolutional Autoencoder (1DCAE) was implemented to extract high-level temporal features from the drilling signals. These learned representations were subsequently used as input for a GMM-based clustering model. This combined 1DCAE-GMM framework demonstrated superior anomaly detection capabilities, achieving a more robust classification of anomalous holes. The use of an autoencoder significantly mitigated overfitting, preserved crucial temporal dependencies in the signals, and provided a more discriminative feature space for anomaly classification. The following subsections provide a detailed analysis of the results obtained by each approach.

3.1. Results of Supervised Learning Models

In this study, two supervised machine learning methods, namely logistic regression and an SVM classifier with a Radial Basis Function kernel, with regularisation parameters

C = 0.6

and

γ = 0.1

, were applied to evaluate their performance in detecting anomalous hole quality in one-shot drilling of multi-material stacks typical of the aerospace sector, with a limited dataset associated with defective holes. Hole quality was measured in terms of diameter and delamination factor using a Keyence VHX 7000 optical microscope, resulting in a dataset consisting of 72 compliant and 18 anomalous holes. From the Fz and Mz Kistler sensors, 14 features were extracted at the end of each hole using an unsupervised change point detection method, which enabled the identification of characteristics in both the aluminium alloy and CFRP parts of the signal. Seventy-five percent of the data were used for training, with the remaining 25% reserved for testing. The metrics and confusion matrix are provided, based on the results from the entire dataset. This approach allows for a comparison of results across the entire dataset, even when both semi-supervised and supervised ML methods are used. Without this, it would not be possible to compare the results of different techniques due to the varying compositions of the training and test datasets.

The test dataset consists of 14 samples, and only 3 samples were associated with poor-quality conditions. Among these 14 holes, all were classified as good, suggesting a strong overfitting issue. This result is likely due to the high number of good holes in the training dataset compared to anomalous ones, which introduces bias in both logistic regression and SVM models. To provide a comprehensive comparison of the different techniques, the performance metrics for the entire dataset, which includes 100 holes, are reported in Table 3. Logistic regression achieves the best F1 score of 0.78, slightly outperforming SVM, which scores 0.75. This difference is mainly due to the higher number of anomalies detected by logistic regression (11 vs. 12 of SVM). Both models are affected by overfitting on the good data, as they fail to detect anomalies, even within the training dataset.

Despite the high metrics of the entire dataset, as discussed earlier, the test dataset contained only 3 poor-quality holes, while 15 holes were included in the training dataset. Therefore, to assess the quality of the supervised ML methods, it is crucial to focus more on the test dataset. None of the two methods were able to detect the poor-quality conditions. Practically, an SVM cannot be employed for this task, as it is biased toward classifying only good data. LR, on the other hand, may be used, but it tends to miss many anomalies, leading to an increased number of holes that do not meet the quality criteria. To better understand the metrics, the confusion matrices are shown in Figure 7, revealing that LR missed six poor-quality holes, while the SVM missed seven and practically all those contained in the test dataset. Additionally, logistic regression generated only one false alarm across the entire dataset.

3.2. Results of Gaussian Mixture Model

As discussed in the previous section, in the case of strong class imbalance and binary classification problems, the semi-supervised method presents a solution to leverage one-class labels. In this case, 75% of the good data are used to train a two-component GMM, and following the methodology presented in Section 2.5, the probability score is used to classify all the data, with results summarised in Table 4. In particular, the results obtained are higher in terms of F1 score to those achieved by logistic regression, with an F1 score of 0.81. However, this result demonstrates a higher recall of 0.83, highlighting the superior ability of semi-supervised approaches to detect anomalies and offer a more balanced performance. The precision is slightly reduced to 0.79 due to an increase in the total number of false alarms. To better understand the results, four false alarms were detected (Figure 8), suggesting increasing sensitivity to anomalies but a higher ability to detect them with 15 detected anomalies among the 18 true anomalies. This indicates a better performance of the proposed methodology compared to state-of-the-art supervised ML techniques.

3.3. Results of 1D Convolutional Autoencoder

The experimental results of the proposed 1DCAE approach to anomaly detection were promising across multiple evaluation criteria. The outcomes are illustrated in terms of both reconstruction fidelity and classification performance. Figure 9 presents a direct comparison between original and reconstructed signals for two critical variables: normalised thrust force (Fz) and normalised torque (Mz). The original signal (orange line) and its corresponding 1DCAE reconstruction (dashed blue line) demonstrate remarkable alignment throughout the temporal sequence, confirming the model’s ability to capture essential time-series patterns.

Notably, the model achieves high reconstruction fidelity across most segments while exhibiting subtle deviations in regions containing anomalous patterns. This selective reconstruction behaviour is intentional by design—the autoencoder has learned the characteristics of normal operation while struggling to precisely reconstruct anomalous segments. These targeted reconstruction discrepancies serve as valuable indicators for anomaly identification, effectively transforming reconstruction error into an anomaly signal.

The 1DCAE model was trained using an optimised configuration for anomaly detection in sequential signals. The network architecture consists of an encoder with three convolutional blocks, each including a 1D convolutional layer (with filter dimensions ranging from 32 to 8), batch normalisation, LeakyReLU activation, and max pooling with a factor of 2. The decoder structure is symmetrical, employing transpose convolution layers and upsampling operations to reconstruct the original signal. The key training hyperparameters are shown in Table 5.

For anomaly detection, a GMM approach was implemented on the latent space extracted from the trained autoencoder. The GMM was configured with four Gaussian components and trained exclusively using data classified as “normal” (75% for training). The threshold for anomaly classification was set at the fifth percentile of the probability scores obtained on the training data, thus optimising the balance between sensitivity in detecting anomalies and reducing false positives. The use of the fifth percentile as the reconstruction error threshold balances the metrics by accommodating a small degree of anomaly in the data, which helps reduce overfitting. Allowing for some variability enables the model to better generalise to real-world conditions, where minor deviations occur even in acceptable scenarios. This approach, previously applied in other industrial problems, recognises that not all anomalies indicate issues, thereby preventing overfitting to noise or outliers. Figure 10 presents the confusion matrix for the proposed integrated 1DCAE-GMM anomaly detection system. The classifier, operating on latent representations extracted from the autoencoder, achieved the performance metrics reported in Table 6. The results show improved performance across both state-of-the-art methods, such as LR and SVM, as well as for manual feature extraction techniques combined with the GMM approach. Specifically, while precision remains unchanged, suggesting a slight tendency for false alarms, recall has increased from 0.83 to 1.0 with respect to manual feature extraction case, indicating that no anomalies were missed. Among all the proposed approaches, only the 1DCAE-GMM framework was able to detect all anomalies without missing any, while simultaneously maintaining the same level of false alarms, highlighting the effectiveness of automatic feature extraction techniques enabled by deep learning.

3.4. Comparison of Methods

A comparative analysis of the different approaches is presented in Table 7, summarising their key performance metrics and limitations.

The results show that traditional supervised learning models struggle with class imbalance, resulting in poor generalisation under real-world conditions, with the SVM failing to detect any anomalies in unseen data. The semi-supervised GMM approaches, both with manual feature extraction and those enhanced by 1DCAE automatic feature extraction, show slight sensitivity to false alarms. However, the automatic feature extraction by the autoencoder helps avoid missing any poor-quality holes without increasing the number of false alarms. In fact, the convolutional autoencoder effectively learns meaningful features from raw sensor data, improving anomaly separability. The integration of a GMM in the latent space further refines classification by utilising probabilistic modelling on the learned feature representations.

3.5. Limitations and Future Works

The use of small and imbalanced datasets presents significant challenges in machine learning, particularly for supervised learning approaches. Class imbalance can lead to overfitting, where models struggle to generalise effectively, resulting in suboptimal performance, as discussed in the Results and Discussion Section. These challenges are particularly pronounced in industrial applications, where obtaining large, balanced datasets is often hindered by constraints such as privacy concerns, cost, and time limitations. While unsupervised learning has the potential to mitigate some of these issues, the limited amount of data can enhance the problem by restricting the model’s ability to capture the full distribution of the data. This limitation underscores the need for alternative approaches to improve model robustness and generalisation. To address these limitations, it would be beneficial to explore advanced techniques such as generative AI and more advanced unsupervised learning techniques. Generative models, such as Generative Adversarial Networks (GANs) [29,30] or Variational Autoencoders (VAEs) [31,32], offer the potential to generate synthetic data that can augment the existing dataset, mitigating the effects of class imbalance and limited data. By learning the underlying distribution of the data, these models can produce realistic samples that help the model generalise better.

4. Conclusions

This study presents a novel approach for process monitoring in multi-material stack drilling, aiming to identify drills in anomalous conditions. Specifically, this work proposes the usage of unsupervised machine learning to address the limitations of traditional supervised methods when dealing with highly imbalanced datasets typical of the industrial field. In particular, a K-Means algorithm is introduced to detect material change, enabling automatic signal segmentation. This segmentation facilitates feature extraction from the aluminium 2024 and CFRP sections of thrust force (Fz) and torque (Mz) signals. Following a microscopic analysis of both drill beats and holes, labels were assigned to the drills based on observations of hole diameter and the delamination factor. Once labelled, 75% of the data were used for training and 25% for testing. Logistic regression and Support Vector Machines (SVMs) achieved F1 scores of 0.78 and 0.75, respectively, with both models being inefficient in detecting drills that were out of compliance, due to the strong class imbalance of the dataset. Then, the Gaussian Mixture Model (GMM) was proposed to estimate the distribution of the 75% of good drill data, with the remaining 25% along with the anomalous drills used for testing, with the probability score serving as the metric to assess whether new data could be sampled from the modelled distribution. The higher F1 score of 0.81 was achieved due to the increased number of anomalous holes detected, but it also reveals that not all anomalous holes were identified and that semi-supervised approaches are more sensitive to overfitting. Finally, a framework in which features are automatically extracted from raw signals using a 1D Convolutional Autoencoder was proposed. This demonstrates the potential of deep learning to automatically extract meaningful features from data, achieving an F1 score of over 0.87, with no missed anomalous holes, while maintaining the same level of false alarms as the previous GMM method. This study advances intelligent manufacturing by demonstrating how unsupervised and semi-supervised learning techniques can enhance process monitoring in aerospace drilling. Specifically, it emphasises the capability to detect material change and identify poor-quality drilling, facilitating adaptive process control, optimising resource use, reducing rework costs, and ensuring compliance with stringent quality standards in aerospace applications.

Author Contributions

Conceptualisation, G.M., M.G.M., A.C., G.G. and L.N.; methodology, G.M., M.G.M., A.C., G.G. and L.N.; software, G.M. and M.G.M.; validation, G.M., M.G.M., A.C., G.G. and L.N.; formal analysis, G.M., M.G.M., A.C., G.G. and L.N.; investigation, G.M., M.G.M., A.C., G.G. and L.N.; resources, L.N.; data curation, G.M., M.G.M., A.C., G.G. and L.N.; writing—original draft preparation, G.M., M.G.M., A.C., G.G. and L.N.; writing—review and editing, G.M., M.G.M., A.C., G.G. and L.N.; visualisation, G.M., M.G.M., A.C., G.G. and L.N.; supervision, L.N. All authors have read and agreed to the published version of the manuscript.

Funding

The authors acknowledge the INVITALIA Project NEMESI for their support to this research work.

Data Availability Statement

Data available on request due to restrictions.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Domínguez-Monferrer, C.; Ramajo-Ballester, A.; Armingol, J.; Cantero, J. Spot-checking machine learning algorithms for tool wear monitoring in automatic drilling operations in CFRP/Ti6Al4V/Al stacks in the aircraft industry. J. Manuf. Syst. 2024, 77, 96–111. [Google Scholar] [CrossRef]
Aamir, M.; Giasin, K.; Tolouei-Rad, M.; Vafadar, A. A review: Drilling performance and hole quality of aluminium alloys for aerospace applications. J. Mater. Res. Technol. 2020, 9, 12484–12500. [Google Scholar] [CrossRef]
Franz, G.; Vantomme, P.; Hassan, M.H. A Review on Drilling of Multilayer Fiber-Reinforced Polymer Composites and Aluminum Stacks: Optimization of Strategies for Improving the Drilling Performance of Aerospace Assemblies. Fibers 2022, 10, 78. [Google Scholar] [CrossRef]
Liu, S.; Yang, T.; Liu, C.; Jin, Y.; Sun, D.; Shen, Y. Modelling and experimental validation on drilling delamination of aramid fiber reinforced plastic composites. Compos. Struct. 2020, 236, 111907. [Google Scholar] [CrossRef]
Kusiak, A. Smart manufacturing. Int. J. Prod. Res. 2018, 56, 508–517. [Google Scholar] [CrossRef]
Caggiano, A.; Mattera, G.; Nele, L. Smart tool wear monitoring of CFRP/CFRP stack drilling using autoencoders and memory-based neural networks. Appl. Sci. 2023, 13, 3307. [Google Scholar] [CrossRef]
Dominguez-Monferrer, C.; Guerra-Sancho, A.; Caggiano, A.; Nele, L.; Miguélez, M.H.; Cantero, J.L. Multiresolution analysis for tool failure detection in CFRP/Ti6Al4V hybrid stacks drilling in aircraft assembly lines. Mech. Syst. Signal Process. 2024, 206, 110925. [Google Scholar] [CrossRef]
Jallageas, J.; Ayfre, M.; Cherif, M.; K’nevez, J.Y.; Cahuc, O. Self-Adjusting Cutting Parameter Technique for Drilling Multi-Stacked Material. SAE Int. J. Mater. Manuf. 2016, 9, 24–30. [Google Scholar] [CrossRef]
Kusiak, A. Smart manufacturing must embrace big data. Nature 2017, 544, 23–25. [Google Scholar] [CrossRef]
Caggiano, A. Tool wear prediction in Ti-6Al-4V machining through multiple sensor monitoring and PCA features pattern recognition. Sensors 2018, 18, 823. [Google Scholar] [CrossRef]
Domínguez-Monferrer, C.; Fernández-Pérez, J.; Santos, R.D.; Miguélez, M.; Cantero, J. Machine learning approach in non-intrusive monitoring of tool wear evolution in massive CFRP automatic drilling processes in the aircraft industry. J. Manuf. Syst. 2022, 65, 622–639. [Google Scholar] [CrossRef]
Hintze, W.; Romanenko, D.; Molkentin, L.; Koettner, L.; Mehnen, J. Holistic process monitoring with machine learning classification methods using internal machine sensors for semi-automatic drilling. Procedia CIRP 2022, 107, 972–977. [Google Scholar] [CrossRef]
Haoua, A.A.; Rey, P.A.; Cherif, M.; Abisset-Chavanne, E.; Yousfi, W. Material recognition method to enable adaptive drilling of multi-material aerospace stacks. Int. J. Adv. Manuf. Technol. 2024, 131, 779–796. [Google Scholar] [CrossRef]
Cruz, C.E.D.; de Aguiar, P.R.; Machado, Á.R.; Bianchi, E.C.; Contrucci, J.G.; Neto, F.C. Monitoring in precision metal drilling process using multi-sensors and neural network. Int. J. Adv. Manuf. Technol. 2013, 66, 151–158. [Google Scholar] [CrossRef]
Lee, S.K.H.; Mongan, P.G.; Farhadi, A.; Hinchy, E.P.; O’Dowd, N.P.; McCarthy, C.T. In-situ evaluation of hole quality and cutting tool condition in robotic drilling of composite materials using machine learning. J. Intell. Manuf. 2025. [Google Scholar] [CrossRef]
Mattera, G.; Vozza, M.; Polden, J.; Nele, L.; Pan, Z. Frequency informed convolutional autoencoder for in situ anomaly detection in wire arc additive manufacturing. J. Intell. Manuf. 2024, 1–16. [Google Scholar] [CrossRef]
Melentiev, R.; Priarone, P.C.; Robiglio, M.; Settineri, L. Effects of tool geometry and process parameters on delamination in CFRP drilling: An overview. Procedia Cirp 2016, 45, 31–34. [Google Scholar] [CrossRef]
Barik, T.; Pal, K.; Sahoo, P.; Patra, K. Sensor-based strategies for accurate prediction of drilled hole surface integrity of CFRP/Al7075 hybrid stack. Mech. Adv. Mater. Struct. 2024, 31, 1097–1124. [Google Scholar] [CrossRef]
Assurin, S.R.; Mativenga, P.; Cooke, K.; Sun, H.; Field, S.; Walker, M.; Chodynicki, J.; Sharples, C.; Jensen, B.; Mortensgaard, M.F. Establishing cutting conditions for dry drilling of aluminium alloy stack materials. Proc. Inst. Mech. Eng. Part B J. Eng. Manuf. 2024, 238, 214–222. [Google Scholar] [CrossRef]
Zhu, Z.; Guo, K.; Sun, J.; Li, J.; Liu, Y.; Zheng, Y.; Chen, L. Evaluation of novel tool geometries in dry drilling aluminium 2024-T351/titanium Ti6Al4V stack. J. Mater. Process. Technol. 2018, 259, 270–281. [Google Scholar] [CrossRef]
Panico, M.; Begemann, E.; Gebhardt, A.; Hartmann, F.; Herrmann, T.; Langella, A.; Boccarusso, L. Process parameter auto-adaptation strategy for one-up drilling of CFRP/aluminium hybrid stack. Int. J. Adv. Manuf. Technol. 2024, 135, 4169–4187. [Google Scholar] [CrossRef]
Zitoune, R.; Krishnaraj, V.; Almabouacif, B.S.; Collombet, F.; Sima, M.; Jolin, A. Influence of machining parameters and new nano-coated tool on drilling performance of CFRP/Aluminium sandwich. Compos. Part B Eng. 2012, 43, 1480–1488. [Google Scholar] [CrossRef]
Dolinšek, S.; Šuštaršič, B.; Kopač, J. Wear mechanisms of cutting tools in high-speed cutting processes. Wear 2001, 250, 349–356. [Google Scholar] [CrossRef]
Yu, J.; Shan, Y.; Zhao, Y.; Mo, R. Study on the Influence of Delamination Damage on the Processing Quality of Composite Laminates. Materials 2022, 15, 8572. [Google Scholar] [CrossRef]
Huang, T.; Bobyr, M. A review of delamination damage of composite materials. J. Compos. Sci. 2023, 7, 468. [Google Scholar] [CrossRef]
Li, Y.; Wang, B.; Zhou, L. Study on the effect of delamination defects on the mechanical properties of CFRP composites. Eng. Fail. Anal. 2023, 153, 107576. [Google Scholar] [CrossRef]
Caggiano, A.; Napolitano, F.; Nele, L.; Teti, R. Study on thrust force and torque sensor signals in drilling of Al/CFRP stacks for aeronautical applications. Procedia CIRP 2019, 79, 337–342. [Google Scholar] [CrossRef]
Szwajka, K.; Trzepiecinski, T. On the machinability of medium density fiberboard by drilling. BioResources 2018, 13, 8263–8278. [Google Scholar] [CrossRef]
Tanaka, F.H.K.D.S.; Aranha, C. Data augmentation using GANs. arXiv 2019, arXiv:1904.09135. [Google Scholar]
Strelcenia, E.; Prakoonwit, S. A survey on gan techniques for data augmentation to address the imbalanced data issues in credit card fraud detection. Mach. Learn. Knowl. Extr. 2023, 5, 304–329. [Google Scholar] [CrossRef]
An, J.; Cho, S. Variational autoencoder based anomaly detection using reconstruction probability. Spec. Lect. IE 2015, 2, 1–18. [Google Scholar]
Yao, R.; Liu, C.; Zhang, L.; Peng, P. Unsupervised anomaly detection using variational auto-encoder based feature extraction. In Proceedings of the 2019 IEEE International Conference on Prognostics and Health Management (ICPHM), San Francisco, CA, USA, 17–20 June 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–7. [Google Scholar]

Figure 1. Experimental setup for the Al/CFRP stack drilling tests: CNC drill press and sensor monitoring system.

Figure 2. (a) Procedure for measuring tool wear. (b) Microscope image showing flank tool wear. (c) Procedure for measuring the delamination factor. (d) Example of effective measurements obtained from the microscope. (d) Microscope image showing CFRP exit delamination.

Figure 3. The employed methodology for automatic segmentation of the signals.

Figure 4. An example of (a) thrust force and (b) torque signals with material change detected and highlighted in red.

Figure 5. The anomaly detection methodology.

Figure 6. The 1D-CAE compresses time-series signals and learns normal drilling behaviour. GMM is then applied on the latent features to detect anomalies in real time.

Figure 7. Confusion matrix of supervised ML methods applied to the collected dataset, namely logistic regression and SVM classifier.

Figure 8. Confusion matrix of the semi-supervised GMM method applied to the collected dataset.

Figure 9. Raw signal of thrust force (Fz) (a) and torque (Mz) (b) in orange vs. the reconstructed signals by 1DCAE in blue.

Figure 10. Confusion matrix of 1D-CAE GMM method applied to the collected dataset.

Table 1. Drilling parameters.

Tool	Feed Rate [mm/rev]	Spindle Speed [rpm]	Number of Holes
Tool 1	0.13	4500	50
Tool 2	0.13	4500	50

Table 2. Summary of the extracted features.

Signal	Aluminium Part	CFRP Part	Entire Signal
Thrust force (Fz)	Mean, std	Mean, std	Mean, std
Torque (Mz)	Mean, std	Mean, std	Mean, std

Table 3. Summary of the results of supervised ML in poor-quality detection.

Model	Precision	Recall	F1 Score
Logistic regression	0.92	0.66	0.78
SVM classifier	1.0	0.61	0.75

Table 4. Summary of the results of the semi-supervised GMM in poor-quality detection.

Model	Precision	Recall	F1 Score
GMM	0.79	0.83	0.81

Table 5. Training hyperparameters and data preprocessing details.

Hyperparameter	Value
Convolutional kernel size	5
Optimiser	Adam with a learning rate of $1 \times 10^{- 4}$
Loss function	Mean Squared Error (MSE)
Number of epochs	300
Batch size	32
Data normalisation	Signals were normalised to the range [0, 1] using min-max scaling
Data split	80% for training and 20% for validation

Table 6. Summary of the results of 1D-CAE and GMM in poor-quality detection.

Model	Precision	Recall	F1 Score
1D-CAE GMM	0.78	1.0	0.87

Table 7. Comparison of different methods for anomaly detection.

Method	Precision	Recall	Key Limitations
Logistic Regression	0.92	0.66	High false negative rate in unseen anomalies; susceptible to class imbalance.
SVM (RBF Kernel)	1.0	0.61	Strong bias towards majority class; failure in detecting poor-quality holes in unseen cases.
GMM (Semi-supervised)	0.79	0.83	Sensitivity to false alarm; few missed anomalies.
1DCAE + GMM	0.78	1.0	Increased computational complexity; requires careful tuning; sensitivity to false alarm

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Mattera, G.; Marchesano, M.G.; Caggiano, A.; Guizzi, G.; Nele, L. Process Monitoring of One-Shot Drilling of Al/CFRP Aeronautical Stacks Using the 1DCAE-GMM Framework. Electronics 2025, 14, 1777. https://doi.org/10.3390/electronics14091777

AMA Style

Mattera G, Marchesano MG, Caggiano A, Guizzi G, Nele L. Process Monitoring of One-Shot Drilling of Al/CFRP Aeronautical Stacks Using the 1DCAE-GMM Framework. Electronics. 2025; 14(9):1777. https://doi.org/10.3390/electronics14091777

Chicago/Turabian Style

Mattera, Giulio, Maria Grazia Marchesano, Alessandra Caggiano, Guido Guizzi, and Luigi Nele. 2025. "Process Monitoring of One-Shot Drilling of Al/CFRP Aeronautical Stacks Using the 1DCAE-GMM Framework" Electronics 14, no. 9: 1777. https://doi.org/10.3390/electronics14091777

APA Style

Mattera, G., Marchesano, M. G., Caggiano, A., Guizzi, G., & Nele, L. (2025). Process Monitoring of One-Shot Drilling of Al/CFRP Aeronautical Stacks Using the 1DCAE-GMM Framework. Electronics, 14(9), 1777. https://doi.org/10.3390/electronics14091777

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Process Monitoring of One-Shot Drilling of Al/CFRP Aeronautical Stacks Using the 1DCAE-GMM Framework

Abstract

1. Introduction

2. Materials and Methods

2.1. Data Collection

2.2. Material Change Detection and Automatic Signal Segmentation

2.3. Machine Learning for Anomaly Detection in Unbalanced Industrial Datasets

2.4. Anomaly Detection Using Gaussian Mixture Models (GMMs)

2.5. Anomaly Detection Using 1D Convolutional Autoencoder (1DCAE)

2.6. Evaluation Metrics for Anomaly Detection

3. Results and Discussion

3.1. Results of Supervised Learning Models

3.2. Results of Gaussian Mixture Model

3.3. Results of 1D Convolutional Autoencoder

3.4. Comparison of Methods

3.5. Limitations and Future Works

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI