1. Introduction
Fatigue damage detection is one of the most important challenges that structural engineers encounter [
1]. Although fatigue has been studied for a long time, a predictive framework to accurately and comprehensively estimate the fatigue life of a component is still elusive. This is mainly due to the massively vast parameter space that drives such failures in an operational setup [
2]. Several factors, such as the load conditions, history, frequency, part geometry, presence of stress concentration factors, material microstructures, and defects, contribute to fatigue failure. While these factors may be accounted for to a certain extent through laboratory-based experiments to develop predictive models, they are often accompanied by unprecedented levels of uncertainties in operation. Hence, a comprehensive experimentation to understand their coupled effects can become overwhelmingly expensive to perform [
3]. Analytical and computational modeling are often performed to augment the understanding of fatigue failures. However, owing to their computational cost, these numerical models are often carried out on reduced-scale geometries. Consequently, a quantifiable generalized framework that can predict the fatigue behavior of any new materials and manufacturing processes remains a major research focus to date [
4]. A broad categorization of fatigue-related research is shown in
Figure 1a. The focus is either toward a ‘prediction’ or a ‘detection’ framework. The prediction framework is mainly targeted toward influencing design criteria that can enable a fatigue-resistant component, whereas the detection framework is useful in a working environment. The studies in the prediction framework can be further divided into three main domains, viz., analytical, computational, and empirical. A recent review paper by Liao et al. [
5] summarizes four classes of analytical techniques that are pursued for this problem, viz., nominal stress approaches, local stress-strain approaches, critical distance theories, and weighting control-parameter-based approaches. High-fidelity computational frameworks are being mainly developed using crystal plasticity simulations [
6,
7]. Empirical frameworks, the oldest and comparatively more error-prone among the three, have been documented through several design criteria [
8]. The detection framework on the other hand, is targeted toward developing better sensing mechanisms [
9] or improving the data analysis therein [
10]. The detection framework provides real-time information about a component’s health and, therefore, is critical in ensuring the safe operation of fatigue-critical components in operation [
11].
The elemental philosophy of the sensor-based approach is underpinned by the sensors’ ability to continuously stream data that has the health information of a component encoded in it. By analyzing this stream of data via data-driven approaches, reliable strategies can be developed to detect fatigue damage in real-time [
12]. Over the past few decades, rapid growth of deep learning methodologies has ushered a new era in damage detection analysis algorithms [
10,
13,
14,
15,
16]. As summarized by Zhao et al. [
15], the state-of-the-art deep learning methods, such as autoencoders, restricted Boltzmann machines, deep Boltzmann machines, convolutional neural nets (CNNs), and recurrent neural nets (RNNs), have shown promising applications to this field. CNNs have shown a particular proclivity to problems that have dealt with imaging datasets [
14,
16]. However, in the majority of complex structures, such as aerospace or automobiles, the reliance is solely on time-series-based sensors, which can be processed with varied methods. For instance, through an autoencoder, acoustic emission data were analyzed to localize damage [
17]. With Bayesian Graph Neural Nets, Mylonas et al. [
18] demonstrated the application of strain gauge sensors. Amiri et al. [
19] studied damage detection in spot welds using ultrasonics and artificial neural nets (ANNs). Similarly, Bansode and Billore [
20] used ANNs to study fatigue failures in rotary shafts.
Dharmadhikari et al. showed the excellence of deep neural networks (DNNs) for fatigue crack detection in notched specimens using ultrasonics [
21]. Along similar lines, Amiri et al. [
19], Xu et al. [
22] used ultrasonics to study damage detection. In addition to a direct application of such sensors, computational assistance in guided wave-based damage detection has also been studied with neural nets [
23,
24]. However, much of the existing research has focused on damage detection of specimens having a fixed geometry, as shown in
Figure 1b. It is well known that notches create localized stress concentrations that may significantly alter failure mechanisms. Since DNNs are trained with a huge amount of data, a logical follow-up question is: would a similar volume of training data be needed for any new specimen type? This question becomes even more paramount if the specimens are built with new manufacturing processes (e.g., additive manufacturing [
25]) or expensive materials (e.g., nickel-based Rhenium containing superalloys [
26]). In trying to find a solution to these challenges, if a DNN trained in some other material systems or geometries aids in any way, it would result in huge cost savings. Although there are transfer learning approaches to answer such problems [
27], they often result in individual models for each specimen geometry and, therefore, may lead to an intractable number of trained models. While these questions are rather
firsts-of-their-kinds in the applied mechanics field, they are not new in other domains, such as natural language processing [
28].
A common theme among the language research problems is to develop a unified natural language processing framework for sparsely available language data (such as Urdu or Tibetan) from similar yet vastly available counterparts (such as Hindi or Mandarin) using a mixed learning strategy that can be schematically represented through
Figure 1c. Based on the success of this framework in translating a representative phrase from any language to the other, it is hypothesized that a single machine-learning framework can also be developed for fatigue damage detection across different specimen geometries. There can be several ways to design specimens with different geometry. In the structural engineering community, the effects of stress concentration have been studied for a long time and have led to well-defined theories for commonly occurring materials and geometries [
3]. This article focuses on understanding the applications of mixed learning to specimens distinguished by stress concentration factors (
Figure 1a,b).
Two different stress concentration factors () are considered. The specimens are built from Al7075-T6, an aluminum alloy that is extensively used in aerospace applications. A custom-built fatigue testing apparatus is used to generate the required time-series data during the entire duration of the tests using ultrasonic sensors. The tests provide crack detection at a very early stage (∼45% fatigue life) owing to the use of a high-resolution confocal microscope. Baseline deep neural nets (DNNs), trained individually for each , show above 95% accuracy. A unified DNN model is developed by mixing the data from both and training a single network. The unified model shows accuracies similar to the baseline DNNs, indicating the success of the unified model through the mixed learning process. To understand the impacts of the data contributions from both , a parametric analysis is conducted by varying the contribution from each . Incredibly, with just 10% training data from both datasets, the performance of the mixed DNN approaches close to 92% accuracy, showing its aptitude for success with scarce data for components with new materials or manufacturing processes.
The article is divided into five sections, including the present one.
Section 2 summarizes the experimental protocol, followed by a data analysis methodology in
Section 3.
Section 4 presents the results and discussion, and
Section 5 presents the conclusions and future work.
3. Data Analysis Methodology
Figure 9 broadly compares the mixed learning framework to a traditional (separate DNN for each
) approach.
Figure 9a shows the commonplace DNN training and testing approaches observed in ref. [
21] where a DNN is trained for a particular problem (or
). In the long run, this may create a hurdle due to the vast number of
s that can demonstrate minute changes. Mixed learning, as shown in
Figure 9, tackles this problem by showing a method to create a single DNN that can adapt to multiple
s without any modifications. The following paragraphs elaborate on a step-by-step procedure that leads to mixed learning.
3.1. Dataset Bifurcation Strategy, Training of Baseline DNNs, and Transductive Analysis
The first step of training any DNN is to create a training and testing split of the available data from both
s (i.e.,
and
), as shown in
Figure 10a. Although fairly common in all machine learning analyses, the distribution of this split is often ad-hoc (80–20% in this case) and is based on an intelligent estimate of the problem. The reason to explicitly mention this step is to emphasize the subsequent parametric analysis that delves into understanding the effects of such data splits. At this stage, however, following the train–test split, a separate DNN (
Figure 10b) is trained and tested for each
to create a baseline for further performance comparison. The DNN is represented using a consistent nomenclature
where
and
represent the training and testing data used for the analysis, respectively. For example, the baseline DNN for
(denoted as
) is trained using 80% of the available data and tested on the remaining 20%. The DNN has a fully connected structure with seven dense hidden layers and one dense output layer with a single neuron. The network receives raw, unprocessed signal data as its input. The hidden layers use the ReLU (Rectified Linear Unit) activation function [
35], while the output layer uses a sigmoid activation function [
35]. The model is inspired by the encoder-decoder architecture [
35], where inputs are compressed to a 2D latent space and then expanded again to reconstruct the original input. A low-dimensional latent space created due to such a structure can help in interpreting the behavior of the DNN in future studies. Logistic regression [
35] is carried out on the reconstructed output. Since the task is a mutually exclusive binary classification problem, binary cross-entropy [
35] is used as the loss function.
The hyperparameters for the model, i.e., the number of neurons for each hidden layer and the learning rate, are selected through a grid-search algorithm using KerasTuner [
36] to ensure the optimality in terms of accuracy and speed of convergence. The optimum network (
Figure 10b) has 428, 132, 96, 2, 96, 132, and 428 neurons in the seven layers, with a learning rate of 0.0004 for the Adam optimizer [
36]. The fully-connected DNN architecture is shown in
Figure 10c. The vast volume of data ensures that the computation rarely encounters over-fitting, and hence techniques such as
regularization and dropouts have not been used in this model. Following the construction of these baseline DNNs, their pre-trained capabilities are tested on data from another
without any training.
and
, therefore, evaluates the universal applicability of pretrained DNNs (
and
) across different stress concentration factors, as shown in
Figure 10d. Such an analysis is also termed as ‘transductive’ analysis in the machine learning literature.
3.2. The Mixed Learning Approach
and
demonstrate that each
may need a customized DNN using the traditional supervised machine learning tools. This is not a sustainable solution in the long run due to the multitude of DNNs that would be needed for different
s [
8]. Therefore, in an attempt to avoid the generation of individual DNNs, the mixed learning approach pools in the data from multiple sources to train a single network. As shown in
Figure 11, the training data from
and
are used together to train a single DNN denoted by
. In doing so, the network is trained to invariably work on both datasets without the need for a
identification label.
The mixed DNN is trained to understand these properties through its multi-layered and fully connected network. Although the procedure remains fairly straightforward, the novel applications to fatigue damage detection engender several questions that need to be thoroughly studied. Specifically, the implications of varying the amount of data from both s can prove to be beneficial to the structural engineering community that often deals with data and testing limitations. For instance, new materials or expensive manufacturing techniques have limited testing information. Under such circumstances, can an accurate damage detection model be built with such mixed learning? How much data is essential to have a reliable DNN for damage detection? To answer such questions, the mixed DNN is further probed with a parametric analysis by varying the training data from both s. A generalized behavior of the mixed DNN is thereby studied by using and where and represent the training data volume variation from 10% to 80%. Since low data can possibly lead to unreliable, underfitted models, the training–testing split followed in this analysis is such that models trained with a lower percentage of the training data are tested on a higher percentage of testing data. In this way, the reliability of the models is also ensured.
Note that, in summary, all DNNs are built with the objective of identifying the health of a specimen by just looking at the ultrasonic signals. To achieve this target, a ground truth needs to be established between the healthy and cracked signals. Since these ultrasonic signals (and sensors) are not capable of segregating healthy signals from cracked ones, a confocal microscope is used as an additional information source. The confocal microscope provides the instant at which a crack emerges. During each fatigue test, all signal data acquired after this instant are labeled as cracked. The labeled data from all the specimens are then pooled together, and a training–testing data split is created. The training data are used to train the DNNs, and the testing data (which are previously unseen) are used to evaluate the capability of the DNNs in correctly distinguishing the healthy and cracked signals. This methodology, therefore, attempts to emulate a real scenario where a lab-trained and confocal-aided DNN is deployed to identify a crack by just processing an ultrasonic signal.
3.3. Performance Metrics
Since the goal of all DNNs discussed in this paper is binary classification, their performance is best represented using a confusion matrix [
36] that visualizes the capability of the classifier in accurately predicting a healthy or a cracked signal. In general, the confusion matrix helps in computing three quantifiable metrics, viz., the sensitivity (true positive rate), specificity (true negative rate), and accuracy (average of sensitivity and specificity). As a corollary to the typical positive–negative terminology used in machine learning literature, a positive occurrence in this situation is equivalent to the cracked state, and a negative occurrence corresponds to a healthy state. Accordingly, sensitivity is the percentage of correctly diagnosing the data labeled as cracked, and specificity is the percentage of recognizing healthy data. It is imperative for all DNNs to have high sensitivity in this damage detection problem, particularly in safety-critical environments. This ensures the reliable detection of cracked components. High overall accuracy is indicative of a good all-round performance in identifying both classes of data.
5. Conclusions and Future Work
This paper presents a fatigue crack detection paradigm using ultrasonic signals through deep learning across different s. Two baseline DNNs are trained for two different stress concentration factors ( and ). An accuracy of 95.8% and 96.1% is observed, respectively, for and . A transductive analysis is conducted to understand the capability of these pre-trained DNNs to detect damage in different s. The analysis shows a steep drop in performance with accuracies of roughly 60%, indicating the disparity in the seemingly identical data from both the s. To build a unified damage detection DNN, a mixed learning approach is developed by combining the data from both s and training a single network. The mixed approach successfully demonstrates performance closer to the baseline DNNs. Delving further into the properties of mixed DNNs, a parametric analysis is conducted by varying the amount of training data used from each . A gradual increase in performance is observed with an increase in the percentage of training data from 10% to 80% for both . Incredibly, even with low training data, accuracies above 90% are observed in the analysis. The study, therefore, provides a basis for retraining with scarcely available data.
All DNNs in this analysis are developed in-house without using any of the existing pretrained networks. A study on fatigue-crack detection using a scattered-wave two-dimensional cross-correlation imaging method [
38] shows an accuracy of 96% for the detection of 5-mm-long cracks and an accuracy of 99% for the detection of 10-mm-long cracks. However, the proposed method is able to detect cracks of the order of 3 micrometers of crack opening displacements with a maximum accuracy of 96% using relatively inexpensive ultrasonic sensors, thus showing that this method is not just feasible but actually superior to the existing techniques under certain scenarios. From a data analysis perspective, a common competitor to the mixed learning approach is transfer learning [
35], which can also be used to solve a similar problem. A preceding study [
27] with transfer learning shows similar accuracies to the mixed learning approach. However, mixed learning triumphs over transfer learning by eliminating the need for multiple DNNs. The combination of high accuracy, ease of deployment, and parametric retraining can, therefore, make DNNs a fantastic choice in fatigue-damage detection across industries.
However, the study does have some shortcomings that need to be addressed. The current approach investigates the question at hand as a binary classification problem without taking into consideration the sequence information present inside the time series. This is a reason why most of the erroneous predictions are around the area in the time series where the crack first appears. Such behavior is expected because there is very little visual or mathematical difference between the signals immediately before and after the short crack initiation. Incorporating sequence information within the models, using long short-term memory (LSTM) or generative adversarial network (GAN), might be able to boost our accuracy further. Moreover, DNNs serve very much as a ’black-box’ model: it is difficult to understand what features the model has learned in order to solve the problem of fatigue crack detection. Employing an encoder–decoder-based structure can allow the extraction of features from a 2D latent space. A systemic study of these features may allow us to gain a deeper understanding of the mechanics of fatigue failure and, thus, serve as an area of interest for future investigations.