1. Introduction
Operational load monitoring (OLM) is a process that consists in investigating characteristics of a structure under mechanical loads in its operating environment, be means of acquiring and saving data from sensors mounted to the structure. The gathered data are then processed to estimate the remaining in-service life of the structure, taking into account the number and characteristics of load cycles that the structure has withstood so far [
1,
2,
3]. OLM processes are mainly utilized in aerospace industry for monitoring aircraft structures in order to maximize their in-service life while ensuring flight safety [
4,
5]. Such approach reduces the costs of inspections and increases the aircraft availability [
6].
A related process to operational load monitoring, is Structural Health Monitoring (SHM), where the mechanical system is monitored, often in real time, using sensor data, in order to detect potential reduction of system performance, or even damage [
7,
8,
9]. In aerospace industry, often the both processes are performed simultaneously (Structural Health and Usage Monitoring Systems, HUMS [
1,
10]), where the strains and stresses at given points are measured, and a real-time system combines the data with flight parameters to determine the current state of the structure; the data are then utilized to compute fatigue factors of some components [
11]. Sometimes, OLM is considered as a case of SHM [
2].
Typical sensors utilized for OLM are embedded (intrinsic optical fiber) and surface mounted (strain gauges) strain sensors [
12,
13]. These sensors should be placed in critical points (places of expected stress and strain accumulations). Due to vicissitude nature of the loads, and therefore variable distributions of strains and stresses during operations, the number of required sensors may be high. As the accuracy of OLM depends on the number of sensors, and the number is limited, monitoring of flying parameters (like speed, acceleration and mass) is often performed to estimate the state of the structure in other locations [
14]. In the author’s earlier work [
15] it has been proven that by implementing artificial neural networks (ANNs), the temporary state of the whole structure could be quite accurately estimated using only a relatively small number of sensors. The neural network was trained with data obtained from finite element simulations of the structure under many different load cases, and then tested with data acquired from strain gauges in series of experiments.
The following work is the analysis of implementation possibilities of other machine learning and artificial intelligence algorithms to OLM, for estimation of the state of the monitored structure based on a reduced number of strain sensor reads. Three different techniques are considered: adaptive neuro-fuzzy inference systems (ANFIS), support-vector machines (SVM) and Gaussian processes for machine learning (GPML). Each of these techniques was used to build several models with different parameters, that were trained by simulation data. All of the models were tested with experimental data, the accuracy and computational time was measured (taking into account possible real-time OLM applications). Detailed comparison between the obtained results is provided.
As there is a potential of real-time applications of machine learning or artificial intelligence methods implemented to OLM processes, the possibilities to deploy such models to real-time systems or embedded systems are considered. Regarding ANNs, the most popular approach is to implement already trained networks (on a capable platform, like PC) to the embedded device. As ANNs are algorithms of relatively low complexity, a whole variety of such applications can be found in the literature, regarding classification of signals for arrythmia detection [
16], evaluation of fiber–metal laminates [
17], Secchi depth calculation [
18], power factor correction [
19], or hybrid simulation [
20]. Implementing SVM algorithms to real-time systems or embedded systems is also a popular approach. Zhang et al. [
21] implemented SVM to detect ventricular fibrillation in real time, Arsalane et al. [
22] deployed SVM to an embedded system for meat freshness evaluation based on machine vision. Dey et al. [
23] studied performance of SVM classification algorithms running on embedded processors and suggested parallel computations. There are also examples of SVMs implemented to FPGA-based systems, like speech recognition system [
24] or melanoma detection [
25]. Regarding ANFIS, there are many examples of real-time applications of these algorithms in automatic control area, like of hydraulic systems [
26], active suspension system for passenger comfort improvement [
27], robot arms [
28], or mobile robots navigation for target seeking and obstacle avoidance [
29]. Karakuzu et al. [
30] and Maldonado et al. [
31] deployed ANFIS algorithms to FPGAs. Danzomo et al. [
32] implemented an ANFIS-based algorithm to Arduino microcontroller, Sakib et al. [
33] to Raspberry Pi for flood monitoring system, and Mishra et al. [
34] to a real-time system with DSP 2812. Although real-time applications of GPML models are known [
35,
36], examples of deploying these algorithms to embedded devices were not found, which is probably caused by the computational complexity of these models (they require all the training data to make predictions).
2. Materials and Methods
A typical structure for aerospace applications is considered—hat-stiffened composite panel made of carbon/epoxy woven laminate, presented in
Figure 1a. In the author’s earlier work [
15], a detailed description of the geometry, experimental measurements of the mechanical properties, as well as building and validating a finite element model, can be found.
Six strain gauges (SG1–SG6) are mounted to the panel, in positions presented in
Figure 1b,c. Strain measurements are performed in longitudinal directions. Finite element model of the panel is presented in
Figure 1d. It consists of 26,357 surface elements and 79,217 nodes. At the edge, BC1 displacements in directions X and Y are fixed, at the opposite edge (marked as BC2) displacements in direction Y are fixed. The panel is loaded vertically at any point on the line BC3 (the load is applied to a 5 mm circle surface, displacements along Z axis are fixed within the circle).
The experimental tests were performed using universal testing machine MTS Insight 10 with 500 N load cell (MTS Systems Corporation, Eden Prairie, MS, USA) (
Figure 2). The panel was mounted in a specially designed stiff frame (to mimic the boundary conditions BC1–BC3) and loaded in points LC1–LC3, of location presented in
Figure 1b. Data from strain gauges and load cell were acquired using HBM AB22A hardware system (Hottinger Baldwin Messtechnik, Darmstadt, Germany) and Catman Easy software. The force signal from the load cell was received at the analog input of the acquisition system. Data from strain gauges and load cell were acquired with 20 Hz sampling frequency. No preprocessing was applied to the measured signals.
For OLM purposes, maximum inverse safety factor of the whole structure was observed during loading with different force values, and at different points. Safety factor is usually defined as the margin to failure, the inverse safety factor ISF
Ci is denoted as the quotient of the applied load
Q to the failure load
QCi, taking into account a given failure criterion C
i:
when there are
n failure criteria (
), the global inverse safety factor ISF is calculated as the maximum value from each criteria:
In the presented research, three failure criteria were considered: maximum stress criterion C
1, maximum strain criterion C
2 and Tsai-Wu [
37] criterion C
3:
where
j,k = 1,2,…,6,
σj and
εj are stresses and strains in Voight notation, respectively,
Fi are parameters calculated from stress limits
σlim,j, and
Fjk are coupling coefficients. In criteria C
1 and C
2, different stress
σlim,j and strain
εlim,j limits were taken for tension and compression.
From the finite element (FE) model, 1241 data samples of relationship between strains in positions SG1-SG6 and the value of
ISF (
Supplementary S7 to [
15]) were generated. When generating the data, forces of different values were applied to 31 spots at line BC3 (
Figure 1d).
Figure 3 illustrates the nonlinear behavior of the
ISF, where it is plotted against location of the applied force (all generated data samples with the same example force value of 40 N).
The authors of [
15] present a neural network, trained with 70% of the abovementioned data generated from FE simulations, that predicts the maximal value of
ISF for the whole structure based on the data from strain gauges (
Figure 4). Efficiency of the networks, when dealing with strain values acquired in series of experiments, was measured.
The following work presents comparison between results of the ANN presented in [
15] and of implementation of three other machine learning methods (similar prediction models, of inputs and output as presented in
Figure 4): ANFIS, SVM, and GPR. The comparison takes into account the accuracy of the results and the computational effort for making predictions. In the presented research, white gaussian noise was added to the input data (SG1–SG6) generated from FE simulations (
Supplementary Materials S1) in order to build more robust models to noisy data.
Before building the models, a grid search approach was implemented to find the best values of hyperparameters of the models, according to the algorithm presented in
Figure 5. For each set of hyperparameters values, the model was trained 10 times, each time with randomly chosen 70% of the reference data. Each time the model was tested with remaining 30% of the reference data and with experimental data. Mean squared errors (MSEs) were computed. Each set of hyperparameters values was evaluated by the average MSE from 10 iterations.
After choosing the best combination of hyperparameters values, the models were build using algorithm presented in
Figure 6. One thousand attempts for training the regression model were performed, from which the one that gave the least mean squared error (MSE) of ISF, when introduced to input data acquired from series of experiments, compared to reference output obtained from FE simulations (knowing the applied force), was chosen.
The algorithms presented in
Figure 5 and
Figure 6 were applied to ANFIS, SVM and GPR models.
ANFIS is a fuzzy inference system implemented in the framework of adaptive networks [
38]. It has the possibility to integrate the benefits of both fuzzy inference systems and adaptive neural networks. In fuzzy systems the output is mapped from the input, based on fuzzy logic (involving membership functions, fuzzy logic operators and if-then rules). Artificial neural networks, on the other hand, are collections of artificial neurons, a mathematical model of biological neurons. The artificial neurons (nodes) are connected with directional links to transmit signals. If some of the nodes’ outputs depend on a set of parameters (to minimize the error by changing the parameters during learning), than it is an adaptive neural network [
39,
40]. The ANFIS structure consists of five layers, as presented in
Figure 7 of network with two inputs
x1 and
x2.
Layer 1 is the fuzzification layer, where membership degrees of the inputs to rules are computed using the membership functions (MFs)
μ(
xi). In the second layer (the rule layer), the firing strengths of each rule is determined as the product of inputs:
for
i = 1, 2. In the third layer, the input values are normalized as follows:
for
i = 1, 2. The fourth layer is known as the defuzzification layer with adaptive nodes that perform the following function:
where
pi,
qi and
ri are consequent parameters for rule
i = 1, 2. The last, fifth layer is the output layer that calculates the sum of its inputs.
ANFIS can learn from training data the same way as ANNs; however, the advantage of ANFIS over ANN is that the hidden layer in ANFIS is already determined by the fuzzy interference system, and its size does not have to be found. ANFIS techniques have been implemented for prediction tasks in many different areas of engineering applications, like biomechanics [
41], geotechnics [
42], production technology [
43], failure analysis [
44] or structural mechanics [
45].
When finding the most suited ANFIS for the given problem, algorithms presented in
Figure 5 and
Figure 6 were implemented to MATLAB (utilizing Fuzzy Logic Toolbox [
46]). The following membership functions
μ(
xi) types were considered: generalized bell-shaped (9), gaussian (10), gaussian combination (where left and right curves are of type (10)), triangular (11), trapezoidal (12), difference between two sigmoidal functions (where a single function is given as (13)), product of two sigmoidal functions (where a single function is given by (13)), Pi-shaped (14):
where
a,
b,
c,
d are parameters to fit. The type of membership functions was the hyperparameter to be optimized with algorithm from
Figure 5.
Among machine learning tools, support vector machines (SVMs) are supervised learning algorithms used in classification and regression problems. SVMs are based on statistical learning theory proposed by Vapnik [
47] where training data are represented as points in space divided by a gap. The points are mapped to maximize the gap size. When data for classification is introduced, it is mapped in the same space and classified based on location of each point. In order to use SVMs for nonlinear classification, kernel trick is implemented, where inputs are mapped to higher-dimensional space [
48]. Points in the new space are defined by a kernel function
k(
x,
y). A hyperplane is found, for which the distance to nearest data points (defined as margin) on both sides is as large as possible. The hyperplane is defined as:
where
w is the normal vector to the hyperplane and
b is constant. In SVMs used for regression, the model depends only on a subset of the training data limited by parameter
ε (which defines how much error is acceptable in the model). When training a regression SVM [
49], the following minimization is performed:
where
The weight vector
w can be presented as the linear combination of the training data:
αi and
αi* being the coefficients associated with training points. The regression function takes the form:
It has been proven that SVMs are effective regression tools for nonlinear data [
50,
51,
52,
53]; therefore, it is motivating to measure the effectiveness of SVMs used for structure state predictions in OLM.
The computations were performed using MATLAB and Statistics and Machine Learning Toolbox [
54]. Three different kernel functions were considered: gaussian
linear
and polynomial
The optimized hyperparameters were ε (half of the width of the epsilon-insensitive band) for kernels (20)–(22), kernel scale for (20), and polynomial order q for kernel function (22).
Constraint
C for coefficients
αi and
αi*
was calculated as a quotient of the interquartile range of the target vector
y:
The utilized optimization routine was Sequential Minimal Optimization.
The third considered technique is Gaussian Process Regression (GPR), a GPML method [
55]. The process is based on Bayesian framework where predictive output is provided based on prior conditional probability. The key assumption is that the probability distribution for any input vector
x ∈
R (and target values
y ∈
R) over the target function
y =
f(
x) follows the Gaussian distribution:
where
μ(
x) is the associated mean function and
K(
x,
xi) is the associated covariance (kernel) function [
56,
57]. The model parameters are chosen in an optimization process. Output value of an unseen point is predicted from the similarity between points
K [
58]. The main disadvantage of GPR is that it uses the whole training data for predictions; however, it was proven to be an efficient tool for making predictions [
57,
58,
59], also when dealing with nonlinearities and uncertainties between inputs and targets [
60,
61,
62]. Moreover, it was reported that GPR processing often requires less input data than ANNs [
63,
64,
65].
Four GPR models were found using the algorithms presented in
Figure 5 and
Figure 6. Different covariance functions were introduced to each of the model:
where
σf,
σl and
α are the parameters of the functions [
66]. Equation (26) represents the squared exponential covariance function (implemented in model 1), (27) the exponential function (model 2), (28) the Matern 5/2 function (model 3), and (29) the rational quadratic function (model 4). The optimized hyperparameter for each model was the initial value for the noise standard deviation
σ0.
4. Discussion
Three types of machine learning techniques (ANFIS, SVM, and GPR) were applied to predict nonlinear ISF of an aerostructure excited with different loads. The results were compared with each other and with ANN model obtained in previous work, for its accuracy and required computational effort, taking into account potential real-time applications in the area of operational load monitoring.
In order to find the best suited ANFIS, models with different membership function types were considered, among which, by grid search, triangular function was revealed as the type that results with the most accurate model, both for models with two and three MFs per input. The required computational effort for using the trained model for predictions grows exponentially with increasing the number of membership functions per input. Therefore, it was decided to stop at two MFs per input, as the model accuracy was considered to be sufficient, and did not improve much when introducing three MFs per input.
For SVM models, three different kernel functions were considered. For all three models grid search of hyperparameters values was performed. From the considered models, the one with gaussian kernel function (SVM1) was of accuracy by one order of magnitude higher than two others. By observing the ISF results of SVMs tests (
Figure 9), it seems that SVM2 was not immune to noisy experimental input data although all the models were trained on numerical data with white gaussian noise added. This is considered to be of great importance, as in OLM one always deals with measured noisy data. SVM1 was chosen for further considerations although its computational time was higher than SVM2 and SVM3; however, the two latter did not perform well even when tested with numerical data, similar to that used in training, which is not acceptable.
Four GPR models, with different covariance functions, were considered. Their computational time and accuracy results were quite close to each other. When looking at the plots of ISF results (
Figure 11) it would be hard to reveal the model with the best performance, without computing error values. Although all GPR models could be considered as acceptable, after analysis of numerical data from
Table 4, GPR1 disqualified other three models, as it fitted both considered criteria (computational time and RMSE for experimental input) best.
From the abovementioned models, one of each type, considered as best, was chosen for final comparison. The accuracy of all the chosen models is acceptable and of the same order of magnitude. However, the time of computations for each model type is of different order of magnitude, ANN being the fastest, and ANFIS being the slowest. When introduced to experimentally acquired input that exceeded the training data range, only the ANN managed to return proper results outside the limits. This means that ANNs may have the strongest generalization abilities and could manage better when unusual or unpredictable event occurs.
When analyzing the statistical data of the accuracy of the trained models (study of repeatability), the ANNs fared the worst of the four considered methods. The standard deviation is of three orders of magnitude higher than for all other model types. The histogram for ANNs could not be presented with others, at it spreads on few orders of magnitude, and logarithmic scale must have been introduced for visibility. This means that training an ANN that meets the requirements for OLM is a far harder task that could require significantly more attempts than other presented methods. The other issue with ANNs it that there is no universal rule on how many neurons should be put in the hidden layer. If the number is too low, the accuracy of the model will not meet the requirements, and if the number is too high, the model could become overfitted and also of low accuracy.
Among the three other methods, SVM proved to have the lowest average value and standard deviation of RMSE from all. Moreover, its computational time is lower than ANF and GPR models. Therefore, it is recommended for OLM systems without real-time constrains, or in OLM systems where this method meets the time constraints.
For real-time systems where the computational power is not high (e.g., where the computations are performed in an embedded system with ARM architecture) or where the required time step is relatively small, one should consider two method—ANN and SVM. ANN is the fastest due to simple floating point operations in each layer; however, finding the accurate model could be hard or even impossible. In this case, SVM could be easier to implement, as it was proven to have good repeatability (by histogram and standard deviation value).