1. Introduction
The development of automatic damage-detection strategies has played a key role in the condition-based maintenance of mechanical, civil and aerospace structures [
1,
2]. Vibration-based damage identification approaches have been widely adopted for long-term continuous monitoring [
3,
4]. The fundamental idea of these approaches is that damage-induced changes in physical properties (mass, damping and stiffness) are reflected in changes in modal parameters (eigenfrequencies, modal damping and mode shapes) [
5,
6]. According to this principle, modal parameters can be adopted to describe the state of health of a monitored structure and thus be used to define effective damage features.
A limitation of modal-based damage-detection strategies is that structural and physical properties are also dependent on environmental and operational conditions and not only on damage [
7,
8]. Furthermore, environmental and operational variations often cause changes in vibration properties that are greater than those caused by damage [
9]. Many works in the literature have suggested strategies to overcome this limitation [
10,
11,
12,
13], but there have been few applications to structures under real operating conditions, especially when damage is present. The authors of this work propose an automatic algorithm that can be used for the structural health monitoring of beam-like structures; this algorithm was validated with a one-of-a-kind application where real damage was detected on full-scale structural elements under the effects of an uncontrolled environment.
More specifically, the test case was represented by tie-rods, i.e., tensioned slender beams widely adopted in civil structures to balance the lateral forces of arches and vaults (though the proposed strategy can work for any kind of tensioned beam, such as the ties and struts of space frames or the diagonal braces of a truss girder). Despite their simple geometry, the interpretation of tie-rod modal parameters is complex due to many different physical variables (e.g., axial load and constraints characteristics) that cause changes in the structural properties under operating conditions [
14]. Moreover, the above-mentioned physical variables are generally affected by significant uncertainty when real tie-rods are considered.
A review of the state of the art on the topic of vibration-based tie-rod monitoring revealed that the attention of the researchers has been mainly focused on the identification of the axial load as a mean to assess the state of health of structures (the arch or vault where the tie-rod is adopted) [
15]. Indeed, the only possible direct measurement of tie-rod tension must be obtained through the adoption of strain gauges calibrated during a tensioning procedure, which is not a viable solution when operating tie-rods are considered. Thus, many research activities have been devoted to the development of indirect approaches to estimate the axial load from vibration measurements (some examples can be found in [
16,
17,
18,
19,
20,
21,
22]).
A common aspect of the above-mentioned works is that none of them considered that a tie-rod itself can be subject to damage. Conversely, deteriorative phenomena, such as corrosion, may cause a tie-rod to lose its functionality and consequently cause the collapse of the structure. Recently, a crack identification method for tie-rods was presented in [
23] based on a comparison between the modal parameters of a damaged tie-rod with those of a healthy reference one undergoing the same environmental conditions. The approach mentioned in the study was intended to be adopted through in situ tests, not continuously. Moreover, since the approach requires a comparison with an undamaged twin structure, its adoption in real monitoring applications may be difficult.
To overcome these limits, the authors of [
24] proposed a vibration-based, data-driven approach to tie-rod damage detection that showed the potential to detect damage under the effects of an uncontrolled environment without knowledge of physical variables (e.g., tie-rod tension) and without the need for a reference structure. The key point of the proposed approach is that when a pattern of modal parameters is considered instead of just a single one, the damage-detection problem can be treated as a multivariate outlier detection problem. Previous studies [
10,
25] have shown that calculating the Mahalanobis squared distance (MSD) of new observations with respect to a baseline period helps mitigate the effects of environmental and operational variations and allows for damage detection. The strategy was validated in an uncontrolled environment to detect damage simulated through the addition of concentrated masses on a tie-rod.
In line with these promising results, three main contributions to the state of the art are made in this paper. First, attention is paid to minimizing the number of required sensors, developing a damage-detection strategy that can be carried out by adopting a single accelerometer on a monitored tie-rod. This strategy relies on a damage feature defined only by tie-rod eigenfrequencies which could be identified with a single accelerometer properly placed on the monitored tie-rod. This allows for a simpler experimental set-up with respect to the one adopted in [
24]. The obvious economic consequences of adopting a simple experimental set-up can help the transition from research to real applications.
A second important contribution of this paper is the automatization of the method, so the strategy can be adopted without any human supervision. We present an automatic data-cleansing procedure that was developed and successfully tested on long-term data acquired in an uncontrolled environment. This paper shows how the proposed strategy can significantly improve the performance of a damage-detection strategy based on the MSD.
Finally, the experiment designed to validate the approach represents a one-of-a-kind test case because data referring to the real evolution of a corrosion process over time were considered in this work. This also represents a rare application in the field of structural health monitoring, since the majority of the strategies presented in the literature were only validated with simulated damage in laboratory environment. In this research, instead of producing discrete changes to structural properties (such as the addition of a concentrated mass to simulate a reduction in bending stiffness), real continuous damage evolution was considered over several months, which allowed us to replicate the most realistic and challenging possible scenario to test the effectiveness of our damage-detection strategy for continuous structural health monitoring.
The paper is organized as follows: The adopted vibration-based damage-detection strategy is described, the mathematical background on the MSD is provided, and a damage index is introduced in
Section 2. The adoption of an effective data-cleansing algorithm is a necessary step to use the strategy without human supervision, and this algorithm is presented in
Section 3. The experimental set-up and the corrosion process that allowed for the introduction of real damage in the monitored tie-rods are described in
Section 4. The effectiveness of the damage-detection strategy is discussed in
Section 5, where the results of the experimental campaign are presented. The strengths of the method, current limitations, and future developments are discussed in
Section 6. Finally, conclusions are drawn in
Section 7.
2. Damage-Detection Strategy
In this section, the vibration-based damage-detection strategy, in which eigenfrequencies of the monitored structure are used to define a damage feature, is presented. The details related to the extraction of tie-rod eigenfrequencies using a single sensor under operating conditions, together with the data-cleansing procedure proposed to apply the strategy in an uncontrolled environment, are presented in the next section. The discussion presented in this section is generally valid for any structure, regardless of the approach adopted to identify the eigenfrequencies.
If a number
of vibration modes is considered, the
eigenfrequencies
(with
) can be stored in a feature vector
, defined as follows:
where the superscript “T” indicates the transpose. More generally,
indicates the first considered eigenfrequency, not necessarily the one associated with the first vibration mode. From a continuous monitoring perspective, the identification of the eigenfrequencies can be repeated a number of times
and the eigenfrequencies can be stored in a matrix of size
, as follows:
with
.
The matrix represents a multivariate feature set where every column contains the trend of each of the considered eigenfrequencies over time. In order to develop an unsupervised learning damage-detection strategy, the behavior of a feature during a reference period (baseline) must be observed and statistically characterized. The matrix containing the eigenfrequencies associated with the baseline period is named hereafter.
Damage Index
A new observation of the feature vector during the monitoring period, when the tie-rod health state is unknown, is referred to
, and multivariate metrics can be adopted to check whether
is an outlier with respect to
. A damage index can be defined by calculating the MSD between
and
according to the following expression:
where
is a column vector of size
, the
-th element is the mean of the
-th column of
,
is the covariance matrix [
26] associated with
, and superscript “
” means the inverse. The index
is the result of a multivariate discordancy test that can be compared against a threshold
to determine whether
is judged to be statistically likely or unlikely to have come from the generating process of the multivariate dataset
; if
, the new observation
is considered to be an outlier with respect to
and damage is detected. The usual condition distribution is assumed to be Gaussian, and the threshold calculation can be carried out in terms of a chi-squared-statistics or by adopting a numerical method. The latter approach based on the Monte Carlo method was adopted in this work following the procedure explained in [
27] comprising the following steps:
For every iteration, a matrix with the same size of (i.e., a matrix with rows and columns) is considered, where every element is generated from a zero mean and unit standard deviation normal distribution.
The MSD is calculated between every row of the matrix and the matrix itself, obtaining values of . The maximum value is stored for every iteration.
The procedure is repeated for a large number of trials (e.g., times). All the resulted maxima (e.g., values) are sorted in terms of magnitude. The critical value for 5% test of discordancy is given by the MSD in the array above which the 5% of the trials occur. In this way, a threshold —also known as “inclusive threshold”—is obtained. This threshold must be used in cases where the baseline set also contains observations related to the damage condition.
If the baseline set does not include data related to the damage condition, as in the considered case, one must adopt another threshold
(also known as “exclusive threshold”) that can be evaluated according to the following expression [
1]:
The threshold level is dependent on both the number of observations () and the number of variables () of the problem being studied. For a given period, the dimensions of the baseline set are dependent on the parameters adopted to obtain a stable automatic identification, as is discussed in the next section.
A key point of the proposed approach is that eigenfrequencies are used to synthetically represent the current state of the monitored tie-rod because representative of all the physical variables that mostly influence its dynamic behavior (e.g., the axial load). Even if these variables change due to environmental and operational variations, the strategy does not require knowledge of them. Indeed, as proven in [
25], in order to filter out variability due to environmental and operational conditions, this variability must be included in the samples used to compute the covariance matrix
. If an adequate baseline set is considered as reference, the MSD becomes almost insensitive to the variations due to environmental effects, and, in this sense, the anomaly detection performance improves when a baseline set containing a full range of environmental conditions is used [
10].
A critical aspect of developing a completely automatic damage-detection algorithm is related to the quality of the features used to evaluate
and
[
28]. Indeed, when the identification of eigenfrequencies is automatically carried out by exploiting the excitation coming from an uncontrolled environment, many possible sources of error can lead to non-reliable results, as discussed in the next section.
If such corrupted data are included in the baseline matrix, they increase the baseline dispersion and make the damage index less sensitive to real outlier data related to damage. Moreover, if a vector contains wrong identifications, can exceed the threshold when damage is not present. In order to obtain an automatic algorithm that can work in real applications, a data-cleansing procedure was developed to discard features containing wrongly identified eigenfrequencies without the supervision of an expert, as explained in the next section.
3. Automatic Identification and Data Cleansing
If the procedure described in
Section 2 can be successfully used to detect damage in a tie-rod, as shown in [
24], a fundamental step toward real applications is the development of a completely automatic algorithm with no need for human expert supervision. The automatization of the procedure to obtain the data needed by the strategy described in
Section 2 is explained below.
Eigenfrequencies can be automatically identified from the dynamic response of a monitored structure through the adoption of automated operational modal analysis (OMA) [
16,
29,
30] techniques. Among the different possible approaches, a single-degree-of-freedom (SDOF) modal identification technique [
31] was adopted in this work to identify the eigenfrequencies of tie-rod bending vibration modes in the vertical plane. Indeed, the monitored tie-rods showed lightly coupled modes that were not closely spaced in frequency and not heavily damped, thus allowing for the adoption of the simple and fast approach described below. However, before discussing the details of the specific case, it is worth mentioning that the data-cleansing procedure and the damage-detection approach proposed here can be carried out regardless of the technique adopted to identify the eigenfrequencies comprising feature vector
.
When the environment provides random excitation to a tie-rod, each eigenfrequency can be identified through a best fitting between the experimental power spectrum of the response
, the function of the angular frequency
, where
is the frequency expressed in Hz), and the analytical power spectrum of the response of an SDOF mechanical system with eigenfrequency
excited by white noise, defined by the following expression [
31]:
where
is the imaginary unit,
is the
m-th modal damping ratio,
is a constant (function of the white noise level, the eigenvector component at the measurement point and the modal participation factor), and
is the contribution of the out-of-band modes. To allow for more compact notation, these parameters are grouped in a vector
such that
. This simple technique comes with the advantage that a single accelerometer can be adopted if the sensor is placed in a position that is not close to a node of the considered vibration mode.
The Welch’s method, based on the frequency-averaging approach, can be used to calculate an experimental power spectrum [
32,
33]. The approach requires an initial tuning of some processing parameters that can be set once prior the automatic monitoring. These parameters are related to the duration of the record to analyze
(which determines the amount of time between two observations of the damage feature vector), the duration of the sub-records
used for the averaging procedure (which determines the frequency resolution of the power spectrum
), the percentage of overlap between two sub-records, and the type of window adopted on every sub-record [
34]. The results presented below refer to the considered case study, where a power spectrum was estimated every hour (
) using
, an overlap of 50%, and a Hanning window.
In an uncontrolled environment, the averaging process may not always allow for a good reconstruction of the power spectrum in the frequency bands where the hypothesis of SDOF is made; for this reason, the best fitting procedure may fail or converge to wrong solutions. When this occurs, the wrong estimates of the eigenfrequencies should not be used to define damage feature vectors. We adopted a data-oriented approach that must be seen in the context of continuous monitoring when a huge amount of data are available: identification is always carried out, and wrong identifications are automatically detected and discarded by analyzing the obtained eigenfrequencies.
The proposed data-cleansing procedure comprises two stages. The first takes place after every eigenfrequency identification, considering vibration modes one at a time, and it is described in
Section 3.1. In the second stage of data cleansing (presented in
Section 3.2), multiple vibration modes are considered together over a period to detect and remove outliers that are still present after the first stage.
To show the effect of each stage, we discuss an example based on two weeks of data in which three eigenfrequencies were considered. All the identified frequencies not adopting any data cleansing are presented in
Figure 1 as two scatter plots (
versus
in
Figure 1a and
versus
in
Figure 1b). The outliers were those observations that significantly deviated from the majority, and they were due to the wrong identification of eigenfrequencies (some examples are circled in
Figure 1a,b).
3.1. First Stage
Every -th eigenfrequency is considered separately every time identification is carried out. The automatic identification procedure considering only one eigenfrequency is described below.
Initialization step: This step is only intended to define the initial range of where to assume the SDOF hypothesis. The first approximate value of the eigenfrequency
must be indicated, along with a value
, such that the power spectrum is considered only in the range of frequencies between
and
. This can be done by roughly identifying the resonance after a visual inspection of the power spectrum at the beginning of the monitoring period. Example initialization parameters used to obtain the eigenfrequencies of the example reported in
Figure 1 are presented in
Table 1. In this case, these values were defined after a visual inspection of the power spectrum shown in
Figure 2, which was obtained from data of duration
,
, an overlap of 50%, and a Hanning window.
Figure 2.
Example of the power spectrum of the response of a tie-rod of the experimental set-up. In grey, the frequency ranges where the SDOF hypothesis is assumed.
Figure 2.
Example of the power spectrum of the response of a tie-rod of the experimental set-up. In grey, the frequency ranges where the SDOF hypothesis is assumed.
- 2.
Assessing the quality of the fitting: When a new record of data of duration
available, the adopted automatic OMA technique is applied to identify the target eigenfrequency. The output of this step is the evaluation of an index that can quantify the quality of the identification. As mentioned above, the best fitting approach was adopted in this work. For this reason, the experimental power spectrum
was first calculated, and only the portion related to the frequencies in the considered range was taken into account. The eigenfrequency
was estimated by adopting the simplex search method [
35] to search for the solution of the minimization problem:
In Equations (6) and (7), , and .
If the ideal power spectrum of the response corresponding to the solution of the minimization problem
is
, the
index can be calculated to quantify the quality of the fitting as follows:
where
is the mean of
in the considered frequency range. The index
can be evaluated once the modal parameters are estimated regardless of the adopted identification method.
Ideally,
equals 1 (or is very close to 1) if the estimated power spectrum perfectly overlaps with the experimental one. An example is reported in
Figure 3a: in this case, the experimental power spectrum (black thin line) and the ideal power spectrum of the response of an SDOF system with an eigenfrequency equal to approximately 111.45 Hz (blue thick line) showed a good match. Lower values of
are associated with the misidentification of the modal parameters that can occur due to a lack of excitation of the considered vibration mode. An example is presented in
Figure 3b. In this case, the vibration mode was not excited, as can be noticed by comparing the amplitude of the experimental power spectrum with that of
Figure 3a. In this case, the coefficient
was approximately equal to 0.5. Other conditions that may be associated with a
are those associated with a poor signal-to-noise ratio that does not allow for an accurate estimate of the power spectrum in the considered frequency range, as can be observed via the shape of the experimental power spectrum presented in
Figure 3c. The
index, in this case, was close to 0.8.
- 3.
Updating and storing in : According to what previously observed, a threshold level can be set to discard wrong identifications. If a is associated with , this value is considered as the first guess estimate for the next iteration, so is updated such that (and consequently, the considered frequency range is updated). Furthermore, is stored in the -th row of the feature vector . Conversely, if , is not updated and no feature vector is obtained for the considered record of data. Regardless of the results obtained with the other eigenfrequencies, since the feature vector must always contain the same number of elements, if even one of the considered eigenfrequencies is not correctly identified, no feature vector can be obtained for the considered record of data. The procedure iterates starting from step 2.
To summarize, at this stage, only eigenfrequency estimates coming from fittings characterized by
are preserved, with decisions made after considering only one record of data at a time and every eigenfrequency separately. This first simple check on the
value effectively points out the presence of clearly corrupted data (the results shown below were obtained with a threshold level
). For the example of
Figure 1, the strategy was adopted on each of the three eigenfrequencies. The effect of this first stage of data cleansing can be observed in
Figure 4, where red-filled circles are associated with identifications that did not satisfy the condition
and were discarded. Although a number of outliers were correctly detected and removed by the first stage, there were still observations that deviated from the majority of the population, meaning that some wrong eigenfrequency identifications could still be associated with a high
. An example related to the presence of a harmonic disturbance is presented in
Figure 3d: in this case, the solution provided by the simplex search method is the value of the harmonic disturbance at 110 Hz, underestimating the correct eigenfrequency value for the considered vibration mode and still resulting in an
. In general, to remove outliers still present after the check on
, another stage of data cleansing is needed.
3.2. Second Stage
This time, multiple observations of the feature vector (i.e., matrixes ) are considered. The second strategy for outlier removal considers the trend in time of the eigenfrequencies over a short-term period, assuming that in such a short-time window, eigenfrequencies variations are only caused by axial load variations, like those caused by temperature variations, and not by damage. This assumption is valid when considering long-term deteriorative phenomena that do not significantly evolve in a short period, e.g., one or two weeks.
From analytical models presented in the literature [
36,
37], every tie-rod squared eigenfrequency is linearly dependent on the axial load. Consequently, squared eigenfrequencies of different vibration modes are related to each other with a linear relationship if the axial load is the only changing variable.
If a set is considered, the columns of the matrix are the trends of the identified eigenfrequencies in a short-term period, and they can be indicated as vectors , with . All the elements in vectors can be squared, and the resulting vectors are indicated as hereafter, with . As an example, two vectors and are considered now, with : if the only changing variable is the axial load, a scatter plot showing the values of as a function of the values of should result in points lying on a line (or scattered around a line when dispersion due to identification uncertainty is considered). Conversely, abnormal observations that cannot be explained by the assumed underlying linear model would deviate from the linear trend.
To exploit this idea in an automatic way, all the
couples made by the lowest squared eigenfrequency and one of the other
squared eigenfrequencies are considered separately (in the discussed example where three eigenfrequencies
,
and
are considered, two couples can be made:
-
and
-
). For every couple, the linear trend is first estimated by only considering a subset of observations, as explained in
Section 3.2.1. Once the linear trend is characterized, data are fitted to the line and observations that show a high residual are removed, as explained in
Section 3.2.2.
3.2.1. Linear Trend Estimate
First, the linear trend between
and
must be estimated over a short-term period (two weeks of data in the considered example). To estimate the coefficients of the linear trend, it is important to consider that, at this stage, outliers may be present in one or both vectors. Evaluating the coefficients of the linear trend while also considering these outliers can lead to biased results, so the identification of the linear trend is carried out only on a sub-set of observations. A pre-selection of data is carried out using the Hampel Identifier [
28] on each of the two vectors. The Hampel Identifier is a variation of the three-sigma rule of statistics that is robust against outliers. When data contain outliers, even a single out-of-scale observation can cause significant changes of the sample mean and variance. For this reason, the median and the median absolute deviation (MAD) are used to estimate the data mean and standard deviation, respectively.
Only data of
and
that are less than 1 scaled MAD (see
Appendix A) distant from the local median over a moving window of 72 h (green-filled circles in
Figure 5) are considered to estimate linear trend coefficients. The vectors that contain the data used to carry out a linear fit are indicated with the symbols
and
. The coefficients of the linear regression
and
are those coming from the least squares solution of the linear problem:
where
is a column vector of the same size as
or
, with all elements equal to 1.
In the example, the two couples of squared eigenfrequencies resulting from the first stage of the data-cleansing strategy are reported in
Figure 5 (all the black and green-filled circles). The observations corresponding to green-filled circles were those considered to calculate
and
(i.e., the coefficients of the black-dashed line in
Figure 5a) and
and
(i.e., the coefficients of the black-dashed line in
Figure 5b).
3.2.2. Discard with Residual
Once the coefficients
and
are known, vectors
and
are considered at this stage to finally carry out outlier removal based on the residuals
of the linear fitting:
The median and the scaled MAD of the elements of vector
are calculated. Every element that is more than 2 scaled MAD away from the median is considered an outlier and the corresponding row of
is removed. This check can be carried out on a moving window of 72 h, as in the previous step, but it is also possible to use a broader window. In the case, as an example, the outliers identified with a two-week window are those reported with red-filled circles in
Figure 6. Similarly to what discussed in
Section 3.1, when the check on either
or
marks an observation as an outlier, the corresponding three eigenfrequency estimates
and
are discarded. For this reason, when an outlier was detected, e.g., because of the check on
, the corresponding observation is marked with a red filled circle in
Figure 6a,b.
In conclusion, the total effect of the proposed strategy on the previous example can be observed in
Figure 7, where the eigenfrequencies selected and those discarded by the automatic data-cleansing procedure are indicated with blue circles and red-filled circles, respectively. As shown in
Section 5, the adoption of the data-cleansing strategy allowed the automatic application of the algorithm introduced in
Section 2 to successfully detect real damage in an uncontrolled environment. Before discussing the results, the test case is presented in the next section.
4. Experimental Set-Up
The experimental set-up (see
Figure 8a) comprised two nominally identical full-scale aluminum tie-rods with a free length of 4 m and a cross-section of
. The set-up was located in the laboratories of Politecnico di Milano, specifically in a room where numerous activities (mainly human activities and those related to laboratory testing machines) take place throughout the day. Furthermore, the temperature is intentionally not controlled, so environmental and operational variations are those of an uncontrolled environment. More specifically, throughout the monitoring period, the maximum and minimum observed laboratory temperatures were approximately 6 and 29 °C, respectively, with daily thermal excursion from 3 to 8 °C.
The tie-rods were equipped with sensors to replicate a long-term structural health monitoring system for research purposes (e.g., [
24,
38]). More specifically, each tie-rod was equipped with four general-purpose industrial piezo-electric accelerometers (PCB 603C01 model with a sensitivity of 10.2 mV/(m/s
2) and full scale of ±490 m/s
2). The choice for general-purpose industrial accelerometers comes from the decision to not adopt high-end sensors, which are typical of laboratory environments and not representative of real applications. Furthermore, strain gauges comprising a calibrated full Wheatstone bridge were used to measure the axial load, and the laboratory temperature was measured with a thermocouple close to the tie-rods.
Data were acquired with NI9234 modules with anti-aliasing filter on board at a sampling frequency of 512 Hz, obtaining a bandwidth of approximately 200 Hz that included the range of frequency significantly excited by the operative environment. After some preliminary tests, it was observed that under normal conditions, the operating environment usually provided a broadband excitation that significantly decreased above 200 Hz.
The data presented below were acquired with an accelerometer placed at
(
is the beam free-length and
is the longitudinal distance from the constraint according to the scheme in
Figure 8b). In order to adopt the proposed damage-detection strategy using only one sensor, the accelerometer position was selected while trying to avoid a vibration node of the first six bending vibration modes in the vertical plane.
Figure 9 shows the analytical mode shapes of the six considered modes for a pinned–pinned beam subject to axial load (the simple pinned–pinned model was used here to show the concept behind the accelerometer position selection). The coordinate
is reported on the
x-axis, and every
-th mode shape
, with
and normalized to one, is presented on the
y-axis. The position of the accelerometer is indicated by a black-solid vertical line at
: it is possible to observe that such a position potentially allowed for the identification of the considered vibration modes.
As a general rule of thumb, the position of the accelerometer must be chosen after considering how many vibration modes are identifiable given the operational environment. When the number of modes is known, the accelerometer must be placed as far as possible from the constraints in a position that is not a vibration node for the considered modes. This choice is important to correctly identify the eigenfrequencies from structural dynamic responses. In practical cases, even though the mode shapes of a real tensioned beam are not exactly coincident with those of simplified analytical models, the positions of modal nodes do not significantly differ and can thus be avoided. Alternatively, for a more accurate selection of the position of the sensor, one can consider carrying out a preliminary experimental campaign at the beginning of the monitoring aimed at identifying the mode shapes. In this case, OMA can be carried out by using an adequate number of accelerometers; once the vibration modes are reconstructed, the single sensor used for long-term monitoring can be placed while avoiding modal nodes. Moreover, since multiple possible options are available, the choice must fall on the position that can provide the best signal-to-noise ratio (e.g., far from the constraints, where the eigenvector components are generally low). A better quality of vibration data in terms of the signal-to-noise ratio increases the reliability of the automatic identification process, which has an impact on the proposed damage-detection algorithm performance.
For our experiment, vibration data were continuously acquired under the effect of uncontrolled environmental and operational variations throughout approximately one year, first to define the baseline set and then to monitor the evolution of real damage during the evolution of a corrosion process, as described in the next sub-section.
The Corrosion Process
A key part of the experiment was the introduction of real damage to the tie-rods, which represents a one-of-a-kind application in the literature of tie-rod damage detection and, more generally, structural health monitoring. A corrosion process was started on one of the two tie-rods. The type of corrosion attack reproduced in the experiment is referred to as “general corrosion”, where the electrochemical reactions between the metal and the chemical to which it is exposed cause a uniform loss of the metal thickness over the entire exposed surface. Although aluminum is a chemically very reactive metal, its behavior is made stable due to the formation of a protective adherent oxide film on the surface. This film is generated in a natural way, and it is immediately reproduced in the presence of oxygen, thus protecting the substrate from further oxidation phenomena. Only when the natural protection provided by the oxide is destroyed under the action of chemical agents and its regeneration is inhibited can corrosion occur in its various forms [
39].
The natural film can be attacked and dissolved both by strongly alkaline solutions and an acid pH; the most sever attacks have been recorded in the presence of concentrated solutions of sodium hydroxide (NaOH) and hydrofluoric acids (HF) [
39]. In line with this observation, a general corrosion process was induced on a portion of each of the two tie-rods by alternating the direct application of HF (
Figure 10a) and NaOH (
Figure 10b) on the top surface. First, HF was used to dissolve the protective film so that the general corrosion process could start more easily. Then, a highly concentrated solution of NaOH and water was applied to corrode the portion of the tie-rod. The reaction did not self-feed, and once the NaOH stopped corroding, the formation of the protective oxide resumed. Moreover, since the corrosion products of NaOH helped the formation of the oxide, they were brushed away from the surface of the tie-rods, and the procedure was periodically repeated, starting again from the application of HF.
One of the two tie-rods was attacked for the first time on 8 November 2020 at
, thus introducing damage close to the constraints. The extent of the area subject to general corrosion was approximately 5 cm, and the final appearance of the tie-rod on 25 May 2021 is presented in
Figure 10c. On 23 February 2021, the same procedure started on the second tie-rod at
, for an extension of 10 cm. The final aspect of the tie-rod on 25 May 2021 can be observed in
Figure 10d.
5. Results
In this section, the results of the experimental campaign are presented. First, damage farther from the constraints is considered, so the results refer to the tie-rod corroded at
. In
Figure 11, the timeline of the experiment is reported. The acquired data can be divided into three sets: baseline, validation and corrosion. Data belonging to the baseline set were those used to build the matrix
, which was the reference for the calculation of the damage index
through the MSD (see
Section 2). The validation set contained data acquired when damage was not present on the tie-rod but that were not included in the matrix
; this set was used to check that the damage index did not exceed the threshold level
when damage was not present, causing false alarms. Finally, the corrosion set contained data referring to the period when the chemical attack was ongoing and damage was progressing.
Four pictures of different states of the tie-rod during the corrosion process with the respective dates are presented in
Figure 12: labels C1, C2, C3 and C4 are used in
Figure 13,
Figure 14 and
Figure 15 to enable direct references to the tie-rod condition.
Moreover, the magnitude of damage was quantified by measuring the reduction in the height of the tie-rod cross section at the center of the corroded area. In
Table 2, the percentage of reduction in the height of the cross section with respect to the initial condition, indicated by the symbol
, is reported for every damage condition (conditions C5, C6, C7, and C8 are related to the case of damage close to the constraints, as discussed below).
The different temperature conditions corresponding to the different corrosion conditions can distinguished from the trend of the laboratory temperature over time presented in
Figure 13, where different vertical lines with markers refer to different stages of the corrosion process shown in
Figure 12 (in this case, labels C5, C6, C7 and C8 refer to the case of damage close to the constraints, as described below).
Figure 14 and
Figure 15 present the damage index
, which was calculated between every observation of the feature vector and
considering three tie-rod eigenfrequencies, more specifically those of the third, fourth and fifth vibration modes. In the following figures, black crosses indicate
calculated on baseline data, blue circles indicate
calculated on the validation set, and red triangles indicate DI calculated during the corrosion process. Different vertical lines with markers refer to different stages of the corrosion process according to
Figure 12. Finally, the horizontal black-dashed line represents the threshold level
for the MSD-based outlier detection (see
Section 2).
If the automatic procedure does not include any data-cleansing process, different problems arise, as shown in
Figure 14. Due to the presence of a high number of wrong eigenfrequency identifications, the scatter of the baseline data was high. This caused the method to be less sensitive to damage, which was not detected (red triangles are always significantly below the damage threshold). Moreover, no trend related to an evolving state of damage could be assessed if red triangles were considered and compared with baseline (black crosses) or validation data (blue circles).
The results presented in
Figure 15 were obtained with data that were automatically selected by the data-cleansing procedure. Since the baseline set, by definition, only included data acquired when the health state of the tie-rod was known and damage was not present, the data-cleansing process explained in
Section 3.2 was conducted while considering all the observations in the baseline set. For the other two sets (Validation and Corrosion), the procedure was instead carried out with two weeks of data at a time because the structural properties of the tie-rod could have changed due to possible damage and evolution.
It is possible to observe that the situation was significantly improved: all the black crosses fall below the threshold, in accordance with the fact that no outlier related to damage was present in the baseline set. Furthermore, the blue circles related to the validation set do not fall above the threshold level, thus not causing false positives. During the corrosion process, the index values represented by red triangles deviated from the range of black crosses and blue circles, showing an increasing trend that can be more easily assessed by looking at a moving average trend reported in green (obtained with a moving average window of duration equal to one day that was shifted every hour).
When the moving average first crossed the threshold, the tie-rod was not yet in the condition C2, so
was between 2% and 5%. The result is remarkable considering that the damage state C2 (
was barely visible during a visual inspection of the tie-rod, with reference to the picture in
Figure 12.
The performance of the automatic damage-detection algorithm can be represented through the adoption of a receiver operating characteristic (ROC) curve [
40]. This graphical tool is widely adopted to illustrate the capability of a binary classifier to detect damage as a threshold is varied. An ROC is made by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold levels. The TPR is the ratio between the number of positives correctly identified as positives (number of red triangles above the threshold) and the total number of positives (total number of red triangles). The FPR is the ratio between the number of false positives (number of blue circles above the threshold) and the number of negatives (total number of the blue circles). A perfect classifier ROC is composed by two straight lines from the origin with coordinates (0,0) to the top left corner (0,1) and from (0,1) to the top right corner (1,1), while a random classifier is represented by a diagonal from (0,0) to (1,1). The resulting plot can be used to compare the relative performance of different classifiers and to determine whether a classifier performs better than random guessing.
Figure 16 shows a comparison of the ROC curves with and without the automatic data-cleansing procedure, as indicated by black-solid and black-dashed lines, respectively.
Figure 16a was derived from
calculated every hour, while
Figure 16b as derived from data obtained with the moving average. It is possible to observe that the data-cleansing algorithm was fundamental for the strategy to be automatically adopted for damage detection (compare the black-dashed line with the black-solid line in
Figure 16a,b). Indeed, the black-dashed lines indicate that the damage-detection algorithm’s performance was worse than that of a random classifier; conversely, the black-solid lines indicate behavior that was very close to that of a perfect classifier.
The same conclusions can be drawn for the other tie-rod experiment that considered damage close to the constraints (damage at
). In this case, the timeline of the experiment is reported in
Figure 17, again adopting the same labels previously used to identify the different datasets.
As in the previous case, different states of the corrosion process are presented in
Figure 18. The labels adopted in
Figure 19, where
is presented for the second tie-rod, are the same as those reported in
Figure 18. The severity of the damage in the different conditions is quantified by
in
Table 2, while the temperature conditions are shown in
Figure 13. The damage index
was evaluated while considering three tie-rod eigenfrequencies, more specifically those of the fourth, fifth and sixth vibration modes.
Additionally in this case, the strategy could successfully detect damage, and the results were remarkable considering the extent and severity of the damage in a location close to the constraints where the index is less sensitive to damage, as shown in [
24]. Moreover, it is worth noticing that, in this case, the baseline set was shorter than that of damage at
(compare timelines reported in
Figure 11 and
Figure 17) due to the fact that the corrosion attack started earlier and that the tie-rod was used for another experimental test between 1 January 2020 and 22 April 2020. Since the performance of MSD-based damage detection improves when a large-enough baseline set is adopted to capture a full range of environmental conditions [
10,
25], the results presented in
Figure 19 could improve if a larger baseline set was adopted. Finally the ROC curves for this case are presented in
Figure 20: again, it is possible to observe how the performance of the damage-detection algorithm was improved by the adoption of the data-cleansing procedure.