A Generative Adversarial Network Based Autoencoder for Structural Health Monitoring

Giorgia Colombera; Luca Rosafalco; Matteo Torzoni; Filippo Gatti; Stefano Mariani; Andrea Manzoni; Alberto Corigliano

doi:10.3390/IOCA2021-10887

Abstract

Civil structures, infrastructures and lifelines are constantly threatened by natural hazards and climate change. Structural Health Monitoring (SHM) has therefore become an active field of research in view of online structural damage detection and long term maintenance planning. In this work, we propose a new SHM approach leveraging a deep Generative Adversarial Network (GAN), trained on synthetic time histories representing the structural responses of both damaged and undamaged multistory building to earthquake ground motion. In the prediction phase, the GAN generates plausible signals for different damage states, based only on undamaged recorded or simulated structural responses, thus without the need to rely upon real recordings linked to damaged conditions.

Keywords:

structural health monitoring; machine learning; generative adversarial network

1. Introduction

Bridges, power generation systems, aircrafts, buildings and rotating machinery are only few instances of structural and mechanical systems which play an essential role in the modern society, even if the majority of them are approaching the end of their original design life [1]. Taking into account that their replacement would be unsustainable from an economic standpoint, alternative strategies for early damage detection have been actively developed so to extend the basis service life of those infrastructures. Furthermore, the advent of novel materials whose long-term behaviour is still not fully understood drives the effort for effective Structural Health Monitoring (SHM), resulting in a saving of human lives and resources [1].

SHM consists of three fundamental steps: (i) measurement, at regular intervals, of the dynamic response of the system; (ii) selection of damage-sensitive features from the acquired data; (iii) statistical analysis of those attributes to assess the current health state of the structure. To characterize the damage state of a system, the method relying on hierarchical phases, originally proposed by [2] represents the currently adopted standard. The latter prescribes several consecutive identification phases (to be tackled in order), namely: check the existence of the damage, the location of the damage, its type, extent and the system’s prognosis. Damaged states are identified by comparison with a reference condition, assumed to be undamaged. The detection of the damage location relies upon a wider awareness of the structural behaviour and the way in which it is influenced by damage. This information, along with the knowledge of how the observed features are altered by different kinds of damage, allows to determine the type of damage. The last two phases require an accurate estimation of the damage mechanisms in order to classify its severity and to estimate the Remaining Useful Life (RUL).

All the steps mentioned above rely on continuous data acquisition and processing to obtain information about the current health condition of a system. In the last few years, the concept of Digital Twin has emerged, combining data assimilation, machine learning and physics-based numerical Simulations [1], the latter being essential to completely understand the physics of the structure and damage mechanisms. A suitable tool able to extract main dominant features from a set of data is represented by neural networks [3], especially generative models such as Generative Adversarial Networks (GANs) [4] and Variational Autoencoders (VAEs) [5].

In this paper, an application of the generative neural network RepGAN, proposed by [6], is presented in the context of SHM. Section 2 provides an overview on existing works. In Section 3, the application of RepGAN to Structural Health Monitoring is presented. In Section 4, extensive numerical results are illustrated, while Section 5 gathers some concluding remarks.

2. Related Work

Generative Adversarial Networks [4] are well known due to their generative capability. Given a multidimensional random variable

X \in (R^{d_{X}}, E_{X}, P_{X})

, where

(R^{d_{X}}, E_{X}, P_{X})

denotes the probabilistic space with

σ

-algebra

E_{X}

and probability measure

P_{X}

, whose samples are collected in the data set

S = {\{x^{(i)}\}}_{i = 1}^{N}

, with probability density function

p_{X} (X)

, the GAN generator attempts to reproduce synthetic samples

\hat{x}

, sampled according to the probability density function

p_{G} (X)

as similar as possible to the original data, i.e., a GAN trains over data samples in order to match

p_{G}

with

p_{X}

.

G

maps a lower dimension manifold

(R^{d_{Z}}, E_{Z}, P_{Z})

(with

d_{Z} < d_{X}

in general) into the physics space

(R^{d_{X}}, E_{X}, P_{X})

. In doing so,

G

learns to pass the critic test, undergoing the judgement of a discriminator

D : R^{d_{X}} \to [0, 1]

, simultaneously trained to recognize

{\hat{x}}^{(i)}

counterfeits. The adversarial training scheme relies on the following two-players Minimax game:

\begin{matrix} \{G; D\} = arg min_{G} & max_{D} V (D, G) \\ V (D, G) = E_{X \sim p_{X}} [ln D (X)] & + E_{Z \sim p_{Z}} [ln (1 - D (G (Z))] \end{matrix}

(1)

In practice,

G

is represented by a neural network

G_{θ}

and D by a neural network

D_{ω}

, with trainable weights and biases

θ

and

ω

, respectively. Moreover,

V (D, G)

is approximated by the Empirical Risk function

L_{S} (ω, θ)

, depending on the data set

S

, defined as:

\begin{matrix} \{θ; ω\} = arg min_{θ} max_{ω} L_{S} (ω, θ) = \\ = arg min_{θ} max_{ω} \frac{1}{n} \sum_{i = 1}^{n} (ln D_{ω} (x^{(i)}) + ln (1 - D_{ω} (G_{θ} (z^{(i)}))) \end{matrix}

(2)

with

z^{(i)}

sampled from a known latent space probability distribution

p_{Z}

(for instance the normal distribution

N (0, I)

). The generator

G_{θ}

induces a sampling probability

p_{G} (X; θ)

so that, when optimized, passes the critic test, with D being unable to distinguish between

x^{(i)}

and

G_{θ} (z^{(i)})

(i.e.,

D (x^{(i)}) = \frac{1}{2} = D (G_{θ} (z^{(i)})

). In other words,

x^{(i)}

and

G_{θ} (z^{(i)})

can be associated with the value of a categorical variable C, with two possible values: class “d” (data) and class “g” (generated).

x^{(i)}

and

G_{θ} (z^{(i)})

can be therefore sampled with the mixture probability density

p_{M} = α χ (C = “ d ”) + (1 - α) χ (C =)

“g” with

χ

being the indicator function and

α = P (C = “ d ”)

[7]. The optimum solution of the Minimax game in Equation (2) induces a mixture probability distribution

\frac{1}{2} (p_{C = “ d ”} + p_{C = “ g ”})

[4]. The saddle point of

V (D, G)

corresponds to the minimum (with the respect to to D) of the conditional Shannon’s entropy

S (C | X)

(see Appendix A). Moreover, minimizing the conditional Shannon’s entropy

S (C | X)

corresponds to the maximization of the Mutual Information

I (X, C) = S (C) - S (C | X)

(see Appendix B), i.e., it corresponds to extract

X

samples

x^{(i)}

or

{\hat{x}}^{(i)}

that are indistinguishable (belonging to same class), with an uninformative mapping

X \to C

.

GANs proved useful in various applications such as generation of artificial data for data-set augmentation, filling gaps in corrupted images and image processing. Especially, deep convolutional generative adversarial networks (DCGANs) [8] proved useful in the field of unsupervised learning. SHM could benefit from GANs as they improve the generalisation performance of models, extracting general features from data, as well as their semantics (damage state, frequency content, etc). However, the adversarial training scheme in Appendix C does not grant a bijective mapping

G_{θ_{Z}} : Z \to X

(decoder) and

F_{θ_{X}} : X \to Z

(encoder), which is crucial in order to obtain a unique representation of the data into the latent manifold. Autoencoders have been developed for image reconstruction so to learn the identity operator

{\hat{x}}^{(i)} = I (x^{(i)}) = G_{θ_{Z}} \circ F_{θ_{X}} (x^{(i)})

. One can leverage the encoder

F_{θ_{X}}

representation power to sample points

{\hat{z}}^{(i)} = F_{θ_{X}} (x^{(i)})

belonging to the latent manifold

Ω_{Z}

and the decoder

G_{θ_{Z}}

to sample points

{\hat{x}}^{(i)} = G_{θ_{Z}} (z^{(i)})

belonging to the latent manifold

Ω_{X}

(see Equation (1)). In order to make the learning process of GANs stable across a range of data-sets and to realize higher resolution and deeper generative models, Convolutional Neural Networks (CNNs) are employed to define

F_{θ_{X}}

,

G_{θ_{Z}}

and the discriminators.

F_{θ_{X}}

and

G_{θ_{Z}}

induce sampling probability density functions

q_{Z | X} = \frac{q_{X Z}}{p_{X}}

and

p_{X | Z} = \frac{p_{X Z}}{p_{Z}}

, respectively.

p_{X}

is usually unknown (depending on the data-set at stake), but

p_{Z}

can be chosen ad hoc (such as, for instance,

N (0, I)

) in order to get a powerful generative tool for realistic data samples

{\hat{x}}^{(i)}

. A particular type of Autoencoders, called Variational Autoencoders (VAEs) was introduced by [5], consisting in a probabilistic and generative version of the standard Autoencoder, where the encoder

F_{θ_{X}}

infers the mean

μ_{Z}

and variance

σ_{Z}^{2}

of the latent manifold. However, the main contribution provided by VAEs is the straightforward approach that allows to reorganize the gradient computation and reduce variance in the gradients labelled reparametrization trick.

Adversarial Autoencoders (AAEs) [9] employ the adversarial learning framework in Equation (1), replacing

G_{θ_{Z}} (z^{(i)})

by

G_{θ_{Z}} \circ F_{θ_{X}} (x^{(i)})

and adding to the adversarial GAN loss the Mean Square Loss

∥ x^{(i)} - G_{θ_{Z}} \circ F_{θ_{X}} (x^{(i)}) ∥^{2}

as an optimization penalty, in order to assure a good reconstruction of the original signal. However, AAEs do not assure a bijective mapping between

(R^{d_{X}}, E_{X}, P_{X})

and

(R^{d_{Z}}, E_{Z}, P_{Z})

. In order to achieve the bijection (in a probabilistic sense) between

(x, \hat{z})

and

(\hat{x}, z)

samples, the distance between the joint probability distributions

q_{X \hat{Z}} = q_{\hat{Z} | X} p_{X}

and

p_{\hat{X} Z} = p_{\hat{X} | Z} p_{Z}

[10], with the posteriors

q_{\hat{Z} | X}

and

p_{Z | \hat{X}}

must be minimized. A suitable distance operator for probability distributions is the so called Jensen–Shannon distance

D_{J S} (q_{X \hat{Z}} | | p_{\hat{X} Z})

, defined as [10]:

D_{J S} (q_{X \hat{Z}} | | p_{\hat{X} Z}) = \frac{D_{K L} (q_{X \hat{Z}} | | p_{M}) + D_{K L} (p_{\hat{X} Z} | | p_{M})}{2} = S (p_{M}) - S (X, Z | M)

(3)

with

D_{K L} (p | | q) = S (p | | q) - S (p)

being the Kullback–Leibler divergence (see Appendix B) and

p_{M} = \frac{q_{X \hat{Z}} + p_{\hat{X} Z}}{2}

being the mixture probability distribution [7], i.e., the probability of extracting

(X, \hat{Z})

or

(\hat{X}, Z)

from a mixed data set, with

α = P (C = “ d ”) = \frac{1}{2}

and the entropy of the mixture probability

S (M) = ln 2

.

D_{J S} (q_{X \hat{Z}} | | p_{\hat{X} Z})

can be rewritten as:

\begin{matrix} D_{J S} (q_{X \hat{Z}} | | p_{\hat{X} Z}) & = S (p_{M}) - \frac{1}{2} (S (q_{X \hat{Z}}) + S (p_{\hat{X} Z})) = \\ = S (X, Z) - S (X, Z | M) = S (M) - S (M | X, Z) \end{matrix}

(4)

The adversarial optimization problem expressed in Equation (1) can be seen as a minimization of the Jensen–Shannon distance for

C \in \{“ d ”, “ g ”\}

:

\begin{matrix} D_{J S} (q_{X \hat{Z}} | | p_{\hat{X} Z}) & + ln 2 = - S (M | X, Z) = \\ = \frac{1}{2} E_{(X, \hat{Z}) \sim q_{X Z}} [D (X, \hat{Z})] + \frac{1}{2} E_{(\hat{X}, Z) \sim p_{X Z}} [1 - D (\hat{X}, Z)] \end{matrix}

(5)

that can be combined with the Autoencoder model in order to obtain the following expression [10,11]:

\begin{matrix} D_{J S} (q_{X \hat{Z}} | | p_{\hat{X} Z}) + ln 2 = D_{J S} (q_{\hat{Z} | X} p_{X} | | p_{\hat{X} | Z} p_{Z}) + ln 2 = \\ = \frac{1}{2} [E_{X \sim p_{X}} [D (X, F_{θ_{X}} (X))] + E_{Z \sim P Z} [1 - D (G_{θ_{Z}} (Z), Z)]] \end{matrix}

(6)

In this context,

F_{θ_{X}}

learns to map data into a disentangled latent space, generally following the normal distribution, a good reconstruction is not ensured unless the cross-entropy between

X

and

Z

is minimized too [12].

Another crucial aspect of generative models is the semantics of the latent manifold. Most of the standard GAN models trained according to Equation (1) employs a simple factored continuous input latent vector

Z

and does not enforce any restrictions on the way the generator treats it. The individual dimensions of

Z

do not correspond to semantic features of the data (uninformative latent manifolds) and

Z

cannot be effectively used in order to perform meaningful topological operations in the latent manifold (e.g., describing neighborhoods) and to associate meaningful labels to it. An information-theoretic extension to GANs, called InfoGAN [13] is able to learn a meaningful and disentangled representations in a completely unsupervised manner: a Gaussian noise

Z

is associated with a latent code

C

to capture the characteristic features of the data distribution (for classification purposes). As a consequence, the generator becomes

G_{θ_{Z}} (Z, C)

and the corresponding probability distribution

p_{G}

, whose Mutual Information with the respect to to the latent codes

C

, namely

I (C, G_{θ_{Z}} (Z, C))

. The latter is forced to be high, penalizing the GAN loss in Equation (1) with the variational lower bound L

_{I}

(G,Q), defined by:

L_{I} (G, Q) = E_{C \sim p_{C}, X \sim p_{G}} [ln Q (C | X)] + S (C) = E_{X \sim p_{G}} E_{C \sim p_{C | X}} [ln q_{C | X}] + S (C)

(7)

with

q_{C | X}

being the probability distribution approximating the real unknown posterior probability distribution

p_{C | X}

(and represented by the neural network

Q_{Z}

).

L_{I} (G, Q)

can be easily approximated via Monte Carlo simulation, and maximized with the respect to to

q_{C | X}

and

p_{G}

via reparametrization trick [13].

V_{InfoGAN} (D, G, Q) = V (D, G) - λ L_{I} (G, Q)

(8)

3. Methods

With the purpose of learning a semantically meaningful and disentangled representation of the SHM time-histories, we adopted in this study the architecture called RepGAN, originally proposed in [6]. RepGAN is based on an encoder-decoder structure (both represented by deep CNNs made of stacked 1D convolutional blocks), with a latent space

Z = [C, S, N]

.

C \in {[0, 1]}^{d_{C}}

a categorical variable representing the damage class(es), with

C \sim p_{C}

which is generally chosen as a categorical distribution over

d_{C}

classes, i.e.,

p_{C} = Cat (d_{C})

.

S \in R^{d_{S}}

is a continuous variable of dimension

d_{S}

, with

S \sim p_{S}

, generally

p_{S} = N (0, I)

or the uniform distribution

p_{S} = U (- 1, 1)

. Finally,

N \in R^{d_{N}}

is a random noise of

d_{N}

independent components, with

N \sim p_{N}

, generally

p_{N} \sim N (0, I)

. RepGAN adopts the conceptual frameworks of VAEs and InfoGAN, combining the learning of two representations

x \to \hat{z} \to \hat{x}

and

z \to \hat{x} \to \hat{z}

, respectively. The

x \to \hat{z} \to \hat{x}

scheme must learn to map multiple data instances

x^{(i)}

into their images (via encoder

F_{θ_{X}}

) in a latent manifold

{\hat{z}}^{(i)} = F_{θ_{X}} (x^{(i)})

and back into a distinct instance in data space

{\hat{x}}^{(i)} = G_{θ_{Z}} \circ F_{θ_{X}} (x^{(i)})

(via decoder

G_{θ_{Z}}

), providing satisfactory results in reconstruction.

z \to \hat{x} \to \hat{z}

maps multiple data latent instances into the same data representation, in order to guarantee impressive generation and clustering performance. Combining the two surjective mappings, in RepGAN the two learning tasks

x \to \hat{z} \to \hat{x}

and

z \to \hat{x} \to \hat{z}

are performed together with shared parameters in order to obtain a bijective mapping

x \leftrightarrow z

. In practice, the training of

z \to \hat{x} \to \hat{z}

is iterated five times more than the

x \to \hat{z} \to \hat{x}

. This ability to learn a bidirectional mapping between the input space and the latent space is achieved through a symmetric adversarial process. The Empirical Loss function can be written as:

\begin{matrix} L_{S} = D_{J S} (p_{\hat{X} | (C, S, N)} | | p_{X}) + D_{J S} (q_{\hat{C} | X} | | p_{C}) + D_{J S} (q_{\hat{S} | X} | | p_{S}) + D_{J S} (q_{\hat{N} | X} | | p_{N}) \\ - E_{p_{C}} [E_{p_{\hat{X} | C}} [ln q_{\hat{C} | X}]] - E_{p_{S}} [E_{p_{\hat{X} | S}} [ln q_{\hat{S} | X}]] - E_{p_{X}} [E_{q_{(C, S, N) | X}} [ln p_{X | (C, S, N)}]] \end{matrix}

(9)

with the terms:

$- E_{p_{C}} [E_{p_{\hat{X} | C}} [ln q_{\hat{C} | X}]]$ minimizing the conditional entropy $S (C | X)$ ;
$- E_{p_{S}} [E_{p_{\hat{X} | S}} [ln q_{\hat{S} | X}]]$ minimizing the conditional entropy $S (S | X)$ .

are introduced in order to constrain a deterministic and injective encoding mapping (see Appendix B). On the other hand, the term

$- E_{p_{X}} [E_{q_{(C, S, N) | X}}$ $[ln p_{X | (C, S, N)}]]$ .

penalizes the learning scheme, in order to minimize the conditional entropy

S (X | (C, S, N))

, i.e., in order to grant a good reconstruction.

Following the original RepGAN formulation:

$- E_{p_{X}} [E_{q_{(C, S, N) | X}} [ln p_{X | (C, S, N)}]]$ is enforced penalizing the $L_{1}$ -norm $| X - G_{θ_{Z}} \circ F_{θ_{X}} (X) |$ ;
$E_{p_{S}} [E_{p_{\hat{X} | S}} [ln q_{\hat{S} | X}]]$ corresponds to the InfoGAN $L_{I}$ penalty, and it is maximized via the reparametrization trick (structuring the $S$ branch of the encoder-decoder structure as a VAE, see [5]).

Finally,

E_{p_{C}} [E_{p_{\hat{X} | C}} [ln q_{\hat{C} | X}]]

is maximized in a supervised way, considering the actual class of labeled signals

x^{(i)}

:

x_{d}^{(i)}

corresponding to a damaged structure and

x_{u}^{(i)}

to an undamaged one, respectively. RepGAN provides an informative and disentangled latent space associated with the damage class

C

. The most significant aspect of the approach is the efficiency in generating reasonable signals for different damage states only on the basis of undamaged recorded or simulated structural responses. Both generators

F_{θ_{X}}

,

G_{θ_{Z}}

and discriminators

D_{ω_{X}}

,

D_{ω_{C}}

,

D_{ω_{S}}

and

D_{ω_{N}}

are parametrized via 1D CNN (and strided 1D CNN), following [8]. Our RepGAN model has been designed using the Keras API, and trained employing a Nvidia Tesla K40 GPU (on the supercomputer Ruche, the cluster of the Mésocentre Moulon of Paris Saclay University).

4. Results and Discussion

In the following, a case study is considered in order to prove the ability of the new architecture to achieve the three fundamental tasks of semantic generation, clustering and reconstruction. The reference example is a shear building subject to an earthquake ground motion whose signals are taken from the STEAD seismic database [14]. STEAD [14] is a high-quality, large-scale, and global data set of local earthquake and non-earthquake signals recorded by seismic instruments. In this work, local earthquake wave forms (recorded at local distances within 350 km of earthquakes) have been considered. Seismic data are constituted by three wave forms of 60 s duration, recorded in east–west, north–south, and up-dip directions, respectively. The structure is composed of 39 storeys. The mass and the stiffness of each floor, in undamaged conditions, are, respectively, m = 625 × 10

^{3}

kg and k = 1 × 10

^{9}

\frac{k N}{m}

. Damage is simulated through the degradation of stiffness. In the present case, the stiffness reduction has been set equal to 50% of the above mentioned value. The structural response of the system is evaluated considering one degree-of-freedom (dof) per floor. To take into account damping effects, a Rayleigh damping model has been considered.

The following results have been obtained considering 100 signals in both undamaged and damaged conditions for a total of 200 samples, with separated training and validation data sets. Each signal is composed of 2048 time steps with dt = 0.04 s. The training process has been performed over 2000 epochs. The reconstruction capability of the proposed network has been evaluated through the Goodness-of-Fit (GoF) criteria [15] where both the fit in Envelope (EG) and the fit in Phase (FG) are measured. An example is shown in Figure 1. The values 9.17 and 9.69, respectively, related to EG and PG testify the excellent reconstruction quality.

Figure 1. Time–Frequency Goodness-of-Fit criterion: the black line represents the original time-histories

x^{(i)}

while the red time history depicts the result of the RepGAN reconstructions

G_{Z} \circ F_{X} (x^{(i)})

. GoF is evaluated between 0 and 10: the higher the score, the better is the reconstruction. Frequency Envelope Goodness (FEG), Time–Frequency Envelope Goodness (EG), Time Envelope Goodness (TEG), Frequency Phase Goodness (FPG), Time–Frequency Phase Goodness (PG) and Time Phase Goodness (TPG).

The capability of reproducing signals for different damage scenarios can be appreciated from Figure 2 which presents the original structural response (black) and the corresponding generated one (orange) in both undamaged (left panel in Figure 2) and damaged (right panel in Figure 2) conditions. Regarding the classification capability, the classification report and the confusion matrix in Figure 3 highlight the fact that the model is able to correctly assign the damage class to the considered time histories.

Figure 2. Examples of reconstructed signals for undamaged (left) and damaged (right) time-histories. The black lines represent the original time-histories

x_{u}^{(i)}

and

x_{d}^{(i)}

, respectively. The orange time histories represent the result of the RepGAN reconstructions

G_{Z} \circ F_{X} (x_{u}^{(i)})

and

G_{Z} \circ F_{X} (x_{d}^{(i)})

, respectively. The proposed examples represent the normalized displacement of the 1st floor of the building in object.

Figure 3. Evaluation of the classification ability of the model. On the left panel, precision, recall, f1-score and accuracy values are reported. A precision score of 1.0 for a class C means that every item labelled as belonging to class C does indeed belong to class C, whereas a recall of 1.0 means that every item from class C was labelled as belonging to class C. F1-score is the harmonic mean of the precision and recall. Accuracy represents the proportion of correct predictions among the total number of cases examined. On the right panel, the confusion matrix allows to visualize the performance of the model: each row of the matrix represents the instances in the actual class, while each column depicts the instances in the predicted class.

5. Conclusions

In this paper, we introduce a SHM method based on a deep Generative Adversarial Network. Trained on synthetic time histories that represent the structural response of a multistory building in both damaged and undamaged conditions, the new model achieves high classification accuracy (Figure 3) and satisfactory reconstruction quality (Figure 1 and Figure 2), resulting in a good bidirectional mapping between the input space and the latent space. However, the major innovation of the proposed method is the ability to generate reasonable signals for different damage states, based only on undamaged recorded or simulated structural responses. As a consequence, real recordings linked to damaged conditions are not requested. In our future work, we would like to extend our approach to real-time data. We will further consider a dataset constituted by a far larger number of time histories.

Author Contributions

Conceptualization, G.C., L.R., F.G., S.M. and A.C.; methodology, G.C. and F.G.; software, G.C., L.R. and F.G.; validation, G.C. and F.G.; formal analysis, G.C. and F.G.; investigation, G.C. and F.G.; resources, F.G.; data curation, G.C., L.R. and F.G.; writing–original draft preparation, G.C., L.R., M.T., F.G., S.M., A.M. and A.C.; writing–review and editing, G.C. and F.G.; visualization, G.C. and F.G.; supervision, F.G., S.M. and A.C.; project administration, F.G. and A.C.; Funding acquisition, G.C., F.G. and A.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data generated during the study are available from the corresponding author upon reasonable request.

Acknowledgments

The training and testing of the neural network has been performed exploiting the supercomputer resources of the Mésocentre Moulon (http://mesocentre.centralesupelec.fr, last accessed 14 February 2022), the cluster of CentraleSupélec and ENS Paris-Saclay, hosted within the Paris-Saclay University and funded by the Contrat Plan État Région (CPER). This work has been developed thanks to the scholarship “Tesi all’estero—a.y. 2020/2021—second call” funded by Politecnico di Milano.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Shannon’s Entropy

Shannon’s entropy for a probability density function $p_{X}$ :

$S (X) = S (p_{X}) = E_{X \sim p_{X}} [ln \frac{1}{p_{X}}] = - E_{X \sim p_{X}} [ln p_{X}] \geq 0$
Conditional Shannon’s entropy for X and Z:

$S (X | Z) = E_{Z \sim P Z} [S (p_{X | Z})] = E_{(X, Z) \sim p_{X Z}} [ln (\frac{1}{p_{X | Z}})]$

$S (X, Z) = S (Z | X) + S (X) = S (X | Z) + S (Z)$
Cross-entropy:

$S (p_{X Z} | | q_{X Z}) = E_{(X, Z) \sim p_{X Z}} (ln (\frac{1}{q_{X Z}})) = E_{X \sim p_{X}} [E_{Z \sim p_{Z | X}} [ln (\frac{1}{q_{X Z}})]]$
Given a data set of identically independent distributed (i.i.d.) samples $S = {\{x^{(i)}\}}_{i = 1}^{N}$ , the true yet unknown probability $p_{X}$ of extracting an instance $x^{(i)}$ can be approximated by the likelihood $p_{θ_{X}} {\{x^{(i)}\}}_{i = 1}^{N}$ , whose entropy is

$S (p_{θ_{X}}) = - ln p_{θ_{X}} ({\{x^{(i)}\}}_{i = 1}^{N}) = - \sum_{i}^{N} ln p_{θ_{X}} (x^{(i)})$

Appendix B. Kullback–Leibler Divergence

Kullback-Liebler divergence (non-symmetric):

$D_{K L} (p_{X Z} | | q_{X Z}) = E_{(X, Z) \sim p_{X Z}} [ln (\frac{p_{X Z}}{q_{X Z}})] = - S (p_{X Z}) + S (p_{X Z} | | q_{X Z}) \leq S (p_{X Z} | | q_{X Z})$

$D_{K L} (p_{X Z}] | | q_{X Z}) + S (X) = \underset{E_{X \sim p_{X}} [S (p_{Z | X})]}{\underset{⏟}{- S (X | Z)}} + S (p_{X Z} | | q_{X Z}) \leq S (p_{X Z} | | q_{X Z})$

$D_{K L} (p_{X Z} | | q_{X Z}) + S (X) \leq D_{K L} (p_{X Z} | | q_{X Z})]$
Mutual Information between $X$ and $X | Z$ :

$I (X, Z) = S (X) - S (X | Z) \geq 0$

If $p_{X | Z} = p_{X}$ ( $(X, Z)$ are independent) then $I (X, Z) = 0$ . If $p_{X | Z} = δ (Z - f (X))$ with f deterministic, then $I (X, Z) = max_{X, Z} I (X, Z) = S (X)$ .
$\begin{array}{l} S (Z | X) = - E_{Z \sim p_{Z}} [E_{X \sim p_{X | Z}} [ln p_{Z | X}]] = \\ = - E_{X \sim p_{X}} [E_{Z \sim p_{Z | X}} [ln \frac{p_{Z | X}}{q_{Z | X}}]] - E_{Z \sim p_{Z}} [E_{X \sim p_{X | Z}} [ln q_{X | Z}]] = \\ = - E_{X \sim p_{X}} [D_{K L} (p_{Z | X} | | q_{Z | X})] - E_{Z \sim p_{Z}} [E_{X \sim p_{X | Z}} [ln q_{X | Z}]] \leq \\ \leq - E_{Z \sim p_{Z}} [E_{X \sim p_{X | Z}} [ln q_{X | Z}]] \end{array}$

Appendix C. Generative Adversarial Networks (GAN)

Given $X$ belonging to the probabilistic space $(Ω_{X}, E_{X}, {Pr}_{X})$ with class $C \in \{“ d ”, “ g ”\}$ (“d” corresponding to data and “g” to generated, and a discriminator $D : Ω_{X} \to [0, 1]$ acting as an expert/critic:
-
$P (C = “ d ”) = α; P (C = “ g ”) = 1 - α$
-
$P (C = “ d ” | x^{(i)}) = D (x^{(i)})$
-
$P (C = “ d ” | x^{(i)}) = 1 - D (G (z^{(i)}))$

$S (C | X) = - E_{X \sim p_{X}} [E_{C \sim p_{C | X}} ln (p_{C | X})] = - E_{C \sim p_{C}} [E_{X \sim p_{X | C}} [ln (p_{C | X})]]$

$S (C | X) = - α E_{X \sim p_{X | C = “ d ”}} [ln (p_{C = “ d ” | X})] - (1 - α) E_{X \sim p_{X | C = “ g ”}} [ln (p_{C = “ g ” | X})]$

$S (C | X) = - α E_{X \sim p_{X}} [ln (D (X))] - (1 - α) E_{Z \sim p_{Z}} ln (1 - D (G (Z)))$

For tuneable conditional probability distributions $D_{ω}$ :

$max I (X, C) \leq S (C) + max - S (C | X) = S (C) + min S (C | X)$

$max I (X, C) \leq S (C) + min_{G} max_{D} α E_{X \sim p_{X}} [ln (D (X))] + (1 - α) E_{Z \sim p_{Z}} ln (1 - D (G (Z)))$

Thus, minimizing $S + {min}_{G} {max}_{D} S (C | X)$ represents an upper bound for the Mutual Information between C and $X$ , which is maximized by maximizing $- S (C | X)$ . For an optimum training, D must not be able to discriminate between $x^{(i)}$ and ${\hat{x}}^{(i)}$ , therefore $α = \frac{1}{2}$ .

Appendix D. Standard Autoencoder

In the standard Autoencoder formulations [16,17],

F

and

G

are trained by maximizing

I (X, Z)

, namely:

\{F, G\} = arg max_{F, G} I (X, Z) = arg min_{F, G} H (X | Z) = arg min_{F, G} E_{X \sim p_{X}} [E_{Z \sim q_{Z | X}} [ln (\frac{1}{p_{X | Z}})]]

(A1)

If the encoder and decoder are parametrized as neural networks, respectively, as

F_{θ_{X}}

and

G_{θ_{Z}}

, the AE loss can be approximated by the Empircal Loss:

\{θ_{X}, θ_{Z}\} = arg max_{θ_{X}, θ_{Z}} \sum_{i = 1}^{N} [ln (p_{X | Z} (x^{(i)} | Z = F_{θ_{X}} (x^{(i)})))]

(A2)

Given the fact that the Gaussian distribution has maximum entropy relative to all probability distributions covering the entire real line, the Empirical Loss in Equation (A2) can be maximized by the Empirical Loss with

p_{X | Z} = N (G_{θ_{Z}} (Z), (σ^{2}) I)

:

\{θ_{X}, θ_{Z}\} = arg max_{θ_{X}, θ_{Z}} \sum_{i = 1}^{N} \frac{1}{2 σ^{2}} {∥ x^{(i)} - G_{θ_{Z}} \circ F_{θ_{X}} (x^{(i)}) ∥}^{2} + \frac{d_{Z}}{2} ln (2 π σ^{2})

(A3)

References

Farrar, C.R.; Worden, K. Structural Health Monitoring: A Machine Learning Perspective; Wiley: Oxford, UK, 2013. [Google Scholar] [CrossRef] [Green Version]
Rytter, A. Vibrational Based Inspection of Civil Engineering Structures. Ph.D. Thesis, University of Aalborg, Aalborg, Denmark, 1993. [Google Scholar]
Bishop, C.M. Neural Networks for Pattern Recognition; Oxford University Press: Oxford, UK, 1995. [Google Scholar]
Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In Proceedings of the 28th Annual Conference on Neural Information Processing Systems (NIPS 2014), Montreal, QC, Canada, 8–13 December 2014. [Google Scholar] [CrossRef]
Kingma, D.P.; Welling, M. Auto-Encoding Variational Bayes. arXiv 2013, arXiv:1312.6114. [Google Scholar]
Zhou, Y.; Gu, K.; Huang, T. Unsupervised Representation Adversarial Learning Network: From Reconstruction to Generation. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019. [Google Scholar] [CrossRef] [Green Version]
Lindsay, B.G. Mixture Models: Theory, Geometry and Applications. In NSF-CBMS Regional Conference Series in Probability and Statistics; Institute of Mathematical Statistics and American Statistical Association: Suitland, MD, USA, 1995; Volume 5, p. 163. ISBN 0-940600-32-3. [Google Scholar]
Radford, A.; Luke, L.M.; Chintala, S. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv 2015, arXiv:1511.06434. [Google Scholar]
Makhzani, A.; Shlens, J.; Jaitly, N.; Goodfellow, I. Adversarial Autoencoders. In Proceedings of the International Conference on Learning Representations, San Juan, Puerto Rico, 2–4 May 2016. [Google Scholar]
Dumoulin, V.; Belghazi, I.; Poole, B.; Mastropietro, O.; Lamb, A.; Arjovsky, M.; Courville, A. Adversarially Learned Inference. arXiv 2016, arXiv:1606.00704v3. [Google Scholar]
Donahue, J.; Krähenbühl, P.; Darrell, T. Adversarial Feature Learning. arXiv 2017, arXiv:1605.09782. [Google Scholar]
Li, C.; Liu, H.; Chen, C.; Pu, Y.; Chen, L.; Henao, R.; Carin, L. ALICE: Towards understanding adversarial learning for joint distribution matching. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17), Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 5501–5509. [Google Scholar]
Chen, X.; Duan, Y.; Houthooft, R.; Schulman, J.; Sutskever, I.; Abbeel, P. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets. In Proceedings of the 30th International Conference on Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016. [Google Scholar]
Mousavi, S.M.; Sheng, Y.; Zhu, W.; Beroza, G.C. STanford EArthquake Dataset (STEAD): A Global Data Set of Seismic Signals for AI. IEEE Access 2019, 7, 179464–179476. [Google Scholar] [CrossRef]
Kristekova, M.; Kristek, J.; Moczo, P. Time-frequency misfit and goodness-of-fit criteria for quantitative comparison of time signals. Geophys. J. Int. 2009, 178, 813–825. [Google Scholar] [CrossRef]
Vincent, P.; Larochelle, H.; Bengio, Y.; Manzagol, P.-A. Extracting and Composing Robust Features with Denoising Autoencoders. In Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, 5–9 July 2008; pp. 1096–1103. [Google Scholar]
Vincent, P.; Larochelle, H.; Lajoie, I.; Bengio, Y.; Manzagol, P.-A. Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion. J. Mach. Learn. Res. 2010, 11, 3371–3408. [Google Scholar]

Figure 1. Time–Frequency Goodness-of-Fit criterion: the black line represents the original time-histories

x^{(i)}

while the red time history depicts the result of the RepGAN reconstructions

G_{Z} \circ F_{X} (x^{(i)})

. GoF is evaluated between 0 and 10: the higher the score, the better is the reconstruction. Frequency Envelope Goodness (FEG), Time–Frequency Envelope Goodness (EG), Time Envelope Goodness (TEG), Frequency Phase Goodness (FPG), Time–Frequency Phase Goodness (PG) and Time Phase Goodness (TPG).

Figure 1. Time–Frequency Goodness-of-Fit criterion: the black line represents the original time-histories

x^{(i)}

while the red time history depicts the result of the RepGAN reconstructions

G_{Z} \circ F_{X} (x^{(i)})

. GoF is evaluated between 0 and 10: the higher the score, the better is the reconstruction. Frequency Envelope Goodness (FEG), Time–Frequency Envelope Goodness (EG), Time Envelope Goodness (TEG), Frequency Phase Goodness (FPG), Time–Frequency Phase Goodness (PG) and Time Phase Goodness (TPG).

Figure 2. Examples of reconstructed signals for undamaged (left) and damaged (right) time-histories. The black lines represent the original time-histories

x_{u}^{(i)}

and

x_{d}^{(i)}

, respectively. The orange time histories represent the result of the RepGAN reconstructions

G_{Z} \circ F_{X} (x_{u}^{(i)})

and

G_{Z} \circ F_{X} (x_{d}^{(i)})

, respectively. The proposed examples represent the normalized displacement of the 1st floor of the building in object.

Figure 2. Examples of reconstructed signals for undamaged (left) and damaged (right) time-histories. The black lines represent the original time-histories

x_{u}^{(i)}

and

x_{d}^{(i)}

, respectively. The orange time histories represent the result of the RepGAN reconstructions

G_{Z} \circ F_{X} (x_{u}^{(i)})

and

G_{Z} \circ F_{X} (x_{d}^{(i)})

, respectively. The proposed examples represent the normalized displacement of the 1st floor of the building in object.

Figure 3. Evaluation of the classification ability of the model. On the left panel, precision, recall, f1-score and accuracy values are reported. A precision score of 1.0 for a class C means that every item labelled as belonging to class C does indeed belong to class C, whereas a recall of 1.0 means that every item from class C was labelled as belonging to class C. F1-score is the harmonic mean of the precision and recall. Accuracy represents the proportion of correct predictions among the total number of cases examined. On the right panel, the confusion matrix allows to visualize the performance of the model: each row of the matrix represents the instances in the actual class, while each column depicts the instances in the predicted class.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Multiple requests from the same IP address are counted as one view.

A Generative Adversarial Network Based Autoencoder for Structural Health Monitoring †

Abstract

1. Introduction

2. Related Work

3. Methods

4. Results and Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Shannon’s Entropy

Appendix B. Kullback–Leibler Divergence

Appendix C. Generative Adversarial Networks (GAN)

Appendix D. Standard Autoencoder

References

Article Metrics

Article Access Statistics

A Generative Adversarial Network Based Autoencoder for Structural Health Monitoring^†