Deep Machine Learning for Acoustic Inspection of Metallic Medium

Jarreau, Brittney; Yoshida, Sanichiro; Laprime, Emily

doi:10.3390/vibration5030030

Open AccessArticle

Deep Machine Learning for Acoustic Inspection of Metallic Medium

by

Brittney Jarreau

^1,*,†

,

Sanichiro Yoshida

^2,†

and

Emily Laprime

^3,†

¹

Integrated Science and Technology, Southeastern Louisiana University, Hammond, LA 70402, USA

²

Department of Chemistry and Physics, Southeastern Louisiana University, Hammond, LA 70402, USA

³

Engineering and Applied Sciences, University of New Orleans, Lakeshore Dr., New Orleans, LA 70108, USA

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Vibration 2022, 5(3), 530-556; https://doi.org/10.3390/vibration5030030

Submission received: 17 June 2022 / Revised: 21 August 2022 / Accepted: 23 August 2022 / Published: 28 August 2022

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Acoustic non-destructive testing is widely used to detect signs of damage. However, an experienced technician is typically responsible for interpreting the result, and often the evaluation varies depending on the technician’s opinion. The evaluation is especially challenging when the acoustic signal is analyzed in the near field as Fresnel range diffraction complicates the data. In this study, we propose a Convolutional Neural Network (CNN) algorithm to detect anomalies bearing in mind its future application to micro-scale specimens such as biomedical materials. Data are generated by emitting a continuous sound wave at a single frequency through a metal specimen with a sub-millimeter anomaly and collecting the transmitted signal at several lateral locations on the opposite side (the observation plane) of the specimen. The distance between the anomaly and the observation plane falls in the quasi Fresnel diffraction regime. The use of transmitted signals is essential to evaluate the phase shift due to the anomaly, which contains information about the substance in the anomaly. We have developed a seven-layered CNN to analyze the acoustic signal in the frequency domain. The CNN takes spectrograms representing the change in the amplitude and phase of the Fourier transform over the lateral position on the observation plane as input and classifies the anomaly into nine classes in association with the lateral location of the anomaly relative to the probing signal and the material of the anomaly. The CNN performed excellently demonstrating the validation accuracy as high as 99.9%. This result clearly demonstrates CNN’s ability to extract features in the input signal that are undetectable to humans.

Keywords:

material testing; acoustics; convolutional neural network; machine learning

1. Introduction

Acoustic waves carry acoustic energy which is the product of a force acting on a segment of the medium through which the acoustic wave travels and the displacement within that medium. Acoustic waves, in general, are observed to be longitudinal in nature. However, when an acoustic wave travels through a solid medium, transverse modes are excited in various circumstances. Due to how an acoustic wave travels through solid media, we can use acoustic waves to probe events occurring inside a structure. From these propagation characteristics, we can investigate the elastic property of the medium and utilize the conversion between longitudinal and transverse modes to probe the structure of the specimen [1].

Acoustic probing for anomalies has a long history [2,3]. Commonly used techniques generally detect the acoustic signal reflected by anomalies and present it as a two-dimensional image. Acoustic microscopy [4] is one of these techniques. It is capable of detecting microscopic anomalies. Time-resolved [5] and V(z) curve techniques [6,7] of scanning acoustic microscopy enable us to obtain longitudinal information. Thanks to the advancements in these techniques, it is relatively easy to locate a hidden anomaly and image its lateral profile.

Improvement in image quality is a remaining challenge in these conventional techniques. Unlike optical microscopy, achieving high spatial resolution in acoustics is challenging. The difficulty is due to the relatively long wavelength. Sound waves are intrinsically longer in wavelength than light. In addition, these conventional techniques use the amplitude of the reflected acoustic signal. When the acoustic impedance of the anomaly is similar to the surrounding material, the reflectivity becomes low, causing a reduction in the contrast of the resulting image. For example, imaging a biological cell is not easy. Since the targeted cell and surrounding tissue have similar acoustic impedance, the contrast is low [8]. The combination of low spatial resolution and low contrast degrades the image quality.

An acoustic signal, like any wave, can be characterized by its amplitude and phase. The amplitude represents the amount of acoustic energy carried by the acoustic wave. The phase at a given point provides the acoustic path length from the source of the acoustic signal. Given that we have both an input signal and output signal, a transfer function representing the change in the signal can be obtained. Using the transfer function, we are also able to examine how much the signal is changed when passing through a medium. Since the phase is directly related to the elastic constant, it is expected to contain more precise information on anomalies [9].

This study tests the feasibility of using neural networks to interpret acoustic signals, while acoustic signals are easy to collect, the interpretation of the signal is not straightforward. To compensate for the challenges in the low quality acoustic imagery, the phase and amplitude of the transmitted acoustic signal are analyzed by a machine learning algorithm. The use of transmitted signals solves the low reflectivity issue, as the machine learning algorithm is able to compensate for the low spatial resolution. We are therefore able to use conventional techniques to locate the anomaly and image its lateral profile, then use the proposed idea to obtain more detailed information.

The promising results from this initial experiment lead to an interest in applying this technique to biological materials such as the characterization of biofilms. This would include applications such as identifying the density of bacteria, estimating film thickness, and detecting abnormal cells. The elastic constant of a cell is similar to water, therefore a cell can be difficult to distinguish from the surrounding water. This results in low reflectivity [10] and consequently low contrast in the acoustic imagery, such that the human eye cannot pinpoint the cell. This experiment gives hope that although humans cannot make this determination, a CNN will be able to detect these differences.

The remainder of this article is organized into sections covering preprocessing, model design, and conclusions. The rest of the current section introduces the motivation and theory which drive this experiment. Section 2 presents the data collection process and the design of the convolutional neural network. In this section we describe the experimental setup used for generating and collecting the acoustic signals, the method used to create the spectrograms, and the design of the model used in this experiment. The CNN architecture used for the classification task is detailed with a discussion of performance and suggestions for future optimizations. Section 3 details the performance of the model designed in the previous section. In Section 4, the performances of the CNN are evaluated and justified based on computer scientific and physical arguments. Finally, Section 5 presents the conclusion and recommendations for possible future work.

1.1. Motivation

Visual inspection is the most widely used method for the detection of anomalies and defects and is a relatively inexpensive method in many scenarios. The major disadvantage of using visual inspection is that this method can only catch defects on the exterior of the system and those defects which are large enough to be seen by the eye. Another popular method of non-destructive testing is Ultrasonic testing (UT) where ultrasonic waves are used to detect defects in a material by examining the wave for a reflection. This technique is often clouded with noise and attenuation, so analysis must be done by a skilled operator. The limitation of this method is that data acquisition and evaluation depend on the expertise of the technician, which makes it difficult to arrive at timely and non-subjective readings [11]. In some situations, such as the inspection of a ship’s hull or underwater structure, this is a costly and, until recently, dangerous operation. Recent advances in technology have shifted the inspection of underwater structures such as ship hulls from requiring divers to do the inspection to using remotely operated sonar and imaging systems to collect the data for analysis by a trained surveyor [12].

Machine learning methods have proven to be capable of learning complex patterns in areas such as speech recognition, object detection, and behavioral analysis [13,14,15]. With the proper preprocessing, similar methods can be applied to acoustic data to detect complex patterns while filtering through the noise in the data. This has been demonstrated in acoustic emissions testing [16]. Acoustic emission (AE) is the term for the transient elastic wave released by a solid when irreversible changes occur. AE testing differs from Ultrasonic testing in that AE utilizes a sound wave which is produced by the object being tested whereas ultrasonic testing must utilize an external sound wave to probe the object. We will clarify in the paper. In 2020, Haile et al. [16] demonstrated the success of convolutional neural networks (CNNs) and long short-term memory recurrent neural networks [17] in the detection of reflection boundaries in an acoustic emission wave. The models achieved upwards of 80% accuracy [16]. A similar approach was taken by Barat et al. in 2017 to classify acoustic emission signals using a CNN [18]. This model aimed to isolate AE signals by recognizing the dispersion structure on the spectrogram of the AE signal [18]. The model was able to determine the presence of AE events with 98% accuracy [18].

In 2020 an approach similar to that seen in the experiments on AE analysis was applied to detecting anomalous sounds such as breaking glass, screams, and gunshots [19]. As with the other AE analysis, Zhao transforms the sound waves into Short Time Fourier Transforms and feeds these images to a CNN. The CNN then learns to detect when a sound fitting the anomalous profiles is shown in the data. The model achieved 96% accuracy on the controlled dataset; however, Zhao notes that this model was trained and tested on a non-noisy dataset which lends toward the sounds being easier to detect and classify [19]. From these acoustic classification experiments, it is clear that the use of convolutional neural networks is effective at classifying AE signals and known anomalous sounds, which generates interest in applying this approach to analyzing acoustic waves collected during material testing for general anomalies in the media being tested.

In a study of Gaussian beam theory in 2019, an approach for examining materials using a continuous acoustic wave was laid out. Utilizing a continuous wave reduces noise in the data and allows the properties of Hermite Gaussian beams, typically only used in light sources, to be applied to acoustic data. This study revealed how continuous waves propagate inside materials [9]. The foundation of this study in combination with the experiments performed on AE testing can be utilized to create a machine learning model which can detect reflection boundaries inside of a metallic medium.

1.2. Theory

Consider in Figure 1 that the acoustic wave emitted by an acoustic transmitter propagates through a medium containing an anomaly. The sensor detects the acoustic signal by scanning over the observation plane. At the interfaces with the anomaly, the acoustic wave experiences partial reflection which results in reduction in amplitude and phase shift. At the exit surface of the anomaly, the wavefront is deformed because the acoustic path length inside of the anomaly differs from that of the bulk material. Figure 1a is a schematic representation of this deformation. The change experienced in the wavefront depends on factors such as the shape of the interface and the acoustic impedance of the anomaly, which are generally unknown. Consequently, it is difficult to produce an analytical function for the acoustic signal received by the sensor using the position on the observation plane.

While an analytical expression is not readily obtained, the sensor signal at a given point on the observation plane can be characterized as follows. The acoustic wave emitted by the transducer diverges as it approaches the anomaly due to diffraction. At the plane that contacts the entrance to the anomaly (the entrance plane of anomaly) the wave has a certain spatial profile. According to the Huygens-Fresnel theory [20], we can express the acoustic signal at a point on the observation plane as the superposition of all the element waves initiated at the entrance plane of the anomaly. Figure 1b illustrates the situation schematically for an arbitrarily selected point P. Thus, at each point on the observation plane, the acoustic signal is characterized by a pair of amplitude and phase. Depending on the location relative to the anomaly, the amplitude and phase values vary along the observation plane. Since the amplitude and phase are affected by the shape and acoustic impedance of the anomaly, the spatial profile of the sensor signal (the x-dependence of the amplitude and phase) contains the acoustic information of the anomaly. We can interpret the phenomenon as the diffraction by the anomaly [21].

The information contained in the spatial profile of the sensor signal is widely used in acoustic technology. For example, Scanning Acoustic Microscopy [22] expresses the aperture that the acoustic signal passes through in the frequency domain as the aperture function [23]. When the distance between the observation plane and the aperture is much greater than the aperture size, the diffraction by the aperture is considered to be Fraunhofer diffraction [21,23]. Under this condition, the spatial profile of the signal represents the Fourier transform of the aperture function. When the distance between the aperture and observation point is comparable or less than the aperture size, this is considered Fresnel diffraction [24]. Under this condition, the aperture function cannot be expressed as a simple function. However, the spatial profile of the signal contains the shape of the aperture. In the case of this experiment, the diffraction experienced is near the Fresnel range. Therefore, the anomaly cannot be expressed as a simple function, so a machine learning algorithm is introduced.

Consider in Figure 1 that the acoustic wave emitted by an acoustic transmitter propagates through a medium that contains an anomaly and the sensor detects the acoustic signal by scanning over the observation plane. At the interface with the anomaly, the acoustic wave experiences partial reflection that causes a reduction in amplitude and phase shift in the transmitted signal. These changes depend on various factors such as the shape of the interface and the acoustic impedance of anomaly, which are generally unknown. Consequently, it is difficult to express the acoustic signal received by the sensor analytically as a function of the position on the observation plane. However, no matter how complicated the shape and acoustic property of the anomaly may be, we can presume that the sensor signal at a given location on the observation plane (x in Figure 1) can be represented by the amplitude and phase of the acoustic wave at the location. This presumption is guaranteed by Huygens-Fresnel theory [25], i.e., the wave at an observation point P is the superposition of all the element waves initiated at the source. The amplitude and the phase at point P are those of the superposed wave at this point.

The above discussion indicates that the x-dependence of the sensor signal (after scanning over the entire observation plane) is uniquely determined by the shape and acoustic property of the anomaly. This in turn indicates that theoretically we can characterize the anomaly by analyzing the x-dependence of the sensor signal (the x-profile of sensor signal). However, in reality, it is not easy to differentiate the sensor signals even to classify the type of anomaly. Figure 2 presents the sensor signal for a simple case, as an example. Here the cases are (a) having a steel plate with no anomaly between the acoustic transmitter and an aluminum metal plate, (b) having the steel plate of the same thickness as (a) with a circular vacancy, and (c) filling the vacancy in the same steel plate as (b) with silicon. A human eye can hardly differentiate the x-profile of sensor data among the cases. This is where we can utilize the power of a machine learning algorithm.

Frequency Domain Analysis

Another complexity is that a considerable amount of the transmitter signal can be reflected at the output surface of the transmitter. The amount of reflection depends on the anomaly condition and is contained by the sensor signal. Therefore, simple analysis of the sensor signal does not indicate the amplitude and phase changes due to the anomaly. It is necessary to remove the effect of the reflection at the transmitter’s output surface. This is evident in the initial results from the Phase and Amplitude models and supported further by the K-fold analysis presented in the results section. In the time domain, it is difficult to remove this effect. In the frequency domain, however, we can eliminate this effect by finding the transfer function

H (ω)

.

H (ω) = \frac{F (S_{o u t})}{F (S_{i n})}

(1)

Here

F (S_{in})

is the Fourier transform of the input signal from the acoustic transmitter, and

F (S_{out})

is the Fourier transform of the output signal acquired by the acoustic sensor. If the transmitter signal is a sinusoidal function of a constant frequency, as in the present case, the Fourier spectrum of

F (S_{in})

is single-peaked at the driving frequency. If the medium between the transmitter and sensor is elastic, the Fourier spectrum of

F (S_{out})

is also single-peaked at the driving frequency. The amplitude of the transfer function represents the reduction in the acoustic energy for various reasons, such as scattering or velocity damping. The phase of the transfer function indicates the acoustic path length of the acoustic signal. Thus, the amplitude and phase information exhibit the structural change of the medium. An anomaly in the acoustic path alters both the amplitude and phase of the transfer function. By analyzing the transfer function as a function of the sensor location, we can extract a feature of a given anomaly.

2. Materials and Methods

Data for this experiment is collected using a Picoscope 5000 Series oscilloscope to produce and receive acoustic waves. At the center of the system lies a metallic specimen, in the case of this experiment, the specimen is a steel ring. The acoustic wave is emitted at three different positions, centered under the specimen, to the left of the specimen, and to the right of the specimen. The emitted signal is then received at 15 locations in a sweeping fashion so that the received wave can be received at each point along the length of the specimen. This section details the use of an oscilloscope and propagation of acoustic waves, the experimental designs, and the data collection process. The setup of the data collection process is shown in Figure 3.

The data for this experiment is generated and captured through the use of a Picoscope oscilloscope. The experimental design utilizes two transducers, one Panametric Accuscan (Olympus A101s-RM) and one Panametric Videoscan (V101-RM) attached to the oscilloscope. These transducers belong to a class known as piezoelectric transducers [26,27,28]. These devices are now widely used for areas such as structural health monitoring since they can be used as either actuators or sensors [29]. In this experiment, the accuscan transducer is responsible for emitting the signal and the videoscan transducer is responsible for receiving the wave after it has passed through the medium.

2.1. System Design and Data Collection

For this experiment, data were collected using an oscilloscope operating at a sampling frequency of 500 MHz. The controlled signal was emitted at a frequency of 400 kHz. A frequency of 400 kHz was selected as the acoustic transducers used in this experiment have the highest sensitivity around 400 kHz. As the goal of the model is to detect the amplitude and phase change observed in the transmitted wave, the use of this wavelength, which is longer than the industry standard of 5 MHz, does not compromise the sensitivity. The accuscan transducer (the transmitter) sits beneath the medium being tested (This is displayed as the left-hand side of Figure 1a). An aluminum plate sits atop the medium to enable data collection from a wider range of locations. Finally, the videoscan transducer (the sensor) sits on top of the aluminum observation plate (This is displayed as the right-hand side of Figure 1a). A layer of water is added between each component to improve acoustic coupling. Water is added by placing 4 drops between each component at the beginning of the experiment and is replenished at the start of each new trial and the midpoint of each trial. In this way, a sine-like wave of 400 kHz travels from the transmitter through the medium and up to the sensor. The transmitter was driven by the built-in oscillator of the Picoscope 5000 Series oscilloscope. The driving signal was fed into channel A, which triggers the oscilloscope, and the sensor output was fed into channel B of the oscilloscope.

The experiment is carried out using three designs. The first design, called centered, centers the medium and the accuscan transducer in the middle of the aluminum observation plate. This design is meant to capture the effect of the hole blocking the strongest point of the acoustic wave compared to the specimens without the defect. The next design, called shifted-left, shifts the hole to the left of the accuscan transducer by moving the accuscan transducer to the right by 1.55 cm. The final design, called shifted-right, shifts the hole to the right of the accuscan transducer by moving the accuscan transducer to the left by 1.55 cm. All designs are depicted in Figure 4.

Figure 5 illustrates the hardware configuration used in the present experiment. A steel disc of 62.9 mm in diameter and 4.5 mm thickness was placed between the acoustic transmitter and an aluminum plate of 82.5 mm in length, 120.0 mm in width, and 5.0 mm in thickness. The acoustic sensor scanned over the surface of the aluminum plate along its width. The steel disc had a hole of 6.87 mm in diameter. These discs can be viewed in Figure 6. The hole was empty for the “Hole” cases and filled with silicon rubber for the “Hole Filled” cases. For the “No Hole” cases, a steel disc of the same dimension without a hole was used. (see the next paragraph and the nine labels discussed in Section 2.2). Each of the three experiments shown in Figure 4 is performed on the “No Hole”, “Hole Filled”, and “Hole” cases.

The goal of the experiment is to find distinct patterns which indicate the presence of a defect. The observation plate has markings at each centimeter of the length of the plate. Measurements are taken by aligning the edge nearest to the center of the receiver with each centimeter marking. Additional measurements are taken at the center of the plate for each case such that the transducer is centered on the plate and at each half centimeter to the left and right of the center marking. These extra measurements were added to capture the rapidly changing trend seen when the transducer is over the center of the medium. The usage of the observation plate is laid out as depicted in Figure 7.

Data was captured by sweeping from the center outward until the transducer leaves the observation plate toward both the left and the right sides. At each location, a minimum of fifteen waveforms are taken. Each waveform is saved to a CSV file containing the timestamps of each sample in the time series, the amplitude of the emitted wave, and the amplitude of the received wave. Each sweep constitutes a single trial and for each case, four to five trials were run. In this way, a database of 9450 CSV files was created.

2.2. Preprocessing

Inspiration was taken from deep machine learning for detection of acoustic wave reflections, a study on structural health monitoring, to get from the format collected by the oscilloscope, CSV, to a format suitable for a convolutional neural network. In their study, Haile et al. [16] convert the collected acoustic emissions data from waveform to short-time Fourier transforms (STFT) in order to create an image that represents the change in the frequency spectrum over time. In our study, however, we utilize a continuous wave, so there is no notable change over time. We adapt the short-time Fourier transform to represent the change in frequency spectrum over distance. This is necessary to relay the spatial relationship of each x reading in the trial to the other locations. We call this method the short distance Fourier transform (SDFT).

Each SDFT is created by taking the fast Fourier transform (FFT) of each of the fifteen periods collected at each x location in the trial. For example, the first trial has locations −4 to 7 cm where data was collected. At each of those locations, the waveform collected is split into equally sized waveforms labeled between 1 and 15. Once all FFTs are taken, an FFT for each location is selected based on matching Trial and waveform numbers. For example, the first set of selected FFTs in number order would be those collected as waveform 1 of Trial 1. Each of these FFTs is then stacked into an image as a column of the SDFT image in order from −4 to 7 cm. These SDFTs are then saved to create a database that stores each SDFT under its corresponding label. This process is represented as pseudocode in Figure 8.

The database created contains nine labels listed in Table 1. Each label has a corresponding directory containing all data collected to support that label. These directories contain four to six trials with directories labeled 7 cm to −4 cm, and each positional directory contains fifteen waveforms in the CSV format. The CSV files are relatively simple, containing only the timestamp, the amplitude of signal A (the input signal) in V, and the amplitude of signal B (the output signal) in mV.

In the Amplitude and Phase models, the Fourier Transform is calculated for each waveform in a trial. This encapsulates fifteen individual waveforms numbered 7 cm to −4 cm including three additional points at 2.5, 1.5, and −0.5 cm. This is because the central point is considered to be 1.5 cm due to applying an offset of the radius of the transducer which is 1.5 cm. Each of the Fourier Transforms is then processed to take the phase and amplitude data from the Fourier series, this corresponding phase or amplitude is then cut down to only the range with the most variation. In this experiment, the range of interest is from 300 to 500 kHz. These subsamples of phase and amplitude are then stacked as columns in the final image. In this way, 630 SDFT images are created which can be fed to the CNN to represent the change in phase or amplitude in the Fourier Domain over distance.

Similarly, in the Phase Transfer Function and Amplitude Transfer Function models, the images fed in represent the change in phase or amplitude in the Fourier Domain over distance. The main difference in the preprocessing steps is the calculation of the transfer function. This is done by taking the Fourier Transform of the received signal and dividing it by the Fourier Transform of the input signal. In this way, we can use the change of amplitude and phase from one end of the system to the other as a feature in our training.

2.3. Convolutional Neural Network

2.3.1. Background

Concepts such as deep learning and neural networks were born from the desire to mimic how the human brain processes information. Artificial neural networks (ANNs) are composed of a complex network of processing units referred to as neurons, similar to the human brain, which is able to quickly accept and process large amounts of input. Each neuron consists of inputs and a single output and utilize an activation function that determines how the neuron gets from the inputs to output. These neurons are typically arranged in 3 sets of layers: the input layer, the hidden layers, and the output layer [30].

The use of a network of neurons is necessary to be able to identify non-linear relationships to solve complex problems. Two regularly used classifications of ANN are the recurrent neural network (RNN) and the convolutional neural network (CNN). A CNN is typically made up of an input layer, hidden layers, pooling layers, and fully connected layers.

The hidden, convolutional, layers perform convolutions between an input array of features and a kernel. The more kernels, often referred to as filters, which are applied, the more features a convolution layer can extract. Finally, an activation function is applied to compute the function’s nonlinearity. This is the step that brings the output of the layer from linear convolution operations to a nonlinear relationship. The activation serves to transform the feature maps which arose in the convolution operations into a non-linear input to the next layer in the network [31].

The commonly selected activation functions are ReLU, Tanh, and Sigmoid. Early researchers mainly used Sigmoid and Tanh, but ran into an issue where when the variable takes a value too far from zero, the model is no longer sensitive to small changes in the data. Because of this, the weight is not updated during back-propagation, and training cannot be completed. ReLu was introduced to resolve the issues seen with tanh and sigmoid. ReLU has since become the default activation function for most neural networks [32]. The ReLU formula is as follows:

F (x) = m a x (0, x),

(2)

A pooling layer is often applied to reduce the dimensions of the input layers created by the hidden layers. Pooling merges similar features by combining the features within the given kernel size into a single entry according to the pooling method applied. Max-pooling is the most common method. This method selects the highest value of the kernel to pass through to the output tensor. Another option is average pooling which calculates the average in the kernel [31].

Fully connected layers connect all input nodes from the incoming layer to all output nodes in the next layer. These layers ultimately combine the generated outputs from the hidden layers with a one-dimensional vector consisting of probabilities of each feature belonging to a label. Each node in a fully connected layer has learnable weight mapping inputs features to output categories. These layers provide a connection between the hidden layers and the final output layer [31].

The output layer is the layer responsible for making the final determination of which label the input sample has the highest probability of belonging to. The most commonly used activation function for this layer is Softmax [33]. Depending on the context, the softmax function also goes by the name of Boltzmann distribution, while the softmax function can take many forms, the most commonly accepted form is as shown in the following Equation [34].

σ {(\vec{z})}_{i} = \frac{e^{z_{i}}}{\sum_{j = 1}^{K} e^{z_{j}}}

(3)

2.3.2. Present Model

The CNNs used in this experiment all have the same basic seven-layer design. This design includes an input layer, two convolutional layers, a flattening layer, two fully connected layers, and a Softmax layer. The only difference in each model is the input data. All input data is normalized between zero and one and given up to three pixel values worth of noise to add variation to the data. The model design is illustrated in Figure 9.

The input layer accepts images, in this case, SDFTs, of size 200 × 15 × 1. These images are then fed to the first convolution layer which contains 64 filters. This layer applies a 5 × 5 filter with a stride size of 1. Because the images are so narrow, the output of the first convolution layer is then fed to the next convolution layer which also contains 64 filters. The second convolution layer applies a 3 × 3 filter with a stride size of one. The output of the second convolution layer then goes to a flattening layer. After flattening, the output goes through two fully connected layers each of which applies a dropout percentage of 40%. Finally, the features from the second fully connected layer are passed to the output layer which selects one of the nine labels. These layers are described in detail below.

Input: The CNN accepts images sized 200 × 15 × 1. These have been normalized between 0 and 1 and augmented with light noise to avoid overfitting. A sample of these input images can be seen in Figure 10. These images are expanded to increase visibility, but at the time they are fed to the CNN they have dimensions of 200 pixels in height and 15 pixels in width.
Convolution Layers: For the first of these layers, convolution is performed with 64 filters. These filters have a 5 × 5 receptive field applied using a stride size of one. Finally, the ReLU activation function is applied to the feature maps. This feature map is then fed directly to the second of these layers where convolution is performed with 64 filters. These filters have a 3 × 3 receptive field applied using a stride size of one. Finally, the ReLU activation function is applied to the new feature maps.
Flattening: This layer takes the feature maps which result from the first two convolution layers and flattens them into a one-dimensional vector. This vector is fed to the fully connected layers.
Fully Connected Layers: In the first fully connected layer the flattened output is connected to 720 hidden units. Here the ReLU activation function is applied to produce the next feature vector. In this fully connected layer, the output is connected to only 80 hidden units. Again the ReLU activation function is applied to produce the feature vector which will be used by the final layer of the CNN. A dropout of 0.4 was applied to each fully connected layer to avoid overfitting the training data.
Softmax: The softmax layer is a nine-dimensional vector that represents the probability of belonging to each of the nine labels.

The model was trained using Stochastic Gradient Descent (SGD) [35] with the Adam optimizer [36] for the sparse-categorical cross-entropy function. For these purposes, the built-in functions from Scikit-Learn were used. SciKit-Learn’s SGD is an implementation of a plain SGD learning routine supporting various loss functions and penalties for classification. The Adam optimizer is a form of SGD which uses momentum and adaptive learning rates to converge faster. The learning rate of the Keras Adam optimizer used in this model was set to

5 \times 10^{- 3}

. The data was split into two sets with 50% of the data dedicated to training and 50% dedicated to validation. An aggressive test-train split was selected since the model appears to be able to learn effectively on a small portion of the data. The model was trained in batches of 32 spectrograms for a maximum of ten epochs. Early stopping was applied to avoid fitting. This method looks for stabilization in the loss function to determine when the model should stop training.

3. Results

3.1. Results of Measurements

3.1.1. Time Domain Signals

This section presents the raw signal collected in the experimental procedure. Here the signal is analyzed to produce an interpretation of the physical significance to ensure the integrity of the data. Figure 11 shows the signals observed in channels A and B for the (a) no hole centered, (b) hole centered and (c) holed filled centered cases. Each waveform is normalized to its peak value so that the crest is aligned. The waveforms in Figure 11 were taken at the location

x = 0

cm, meaning the sensor lies directly above the transmitter. With this configuration, the delay of the sensor (channel B) signal to the driving voltage (channel A) should represent the travel time of the acoustic signal inside the steel disc-aluminum plate assembly.

In the no hole case, the travel time can be evaluated from the thickness of the steel disc and aluminum plate with the acoustic velocity in the respective material. Figure 11a shows that the first crests of channel A and channel B in this time segment appear at 43.10 μs and 42.79 μs, respectively. This indicates that the time lag between the driving voltage and sensor signal is 0.2560 +2.5 × N μs. Here 0.2560

μ

s is the distance between the first crest of Channel A and the first crest of Channel B read from Figure 11a, 2.5 μs is the period of the driving frequency of 400 kHz, and N is an integer. N is included to consider the possibility that the sensor signal is behind the driving signal over N periods. In addition, it is expected that the acoustic transmitter has an electric circuit delay. Calling the electric delay

δ

we obtain the following equation.

0.2560 + 2.5 N - δ = \frac{L_{s t}}{v_{s t}} + \frac{L_{A l}}{v_{A l}}

(4)

On the right-hand side of Equation (4),

L_{s t}

and

v_{s t}

are the thickness of the steel disc and the acoustic velocity in steel. The other two terms are the same quantities for the aluminum plate. Using the acoustic velocity in steel (5940 m/s) [37] and in aluminum (6320 m/s) [38] along with their thicknesses of 0.45 mm and 5.0 mm, we find that the right-hand side to be 1.5487

μ

s. From this information, we determine that N must be 1, and subsequently,

δ =

1.2073 μs.

Assuming that the electric delay time

δ

is common to the hole and hole filled case, we can solve Equation (4) for these cases from the time delay observed in Figure 11b,c for their acoustic velocities. As a result, we obtain the acoustic velocity of 1385 m/s for the hole case and 1317 m/s for the hole filled case. The acoustic velocity of 1385 m/s is slightly lower than the acoustic velocity in water at room temperature (1480 m/s) [39] and significantly higher than that in air (343 m/s) [39]. It is likely that the hole is mostly filled with water and contains some air. This mixture reduces the density of the medium lowering the acoustic velocity. The acoustic velocity of 1317 m/s estimated for silicon rubber falls in the range of literature values [38,40].

3.1.2. Modal Analysis

Although analytical solutions are hard to obtain, it is clear that the sensor signal has a pair of amplitude and phase at each sensor position. This leads to the expansion of the spatial profile (the x-dependence) of the sensor signal into a Hermite–Gaussian series [41]. The acoustic signal emitted by the transmitter is approximately a Gaussian beam (the lowest mode of a Hermite–Gaussian series), possibly containing low-order Hermite–Gaussian terms. The spatial profile of the sensor signal is expected to consist of the Hermite–Gaussian terms of the transmitter signal and additional higher-order modes in association with the wavefront distortion (see Figure 1) caused by the anomaly.

Figure 12 shows the spatial profile of the sensor signal’s amplitude measured in a dataset. Here Figure 12a compares the experimental amplitude profiles for the no hole, hole, and hole filled cases. Figure 12b–d shows the Hermite–Gaussian fit to each, using TEM00, TEM10, and TEM20 with the following expression.

S (x) = \{1 + α H_{1} (\frac{\sqrt{2} x}{w}) + β H_{2} (\frac{\sqrt{2} x}{w})\} e^{- \frac{x^{2}}{w^{2}}}

(5)

where w is the spot size,

H_{1} (ξ)

and

H_{2} (ξ)

are the first and second-order Hermite polynomials [42], and

α

and

β

are the coefficient for the first and second orders. To fit the experimental data, the first and second order Hermite polynomial terms defined by (6) and (7), and the spot size of 1.25 cm were used.

\begin{matrix} H_{1} (ξ) & = & 2 ξ \end{matrix}

(6)

\begin{matrix} H_{2} (ξ) & = & 4 ξ^{2} - 2 \end{matrix}

(7)

In Figure 12b–d, the value of

α

and

β

that best fit for each case are indicated in the legend. A spot size of 1.25 cm was used for all the cases.

The Hermite–Gaussian polynomial fits the experimental plot reasonably well for all three cases with their respective coefficients. To assure the accuracy of the polynomial fits, Figure 13 compares the phase of the experimental data (Figure 12b) and the phase of the best-fit Hermite–Gaussian polynomial. It shows reasonable agreement between the experiment and theory as well. These observations indicate that the x-dependence of the sensor signal represents the physically significant waveform under each condition.

3.1.3. Amplitude and Phase of Sensor Signal and Transfer Function

For various reasons, such as the formation of the distilled water layer between the acoustic transducer and the steel ring, acoustic reflection can occur at the input surface of the steel plate. This reflection can affect the electric circuit. Thus, the actual acoustic signal coupled into the system varies between measurements adding ambiguity to the signal. To eliminate the ambiguity, the transfer function between the Channel A signal and Channel B signal was evaluated. Figure 14 compares the amplitude and phase of the sensor signals and the driving voltage (channel A) to sensor signal transfer function. The amplitude and phase of both the sensor signal and the transfer were used to classify the cases in the CNN, as we will detail in the following sections.

3.2. Results of CNN

3.2.1. CNN Performance

Using the configuration described in the section on model design, the models were trained on 315 training images and evaluated against 315 test images. Typically a CNN is evaluated with a 60% training and 40% validation set, but, since the model was performing so well, equal parts were allocated to training and testing to increase validation. The Phase model performs poorly averaging only 18% accuracy, the Amplitude model performs well averaging 94.1% accuracy, and the Transfer Function Phase and Amplitude models each perform near-perfect averaging 99.5% and 99.9% accuracy, respectively. The accuracy and loss trends of each function during the training epochs are illustrated in Figure 15.

3.2.2. K-Fold Validation

To further support the performance of the models, K-fold validation was run in two configurations for each of the models. The training and validation were run on five folds. In one configuration, the data was split into 504 training images and 126 test images. In the second configuration, these numbers were reversed to use 126 training images and 504 test images. The goal was first to verify the theory that the phase and accuracy could perform better on a surplus of training data, but worse on a small set of training data because of the amount of noise generated by the reflection boundaries when the signal first enters the system. On the other hand, this aims to support the theory that the transfer function’s phase and amplitude do not experience this same noise and can therefore maintain an average accuracy of over 90% when trained on both a large and small dataset. These results are observed in Figure 16.

4. Discussion

4.1. Correlation Analysis

Correlation analysis was performed to determine the correlation of various data points to the 9 labels and to determine the separability of the labels. Figure 17 shows the correlation of all attributes for each of the 4 data types: Amplitude, Transfer Function Amplitude, Phase, and Transfer Function Phase. This figure contains the correlation matrices for all 9 labels listed in Section 2.2. The matrix is arranged as follows. The first through the 15th rows represent the correlation of the measurement taken at each location

x = - 4

—

x = + 7

cm whereas the bottom row indicates the correlation with the labels. For example, the (3,2) component of the matrix presents the correlation of data collected at

x = + 5

cm and

x = + 6

cm. For the generation of the correlation matrix, the labels are defined as follows. “Hole Filled”: 0, “Hole”: 1, “No Hole”: 2, “Hole Filled Right”: 3, “Hole Filled Shifted”: 4, “Hole Right”: 5, “hole shifted”: 6, “No Hole Right”: 7, “No Hole Shifted”: 8.

Figure 18 is the same type of matrix as Figure 17. In this case, the correlations are shown for just the basic labels of the Hole, No Hole, and Hole Filled cases. Comparison of Figure 17 and Figure 18 illustrates how the correlation of features changes as the labels change in complexity. In the figures where the model has decreased complexity (distinguishing only between the Hole, No Hole, and Hole Filled cases) the correlation of features increases both with each other and with the labels; however, when the complexity is increased, fewer labels have high levels of correlation.

We can make the following observations in Figure 17 and Figure 18. First, data collected at each x-location show higher correlations with data collected at nearby than some distance away. Figure 19 exhibits this observation explicitly. Here Figure 19a plots the correlation of the amplitude of the sensor signal (Channel B signal) obtained at

x = 2

cm to the data obtained at other x locations, and (b) shows the same correlation for the Channel A to Channel B transfer function. Notice that the sensor signal phase data have significantly lower correlations to other x locations than the sensor signal amplitude data or transfer function amplitude and phase data. The amplitude and phase are moderate functions of x regardless of the hole condition. This observed behavior indicates that the sensor signal phase data do not accurately characterize the anomaly. This observation is consistent with the poor training performance of the sensor signal phase data observed in Figure 15.

Second, the amplitude data vary with the labeling with a monotonic relation, indicating that the amplitude and the labeling have the same trend in the sequence of data taking. For instance, if we assign the label values in ascending order from the hole condition showing the lowest amplitude to the one with the highest amplitude, the correlation takes a positive value at all the x positions. If the labeling is descending, the correlation becomes negative. Figure 20a indicates this tendency. The amplitude with the partial labeling shows positive correlations to the label for all the x positions in the sensor signal and transfer function, with the only exception for the transfer function data at

x = 4

cm. The negative correlation at

x = 4

cm is likely due to the low S/N (signal-to-noise) ratio at this x position. We will revisit this discussion shortly. In the case of the full labeling, the correlation is negative in the

- 0.5 < x < 2.5

cm range and positive in the other ranges. The amplitude is high in

- 0.5 < x < 2.5

cm, and the S/N ratio is high in this range. The negative correlation indicates that the amplitude data and labeling have opposite trends. (Note that we use different labeling for the partial and full labeling cases). We suspect the positive correlation outside

- 0.5 < x < 2.5

cm is due to the low S/N ratio.

Note in Figure 20a that the full-labeling case has a smaller peak-correlation and a narrower highly-correlated x range. This observation indicates that accurate labeling is more difficult as the number of labels increases, which we can understand intuitively. The amplitude correlations of the sensor signal and transfer function behave similarly, as is the case of partial labeling.

The correlation of the phase data, on the other hand, oscillates around zero, as seen in Figure 20b. This behavior indicates that the phase value changes its sign depending on the x position. This change in sign is the result of evaluating the phase value as the inverse tangent of the complex number associated with the frequency domain data. Figure 20b indicates that the sensor signal phase has a considerably lower correlation to label than the transfer function phase. Again, this observation is consistent with the poor training performance of the sensor signal phase.

Consider the correlation behaviors of amplitude and phase data based on physics. Figure 21 compares the correlation of the amplitude and phase data to the label with the x profile of the best-fit Hermite–Gaussian function for the non-hole, hole, and hole filled conditions shown in Figure 12. Here, Figure 21a plots the correlation of the sensor signal amplitude to the label, (b) shows the transfer function phase’s correlation to the label, (c) is the amplitude of the Hermite–Gaussian functions best-fit to the three-hole conditions, and (d) is the phase of the Hermite–Gaussian functions. Note that the phase in (d) results from the inverse tangent of the Hermite–Gaussian fit. Since the inverse tangent folds (wraps) the phase value, the phase data in this figure ranges from

- π

to

π

.

Comparison of Figure 21a,c indicates that the amplitude data has high correlations in

- 2 < x < 2

cm where the Hermite–Gaussian fits show significant difference among the hole conditions. Comparing Figure 21b,d, reveals that the

- π

to

π

folding of the inverse tangent causes the positive-to-negative oscillatory behavior of the phase correlation. The tangent function varies rapidly near

\pm π / 2

which consequently means that the tangent inverse is sensitive to small phase shifts at the entrance of the anomaly. This observation quantitatively explains the significantly lower correlation observed in the sensor phase data. By using the transfer function, this effect at the entrance to the anomaly can be eliminated.

4.2. Separability

Figure 22 and Figure 23 were also produced to help visualize the separation of the labels. Similar to Figure 17 and Figure 18, these illustrate how the separation of the labels based on various features changes as the labels change in complexity. Figure 22 shows the separation of all nine labels while Figure 23 exhibits the separation of the case with only three labels. Before producing these graphs, the data was reduced by removing any redundant features, those which were highly correlated with another feature, and by dropping any unimportant features, those which did not correlate to the label. As observed in Section 4.1, data from neighboring x-locations have mutually high correlations and therefore considering the separation for all these data sets is redundant. In both the nine-label case and the three-label case, it is clear there is no separation of the phase data. It is difficult to discern the separation of the classes with all nine labels present. However, there are several features which, when selected, allow clear separation of the hole and no hole labels when there are only three labels, and partial separation of the hole filled labels as well. This indicates that simpler models such as linear models would be able to perform well on the Amplitude, Transfer Function Phase, and Transfer Function Amplitude models in the three-label case, but would likely not perform well at distinguishing all nine labels. It is also evident that models attempting to make a determination using only the phase data are not likely to be able to make a clear determination in either the three- or nine-label case.

We can interpret the label separation based on the physical argument made for the correlation matrices. The diagonal components of Figure 23 represent the histogram of the amplitude or phase values against the correlation label values. The (5,5) component of the Amplitude matrix at the upper-left of Figure 23 indicates that as the amplitude value increases, the label value tends toward the No Hole case. The same physics argument holds for the phase matrix. In this case, the location of the histograms representing the label value Hole and No Hole alternates on the phase value axis. This alternation happens because the wrapped phase value changes the sign with the x-location.

As anticipated from the correlation and separation analysis, the phase model performs poorly. This poor performance of the phase model supports the correctness of this model. The phase of the sensor signal includes the phase change at the input face, i.e., the phase change occurring between the output face of the acoustic transmitter and the input surface of the material dominates the overall phase at the sensor. The phase model CNN results indicate that the model is accurately probing the characteristics of the material.

The near-perfect accuracy experienced by the transfer function phase model is also anticipated, although not clearly evident in the analysis, because the phase change due to the material to be tested is the most indicative of a change in the material. This was verified through higher-order mode analyses. The reasoning for the excellent behavior of the transfer function amplitude model is similar and can also be verified through higher-order mode analyses.

The near-perfect performance of the amplitude model is also expected for the same reason that we can expect an excellent prediction using the transfer function; the amplitude is sensitive to changes in the material with low interference at the reflection boundary. The trend in the amplitude model, however, is not a smooth rise in accuracy. This is likely caused by the heavy sensitivity of amplitude to the location of the sensor. This means slight deviations can cause largely different trends, making it more difficult to learn the raw amplitude.

In each of the well-performing models: Transfer Function Phase, Transfer Function Amplitude, and Amplitude, it is clear that the CNN can learn a set of discriminating features which enable it to divide the dataset into nine different labels. These labels are telling of the nature of the material as well as the position of the material concerning the sensors. Utilizing SDFTs, the CNN can leverage both spatial and frequency domain features to gain an understanding of the change in the signal in relation to the change in distance from the sensor. This enables the CNN to get a picture of the signal as it passes through the entire face of the material being tested and choose the relevant discriminatory features for learning. The results presented in this section show that this proposed architecture can accurately perform classification of the material being tested and of that material’s location in the system.

5. Conclusions

In order to increase the speed with which material testing can occur and to expand its application to a wider range of materials, an increase in autonomy is required. This study aims at creating a model which can detect the presence and direction of a defect in a system. Data was collected using a continuous wave method to probe a metallic disc for evidence of damage. These data were then processed to form normalized SDFTs representing phase, amplitude, transfer function phase, and transfer function amplitude. A model was produced for each of these measures to demonstrate the use of CNN for material testing. Our results suggest that a CNN can effectively distinguish damaged and intact materials using the transfer function amplitude, the transfer function phase, or simply the amplitude of the received signal on a near-perfect basis.

The results of this experiment lead to the conclusion that utilizing the change in signal allows both phase and amplitude data to be leveraged in examining the integrity of a structure. This is an important discovery which allows us to extend this research to cases requiring analysis in the Fresnel range. This is due to the complex nature of Fresnel (close-range) diffraction wherein the phase shift is no longer considered constant and the amplitude is no longer evenly distributed. Fresnel diffraction makes the detected signal much more complicated as a function of the lateral spatial variable. From our experiment, it is clear that phase data itself is too noisy to use for the sake of classification of acoustic data; however, the use of transfer function phase data shows promising results.

This research may be used to increase the automation of materials testing since both the presence of damage and the direction of the damaged component can be determined. This leads to future work wherein it is necessary to determine how this model performs on various degrees of damage in a structure, thin structures, and more complex objects. This will be important in determining to what extent damage can be classified. Ultimately, this research aims to extend toward the classification of other non-metallic materials such as biofilms.

Author Contributions

Conceptualization, S.Y.; methodology, B.J., S.Y. and E.L.; software, B.J.; validation, B.J., S.Y. and E.L.; formal analysis, B.J. and S.Y.; investigation, B.J., S.Y. and E.L.; data curation, B.J.; writing—original draft preparation, B.J.; writing—review and editing, B.J. and S.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data collected in the experiments and used in this paper can be downloaded at: https://drive.google.com/file/d/1uknpggkey4bzyoEb9ZXxrqlJL5DcBXD0/view?usp=sharing accessed on 25 August 2022.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

CNN	Convolutional Neural Network
AE	Acoustic Emissions
FFT	Fast Fourier Transform
SDFT	Short Distance Fourier Transform

References

Graff, K.F. Wave Motion in Elastic Solids; Oxford Univ. Press: Oxford, UK, 1975. [Google Scholar]
Krautkrämer, J.; Krautkrämer, H. Ultrasonic Testing of Materials; Springer: Berlin/Heidelberg, Germany, 1990. [Google Scholar]
Drinkwater, B.W.; Wilcox, P.D. Ultrasonic arrays for non-destructive evaluation: A review. NDT & E Int. 2006, 39, 525–541. [Google Scholar]
Zinin, P.V.; Arnold, W.; Weise, W.; Slabeyciusova-Berezina, S. Theory and Applications of Scanning Acoustic Microscopy and Scanning Near-Field Acoustic Imaging. In Ultrasonic and Electromagnetic NDE for Structure and Material Characterization, 1st ed.; Kundu, T., Ed.; CRC Press: Boca Raton, FL, USA, 2012; Chapter 11; pp. 611–688. [Google Scholar]
Juntarapaso, Y.; Miyasaka, C.; Tutwiler, R.L.; Anastasiadis, P. Contrast Mechanisms for Tumor Cells by High-frequency Ultrasound. Open Neuroimaging J. 2018, 12, 105–119. [Google Scholar] [CrossRef]
Ono, Y.; Kushibiki, J. Experimental study of construction mechanism of V(z) curves obtained by line-focus-beam acoustic microscopy. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2000, 47, 1042–1050. [Google Scholar] [CrossRef] [PubMed]
Duquennoy, M.; Ourak, M.; Xu, W.J.; Nongaillard, B.; Ouaftouh, M. Observation of V(z) curves with multiple echoes. NDT Int. 1995, 28, 147–153. [Google Scholar] [CrossRef]
Miyasaka, C.; Yoshida, S. Overview of Recent Advancement in Ultrasonic Imaging for Biomedical Applications. Open Neuroimaging J. 2018, 12, 133–157. [Google Scholar] [CrossRef]
Miyaska, K.; Laprime, E.; Yoshida, S.; Sasaki, T. Application of Gaussian Beam to Ultrasonic Testing. In The Abstracts of ATEM: International Conference on Advanced Technology in Experimental Mechanics: Asian Conference on Experimental Mechanics; The Japan Society of Mechanical Engineers: Tokyo, Japan, 2019; p. 1008B1415. [Google Scholar]
Hilderbrand, J.A.; Rugar, D.; Johnston, R.N.; Quate, C.F. Acoustic microscopy of living cells. Proc. Natl. Acad. Sci. USA 1981, 78, 1656–1660. [Google Scholar] [CrossRef]
Mahesh, B. Application of Non-Destructive Testing in Oil and Gas Industries. Int. J. Res. Eng. Sci. Manag. 2020, 2, 613–615. [Google Scholar]
Hover, F.S.; Eustice, R.M.; Kim, A.; Englot, B.; Johannsson, H.; Kaess, M.; Leonard, J.J. Advanced perception, navigation and planning for autonomous in-water ship hull inspection. Int. J. Robot. Res. 2012, 31, 1445–1464. [Google Scholar] [CrossRef]
Anusha, R.; Subhashini, P.; Jyothi, D.; Harshitha, P.; Sushma, J.; Mukesh, N. Speech Emotion Recognition using Machine Learning. In Proceedings of the 2021 5th International Conference on Trends in Electronics and Informatics (ICOEI), Tirunelveli, India, 3–5 June 2021; pp. 1608–1612. [Google Scholar] [CrossRef]
Xiang, Z.; Zhang, R.; Seeling, P. Machine learning for object detection. In Computing in Communication Networks; Academic Press: Cambridge, MA, USA, 2020; pp. 325–338. [Google Scholar]
Richter, T.; Fishbain, B.; Markus, A.; Richter-Levin, G.; Okon-Singer, H. Using machine learning-based analysis for behavioral differentiation between anxiety and depression. Sci. Rep. 2020, 10, 16381. [Google Scholar] [CrossRef]
Haile, M.A.; Zhu, E.; Hsu, C.; Bradley, N. Deep machine learning for detection of acoustic wave reflections. Struct. Health Monit. 2020, 19, 1340–1350. [Google Scholar] [CrossRef]
Brownlee, J. A Gentle Introduction to Long Short-Term Memory Networks by the Experts. Available online: https://machinelearningmastery.com/gentle-introduction-long-short-term-memory-networks-experts/ (accessed on 11 June 2022).
Barat, V.; Kostenko, P.; Bardakov, V.; Terentyev, D. Acoustic signals recognition by convolutional neural network. Int. J. Appl. Eng. Res. 2017, 12, 3461–3469. [Google Scholar]
Zhao, J. Anomalous Sound Detection Based on Convolutional Neural Network and Mixed Features. J. Phys. Conf. Ser. 2020, 1621, 012025. [Google Scholar] [CrossRef]
Smith, D.G. Ebook Topic: Huygens’ and Huygens-Fresnel Principles. In Field Guide to Physical Optics; SPIE: Bellingham, WA, USA, 2013. [Google Scholar]
Guenther, B.D.; Steel, D. Encyclopedia of Modern Optics; Academic Press: Cambridge, MA, USA, 2018; p. 69. [Google Scholar]
Zinin, P.V.; Arnold, W.; Weise, W.; Berezina, S. Ultrasonic and Electromagnetic NDE for Structure and Material Characterization, 1st ed.; Kundu, T., Ed.; CRC Press: Boca Raton, FL, USA, 2012; Chapter 11. [Google Scholar]
Weise, W.; Zinin, P.; Briggs, A.; Wilson, T.; Boseck, S. Examination of the two-dimensional pupil function in coherent scanning microscopes using spherical particles. J. Acoust. Soc. Am. 1998, 104, 181–191. [Google Scholar] [CrossRef] [PubMed]
Soskind, Y.G. Ebook Topic: Fresnel Diffraction. In Field Guide to Diffractive Optics; SPIE: Bellingham, WA, USA, 2011. [Google Scholar]
Goodman, J.W. Introduction to Fourier Optics. In McGraw-Hill Physical and Quantum Electronics Series; Roberts & Company Publishers: Greenwood Village, CO, USA, 2005. [Google Scholar]
Lippmann, G. Principe de la conservation de l’électricité [Principle of the conservation of electricity]. Annales de chimie et de physique 1881, 24, 145. (In French) [Google Scholar]
Curie, J.; Curie, P. Contractions et dilatations produites par des tensions dans les cristaux hémièdres à faces inclinées [Contractions and expansions produced by voltages in hemihedral crystals with inclined faces]. Comptes Rendus 1881, 93, 1137–1140, 1881. (In French) [Google Scholar]
Curie, J.; Curie, P. Développement par compression de l’électricité polaire dans les cristaux hémièdres à faces inclinées. Bull. Minéral 1880, 3, 90–93. [Google Scholar] [CrossRef]
Qing, X.; Li, W.; Wang, Y.; Sun, H. Piezoelectric Transducer-Based Structural Health Monitoring for Aircraft Applications. Sensors 2019, 19, 545. [Google Scholar] [CrossRef]
Shiruru, K. An Introduction To Artificial Neural Network. Int. J. Adv. Res. Innov. Ideas Educ. 2016, 1, 27–30. [Google Scholar]
Vaz, J.M.; Balaji, S. Convolutional Neural Networks (CNNs): Concepts and applications in pharmacogenomics. Mol. Divers. 2021, 25, 1569–1584. [Google Scholar] [CrossRef]
Hao, W.; Yizhou, W.; Yaqin, L.; Zhili, S. The Role of Activation Function in CNN. In Proceedings of the 2020 2nd International Conference on Information Technology and Computer Application (ITCA), Guangzhou, China, 18–20 December 2020; IEEE: Piscataway, NJ, USA; pp. 429–432. [Google Scholar]
Brownlee, J. Softmax Activation Function with Python. Available online: https://machinelearningmastery.com/softmax-activation-function-with-python/ (accessed on 11 June 2022).
Gao, B.; Pavel, L. On the Properties of the Softmax Function with Application in Game Theory and Reinforcement Learning. arXiv 2017, arXiv:1704.00805. [Google Scholar]
Srinivasan, A.V. Stochastic Gradient Descent—Clearly Explained!! Available online: https://towardsdatascience.com/stochastic-gradient-descent-clearly-explained-53d239905d31 (accessed on 11 June 2022).
Brownlee, J. Gentle Introduction to the Adam Optimization Algorithm for Deep Learning. Available online: https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/ (accessed on 11 June 2022).
The Engineering ToolBox. Available online: https://www.engineeringtoolbox.com/sound-speed-solids-d_713.html (accessed on 11 June 2022).
Material Sound Velocities. Available online: https://www.olympus-ims.com/en/ndt-tutorials/thickness-gauge/appendices-velocities/ (accessed on 11 June 2022).
Baird, C.S. Science Questions with Surprising Answers. Available online: https://www.wtamu.edu/~cbaird/sq/mobile/2013/11/12/how-does-sound-going-slower-in-water-make-it-hard-to-talk-to-someone-underwater/ (accessed on 11 June 2022).
Folds, D.L. Speed of sound and transmission loss in silicone rubbers at ultrasonic frequencies. J. Acoustical Soc. Am. 1974, 56, 1295. [Google Scholar] [CrossRef]
RP Photonics Encyclopedia. Hermite–Gaussian Modes. Available online: https://www.rp-photonics.com/hermite_gaussian_modes.html (accessed on 11 June 2022).
Weisstein, E.W. Hermite Polynomial—From Wolfram MathWorld. Available online: https://mathworld.wolfram.com/HermitePolynomial.html (accessed on 11 June 2022).

Figure 1. Acoustic wave traveling through anomaly in bulk material. (a) wave incident to anomaly and wave diffracted by anomaly; (b) signal captured by sensor represented by element waves according to Huygens’ principle.

Figure 2. Experimental data normalized to maximum value.

Figure 3. Data Collection—here, from top down, the receiver, observation plate, specimen, and transmitter can be seen. The wires connected to these components are attached to a computer to monitor the signals being transmitted and received.

Figure 4. The experimental designs—(a) centered, (b) shifted-left, (c) shifted-right. In each design the long rectangular object is the aluminum observation plate, the small rectangular object is the medium to test, and the dark gray ovals are the transducers.

Figure 5. (a) Hardware configuration. Parts are not proportionally scaled. (b) Parts and dimension.

Figure 6. The steel rings used in this experiment with the one for the Hole cases on the left and the one for the No Hole cases on the right.

Figure 7. Taking Data. The markings on the plate represent measurements in cm. The circles depict the sensors with the arrow demonstrating the direction of movement. As shown in the figure, if moving right, the left edge is aligned and if moving left the right edge is aligned. The measurement is always counted from the right edge, so in this figure sensor A reads at 0.5 cm while sensor B reads at 5 cm.

Figure 8. Pseudocode describing the process of creating SDFTs for CNN processing.

Figure 9. The CNN for this experiment has seven layers. The first convolution layer uses a kernel of 5 × 5 with a stride of one, the second convolution layer uses a kernel of 3 × 3 with a stride of one, and all activation functions are ReLU.

Figure 10. An example of a 200 × 15 × 1 STFT for a case with a Hole (left) and a case without a Hole (right). Images have been expanded so that the 15 columns of the image are more visible.

Figure 11. Time domain signal observed in channel A and channel B. (a) no hole centered; (b) hole centered; (c) hole–filled centered cases.

Figure 12. (a) Measured sensor signals; (b) Hermite–Gaussian fit for the No Hole case; (c) Hermite–Gaussian fit for the Hole case; (d) Hermite–Gaussian fit for the Hole Filled case.

Figure 13. Experimental and theoretical phase as a function of x.

Figure 14. Amplitude of (a) sensor signal and (b) driving-voltage-to-sensor-signal transfer function, and phase of (c) sensor signal and (d) of driving-voltage-to-sensor-signal transfer function.

Figure 15. The accuracy and loss for each model over each training epoch. All models are capable of learning the difference in the 9 labels except for the phase model.

Figure 16. The accuracy for each fold in the K-fold validation is displayed for each configuration. It is evident that all models experience some decrease in accuracy on the small training set, but the transfer function models maintain excellent performance.

Figure 17. The correlation matrices for the Amplitude data (top left), the Transfer Function Amplitude data (top right), the Phase data (bottom left), and the Transfer Function Phase data (bottom right) for the full set of labels.

Figure 18. The correlation matrices for the Amplitude data (top left), the Transfer Function Amplitude data (top right), the Phase data (bottom left), and the Transfer Function Phase data (bottom right) for the set of Hole, No Hole, or Hole Filled only.

Figure 19. (a) sensor signal amplitude and phase, (b) transfer function amplitude and phase.

Figure 20. (a) Amplitude correlation to label, (b) phase correlation to label.

Figure 21. (a) Amplitude correlation to label, (b) phase correlation to label, (c) amplitude of Hermite–Gaussian fits, (d) phase of Hermite–Gaussian fits.

Figure 22. The separation matrices for the Amplitude data (top left), the Transfer Function Amplitude data (top right), the Phase data (bottom left), and the Transfer Function Phase data (bottom right).

Figure 23. The separation matrices for the Amplitude data (top left), the Transfer Function Amplitude data (top right), the Phase data (bottom left), and the Transfer Function Phase data (bottom right).

Table 1. Database Labels Used By Model

Label	Description
Hole	The specimen with a hole placed over the center of the transmitter
No Hole	The specimen with no hole placed over the center of the transmitter
Hole Filled	The specimen with a hole filled with silicone placed over the center of the transmitter
Hole Right	The specimen with a hole placed so that the center is to the right of the transmitter
No Hole Right	The specimen with no hole placed so that the center is to the right of the transmitter
Hole Filled Right	The specimen with a hole filled with silicone placed so that the center is to the right of the transmitter
Hole Left	The specimen with a hole placed so that the center is to the left of the transmitter
No Hole Left	The specimen with no hole placed so that the center is to the left of the transmitter
Hole Filled Left	The specimen with a hole filled with silicone placed so that the center is to the left of the transmitter

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jarreau, B.; Yoshida, S.; Laprime, E. Deep Machine Learning for Acoustic Inspection of Metallic Medium. Vibration 2022, 5, 530-556. https://doi.org/10.3390/vibration5030030

AMA Style

Jarreau B, Yoshida S, Laprime E. Deep Machine Learning for Acoustic Inspection of Metallic Medium. Vibration. 2022; 5(3):530-556. https://doi.org/10.3390/vibration5030030

Chicago/Turabian Style

Jarreau, Brittney, Sanichiro Yoshida, and Emily Laprime. 2022. "Deep Machine Learning for Acoustic Inspection of Metallic Medium" Vibration 5, no. 3: 530-556. https://doi.org/10.3390/vibration5030030

APA Style

Jarreau, B., Yoshida, S., & Laprime, E. (2022). Deep Machine Learning for Acoustic Inspection of Metallic Medium. Vibration, 5(3), 530-556. https://doi.org/10.3390/vibration5030030

Article Menu

Deep Machine Learning for Acoustic Inspection of Metallic Medium

Abstract

1. Introduction

1.1. Motivation

1.2. Theory

Frequency Domain Analysis

2. Materials and Methods

2.1. System Design and Data Collection

2.2. Preprocessing

2.3. Convolutional Neural Network

2.3.1. Background

2.3.2. Present Model

3. Results

3.1. Results of Measurements

3.1.1. Time Domain Signals

3.1.2. Modal Analysis

3.1.3. Amplitude and Phase of Sensor Signal and Transfer Function

3.2. Results of CNN

3.2.1. CNN Performance

3.2.2. K-Fold Validation

4. Discussion

4.1. Correlation Analysis

4.2. Separability

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI