3.1. Optimizing the Deep Convolutional Neural Network
In this study, a monitoring system that can evaluate thruster health based on the flow noise signal generated by AUV is constructed using CNN. For this, an appropriate CNN architecture setting and optimization process are required. In this study, the final CNN architecture is determined by composing two types of artificial neural networks and comparing their accuracy.
3.1.1. Architectural Optimization of Convolutional Neural Networks
In general, it is known that the depth and performance of artificial neural networks have a positive correlation. However, when an excessively deep neural network is constructed, the number of hyperparameters that determine the characteristics of the neural network increases, and a problem of overfitting may occur. Therefore, it is essential to find the optimal layer through various attempts to construct an appropriate neural network.
In this study, as shown in
Figure 8, two types of neural networks were created and the accuracy was compared. In the case of Type-A, filters having a size of 10 × 10 were used, and each of the four layers consisted of a convolution layer and a ReLU function. The number of filters used for each layer is sequentially 8, 8, 16, and 32. The derived feature map is converted into a one-dimensional vector through flattening. After that, the final output was derived through the FCL and the softmax layer. In the case of Type-B, filters with a size of 5 × 5 that are smaller than those of Type-A were applied, and each of the six layers was composed of a combination of a convolution layer and a ReLU function. The number of filters used for each layer is sequentially 8, 16, 16, 32, 64, and 64. After that, by applying GAP, the feature map is converted into a one-dimensional vector having a length of 64 and passed through a layer composed of a combination of an FCL and a dropout layer. In the subsequent step, the final output was derived after going through the softmax layer in the same way as in type-A. The size of the input image used in this study is 400 × 400 × 3. For the training set, 70% of the data were randomly selected from the entire data set and the remaining 30% of the data were assigned to the test set.
Unlike Type-A, Type-B minimizes the size of the filter to reduce memory, but increases the number of filters used and extends the depth of the neural network. In this case, the problem of overfitting may occur as the neural network deepens. To prevent this, in Type-B, a combination of GAP, FCL, and Dropout was applied instead of composing the classification layer only with FCL. In particular, GAP is known to be effective in processing high-dimensional feature maps derived from a large number of filters, so it is appropriate to apply to Type-B.
3.1.2. Comparison of Trained Neural Networks of Two Architecture Types
By training two neural networks designed with different concepts, the accuracy and overfitting were compared and analyzed from the viewpoint of constructing a health monitoring system. For both Type-A and Type-B, the training settings were the same as in
Table 3.
Figure 9 shows the accuracy and loss functions derived from training. In the case of Type-A, the accuracy and loss values fluctuate heavily in the section where the iteration is less than 20, but it is relatively stabilized when the iteration exceeds 20. However, between 20 and 100 iterations, the accuracy partially decreases and the loss value partially increases. A characteristic of well-structured neural network training is that the accuracy continuously increases with iteration and the loss continuously decreases. During the training of the neural network, if the loss function continues to decrease and then bumps appear as the trend changes in a specific section, it can be estimated that overfitting has occurred. In the case of Type-A, it can be seen that some overfitting occurred in the middle. Conversely, in the case of Type-B, it can be seen that both accuracy and loss converge well without any fluctuations in the entire iteration section. In particular, even in the case of loss, it shows a tendency to continuously decrease without a bump section in the middle.
Table 4 compares the final accuracy and loss after training for two types of neural networks. As can be expected from the trend shown in
Figure 9 above, it can be seen that Type-B exhibits better accuracy and lower loss than Type-A. By comparing these results, it can be concluded that the neural network architecture of Type-B is more suitable for the health monitoring system than Type-A. All subsequent neural networks were trained using the same architecture as Type-B.
3.2. Accuracy Analysis according to Pre-Processing Method
In this section, the effect of data pre-processing on the development of the AUV health monitoring system was analyzed. In the case of flow noise generated in AUV, the level of BPF noise from the thruster is very dominant compared to other noise sources. Furthermore, damaged thrusters generally generate higher levels of BPF noise than thrusters under normal conditions. Considering these acoustic characteristics, it can be expected that the accuracy and utility of the derived health monitoring system may vary depending on how the collected flow noise data is pre-processed.
Therefore, in this section, various modifications of the pre-processing technique applied to the flow noise data were analyzed, and the change of the monitoring system was analyzed accordingly. Basically, the data processing technique is common to wavelet transform, but various modifications can be applied in terms of acoustic scale. In the acoustic scale, there is a choice of whether to use the flow noise data as a linear pressure scale or a dB scale, sound pressure level (SPL). In addition, there is a way to normalize the noise data to the maximum value. In particular, as mentioned above, the failure is likely to be identified simply by an increase in the BPF noise level, since the damaged thruster generates a higher-than-normal flow noise. When the neural network is trained with data normalized to the maximum value of noise, it can be expected that the neural network will diagnose the failure by recognizing the noise distribution characteristics rather than information about the noise level.
Therefore, in this study, three different data pre-processing methods were selected as follows:
3.2.1. Linear Scale
First,
Figure 10 shows the comparison of the scalograms that appear in the normal and damaged conditions of the thruster. In all three cases, a high noise level occurs at the first BPF (4 × 27 = 108 Hz), which corresponds to the product of the number of blades (
) and rotational speed (
= 27 rps). Since the noise level is expressed on a linear scale, there is a disadvantage that only the BPF noise is emphasized in the distribution characteristic of the flow noise. In particular, in the case of a thruster propulsion model such as the target AUV, since BPF noise is dominant in the generated flow noise, the distribution characteristics of other noise sources are masked by the BPF noise. Unlike the decibel scale, in the linear scale, other noise sources except for the BPF noise are expressed at an excessively low level. Therefore, in the image data pre-processed on the linear pressure scale, the change in the BPF noise level due to thruster damage is most emphasized, and the change in the distribution characteristics of other noise sources is not prominent.
3.2.2. Decibel Scale
Figure 11 shows the scalogram of flow noise expressed in decibel scale by comparing the thruster state. Unlike the previous case, since decibel based on log scale was applied, it can be confirmed that noise sources lower than BPF noise are properly decomposed. Again, it can be seen that a high noise level appears in the first BPF noise. However, unlike the previous linear scale, the decibel scale implements the distribution of flow noise with log-scale, revealing the distribution characteristics of other noise sources in addition to the BPF noise. When looking at each noise source component, since BPF noise has a high level, the distribution characteristic appears mainly in the red component, and other noise sources are distributed in the green and blue components. However, the change in the level of BPF noise due to thruster damage is still the most prominent throughout the image.
3.2.3. Normalized Decibel Scale
Figure 12 shows the result of normalizing the distribution of flow noise to the maximum noise value. The noise distribution data were normalized by dividing the noise level across the scalogram by the maximum value of noise derived from each scalogram, and then unifying the maximum value to 86 dB for all data. Through this process, the significant change in BPF noise level due to thruster damage, which appeared in the previous technique, was compensated for. As can be seen from the comparison of the noise distribution results, since the maximum values of all scalograms are unified, there is no change in the level of BPF noise depending on the damage condition of the thruster. Instead, it can be seen that the distribution characteristics of noise sources other than BPF noise are actively changing depending on the damage state of the thruster. This feature is expected to help the neural network not only focus on the BPF noise source in the learning process, but comprehensively recognize the distribution characteristics of other noise sources.
3.2.4. Comparative Analysis of Neural Networks According to Pre-Processing Method
Table 5 shows the configuration of the data set used to train the neural network. The training set and validation set were divided in a ratio of 7 to 3, and training was performed using 704, 640, and 560 data for the normal condition, single-blade damage, and double-blade damage conditions, respectively.
Table 6 compares the accuracy and loss derived when validating three neural networks trained using data from different pre-processing methods. First, in that all three neural networks showing a high accuracy of over 99%. It can be said that the effectiveness of the health monitoring system of the AUV thruster developed in this study has been secured. Furthermore, there was a slight difference in accuracy depending on the pre-processing technique. The two neural networks using the decibel scale showed 100% accuracy, whereas the neural network applying the linear scale showed a slightly lower accuracy of 99.90%. As shown in the comparative analysis of the pre-processed images, in the case of the linear scale, it is difficult to examine the distribution characteristics of other noise sources except for the BPF noise. As a result, it is expected that the decrease in accuracy is caused by the tendency of the neural network to over-recognize the level of BPF noise during the learning process. Even in the normal condition of the thruster, flow noise similar to that of a single-blade damage condition rarely occurs, so recognizing a failure based on the level of the BPF noise alone may act as a factor to lower the accuracy.
3.3. Performance of AUV Health Monitoring System for Off-Training Conditions
As mentioned above, the developed neural networks exhibit high identification accuracy close to 100% under the learned conditions. In this section, the classification accuracy was analyzed for the derived neural networks under different conditions from the data used for learning. The off-training conditions selected for the test are as follows.
Normal thruster, Moderate operation speed ();
Normal thruster, Rapid operation speed ();
Single-blade damaged thruster, Silent operation speed ().
The purpose of performing tests on off-training conditions is to analyze the potential performance and scalability of the trained neural network. In particular, it was analyzed whether the developed system could capture and accurately classify the characteristics of the potentially implied noise distribution based on the state of the thruster, even for data not used for training. As mentioned earlier, damaged thrusters typically produce higher noise levels than normal thrusters. In addition, the faster the thruster rotates, the higher the noise level it generates. Therefore, for the off-training test, a condition that can be as confusing as possible for the neural network to classify noise data was derived. As a result, two conditions were selected that generate high noise similar to the damaged thruster due to the thruster being in a normal state but operating at a high speed. In addition, although the thruster was damaged, a single condition—in which low noise similar to that of a thruster in a normal state was generated due to a low operating speed—was selected.
Table 7 provides information on thruster conditions and operating speeds for the selected off-training test conditions. For the first two conditions of moderate and rapid operation, the thruster was in normal condition and the operating speed was increased from 1.0 m/s to 1.2 m/s and 1.5 m/s, respectively. As the operation speed of the AUV increased, the self-navigation point also changed accordingly, and the thruster rotation speed increased from 27 rps to 32 rps and 37 rps in moderate and rapid operation, respectively. In the damaged thruster condition, silent operation was applied to reduce the operating speed to 0.7 m/s, and accordingly, the self-navigation point also changed, reducing the thruster rotational speed from 27 rps to 23 rps.
Figure 13 shows scalograms of flow noise expressed in linear scale generated under off-training conditions.
Figure 14 and
Figure 15 show scalograms of the same flow noise expressed in decibel scale and normalized decibel scale, respectively.
The normal-moderate conditions ( = 1.2 m/s) are faster than the cruising speed ( = 1.0 m/s) and the thruster rotation speed increases, so it can be seen that a higher level of BPF noise is generated at a BPF ( = 128 Hz) higher than the BPF ( = 108 Hz) at the cruising speed. However, since the operating speed is not significantly different, the noise distribution result is not significantly different from the normal condition at the cruising speed. In normal-rapid conditions, since the rapid operating speed ( = 1.5 m/s) is faster than moderate ( = 1.2 m/s), it can be seen that the first BPF ( = 148 Hz) is further increased and the noise level is also higher. In the case of single-blade damage-silent conditions, it can be confirmed that the speed ( = 0.7 m/s) is slower than the cruising speed ( 1.0 m/s) and the rotation speed of the thruster is reduced, so that the BPF noise is generated at a lower frequency ( = 92 Hz) than the cruising condition.
In the case of the results expressed in linear scale in
Figure 13, as mentioned above, the BPF noise is overly emphasized, and the distribution of other noise sources is not revealed. Furthermore, in the case of the normal–rapid condition, it will be difficult for the neural network to properly classify it as a normal condition because the level of BPF noise appears similar to that of the single-blade damaged condition in
Figure 10 even though the thruster is in a normal condition. In the case of the results expressed in the decibel scale in
Figure 14, the distribution of noise sources other than BPF noise is also revealed. However, in the case of the normal–rapid condition, it can still be confirmed that it appears similar to the noise distribution characteristic shown in the single-blade damaged condition of
Figure 11.
On the other hand, in the case of the result expressed in the RGB-normalized dB scale in
Figure 15, the maximum value is normalized and the distribution characteristics of the entire noise sources are emphasized. Accordingly, it can be seen that the result of the normal–rapid condition shows a distribution characteristic similar to that of the normal condition in
Figure 12. Therefore, it seems that the neural network can properly classify normal–rapid results into normal conditions.
Table 8 shows the accuracy derived by performing the test in the off-training condition of the neural network. Three neural networks using different pre-processing methods were used. As mentioned above, in the case of normal–moderate conditions, the moderate speed did not differ significantly from the cruising speed, so the distribution or magnitude of the BPF noise did not change much. Accordingly, 87.5% accuracy was obtained in all three neural networks. On the other hand, in the case of the normal–rapid condition, it is very difficult to properly classify the results because they are very similar to the results under the single-blade damage condition at cruising speed. Therefore, neural networks applied with linear scale and decibel scale show a low accuracy of 0%. On the other hand, the normalized dB scale showed better results than other neural networks with 50% accuracy as it focused on recognizing the distribution of noise sources by applying normalization. In the case of the single silent operation, the neural network to which the linear scale was applied showed an accuracy of 66.7%, but both neural networks to which the decibel scale was applied showed an accuracy of 100%. This seems to indicate that the level of BPF noise was overemphasized in the learning of the neural network to which the linear scale was applied, and an error of mistaking the level of BPF noise generated in single-silent operation as a normal state appeared.