Research on a Feature Enhancement Extraction Method for Underwater Targets Based on Deep Autoencoder Networks

Ji, Fang; Li, Guonan; Lu, Shaoqing; Ni, Junshuai

doi:10.3390/app14041341

Open AccessArticle

Research on a Feature Enhancement Extraction Method for Underwater Targets Based on Deep Autoencoder Networks

by

Fang Ji

^*,

Guonan Li

,

Shaoqing Lu

and

Junshuai Ni

China Ship Research and Development Academy, Beijing 100192, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(4), 1341; https://doi.org/10.3390/app14041341

Submission received: 8 October 2023 / Revised: 30 November 2023 / Accepted: 30 January 2024 / Published: 6 February 2024

(This article belongs to the Special Issue Underwater Acoustic Signal Processing)

Download

Browse Figures

Versions Notes

Abstract

:

The low-frequency line spectrum of the radiated noise signals of hydroacoustic targets contains features describing the intrinsic properties of the target that make the target susceptible to exposure. In order to extract the line spectral features of underwater acoustic targets, a method combining image processing and a deep autoencoder network (DAE) is proposed in this paper to enhance the low-frequency weak line spectrum of underwater targets in an extremely low signal-to-noise ratio environment based on the measured data of large underwater vehicles. A Gauss–Bernoulli restricted Boltzmann machine (G–BRBM) for real-value signal processing was designed and programmed by introducing a greedy algorithm. On this basis, the encoding and decoding mechanism of the DAE network was used to eliminate interference from environmental noise. The weak line spectrum features were effectively enhanced and extracted under an extremely low signal-to-noise ratio of 10–300 Hz, after which the reconstruction results of the line spectrum features were obtained. Data from large underwater vehicles detected by far-field sonar arrays were processed and the results show that the method proposed in this paper was able to adaptively enhance the line spectrum in a data-driven manner. The DAE method was able to achieve more than double the extractable line spectral density in the frequency band of 10–300 Hz. Compared with the traditional feature enhancement extraction method, the DAE method has certain advantages for the extraction of weak line spectra.

Keywords:

large underwater vehicles; deep autoencoder network; very low signal-to-noise ratio; line spectrum enhancement

1. Introduction

The low-frequency line spectrum (10–300 Hz) generated by mechanical operations is a crucial attribute of underwater acoustic target identification, which is primarily linked to the periodic running velocity of mechanical equipment and the resonance frequency of the target’s structure. Most of the line spectrum originates from large reciprocating and rotary mechanical equipment, as well as pumps throughout the sea, with an intensity of around 160 dB. In actual detection, the primary cause of a low signal-to-noise ratio (SNR) is the target’s inherent noise being equal to or lower than the ambient ocean noise. Moreover, additional interference from other ships plays a role. The efficient extraction of weak line spectrum characteristics from low signal-to-noise ratio environments is a crucial issue in underwater acoustic target recognition. Traditional feature enhancement extraction techniques often need to rely on good feature engineering [1]. Zare et al. [2] proposed a hydroacoustic image feature extraction method based on two-dimensional permutation entropy (PE) coding, and Li et al. [3] proposed a feature extraction method with hierarchical entropy (HE). Zare et al. [4] combined permutation entropy and slope entropy as features of intrinsic mode functions, which helped to improve the accuracy and stability of underwater signal recognition. In addition to scale-invariant feature transform (SIFT) [5], histograms of oriented gradient (HOG) [6] and line spectrum enhancement based on time accumulation algorithms are traditional feature extraction techniques. With the deep learning theory’s emergence, scholars have designed many deep neural networks for feature extraction. From 2015 to 2020, Yu et al. [7] used DNNs to learn features from wavelet packet component energy that are more advanced and discriminative than spectral features extracted by traditional methods. Feng [8] applied multiple depth structures to SAR feature extraction and obtained excellent experimental results. Chen et al. [9] patched occluded images in a real environment by fusing the out-of-field features of the occluded region with in-field features to form an optimized map of the incomplete image, which could then be used for feature extraction and for the prediction of partially occluded regions of the complete image. However, this network model was only applied to the original image and does not address the denoising of global noise. The stacked convolutional sparse denoising autoencoder (SCSDA) model proposed by Wang et al. [10] combines a sparse autoencoder and convolutional neural network to extract the depth features of water sound signals with impressive denoising performance. Shi et al. [11] used the convolutional neural network and denoising auto encoder network (CDAE) for end-to-end infrared small target detection, which is more efficient and accurate than traditional methods; additionally, the structural loss they proposed helps to preserve background texture features during the encoding process, which can carry out unsupervised extraction of certain features. The underwater denoising autoencoder (UDAE) model, proposed by Hashisho et al. [12], integrates the self-encoder with the convolutional layer to extract invariant features from underwater images and to enhance their clarity. The deep convolutional denoising autoencoder proposed by Testolin et al. [13] can remove the noise from received signals quickly and efficiently and can also perform well at low SNRs, but the method requires active sound and its covertness is low. In the last three years, Chen et al. [14] have used a LOFAR–CNN deep neural network, Hyunsoo et al. [15] have enhanced hydroacoustic images using the denoising autoencoder, and Ni et al. [16] have used an improved autoencoder network with stacking, sparsity, and denoising functions for DEMON spectral feature extraction. Chen et al. [17,18] used neural networks to extract different levels of shallow features, reconstructing an image using the idea of multilevel feature fusion, and they also reduced the number of parameters with an asymmetric residual module to finally obtain a super-resolution reconstructed image. Similarly, although they show the excellent performance of neural networks for image feature extraction, the images themselves do not involve denoising and spectral problems. Su et al. [19] reviewed the application of neural networks in image restoration, the important part of which is the processing of image features by neural networks, and found that the merit of the features extracted by neural networks deeply affects the quality of the final image. In the denoising of hydroacoustic data, Liu et al. [20] used 3D features combined with a CRNN network model to identify underwater acoustic targets and used data enhancement to improve the subsequent target identification rate; however, the neural network required a large number of known data labels to train the network, and its application in many scenarios may be limited. Khishe [21] used an autoencoder to extract features automatically, which allows for the selection of the optimal combination of features based on type and dimensionality without human intervention; in combination with recurrent and wavelet networks, the temporal features of spectrograms can be received and fixed.

In many cases, supervised neural networks such as CNN [22,23,24] require the use of a gradient descent on a labeled training set to reduce training errors. To achieve the objective function, it has to have the following features:

Labeled data is required;
Some papers use supervised learning methods to train the network. When it has fewer layers, usually converge the parameters can converge to a reasonable range. However, when this method is used to train deep networks, results are not satisfactory. In particular, training a neural network using a supervised learning approach usually involves solving a highly non-convex optimization problem, such as minimizing training error. For deep networks, the search region of such non-convex optimization problems is filled with a large number of local extremes, thus, using gradient descent does not work well;
As the depth of the network increases, the magnitude value of the backward propagation’s gradient decreases. This results in a very small derivative of the loss function with respect to the weights of the initial layers. Thus, using gradient descent, the weights of the initial layers change so slowly that the network is not able to learn effectively from samples.

To address the above problems, this paper uses the layer-by-layer greedy algorithm [25], whose main idea is to compute one layer of the obtained network at a time, which means a network containing one hidden layer is computed, and only when this layer is computed over, we start computing a network with two hidden layers. In each step, we fix the first k-1 layers that we have already obtained and then add the kth layer (i.e., we take as input the output of the first k-1 that we have already obtained). Then, during the computational acquisition of the model, we use an unsupervised method to initialize the weights of the deep network with the weights obtained from each layer individually. Finally, the entire network is “fine-tuning”. Our model takes LOFAR spectrum as input, and only requires superposition of multiple layers of the autoencoder network, which is computationally simple, and allows us to extract deep line spectral features, and realize the denoising and enhancement effects of the spectrogram.

Considering that the actual marine environment is very complex, the acquired underwater targets data are often heavily contaminated, which is caused by background noise. Under the environment of very low signal-to-noise ratio, it is difficult to fully express information of underwater targets with the classical two-valued restricted Boltzmann machine (RMB), as traditional methods cannot repair the broken and weak line spectrum. In this paper, a feature enhancement extraction method combining two-dimensional time and time–frequency spectrum (LOFAR spectrum) with deep autoencoder network (DAE) is proposed. Using the encoding and decoding mechanism based on the DAE method, the spectrogram is compressed, the spectrogram is segmented according to certain frequency intervals to calculate the output one-by-one, and the optimal weights are selected for the synchronous optimization of frequency intervals for the fine-tuning, so as to achieve the background noise suppression and the reconstruction of the LOFAR spectrum.

This paper made some contributions:

The G-BRBM for real-valued signals is designed and programmed to optimize parameters of the DAE with the layer-by-layer greedy and fine-tuning algorithm;
The proposed DAE is applied to the enhancement extraction of line spectrum image features, which can adaptively repair the feature line spectrum, remove the noise, and improve the contrast between the weak line spectrum and the background;
The DAE-processed spectral image features are clear and the line spectrum are regularized, which is conducive to the application of subsequent features.

2. DAE for Feature Line Spectrum Enhanced Extraction

2.1. DAE Network Principles

Each layer of our proposed network model is an RBM structure with an energy function:

E (v, h) = \sum_{i \in v i s} \frac{{(v_{i} - a_{i})}^{2}}{2 {σ_{i}}^{2}} - \sum_{j \in h i d} b_{j} h_{j} - \sum_{i, j} \frac{v_{i}}{σ_{i}} h_{j} w_{i j}

(1)

where

σ_{i}

is standard deviation of the visible layer Gaussian noise, which generally takes the value of 1. Thus, this creates:

p (v_{i} | h, θ) = N (a_{i} + \sum_{j = 1} w_{i j} h_{j}, 1)

(2)

where

N (μ, V)

is a Gaussian noise with mean

μ

and standard deviation

V

.

When an unsupervised method is used to calculate an auto-coder network hierarchically, distortion can be introduced in the input data or hidden layers to randomly zero out the values of the input layer nodes with a certain probability, which allow the denoising auto-coder network to obtain more robust features during hierarchical processing. Using denoising auto-coder networks, the original data can be reconstructed with input data containing noise [26].

The original data in the Figure 1 is

x

,

x

is changed to

\hat{x}

by adding random noise,

\hat{x}

is used as input to the autoencoder, the value of the implicit layer is

y

, and then

x

is reconstructed from

y

. The reconstructed value is

z

. Using this method, denoised autoencoder network can avoid obtaining a mundane constant scheme by forcing the output to match initial undistorted input. Then, since the noise is added randomly, the denoising auto-coder network thus obtained will be more accurate when tested with the same distorted test data. Finally, since each set of distorted input samples is different, this inherently increases the size of the data samples significantly and, thus, improves overfitting problem.

After the DAE network is obtained by processing the restricted Boltzmann machine with the layer-by-layer greedy algorithm, it is fine-tuned according to the discriminative criterion of our underwater target, and the purpose of fine-tuning is to consider the input, implicit, and output layers of the DAE network as a whole. The fine-tuning can be divided into two parts, which is forward propagation of signal and back propagation of error. The forward propagation process means that the signal is weighted to the implicit layer through the input layer, then to the output layer through the activation function, the result of the output layer is compared with the expected value. Next, the size of the error is observed, and the optimization is finished if error is within the acceptable range. If the error is too large, it enters into the process of the back propagation of the error, and the error signal is corrected layer-by-layer of the weights and bias values through the output layer, the implicit layer to the input layer, to make the error gradually reduce, until the error is reduced to within the acceptable range, then the optimization ends.

2.2. Introduction to Line Spectrum Enhancement Method

The specific algorithm of our method is as follows:

Suppose we have a divided subset of samples,

{({x_{l}}^{(1)}, y^{(1)}), \dots, ({x_{l}}^{(m_{l})}, y^{(m_{l})})}

, which contains

m

samples. For a single sample,

(x, y)

, the cost function is as follows

J (W, b, x, y) = \frac{1}{2} {‖h_{W, b} (x) - y‖}^{2}

(3)

Give a dataset consisting of a subset of divided samples, containing

m

samples, and define the overall cost function as:

\begin{array}{l} J (W, b) = [\frac{1}{m} \sum_{i = 1}^{m} J (W, b; x^{(i)}, y^{(i)})] + \frac{λ}{2} \sum_{l = 1}^{n_{l} - 1} \sum_{i = 1}^{s_{l}} \sum_{j = 1}^{s_{l + 1}} {(W_{j i}^{(l)})}^{2} \\ = [\frac{1}{m} \sum_{i = 1}^{m} (\frac{1}{2} {‖h_{W, b} (x^{(i)}) - y^{(i)}‖}^{2})] + \frac{λ}{2} \sum_{l = 1}^{n_{l} - 1} \sum_{i = 1}^{s_{l}} \sum_{j = 1}^{s_{l + 1}} {(W_{j i}^{(l)})}^{2} \end{array}

(4)

where

W

and

b

are neural network parameters,

λ

is the weight decay coefficient,

n_{l}

is network layers’ number,

s_{l}

is nodes’ number in the network at the layer

l_{t h}

, and

W_{j i}^{(l)}

denotes the connection weight of unit

i_{t h}

at the layer

l_{t h}

to unit

j_{t h}

at the layer

{(l + 1)}_{t h}

.

Our goal is to compute the parameters with the gradient descent algorithm to find the minimum of the function

J (W, b)

. Each parameter is first initialized to a very small and near-zero random value, then each iteration of the gradient descent method updates the parameters

W, b

according to the following function:

W_{i j}^{(l)} = W_{i j}^{(l)} - α \frac{\partial}{\partial W_{i j}^{(l)}} J (W, b)

(5)

b_{i}^{(l)} = b_{i}^{(l)} - α \frac{\partial}{\partial b_{i}^{(l)}} J (W, b)

(6)

The process is:

① A forward propagation calculation is performed to calculate the activation values for each layer of the network using initialization parameters.

② For each output unit in the layer

n_{l}

th, the residuals are calculated according to Equation (7):

\begin{array}{l} δ_{i}^{(n_{l})} = \frac{\partial}{\partial_{z_{i}}^{(n_{l})}} \frac{1}{2} {‖y - h_{W, b} (x)‖}^{2} \\ = - (y_{i} - a_{i}^{(n_{l})}) \cdot f ’ (z_{i}^{(n_{l})}) \end{array}

(7)

where

z_{i}^{(n_{l})}

denotes the input weighted sum of the

i_{t h}

cell of the layer

n_{l}

th.

③ For each layer of

l = n_{l} - 1, n_{l} - 2, n_{l} - 3, \dots, 2

, the residuals of the

i_{t h}

unit of the layer

l_{t h}

are calculated according to Equation (8):

δ_{i}^{(l)} = (\sum_{j = 1}^{s_{l + 1}} W_{j i}^{(l)} δ_{j}^{(l + 1)}) f ’ (z_{i}^{(l)})

(8)

④ Calculate the required partial derivatives as in Equations (9) and (10):

\frac{\partial}{\partial W_{i j}^{(l)}} J (W, b; x, y) = a_{j}^{(l)} δ_{i}^{(l + 1)}

(9)

\frac{\partial}{\partial b_{i}^{(l)}} J (W, b; x, y) = δ_{i}^{(l + 1)}

(10)

⑤ Repeat the above steps until the cost function is less than the given value.

3. Simulation of Feature Enhancement Extraction Based on Real Ship Data

3.1. DAE Model Validation

In the simulation experiments of online spectral feature enhancement extraction, our hardware parameters are as follows: Intel Core [email protected] GHz CPU (Intel, Santa Clara, CA, USA), NVIDIA GeForce GTX 1660 Ti GPU (NVIDIA, Santa Clara, CA, USA), 16 G RAM, implemented in Python 3.6.5 environment. The input layer of the deep network model is (400,400,3), the number of hidden layers is five. The loss curve of our model validation process is shown in Figure 2, and it can be seen that the loss curve of our model has gradually converged to near the optimal point.

3.2. Real-Ship LOFAR Spectrum Experiment

The loss curve in Section 3.1 verifies the convergence of the model, and this section verifies the accuracy of the model’s G-BRBM representation of underwater targets. Comparative simulation and comparison experiments of sonar detection spectral feature enhancement based on deep autoencoder network with measured sonar detection LOFAR spectrum of large underwater vehicles were used as an example. The signal of large underwater vehicles involved in this study was acquired from a hydrophone array, which was approximately 15 km away from the target. The test was conducted under sea state level 3. By setting the parameters of hydrophone array element number, array sensitivity, detection distance, target sound source strength, and test sea state, this paper constructs a real-ship validation environment for enhanced extraction of weak line spectral features of the large underwater vehicles by the DAE model under low signal-to-noise ratio condition. Figure 3 depicts the schematic diagram of obtaining sonar detection signals of large underwater vehicles in this study. The experimental environment parameters are enumerated in Table 1.

The traditional LOFAR line spectrum enhancement approach employing a time accumulation algorithm initially transforms the color LOFAR graph into a grayscale representation to facilitate line spectrum enhancement. Subsequently, a histogram equalization algorithm is employed to effectively expand the commonly employed brightness, enhance local contrast without impacting the overall contrast, and render the image visually distinct. Last, image convolution denoising is performed to eliminate Gaussian white noise and “pepper salt” additive noise induced by background noise within the image, yielding the line spectral enhancement outcomes of traditional LOFAR spectrum. Figure 4, Figure 5, Figure 6 and Figure 7 display the treatment effects at each stage of traditional method.

In order to ensure the consistent inputs of the DAE model, a resize operation is often used to adjust the size of image. Commonly used resize operations in Python 3.6.5 include nearest neighbor interpolation (INTER_NEAREST), bi-linear interpolation (INTER_LINEAR), bi-cubic interpolation (INTER_CUBIC), interpolation based on pixel region relationships (INTER_AREA), and Lanczos interpolation (INTER_LANCZOS4). The resize operation used in this paper performs INTER_LINEAR by default.

According to the target characteristics and perception conditions, underwater acoustic signal LOFAR spectrum estimation in this study is conducted within a frequency band of 10–300 Hz and a frequency resolution of 0.1 Hz, to better delineate details. To maintain the consistency of the input of the DAE model, the LOFAR spectrum is resized to a 400 × 400 × 3 RGB image, serving as the model’s input. The LOFAR spectrum of remote sonar detection of actual ships is obtained after initial processing on sonar detection data. The spectrum is displayed in Figure 8, with a frequency (10–300 Hz) on the horizontal axis and time (15 min) on the vertical axis.

In Figure 8, it is clear that there are some distinctive characteristic spectrum throughout the time course, but because of the large number of interferences in the marine environment, the line spectrum features show the characteristics of interferences, breaks, deformations, and sporadic features. In order to further achieve the feature enhancement for subsequent extraction, the LOFAR spectrum is reconstructed with the coding and decoding mechanism of the DAE network. After the deep autoencoding network obtained by the layer-by-layer greedy algorithm for the G-BRBM, samples of real ship data for typical operating conditions are separated by certain frequency intervals, which are described in Figure 9. The specific steps are as follows:

The generated LOFAR spectrum is encoded with a DAE, the image language is transformed into a matrix form as an input for subsequent calculations and the LOFAR spectrum is deflated into a 400 × 400 × 3 RGB image through the resize operation. The matrix size depends on the frequency resolution of LOFAR spectrum, the selected bandwidth on the one hand, and the selected time duration on the other. In order to make the denoising network model process natural images and calculate the nonlinear data model with stronger computational ability, the autoencoder designed in this paper incorporates an activation function f(x) such as sigmoid(x) at the end of each layer;
The generated matrix is used as input, and the layer-by-layer greedy algorithm is used to process the G-BRBM to obtain the DAE network, which kernel computes the separated subsets one-by-one and obtains the output, so that the background noise in original data is suppressed while the feature line spectrum is enhanced;
Fine-tune according to the discriminative criteria of underwater targets and observe the size of the error, if the error is within the acceptable range, then output the results; if the error is too large, then enter the process of back propagation of the error, the error signal through the output layer, the implied layer to the input layer-by-layer to the weights and bias values to correct, so that the error is gradually reduced within the acceptable range. Finally, the processed matrix is re-decoded and transformed into the image language.

Figure 10 shows the result of the DAE’s reconstruction:

The reconstructed LOFAR graph based on the DAE method is compared with the original graph, and the statistics such as the number of line spectrum, density of line spectral distribution, and the minimum frequency interval in the frequency band of 10–300 Hz are further calculated. For easy observation, the LOFAR comparison plots after and before the reconstruction of the DAE method are shown in Figure 10 and Figure 11, with the former after reconstruction and the latter before reconstruction, respectively.

In the comparison with the traditional image processing based methods, it can be seen that reconstructed image obtained by our proposed DAE model is more advantageous. Figure 12 shows the image obtained by the traditional LOFAR line spectrum enhancement method based on the time accumulation algorithm, which employs the traditional histogram equalization algorithm to accumulate and sum the corresponding pixel values of the points in the image over the entire time course, finally achieving the enhancement processing of the line spectrum features in the LOFAR spectrum.

In the comparison with the traditional image processing based methods, it can be seen that reconstructed images obtained by our proposed DAE model are more advantageous. As shown in Table 2, within the frequency range of 10–300 Hz, the LOFAR spectrum processed by the DAE method has an 18 line spectrum, with a line spectral density of 0.064 and a minimum frequency interval of 2.5 Hz between adjacent lines. In contrast, the result of traditional method is an 11 line spectrum, with a line spectral density of 0.038 and a minimum frequency interval of 7.0 Hz. Consequently, the DAE method significantly enhances the number of line spectrum and increases the discernable line spectral density, demonstrating its superiority over traditional line spectral enhancement techniques.

In the before-and-after comparison of extraction results and comparison with traditional time-accumulation algorithms, our proposed DAE-based line spectrum enhancement extraction method has following advantages:

Traditional algorithms have limited enhancement effect on broken line spectrum, whereas our proposed DAE can adaptively drive repair line spectrum with hydroacoustic data to remove image noise and, thus, enhance line spectral features while maximizing the retention of original feature information;
The traditional method can enhance the contrast between the line features and background, but the enhancement effect is poor for the weak line spectrum, while our proposed DAE can also increase the contrast between the weak line spectrum and background, which is more intuitive visually, with a higher degree of separation, which is conducive to the identification of weak/very weak line spectrum, Compared to traditional methods, the DAE method can achieve more than double the extractable line spectral density within the 10–300 Hz frequency band;
The features after DAE reconstruction are more regular, and the phenomenon of line spectrum crossing and overlapping is reduced, which is conducive to feature extraction at a later stage;
The DAE model enhances the line spectrum features of LOFAR spectrum with a faster response time, and the average processing time per sample is 26 ms.

In summary, spectral reconstruction based on the DAE method can be used for the extraction of line spectral features in low signal-to-noise ratio environments. It shows that the effect is obviously better than the traditional feature extraction method. However, this method is insufficient in the characterization of frequency drift and energy fluctuation, which needs to be further improved on the measured data in order to meet the real demand of feature line spectrum extraction.

4. Conclusions

In this paper, research on feature enhancement extraction technique based on DAE network is carried out, G-BRBM for real-valued signals is designed and programmed, and greedy algorithm is introduced to enhance the extraction of line spectrum features. The validity of the model is verified by the loss curve obtained by simulation experiment. Based on this, an underwater large vehicle sonar detection LOFAR spectrum reconstruction experiment is conducted, and the performance of the DAE method is compared with the conventional method. The findings indicate that the G-BRBM employed in our network model aids in converging the model to its optimal state. The LOFAR spectral features exhibit a notable enhancement following reconstruction, along with a satisfactory global noise removal effect. Additionally, weak line spectrum that are imperceptible to the naked eye are amplified, resulting in improved contrast and regularity. The extractable line spectral density within 10–300 Hz band is more than doubled compared to traditional time-accumulation-based algorithms. Our proposed method can be applied to the enhancement and extraction of weak line spectral features in the case of very low signal-to-noise ratio, which has a positive effect on both the subsequent application of the features and the solution of other low-frequency and low signal-to-noise ratio hydroacoustic problems. However, we also found that the model still needs to be further improved for the characterization of statistical features such as line spectral frequency drift and energy fluctuation. We will focus on solving these problems in our subsequent studies by performing LOFAR line spectral enhancement work with different bandwidths and resolutions, and trying to solve this problem by introducing the attention mechanism and multiscale feature extraction.

Author Contributions

Conceptualization, F.J. and G.L.; methodology, F.J. and G.L.; software, F.J. and G.L.; validation, F.J., G.L. and S.L.; formal analysis, F.J.; investigation, S.L.; resources, F.J. and G.L.; data curation, G.L.; writing—original draft preparation, G.L. and S.L.; writing—review and editing, S.L. and J.N.; visualization, G.L. and J.N.; supervision, F.J. and J.N.; project administration, F.J. and G.L.; funding acquisition, F.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by National Natural Science Foundation of China, grant number 52371356.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author (privacy reasons).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhuo, J. A Study on the Auditory-Based Algorithms for Underwater Target Radiated Noise Identification. Master’s Thesis, National University of Defense Technology, Changsha, China, November 2016. [Google Scholar]
Zare, M.; Nouri, N.M. Novel feature extraction of underwater targets by encoding hydro-acoustic signatures as image. Appl. Ocean. Res. 2023, 138, 103627. [Google Scholar] [CrossRef]
Li, W.; Shen, X.; Li, Y. A Comparative Study of Multiscale Sample Entropy and Hierarchical Entropy and Its Application in Feature Extraction for Ship-Radiated Noise. Entropy 2019, 21, 793. [Google Scholar] [CrossRef] [PubMed]
Zare, M.; Nouri, N.M. A novel hybrid feature extraction approach of marine vessel signal via improved empirical mode decomposition and measuring complexity. Ocean. Eng. 2023, 271, 113727. [Google Scholar] [CrossRef]
Aulinas, J.; Carreras, M.; Llado, X.; Salvi, J.; Garcia, R.; Prados, R.; Petillot, Y.R. Feature extraction for underwater visual SLAM. In Proceedings of the OCEANS 2011 IEEE-Spain, Santander, Spain, 6–9 June 2011. [Google Scholar]
Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005. [Google Scholar]
Yu, Y.; Cao, X.; Zhang, X. Underwater Target Classification Using Deep Neural Network. In Proceedings of the 2018 OCEANS-MTS/IEEE Kobe Techno-Oceans (OTO), Kobe, Japan, 28–31 May 2018. [Google Scholar]
Feng, Y. Research of SAR Feature Extraction and Target Recognition Based on Deep Learning. Master’s Thesis, University of Electronic Science and Technology of China, Chengdu, China, 31 March 2017. [Google Scholar]
Chen, Y.; Xia, R.; Zou, K.; Yang, K. FFTI: Image inpainting algorithm via features fusion and two-steps inpainting. J. Vis. Commun. Image Represent. 2023, 91, 103776. [Google Scholar] [CrossRef]
Wang, X.; Zhao, Y.; Teng, X.; Sun, W. A stacked convolutional sparse denoising autoencoder model for underwater heterogeneous information data. Appl. Acoust. 2020, 167, 107391. [Google Scholar] [CrossRef]
Shi, M.; Wang, H. Infrared Dim and Small Target Detection Based on Denoising Autoencoder Network. Mob. Netw. Appl. 2020, 25, 1469–1483. [Google Scholar] [CrossRef]
Hashisho, Y.; Albadawi, M.; Krause, T.; von Lukas, U.F. Underwater Color Restoration Using U-Net Denoising Autoencoder. In Proceedings of the 2019 11th International Symposium on Image and Signal Processing and Analysis (ISPA), Dubrovnik, Croatia, 23–25 September 2019. [Google Scholar]
Testolin, A.; Diamant, R. Underwater Acoustic Detection and Localization with a Convolutional Denoising Autoencoder. In Proceedings of the 2019 IEEE 8th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), Le Gosier, Guadeloupe, 15–18 December 2019. [Google Scholar]
Chen, J.; Han, B.; Ma, X.; Zhang, J. Underwater Target Recognition Based on Multi-Decision LOFAR Spectrum Enhancement: A Deep-Learning Approach. Future Internet 2021, 13, 265. [Google Scholar] [CrossRef]
Jeong, H.; Lee, C.; Park, J.; Park, K. Performance of Denoising Autoencoder for Enhancing Image in Shallow Water Acoustic Communication. J. Korea Inst. Inf. Commun. Eng. 2021, 25, 327–329. [Google Scholar] [CrossRef]
Ni, J.; Zhao, M.; Hu, C.; Lv, G.; Guo, Z. Ship Shaft Frequency Extraction Based on Improved Stacked Sparse Denoising Auto-Encoder Network. Appl. Sci. 2022, 12, 9076. [Google Scholar] [CrossRef]
Chen, Y.; Liu, L.; Phonevilay, V.; Gu, K.; Xia, R.; Xie, J.; Zhang, Q.; Yang, K. Image super-resolution reconstruction based on feature spectrum attention mechanism. Appl. Intell. 2021, 51, 4367–4380. [Google Scholar] [CrossRef]
Chen, Y.; Xia, R.; Yang, K.; Zou, K. MFFN: Image super-resolution via multi-level features fusion network. Vis. Comput. 2023, 40, 489–504. [Google Scholar] [CrossRef]
Su, J.; Xu, B.; Yin, H. A survey of deep learning approaches to image restoration. Neurocomputing 2022, 487, 46–65. [Google Scholar] [CrossRef]
Liu, F.; Shen, T.; Luo, Z.; Zhao, D.; Guo, S. Underwater target recognition using convolutional recurrent neural networks with 3-D Mel-spectrogram and data augmentation. Appl. Acoust. 2021, 178, 107989. [Google Scholar] [CrossRef]
Khishe, M. DRW-AE: A Deep Recurrent-Wavelet Autoencoder for Underwater Target Recognition. IEEE J. Ocean. Eng. 2022, 4, 1083–1098. [Google Scholar] [CrossRef]
Park, J.; Jung, D.-J. Identifying Tonal Frequencies in a Lofargram with Convolutional Neural Networks. In Proceedings of the 2019 19th International Conference on Control, Automation and Systems (ICCAS), Jeju, Republic of Korea, 15–18 October 2019. [Google Scholar] [CrossRef]
Muhammad, I.; Zheng, J.; Shahid, A.; Muhammad, I.; Zafar, M.; Umar, H. DeepShip: An underwater acoustic benchmark dataset and a separable convolution based autoencoder for classification. Expert Syst. Appl. 2021, 183, 115270. [Google Scholar] [CrossRef]
Ji, F.; Ni, J.; Li, G.; Liu, L.; Wang, Y. Underwater Acoustic Target Recognition Based on Deep Residual Attention Convolutional Neural Network. J. Mar. Sci. Eng. 2023, 11, 1626. [Google Scholar] [CrossRef]
Hinton, G.E.; Osindero, S.; Teh, Y.W. A fast learning algorithm for deep belief nets. Neural Comput. 2006, 18, 1527–1554. [Google Scholar] [CrossRef] [PubMed]
Vincent, P.; Larochelle, H.; Bengio, Y.; Manzagol, P. Extracting and composing robust features with denoising autoencoders. In Proceedings of the ICML ‘08: Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, 5–9 July 2008. [Google Scholar] [CrossRef]

Figure 1. Principle of the autoencoder networks for noise removal.

Figure 2. Loss curve for the DAE model.

Figure 3. Schematic diagram of sonar detection of large underwater vehicles.

Figure 4. Sonar detection LOFAR color spectrum.

Figure 5. Sonar detection LOFAR grayscale spectrum.

Figure 6. Histogram equalization results.

Figure 7. Image convolution noise reduction result.

Figure 8. LOFAR spectrum of sonar detection.

Figure 9. Fine-tuning flow chart.

Figure 10. DAE method reconstruction results.

Figure 11. LOFAR spectrum before processing.

Figure 12. Traditional method after processing.

Table 1. Experimental environment and parameters.

Physical Quantity	Parameters
Number of hydrophones	64
Distance	15 km
Sea state level	3
The target’s source level	160 dB
Test frequency band	10–300 Hz
Frequency resolution	0.1 Hz

Table 2. Comparison of the effect of traditional method and DAE method.

Method	Band (Δf)	Number of Line Spectrums (N_l)	Line Spectral Density (ρ_l)	Minimum Frequency Interval (f_m)
Traditional method	10–300 Hz 10–300 Hz	11	0.037	7.0 Hz
DAE	10–300 Hz 10–300 Hz	18	0.062	2.5 Hz

where

ρ_{l} = \frac{N_{l}}{∆ f}

.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ji, F.; Li, G.; Lu, S.; Ni, J. Research on a Feature Enhancement Extraction Method for Underwater Targets Based on Deep Autoencoder Networks. Appl. Sci. 2024, 14, 1341. https://doi.org/10.3390/app14041341

AMA Style

Ji F, Li G, Lu S, Ni J. Research on a Feature Enhancement Extraction Method for Underwater Targets Based on Deep Autoencoder Networks. Applied Sciences. 2024; 14(4):1341. https://doi.org/10.3390/app14041341

Chicago/Turabian Style

Ji, Fang, Guonan Li, Shaoqing Lu, and Junshuai Ni. 2024. "Research on a Feature Enhancement Extraction Method for Underwater Targets Based on Deep Autoencoder Networks" Applied Sciences 14, no. 4: 1341. https://doi.org/10.3390/app14041341

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on a Feature Enhancement Extraction Method for Underwater Targets Based on Deep Autoencoder Networks

Abstract

1. Introduction

2. DAE for Feature Line Spectrum Enhanced Extraction

2.1. DAE Network Principles

2.2. Introduction to Line Spectrum Enhancement Method

3. Simulation of Feature Enhancement Extraction Based on Real Ship Data

3.1. DAE Model Validation

3.2. Real-Ship LOFAR Spectrum Experiment

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI